Looking through the attributes for DMA mappings, it appears that by
default dma_map_sg will try and create a kernel accessible map of the
page. We never access this, as we either have a struct page already or
an iomap, so we can request that the dma mapper does not create one.
Without a kernel map in place, one presumes the rest of the memory
control attributes do not apply. We also explicitly control the caches
around the mappings, so we can ask it not to bother synchronising itself.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200706224308.22636-1-chris@chris-wilson.co.uk
Getting wedged device on driver init is pretty much unrecoverable.
Since we're running various scenarios that may potentially hit this in
CI (module reload / selftests / hotunplug), and if it happens, it means
that we can't trust any subsequent CI results, we should just apply the
taint to let the CI know that it should reboot (CI checks taint between
test runs).
v2: Comment that WEDGED_ON_INIT is non-recoverable, distinguish
WEDGED_ON_INIT from WEDGED_ON_FINI (Chris)
v3: Appease checkpatch, fixup search-replace logic expression mindbomb
in assert (Chris)
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Petri Latvala <petri.latvala@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200706144107.204821-1-michal@hardline.pl
Normally i85x/i865 3D activity will block FBC until a 2D blit
occurs. I suppose this was meant to avoid recompression while
3D activity is still going on but the frame hasn't yet been
presented. Unfortunately that also means that a page flipped
3D workload will permanently block FBC even if it only renders
a single frame and then does nothing.
Since we are using software render tracking anyway we might as
well flip the chicken bit so that 3D does not block FBC. This
will avoid the permament FBC blockage in the aforemention use
case, but thanks to the software tracking the compressor will
not disturb 3D rendering activity.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200702153723.24327-5-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Unlike all the other pre-snb desktop platforms i865 actually
supports FBC. Let's enable it.
Quote from the spec:
"DevSDG provides the same Run-Length Encoded Frame Buffer
Compression (RLEFBC) function as exists in DevMGM."
As i865 only has the one pipe we want to skip massaging the
plane<->pipe assignment aimed at getting FBC+LVDS working on
the mobile platforms.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200702153723.24327-4-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Consult the actual plane stride instead of the fb stride. The two
will disagree when we remap the gtt. The plane stride is what the
hw will be fed so that's what we should look at for the FBC
retrictions/cfb allocation.
Since we no longer require a fence we are going to attempt using
FBC with remapping, and so we should look at correct stride.
With 90/270 degree rotation the plane stride is stored in units
of pixels, so we need to conver it to bytes for the purposes
of calculating the cfb stride. Not entirely sure if this matches
the hw behaviour though. Need to reverse engineer that at some
point...
We also need to reorder the pixel format check vs. stride check
to avoid triggering a spurious WARN(stride & 63) with cpp==1 and
plane stride==32.
v2: Try to deal with rotated stride and related WARN
Cc: José Roberto de Souza <jose.souza@intel.com>
Fixes: 691f7ba58d ("drm/i915/display/fbc: Make fences a nice-to-have for GEN9+")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200702153723.24327-2-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Avoid waking up the device and taking stale locks if we know that the
object is not currently mmapped. This is particularly useful as not many
object are actually mmapped and so we can destroy them without waking
the device up, and gives us a little more freedom of workqueue ordering
during shutdown.
v2: Pull the release_mmap() into its single user in freeing the objects,
where there can not be any race with a concurrent user of the freed
object. Or so one hopes!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>,
Link: https://patchwork.freedesktop.org/patch/msgid/20200702163623.6402-2-chris@chris-wilson.co.uk
We have a mix of dport, intel_dport, intel_dig_port and dig_port to
reference a intel_digital_port struct. Numbers are around
5 intel_dport
36 dport
479 intel_dig_port
352 dig_port
Since we already removed the intel_ prefix from most of our other
structs, do the same here and prefer dig_port.
v2: rename everything in i915, not just a few display sources and
reword commit message (from Matt Roper)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200701045054.23357-1-lucas.demarchi@intel.com
If the driver gets stuck holding the kernel timeline, we cannot issue a
heartbeat and so fail to discover that the driver is indeed stuck and do
not issue a GPU reset (which would hopefully unstick the driver!).
Switch to using a trylock so that we can query if the heartbeat's
timeline mutex is locked elsewhere, and then use the timer to probe if it
remains stuck at the same spot for consecutive heartbeats, indicating
that the mutex has not been released and the engine has not progressed.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200702095219.963-1-chris@chris-wilson.co.uk
intel_dp_set_source_rates() calls intel_dp_is_edp(), which is unsafe to
use before encoder_type is set. This caused GEN11+ to incorrectly strip
HBR3 from source rates for edp. Move intel_dp_set_source_rates() to
after encoder_type is set. Add comment to intel_dp_is_edp() describing
unsafe usages.
v2: Alter intel_dp_set_source_rates final position (Ville/Manasi).
Remove outdated comment (Ville).
Slight optimization of control flow in intel_dp_init_connector.
Slight rewording in commit message.
Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200630233310.10191-1-matthew.s.atwood@intel.com
When the reference clock is 38.4MHz, using the current TBT PLL
fractional divider value results in a slightly off TBT link frequency.
This causes an endless loop of link training success followed by a bad
link signaling and retraining at least on a Dell WD19TB TBT dock. The
workaround provided by the HW team is to divide the fractional divider
value by two. This fixed the link training problem on the ThinkPad dock.
The same workaround is needed on some EHL platforms and for combo PHY
PLLs, these will be addressed in a follow-up.
Bspec: 49204
References: HSDES#22010772725
References: HSDES#14011861142
Reported-and-tested-by: Khaled Almahallawy <khaled.almahallawy@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Khaled Almahallawy <khaled.almahallawy@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200629185848.20550-1-imre.deak@intel.com
The obj->lut_list is traversed when the object is closed as the file
table is destroyed during process termination. As this occurs before we
kill any outstanding context if, due to some bug or another, the closure
is blocked, then we fail to shootdown any inflight operations
potentially leaving the GPU spinning forever. As we only need to guard
the list against concurrent closures and insertions, the hold is short
and merits being treated as a simple spinlock.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200701084439.17025-1-chris@chris-wilson.co.uk
Rearrange the allocation of the mm_struct registration to avoid
allocating underneath the i915->mm_lock, so that we avoid tainting the
lock (and in turn many other locks that may be held as i915->mm_lock is
taken, and those locks we may want on the free [shrinker] paths). In
doing so, we convert the lookup to be RCU protected by courtesy of
converting the free-worker to be an rcu_work.
v2: Remember to use hash_rcu variants to protect the list iteration from
concurrent add/del.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200619194038.5088-1-chris@chris-wilson.co.uk
The default fbc1 compression interval we use is 500 frames. That
translates to over 8 seconds typically. That's rather excessive
so let's drop it to 1 second.
The hardware will not attempt recompression unless at least one
line has been modified, so a shorter compression interval should
not cause extra bandwidth use in the purely idle scenario. Of
course in the mostly idle case we are possibly going to recompress
a bit more.
Should really try to find some kind of sweet spot to minimize
the energy usage...
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429101034.8208-11-ville.syrjala@linux.intel.com
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
The current fence_y_offset calculation is broken. I think it more or
less used to do the right thing, but then I changed the plane code
to put the final x/y source offsets back into the src rectangle so
now it's just subtraacting the same value from itself. The code would
never have worked if we allowed the framebuffer to have a non-zero
offset.
Let's do this in a better way by just calculating the fence_y_offset
from the final plane surface offset. Note that we don't align the
plane surface address to fence rows so with horizontal panning there's
often a horizontal offset from the fence start to the surface address
as well. We have no way to tell the hardware about that so we just
ignore it. Based on some quick tests the invlidation still happens
correctly. I presume due to the invalidation nuking at least the full
line (or a segment of multiple lines).
Fixes: 54d4d719fa ("drm/i915: Overcome display engine stride limits via GTT remapping")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429101034.8208-4-ville.syrjala@linux.intel.com
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Currently there is no null check for a failed memory allocation
on the dsb object and without this a null pointer dereference
error can occur. Fix this by adding a null check.
Note: added a drm_err message in keeping with the error message style
in the function.
Addresses-Coverity: ("Dereference null return")
Fixes: afeda4f3b1 ("drm/i915/dsb: Pre allocate and late cleanup of cmd buffer")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200616114221.73971-1-colin.king@canonical.com