Setting a write-back cache policy in the MOCS entry definition also
implies snooping, which has a considerable overhead. This is
unexpected for a few reasons:
- From user-space's point of view since it didn't want a coherent
surface (it didn't set the buffer as such via the set caching IOCTL).
- There is a separate MOCS entry field for snooping (which we never
set).
- This MOCS table is about caching in (e)LLC and there is no (e)LLC on
BXT. There is a separate table for L3 cache control.
Considering the above the current behavior of snooping looks like an
unintentional side-effect of the WB setting. Changing it to be LLC-UC
gets rid of the snooping without any ill-effects. For a coherent
surface the application would use a separate MOCS entry at index 1 and
call the set caching IOCTL to setup the PTE entries for the
corresponding buffer to be snooped. In the future we could also add a
new MOCS entry for coherent surfaces.
This resulted in 70% improvement in synthetic texturing benchmarks.
Kudos to Valtteri Rantala, Eero Tamminen and Michael T Frederick and
Ville who helped to narrow the source of problem to the kernel and to
the snooping behaviour in particular.
With a follow-up change to adjust the 3rd entry value
igt/gem_mocs_settings is passing after this change.
v2:
- Rebase on v2 of patch 1/2.
v3:
- Set the entry as LLC uncached instead of PTE-passthrough. This way
we also keep snooping disabled, but we also make the cacheability/
coherency setting indepent of the PTE which is managed by the
kernel. (Chris)
CC: Rong R Yang <rong.r.yang@intel.com>
CC: Yakui Zhao <yakui.zhao@intel.com>
CC: Valtteri Rantala <valtteri.rantala@intel.com>
CC: Eero Tamminen <eero.t.tamminen@intel.com>
CC: Michael T Frederick <michael.t.frederick@intel.com>
CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
CC: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Acked-by: Zhao Yakui <yakui.zhao@intel.com>
Tested-by: Rong R Yang <rong.r.yang@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467380406-11954-3-git-send-email-imre.deak@intel.com
(cherry picked from commit 6bee14ed1e)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Unfortunately, there's two situations where we lose hpd right now:
- Runtime suspend
- When we've shut off all of the power wells on Valleyview/Cherryview
While it would be nice if this didn't cause issues, this has the
ability to get us in some awkward states where a user won't be able to
get their display to turn on. For instance; if we boot a Valleyview
system without any monitors connected, it won't need any of it's power
wells and thus shut them off. Since this causes us to lose HPD, this
means that unless the user knows how to ssh into their machine and do a
manual reprobe for monitors, none of the monitors they connect after
booting will actually work.
Eventually we should come up with a better fix then having to enable
polling for this, since this makes rpm a lot less useful, but for now
the infrastructure in i915 just isn't there yet to get hpd in these
situations.
Changes since v1:
- Add comment explaining the addition of the if
(!mode_config->poll_running) in intel_hpd_init()
- Remove unneeded if (!dev->mode_config.poll_enabled) in
i915_hpd_poll_init_work()
- Call to drm_helper_hpd_irq_event() after we disable polling
- Add cancel_work_sync() call to intel_hpd_cancel_work()
Changes since v2:
- Apparently dev->mode_config.poll_running doesn't actually reflect
whether or not a poll is currently in progress, and is actually used
for dynamic module paramter enabling/disabling. So now we instead
keep track of our own poll_running variable in dev_priv->hotplug
- Clean i915_hpd_poll_init_work() a little bit
Changes since v3:
- Remove the now-redundant connector loop in intel_hpd_init(), just
rely on intel_hpd_poll_enable() for setting connector->polled
correctly on each connector
- Get rid of poll_running
- Don't assign enabled in i915_hpd_poll_init_work before we actually
lock dev->mode_config.mutex
- Wrap enabled assignment in i915_hpd_poll_init_work() in READ_ONCE()
for doc purposes
- Do the same for dev_priv->hotplug.poll_enabled with WRITE_ONCE in
intel_hpd_poll_enable()
- Add some comments about racing not mattering in intel_hpd_poll_enable
Changes since v4:
- Rename intel_hpd_poll_enable() to intel_hpd_poll_init()
- Drop the bool argument from intel_hpd_poll_init()
- Remove redundant calls to intel_hpd_poll_init()
- Rename poll_enable_work to poll_init_work
- Add some kerneldoc for intel_hpd_poll_init()
- Cross-reference intel_hpd_poll_init() in intel_hpd_init()
- Just copy the loop from intel_hpd_init() in intel_hpd_poll_init()
Changes since v5:
- Minor kerneldoc nitpicks
Cc: stable@vger.kernel.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Lyude <cpaul@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
(cherry picked from commit 19625e85c6)
One of the things preventing us from using polling is the fact that
calling valleyview_crt_detect_hotplug() when there's a VGA cable
connected results in sending another hotplug. With polling enabled when
HPD is disabled, this results in a scenario like this:
- We enable power wells and reset the ADPA
- output_poll_exec does force probe on VGA, triggering a hpd
- HPD handler waits for poll to unlock dev->mode_config.mutex
- output_poll_exec shuts off the ADPA, unlocks dev->mode_config.mutex
- HPD handler runs, resets ADPA and brings us back to the start
This results in an endless irq storm getting sent from the ADPA
whenever a VGA connector gets detected in the middle of polling.
Somewhat based off of the "drm/i915: Disable CRT HPD around force
trigger" patch Ville Syrjälä sent a while back
Cc: stable@vger.kernel.org
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Lyude <cpaul@redhat.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
(cherry picked from commit b236d7c842)
While VGA hotplugging worked(ish) before, it looks like that was mainly
because we'd unintentionally enable it in
valleyview_crt_detect_hotplug() when we did a force trigger. This
doesn't work reliably enough because whenever the display powerwell on
vlv gets disabled, the values set in VLV_ADPA get cleared and
consequently VGA hotplugging gets disabled. This causes bugs such as one
we found on an Intel NUC, where doing the following sequence of
hotplugs:
- Disconnect all monitors
- Connect VGA
- Disconnect VGA
- Connect HDMI
Would result in VGA hotplugging becoming disabled, due to the powerwells
getting toggled in the process of connecting HDMI.
Changes since v3:
- Expose intel_crt_reset() through intel_drv.h and call that in
vlv_display_power_well_init() instead of
encoder->base.funcs->reset(&encoder->base);
Changes since v2:
- Use intel_encoder structs instead of drm_encoder structs
Changes since v1:
- Instead of handling the register writes ourself, we just reuse
intel_crt_detect()
- Instead of resetting the ADPA during display IRQ installation, we now
reset them in vlv_display_power_well_init()
Cc: stable@vger.kernel.org
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Lyude <cpaul@redhat.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
[danvet: Rebase over dev_priv/drm_device embedding.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
(cherry picked from commit 9504a89247)
Never go to sleep waiting on the GPU without first ensuring that we will
get woken up.
We have a choice of queuing the hangcheck before every schedule() or the
first time we wakeup. In order to simply accommodate both the signaler
and the ordinary waiter, move the queuing to the common point of
enabling the irq. We lose the paranoid safety of ensuring that the
hangcheck is active before the sleep, but avoid code duplication (and
redundant hangcheck queuing).
Testcase: igt/prime_busy
Fixes: c81d46138d ("drm/i915: Convert trace-irq to the breadcrumb waiter")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1468055535-19740-2-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
(cherry picked from commit 232af392fd)
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Move the encoder cloning check to happen earlier in the modeset. The
main benefit will be that the debug output from a failed modeset will
be less confusing as output_types can not indicate an invalid
configuration during the later computation stages.
For instance, what happened to me was kms_setmode was attempting one
of its invalid cloning checks during which it asked for DP+VGA cloning
on HSW. In this case the DP .compute_config() was executed after
the FDI .compute_config() leaving the DP link clock (1.62 in this case)
in port_clock, and then later the FDI BW computation tried to use that
as the FDI link clock (which should always be 2.7). 1.62 x 2 wasn't
enough for the mode it was trying to use, and so it ended up rejecting
the modeset, not because of an invalid cloning configuration, but
because of supposedly running out of FDI bandwidth. Took me a while
to figure out what had actually happened.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1466621833-5054-12-git-send-email-ville.syrjala@linux.intel.com
Rather than using wait_for_atomic() when chacking for a response from
the GuC, we can get the effect of a hybrid spin/sleep wait by breaking
it into two stages. First, spin-wait for up to 10us to minimise latency
for "quick" commands; then, if that times out, sleep-wait for up 10ms
(the maximum allowed for a "slow" command).
Being able to do this depends on the recent patch
18f4b84 drm/i915: Use atomic waits for short non-atomic ones
and is similar to the hybrid approach in
1758b90 drm/i915: Use a hybrid scheme for fast register waits
(although we can't use that as-is, because that interface doesn't quite
match what we need here).
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467815411-21756-1-git-send-email-david.s.gordon@intel.com
As we inspect both the tasklet (to check for an active bottom-half) and
set the irq-posted flag at the same time (both in the interrupt handler
and then in the bottom-halt), group those two together into the same
cacheline. (Not having total control over placement of the struct means
we can't guarantee the cacheline boundary, we need to align the kmalloc
and then each struct, but the grouping should help.)
v2: Try a couple of different names for the state touched by the user
interrupt handler.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467805142-22219-3-git-send-email-chris@chris-wilson.co.uk
Since drm_i915_private is now a subclass of drm_device we do not need to
chase the drm_i915_private->dev backpointer and can instead simply
access drm_i915_private->drm directly.
text data bss dec hex filename
1068757 4565 416 1073738 10624a drivers/gpu/drm/i915/i915.ko
1066949 4565 416 1071930 105b3a drivers/gpu/drm/i915/i915.ko
Created by the coccinelle script:
@@
struct drm_i915_private *d;
identifier i;
@@
(
- d->dev->i
+ d->drm.i
|
- d->dev
+ &d->drm
)
and for good measure the dev_priv->dev backpointer was removed entirely.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467711623-2905-4-git-send-email-chris@chris-wilson.co.uk
As we only ever keep the first error state around, we can avoid some
work that can be quite intrusive if we don't record the error the second
time around. This does move the race whereby the user could discard one
error state as the second is being captured, but that race exists in the
current code and we hope that recapturing error state is only done for
debugging.
Note that as we discard the error state for simulated errors, igt that
exercise error capture continue to function.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1467618513-4966-3-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
After Joonas complained about using READ_ONCE() on the only use of the
variable in the function, where the intent was to simply document that
the read was intentionally racy and unlocked, I switched the READ_ONCE()
over to lockless_dereference(). However, in linux-next that has a
stronger type-check to only allow pointers and is no longer
interchangeable with READ_ONCE(), see commit 331b6d8c7a
("locking/barriers: Validate lockless_dereference() is used on a pointer
type")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Fixes: 67d97da349 ("drm/i915: Only start retire worker when idle")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1467705276-707-1-git-send-email-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Some IS_ and HAS_ macros can return any non-zero value for true.
One potential problem with that is that someone could assign
them to integers and be surprised with the result. Therefore it
is probably safer to do the conversion to 0/1 in the macros
themselves.
Luckily this does not seem to have an effect on code size.
Only one call site was getting bit by this and a patch for
that has been sent as "drm/i915/guc: Protect against HAS_GUC_*
returning true values other than one".
v2: Added some extra braces as suggested by checkpatch.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/1467643823-9798-1-git-send-email-tvrtko.ursulin@linux.intel.com
Since we now subclass struct drm_device, we can save pointer dances by
noting the equivalence of struct drm_device and struct drm_i915_private,
i.e. by using to_i915().
text data bss dec hex filename
1073824 4562 416 1078802 107612 drivers/gpu/drm/i915/i915.ko
1068976 4562 416 1073954 106322 drivers/gpu/drm/i915/i915.ko
Created by the coccinelle script:
@@
expression E;
identifier p;
@@
- struct drm_i915_private *p = E->dev_private;
+ struct drm_i915_private *p = to_i915(E);
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467628477-25379-1-git-send-email-chris@chris-wilson.co.uk
igt likes to inject GPU hangs into its command streams. However, as we
expect these hangs, we don't actually want them recorded in the dmesg
output or stored in the i915_error_state (usually). To accommodate this
allow userspace to set a flag on the context that any hang emanating
from that context will not be recorded. We still do the error capture
(otherwise how do we find the guilty context and know its intent?) as
part of the reason for random GPU hang injection is to exercise the race
conditions between the error capture and normal execution.
v2: Split out the request->ringbuf error capture changes.
v3: Move the flag defines next to the intel_context->flags definition
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-9-git-send-email-chris@chris-wilson.co.uk
The request tells us where to read the ringbuf from, so use that
information to simplify the error capture. If no request was active at
the time of the hang, the ring is idle and there is no information
inside the ring pertaining to the hang.
Note carefully that this will reduce the amount of information stored in
the error state - any ring without an active request will not be
recorded.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Dave Gordon <david.s.gordon@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1467616119-4093-8-git-send-email-chris@chris-wilson.co.uk