When doing a modeset where the sink is transitioning from D3 to D0 , it
would sometimes be possible for the initial power_up_phy() to start
timing out. This would only be observed in the last action before the
sink went into D3 mode was intel_dp_sink_dpms(DRM_MODE_DPMS_OFF). We
originally thought this might be an issue with us accidentally shutting
off the aux block when putting the sink into D3, but since the DP spec
mandates that sinks must wake up within 1ms while we have 100ms to
respond to an ESI irq, this didn't really add up. Turns out that the
problem is more subtle then that:
It turns out that the timeout is from us not enabling DPMS on the MST
hub before actually trying to initiate sideband communications. This
would cause the first sideband communication (power_up_phy()), to start
timing out because the sink wasn't ready to respond. Afterwards, we
would call intel_dp_sink_dpms(DRM_MODE_DPMS_ON) in
intel_ddi_pre_enable_dp(), which would actually result in waking up the
sink so that sideband requests would work again.
Since DPMS is what lets us actually bring the hub up into a state where
sideband communications become functional again, we just need to make
sure to enable DPMS on the display before attempting to perform sideband
communications.
Changes since v1:
- Remove comment above if (!intel_dp->is_mst) - vsryjala
- Move intel_dp_sink_dpms() for MST into intel_dp_post_disable_mst() to
keep enable/disable paths symmetrical
- Improve commit message - dhnkrn
Changes since v2:
- Only send DPMS off when we're disabling the last sink, and only send
DPMS on when we're enabling the first sink - dhnkrn
Changes since v3:
- Check against is_mst, not intel_dp->is_mst - dhnkrn/vsyrjala
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Laura Abbott <labbott@redhat.com>
Cc: stable@vger.kernel.org
Fixes: ad260ab32a ("drm/i915/dp: Write to SET_POWER dpcd to enable MST hub.")
Link: https://patchwork.freedesktop.org/patch/msgid/20180407011053.22437-1-lyude@redhat.com
Currently, we rely on inspecting the hangcheck state from within the
i915_reset() routines to determine which engines were guilty of the
hang. This is problematic for cases where we want to run
i915_handle_error() and call i915_reset() independently of hangcheck.
Instead of relying on the indirect parameter passing, turn it into an
explicit parameter providing the set of stalled engines which then are
treated as guilty until proven innocent.
While we are removing the implicit stalled parameter, also make the
reason into an explicit parameter to i915_reset(). We still need a
back-channel for i915_handle_error() to hand over the task to the locked
waiter, but let's keep that its own channel rather than incriminate
another.
This leaves stalled/seqno as being private to hangcheck, with no more
nefarious snooping by reset, be it whole-device or per-engine. \o/
The only real issue now is that this makes it crystal clear that we
don't actually do any testing of hangcheck per se in
drv_selftest/live_hangcheck, merely of resets!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180406220354.18911-2-chris@chris-wilson.co.uk
BSpec says:
"Second level interrupt events are stored in the GT INT DW. GT INT DW is
a double buffered structure. A snapshot of events is taken when SW reads
GT INT DW. From the time of read to the time of SW completely clearing
GT INT DW (to indicate end of service), all incoming interrupts are logged
in a secondary storage structure. this guarantees that the record of
interrupts SW is servicing will not change while under service".
We read GT INT DW in several places now:
- The IRQ handler (banks 0 and 1) where, hopefully, it is completely
cleared (operation now covered with the irq lock).
- The 'reset' interrupts functions for RPS and GuC logs, where we clear
the bit we are interested in and leave the others for the normal
interrupt handler.
- The 'enable' interrupts functions for RPS and GuC logs, as a measure
of precaution. Here we could relax a bit and don't check GT INT DW
at all or, if we do, at least we should clear the offending bit
(which is what this patch does).
Note that, if every bit is cleared on reading GT INT DW, the register
won't be locked. Also note that, according to the BSpec, GT INT DW
cannot be cleared without first servicing the Selector & Shared IIR
registers.
v2:
- Remove some code duplication (Tvrtko)
- Make sure GT_INTR_DW are protected by the irq spinlock, since it's a
global resource (Tvrtko)
v3: Optimize the spinlock (Tvrtko)
v4: Rebase.
v5:
- take spinlock on outer scope to please sparse (Mika)
- use raw_reg accessors (Mika)
v6: omit the continue in looping banks (Michel)
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Sagar Arun Kamble <sagar.a.kamble@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> (v4)
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180406093237.14548-1-mika.kuoppala@linux.intel.com
We don't handle resetting the kernel context very well, or presumably any
context executing its breadcrumb commands in the ring as opposed to the
batchbuffer and flush. If we trigger a device reset twice in quick
succession while the kernel context is executing, we may end up skipping
the breadcrumb. This is really only a problem for the selftest as
normally there is a large interlude between resets (hangcheck), or we
focus on resetting just one engine and so avoid repeatedly resetting
innocents.
Something to try would be a preempt-to-idle to quiesce the engine before
reset, so that innocent contexts would be spared the reset.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
CC: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180330131801.18327-1-chris@chris-wilson.co.uk
If we skip the intel_prepare_reset(), we should also skip the
intel_display_reset(). If we we use a flag set by intel_prepare_reset()
then we do not have to second guess based on external user controlled
state whether or not the prepare was called before deciding to finish
it after the reset. igt/gem_eio is one such example that may tweak
i915.reset faster than the code is expecting, leading to
[ 190.233528] =====================================
[ 190.233534] WARNING: bad unlock balance detected!
[ 190.233540] 4.16.0-rc7-g335ef9849310-drmtip_10+ #1 Tainted: G U
[ 190.233547] -------------------------------------
[ 190.233553] gem_eio/1348 is trying to release lock (crtc_ww_class_acquire) at:
[ 190.233569] [<ffffffff895c7810>] drm_modeset_acquire_fini+0x0/0x60
[ 190.233575] but there are no more locks to release!
[ 190.233580]
other info that might help us debug this:
[ 190.233588] 3 locks held by gem_eio/1348:
[ 190.233592] #0: (&f->f_pos_lock){+.+.}, at: [<00000000ab90c784>] __fdget_pos+0x3a/0x50
[ 190.233607] #1: (sb_writers#11){.+.+}, at: [<00000000e1529265>] vfs_write+0x188/0x1a0
[ 190.233622] #2: (&attr->mutex){+.+.}, at: [<0000000011f40afe>] simple_attr_write+0x36/0xd0
[ 190.233635]
stack backtrace:
[ 190.233644] CPU: 0 PID: 1348 Comm: gem_eio Tainted: G U 4.16.0-rc7-g335ef9849310-drmtip_10+ #1
[ 190.233655] Hardware name: Dell Inc. OptiPlex GX280 /0G8310, BIOS A04 02/09/2005
[ 190.233664] Call Trace:
[ 190.233674] dump_stack+0x67/0x95
[ 190.233682] ? drm_modeset_backoff+0x1b0/0x1b0
[ 190.233690] print_unlock_imbalance_bug+0xd2/0xe0
[ 190.233698] ? drm_modeset_backoff+0x1b0/0x1b0
[ 190.233704] lock_release+0x23e/0x300
[ 190.233712] drm_modeset_acquire_fini+0x16/0x60
[ 190.233835] intel_finish_reset+0x72/0x160 [i915]
[ 190.233894] i915_reset_device+0x1e9/0x240 [i915]
[ 190.233953] ? __intel_get_crtc_scanline+0x1c0/0x1c0 [i915]
[ 190.233962] ? work_on_cpu_safe+0x50/0x50
[ 190.234020] i915_handle_error+0x1f2/0x470 [i915]
[ 190.234031] ? __might_fault+0x39/0x90
[ 190.234037] ? __might_fault+0x39/0x90
[ 190.234099] i915_wedged_set+0x7f/0xc0 [i915]
[ 190.234107] simple_attr_write+0xb0/0xd0
[ 190.234117] full_proxy_write+0x51/0x80
[ 190.234125] __vfs_write+0x21/0x140
[ 190.234133] ? rcu_read_lock_sched_held+0x6f/0x80
[ 190.234140] ? rcu_sync_lockdep_assert+0x29/0x50
[ 190.234147] ? __sb_start_write+0x152/0x1f0
[ 190.234152] ? __sb_start_write+0x168/0x1f0
[ 190.234159] vfs_write+0xbd/0x1a0
[ 190.234166] SyS_write+0x40/0xa0
[ 190.234173] ? do_syscall_64+0x19/0x1b0
[ 190.234180] do_syscall_64+0x6b/0x1b0
[ 190.234188] entry_SYSCALL_64_after_hwframe+0x42/0xb7
[ 190.234196] RIP: 0033:0x7f84c1b392b7
[ 190.234201] RSP: 002b:00007f84b6755b00 EFLAGS: 00000293 ORIG_RAX: 0000000000000001
[ 190.234211] RAX: ffffffffffffffda RBX: 0000000000000046 RCX: 00007f84c1b392b7
[ 190.234218] RDX: 0000000000000002 RSI: 000055ec20abc8d6 RDI: 0000000000000046
[ 190.234225] RBP: 000055ec20abc8d6 R08: 0000000000000000 R09: 0000000000000000
[ 190.234231] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000000002
[ 190.234238] R13: 0000000000000000 R14: 00007f84b0000b20 R15: 000055ec20ce4eb8
Testcase: igt/gem_eio
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180405123714.3638-1-chris@chris-wilson.co.uk
As per DP spec when R0 mismatch is detected, HDCP source supported
re-read the R0 atleast twice.
And For HDMI and DP minimum wait required for the R0 availability is
100mSec. So this patch changes the wait time to 100mSec but retries
twice with the time interval of 100mSec for each attempt.
This patch is needed for DP HDCP1.4 CTS Test: 1A-06.
v2:
No Change
v3:
Comment on R0 retry is moved closer to the code[Seanpaul]
v4:
Removing unwanted noise introduced in v3.
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1522669822-2508-1-git-send-email-ramalingam.c@intel.com
Commit 'aee3bac0a3a8 ("drm/i915/psr: Tie PSR2 support to Y
coordinate requirement")' got merged to drm-intel-next-queued
but the variable was defined commit 'c5fe47327b06 ("drm: Add PSR
version 3 macro") who was merged through drm-misc.
So backmerging to get drm-intel-next-queued compiling back again.
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Although i915 don't implement aux sync frame through tests was
findout that pannels can do selective update when the y-coordinate
is also included in SDP, that is why it is required to run PSR2 in
i915.
So moving to only one place the sink requirements that the actual
driver needs to enable PSR2.
Also intel_psr2_config_valid() is called every time the crtc config
is computed, wasting some time every time it was checking for
Y coordinate requirement.
This allow us to nuke y_cord_support and some of VSC setup code that
was handling a scenario that would never happen(PSR2 without Y
coordinate).
Also here renaming intel_dp_get_y_cord_status() to
intel_dp_get_y_coord_required() as it more accurate to the name and
function of bit according to eDP spec.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180328223046.16125-4-jose.souza@intel.com
eDP spec states that aux frame is required to do PSR2 selective
update but i915 don't fully implement it. It sends the aux frame
sync messages but the value is always zero as the GTC is not enabled
in driver.
Through tests was findout that pannels can do selective update when
the y-coordinate is also included in SDP, that is why it is required
to run PSR2 in i915.
A dummy value is not useful at all to sink, so removing everything
related to aux frame sync, if GTC is enabled we can bring this back.
Cc: Vathsala Nagaraju <vathsala.nagaraju@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180328223046.16125-3-jose.souza@intel.com
Only sleep and repeat when asked for a full device reset (ALL_ENGINES)
and avoid using sleeping waits when asked for a per-engine reset. The
goal is to be able to use a per-engine reset from hardirq/softirq/timer
context. A consequence is that our individual wait timeouts are a
thousand times shorter, on the order of a hundred microseconds rather
than hundreds of millisecond. This may make hitting the timeouts more
common, but hopefully the fallover to the full-device reset will be
sufficient to pick up the pieces.
Note, that the sleeps inside older gen (pre-gen8) have been left as they
are only used in full device reset mode.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
CC: Michel Thierry <michel.thierry@intel.com>
Cc: Jeff McGee <jeff.mcgee@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180329224519.13598-1-chris@chris-wilson.co.uk
- Mask mode type garbage from userspace (Ville)
Something went wrong on the misc tree side, but I'll pull the patch directly.
* 'drm-misc-next-fixes' of git://anongit.freedesktop.org/drm/drm-misc:
drm: Fix uabi regression by allowing garbage mode->type from userspace