It is debatable whether having an error message on suspend for forcibly
cancelling outstanding work is worthwhile. We want to know if it occurs
in the wild (as we will then have to reconsider the approach!), but
equally is not fatal across suspend, as upon resume we automatically
clear the wedged status.
However, CI does trigger this scenario with gem_eio/suspend; as there we
are intentionally wedging the device upon suspend. The dilemma is how
not to trigger a failure report for the dmesg spam, for which the
quickest response is to suppress the warning in the kernel. I'd rather
mark it as accepted in gem_eio, but for now detecting when gem_eio is
playing games and cancelling the warning for that case seems a barely
acceptable hack.
Testcase: igt/gem_eio/suspend
Reference: 5861b013e2 ("drm/i915: Do a synchronous switch-to-kernel-context on idling")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308134512.19115-1-chris@chris-wilson.co.uk
If PSR1 is active when pipe CRC is enabled the CRC calculations will
be inhibit by the transition to low power states that PSR1 brings.
So lets force a PSR1 exit and as soon as pipe CRC is enabled it will
block PSR1 activation and avoid CRC timeouts when running IGT tests.
There is a little window between the call to force exit PSR and the
write to pipe CRC registers that needs to happen within the minimum
of 6 idles frames otherwise PSR1 will be active again causing the CRC
timeouts but anyways this will at least reduce the occurrence of CRC
timeouts.
This can possibily fix issues present right now but I did not found
any open, I mostly got this issue from previous CI runs of this
series, bellow some exambles:
* igt@kms_color@pipe-b-ctm-0-75:
- shard-apl: PASS -> FAIL +9
* igt@kms_cursor_legacy@flip-vs-cursor-busy-crc-legacy:
- shard-apl: PASS -> DMESG-FAIL +17
* igt@kms_frontbuffer_tracking@fbc-1p-primscrn-indfb-pgflip-blt:
- shard-kbl: PASS -> DMESG-FAIL +12
* igt@kms_pipe_crc_basic@read-crc-pipe-c:
- shard-kbl: PASS -> FAIL +7
v6: s/PSR/PSR1 (Dhinakaran)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308000050.6226-8-jose.souza@intel.com
When PSR2 is active aka after the number of frames programmed in
PSR2_CTL 'Frames Before SU Entry' hardware stops to generate CRC
interrupts causing IGT tests to fail due timeout.
This same behavior don't happen with PSR1, as soon as pipe CRC is
enabled it blocks PSR1 activation so CRC calculation continues to
happens normaly.
This patch also set mode_changed as true when PSR is available to
force atomic check functions to compute new PSR state, otherwise PSR2
would not be disabled.
v4: Only setting mode_changed if has_psr is set(Dhinakaran)
v3: Reusing intel_crtc_crc_prepare() and crc_enabled, only setting
mode_changed if it can do PSR.
v2: Changed commit description to describe that PSR2 inhibit CRC
calculations.
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308000050.6226-6-jose.souza@intel.com
In any commit, intel_modeset_pipe_config() will initialilly clear
and then recalculate most of the pipe states but it leave intel
specific color features states in reset state.
If after intel_pipe_config_compare() is detected that a fastset is
possible it will mark update_pipe as true and unsed mode_changed,
causing the color features state to be kept in reset state and then
latter being committed to hardware disabling the color features.
This issue can be reproduced by any code patch that duplicates the
actual(with color features already enabled) state and only mark
mode_changed as true.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308000050.6226-3-jose.souza@intel.com
Introduce a mutex to start locking the HW contexts independently of
struct_mutex, with a view to reducing the coarse struct_mutex. The
intel_context.pin_mutex is used to guard the transition to and from being
pinned on the gpu, and so is required before starting to build any
request. The intel_context will then remain pinned until the request
completes, but the mutex can be released immediately unpin completion of
pinning the context.
A slight variant of the above is used by per-context sseu that wants to
inspect the pinned status of the context, and requires that it remains
stable (either !pinned or pinned) across its operation. By using the
pin_mutex to serialise operations while pin_count==0, we can take that
pin_mutex for stabilise the boolean pin status.
v2: for Tvrtko!
* Improved commit message.
* Dropped _gpu suffix from gen8_modify_rpcs_gpu.
v3: Repair the locking for sseu selftests
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308132522.21573-7-chris@chris-wilson.co.uk
In preparation for an ever growing number of engines and so ever
increasing static array of HW contexts within the GEM context, move the
array over to an rbtree, allocated upon first use.
Unfortunately, this imposes an rbtree lookup at a few frequent callsites,
but we should be able to mitigate those by moving over to using the HW
context as our primary type and so only incur the lookup on the boundary
with the user GEM context and engines.
v2: Check for no HW context in guc_stage_desc_init
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308132522.21573-4-chris@chris-wilson.co.uk
PPS locking is a thing on pre-DDI, up to and including CPT and PPT.
The PPS divisor register exists up to gen 9 BC, replaced by a field in
the control register starting from gen 9 LP, i.e. BXT, GLK, and CNP on.
Commit b0a08bec96 ("drm/i915/bxt: eDP Panel Power sequencing") stopped
using the divisor register, but inadvertently conflated the PPS unlock
in the change. No longer doing the unlocking was the right thing to do,
however we should've stopped already at LPT (or DDI platforms).
Deconflate the two.
Arguably this could be moved away from here altogether, but this is the
minimally intrusive change for now.
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305135215.29862-1-jani.nikula@intel.com
Currently we assume that we know the order in which requests run and so
can determine if we need to reissue a switch-to-kernel-context prior to
idling. That assumption does not hold for the future, so instead of
tracking which barriers have been used, simply determine if we have ever
switched away from the kernel context by using the engine and before
idling ensure that all engines that have been used since the last idle
are synchronously switched back to the kernel context for safety (and
else of shrinking memory while idle).
v2: Use intel_engine_mask_t and ALL_ENGINES
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308093657.8640-3-chris@chris-wilson.co.uk
When the system idles, we switch to the kernel context as a defensive
measure (no users are harmed if the kernel context is lost). Currently,
we issue a switch to kernel context and then come back later to see if
the kernel context is still current and the system is idle. However,
if we are no longer privy to the runqueue ordering, then we have to
relax our assumptions about the logical state of the GPU and the only
way to ensure that the kernel context is currently loaded is by issuing
a request to run after all others, and wait for it to complete all while
preventing anyone else from issuing their own requests.
v2: Pull wedging into switch_to_kernel_context_sync() but only after
waiting (though only for the same short delay) for the active context to
finish.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190308093657.8640-1-chris@chris-wilson.co.uk
Now with the watermarks fixes merged, Icelake is stable enough to
have the alpha support protection flag removed.
We have a few ICL machines in our CI and it is mostly green with
failures in tests that will not impact future linux installations.
Also there is no warnings, errors, flickering or any visual defects
while doing ordinary tasks like browsing and editing documents in a
dual monitor setup.
As a reminder i915.alpha_support was created to protect
future linux installation's iso images that might contain a
kernel from the enabling time of the new platform. Without this
protection most of linux installation was recommending
nomodeset option during installation that was getting stick
there after installation.
Specifically, alpha support says nothing about the development
state of the hardware, and everything about the state of the
driver in a kernel release.
This is semantically no different from the old
preliminary_hw_support flag, but the old one was all too often
interpreted as (preliminary hw) support instead of the intended
(preliminary) hw support, and it was misleading for everyone.
Hence the rename.
Reference: https://intel-gfx-ci.01.org/tree/drm-tip/fi-icl-y.html
Reference: https://intel-gfx-ci.01.org/tree/drm-tip/shard-iclb.html
Cc: James Ausmus <james.ausmus@intel.com>
Cc: Jani Saarinen <jani.saarinen@intel.com>
Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190305221153.359-1-jose.souza@intel.com
To facilitate the next patch to allow preemptible kernels not to incur
the wrath of hangcheck, we need to ensure that we can still suspend and
shutdown. That is we will not be able to rely on hangcheck to terminate
a blocking kernel and instead must manually do so ourselves. The
advantage is that we can apply more pressure!
As we now perform a GPU reset to clean up any residual kernels, we leave
the GPU in an unknown state and in particular can not talk to the GuC
before we reinitialise it following resume. For example, we no longer
need to tell the GuC to suspend itself, as it is already reset.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190307104530.21745-2-chris@chris-wilson.co.uk