When we write to ELSP, it triggers a context preemption at the earliest
arbitration point (3DPRIMITIVE, some PIPECONTROLs, a few other
operations and the explicit MI_ARB_CHECK). If this is to the same
context, it triggers a LITE_RESTORE where the RING_TAIL is merely
updated (used currently to chain requests from the same context
together, avoiding bubbles). However, if it is to a different context, a
full context-switch is performed and it will start to execute the new
context saving the image of the old for later execution.
Previously we avoided preemption by only submitting a new context when
the old was idle. But now we wish embrace it, and if the new request has
a higher priority than the currently executing request, we write to the
ELSP regardless, thus triggering preemption, but we tell the GPU to
switch to our special preemption context (not the target). In the
context-switch interrupt handler, we know that the previous contexts
have finished execution and so can unwind all the incomplete requests
and compute the new highest priority request to execute.
It would be feasible to avoid the switch-to-idle intermediate by
programming the ELSP with the target context. The difficulty is in
tracking which request that should be whilst maintaining the dependency
change, the error comes in with coalesced requests. As we only track the
most recent request and its priority, we may run into the issue of being
tricked in preempting a high priority request that was followed by a
low priority request from the same context (e.g. for PI); worse still
that earlier request may be our own dependency and the order then broken
by preemption. By injecting the switch-to-idle and then recomputing the
priority queue, we avoid the issue with tracking in-flight coalesced
requests. Having tried the preempt-to-busy approach, and failed to find
a way around the coalesced priority issue, Michal's original proposal to
inject an idle context (based on handling GuC preemption) succeeds.
The current heuristic for deciding when to preempt are only if the new
request is of higher priority, and has the privileged priority of
greater than 0. Note that the scheduler remains unfair!
v2: Disable for gen8 (bdw/bsw) as we need additional w/a for GPGPU.
Since, the feature is now conditional and not always available when we
have a scheduler, make it known via the HAS_SCHEDULER GETPARAM (now a
capability mask).
v3: Stylistic tweaks.
v4: Appease Joonas with a snippet of kerneldoc, only to fuel to fire of
the preempt vs preempting debate.
Suggested-by: Michal Winiarski <michal.winiarski@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Ben Widawsky <benjamin.widawsky@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Zhi Wang <zhi.a.wang@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-8-chris@chris-wilson.co.uk
In the next few patches, we wish to enable different features for the
scheduler, some which may subtlety change ABI (e.g. allow requests to be
reordered under different circumstances). So we need to make sure
userspace is cognizant of the changes (if they care), by which we employ
the usual method of a GETPARAM. We already have an
I915_PARAM_HAS_SCHEDULER (which notes the existing ability to reorder
requests to avoid bubbles), and now we wish to extend that to be a
bitmask to describe the different capabilities implemented.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-7-chris@chris-wilson.co.uk
With preemption, we will want to "unsubmit" a request, taking it back
from the hw and returning it to the priority sorted execution list. In
order to know where to insert it into that list, we need to remember
its adjust priority (which may change even as it was being executed).
This also affects reset for execlists as we are now unsubmitting the
requests following the reset (rather than directly writing the ELSP for
the inflight contexts). This turns reset into an accidental preemption
point, as after the reset we may choose a different pair of contexts to
submit to hw.
GuC is not updated as this series doesn't add preemption to the GuC
submission, and so it can keep benefiting from the early pruning of the
DFS inside execlists_schedule() for a little longer. We also need to
find a way of reducing the cost of that DFS...
v2: Include priority in error-state
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-6-chris@chris-wilson.co.uk
Add another perma-pinned context for using for preemption at any time.
We cannot just reuse the existing kernel context, as first and foremost
we need to ensure that we can preempt the kernel context itself, so
require a distinct context id. Similar to the kernel context, we may
want to interrupt execution and switch to the preempt context at any
time, and so it needs to be permanently pinned and available.
To compensate for yet another permanent allocation, we shrink the
existing context and the new context by reducing their ringbuffer to the
minimum.
v2: Assert that we never allocate a request from the preemption context.
v3: Limit perma-pin to engines that may preempt.
v4: Onion cleanup for early driver death
v5: Onion ordering in main driver cleanup as well.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-4-chris@chris-wilson.co.uk
Supporting fine-granularity preemption levels may require changes in
userspace batch buffer programming. Therefore, we need to fallback to
safe default values, rather that use hardware defaults. Userspace is
still able to enable fine-granularity, since we're whitelisting the
register controlling it in WaEnablePreemptionGranularityControlByUMD.
v2: Extend w/a to cover Cannonlake
v3: Fix commentary to include both fake w/a names.
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171003203453.15692-2-chris@chris-wilson.co.uk
According to Spec for SKL+: "Isochronous Priority Control.
If enabled, Display sends demoted requests once the transition
watermark is reached. If transition watermark is not enabled,
Display sends demoted requests when the display buffer is full."
The commit 'e57f1c02155f ("drm/i915/gen9+: Add has_ipc flag in
device info structure")' introduced that as gen9+ but missing many
SKL Skus.
I believe the reason for that is Spec also mentions workarounds for
SKL-ALL: "IPC (Isoch Priority Control) may cause underflows
WA: Do not enable IPC in register ARB_CTL2"
It seems lame to add the feature and forever disable it,
but it will avoid a mistake of enabling it when we are reorganizing
the feature definitions on i915_pci.c later.
It will also allow us to probably extend that workaround for
other platforms.
Cc: Mahesh Kumar <mahesh1.kumar@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171003063652.17248-1-rodrigo.vivi@intel.com
The common lane power down flag of a DPIO PHY has a funky semantic:
after the initial enabling of the PHY (so from a disabled state) this
flag will be clear. It will be set only after the PHY will be used for
the first time (for instance due to enabling the corresponding pipe) and
then become unused (due to disabling the pipe). During the initial PHY
enablement we don't know which of the above phases we are in, so move
the check for the flag where this is known, the HW readout code. This is
where the rest of lane power down status checks are done anyway.
This fixes at least a problem on GLK where after module reloading, the
common lane power down flag of PHY1 is set, but the PHY is actually
powered-on and properly set up. The GRC readout code for other PHYs will
hence think that PHY1 is not powered initially and disable it after the
GRC readout. This will cause the AUX power well related to PHY1 to get
disabled in a stuck state, timing out when we try to enable it later.
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Fixes: e93da0a013 ("drm/i915/bxt: Sanitiy check the PHY lane power down status")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102777
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171002135307.26117-1-imre.deak@intel.com
On GLK and CNL enabling a pipe with its pipe scaler enabled will result
in a FIFO underrun. This happens only once after driver loading or
system/runtime resume, more specifically after power well 1 gets
enabled; subsequent modesets seem to be free of underruns. The BSpec
workaround for this is to disable the pipe scaler clock gating for the
duration of modeset. Based on my tests disabling clock gating must be
done before enabling pipe scaling and we can re-enable it after the pipe
is enabled and one vblank has passed.
For consistency I also checked if plane scaling would cause the same
problem, but that doesn't seem to trigger this problem.
The patch is based on an earlier version from Ander.
v2 (Rodrigo):
- Set also CLKGATE_DIS_PSL bits 8 and 9.
- Add also the BSpec workaround ID.
Cc: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100302
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171002075557.32615-1-imre.deak@intel.com
One of the differences I spotted for GEN8+ platforms when
compared to older platforms is that spec for BDW+ includes
this sentence:
"The first CRC done indication after CRC is first enabled is
from only a partial frame, so it will not have the expected
CRC result."
This is an indication that on BDW+ platforms, by the time
we receive the interrupt the CRC is not accurate yet for
the full frame. That would be ok, because we are already
skipping the first CRC for all platforms. However the comment
on the code state that it is for some unknown reason. Also,
on CHV (gen8 lp) we were already discarding the second CRC
as well to make sure we have a reliable CRC on hand.
So based on all ou tests and bugs it seems that it is not
on CHV that needs to discard 2 first CRCs, but all BDW+
platforms.
Starting on SKL we have this CRC done bit (24), but the
experiments around the use of this bit wasn't that stable
as just discarding the second CRC. So, let's for now
just move with CHV solution for all gen8+ platforms and
make our CI a bit more stable.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102374
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101309
Cc: Mika Kahola <mika.kahola@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Tested-by: Mika Kahola <mika.kahola@intel.com>
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170928002040.7917-1-rodrigo.vivi@intel.com
First feature pull for 4.15. Highlights:
- Per VM BO support
- Lots of powerplay cleanups
- Powerplay support for CI
- pasid mgr for kfd
- interrupt infrastructure for recoverable page faults
- SR-IOV fixes
- initial GPU reset for vega10
- prime mmap support
- ttm page table debugging improvements
- lots of bug fixes
* 'drm-next-4.15' of git://people.freedesktop.org/~agd5f/linux: (232 commits)
drm/amdgpu: clarify license in amdgpu_trace_points.c
drm/amdgpu: Add gem_prime_mmap support
drm/amd/powerplay: delete dead code in smumgr
drm/amd/powerplay: delete SMUM_FIELD_MASK
drm/amd/powerplay: delete SMUM_WAIT_INDIRECT_FIELD
drm/amd/powerplay: delete SMUM_READ_FIELD
drm/amd/powerplay: delete SMUM_SET_FIELD
drm/amd/powerplay: delete SMUM_READ_VFPF_INDIRECT_FIELD
drm/amd/powerplay: delete SMUM_WRITE_VFPF_INDIRECT_FIELD
drm/amd/powerplay: delete SMUM_WRITE_FIELD
drm/amd/powerplay: delete SMU_WRITE_INDIRECT_FIELD
drm/amd/powerplay: move macros to hwmgr.h
drm/amd/powerplay: move PHM_WAIT_VFPF_INDIRECT_FIELD to hwmgr.h
drm/amd/powerplay: move SMUM_WAIT_VFPF_INDIRECT_FIELD_UNEQUAL to hwmgr.h
drm/amd/powerplay: move SMUM_WAIT_INDIRECT_FIELD_UNEQUAL to hwmgr.h
drm/amd/powerplay: add new helper functions in hwmgr.h
drm/amd/powerplay: use SMU_IND_INDEX/DATA_11 pair
drm/amd/powerplay: refine powerplay code.
drm/amd/powerplay: delete dead code in hwmgr.h
drm/amd/powerplay: refine interface in struct pp_smumgr_func
...
Getting started with v4.15 features:
- Cannonlake workarounds (Rodrigo, Oscar)
- Infoframe refactoring and fixes to enable infoframes for DP (Ville)
- VBT definition updates (Jani)
- Sparse warning fixes (Ville, Chris)
- Crtc state usage fixes and cleanups (Ville)
- DP vswing, pre-emph and buffer translation refactoring and fixes (Rodrigo)
- Prevent IPS from interfering with CRC capture (Ville, Marta)
- Enable Mesa to advertise ARB_timer_query (Nanley)
- Refactor GT number into intel_device_info (Lionel)
- Avoid eDP DP AUX CH timeouts harder (Manasi)
- CDCLK check improvements (Ville)
- Restore GPU clock boost on missed pageflip vblanks (Chris)
- Fence register reservation API for vGPU (Changbin)
- First batch of CCS fixes (Ville)
- Finally, numerous GEM fixes, cleanups and improvements (Chris)
* tag 'drm-intel-next-2017-09-07' of git://anongit.freedesktop.org/git/drm-intel: (100 commits)
drm/i915: Update DRIVER_DATE to 20170907
drm/i915/cnl: WaThrottleEUPerfToAvoidTDBackPressure:cnl(pre-prod)
drm/i915: Lift has-pinned-pages assert to caller of ____i915_gem_object_get_pages
drm/i915: Display WA #1133 WaFbcSkipSegments:cnl, glk
drm/i915/cnl: Allow the reg_read ioctl to read the RCS TIMESTAMP register
drm/i915: Move device_info.has_snoop into the static tables
drm/i915: Disable MI_STORE_DATA_IMM for i915g/i915gm
drm/i915: Re-enable GTT following a device reset
drm/i915/cnp: Wa 1181: Fix Backlight issue
drm/i915: Annotate user relocs with __user
drm/i915: Constify load detect mode
drm/i915/perf: Remove __user from u64 in drm_i915_perf_oa_config
drm/i915: Silence sparse by using gfp_t
drm/i915: io unmap functions want __iomem
drm/i915: Add __rcu to radix tree slot pointer
drm/i915: Wake up the device for the fbdev setup
drm/i915: Add interface to reserve fence registers for vGPU
drm/i915: Use correct path to trace include
drm/i915: Fix the missing PPAT cache attributes on CNL
drm/i915: Fix enum pipe vs. enum transcoder for the PCH transcoder
...
UAPI Changes:
Cross-subsystem Changes:
Core Changes:
- DP SDP defines (Ville)
- polish for scdc helpers (Thierry Reding)
- fix lifetimes for connector/plane state across crtc changes (Maarten
Lankhorst).
- sparse fixes (Ville+Thierry)
- make legacy kms ioctls all interruptible (Maarten)
- push edid override into the edid helpers (out of probe helpers)
(Jani)
- DP ESI defines for link status (DK)
Driver Changes:
- drm-panel is now in drm-misc!
- minor panel-simple cleanups/refactoring by various folks
- drm_bridge_add cleanup (Inki Dae)
- constify a few i2c_device_id structs (Arvind Yadav)
- More patches from Noralf's fb/gem helper cleanup
- bridge/synopsis: reset fix (Philippe Cornu)
- fix tracepoint include handling in drivers (Thierry)
- rockchip: lvds support (Sandy Huang)
- move sun4i into drm-misc fold (Maxime Ripard)
- sun4i: refactor driver load + support TCON backend/layer muxing
(Chen-Yu Tsai)
- pl111: support more pl11x variants (Linus Walleij)
- bridge/adv7511: robustify probing/edid handling (Lars-Petersen
Clausen)
New hw support:
- S6E63J0X03 panel (Hoegeun Kwon)
- OTM8009A panel (Philippe CORNU)
- Seiko 43WVF1G panel (Marco Franchi)
- tve200 driver (Linus Walleij)
Plus assorted of tiny patches all over, including our first outreachy
patches from applicants for the winter round!
* tag 'drm-misc-next-2017-09-20' of git://anongit.freedesktop.org/git/drm-misc: (101 commits)
drm: add backwards compatibility support for drm_kms_helper.edid_firmware
drm: handle override and firmware EDID at drm_do_get_edid() level
drm/dp: DPCD register defines for link status within ESI field
drm/rockchip: Replace dev_* with DRM_DEV_*
drm/tinydrm: Drop driver registered message
drm/gem-fb-helper: Use debug message on gem lookup failure
drm/imx: Use drm_gem_fb_create() and drm_gem_fb_prepare_fb()
drm/bridge: adv7511: Constify HDMI CODEC platform data
drm/bridge: adv7511: Enable connector polling when no interrupt is specified
drm/bridge: adv7511: Remove private copy of the EDID
drm/bridge: adv7511: Properly update EDID when no EDID was found
drm/crtc: Convert setcrtc ioctl locking to interruptible.
drm/atomic: Convert pageflip ioctl locking to interruptible.
drm/legacy: Convert setplane ioctl locking to interruptible.
drm/legacy: Convert cursor ioctl locking to interruptible.
drm/atomic: Convert atomic ioctl locking to interruptible.
drm/atomic: Prepare drm_modeset_lock infrastructure for interruptible waiting, v2.
drm/tve200: Clean up panel bridging
drm/doc: Update todo.rst
drm/dp/mst: Sideband message transaction to power up/down nodes
...