Writes to CURSURFLIVE in TGL are causing IOMMU errors and visual
glitches that are often reproduced when executing CPU intensive
workloads while a eDP 4K panel is attached.
Manually exiting PSR causes the frontbuffer to be updated without
glitches and the IOMMU errors are also gone but this comes at the cost
of less time with PSR active.
So using this workaround until this issue is root caused and a better
fix is found.
The current code is already ready to enable PSR after this exit if
there is not other frontbuffer modifications.
Adding a new if block in psr_force_hw_tracking_exit() instead of reuse
the else/gen8- block because the plan is to revert this workaround
as soon as a better solution is found.
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Tested-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201002231627.24528-1-jose.souza@intel.com
As we disable the interrupt during suspend, also reset the irq_mask to
short-circuit subsystems that later try to turn off their interrupt
source.
<4>[ 101.816730] i915 0000:00:02.0: drm_WARN_ON(!intel_irqs_enabled(dev_priv))
<4>[ 101.816853] WARNING: CPU: 3 PID: 4241 at drivers/gpu/drm/i915/i915_irq.c:343 ilk_update_display_irq+0xb3/0x130 [i915]
v2: Reset irq_mask for i8xx_irq_reset as well, and split patch to focus
on only i915->irq_mask
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201022114246.28566-1-chris@chris-wilson.co.uk
The block comment for cnl_program_nearest_filter_coefs() has a wonderful
diagram, but although it is marked up as kerneldoc does not use the
markup for providing the function definition.
drivers/gpu/drm/i915/display/intel_display.c:6341: warning: Function parameter or member 'dev_priv' not described in 'cnl_program_nearest_filter_coefs'
drivers/gpu/drm/i915/display/intel_display.c:6341: warning: Function parameter or member 'pipe' not described in 'cnl_program_nearest_filter_coefs'
drivers/gpu/drm/i915/display/intel_display.c:6341: warning: Function parameter or member 'id' not described in 'cnl_program_nearest_filter_coefs'
drivers/gpu/drm/i915/display/intel_display.c:6341: warning: Function parameter or member 'set' not described in 'cnl_program_nearest_filter_coefs'
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021185649.17759-1-chris@chris-wilson.co.uk
There is no GEN6_RPSTAT1 on ILK. Instead of reading that let's
try to get the same information from MEMSTAT_ILK. At least it
seems to track MEMSWCTL frequency request perfectly on my ILK.
It needs the same invert trick as the request value.
We don't want to put the invert thing into intel_gpu_freq()
and intel_freq_opcode() because that would incorrectly invert
the min/max/etc frequencies also.
One day someone might want to reverse engineer the formula for
converting these numbers to Hz, but for now we'll just report
them raw.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021131443.25616-2-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Restore RPS for ILK-M. We lost it when an extra HAS_RPS()
check appeared in intel_rps_enable().
Unfortunaltey this just makes the performance worse on my
ILK because intel_ips insists on limiting the GPU freq to
the minimum. If we don't do the RPS init then intel_ips will
not limit the frequency for whatever reason. Either it can't
get at some required information and thus makes wrong decisions,
or we mess up some weights/etc. and cause it to make the wrong
decisions when RPS init has been done, or the entire thing is
just wrong. Would require a bunch of reverse engineering to
figure out what's going on.
Cc: stable@vger.kernel.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Fixes: 9c878557b1 ("drm/i915/gt: Use the RPM config register to determine clk frequencies")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201021131443.25616-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
GEN >= 10 hardware supports the programmable scaler filter.
Attach scaling filter property for CRTC and plane for GEN >= 10
hardwares and program scaler filter based on the selected filter
type.
changes since v3:
* None
changes since v2:
* Use updated functions
* Add ps_ctrl var to contain the full PS_CTRL register value (Ville)
* Duplicate the scaling filter in crtc and plane hw state (Ville)
changes since v1:
* None
Changes since RFC:
* Enable properties for GEN >= 10 platforms (Ville)
* Do not round off the crtc co-ordinate (Danial Stone, Ville)
* Add new functions to handle scaling filter setup (Ville)
* Remove coefficient set 0 hardcoding.
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201020161427.6941-5-pankaj.laxminarayan.bharadiya@intel.com
Integer scaling (IS) is a nearest-neighbor upscaling technique that
simply scales up the existing pixels by an integer
(i.e., whole number) multiplier.Nearest-neighbor (NN) interpolation
works by filling in the missing color values in the upscaled image
with that of the coordinate-mapped nearest source pixel value.
Both IS and NN preserve the clarity of the original image. Integer
scaling is particularly useful for pixel art games that rely on
sharp, blocky images to deliver their distinctive look.
Introduce functions to configure the scaler filter coefficients to
enable nearest-neighbor filtering.
Bspec: 49247
changes since v6:
* Trust compiler, remove pointless inline keyword from cnl_coef_tap()
& cnl_nearest_filter_coef() functions (Ville)
changes since v4:
* Make cnl_coef_tap(), cnl_nearest_filter_coef() inline (Uma)
changes since v3:
* None
changes since v2:
* Move APIs from 5/5 into this patch.
* Change filter programming related function names to cnl_*, move
filter select bits related code into inline function (Ville)
changes since v1:
* Rearrange skl_scaler_setup_nearest_neighbor_filter() to iterate the
registers directly instead of the phases and taps (Ville)
changes since RFC:
* Refine the skl_scaler_setup_nearest_neighbor_filter() logic (Ville)
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201020161427.6941-4-pankaj.laxminarayan.bharadiya@intel.com
Introduce per-plane and per-CRTC scaling filter properties to allow
userspace to select the driver's default scaling filter or
Nearest-neighbor(NN) filter for upscaling operations on CRTC and
plane.
Drivers can set up this property for a plane by calling
drm_plane_create_scaling_filter() and for a CRTC by calling
drm_crtc_create_scaling_filter().
NN filter works by filling in the missing color values in the upscaled
image with that of the coordinate-mapped nearest source pixel value.
NN filter for integer multiple scaling can be particularly useful for
for pixel art games that rely on sharp, blocky images to deliver their
distinctive look.
changes since: v6:
* Move property doc to existing "Standard CRTC Properties" and
"Plane Composition Properties" doc comments (Simon)
changes since v3:
* Refactor code, add new function for common code (Ville)
changes since v2:
* Create per-plane and per-CRTC scaling filter property (Ville)
changes since v1:
* None
changes since RFC:
* Add separate properties for plane and CRTC (Ville)
Link: https://github.com/xbmc/xbmc/pull/18194
Link: https://github.com/xbmc/xbmc/pull/18567
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Acked-by: Simon Ser <contact@emersion.fr>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201020161427.6941-2-pankaj.laxminarayan.bharadiya@intel.com
Currently we call .hpd_irq_setup() directly just before display
resume, and follow it with another call via intel_hpd_init()
just afterwards. Assuming the hpd pins are marked as enabled
during the open-coded call these two things do exactly the
same thing (ie. enable HPD interrupts). Which even makes sense
since we definitely need working HPD interrupts for MST sideband
during the display resume.
So let's nuke the open-coded call and move the intel_hpd_init()
call earlier. However we need to leave the poll_init_work stuff
behind after the display resume as that will trigger display
detection while we're resuming. We don't want that trampling over
the display resume process. To make this a bit more symmetric
we turn this into a intel_hpd_poll_{enable,disable}() pair.
So we end up with the following transformation:
intel_hpd_poll_init() -> intel_hpd_poll_enable()
lone intel_hpd_init() -> intel_hpd_init()+intel_hpd_poll_disable()
.hpd_irq_setup()+resume+intel_hpd_init() -> intel_hpd_init()+resume+intel_hpd_poll_disable()
If we really would like to prevent all *long* HPD processing during
display resume we'd need some kind of software mechanism to simply
ignore all long HPDs. Currently we appear to have that just for
fbdev via ifbdev->hpd_suspended. Since we aren't exploding left and
right all the time I guess that's mostly sufficient.
For a bit of history on this, we first got a mechanism to block
hotplug processing during suspend in commit 15239099d7 ("drm/i915:
enable irqs earlier when resuming") on account of moving the irq enable
earlier. This then got removed in commit 50c3dc970a ("drm/fb-helper:
Fix hpd vs. initial config races") because the fdev initial config
got pushed to a later point. The second ad-hoc hpd_irq_setup() for
resume was added in commit 0e32b39cee ("drm/i915: add DP 1.2 MST
support (v0.7)") to be able to do MST sideband during the resume.
And finally we got a partial resurrection of the hpd blocking
mechanism in commit e8a8fedd57 ("drm/i915: Block fbdev HPD
processing during suspend"), but this time it only prevent fbdev
from handling hpd while resuming.
v2: Leave the poll_init_work behind
v3: Remove the extra intel_hpd_poll_disable() from display reset (Lyude)
Add the missing intel_hpd_poll_disable() to display init (Imre)
Cc: Lyude Paul <lyude@redhat.com>
Cc: Imre Deak <imre.deak@intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201013181137.30560-1-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Rename intel_dp_sink_dpms() to intel_dp_set_power()
so one doesn't always have to convert from the DPMS
enum values to the actual DP D-states.
Also when dealing with a branch device this has nothing to
do with any sink, so the old name was nonsense anyway.
Also adjust the debug message accordingly, and pimp it
with the standard encoder id+name thing.
Trivial bits done with cocci:
@@
expression DP;
@@
(
- intel_dp_sink_dpms(DP, DRM_MODE_DPMS_OFF)
+ intel_dp_set_power(DP, DP_SET_POWER_D3)
|
- intel_dp_sink_dpms(DP, DRM_MODE_DPMS_ON)
+ intel_dp_set_power(DP, DP_SET_POWER_D0)
)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201016194800.25581-2-ville.syrjala@linux.intel.com
Reviewed-by: Imre Deak <imre.deak@intel.com>
The "mmio" writes into vgpu registers are simple memory traps from the
guest into the host. We do not need to assert in the guest that the
device is awake for the io as we do not write to the device itself.
However, over time we have refactored all the mmio accessors with the
result that the vgpu reuses the gen2 accessors and so inherits the
assert for runtime-pm of the native device. The assert though has
actually been there since commit 3be0bf5acc ("drm/i915: Create vGPU
specific MMIO operations to reduce traps").
References: 3be0bf5acc ("drm/i915: Create vGPU specific MMIO operations to reduce traps")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Yan Zhao <yan.y.zhao@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20200811092532.13753-1-chris@chris-wilson.co.uk
Underruns happens when plane height + y offset is not a modulo of 4
when FBC is enabled. It happens when scanline is at vactive - 10 but
that is not feasible to do from the software side so here completely
disabling FBC when height + y offset matches to avoid visual glitches.
Specification says that it only affects TGL display C stepping and
newer but to simply the check and as TGL is already in final costumers
hands, pre-production display stepping A and B was also included.
BSpec: 52887 ICL
BSpec: 52888 EHL/JSL
BSpec: 52890/55378 TGL
BSpec: 53508 DG1
BSpec: 53273 RKL
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201019175609.28715-1-jose.souza@intel.com
intel_dp_ycbcr420_config() is rather pointless. Just inline it
directly into intel_dp_compute_config(). This gets rid of the
ugly double assignment of output_format.
Not really sure what the best policy would be when the user
supplies a mode classified by the display as "YCbCr 4:2:0
only", but we know that we can't do YCbCr 4:2:0 output. For
now keep the current behaviour of just silently upgrade
it to RGB 4:4:4.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200924184156.24491-3-ville.syrjala@linux.intel.com
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Remove the lspcon special case from intel_dp_compute_config() and
just treat it like any other DFP than can do 4:4:4->4:2:0 conversion.
The only difference between the two codepaths was that the lspcon
code tried to already halve port_clock. That was just total nonsense
as we hadn't even computed the base port_clock at that time.
All that stuff happens intel_dp_compute_link_config*() and it
already takes care of the 4:2:0 clock reduction.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200924184156.24491-2-ville.syrjala@linux.intel.com
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
WAC6entrylatency is trying to fix excessive rc6 entry latency caused
by the extra delay from FBC_LLC_READ_CTRL, which is there for some
extra sync with uncore for frame buffer caching in LLC.
Reading through the hsd the recommendation was to set the FBC_LLC_FULLY_OPEN
bit to disable this extra delay entirely. This can be done whenever fb LLC
caching is not used. The alternative suggestion was to reduce the delay to
eg. 0x5 via updated BIOS programming instructions. But all the kbl/cfl
machines I've seen still have the default 0xff programmed. As we never use
fb LLC caching let's just apply the w/a to all skl derivatives to get
consistent rc6 latencies.
I was able to measure the effect of FBC_LLC_READ_CTRL to rc6 latency
via forcewake. Here's a graph of some of the results:
sleep;fw_req=1;wait fw_ack==1;sleep;fw_req=0;wait fw_ack==0
fw_ack==1 duration
160us +----------------------------------------------------------------+
| + + $$+ + + |
| $$ $ $ ******$$ ** $ $**$* #########$$######|
140us |-$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$$*$$$$$$$$$$$$$$$$ $$$$$$|
| $ * # |
| $ * # |
120us |$+ * # +-|
|$ * # |
|$ * # # |
100us |$+ ************######################## +-|
|$ * *# |
|$ ***** ######### |
80us |$+ * # #### ## +-|
|$ **** ### # # |
| ** #### FBC_LLC_READ_CTRL: 0x8000 ******* |
60us |-###### FBC_LLC_READ_CTRL: 0xffff #######-|
|## + + FBC_LLC_READ_CTRL: 0x400000ff $$$$$$$ |
+----------------------------------------------------------------+
0ms 10ms 20ms 30ms 40ms 50ms 60ms
sleep duration
The default FBC_LLC_READ_CTRL value of 0xff is documented to give us
a 170usec delay. That tracks well with the knees at 0xffff->~44msec and
0x8000->~22msec we see in the graph.
We can see that if we sleep longer than the FBC_LLC_READ_CTRL delay
we always observe the full (~145usec) rc6 wakeup latency. But if we sleep
for less than the FBC_LLC_READ_CTRL delay we see a quicker fw wakeup,
presumably due the hardware not having yet entered rc6 fully.
The other plateaus in the graph I suspect correspond to some shallower
internal rc states.
v2: s/usec/msec/ typo in commit msg
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716190426.17047-2-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>