We should never BUG_ON on any corruption in CTB descriptor as
data there can be also modified by the GuC. Instead we can
use flag "is_in_error" to indicate that we will not process
any further messages over this CTB (until reset). While here
move descriptor error reporting to the function that actually
touches that descriptor.
Note that unexpected content of the specific CT messages, that
still complies with generic CT message format, shall not trigger
disabling whole CTB, as that might just indicate new unsupported
message types.
v2: drop redundant message (Daniele)
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200117082039.65644-2-michal.wajdeczko@intel.com
Turns out we actually already have some companies, such as Lenovo,
shipping machines with AMOLED screens that don't allow controlling the
backlight through the usual PWM interface and only allow controlling it
through the standard EDP DPCD interface. One example of one of these
laptops is the X1 Extreme 2nd Generation.
Since we've got systems that need this turned on by default now to have
backlight controls working out of the box, let's start auto-detecting it
for systems by default based on what the VBT tells us. We do this by
changing the default value for the enable_dpcd_backlight module param
from 0 to -1.
Tested-by: AceLan Kao <acelan.kao@canonical.com>
Tested-by: Perry Yuan <pyuan@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200116211623.53799-6-lyude@redhat.com
For eDP panels, it appears it's expected that so long as the panel is in
DPCD control mode that the brightness value is never set to 0. Instead,
if the desired effect is to set the panel's backlight to 0 we're
expected to simply turn off the backlight through the
DP_EDP_DISPLAY_CONTROL_REGISTER.
We already do the latter correctly in intel_dp_aux_disable_backlight().
But, we make the mistake of writing the DPCD registers in the wrong
order when enabling the backlight in intel_dp_aux_enable_backlight()
since we currently enable the backlight through
DP_EDP_DISPLAY_CONTROL_REGISTER before writing the brightness level. On
the X1 Extreme 2nd Generation, this appears to have the potential of
confusing the panel in such a way that further attempts to set the
brightness don't actually change the backlight as expected and leave it
off. Presumably, this happens because the incorrect register writing
order briefly leaves the panel with DPCD mode enabled and a 0 brightness
level set.
So, reverse the order we write the DPCD registers when enabling the
panel backlight so that we write the brightness value first, and enable
the backlight second. This fix appears to be the final bit needed to get
the backlight on the ThinkPad X1 Extreme 2nd Generation's AMOLED screen
working.
Tested-by: AceLan Kao <acelan.kao@canonical.com>
Tested-by: Perry Yuan <pyuan@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200116211623.53799-4-lyude@redhat.com
Currently we always determine the initial panel brightness level by
simply reading the value from DP_EDP_BACKLIGHT_BRIGHTNESS_MSB/LSB. This
seems wrong though, because if the panel is not currently in DPCD
control mode there's not really any reason why there would be any
brightness value programmed in the first place.
This appears to be the case on the Lenovo ThinkPad X1 Extreme 2nd
Generation, where the default value in these registers is always 0 on
boot despite the fact the panel runs at max brightness by default.
Getting the initial brightness value correct here is important as well,
since the panel on this laptop doesn't behave well if it's ever put into
DPCD control mode while the brightness level is programmed to 0.
So, let's fix this by checking what the current backlight control mode
is before reading the brightness level. If it's in DPCD control mode, we
return the programmed brightness level. Otherwise we assume 100%
brightness and return the highest possible brightness level. This also
prevents us from accidentally programming a brightness level of 0.
This is one of the many fixes that gets backlight controls working on
the ThinkPad X1 Extreme 2nd Generation with optional 4K AMOLED screen.
Changes since v1:
* s/DP_EDP_DISPLAY_CONTROL_REGISTER/DP_EDP_BACKLIGHT_MODE_SET_REGISTER/
- Jani
Tested-by: AceLan Kao <acelan.kao@canonical.com>
Tested-by: Perry Yuan <pyuan@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200116211623.53799-3-lyude@redhat.com
Max backlight value for the panel was being calculated using byte
count i.e. 0xffff if 2 bytes are supported for backlight brightness
and 0xff if 1 byte is supported. However, EDP_PWMGEN_BIT_COUNT
determines the number of active control bits used for the brightness
setting. Thus, even if the panel uses 2 byte setting, it might not use
all the control bits. Thus, max backlight should be set based on the
value of EDP_PWMGEN_BIT_COUNT instead of assuming 65535 or 255.
Additionally, EDP_PWMGEN_BIT_COUNT was being updated based on the VBT
frequency which results in a different max backlight value. Thus,
setting of EDP_PWMGEN_BIT_COUNT is moved to setup phase instead of
enable so that max backlight can be calculated correctly. Only the
frequency divider is set during the enable phase using the value of
EDP_PWMGEN_BIT_COUNT.
This is based off the original patch series from Furquan Shaikh
<furquan@google.com>:
https://patchwork.freedesktop.org/patch/317255/?series=62326&rev=3
Changes since original patch:
* Remove unused intel_dp variable in intel_dp_aux_setup_backlight()
* Fix checkpatch issues
* Make sure that we rewrite the pwmgen bit count whenever we bring the
panel out of D3 mode
v2 by Jani:
* rebase
* fix readb return value check
Cc: Furquan Shaikh <furquan@google.com>
Tested-by: AceLan Kao <acelan.kao@canonical.com>
Tested-by: Perry Yuan <pyuan@redhat.com>
Signed-off-by: Lyude Paul <lyude@redhat.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200116211623.53799-2-lyude@redhat.com
Currently, we skip error capture upon forced preemption. We apply forced
preemption when there is a higher priority request that should be
running but is being blocked, and we skip inline error capture so that
the preemption request is not further delayed by a user controlled
capture -- extending the denial of service.
However, preemption reset is also used for heartbeats and regular GPU
hangs. By skipping the error capture, we remove the ability to debug GPU
hangs.
In order to capture the error without delaying the preemption request
further, we can do an out-of-line capture by removing the guilty request
from the execution queue and scheduling a worker to dump that request.
When removing a request, we need to remove the entire context and all
descendants from the execution queue, so that they do not jump past.
Closes: https://gitlab.freedesktop.org/drm/intel/issues/738
Fixes: 3a7a92aba8 ("drm/i915/execlists: Force preemption")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200116184754.2860848-3-chris@chris-wilson.co.uk
In order to support out-of-line error capture, we need to remove the
active request from HW and put it to one side while a worker compresses
and stores all the details associated with that request. (As that
compression may take an arbitrary user-controlled amount of time, we
want to let the engine continue running on other workloads while the
hanging request is dumped.) Not only do we need to remove the active
request, but we also have to remove its context and all requests that
were dependent on it (both in flight, queued and future submission).
Finally once the capture is complete, we need to be able to resubmit the
request and its dependents and allow them to execute.
v2: Replace stack recursion with a simple list.
v3: Check all the parents, not just the first, when searching for a
stuck ancestor!
References: https://gitlab.freedesktop.org/drm/intel/issues/738
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200116184754.2860848-2-chris@chris-wilson.co.uk
Add a debugfs subdirectory i915_params with all the i915 module
parameters. This is a first step, with lots of boilerplate, and not much
benefit yet.
This will result in a new device specific debugfs directory at
/sys/kernel/debug/dri/<N>/i915_params duplicating the module specific
sysfs directory at /sys/module/i915/parameters/. Going forward, all
users of the parameters should use the debugfs, with the module
parameters being phased out.
Add debugfs permissions to I915_PARAMS_FOR_EACH(). This duplicates the
mode with module parameter sysfs, but the goal is to make the module
parameters read-only initial values for device specific parameters.
0 mode will bypass debugfs creation. Use it for verbose_state_checks
which will need special attention in follow-up work.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/600101c8433e7caf9303663fc85a9972fa1f05e7.1575560168.git.jani.nikula@intel.com
intel_prepare_plane_fb() will always pin plane_state->hw.fb whenever
it is present. We copy that from the master plane to the slave plane,
but we fail to copy the corresponding ggtt view. Thus when it comes time
to pin the slave plane's fb we use some stale ggtt view left over from
the last time the plane was used as a non-slave plane. If that previous
use involved 90/270 degree rotation or remapping we'll try to shuffle
the pages of the new fb around accordingingly. However the new
fb may be backed by a bo with less pages than what the ggtt view
rotation/remapped info requires, and so we we trip a GEM_BUG().
Steps to reproduce on icl:
1. plane 1: whatever
plane 6: largish !NV12 fb + 90 degree rotation
2. plane 1: smallish NV12 fb
plane 6: make invisible so it gets slaved to plane 1
3. GEM_BUG()
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/issues/951
Fixes: 1f594b209f ("drm/i915: Remove special case slave handling during hw programming, v3.")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200110183228.8199-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
If 'CONFIG_DRM_I915_CAPTURE_ERROR' not configured, there is an error
when compile the kernel:
drivers/gpu/drm/i915/gt/intel_reset.c:
In function intel_gt_handle_error:
drivers/gpu/drm/i915/gt/intel_reset.c:1233:3:
error: too few arguments to function i915_capture_error_state
i915_capture_error_state(gt->i915);
^~~~~~~~~~~~~~~~~~~~~~~~
In file included
from ./drivers/gpu/drm/i915/i915_drv.h:97:0,
from ./drivers/gpu/drm/i915/display/intel_display_types.h:46,
from drivers/gpu/drm/i915/gt/intel_reset.c:10:
./drivers/gpu/drm/i915/i915_gpu_error.h:267:20: note: declared here
static inline void i915_capture_error_state(struct drm_i915_private *dev_priv,
Fixes: 742379c0c4 ("drm/i915: Start chopping up the GPU error capture")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200113081942.15982-1-zhangxiaoxu5@huawei.com
If 'CONFIG_DRM_I915_CAPTURE_ERROR' not configured, there are some
errors like:
drivers/gpu/drm/i915/i915_irq.o:
In function `i915_vma_capture_finish':
./drivers/gpu/drm/i915/i915_gpu_error.h:312:
multiple definition of `i915_vma_capture_finish'
drivers/gpu/drm/i915/i915_drv.o:
./drivers/gpu/drm/i915/i915_gpu_error.h:312: first defined here
So, add 'static inline' on the defineation of the 'i915_vma_capture_finish'
Fixes: d713e3ab93fdc("drm/i915: Correct typo in i915_vma_compress_finish stub")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zhang Xiaoxu <zhangxiaoxu5@huawei.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200113104009.13274-1-zhangxiaoxu5@huawei.com
Life is usually easier when we pass around intel_ types instead
of drm_ types. In this case it might not be, but I think being
consistent is a good thing anyway. Also some of this might get
cleaned up a bit more later as we keep propagating the intel_
types further.
@find@
identifier F =~ "^intel_attached_.*";
identifier C;
@@
F(struct drm_connector *C)
{
...
}
@@
identifier find.F;
identifier find.C;
@@
F(
- struct drm_connector *C
+ struct intel_connector *connector
)
{
<...
- C
+ &connector->base
...>
}
@@
identifier find.F;
expression C;
@@
- F(C)
+ F(to_intel_connector(C))
@@
expression C;
@@
- to_intel_connector(&C->base)
+ C
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191204180549.1267-3-ville.syrjala@linux.intel.com
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
There seems to be some undocumented bandwidth
bottleneck/dependency which scales with CDCLK,
causing FIFO underruns when CDCLK is too low,
even when it's correct from BSpec point of view.
Currently for TGL platforms we calculate
min_cdclk initially based on pixel_rate divided
by 2, accounting for also plane requirements,
however in some cases the lowest possible CDCLK
doesn't work and causing the underruns.
We've found experimentally that raising cdclk to
at least pixel_rate (rather than pixel_rate/2)
eliminates these underruns, so let's use this as a
temporary workaround until the hardware team
can suggest a more precise remedy.
Explicitly stating here that this seems to be currently
rather a Hack, than final solution.
v2: Use clamp operation instead of min(Matt Roper)
v3: - Fixed commit message(Matt Roper)
- Now using pixel_rate instead of max_cdclk(Jani Nikula)
- Switched to max from clamp(Ville Syrjälä)
Hopefully this hybrid satisfies everyone :)
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/issues/402
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200109220547.23817-1-stanislav.lisovskiy@intel.com