Since we're inhibiting context save of preempt context, we're no longer
tracking the position of HEAD/TAIL. With GuC, we're adding a new
breadcrumb for each preemption, which means that the HW will do more and
more breadcrumb writes. Eventually the ring is filled, and we're
submitting the preemption context with HEAD==TAIL==0, which won't result
in breadcrumb write, but will trigger hangcheck instead.
Instead of writing a new preempt breadcrumb for each preemption, let's
just fill the ring once at init time (which also saves a couple of
instructions in the tasklet).
v2: Assert that context save restore is inhibited, don't assert on ring
alignment. (Chris)
v3: Cleanup checkpatch.
Fixes: 517aaffe0c ("drm/i915/execlists: Inhibit context save/restore for the fake preempt context")
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180226163800.21745-1-michal.winiarski@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Sometimes we need to boost the priority of an in-flight request, which
may lead to the situation where the second submission port then contains
a higher priority context than the first and so we need to inject a
preemption event. To do so we must always check inside
execlists_dequeue() whether there is a priority inversion between the
ports themselves as well as the head of the priority sorted queue, and we
cannot just skip dequeuing if the queue is empty.
As Michał noted, this doesn't simply extend to handling more than 2-port
submission, as we may need to reorder within the array of executing
requests which themselves are lower priority than the first. A task for
later!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180222142229.14517-1-chris@chris-wilson.co.uk
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Currently the FBC code doesn't handle the 90/270 degree rotated case
correctly. We would need the GTT tracking to monitor the fence on the
normal GTT view (the rotated view doesn't even have a fence). Not quite
sure how we should program the fence Y offset etc. in that case. For now
we'll end up disabling FBC with 90/270 degree rotation. Add a FIXME
to remind people about this fact.
v2: Reword the text (Chris)
Move the FIXME to the fbc code
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180221160235.11134-7-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Currently we pin a fence on every plane doing tiled scanout. The
number of planes we have available is fast apporaching the number
of fences so we really should stop wasting them. Only FBC needs
the fence on gen4+, so let's use fences only for the primary planes
on those platforms.
v2: drop the tiling check from plane_uses_fence() as the obj is
NULL during initial_plane_config() and we don't rally need the
check since i915_vma_pin_fence() does the check anyway
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180221184807.577-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Some panels support limited range output (16-235) compared
to full range RGB values (0-255). Also userspace can control
the RGB range using "Broadcast RGB" property. Currently the
code to handle full range to limited range is broken. This
patch fixes the same by properly scaling down all the full
range co-efficients with limited range scaling factor.
v2: Fixed Ville's review comments.
v3: Changed input to const and used correct data types as
suggested by Ville
v4: Fixed some missing data type corrections.
Signed-off-by: Johnson Lin <johnson.lin@intel.com>
Signed-off-by: Uma Shankar <uma.shankar@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1517327489-26128-1-git-send-email-uma.shankar@intel.com
It turns out that HSW has a register that tells us how many EUs are
disabled per half-slice (roughly a similar notion to subslice). We
didn't read those registers so far as most userspace drivers didn't
need those values prior to Gen8, but an internal library would like to
have access to this.
Since we already have the getparam interface, there is no harm in
exposing this.
v2: Rename bits value (Joonas)
v3: s/GEM_BUG_ON/MISSING_CASE/ (Joonas)
v4: s/GEM_BUG_ON/MISSING_CASE/ again... (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180221204902.23084-1-lionel.g.landwerlin@intel.com
i965 and g4x still have the pipe select bits in the plane control
registers, they're just hardcoded to select a specific pipe. However
plane C on i965 can still move between the pipes, thus we should
program the pipe select bits on i965 if we want to expose plane C
some day.
Since there is no harm in programming the bits on any plane on
i965/g4x let's just always set them. This will also make our
pre-computed register value match what the hardware register
would read, should we want to cross check the two.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180130203807.13721-2-ville.syrjala@linux.intel.com
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
G4x cursor control registers still allow us to write to the pipe select
bits even though cursors are supposed to be fixed to a specific pipe.
Bspec tells us that we should only ever write 0 to these bits. Let's
follow that recommendation. On ilk+ the bits become hardwired to 0.
Also looks like ICL repurposes these bits for some other use, so
we had better stop setting them to bogus values there.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180130203807.13721-1-ville.syrjala@linux.intel.com
Reviewed-by: Mika Kahola <mika.kahola@intel.com>
Add some compile time assrts to the frontbuffer tracking to make sure
that we have enough bits per pipe to cover all the planes, and that we
have enough total bits to cover all the planes across all pipes.
We'll ignore any potential clash between the overlay bit and the
plane bits because that will allow us to keep using a total of 32
bits for the foreseeable future.
While at it change the macros to use BIT() and GENMASK(). The latter
gets rid of the hardcoded 0xff and thus means we can change the
number of bits per pipe by just changing
INTEL_FRONTBUFFER_BITS_PER_PIPE.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180124183642.32549-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
During igt, we frequently call into the driver to reset both HW and
driver state (idling the device, waiting for it to become idle and
freeing off old objects) to ensure that we start each test/subtest/pass
from known state. This process incurs an RCU barrier or two to ensure
that any such pending frees are indeed flushed before we return.
However, unconditionally waiting on the RCU barrier adds needless delay
to many callers, which adds up to several seconds when repeated thousands
of times. We can skip the rcu_barrier() if by tracking how many outstanding
frees we have, we know there are none.
The same path is used along suspend, where we may be able to save the
unconditional RCU barrier.
To put it into perspective with a completely meaningless
microbenchmark, igt/gem_sync/idle is improved from 50ms to 30us on bdw.
v2: Remove the extra synchronize_rcu() inside i915_drop_caches_set()
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180219220631.25001-1-chris@chris-wilson.co.uk
The compiler is not automatically caching the i915->regs address inside
a register and emitting a load for every mmio access. For simple
functions like gen8_gt_irq_handler that are already using the raw
accessors, we can open-code them for substantial savings:
add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-83 (-83)
Function old new delta
gen8_gt_irq_handler 290 266 -24
gen8_gt_irq_ack 181 122 -59
Total: Before=954637, After=954554, chg -0.01%
v2: Add raw_reg_read/raw_reg_write.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180219100926.16554-1-chris@chris-wilson.co.uk
The in-lined comments for channel.port doesn't follow the syntax
described at kernel-doc document, causing the following warning:
$ ./scripts/kernel-doc -none drivers/gpu/drm/i915/intel_dpio_phy.c
drivers/gpu/drm/i915/intel_dpio_phy.c:154: warning: Function parameter or member 'channel.port' not described in 'bxt_ddi_phy_info'
While the best would be for the Kernel to deduce that from the
context, supporting it is not trivial. So, let's just stick with
the existing syntax.
[Jani: depends on "scripts: kernel-doc: support in-line comments on
nested structs/unions" to actually fix the warning.]
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Reported-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/9ba9ac773f4f9e60770bd9169b0e46ac974d858a.1518788761.git.mchehab@s-opensource.com
Hitting the failure path through check_digital_port_conflicts triggers:
================================================
WARNING: lock held when returning to user space!
4.16.0-rc1-CI-kasan_1+ #1 Tainted: G W
------------------------------------------------
kms_3d/1439 is leaving the kernel with locks still held!
1 lock held by kms_3d/1439:
#0: (drm_connector_list_iter){.+.+}, at: [<000000003745d183>] intel_atomic_check+0x1d9d/0x3ff0 [i915]
Rearrange the code to have a single exit path through the unlock.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180215091425.42364-1-maarten.lankhorst@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
i915 is the only driver using those fields in the drm_gem_object
structure, so they only waste memory for all other drivers.
Move the fields into drm_i915_gem_object instead and patch the i915 code
with the following sed commands:
sed -i "s/obj->base.read_domains/obj->read_domains/g" drivers/gpu/drm/i915/*.c drivers/gpu/drm/i915/*/*.c
sed -i "s/obj->base.write_domain/obj->write_domain/g" drivers/gpu/drm/i915/*.c drivers/gpu/drm/i915/*/*.c
Change is only compile tested.
v2: move fields around as suggested by Chris.
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20180216124338.9087-1-christian.koenig@amd.com
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>