Commit Graph

968153 Commits

Author SHA1 Message Date
Chris Wilson
0da3f2500a drm/i915/gt: Disable arbitration around Braswell's pdp updates
Braswell's pdp workaround is full of dragons, that may be being angered
when they are interrupted. Let's not take that risk and disable
arbitration during the update.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210111105735.21515-1-chris@chris-wilson.co.uk
2021-01-11 23:07:38 +00:00
Dan Carpenter
6a3daee1b3 drm/i915/selftests: Fix some error codes
These error paths return success instead of negative error codes as
intended.

Fixes: c92724de6d ("drm/i915/selftests: Try to detect rollback during batchbuffer preemption")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/X/xMdcewtft7+QFM@mwanda
2021-01-11 13:19:17 +00:00
Chris Wilson
baa7c2cd99 drm/i915: Refactor marking a request as EIO
When wedging the device, we cancel all outstanding requests and mark
them as EIO. Rather than duplicate the small function to do so between
each submission backend, export one.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210109163455.28466-3-chris@chris-wilson.co.uk
2021-01-09 18:29:07 +00:00
Chris Wilson
e3aabe31fd drm/i915/gt: Mark up a debug-only function
drivers/gpu/drm/i915//gt/intel_workarounds.c:1394:20: error: function 'is_nonpriv_flags_valid' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration]
static inline bool is_nonpriv_flags_valid(u32 flags)

This is only used by debug build, so mark it as maybe-unused to keep the
compiler from complaining.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210109163455.28466-2-chris@chris-wilson.co.uk
2021-01-09 18:29:06 +00:00
Chris Wilson
a42f4dd2bf drm/i915/gt: Remove unused function 'dword_in_page'
>> drivers/gpu/drm/i915/gt/intel_lrc.c:17:28: error: unused function 'dword_in_page' [-Werror,-Wunused-function]
   static inline unsigned int dword_in_page(void *addr)

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210109163455.28466-1-chris@chris-wilson.co.uk
2021-01-09 18:29:06 +00:00
Chris Wilson
9a437ccb84 drm/i915/gt: Exercise lrc_wa_ctx initialisation failure
Inject a fault into lrc_init_wa_ctx() to ensure that we can tolerate a
failure to construct the workarounds.

v2: Avoid mentioning an error for fault-injection, other CI will
complain about the dmesg spam.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210109114453.27798-1-chris@chris-wilson.co.uk
2021-01-09 16:02:57 +00:00
Chris Wilson
9b3a8f558d drm/i915/gt: Disable arbitration on no-preempt requests
If a request is submitted and known to require no preemption, disable
arbitration around the batch which prevents the HW from handling a
preemption request during the payload.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210108204026.20682-7-chris@chris-wilson.co.uk
2021-01-08 21:35:57 +00:00
Chris Wilson
751f82b353 drm/i915/gt: Only disable preemption on gen8 render engines
The reason why we did not enable preemption on Broadwater was due to
missing GPGPU workarounds. Since this only applies to rcs0, only
restrict rcs0 (and our global capabilities).

While this does not affect exposing a preemption capability to
userspace, it does affect our internal decisions on whether to use
timeslicing and semaphores between individual engines.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210108204026.20682-6-chris@chris-wilson.co.uk
2021-01-08 21:35:56 +00:00
Chris Wilson
b1ad5f6d68 drm/i915/gt: Only retire on the last breadcrumb if the last request
We use the completion of the last active breadcrumb to retire the
requests along a timeline. This is purely opportunistic as nothing
guarantees that any particular timeline is terminated by a breadcrumb;
except for parking the engine where we explicitly add a breadcrumb so
that we park quickly and do an explicit retire upon signaling to reduce
the latency dramatically (avoiding a retire worker roundtrip).

With scheduling, we anticipate retiring completed timelines as a matter
of course. Performing the same action from inside the breadcrumbs is
intended to provide similar functionality for legacy ringbuffer
submission.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210108204026.20682-5-chris@chris-wilson.co.uk
2021-01-08 21:35:56 +00:00
Chris Wilson
2b2985a417 drm/i915/gt: Restore ce->signal flush before releasing virtual engine
Before we mark the virtual engine as no longer inflight, flush any
ongoing signaling that may be using the ce->signal_link along the
previous breadcrumbs. On switch to a new physical engine, that link will
be inserted into the new set of breadcrumbs, causing confusion to an
ongoing iterator.

This patch undoes a last minute mistake introduced into commit
bab0557c8d ("drm/i915/gt: Remove virtual breadcrumb before transfer"),
whereby instead of unconditionally applying the flush, it was only
applied if the request itself was going to be reused.

v2: Generalise and cancel all remaining ce->signals

Fixes: bab0557c8d ("drm/i915/gt: Remove virtual breadcrumb before transfer")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210108204026.20682-4-chris@chris-wilson.co.uk
2021-01-08 21:35:55 +00:00
Chris Wilson
0399d0e33a drm/i915/selftests: Rearrange ktime_get to reduce latency against CS
In our tests where we measure the elapsed time on both the CPU and CS
using a udelay, our CS results match the udelay much more accurately
than the ktime (even when using ktime_get_fast_ns). With preemption
disabled, we can go one step lower than ktime and use local_clock.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2919
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210108204026.20682-3-chris@chris-wilson.co.uk
2021-01-08 21:35:54 +00:00
Chris Wilson
c318a203ea drm/i915/selftests: Skip unstable timing measurements
If any of the perf tests run into 0 time, not only are we liable to
divide by zero, but the result would be highly questionable.
Nevertheless, let's not have a div-by-zero error.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210108204026.20682-2-chris@chris-wilson.co.uk
2021-01-08 21:35:54 +00:00
Chris Wilson
5b4dc95cf7 drm/i915/gt: Prevent use of engine->wa_ctx after error
On error we unpin and free the wa_ctx.vma, but do not clear any of the
derived flags. During lrc_init, we look at the flags and attempt to
dereference the wa_ctx.vma if they are set. To protect the error path
where we try to limp along without the wa_ctx, make sure we clear those
flags!

Reported-by: Matt Roper <matthew.d.roper@intel.com>
Fixes: 604a8f6f1e ("drm/i915/lrc: Only enable per-context and per-bb buffers if set")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.15+
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210108204026.20682-1-chris@chris-wilson.co.uk
2021-01-08 21:35:53 +00:00
Chris Wilson
4386b8e5ad drm/i915/gt: Remove timeslice suppression
In the next^W future patch, we remove the strict priority system and
continuously re-evaluate the relative priority of tasks. As such we need
to enable the timeslice whenever there is more than one context in the
pipeline. This simplifies the decision and removes some of the tweaks to
suppress timeslicing, allowing us to lift the timeslice enabling to a
common spot at the end of running the submission tasklet.

One consequence of the suppression is that it was reducing fairness
between virtual engines on an over saturated system; undermining the
principle for timeslicing.

v2: Commentary
v3: Commentary for the right cancel_timer()
v4: Add tracing for why we need a timeslice

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2802
Testcase: igt/gem_exec_balancer/fairslice
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210107132322.28373-1-chris@chris-wilson.co.uk
2021-01-07 21:37:15 +00:00
Chris Wilson
c185a16ece drm/i915: Wrap our timer_list.expires checking
Refactor our timer_list.expires checking into its own timer_active()
helper.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210107123541.17153-1-chris@chris-wilson.co.uk
2021-01-07 21:37:14 +00:00
Chris Wilson
88b39600da drm/i915/selftests: Improve handling of iomem around stolen
Use memset_io() on the iomem, and silence sparse as we copy from the
iomem to normal system pages.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210106123939.18435-2-chris@chris-wilson.co.uk
2021-01-06 15:37:56 +00:00
Chris Wilson
989536a4e6 drm/i915/selftests: Break out of the lrc layout test after register mismatch
AFter detecting a register mismatch between the protocontext and the
image generated by HW, immediately break out of the double loop.
Otherwise we end up with a second confusing error message.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210106123939.18435-1-chris@chris-wilson.co.uk
2021-01-06 15:37:19 +00:00
Chris Wilson
8d03344b9d drm/i915/selftests: Switch 4k kmalloc to use get_free_page for alignment
In generating the reference LRC, we want a page-aligned address for
simplicity in computing the offsets within. This then shares the
computation for the HW LRC which is mapped and so page aligned, making
the comparison straightforward. It seems that kmalloc(4k) is not always
returning from a 4k-aligned slab cache (which would give us a page aligned
address) so force alignment by explicitly allocating a page.

Reported-by: "Gote, Nitin R" <nitin.r.gote@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: "Gote, Nitin R" <nitin.r.gote@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200923114156.17749-1-chris@chris-wilson.co.uk
2021-01-05 13:06:02 +00:00
Chris Wilson
0e58de9fc9 drm/i915/gt: Check the virtual still matches upon locking
If another sibling is able to claim the virtual request, by the time we
inspect the request under the lock it may no longer match the local
engine.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2877
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210104115145.24460-4-chris@chris-wilson.co.uk
2021-01-05 09:18:05 +00:00
Chris Wilson
0a7d355ec6 drm/i915/gt: Allow failed resets without assertion
If the engine reset fails, we will attempt to resume with the current
inflight submissions. When that happens, we cannot assert that the
engine reset cleared the pending submission, so do not.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2878
Fixes: 16f2941ad3 ("drm/i915/gt: Replace direct submit with direct call to tasklet")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210104115145.24460-3-chris@chris-wilson.co.uk
2021-01-05 09:17:22 +00:00
Chris Wilson
c864e9abaf drm/i915: Set rawclk earlier during mmio probe
Older platforms use rawclk to derive the CS clock. rawclk is being
determined during intel_device_info_init(), and so that needs to be
pushed slightly earlier.

Fixes: f170523a7b ("drm/i915/gt: Consolidate the CS timestamp clocks")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210104115145.24460-2-chris@chris-wilson.co.uk
2021-01-05 09:15:57 +00:00
Chris Wilson
6895649bf1 drm/i915/selftests: Set error returns
A few missed PTR_ERR() upon create_gang() errors.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210104115145.24460-1-chris@chris-wilson.co.uk
2021-01-05 09:14:41 +00:00
Chris Wilson
bf3997a541 drm/i915/selftests: Guard against redifinition of SZ_8G
In the near future, upstream will introduce a SZ_8G macro that is
slightly different to our own. Employ a temporary ifndef to avoid
compilation failure until we have backmerged.

References: 8b0fac44bd ("sizes.h: add SZ_8G/SZ_16G/SZ_32G macros")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210104171511.32684-1-chris@chris-wilson.co.uk
2021-01-04 20:31:06 +00:00
Chris Wilson
2b2779917a drm/i915/gt: Rearrange hsw workarounds
Some rcs0 workarounds were being incorrectly applied to the GT, and so
we failed to restore the expected register settings after a reset.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210104114914.30165-2-chris@chris-wilson.co.uk
2021-01-04 12:36:31 +00:00
Chris Wilson
81dc2ddc26 drm/i915/gt: Rearrange snb workarounds
Some rcs0 workarounds were being incorrectly applied to the GT, and so
we failed to restore the expected register settings after a reset.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210104114914.30165-1-chris@chris-wilson.co.uk
2021-01-04 12:36:30 +00:00
Arnd Bergmann
bb80d8784d drm/i915: fix shift warning
Randconfig builds on 32-bit machines show lots of warnings for
the i915 driver for passing a 32-bit value into __const_hweight64():

drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c:2584:9: error: shift count >= width of type [-Werror,-Wshift-count-overflow]
        return hweight64(VDBOX_MASK(&i915->gt));
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
include/asm-generic/bitops/const_hweight.h:29:49: note: expanded from macro 'hweight64'
 #define hweight64(w) (__builtin_constant_p(w) ? __const_hweight64(w) : __arch_hweight64(w))

Change it to hweight_long() to avoid the warning.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210103135158.3591442-1-arnd@kernel.org
2021-01-03 14:12:48 +00:00
Maarten Lankhorst
093a0bea62 drm/i915: Populate logical context during first pin.
This allows us to remove pin_map from state allocation, which saves
us a few retry loops. We won't need this until first pin, anyway.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201231170405.22843-1-chris@chris-wilson.co.uk
2021-01-01 22:16:01 +00:00
Matt Roper
9fb87fb3fd drm/i915: Clarify error message on failed workaround
Let's modify the "workaround lost" error message slightly to make it
more clear what the various numbers represent.  Also, the 'expected'
value needs to be &'d with wa->read so that it doesn't include the mask
bits for masked registers (those bits are write-only in the hardware and
will usually always read out as 0's).

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201231191103.854519-1-matthew.d.roper@intel.com
2021-01-01 22:15:03 +00:00
Chris Wilson
4e5c8a99e1 drm/i915: Drop i915_request.lock requirement for intel_rps_boost()
Since we use a flag within i915_request.flags to indicate when we have
boosted the request (so that we only apply the boost) once, this can be
used as the serialisation with i915_request_retire() to avoid having to
explicitly take the i915_request.lock which is more heavily contended.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201231093149.19086-1-chris@chris-wilson.co.uk
2020-12-31 15:15:05 +00:00
Chris Wilson
9c080b0f96 drm/i915/gt: Pull context closure check from request submit to schedule-in
We only need to evaluate the current status of the context when it is
scheduled in, we will force a reschedule when the context is closed
propagating the change to inflight contexts.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201231093946.11649-1-chris@chris-wilson.co.uk
2020-12-31 11:30:03 +00:00
Chris Wilson
7904e0819d drm/i915/gt: Cancel submitted requests upon context reset
Since we process schedule-in of a context after submitting the request,
if we decide to reset the context at that time, we also have to cancel
the requets we have marked for submission.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201230220028.17089-1-chris@chris-wilson.co.uk
2020-12-31 09:13:24 +00:00
Chris Wilson
cecb2af42c drm/i915/gt: Taint the reset mutex with the shrinker
Declare that, under extreme circumstances, the shrinker may need to wait
upon a request, in which case reset must not itself deadlock in order to
ensure forward progress of the driver. That is since the shrinker may
depend upon a reset, any reset cannot touch the shrinker.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201229141626.4773-1-chris@chris-wilson.co.uk
2020-12-30 13:36:54 +00:00
Chris Wilson
cc1557cadf drm/i915/gem: Peek at the inflight context
If supported by the backend, we can quickly look at the context's
inflight engine rather than search along the active list to confirm.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201229144114.31686-1-chris@chris-wilson.co.uk
2020-12-29 19:28:45 +00:00
Chris Wilson
70960ab275 drm/i915/gt: Define guc firmware blob for older Cometlakes
When splitting the Coffeelake define to also identify Cometlakes, I
missed the double fw_def for Coffeelake. That is only newer Cometlakes
use the cml specific guc firmware, older Cometlakes should use kbl
firmware.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2859
Fixes: 5f4ae2704d ("drm/i915: Identify Cometlake platform")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: <stable@vger.kernel.org> # v5.9+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201229120828.29931-1-chris@chris-wilson.co.uk
2020-12-29 16:21:21 +00:00
Chris Wilson
fe7bcfaeb2 drm/i915/gt: Refactor heartbeat request construction and submission
Pull the individual strands of creating a custom heartbeat requests into
a pair of common functions. This will reduce the number of changes we
will need to make in future.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201224160213.29521-1-chris@chris-wilson.co.uk
2020-12-24 18:07:26 +00:00
Matthew Auld
26ebc511e7 drm/i915: clear the gpu reloc batch
The reloc batch is short lived but can exist in the user visible ppGTT,
and since it's backed by an internal object, which lacks page clearing,
we should take care to clear it upfront.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201224151358.401345-2-matthew.auld@intel.com
Cc: stable@vger.kernel.org
2020-12-24 15:25:41 +00:00
Matthew Auld
eeb52ee6c4 drm/i915: clear the shadow batch
The shadow batch is an internal object, which doesn't have any page
clearing, and since the batch_len can be smaller than the object, we
should take care to clear it.

Testcase: igt/gen9_exec_parse/shadow-peek
Fixes: 4f7af1948a ("drm/i915: Support ro ppgtt mapped cmdparser shadow buffers")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201224151358.401345-1-matthew.auld@intel.com
Cc: stable@vger.kernel.org
2020-12-24 15:25:41 +00:00
Chris Wilson
177b7a52a1 drm/i915/gt: ce->inflight updates are now serialised
Since schedule-in and schedule-out are now both always under the tasklet
bitlock, we can reduce the individual atomic operations to simple
instructions and worry less.

This notably eliminates the race observed with intel_context_inflight in
__engine_unpark().

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2583
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-9-chris@chris-wilson.co.uk
2020-12-24 15:02:41 +00:00
Chris Wilson
ac1a6d7310 drm/i915/gt: Simplify virtual engine handling for execlists_hold()
Now that the tasklet completely controls scheduling of the requests, and
we postpone scheduling out the old requests, we can keep a hanging
virtual request bound to the engine on which it hung, and remove it from
te queue. On release, it will be returned to the same engine and remain
in its queue until it is scheduled; after which point it will become
eligible for transfer to a sibling. Instead, we could opt to resubmit the
request along the virtual engine on unhold, making it eligible for load
balancing immediately -- but that seems like a pointless optimisation
for a hanging context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-8-chris@chris-wilson.co.uk
2020-12-24 15:02:41 +00:00
Chris Wilson
f81475bb5b drm/i915/gt: Resubmit the virtual engine on schedule-out
Having recognised that we do not change the sibling until we schedule
out, we can then defer the decision to resubmit the virtual engine from
the unwind of the active queue to scheduling out of the virtual context.
This improves our resilence in virtual engine scheduling, and should
eliminate the rare cases of gem_exec_balance failing.

By keeping the unwind order intact on the local engine, we can preserve
data dependency ordering while doing a preempt-to-busy pass until we
have determined the new ELSP. This means that if we try to timeslice
between a virtual engine and a data-dependent ordinary request, the pair
will maintain their relative ordering and we will avoid the
resubmission, cancelling the timeslicing until further change.

The dilemma though is that we then may end up in a situation where the
'demotion' of the virtual request to an ordinary request in the engine
queue results in filling the ELSP[] with virtual requests instead of
spreading the load across the engines. To compensate for this, we mark
each virtual request and refuse to resubmit a virtual request in the
secondary ELSP slots, thus forcing subsequent virtual requests to be
scheduled out after timeslicing. By delaying the decision until we
schedule out, we will avoid unnecessary resubmission.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2079
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2098
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-7-chris@chris-wilson.co.uk
2020-12-24 15:02:40 +00:00
Chris Wilson
66e40750d2 drm/i915/gt: Shrink the critical section for irq signaling
Let's only wait for the list iterator when decoupling the virtual
breadcrumb, as the signaling of all the requests may take a long time,
during which we do not want to keep the tasklet spinning.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-6-chris@chris-wilson.co.uk
2020-12-24 15:02:39 +00:00
Chris Wilson
bab0557c8d drm/i915/gt: Remove virtual breadcrumb before transfer
The issue with stale virtual breadcrumbs remain. Now we have the problem
that if the irq-signaler is still referencing the stale breadcrumb as we
transfer it to a new sibling, the list becomes spaghetti. This is a very
small window, but that doesn't stop it being hit infrequently. To
prevent the lists being tangled (the iterator starting on one engine's
b->signalers but walking onto another list), always decouple the virtual
breadcrumb on schedule-out and make sure that the walker has stepped out
of the lists.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-5-chris@chris-wilson.co.uk
2020-12-24 15:02:39 +00:00
Chris Wilson
6f0726b480 drm/i915/gt: Defer schedule_out until after the next dequeue
Inside schedule_out, we do extra work upon idling the context, such as
updating the runtime, kicking off retires, kicking virtual engines.
However, if we are in a series of processing single requests per
contexts, we may find ourselves scheduling out the context, only to
immediately schedule it back in during dequeue. This is just extra work
that we can avoid if we keep the context marked as inflight across the
dequeue. This becomes more significant later on for minimising virtual
engine misses.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-4-chris@chris-wilson.co.uk
2020-12-24 15:02:38 +00:00
Chris Wilson
2efa2c522a drm/i915/gt: Decouple inflight virtual engines
Once a virtual engine has been bound to a sibling, it will remain bound
until we finally schedule out the last active request. We can not rebind
the context to a new sibling while it is inflight as the context save
will conflict, hence we wait. As we cannot then use any other sibliing
while the context is inflight, only kick the bound sibling while it
inflight and upon scheduling out the kick the rest (so that we can swap
engines on timeslicing if the previously bound engine becomes
oversubscribed).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-3-chris@chris-wilson.co.uk
2020-12-24 15:02:37 +00:00
Chris Wilson
64b7a3fa7e drm/i915/gt: Use virtual_engine during execlists_dequeue
Rather than going back and forth between the rb_node entry and the
virtual_engine type, store the ve local and reuse it. As the
container_of conversion from rb_node to virtual_engine requires a
variable offset, performing that conversion just once shaves off a bit
of code.

v2: Keep a single virtual engine lookup, for typical use.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-2-chris@chris-wilson.co.uk
2020-12-24 15:02:37 +00:00
Chris Wilson
16f2941ad3 drm/i915/gt: Replace direct submit with direct call to tasklet
Rather than having special case code for opportunistically calling
process_csb() and performing a direct submit while holding the engine
spinlock for submitting the request, simply call the tasklet directly.
This allows us to retain the direct submission path, including the CS
draining to allow fast/immediate submissions, without requiring any
duplicated code paths, and most importantly greatly simplifying the
control flow by removing reentrancy. This will enable us to close a few
races in the virtual engines in the next few patches.

The trickiest part here is to ensure that paired operations (such as
schedule_in/schedule_out) remain under consistent locking domains,
e.g. when pulled outside of the engine->active.lock

v2: Use bh kicking, see commit 3c53776e29 ("Mark HI and TASKLET
softirq synchronous").
v3: Update engine-reset to be tasklet aware

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-1-chris@chris-wilson.co.uk
2020-12-24 15:02:35 +00:00
Chris Wilson
6d393ef5ff drm/i915/gem: Optimistically prune dma-resv from the shrinker.
As we shrink an object, also see if we can prune the dma-resv of idle
fences it is maintaining a reference to.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201223122051.4624-2-chris@chris-wilson.co.uk
2020-12-23 21:58:00 +00:00
Chris Wilson
d7d82f5d5c drm/i915/gt: Prefer recycling an idle fence
If we want to reuse a fence that is in active use by the GPU, we have to
wait an uncertain amount of time, but if we reuse an inactive fence, we
can change it right away. Loop through the list of available fences
twice, ignoring any active fences on the first pass.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201223122051.4624-1-chris@chris-wilson.co.uk
2020-12-23 21:58:00 +00:00
Chris Wilson
f170523a7b drm/i915/gt: Consolidate the CS timestamp clocks
Pull the GT clock information [used to derive CS timestamps and PM
interval] under the GT so that is it local to the users. In doing so, we
consolidate the two references for the same information, of which the
runtime-info took note of a potential clock source override and scaling
factors.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201223122359.22562-2-chris@chris-wilson.co.uk
2020-12-23 21:10:41 +00:00
Chris Wilson
8391c9b28c drm/i915/selftests: Confirm CS_TIMESTAMP / CTX_TIMESTAMP share a clock
We assume that both timestamps are driven off the same clock [reported
to userspace as I915_PARAM_CS_TIMESTAMP_FREQUENCY]. Verify that this is
so by reading the timestamp registers around a busywait (on an otherwise
idle engine so there should be no preemptions).

v2: Icelake (not ehl, nor tgl) seems to be using a fixed 80ns interval
for, and only for, CTX_TIMESTAMP -- or it may be GPU frequency and the
test is always running at maximum frequency?. As far as I can tell, this
isolated change in behaviour is undocumented.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201223122359.22562-1-chris@chris-wilson.co.uk
2020-12-23 21:10:40 +00:00