This mode allows to assign EUs to pools which can process work collectively.
The command to enable this mode should be issued as part of context initialization.
The pooled mode is global, once enabled it has to stay the same across all
contexts until HW reset hence this is sent in auxiliary golden context batch.
Thanks to Mika for the preliminary review and comments.
v2: explain why this is enabled in golden context, use feature flag while
enabling the support (Chris)
v3: Include only kernel support as userspace support is not available yet.
User space clients need to know when the pooled EU feature is present
and enabled on the hardware so that they can adapt work submissions.
Create a new device info flag for this purpose.
Set has_pooled_eu to true in the Broxton static device info - Broxton
supports the feature in hardware and the driver will enable it by
default.
We need to add getparam ioctls to enable userspace to query availability of
this feature and to retrieve min. no of eus in a pool but we will expose
them once userspace support is available. Opensource users for this feature
are mesa, libva and beignet.
Beignet team is currently working on adding userspace support.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Cc: Winiarski, Michal <michal.winiarski@intel.com>
Cc: Zou, Nanhai <nanhai.zou@intel.com>
Cc: Yang, Rong R <rong.r.yang@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Armin Reese <armin.c.reese@intel.com>
Cc: Tim Gore <tim.gore@intel.com>
Signed-off-by: Jeff McGee <jeff.mcgee@intel.com>
Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Convert all static functions in i915_guc_submission.c that currently
take a 'dev' pointer to take 'dev_priv' instead (there are three,
guc_client_alloc(), guc_client_free(), and gem_allocate_guc_obj().
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Rename these remaining function prefixes to better align with the
corresponding SKL functions.
No functional change.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
So far we configured a static lane latency optimization during driver
loading/resuming. The specification changed at one point and now this
configuration depends on the lane count, so move the configuration
to modeset time accordingly.
It's not clear when this lane configuration takes effect. The
specification only requires that the programming is done before enabling
the port. On CHV OTOH the lanes start to power up already right after
enabling the PLL. To be safe preserve the current order and set things
up already before enabling the PLL.
v2: (Ander)
- Simplify the optimization mask calculation.
- Use the correct pipe_config always during the calculation instead
of the bogus intel_crtc->config.
CC: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95476
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
So far we depended on the HW to dynamically power down unused PHYs and
so we enabled them manually once during driver loading/resuming. There
are indications however that we can achieve better power savings by
manual powering toggling. So make the PHY enabling/disabling to happen
on-demand whenever we need either the corresponding AUX or port
functionality. CHV does this already by enabling the PHY along the
corresponding PHY common lane power wells there, do the same on BXT by
adding virtual power wells for the same purpose.
Also sanity check the common lane power down ack signal from the PHY. Do
this only when the PHY is enabled, since it's not clear at what point
the HW power/clock gates things.
While at it rename broxton_ prefix to bxt_ in related function names to
better align with the SKL code.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
These helpers will be needed by the next patch, so factor them out.
No functional change.
v2:
- Move the refcount==0 WARN to the new put helper. (Ville)
CC: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
A follow-up patch moves the PHY enabling to the power well code where
enabling/disabling the PHYs will happen independently. Because of this
waiting for the GRC calibration in PHY1 asynchronously would need some
additional logic. Instead of adding that let's keep things simple for now
and wait synchronously. My measurements showed that the calibration
takes ~4ms.
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
It seems pretty clear that bitwise OR was intended here and not logical
OR.
Fixes: 6fc29133ea ('drm/i915/gen9: Add WaDisableSkipCaching')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com>
Calling drm_framebuffer_unregister_private() in intel_fbdev_destroy() is
superfluous because the framebuffer will subsequently be unregistered by
drm_framebuffer_free() when unreferenced in drm_framebuffer_remove().
The call is a leftover, when it was introduced by commit 362063619c
("drm: revamp framebuffer cleanup interfaces"), struct intel_framebuffer
was still embedded in struct intel_fbdev rather than being a pointer as
it is today, and drm_framebuffer_remove() wasn't used yet.
As a bonus, the ID of the framebuffer is no longer 0 in the debug log:
Before:
[ 39.680874] [drm:drm_mode_object_unreference] OBJ ID: 0 (3)
[ 39.680878] [drm:drm_mode_object_unreference] OBJ ID: 0 (2)
[ 39.680884] [drm:drm_mode_object_unreference] OBJ ID: 0 (1)
After:
[ 102.504649] [drm:drm_mode_object_unreference] OBJ ID: 45 (3)
[ 102.504651] [drm:drm_mode_object_unreference] OBJ ID: 45 (2)
[ 102.504654] [drm:drm_mode_object_unreference] OBJ ID: 45 (1)
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/5031860caad67faa0f1be5965331ef048a311a01.1465383212.git.lukas@wunner.de
This patch adds support for extending the pread/pwrite functionality
for objects not backed by shmem. The access will be made through
gtt interface. This will cover objects backed by stolen memory as well
as other non-shmem backed objects.
v2: Drop locks around slow_user_access, prefault the pages before
access (Chris)
v3: Rebased to the latest drm-intel-nightly (Ankit)
v4: Moved page base & offset calculations outside the copy loop,
corrected data types for size and offset variables, corrected if-else
braces format (Tvrtko/kerneldocs)
v5: Enabled pread/pwrite for all non-shmem backed objects including
without tiling restrictions (Ankit)
v6: Using pwrite_fast for non-shmem backed objects as well (Chris)
v7: Updated commit message, Renamed i915_gem_gtt_read to i915_gem_gtt_copy,
added pwrite slow path for non-shmem backed objects (Chris/Tvrtko)
v8: Updated v7 commit message, mutex unlock around pwrite slow path for
non-shmem backed objects (Tvrtko)
v9: Corrected check during pread_ioctl, to avoid shmem_pread being
called for non-shmem backed objects (Tvrtko)
v10: Moved the write_domain check to needs_clflush and tiling mode check
to pwrite_fast (Chris)
v11: Use pwrite_fast fallback for all objects (shmem and non-shmem backed),
call fast_user_write regardless of pagefault in previous iteration
v12: Use page-by-page copy for slow user access too (Chris)
v13: Handled EFAULT, Avoid use of WARN_ON, put_fence only if whole obj
pinned (Chris)
v14: Corrected datatypes/initializations (Tvrtko)
Testcase: igt/gem_stolen, igt/gem_pread, igt/gem_pwrite
Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1465548783-19712-1-git-send-email-ankitprasad.r.sharma@intel.com
In pwrite_fast, map an object page by page if obj_ggtt_pin fails. First,
we try a nonblocking pin for the whole object (since that is fastest if
reused), then failing that we try to grab one page in the mappable
aperture. It also allows us to handle objects larger than the mappable
aperture (e.g. if we need to pwrite with vGPU restricting the aperture
to a measely 8MiB or something like that).
v2: Pin pages before starting pwrite, Combined duplicate loops (Chris)
v3: Combined loops based on local patch by Chris (Chris)
v4: Added i915 wrapper function for drm_mm_insert_node_in_range (Chris)
v5: Renamed wrapper function for drm_mm_insert_node_in_range (Chris)
v5: Added wrapper for drm_mm_remove_node() (Chris)
v6: Added get_pages call before pinning the pages (Tvrtko)
Added remove_mappable_node() wrapper for drm_mm_remove_node() (Chris)
v7: Added size argument for insert_mappable_node (Tvrtko)
v8: Do not put_pages after pwrite, do memset of node in the wrapper
function (insert_mappable_node) (Chris)
v9: Rebase (Ankit)
Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
This utility function is a companion to i915_gem_object_get_page() that
uses the same cached iterator for the scatterlist to perform fast
sequential lookup of the dma address associated with any page within the
object.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Introduced a new vm specfic callback insert_page() to program a single pte in
ggtt or ppgtt. This allows us to map a single page in to the mappable aperture
space. This can be iterated over to access the whole object by using space as
meagre as page size.
v2: Added low level rpm assertions to insert_page routines (Chris)
v3: Added POSTING_READ post register write (Tvrtko)
v4: Rebase (Ankit)
v5: Removed wmb() and FLUSH_CTL from insert_page, caller to take care
of it (Chris)
v6: insert_page not working correctly without FLSH_CNTL write, added the
write again.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ankitprasad Sharma <ankitprasad.r.sharma@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Past evidence with system hangs and hsds tie
WaForceEnableNonCoherent and WaDisableHDCInvalidation to
WaForceContextSaveRestoreNonCoherent. Documentation
states that WaForceContextSaveRestoreNonCoherent would
not be needed on skl past E0 but evidence proved otherwise. See
commit <510650e8b2ab> ("drm/i915/skl: Fix spurious gpu hang with gt3/gt4
revs"). In this scope consider kbl to be skl with a bigger revision than
E0 so play it safe and bind these two workarounds to the
WaForceContextSaveRestoreNonCoherent, and apply to all gen9.
v2: fix comment (Matthew)
References: HSD#2134449, HSD#2131413
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1465309159-30531-7-git-send-email-mika.kuoppala@intel.com
The atomic version of intel_pre_plane_update did not check
for HAS_GMCH_DISPLAY before calling intel_set_memory_cxsr().
While this doesn't cause any issues on its own (it will
return without doing anything if the hardware doesn't
have the required feature), the drm_wait_one_vblank() that
is needed if memory self-refresh is disabled introduces
an unnecessary delay in the suspend path.
In cases where i915 is on the critical path it means that
we slow down suspend by 16.8ms on platforms that don't
need to disable memory self-refresh.
Signed-off-by: David Weinehall <david.weinehall@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1463662236-18192-1-git-send-email-david.weinehall@linux.intel.com
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Apparently some CHV boards failed to hook up the port presence straps
for HDMI ports as well (earlier we assumed this problem only affected
eDP ports). So let's check the VBT in addition to the strap, and if
either one claims that the port is present go ahead and register the
relevant connector.
While at it, change port D to register DP before HDMI as we do for ports
B and C since
commit 457c52d87e ("drm/i915: Only ignore eDP ports that are connected")
Also print a debug message when we register a HDMI connector to aid
in diagnosing missing/incorrect ports. We already had such a print for
DP/eDP.
v2: Improve the comment in the code a bit, note the port D change in
the commit message
Cc: Radoslav Duda <radosd@radosd.com>
Tested-by: Radoslav Duda <radosd@radosd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96321
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/1464945463-14364-1-git-send-email-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
When resetting and reloading the GuC, the GuC submission management code
also needs to destroy and recreate the GuC client(s). Currently this is
done by a separate call from the GuC loader, but really, it's just an
internal detail of the submission code. So here we remove the call from
the loader (which is too late, really, because the GuC has already been
reloaded at this point) and put it into guc_submission_init() instead.
This means that any preexisting client is destroyed *before* the GuC
(re)load and then recreated after, iff the firmware was successfully
loaded. If the GuC reload fails, we don't recreate the client, so
fallback to execlists mode (if active) won't leak the client object
(previously, the now-unusable client would have been left allocated,
and leaked if the driver were unloaded).
Signed-off-by: Dave Gordon <david.s.gordon@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>