Let intel_guc_init_fw() focus on determining and fetching the correct
firmware.
This patch introduces intel_uc_sanitize_options() that is called from
intel_sanitize_options().
Then, if we have GuC, we can call intel_guc_init_fw() conditionally
and we do not have to do the internal checks.
v2: fix comment, notify when nuking GuC explicitly enabled (M. Wajdeczko)
v3: fix comment again, change the nuke message (M. Wajdeczko)
v4: update title to reflect new function name + rebase
v5: text && remove 2 uneccessary checks (M. Wajdeczko)
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Instead of calling intel_guc_init() and intel_huc_init() one by one this
patch introduces intel_uc_init_fw() function that calls them both.
Called functions are renamed accordingly.
Trying to have subject_verb_object ordering and more descriptive names,
the intel_huc_init() and intel_guc_init() functions are renamed.
For guc_init():
* `intel_guc` is the subject, so those functions now take intel_guc
structure, instead of the dev_priv
* init is the verb
* fw is the object which better describes the function's role
huc_init() change follows the same reasoning.
v2: settle on intel_uc_fetch_fw name (M. Wajdeczko)
v3: yet another rename - intel_uc_init_fw (J. Lahtinen)
v4: non-trivial rebase
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The file fits better.
Additionally rename it to intel_uc_prepare_fw(), as the function does
more than simple fetch.
`obj` cleanup in the function is also fixed (i.e. removed). In the fail
scenario it was always 'put' but there's no possible flow that
initializes the obj properly and then goes to the fail label.
v2: remove second declaration, reorder (M. Wajdeczko)
v3: non-trivial rebase
v4: remove obj cleanup in the fail scenario (C. Wilson)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
GuC historically has two "startup" functions called _init() and _setup()
Then HuC came with it's _init() and _load().
This commit renames intel_guc_setup() and intel_huc_load() to
*uc_init_hw() as they called from the i915_gem_init_hw().
The aim is to be consistent in that entry points called during
particular driver init phases (e.g. init_hw) are all suffixed by that
phase. When reading the leaf functions, it should be clear at what stage
during the driver load it is called and therefore what operations are
legal at that point.
Also, since the functions start with intel_guc and intel_huc they take
appropiate structure.
v2: commit message update (Chris Wilson)
v3: change taken parameters to be more "semantic" (M. Wajdeczko)
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
The 33rd entry in the pre-CSC gamma table in Geminilake can represent a
value of 1.0 as 17 bits fixed point with one integer bit. However, the
table was generated such that the value of 1.0 would be 0.ffff with
all the intervals scaled accordingly. For instance, 0.5 mapped to
0.7fff instead of 0.8000.
For a reason that is not clear to the author, the rounding seems to be
different when a cursor plane is used, leading to some seemingly random
failures of the kms_cursor_crc igt tests. The differences weren't
perceptible at 8bpc with images captured by a Chamelium device, but did
cause CRC mismatches.
Signed-off-by: Ander Conselvan de Oliveira <ander.conselvan.de.oliveira@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170310101835.29845-1-ander.conselvan.de.oliveira@intel.com
There's really not a reason afaics that we can't just clean up
everything at the end, in the terminal postclose hook: Since this is
closing a file descriptor we know no one else can have a reference or
a thread doing something with that drm_file except the close code.
Ordering shouldn't matter, as long as we don't kfree before we clean
stuff up.
In the past this was more relevant when drivers still had to track and
clean up pending drm events, but that's all done by the core now.
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Liviu Dudau <Liviu.Dudau@arm.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170308141257.12119-13-daniel.vetter@ffwll.ch
The trouble we have is that we can't really test all the shrinker
recursion stuff exhaustively in BAT because any kind of thrashing
stress test just takes too long.
But that leaves a really big gap open, since shrinker recursions are
one of the most annoying bugs. Now lockdep already has support for
checking allocation deadlocks:
- Direct reclaim paths are marked up with
lockdep_set_current_reclaim_state() and
lockdep_clear_current_reclaim_state().
- Any allocation paths are marked with lockdep_trace_alloc().
If we simply mark up our debugfs with the reclaim annotations, any
code and locks taken in there will automatically complete the picture
with any allocation paths we already have, as long as we have a simple
testcase in BAT which throws out a few objects using this interface.
Not stress test or thrashing needed at all.
v2: Need to EXPORT_SYMBOL_GPL to make it compile as a module.
v3: Fixup rebase fail (spotted by Chris).
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://patchwork.freedesktop.org/patch/msgid/20170312205340.16202-1-daniel.vetter@ffwll.ch
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Optimize the plane register accesses a little bit by grabbing
the uncore lock manually across the entire pile of accesses and
using I915_READ_FW().
This helps keep the pipe update vblank evade critical section
below our 100 usec deadline, particularly with lockdep enabled.
And in general we want to keep that critical section as short
as possible as it's executed with interrupts disabled.
Not all plane updates currently happen from within the vblank evade
critical section, so we must use the irqsave/irqrestore variants
of the spinlock functions in the plane hooks.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170309154434.29303-5-ville.syrjala@linux.intel.com
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Baytrail PMIC vs. PMU race fixes from Hans de Goede
This time the right version (v4), with the compile fix.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Currently, we sum the render and media cycles (on different engines) to
compute a percentage - but we fail to factor in the duplication into the
threshold calculations. This makes us very eager to upclock!
If we just consider the maximum busy cycles of either counter, we should
have an accurate reflection on whether there are cycles to spare to
handle the workload at this frequency.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170309211232.28878-2-chris@chris-wilson.co.uk
On Baytrail, we manually calculate busyness over the evaluation interval
to avoid issues with miscaluations with RC6 enabled. However, it turns
out that the DOWN_EI interrupt generator is completely bust - it
operates in two modes, continuous or never. Neither of which are
conducive to good behaviour. Stop unmask the DOWN_EI interrupt and just
compute everything from the UP_EI which does seem to correspond to the
desired interval.
v2: Fixup gen6_rps_pm_mask() as well
v3: Inline vlv_c0_above() to combine the now identical elapsed
calculation for up/down and simplify the threshold testing
Fixes: 43cf3bf084 ("drm/i915: Improved w/a for rps on Baytrail")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v4.1+
Link: http://patchwork.freedesktop.org/patch/msgid/20170309211232.28878-1-chris@chris-wilson.co.uk
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
gcc-4.4.4 has issues with anonymous union initializers.
In file included from drivers/gpu/drm/i915/selftests/i915_selftest.c:68:
drivers/gpu/drm/i915/selftests/i915_mock_selftests.h:11: error: unknown field 'mock' specified in initializer
drivers/gpu/drm/i915/selftests/i915_mock_selftests.h:11: warning: missing braces around initializer
drivers/gpu/drm/i915/selftests/i915_mock_selftests.h:11: warning: (near initialization for 'mock_selftests[0].<anonymous>')
drivers/gpu/drm/i915/selftests/i915_mock_selftests.h:12: error: unknown field 'mock' specified in initializer
drivers/gpu/drm/i915/selftests/i915_mock_selftests.h:13: error: unknown field 'm
...
Work around this.
Fixes: 953c7f82eb ("drm/i915: Provide a hook for selftests")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: http://patchwork.freedesktop.org/patch/msgid/20170310090314.3142-1-chris@chris-wilson.co.uk
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If we allow the user to convert a GTT mmap address into a userptr, we
may end up in recursion hell, where currently we hit a mutex deadlock
but other possibilities include use-after-free during the
unbind/cancel_userptr.
[ 143.203989] gem_userptr_bli D 0 902 898 0x00000000
[ 143.204054] Call Trace:
[ 143.204137] __schedule+0x511/0x1180
[ 143.204195] ? pci_mmcfg_check_reserved+0xc0/0xc0
[ 143.204274] schedule+0x57/0xe0
[ 143.204327] schedule_timeout+0x383/0x670
[ 143.204374] ? trace_hardirqs_on_caller+0x187/0x280
[ 143.204457] ? trace_hardirqs_on_thunk+0x1a/0x1c
[ 143.204507] ? usleep_range+0x110/0x110
[ 143.204657] ? irq_exit+0x89/0x100
[ 143.204710] ? retint_kernel+0x2d/0x2d
[ 143.204794] ? trace_hardirqs_on_caller+0x187/0x280
[ 143.204857] ? _raw_spin_unlock_irq+0x33/0x60
[ 143.204944] wait_for_common+0x1f0/0x2f0
[ 143.205006] ? out_of_line_wait_on_atomic_t+0x170/0x170
[ 143.205103] ? wake_up_q+0xa0/0xa0
[ 143.205159] ? flush_workqueue_prep_pwqs+0x15a/0x2c0
[ 143.205237] wait_for_completion+0x1d/0x20
[ 143.205292] flush_workqueue+0x2e9/0xbb0
[ 143.205339] ? flush_workqueue+0x163/0xbb0
[ 143.205418] ? __schedule+0x533/0x1180
[ 143.205498] ? check_flush_dependency+0x1a0/0x1a0
[ 143.205681] i915_gem_userptr_mn_invalidate_range_start+0x1c7/0x270 [i915]
[ 143.205865] ? i915_gem_userptr_dmabuf_export+0x40/0x40 [i915]
[ 143.205955] __mmu_notifier_invalidate_range_start+0xc6/0x120
[ 143.206044] ? __mmu_notifier_invalidate_range_start+0x51/0x120
[ 143.206123] zap_page_range_single+0x1c7/0x1f0
[ 143.206171] ? unmap_single_vma+0x160/0x160
[ 143.206260] ? unmap_mapping_range+0xa9/0x1b0
[ 143.206308] ? vma_interval_tree_subtree_search+0x75/0xd0
[ 143.206397] unmap_mapping_range+0x18f/0x1b0
[ 143.206444] ? zap_vma_ptes+0x70/0x70
[ 143.206524] ? __pm_runtime_resume+0x67/0xa0
[ 143.206723] i915_gem_release_mmap+0x1ba/0x1c0 [i915]
[ 143.206846] i915_vma_unbind+0x5c2/0x690 [i915]
[ 143.206925] ? __lock_is_held+0x52/0x100
[ 143.207076] i915_gem_object_set_tiling+0x1db/0x650 [i915]
[ 143.207236] i915_gem_set_tiling_ioctl+0x1d3/0x3b0 [i915]
[ 143.207377] ? i915_gem_set_tiling_ioctl+0x5/0x3b0 [i915]
[ 143.207457] drm_ioctl+0x36c/0x670
[ 143.207535] ? debug_lockdep_rcu_enabled.part.0+0x1a/0x30
[ 143.207730] ? i915_gem_object_set_tiling+0x650/0x650 [i915]
[ 143.207793] ? drm_getunique+0x120/0x120
[ 143.207875] ? __handle_mm_fault+0x996/0x14a0
[ 143.207939] ? vm_insert_page+0x340/0x340
[ 143.208028] ? up_write+0x28/0x50
[ 143.208086] ? vm_mmap_pgoff+0x160/0x190
[ 143.208163] do_vfs_ioctl+0x12c/0xa60
[ 143.208218] ? debug_lockdep_rcu_enabled+0x35/0x40
[ 143.208267] ? ioctl_preallocate+0x150/0x150
[ 143.208353] ? __do_page_fault+0x36a/0x6e0
[ 143.208400] ? mark_held_locks+0x23/0xc0
[ 143.208479] ? up_read+0x1f/0x40
[ 143.208526] ? entry_SYSCALL_64_fastpath+0x5/0xc6
[ 143.208669] ? __fget_light+0xa7/0xc0
[ 143.208747] SyS_ioctl+0x41/0x70
To prevent the possibility of a deadlock, we defer scheduling the worker
until after we have proven that given the current mm, the userptr range
does not overlap a GGTT mmaping. If another thread tries to remap the
GGTT over the userptr before the worker is scheduled, it will be stopped
by its invalidate-range flushing the current work, before the deadlock
can occur.
v2: Improve discussion of how we end up in the deadlock.
v3: Don't forget to mark the userptr as active after a successful
gup_fast. Rename overlaps_ggtt to noncontiguous_or_overlaps_ggtt.
v4: Fix test ordering between invalid GTT mmaping and range completion
(Tvrtko)
Reported-by: Michał Winiarski <michal.winiarski@intel.com>
Testcase: igt/gem_userptr_blits/map-fixed-invalidate-gup
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20170308215903.24171-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>