Commit Graph

60422 Commits

Author SHA1 Message Date
Chris Wilson
36104fcf8f drm/i915: Flush context free work on cleanup
Throw in a flush_work() to specifically flush the context cleanup work
before the module is unloaded.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112248
Fixes: a4e7ccdac3 ("drm/i915: Move context management under GEM")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191112150051.1603-1-chris@chris-wilson.co.uk
(cherry picked from commit 5f00cac921)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-18 16:35:57 +02:00
Dave Airlie
17cc51390c Merge branch 'vmwgfx-next' of git://people.freedesktop.org/~thomash/linux into drm-next
Two minor cleanups / fixes for -next.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: =?UTF-8?q?Thomas=20Hellstr=C3=B6m=20=28VMware=29?=
Link: https://patchwork.freedesktop.org/patch/msgid/20191114131703.8607-1-thomas_os@shipmail.org
2019-11-15 12:34:45 +10:00
Dave Airlie
2d0720f5a4 Merge tag 'drm-intel-next-fixes-2019-11-14' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
- PMU "Frequency" is reported as accumulated cycles
- Avoid OOPS in dumb_create IOCTL when no CRTCs
- Mitigation for userptr put_pages deadlock with trylock_page
- Fix to avoid freeing heartbeat request too early
- Fix LRC coherency issue
- Fix Bugzilla #112212: Avoid screen corruption on MST
- Error path fix to unlock context on failed context VM SETPARAM
- Always consider holding preemption a privileged op in perf/OA
- Preload LUTs if the hw isn't currently using them to avoid color flash on VLV/CHV
- Protect context while grabbing its name for the request
- Don't resize aliasing ppGTT size
- Smaller fixes picked by tooling

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191114085213.GA6440@jlahtine-desk.ger.corp.intel.com
2019-11-15 12:16:43 +10:00
YueHaibing
b4011644b0 drm/vmwgfx: remove set but not used variable 'srf'
drivers/gpu/drm/vmwgfx/vmwgfx_surface.c:339:22:
 warning: variable srf set but not used [-Wunused-but-set-variable]

'srf' is never used, so can be removed.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
2019-11-14 08:41:36 +01:00
Thomas Hellstrom
e2e966636a drm/ttm, drm/vmwgfx: Use a configuration option for the TTM dma page pool
Drivers like vmwgfx may want to test whether the dma page pool is present
or not. Since it's activated by default by TTM if compiled-in, define a
hidden configuration option that the driver can test for.

Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2019-11-14 08:41:23 +01:00
Dave Airlie
dfce90259d Backmerge i915 security patches from commit 'ea0b163b13ff' into drm-next
This backmerges the branch that ended up in Linus' tree. It removes
all the changes for the rc6 patches from Linus' tree in favour of
a patch that is based on a large refactor that occured.

Otherwise it all looks good.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-11-14 11:09:06 +10:00
Imre Deak
2248a28384 drm/i915/gen8+: Add RC6 CTX corruption WA
In some circumstances the RC6 context can get corrupted. We can detect
this and take the required action, that is disable RC6 and runtime PM.
The HW recovers from the corrupted state after a system suspend/resume
cycle, so detect the recovery and re-enable RC6 and runtime PM.

v2: rebase (Mika)
v3:
- Move intel_suspend_gt_powersave() to the end of the GEM suspend
  sequence.
- Add commit message.
v4:
- Rebased on intel_uncore_forcewake_put(i915->uncore, ...) API
  change.
v5:
- Rebased on latest upstream gt_pm refactoring.
v6:
- s/i915_rc6_/intel_rc6_/
- Don't return a value from i915_rc6_ctx_wa_check().
v7:
- Rebased on latest gt rc6 refactoring.

Signed-off-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
[airlied: pull this later version of this patch into drm-next
to make resolving the conflict mess easier.]
Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-11-14 10:51:54 +10:00
Dave Airlie
94bc7f56a8 Merge tag 'arcpgu-updates-2019.07.18' of github.com:abrodkin/linux into drm-next
This is a pretty simple improvement that allows to find encoder
as the one and only (ARC PGU doesn't support more than one) endpoint
instead of using non-standard "encoder-slave" property.

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alexey Brodkin <Alexey.Brodkin@synopsys.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CY4PR1201MB0120FDB10A777345F9C27720A1C90@CY4PR1201MB0120.namprd12.prod.outlook.com
2019-11-14 09:28:46 +10:00
Dave Airlie
3447fd0c9d Merge tag 'drm-misc-next-fixes-2019-11-13' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
- Fix memory leak in gpu debugfs node's release (Johan)

Cc: Johan Hovold <johan@kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113211056.GA78440@art_vandelay
2019-11-14 08:51:11 +10:00
Dave Airlie
0990ca235d Merge tag 'drm-next-5.5-2019-11-08' of git://people.freedesktop.org/~agd5f/linux into drm-next
drm-next-5.5-2019-11-08:

amdgpu:
- Enable VCN dynamic powergating on RV/RV2
- Fixes for Navi14
- Misc Navi fixes
- Fix MSI-X tear down
- Misc Arturus fixes
- Fix xgmi powerstate handling
- Documenation fixes

scheduler:
- Fix static code checker warning
- Fix possible thread reactivation while thread is stopped
- Avoid cleanup if thread is parked

radeon:
- SI dpm fix ported from amdgpu

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191108212713.5078-1-alexander.deucher@amd.com
2019-11-14 08:43:07 +10:00
Johan Hovold
a64fc11b9a drm/msm: fix memleak on release
If a process is interrupted while accessing the "gpu" debugfs file and
the drm device struct_mutex is contended, release() could return early
and fail to free related resources.

Note that the return value from release() is ignored.

Fixes: 4f776f4511 ("drm/msm/gpu: Convert the GPU show function to use the GPU state")
Cc: stable <stable@vger.kernel.org>     # 4.18
Cc: Jordan Crouse <jcrouse@codeaurora.org>
Cc: Rob Clark <robdclark@gmail.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Sean Paul <seanpaul@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20191010131333.23635-2-johan@kernel.org
2019-11-13 15:34:15 -05:00
Dave Airlie
77e0723bd2 Merge v5.4-rc7 into drm-next
We have the i915 security fixes to backmerge, but first
let's clear the decks for other drivers to avoid a bigger
mess.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2019-11-14 05:53:10 +10:00
Gwan-gyeong Mun
789c4aea3f drm/i915: Split a setting of MSA to MST and SST
The setting of MSA is done by the DDI .pre_enable() hook. And when we are
using MST, the MSA is only set to first mst stream by calling of
DDI .pre_eanble() hook. It raies issues to non-first mst streams.
Wrong MSA or missed MSA packets might show scrambled screen or wrong
screen.

This splits a setting of MSA to MST and SST cases. And In the MST case it
will call a setting of MSA after an allocating of Virtual Channel from
MST encoder pre_enable callback.

Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Stanislav Lisovskiy <stanislav.lisovskiy@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112212
Fixes: 0c06fa1560 ("drm/i915/dp: Add support of BT.2020 Colorimetry to DP MSA")
Fixes: d4a415dcda ("drm/i915: Fix MST oops due to MSA changes")
Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106212636.502471-1-gwan-gyeong.mun@intel.com
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
[vsyrjala: nuke spurious newline]
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
(cherry picked from commit bd8c9cca88)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191113125241.20547-1-ville.syrjala@linux.intel.com
2019-11-13 15:00:10 +02:00
Chris Wilson
98ae6fb3f1 drm/i915/execlists: Move reset_active() from schedule-out to schedule-in
The gem_ctx_persistence/smoketest was detecting an odd coherency issue
inside the LRC context image; that the address of the ring buffer did
not match our associated struct intel_ring. As we set the address into
the context image when we pin the ring buffer into place before the
context is active, that leaves the question of where did it get
overwritten. Either the HW context save occurred after our pin which
would imply that our idle barriers are broken, or we overwrote the
context image ourselves. It is only in reset_active() where we dabble
inside the context image outside of a serialised path from schedule-out;
but we could equally perform the operation inside schedule-in which is
then fully serialised with the context pin -- and remains serialised by
the engine pulse with kill_context(). (The only downside, aside from
doing more work inside the engine->active.lock, was the plan to merge
all the reset paths into doing their context scrubbing on schedule-out
needs more thought.)

Fixes: d12acee84f ("drm/i915/execlists: Cancel banned contexts on schedule-out")
Testcase: igt/gem_ctx_persistence/smoketest
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111133205.11590-3-chris@chris-wilson.co.uk
(cherry picked from commit 31b61f0ef9)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-13 12:26:35 +02:00
Chris Wilson
cee7fb437e drm/i915/userptr: Try to acquire the page lock around set_page_dirty()
set_page_dirty says:

	For pages with a mapping this should be done under the page lock
	for the benefit of asynchronous memory errors who prefer a
	consistent dirty state. This rule can be broken in some special
	cases, but should be better not to.

Under those rules, it is only safe for us to use the plain set_page_dirty
calls for shmemfs/anonymous memory. Userptr may be used with real
mappings and so needs to use the locked version (set_page_dirty_lock).

However, following a try_to_unmap() we may want to remove the userptr and
so call put_pages(). However, try_to_unmap() acquires the page lock and
so we must avoid recursively locking the pages ourselves -- which means
that we cannot safely acquire the lock around set_page_dirty(). Since we
can't be sure of the lock, we have to risk skip dirtying the page, or
else risk calling set_page_dirty() without a lock and so risk fs
corruption.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=203317
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112012
Fixes: 5cc9ed4b9a ("drm/i915: Introduce mapping of user pages into video memory (userptr) ioctl")
References: cb6d7c7dc7 ("drm/i915/userptr: Acquire the page lock around set_page_dirty()")
References: 505a8ec7e1 ("Revert "drm/i915/userptr: Acquire the page lock around set_page_dirty()"")
References: 6dcc693bc5 ("ext4: warn when page is dirtied without buffers")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111133205.11590-1-chris@chris-wilson.co.uk
(cherry picked from commit 0d4bbe3d40)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-12 10:22:58 +02:00
Chris Wilson
a7d87b70d6 drm/i915/pmu: "Frequency" is reported as accumulated cycles
We report "frequencies" (actual-frequency, requested-frequency) as the
number of accumulated cycles so that the average frequency over that
period may be determined by the user. This means the units we report to
the user are Mcycles (or just M), not MHz.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191109105356.5273-1-chris@chris-wilson.co.uk
(cherry picked from commit e88866ef02)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-12 10:22:46 +02:00
Chris Wilson
d231c15aff drm/i915: Protect context while grabbing its name for the request
Inside print_request(), we query the context/timeline name. Nothing
immediately protects the context from being freed if the request is
complete -- we rely on serialisation by the caller to keep the name
valid until they finish using it. Inside intel_engine_dump(), we
generally only print the requests in the execution queue protected by the
engine->active.lock, but we also show the pending execlists ports which
are not protected and so require a rcu_read_lock to keep the pointer
valid.

[ 1695.700883] BUG: KASAN: use-after-free in i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.700981] Read of size 8 at addr ffff8887344f4d50 by task gem_ctx_persist/2968
[ 1695.701068]
[ 1695.701156] CPU: 1 PID: 2968 Comm: gem_ctx_persist Tainted: G     U            5.4.0-rc6+ #331
[ 1695.701246] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
[ 1695.701334] Call Trace:
[ 1695.701424]  dump_stack+0x5b/0x90
[ 1695.701870]  ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.701964]  print_address_description.constprop.7+0x36/0x50
[ 1695.702408]  ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.702856]  ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.702947]  __kasan_report.cold.10+0x1a/0x3a
[ 1695.703390]  ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.703836]  i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.704241]  print_request+0x82/0x2e0 [i915]
[ 1695.704638]  ? fwtable_read32+0x133/0x360 [i915]
[ 1695.705042]  ? write_timestamp+0x110/0x110 [i915]
[ 1695.705133]  ? _raw_spin_lock_irqsave+0x79/0xc0
[ 1695.705221]  ? refcount_inc_not_zero_checked+0x91/0x110
[ 1695.705306]  ? refcount_dec_and_mutex_lock+0x50/0x50
[ 1695.705709]  ? intel_engine_find_active_request+0x202/0x230 [i915]
[ 1695.706115]  intel_engine_dump+0x2c9/0x900 [i915]

Fixes: c36eebd9ba ("drm/i915/gt: execlists->active is serialised by the tasklet")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-1-chris@chris-wilson.co.uk
(cherry picked from commit fecffa4668)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-12 10:22:41 +02:00
Lionel Landwerlin
2b3c7f0db8 drm/i915/perf: always consider holding preemption a privileged op
The ordering of the checks in the existing code can lead to holding
preemption not being considered as privileged op.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 9cd20ef780 ("drm/i915/perf: allow holding preemption on filtered ctx")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111095308.2550-1-lionel.g.landwerlin@intel.com
(cherry picked from commit 0b0120d4c7)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-12 10:22:37 +02:00
Ben Hutchings
ea0b163b13 drm/i915/cmdparser: Fix jump whitelist clearing
When a jump_whitelist bitmap is reused, it needs to be cleared.
Currently this is done with memset() and the size calculation assumes
bitmaps are made of 32-bit words, not longs.  So on 64-bit
architectures, only the first half of the bitmap is cleared.

If some whitelist bits are carried over between successive batches
submitted on the same context, this will presumably allow embedding
the rogue instructions that we're trying to reject.

Use bitmap_zero() instead, which gets the calculation right.

Fixes: f8c08d8fae ("drm/i915/cmdparser: Add support for backward jumps")
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com>
2019-11-11 08:13:49 -08:00
Ville Syrjälä
f77021372e drm/i915: Preload LUTs if the hw isn't currently using them
The LUTs are single buffered so in order to program them without
tearing we'd have to do it during vblank (actually to be 100%
effective it has to happen between start of vblank and frame start).
We have no proper mechanism for that at the moment so we just
defer loading them after the vblank waits have happened. That
is not quite sufficient (especially when committing multiple pipes
whose vblanks don't line up) so the LUT load will often leak into
the following frame causing tearing.

However in case the hardware wasn't previously using the LUT we
can preload it before setting the enable bit (which is double
buffered so won't tear). Let's determine if we can do such
preloading and make it happen. Slight variation between the
hardware requires some platforms specifics in the checks.

Hans is seeing ugly colored flash on VLV/CHV macchines (GPD win
and Asus T100HA) when the gamma LUT gets loaded for the first
time as the BIOS has left some junk in the LUT memory.

v2: Deal with uapi vs. hw crtc state split
    s/GCM/CGM/ typo fix

Cc: Hans de Goede <hdegoede@redhat.com>
Fixes: 051a6d8d3c ("drm/i915: Move LUT programming to happen after vblank waits")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191030190815.7359-1-ville.syrjala@linux.intel.com
Tested-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
(cherry picked from commit 0ccc42a2fd)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-11 11:44:43 +02:00
Ville Syrjälä
aeec766133 drm/i915: Don't oops in dumb_create ioctl if we have no crtcs
Make sure we have a crtc before probing its primary plane's
max stride. Initially I thought we can't get this far without
crtcs, but looks like we can via the dumb_create ioctl.

Not sure if we shouldn't disable dumb buffer support entirely
when we have no crtcs, but that would require some amount of work
as the only thing currently being checked is dev->driver->dumb_create
which we'd have to convert to some device specific dynamic thing.

Cc: stable@vger.kernel.org
Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Fixes: aa5ca8b742 ("drm/i915: Align dumb buffer stride to 4k to allow for gtt remapping")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106172349.11987-1-ville.syrjala@linux.intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
(cherry picked from commit baea9ffe64)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-11 10:30:12 +02:00
Chris Wilson
3cac195875 drm/i915: Leave the aliasing-ppgtt size alone
The hidden aliasing-ppgtt's size is never revealed, as we only inspect
the front GTT when engaged. However, we were "fixing" the hidden ppgtt
to match, with the net result that we ended up leaking the unused
portion on Braswell were we preallocated the entire set of top level
PDP, see gen8_preallocate_top_level_pdp().

[   26.025364] DMA-API: pci 0000:00:02.0: device driver has pending DMA allocations while released from device [count=2]
[   26.025364] One of leaked entries details: [device address=0x0000000230778000] [size=4096 bytes] [mapped with DMA_BIDIRECTIONAL] [mapped as single]
[   26.025683] WARNING: CPU: 0 PID: 415 at kernel/dma/debug.c:894 dma_debug_device_change+0x1a4/0x1f0
[   26.025905] Modules linked in: i915(E-) intel_powerclamp(E) nls_ascii(E) nls_cp437(E) crct10dif_pclmul(E) crc32_pclmul(E) vfat(E) crc32c_intel(E) fat(E) ghash_clmulni_intel(E) prime_numbers(E) intel_gtt(E) i2c_algo_bit(E) efi_pstore(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) fb_sys_fops(E) evdev(E) drm(E) aesni_intel(E) glue_helper(E) crypto_simd(E) cryptd(E) intel_cstate(E) sg(E) efivars(E) pcspkr(E) video(E) button(E) efivarfs(E) ip_tables(E) x_tables(E) autofs4(E) sd_mod(E) lpc_ich(E) ahci(E) mfd_core(E) i2c_i801(E) libahci(E) i2c_designware_pci(E) i2c_designware_core(E)
[   26.026613] CPU: 0 PID: 415 Comm: rmmod Tainted: G            E     5.4.0-rc6+ #25
[   26.026837] Hardware name:  /, BIOS PYBSWCEL.86A.0027.2015.0507.1758 05/07/2015
[   26.027080] RIP: 0010:dma_debug_device_change+0x1a4/0x1f0
[   26.027319] Code: 89 54 24 08 e8 ad 60 62 00 48 8b 54 24 08 48 89 c6 41 57 4d 89 e9 49 89 d8 44 89 f1 41 54 48 c7 c7 e0 61 06 82 e8 c1 aa f5 ff <0f> 0b 5a 59 48 83 3c 24 00 0f 85 97 26 00 00 8b 05 77 47 92 01 85
[   26.027600] RSP: 0018:ffff888228d2fcc8 EFLAGS: 00010282
[   26.027831] RAX: 0000000000000000 RBX: 0000000230778000 RCX: 0000000000000000
[   26.028053] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffed10451a5f8f
[   26.028279] RBP: ffff88823480c0b0 R08: 0000000000000001 R09: ffffed1046e83eb1
[   26.028500] R10: ffffed1046e83eb0 R11: ffff88823741f587 R12: ffffffff82067340
[   26.028725] R13: 0000000000001000 R14: 0000000000000002 R15: ffffffff82067480
[   26.028952] FS:  00007fdf3ed174c0(0000) GS:ffff888237400000(0000) knlGS:0000000000000000
[   26.029185] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   26.029405] CR2: 000055e211109030 CR3: 0000000230139000 CR4: 00000000001006f0
[   26.029622] Call Trace:
[   26.029846]  notifier_call_chain+0x67/0xa0
[   26.030076]  blocking_notifier_call_chain+0x5a/0x80
[   26.030305]  device_release_driver_internal+0x20d/0x260
[   26.030535]  driver_detach+0x7b/0xe1
[   26.030761]  bus_remove_driver+0x8c/0x153
[   26.030993]  pci_unregister_driver+0x2d/0xf0
[   26.032603]  i915_exit+0x16/0x1c [i915]

Reported-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Fixes: 1eda701eac ("drm/i915/gtt: Recursive cleanup for gen8")
References: c082afac86 ("drm/i915: Move aliasing_ppgtt underneath its i915_ggtt")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106221223.7437-1-chris@chris-wilson.co.uk
(cherry picked from commit 2b0a4fc25a)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-11 10:30:02 +02:00
Jani Nikula
56a327f983 drm/i915/display: only include intel_dp_link_training.h where needed
The intel_dp_link_training.h include has no need or place in
intel_display.h. Include it in intel_display.c instead.

Cc: Manasi Navare <manasi.d.navare@intel.com>
Fixes: eadf6f9170 ("drm/i915/display/icl: Enable master-slaves in trans port sync")
Reviewed-by: Manasi Navare <manasi.d.navare@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191029103947.7535-1-jani.nikula@intel.com
(cherry picked from commit 3c954c418e)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-11 10:29:57 +02:00
Chris Wilson
6300c66372 drm/i915/gem: Fix error path to unlock if the GEM context is closed
When inside the lock, remember to unlock even if you want to leave
early.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: a4e7ccdac3 ("drm/i915: Move context management under GEM")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106144155.25727-1-chris@chris-wilson.co.uk
(cherry picked from commit feba2b8146)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-11 10:29:52 +02:00
Chris Wilson
d29926fa5f drm/i915/gt: Only drop heartbeat.systole if the sole owner
Mika spotted that only using cancel_delayed_work() could mean that we
attempted to clear the heartbeat.systole while the worker was still
running. Rectify the situation by only touching the systole from outside
the worker if we suceeded in cancelling the worker before it could run.
The worker is expected to clean up by itself upon idling.

Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Fixes: 058179e72e ("drm/i915/gt: Replace hangcheck by heartbeats")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106133129.17732-1-chris@chris-wilson.co.uk
(cherry picked from commit 841e867286)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2019-11-11 10:29:47 +02:00
Alex Deucher
53dbc27ad5 drm/amdgpu/powerplay: fix AVFS handling with custom powerplay table
When a custom powerplay table is provided, we need to update
the OD VDDC flag to avoid AVFS being enabled when it shouldn't be.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205393
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-08 12:30:24 -05:00
Kenneth Feng
558491dda0 drm/amd/powerplay: dynamically disable ds and ulv for compute
This is to improve the performance in the compute mode
for vega10. For example, the original performance for a rocm
bandwidth test: 2G internal GPU copy, is about 99GB/s.
With the idle power features disabled dynamically, the porformance
is promoted to about 215GB/s.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-08 12:30:17 -05:00
Evan Quan
875dc7c4ff drm/amd/powerplay: correct Arcturus OD support
OD is not supported on Arcturus. Thus the
pp_od_clk_voltage sysfs interface is also not supported.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-08 12:30:09 -05:00
changzhu
eebc7f4d7f drm/amdgpu: allow direct upload save restore list for raven2
It will cause modprobe atombios stuck problem in raven2 if it doesn't
allow direct upload save restore list from gfx driver.
So it needs to allow direct upload save restore list for raven2
temporarily.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-08 12:29:52 -05:00
Dave Airlie
3ca3a9eab7 Merge tag 'drm-misc-next-fixes-2019-11-06' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
msm: Fix up a6xx debugbus register names (Sharat)
mst: Avoid u64 division (Sean)

Cc: Sharat Masetty <smasetty@codeaurora.org>
Cc: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106202730.GA199896@art_vandelay
2019-11-08 16:48:22 +10:00
Dave Airlie
23aae183ff Merge tag 'drm-intel-next-fixes-2019-11-07' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
One RCU fix and fix for suspend GEM_BUG_ON (with dependencies).

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107145058.GA17401@jlahtine-desk.ger.corp.intel.com
2019-11-08 13:59:02 +10:00
Dave Airlie
393fdfdb4a Merge tag 'mediatek-drm-next-5.5-2' of https://github.com/ckhu-mediatek/linux.git-tags into drm-next
Mediatek DRM next for Linux 5.5 - 2

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: CK Hu <ck.hu@mediatek.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1573093419.13645.5.camel@mtksdaap41
2019-11-08 13:19:55 +10:00
Dave Airlie
ff9234583d Merge tag 'drm-fixes-5.4-2019-11-06' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
drm-fixes-5.4-2019-11-06:

amdgpu:
- Fix navi14 display issue root cause and revert workaround
- GPU reset scheduler interaction fix
- Fix fan boost on multi-GPU
- Gfx10 and sdma5 fixes for navi
- GFXOFF fix for renoir
- Add navi14 PCI ID
- GPUVM fix for arcturus

radeon:
- Port an SI power fix from amdgpu

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107032241.1021217-1-alexander.deucher@amd.com
2019-11-08 13:07:58 +10:00
Dave Airlie
67322bec97 Merge tag 'drm-intel-fixes-2019-11-06' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Fix HPD poll to avoid kworker consuming a lot of cpu cycles.
- Do not use TBT type for non Type-C ports.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191106213958.GA16525@intel.com
2019-11-08 13:07:44 +10:00
Dave Airlie
72d74a06e1 Merge tag 'drm-misc-fixes-2019-11-07-1' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
- Some new documentation for GEM shmem madvise helpers
 - Fix for a state dereference in atomic self-refresh helpers
 - One compilation fix for c2p fbdev helpers

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107082215.GA34850@gilmour.lan
2019-11-08 12:12:57 +10:00
Andrey Grodzovsky
a28fda312a drm/amdgpu: Avoid accidental thread reactivation.
Problem:
During GPU reset we call the GPU scheduler to suspend it's
thread, those two functions in amdgpu also suspend and resume
the sceduler for their needs but this can collide with GPU
reset in progress and accidently restart a suspended thread
before time.

Fix:
Serialize with GPU reset.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07 18:08:07 -05:00
Andrey Grodzovsky
2b6f717c33 drm/sched: Avoid job cleanup if sched thread is parked.
When the sched thread is parked we assume ring_mirror_list is
not accessed from here.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07 18:08:07 -05:00
Andrey Grodzovsky
7c55adb0a9 Revert "drm/amdgpu: dont schedule jobs while in reset"
This reverts commit 89b3d86403.

We will do a proper fix in next patch.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07 18:08:07 -05:00
Andrey Grodzovsky
83a7772ba2 drm/sched: Use completion to wait for sched->thread idle v2.
Removes thread park/unpark hack from drm_sched_entity_fini and
by this fixes reactivation of scheduler thread while the thread
is supposed to be stopped.

v2: Per sched entity completion.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07 18:08:07 -05:00
Jonathan Kim
cb5932f866 drm/amdgpu: fix vega20 pstate status change
vega20 only requires all devices be set to same pstate level for low
pstate and not high.

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Evan Quan <Evan.Quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07 18:08:07 -05:00
Kevin Wang
2af8153126 drm/amdgpu: fix sysfs interface pcie_replay_count error on navi asic
the asic callback function of get_pcie_replay_count is not implement on navi asic,
it will cause null pinter error when read this interface.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Kent Russell <kent.russell@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07 18:08:07 -05:00
Emily Deng
e31dcdcfab drm/amdgpu: Need to disable msix when unloading driver
For driver reload test, it will report "can't enable
MSI (MSI-X already enabled)".

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07 18:08:07 -05:00
Oak Zeng
f6baa07497 drm/amdgpu: Add comments to gmc structure
Explain fields like aper_base, agp_start etc. The definition
of those fields are confusing as they are from different view
(CPU or GPU). Add comments for easier understand.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <Alex.Deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-07 18:08:07 -05:00
Alex Deucher
2c409ba81b drm/radeon: fix si_enable_smc_cac() failed issue
Need to set the dte flag on this asic.

Port the fix from amdgpu:
5cb818b861 ("drm/amd/amdgpu: fix si_enable_smc_cac() failed issue")

Reviewed-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2019-11-06 22:06:23 -05:00
Alex Deucher
77a3160221 drm/amdgpu/renoir: move gfxoff handling into gfx9 module
To properly handle the option parsing ordering.

Reviewed-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 22:06:23 -05:00
changzhu
440a7a54e7 drm/amdgpu: add warning for GRBM 1-cycle delay issue in gfx9
It needs to add warning to update firmware in gfx9
in case that firmware is too old to have function to
realize dummy read in cp firmware.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 22:06:23 -05:00
changzhu
589b64a7e3 drm/amdgpu: add dummy read by engines for some GCVM status registers in gfx10
The GRBM register interface is now capable of bursting 1 cycle per
register wr->wr, wr->rd much faster than previous muticycle per
transaction done interface.  This has caused a problem where
status registers requiring HW to update have a 1 cycle delay, due
to the register update having to go through GRBM.

For cp ucode, it has realized dummy read in cp firmware.It covers
the use of WAIT_REG_MEM operation 1 case only.So it needs to call
gfx_v10_0_wait_reg_mem in gfx10. Besides it also needs to add warning to
update firmware in case firmware is too old to have function to realize
dummy read in cp firmware.

For sdma ucode, it hasn't realized dummy read in sdma firmware. sdma is
moved to gfxhub in gfx10. So it needs to add dummy read in driver
between amdgpu_ring_emit_wreg and amdgpu_ring_emit_reg_wait for sdma_v5_0.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 22:06:23 -05:00
Evan Quan
6a299d7aaa drm/amdgpu: register gpu instance before fan boost feature enablment
Otherwise, the feature enablement will be skipped due to wrong count.

Fixes: beff74bc6e ("drm/amdgpu: fix a race in GPU reset with IB test (v2)")
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 22:06:23 -05:00
Kevin Wang
38264de0dc drm/amd/swSMU: fix smu workload bit map error
fix workload bit (WORKLOAD_PPLIB_COMPUTE_BIT) map error
on vega20 and navi asic.

fix commit:
drm/amd/powerplay: add function get_workload_type_map for swsmu

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 22:06:23 -05:00
Alex Deucher
ef177d11d6 drm/amdgpu: Improve RAS documentation (v2)
Clarify some areas, clean up formatting, add section for
unrecoverable error handling.

v2: fix grammatical errors

Reviewed-by: Yong Zhao <yong.zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2019-11-06 16:27:48 -05:00