- Add CWSR (compute wave save restore) support for GFX8 (Carrizo)
- Fix SDMA user-mode queues support for GFX7 (Kaveri)
- Add SDMA user-mode queues support for GFX8 (Carrizo)
- Allow HWS (hardware scheduling) to schedule multiple processes concurrently
- Add debugfs support
- Simplify process locking and lock dependencies
- Refactoring topology code to prepare for dGPU support + fixes to that code
- Add option to generate dummy/virtual CRAT table when its missing or deformed
- Recognize CPUs other then APUs as compute entities
- Various clean ups and bug fixes
I have not yet sent the dGPU topology code because it depends on a patch
for the PCI subsystem that adds PCIe atomics support. Once that patch is
upstreamed we can continue with the rest of the dGPU code.
* tag 'drm-amdkfd-next-2017-12-24' of git://people.freedesktop.org/~gabbayo/linux: (53 commits)
drm/amdgpu: Add support for reporting VRAM usage
drm/amdkfd: Ignore ACPI CRAT for non-APU systems
drm/amdkfd: Module option to disable CRAT table
drm/amdkfd: Add AQL Queue Memory flag on topology
drm/amdkfd: Fixup incorrect info in the CZ CRAT table
drm/amdkfd: Add perf counters to topology
drm/amdkfd: Add topology support for dGPUs
drm/amdkfd: Add topology support for CPUs
drm/amdkfd: Fix sibling_map[] size
drm/amdkfd: Simplify counting of memory banks
drm/amdkfd: Turn verbose topology messages into pr_debug
drm/amdkfd: sync IOLINK defines to thunk spec
drm/amdkfd: Support enumerating non-GPU devices
drm/amdkfd: Decouple CRAT parsing from device list update
drm/amdkfd: Reorganize CRAT fetching from ACPI
drm/amdkfd: Group up CRAT related functions
drm/amdkfd: Fix memory leaks in kfd topology
drm/amdkfd: Topology: Fix location_id
drm/amdkfd: Update number of compute unit from KGD
drm/amd: Remove get_vmem_size from KGD-KFD interface
...
- Allow internal page allocation to fail (Chris)
- More improvements on logs, dumps, and trace (Chris, Michal)
- Coffee Lake important fix for stolen memory (Lucas)
- Continue to make GPU reset more robust as well
improving selftest coverage for it (Chris)
- Unifying debugfs return codes (Michal)
- Using existing helper for testing obj pages (Matthew)
- Organize and improve gem_request tracepoints (Lionel)
- Protect DDI port to DPLL map from theoretical race (Rodrigo)
- ... and consequently fixing the indentation on this DDI clk selection function (Chris)
- ... and consequently properly serializing non-blocking modesets (Ville)
- Add support for horizontal plane flipping on Cannonlake (Joonas)
- Two Cannonlake Workarounds for better stability (Rafael)
- Fix mess around PSR registers (DK)
- More Coffee Lake PCI IDs (Rodrigo)
- Remove CSS modifiers on pipe C of Geminilake (Krisman)
- Disable all planes for load detection (Ville)
- Reorg on i915 display headers (Michal)
- Avoid enabling movntdqa optimization on hypervisor guest (Changbin)
GVT:
- more mmio switch optimization (Weinan)
- cleanup i915_reg_t vs. offset usage (Zhenyu)
- move write protect handler out of mmio handler (Zhenyu)
* tag 'drm-intel-next-2017-12-22' of git://anongit.freedesktop.org/drm/drm-intel: (55 commits)
drm/i915: Update DRIVER_DATE to 20171222
drm/i915: Show HWSP in intel_engine_dump()
drm/i915: Assert that the request is on the execution queue before being removed
drm/i915/execlists: Show preemption progress in GEM_TRACE
drm/i915: Put all non-blocking modesets onto an ordered wq
drm/i915: Disable GMBUS clock gating around GMBUS transfers on gen9+
drm/i915: Clean up the PNV bit banging vs. GMBUS clock gating w/a
drm/i915: No need to power up PG2 for GMBUS on BXT
drm/i915: Disable DC states around GMBUS on GLK
drm/i915: Do not enable movntdqa optimization in hypervisor guest
drm/i915: Dump device info at once
drm/i915: Add pretty printer for runtime part of intel_device_info
drm/i915: Update intel_device_info_runtime_init() parameter
drm/i915: Move intel_device_info definitions to its own header
drm/i915: Move opregion definitions to dedicated intel_opregion.h
drm/i915: Move display related definitions to dedicated header
drm/i915: Move some utility functions to i915_util.h
drm/i915/gvt: move write protect handler out of mmio emulation function
drm/i915/gvt: cleanup usage for typed mmio reg vs. offset
drm/i915/gvt: Fix pipe A enable as default for vgpu
...
Looking at a CI failure with an ominous line of
[ 362.550715] hangcheck current seqno ffffff6b, last ffffff8c, hangcheck ffffff6b [6016 ms], inflight 118
with no apparent cause for the seqno to be negative, left me wondering
if someone had scribbled over the HWSP. So include the HWSP in the
engine dump to see if there are more signs of random scribbling.
v2: Fix row pointer, i is now incremented by 8 so doesn't need scaling
by 8, and we don't need to keep volatile here as the status_page isn't
marked up as volatile itself.
v3: Use hexdump, with suppression of identical lines. (Tvrtko)
Which results in
HWSP:
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
*
00000040 00000001 00000000 00000018 00000002 00000001 00000000 00000018 00000000
00000060 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000003
00000080 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
*
000000c0 00000002 00000000 00000000 00000000 00000000 00000000 00000000 00000000
000000e0 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
*
instead of 128 lines of mostly 0s.
v4: Tidy up the locals
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171222182521.18106-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
We should only attempt to remove requests from the execution queue that
are on the execution queue. These are the requests that have been
assigned a global_seqno, so we can assert that we only attempt to remove
requests with a nonzero global_seqno. Afterwards we assert that we
remove them in order, i.e. the global_seqno matches the engine's seqno,
but that leaves a small loophole for an unattached request on an unused
engine.
We can then make the same assertion on queuing the request to the
execution engine, it must have a zero global_seqno or else we are queuing
the same request twice.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171222141959.3006-1-chris@chris-wilson.co.uk
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
We have plenty of global registers and whatnot programmed without
any further locking by the modeset code. Currently non-bocking
modesets are allowed to execute in parallel which could corrupt
said registers.
To avoid the problem let's run all non-blocking modesets on an
ordered workqueue. We still put page flips etc. to system_unbound_wq
allowing page flips on one pipe to execute in parallel with page flips
or a modeset on a another pipe (assuming no known state is shared
between them, at which point they would have been added to the same
atomic commit and serialized that way).
Blocking modesets are already serialized with each other by
connection_mutex, and thus are safe. To serialize them with
non-blocking modesets we just flush the workqueue before executing
blocking modesets.
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Fixes: 94f050246b ("drm/i915: nonblocking commit")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20171113133622.8593-1-ville.syrjala@linux.intel.com
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Gen9+ need to disable GMBUS clock gating when doing multi part
transfers. Otherwise clock gating will kick in when GMBUS is in
the WAIT state and presumably that will corrupt the transfer.
This is documented as Display WA #0868.
Apparently older hardware doesn't allow clock gating in the WAIT
state and thus are unaffected by this problem.
v2: Limit the PCH w/a to gen9 and gen10 only (DK)
Actually change it to check the PCH type instead since
it's the PCH that actually contains the GMBUS hardware
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/20171221202432.17373-1-ville.syrjala@linux.intel.com
Our QA reported a problem caused by movntdqa instructions. Currently,
the KVM hypervisor doesn't support VEX-prefix instructions emulation.
If users passthrough a GPU to guest with vfio option 'x-no-mmap=on',
then all access to the BARs will be trapped and emulated. The KVM
hypervisor would raise an inertal error to qemu which cause the guest
killed. (Since 'movntdqa' ins is not supported.)
This patch try not to enable movntdqa optimization if the driver is
running in hypervisor guest.
Signed-off-by: Changbin Du <changbin.du@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1513924309-3113-1-git-send-email-changbin.du@intel.com
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
It's a bit confusing that page write protect handler is live in
mmio emulation handler. This moves it to stand alone gvt ops.
Also remove unnecessary check of write protected page access
in mmio read handler and cleanup handling of failsafe case.
v2: rebase
Reviewed-by: Xiong Zhang <xiong.y.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
We had previous hack that tried to accept either i915_reg_t or offset
value to access vGPU virtual/shadow regs which broke that purpose to
be type safe in context. This one trys to explicitly separate the usage
of typed mmio reg with real offset.
Old vgpu_vreg(offset) helper is used only for offset now with new
vgpu_vreg_t(reg) is used for i915_reg_t only. Convert left usage
of that to new helper.
Also fixed left KASAN warning issues caused by previous hack.
v2: rebase, fixup against recent mmio switch change
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
observed igt drv_module_reload test case failure on 4.15.0
rc2 kernel with panic due to no active pipe available.
the gpu will reset during unload/load and make pipe config reg
lost which can cause kernel panic issue happen.
this patch is to move pipe enabling to emulate_mointor_status_chagne
to handle vgpu reset case as well.
Fixes: 7e60590208 ("drm/i915/gvt: enabled pipe A default on creating vgpu")
Signed-off-by: Xiaolin Zhang <xiaolin.zhang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
(cherry picked from commit f5f00e7dcc)
Always requires properly defined i915_reg_t type for MMIO handler
definition.
Fix kasan warning of "drivers/gpu/drm/i915/gvt/handlers.c:2397:1: error: the frame size of 32120 bytes is larger than 8192 bytes"
Reviewed-by: Zhi Wang <zhi.a.wang@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
drm-misc-next for 4.16:
Core Changes:
- mostly doc updates and some fbdev improvements
* tag 'drm-misc-next-2017-12-21' of git://anongit.freedesktop.org/drm/drm-misc:
drm/framebuffer: Print task that allocated the fb in debug info.
drm/fb-helper: Add drm_fb_helper_defio_init()
drm/fb-helper: Update DOC with new helpers
drm/docs: Add todo entry for drm_fb_helper_fbdev_setup()
drm/fb-helper: Add drm_fb_helper_fbdev_setup/teardown()
drm/fb-helper: Set/clear dev->fb_helper in dummy init/fini
drm/stm: ltdc: Remove unnecessary platform_get_resource() error check
drm/stm: dsi: Remove unnecessary platform_get_resource() error check
drm/doc: Move legacy kms helpers to the very end
drm/atomic: document how to handle driver private objects
drm/syncobj: some kerneldoc polish
drm/print: Unconfuse kerneldoc
drm/edid: kerneldoc for is_hdmi2_sink
* 'drm-next-4.16' of git://people.freedesktop.org/~agd5f/linux: (171 commits)
drm/amdgpu: fix test for shadow page tables
drm/amd/display: Expose dpp1_set_cursor_attributes
drm/amd/display: Update FMT and OPPBUF functions
drm/amd/display: check for null before calling is_blanked
drm/amd/display: dal 3.1.27
drm/amd/display: Fix unused variable warnings.
drm/amd/display: Only blank DCN when we have set_blank implementation
drm/amd/display: Put dcn_mi_registers with other structs
drm/amd/display: hubp refactor
drm/amd/display: integrating optc pseudocode
drm/amd/display: Call validate_fbc should_enable_fbc
drm/amd/display: Clean up DCN cursor code
drm/amd/display: fix 180 full screen pipe split
drm/amd/display: reprogram surface config on scaling change
drm/amd/display: Remove dwbc from pipe_ctx
drm/amd/display: Use the maximum link setting which EDP reported.
drm/amd/display: Add hdr_supported flag
drm/amd/display: fix global sync param retrieval when not pipe splitting
drm/amd/display: Update HUBP
drm/amd/display: fix rotated surface scaling
...
omapdrm changes for v4.16
* support memory bandwidth limits
* DSI command mode panel cleanups for N9
* DMM error handling
* tag 'omapdrm-4.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tomba/linux: (27 commits)
drm: omapdrm: Simplify platform registration
drm: omapdrm: Remove filename from header and fix copyright tag
drm/omap: DMM: Check for DMM readiness after successful transaction commit
drm/omap: DMM: Print information if we received an error interrupt
drm/omap: DMM: In case of error/timeout in wait_status() print the reason
drm/omap: DMM: Fix DMM_IRQSTAT_ERR_MASK definition
drm: omapdrm: Deconstruct the omap_drv.h header.
drm: omapdrm: venc: Return error code on OF parsing failure
drm: omapdrm: dpi: Remove dpi_data port_initialized field
drm: omapdrm: dss: Make dss_dump_clocks() function static
drm: omapdrm: dss: Set the DMA coherent mask
drm: omapdrm: Remove unused omap_dss_find_device() function
drm: omapdrm: Pass drm_device to omap_gem_resume()
drm: omapdrm: dpi: Don't treat GPIO probe deferral as an error
drm/omap: panel-dsi-cm: switch to gpiod
drm/omap: panel-dsi-cm: add external backlight support
drm/omap: panel-dsi-cm: add physical size support
drm/omap: panel-dsi-cm: add regulator support
drm/omap: panel-dsi-cm: fix driver
drm/omap: add support for physical size hints from display drivers
...
drm/tilcdc changes for 4.16
* tag 'tilcdc-4.16' of https://github.com/jsarha/linux:
drm/tilcdc: make tilcdc_mode_hvtotal() static
drm/tilcdc: Remove drm_framebuffer_get() and *_put() calls
drm/tilcdc: ensure nonatomic iowrite64 is not used
- Fix documentation build issues (Randy, Markus)
- Fix timestamp frequency calculation for perf on CNL (Lionel)
- New DMC firmware for Skylake (Anusha)
- GTT flush fixes and other GGTT write track and refactors (Chris)
- Taint kernel when GPU reset fails (Chris)
- Display workarounds organization (Lucas)
- GuC and HuC initialization clean-up and fixes (Michal)
- Other fixes around GuC submission (Michal)
- Execlist clean-ups like caching ELSP reg offset and improving log readability (Chri\
s)
- Many other improvements on our logs and dumps (Chris)
- Restore GT performance in headless mode with DMC loaded (Tvrtko)
- Stop updating legacy fb parameters since FBC is not using anymore (Daniel)
- More selftest improvements (Chris)
- Preemption fixes and improvements (Chris)
- x86/early-quirks improvements for Intel graphics stolen memory. (Joonas, Matthew)
- Other improvements on Stolen Memory code to be resource centric. (Matthew)
- Improvements and fixes on fence allocation/release (Chris).
GVT:
- fixes for two coverity scan errors (Colin)
- mmio switch code refine (Changbin)
- more virtual display dmabuf fixes (Tina/Gustavo)
- misc cleanups (Pei)
- VFIO mdev display dmabuf interface and gvt support (Tina)
- VFIO mdev opregion support/fixes (Tina/Xiong/Chris)
- workload scheduling optimization (Changbin)
- preemption fix and temporal workaround (Zhenyu)
- and misc fixes after refactor (Chris)
* tag 'drm-intel-next-2017-12-14' of git://anongit.freedesktop.org/drm/drm-intel: (87 commits)
drm/i915: Update DRIVER_DATE to 20171214
drm/i915: properly init lockdep class
drm/i915: Show engine state when hangcheck detects a stall
drm/i915: make CS frequency read support missing more obvious
drm/i915/guc: Extract doorbell verification into a function
drm/i915/guc: Extract clients allocation to submission_init
drm/i915/guc: Extract doorbell creation from client allocation
drm/i915/guc: Call invalidate after changing the vfunc
drm/i915/guc: Extract guc_init from guc_init_hw
drm/i915/guc: Move GuC workqueue allocations outside of the mutex
drm/i915/guc: Move shared data allocation away from submission path
drm/i915: Unwind i915_gem_init() failure
drm/i915: Ratelimit request allocation under oom
drm/i915: Allow fence allocations to fail
drm/i915: Mark up potential allocation paths within i915_sw_fence as might_sleep
drm/i915: Don't check #active_requests from i915_gem_wait_for_idle()
drm/i915/fence: Use rcu to defer freeing of irq_work
drm/i915: Dump the engine state before declaring wedged from wait_for_engines()
drm/i915: Bump timeout for wait_for_engines()
drm/i915: Downgrade misleading "Memory usable" message
...
validate_fbc never fails a modeset. It's simply used to decide whether
to use FBC or not. Calling it validate_fbc might be confusing to some so
rename it to should_enable_fbc.
With that let's also remove the DC_STATUS return code and return bool
and make enable_fbc a void function since we never check it's return
value and probably never want to anyways.
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Roman Li <Roman.Li@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When plane size changes, we need to reprogram surface pitch in addition
to viewport and scaler. This change is a conservative way to make this happen.
However it could be more optimized to move pitch programming into
mem_program_viewport.
Signed-off-by: Eric Yang <Eric.Yang2@amd.com>
Reviewed-by: Andrew Jiang <Andrew.Jiang@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>