Alloc GEM buffers backed by noncoherent memory on SoCs where it is
actually faster than write-combine.
This dramatically speeds up software rendering on these SoCs, even for
tasks where write-combine memory should in theory be faster (e.g. simple
blits).
v3: The option is now selected per-SoC instead of being a module
parameter.
v5: - Fix drm_atomic_get_new_plane_state() used to retrieve the old
state
- Use custom drm_gem_fb_create()
- Only check damage clips and sync DMA buffers if non-coherent
buffers are used
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20210523170415.90410-4-paul@crapouillou.net
This function can be used by drivers that use damage clips and have
CMA GEM objects backed by non-coherent memory. Calling this function
in a plane's .atomic_update ensures that all the data in the backing
memory have been written to RAM.
v3: - Only sync data if using GEM objects backed by non-coherent memory.
- Use a drm_device pointer instead of device pointer in prototype
v5: - Rename to drm_fb_cma_sync_non_coherent
- Invert loops for better cache locality
- Only sync BOs that have the non-coherent flag
- Move to drm_fb_cma_helper.c to avoid circular dependency
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20210523170415.90410-3-paul@crapouillou.net
Having GEM buffers backed by non-coherent memory is interesting in the
particular case where it is faster to render to a non-coherent buffer
then sync the data cache, than to render to a write-combine buffer, and
(by extension) much faster than using a shadow buffer. This is true for
instance on some Ingenic SoCs, where even simple blits (e.g. memcpy)
are about three times faster using this method.
Add a 'map_noncoherent' flag to the drm_gem_cma_object structure, which
can be set by the drivers when they create the dumb buffer.
Since this really only applies to software rendering, disable this flag
as soon as the CMA objects are exported via PRIME.
v3: New patch. Now uses a simple 'map_noncoherent' flag to control how
the objects are mapped, and use the new dma_mmap_pages function.
v4: Make sure map_noncoherent is always disabled when creating GEM
objects meant to be used with dma-buf.
Signed-off-by: Paul Cercueil <paul@crapouillou.net>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20210523170415.90410-2-paul@crapouillou.net
All DPT FB color plane surface base addresses must be 2MB aligned. On
ADL_P this means that the offsets in CCS FB object must be also 2MB
aligned. Adjusting unaligned offsets for these FBs during commit time
(compensating with the x/y offsets) doesn't work, since the big
alignment would most probably lead to an x/y offset mismatch error
between the main and CCS planes.
We can overcome this limitation by remapping CCS FBs, so that each color
plane is at an aligned offset, leaving x/y for each plane unadjusted
during commit and so not causing an x/y mismatch error. However
remapping for CCS FBs will be done as a follow-up, so for now require
that user space allocates the FB obj with properly aligned planes.
v2: s/SZ_2M/512*4k/ for clarity. (Ville)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210524172703.2113058-1-imre.deak@intel.com
The driver currently disables the LTTPR non-transparent link training
mode for sinks with a DPCD_REV<1.4, based on the following description
of the LTTPR DPCD register range in DP standard 2.0 (at the 0xF0000
register description):
""
LTTPR-related registers at DPCD Addresses F0000h through F02FFh are valid
only for DPCD r1.4 (or higher).
"""
The transparent link training mode should still work fine, however the
implementation for this in some retimer FWs seems to be broken, see the
References: link below.
After discussions with DP standard authors the above "DPCD r1.4" does
not refer to the DPCD revision (stored in the DPCD_REV reg at 0x00000),
rather to the "LTTPR field data structure revision" stored in the
0xF0000 reg. An update request has been filed at vesa.org (see
wg/Link/documentComment/3746) for the upcoming v2.1 specification to
clarify the above description along the following lines:
"""
LTTPR-related registers at DPCD Addresses F0000h through F02FFh are
valid only for LT_TUNABLE_PHY_REPEATER_FIELD_DATA_STRUCTURE_REV 1.4 (or
higher)
"""
Based on my tests Windows uses the non-transparent link training mode
for DPCD_REV==1.2 sinks as well (so presumably for all DPCD_REVs), and
forcing it to use transparent mode on ICL/TGL platforms leads to the
same LT failure as reported at the References: link.
Based on the above let's assume that the transparent link training mode
is not well tested/supported and align the code to the correct
interpretation of what the r1.4 version refers to.
Reported-and-tested-by: Casey Harkins <caseyharkins@gmail.com>
Tested-by: Khaled Almahallawy <khaled.almahallawy@intel.com>
References: https://gitlab.freedesktop.org/drm/intel/-/issues/3415
Fixes: 264613b406 ("drm/i915: Disable LTTPR support when the DPCD rev < 1.4")
Cc: <stable@vger.kernel.org> # v5.11+
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Khaled Almahallawy <khaled.almahallawy@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210512212809.1234701-1-imre.deak@intel.com
(cherry picked from commit cb4920cc40)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
When main component is not probed, by example when the dw-hdmi module is
not loaded yet or in probe defer, the following crash appears on shutdown:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000038
...
pc : meson_drv_shutdown+0x24/0x50
lr : platform_drv_shutdown+0x20/0x30
...
Call trace:
meson_drv_shutdown+0x24/0x50
platform_drv_shutdown+0x20/0x30
device_shutdown+0x158/0x360
kernel_restart_prepare+0x38/0x48
kernel_restart+0x18/0x68
__do_sys_reboot+0x224/0x250
__arm64_sys_reboot+0x24/0x30
...
Simply check if the priv struct has been allocated before using it.
Fixes: fa0c16caf3 ("drm: meson_drv add shutdown function")
Reported-by: Stefan Agner <stefan@agner.ch>
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
Tested-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210430082744.3638743-1-narmstrong@baylibre.com
Fix table returned when port_clock > 270000:
drivers/gpu/drm/i915/display/intel_ddi_buf_trans.c:752:47: error: variable 'adlp_dkl_phy_dp_ddi_trans_hbr2_hbr3' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration]
Initial version of the patch had it in a single table, but on second
version the table got split, but we continued to reference just one of
them.
Fixes: ca96288226 ("drm/i915/adl_p: Define and use ADL-P specific DP translation tables")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Anusha Srivatsa <anusha.srivatsa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210521005209.4058702-1-lucas.demarchi@intel.com
The PM Runtime docs specifically call out the need to call
pm_runtime_dont_use_autosuspend() in the remove() callback if
pm_runtime_use_autosuspend() was called in probe():
> Drivers in ->remove() callback should undo the runtime PM changes done
> in ->probe(). Usually this means calling pm_runtime_disable(),
> pm_runtime_dont_use_autosuspend() etc.
We should do this. This fixes a warning splat that I saw when I was
testing out the panel-simple's remove().
Fixes: 3235b0f20a ("drm/panel: panel-simple: Use runtime pm to avoid excessive unprepare / prepare")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210517130450.v7.1.I9e947183e95c9bd067c9c1d51208ac6a96385139@changeid
At boot, we can't rely on the vc4_get_crtc_encoder since we don't have a
state yet and thus will not be able to figure out which connector is
attached to our CRTC.
However, we have a muxing bit in the CRTC register we can use to get the
encoder currently connected to the pixelvalve. We can thus read that
register, lookup the associated register through the vc4_pv_data
structure, and then pass it to vc4_crtc_disable so that we can perform
the proper operations.
Fixes: 875a4d5368 ("drm/vc4: drv: Disable the CRTC at boot time")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210507150515.257424-6-maxime@cerno.tech
The vc4_get_crtc_encoder function currently only works when the
connector->state->crtc pointer is set, which is only true when the
connector is currently enabled.
However, we use it as part of the disable path as well, and our lookup
will fail in that case, resulting in it returning a null pointer we
can't act on.
We can access the connector that used to be connected to that crtc
though using the old connector state in the disable path.
Since we want to support both the enable and disable path, we can
support it by passing the state accessor variant as a function pointer,
together with the atomic state.
Fixes: 792c3132bc ("drm/vc4: encoder: Add finer-grained encoder callbacks")
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210507150515.257424-5-maxime@cerno.tech
The vc4_set_crtc_possible_masks is meant to run over all the encoders
and then set their possible_crtcs mask to their associated pixelvalve.
However, since the commit 39fcb28083 ("drm/vc4: txp: Turn the TXP into
a CRTC of its own"), the TXP has been turned to a CRTC and encoder of
its own, and while it does indeed register an encoder, it no longer has
an associated pixelvalve. The code will thus run over the TXP encoder
and set a bogus possible_crtcs mask, overriding the one set in the TXP
bind function.
In order to fix this, let's skip any virtual encoder.
Cc: <stable@vger.kernel.org> # v5.9+
Fixes: 39fcb28083 ("drm/vc4: txp: Turn the TXP into a CRTC of its own")
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20210507150515.257424-3-maxime@cerno.tech
The VEC's DAC on BCM2711 is slightly different compared to the one on
BCM283x and needs different configuration. In particular, bit 3
(mask 0x8) switches the BCM2711 DAC input to "self-test input data",
which makes the output unusable. Separating two compatible variants in
devicetrees and the DRM driver was therefore necessary.
The configurations used for both variants have been borrowed from
Raspberry Pi (model 3B for BCM283x, 4B for BCM2711) firmware defaults.
Signed-off-by: Mateusz Kwiatkowski <kfyatek+publicgit@gmail.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Reviewed-by: Dave Stevenson <dave.stevenson@raspberrypi.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210520150344.273900-4-maxime@cerno.tech
The first parameter passed to container_of() is the pointer to the work
structure passed to the worker and never NULL. The NULL check on the
result of container_of() is therefore unnecessary and misleading.
Remove it.
This change was made automatically with the following Coccinelle script.
@@
type t;
identifier v;
statement s;
@@
<+...
(
t v = container_of(...);
|
v = container_of(...);
)
...
when != v
- if (\( !v \| v == NULL \) ) s
...+>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Treat it like ATIF and check both the dGPU and APU for
the method. This is required because ATCS may be hung
off of the APU in ACPI on A+A systems.
v2: add back accidently removed ACPI handle check.
v3: Fix incorrect atif check (Colin)
Fix uninitialized variable (Colin)
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Avoid spamming the log. The backlight controller on DCN chips
gets powered down when the display is off, so if you attempt to
set the backlight level when the display is off, you'll get this
message. This isn't a problem as we cache the requested backlight
level if it's adjusted when the display is off and set it again
during modeset.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: nicholas.choi@amd.com
Cc: harry.wentland@amd.com
[Why]
FS video support regressed GPU scaling and the scaled buffer ends up
stuck in the top left of the screen at native size - full, aspect,
center scaling modes do not function.
This is because decide_crtc_timing_for_drm_display_mode() does not
get called when scaling is enabled.
[How]
Split recalculate timing and scaling into two different flags.
We don't want to call drm_mode_set_crtcinfo() for scaling, but we
do want to call it for FS video.
Optimize and move preferred_refresh calculation next to
decide_crtc_timing_for_drm_display_mode() like it used to be since
that's not used for FS video.
We don't need to copy over the VIC or polarity in the case of FS video
modes because those don't change.
Fixes: 6f59f229f8 ("drm/amd/display: Skip modeset for front porch change")
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
KFD userptr BOs and SG BOs used for DMA mappings can be preempted with
CWSR. Therefore we can use preemptible placement and avoid unwanted
evictions due to GTT accounting.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
SG BOs such as dmabuf imports and userptr BOs do not consume system
resources directly. Instead they point to resources owned elsewhere.
They typically get evicted by DMABuf move notifiers of MMU notifiers.
If those notifiers don't need to wait for hardware fences (i.e. the SG
BOs are used in a preemptible context), then we don't need to limit
them to the GTT size and we don't need TTM to evict them.
Create a new placement for such preemptible SG BOs that does not impose
artificial size limits and TTM evictions.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Bandwidth calculations are triggered for non zero streams, and
in case of 0 streams, these calculations were skipped with
pstate status not being updated.
[How]
As the pstate status is applicable for non zero streams, check
added for allowing 0 streams inline with dcn internal bandwidth
validations.
Signed-off-by: Bindu Ramamurthy <bindu.r@amd.com>
Reviewed-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
For default cases,we should return 0. Otherwise resume will
abort because of the wrong return value.
Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fixes the following W=1 kernel build warning(s):
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c:98: warning: expecting prototype for amdgpu_vce_init(). Prototype was for amdgpu_vce_sw_init() instead
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c:214: warning: expecting prototype for amdgpu_vce_fini(). Prototype was for amdgpu_vce_sw_fini() instead
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c:590: warning: expecting prototype for amdgpu_vce_cs_validate_bo(). Prototype was for amdgpu_vce_validate_bo() instead
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c:724: warning: expecting prototype for amdgpu_vce_cs_parse(). Prototype was for amdgpu_vce_ring_parse_cs() instead
drivers/gpu/drm/amd/amdgpu/amdgpu_vce.c:960: warning: expecting prototype for amdgpu_vce_cs_parse_vm(). Prototype was for amdgpu_vce_ring_parse_cs_vm() instead
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>