The dma_map_sgtable() call (used to invalidate cache) overwrites sgt->nents
with 1, so msm_iommu_pagetable_map maps only the first physical segment.
To fix this problem use for_each_sgtable_sg(), which uses orig_nents.
Fixes: b145c6e65e ("drm/msm: Add support to create a local pagetable")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Link: https://lore.kernel.org/r/20220613221019.11399-1-jonathan@marek.ca
Signed-off-by: Rob Clark <robdclark@chromium.org>
Following commit 17e822f759 ("drm/msm: fix unbalanced
pm_runtime_enable in adreno_gpu_{init, cleanup}"), any call to
adreno_unbind() will disable runtime PM twice, as indicated by the call
trees below:
adreno_unbind()
-> pm_runtime_force_suspend()
-> pm_runtime_disable()
adreno_unbind()
-> gpu->funcs->destroy() [= aNxx_destroy()]
-> adreno_gpu_cleanup()
-> pm_runtime_disable()
Note that pm_runtime_force_suspend() is called right before
gpu->funcs->destroy() and both functions are called unconditionally.
With recent addition of the eDP AUX bus code, this problem manifests
itself when the eDP panel cannot be found yet and probing is deferred.
On the first probe attempt, we disable runtime PM twice as described
above. This then causes any later probe attempt to fail with
[drm:adreno_load_gpu [msm]] *ERROR* Couldn't power up the GPU: -13
preventing the driver from loading.
As there seem to be scenarios where the aNxx_destroy() functions are not
called from adreno_unbind(), simply removing pm_runtime_disable() from
inside adreno_unbind() does not seem to be the proper fix. This is what
commit 17e822f759 ("drm/msm: fix unbalanced pm_runtime_enable in
adreno_gpu_{init, cleanup}") intended to fix. Therefore, instead check
whether runtime PM is still enabled, and only disable it in that case.
Fixes: 17e822f759 ("drm/msm: fix unbalanced pm_runtime_enable in adreno_gpu_{init, cleanup}")
Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
Tested-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Link: https://lore.kernel.org/r/20220606211305.189585-1-luzmaximilian@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
If a GEM object is allocated, and then exported as a dma-buf fd which is
mmap'd before or without the GEM buffer being directly mmap'd, the
vma_node could be unitialized. This leads to a situation where the CPU
mapping is not correctly torn down in drm_vma_node_unmap().
Fixes: e551655399 ("drm: call drm_gem_object_funcs.mmap with fake offset")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://lore.kernel.org/r/20220531200857.136547-1-robdclark@gmail.com
5.19 fixes for msm-next
- Fix to add minimum ICC vote in the msm_mdss pm_resume path to address
bootup splats
- Fix to avoid dereferencing without checking in WB encoder
- Fix to avoid crash during suspend in DP driver by ensuring interrupt
mask bits are updated
- Remove unused code from dpu_encoder_virt_atomic_check()
- Fix to remove redundant init of dsc variable
Signed-off-by: Rob Clark <robdclark@chromium.org>
In commit a670ff578f ("drm/msm/dpu: always use mdp device to scale
bandwidth") we fully moved interconnect stuff to the DPU driver. This
had no change for sc7180 but _did_ have an impact for other SoCs. It
made them match the sc7180 scheme.
Unfortunately, the sc7180 scheme seems like it was a bit broken.
Specifically the interconnect needs to be on for more than just the
DPU driver's AXI bus. In the very least it also needs to be on for the
DSI driver's AXI bus. This can be seen fairly easily by doing this on
a ChromeOS sc7180-trogdor class device:
set_power_policy --ac_screen_dim_delay=5 --ac_screen_off_delay=10
sleep 10
cd /sys/bus/platform/devices/ae94000.dsi/power
echo on > control
When you do that, you'll get a warning splat in the logs about
"gcc_disp_hf_axi_clk status stuck at 'off'".
One could argue that perhaps what I have done above is "illegal" and
that it can't happen naturally in the system because in normal system
usage the DPU is pretty much always on when DSI is on. That being
said:
* In official ChromeOS builds (admittedly a 5.4 kernel with backports)
we have seen that splat at bootup.
* Even though we don't use "autosuspend" for these components, we
don't use the "put_sync" variants. Thus plausibly the DSI could stay
"runtime enabled" past when the DPU is enabled. Techncially we
shouldn't do that if the DPU's suspend ends up yanking our clock.
Let's change things such that the "bare minimum" request for the
interconnect happens in the mdss driver again. That means that all of
the children can assume that the interconnect is on at the minimum
bandwidth. We'll then let the DPU request the higher amount that it
wants.
It should be noted that this isn't as hacky of a solution as it might
initially appear. Specifically:
* Since MDSS and DPU individually get their own references to the
interconnect then the framework will actually handle aggregating
them. The two drivers are _not_ clobbering each other.
* When the Qualcomm interconnect driver aggregates it takes the max of
all the peaks. Thus having MDSS request a peak, as we're doing here,
won't actually change the total interconnect bandwidth (it won't be
added to the request for the DPU). This perhaps explains why the
"average" requested in MDSS was historically 0 since that one
_would_ be added in.
NOTE also that in the downstream ChromeOS 5.4 and 5.15 kernels, we're
also seeing some RPMH hangs that are addressed by this fix. These
hangs are showing up in the field and on _some_ devices with enough
stress testing of suspend/resume. Specifically right at suspend time
with a stack crawl that looks like this (from chromeos-5.15 tree):
rpmh_write_batch+0x19c/0x240
qcom_icc_bcm_voter_commit+0x210/0x420
qcom_icc_set+0x28/0x38
apply_constraints+0x70/0xa4
icc_set_bw+0x150/0x24c
dpu_runtime_resume+0x50/0x1c4
pm_generic_runtime_resume+0x30/0x44
__genpd_runtime_resume+0x68/0x7c
genpd_runtime_resume+0x12c/0x20c
__rpm_callback+0x98/0x138
rpm_callback+0x30/0x88
rpm_resume+0x370/0x4a0
__pm_runtime_resume+0x80/0xb0
dpu_kms_enable_commit+0x24/0x30
msm_atomic_commit_tail+0x12c/0x630
commit_tail+0xac/0x150
drm_atomic_helper_commit+0x114/0x11c
drm_atomic_commit+0x68/0x78
drm_atomic_helper_disable_all+0x158/0x1c8
drm_atomic_helper_suspend+0xc0/0x1c0
drm_mode_config_helper_suspend+0x2c/0x60
msm_pm_prepare+0x2c/0x40
pm_generic_prepare+0x30/0x44
genpd_prepare+0x80/0xd0
device_prepare+0x78/0x17c
dpm_prepare+0xb0/0x384
dpm_suspend_start+0x34/0xc0
We don't completely understand all the mechanisms in play, but the
hang seemed to come and go with random factors. It's not terribly
surprising that the hang is gone after this patch since the line of
code that was failing is no longer present in the kernel.
Fixes: a670ff578f ("drm/msm/dpu: always use mdp device to scale bandwidth")
Fixes: c33b7c0389 ("drm/msm/dpu: add support for clk and bw scaling for display")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Tested-by: Jessica Zhang <quic_jesszhan@quicinc.com> # RB3 (sdm845) and
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/487884/
Link: https://lore.kernel.org/r/20220531160059.v2.1.Ie7f6d4bf8cce28131da31a43354727e417cae98d@changeid
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
dp_catalog_ctrl_reset() will software reset DP controller. But it will
not reset programmable registers to default value. DP driver still have
to clear mask bits to interrupt status registers to disable interrupts
after software reset of controller.
At current implementation, dp_ctrl_reset_irq_ctrl() will software reset dp
controller but did not call dp_catalog_ctrl_enable_irq(false) to clear hpd
related interrupt mask bits to disable hpd related interrupts due to it
mistakenly think hpd related interrupt mask bits will be cleared by software
reset of dp controller automatically. This mistake may cause system to crash
during suspending procedure due to unexpected irq fired and trigger event
thread to access dp controller registers with controller clocks are disabled.
This patch fixes system crash during suspending problem by removing "enable"
flag condition checking at dp_ctrl_reset_irq_ctrl() so that hpd related
interrupt mask bits are cleared to prevent unexpected from happening.
Changes in v2:
-- add more details commit text
Changes in v3:
-- add synchrons_irq()
-- add atomic_t suspended
Changes in v4:
-- correct Fixes's commit ID
-- remove synchrons_irq()
Changes in v5:
-- revise commit text
Changes in v6:
-- add event_lock to protect "suspended"
Changes in v7:
-- delete "suspended" flag
Fixes: 989ebe7bc4 ("drm/msm/dp: do not initialize phy until plugin interrupt received")
Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/486591/
Link: https://lore.kernel.org/r/1652804494-19650-1-git-send-email-quic_khsieh@quicinc.com
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
5.19 fixes for msm-next
- Limiting WB modes to max sspp linewidth
- Fixing the supported rotations to add 180 back for IGT
- Fix to handle pm_runtime_get_sync() errors to avoid unclocked access
in the bind() path for dpu driver
- Fix the irq_free() without request issue which was a big-time
hitter in the CI-runs.
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Signed-off-by: Rob Clark <robdclark@chromium.org>
of_parse_phandle() returns a node pointer with refcount
incremented, we should use of_node_put() on it when not need anymore.
a6xx_gmu_init() passes the node to of_find_device_by_node()
and of_dma_configure(), of_find_device_by_node() will takes its
reference, of_dma_configure() doesn't need the node after usage.
Add missing of_node_put() to avoid refcount leak.
Fixes: 4b565ca5a2 ("drm/msm: Add A6XX device support")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Reviewed-by: Akhil P Oommen <quic_akhilpo@quicinc.com>
Link: https://lore.kernel.org/r/20220512121955.56937-1-linmq006@gmail.com
Signed-off-by: Rob Clark <robdclark@chromium.org>
Commit 7d8e9a9050 ("drm/msm/dsi: move DSI host powerup to modeset
time") caused sc7180 Chromebooks that use the parade-ps8640 bridge
chip to fail to turn the display back on after it turns off.
Unfortunately, it doesn't look easy to fix the parade-ps8640 driver to
handle the new power sequence. The Linux driver has almost nothing in
it and most of the logic for this bridge chip is in black-box firmware
that the bridge chip uses.
Also unfortunately, reverting the patch will break "tc358762".
The long term solution here is probably Dave Stevenson's series [1]
that would give more flexibility. However, that is likely not a quick
fix.
For the short term, we'll look at the compatible of the next bridge in
the chain and go back to the old way for the Parade PS8640 bridge
chip. If it's found that other bridge chips also need this workaround
then we can add them to the list or consider inverting the
condition. However, the hope is that the framework will not take too
much longer to land and we won't have to add anything other than
ps8640 here.
[1] https://lore.kernel.org/r/cover.1646406653.git.dave.stevenson@raspberrypi.com
Fixes: 7d8e9a9050 ("drm/msm/dsi: move DSI host powerup to modeset time")
Suggested-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Link: https://lore.kernel.org/r/20220513131504.v5.1.Ia196e35ad985059e77b038a41662faae9e26f411@changeid
Signed-off-by: Rob Clark <robdclark@chromium.org>
For the past several releases I have been assisting Rob by writing,
collecting, testing and integrating patches for non-GPU and non-core
parts of MSM DRM driver, while Rob is more interested in improving the
GPU-related part. Let's note this in the MAINTAINERS file.
While we are at it, per Rob's suggestion let's also promote Abhinav
Kumar to M: (as he is actively working on the driver) and switch Sean
Paul to R: (since he isn't doing much on msm these days).
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Acked-by: Rob Clark <robdclark@gmail.com>
Link: https://lore.kernel.org/r/20220429215324.3729441-1-dmitry.baryshkov@linaro.org
Signed-off-by: Rob Clark <robdclark@chromium.org>
When rebooting on my sc7280-herobrine based device, I got a
crash. Upon debugging, I found that I was in msm_drv_shutdown() and my
"pdev" was the one associated with mdss_probe().
From source, I found that mdss_probe() has the line:
platform_set_drvdata(pdev, mdss);
...where "mdss" is of type "struct msm_mdss *".
Also from source, I saw that in msm_drv_shutdown() we have the line:
struct msm_drm_private *priv = platform_get_drvdata(pdev);
This is a mismatch and is the root of the problem.
Further digging made it apparent that msm_drv_shutdown() is only
supposed to be used for parts of the msm display framework that also
call msm_drv_probe() but mdss_probe() doesn't call
msm_drv_probe(). Let's remove the shutdown functon from msm_mdss.c.
Digging a little further, code inspection found that two drivers that
use msm_drv_probe() weren't calling msm_drv_shutdown(). Let's add it
to them.
Fixes: 6874f48bb8 ("drm/msm: make mdp5/dpu devices master components")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Patchwork: https://patchwork.freedesktop.org/patch/484975/
Link: https://lore.kernel.org/r/20220504163900.v2.1.Iaebd35e60160fc0f2a50fac3a0bf3b298c0637c8@changeid
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
mdp5_get_global_state runs the risk of hitting a -EDEADLK when acquiring
the modeset lock, but currently mdp5_pipe_release doesn't check for if
an error is returned. Because of this, there is a possibility of
mdp5_pipe_release hitting a NULL dereference error.
To avoid this, let's have mdp5_pipe_release check if
mdp5_get_global_state returns an error and propogate that error.
Changes since v1:
- Separated declaration and initialization of *new_state to avoid
compiler warning
- Fixed some spelling mistakes in commit message
Changes since v2:
- Return 0 in case where hwpipe is NULL as this is considered normal
behavior
- Added 2nd patch in series to fix a similar NULL dereference issue in
mdp5_mixer_release
Reported-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com>
Fixes: 7907a0d77c ("drm/msm/mdp5: Use the new private_obj state")
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Patchwork: https://patchwork.freedesktop.org/patch/485179/
Link: https://lore.kernel.org/r/20220505214051.155-1-quic_jesszhan@quicinc.com
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Event thread supposed to exit from its while loop after kthread_stop().
However there may has possibility that event thread is pending in the
middle of wait_event due to condition checking never become true.
To make sure event thread exit its loop after kthread_stop(), this
patch OR kthread_should_stop() into wait_event's condition checking
so that event thread will exit its loop after kernal_stop().
Changes in v2:
-- correct spelling error at commit title
Changes in v3:
-- remove unnecessary parenthesis
-- while(1) to replace while (!kthread_should_stop())
Reported-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Fixes: 570d3e5d28 ("drm/msm/dp: stop event kernel thread when DP unbind")
Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/484576/
Link: https://lore.kernel.org/r/1651595136-24312-1-git-send-email-quic_khsieh@quicinc.com
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
At normal operation, transmit phy test pattern has to be terminated before
DP controller switch to video ready state. However during phy compliance
testing, transmit phy test pattern should not be terminated until end of
compliance test which usually indicated by unplugged interrupt.
Only stop sending the train pattern in dp_ctrl_on_stream() if we're not
doing compliance testing. We also no longer reset 'p_level' and
'v_level' within dp_ctrl_on_link() due to both 'p_level' and 'v_level'
are acquired from link status at previous dpcd read and we like to use
those level to start link training.
Changes in v2:
-- add more details commit text
-- correct Fixes
Changes in v3:
-- drop unnecessary braces
Fixes: 2e0adc765d ("drm/msm/dp: do not end dp link training until video is ready")
Signed-off-by: Kuogee Hsieh <quic_khsieh@quicinc.com>
Reviewed-by: Stephen Boyd <swboyd@chromium.org>
Patchwork: https://patchwork.freedesktop.org/patch/483564/
Link: https://lore.kernel.org/r/1650995939-28467-3-git-send-email-quic_khsieh@quicinc.com
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>