Changes for v7.1
CI:
- Uprev mesa
- Restore CI jobs for Qualcomm APQ8016 and APQ8096 devices
Core:
- Switched to of_get_available_child_by_name()
DPU:
- Fixes for DSC panels
- Fixed brownout because of the frequency / OPP mismatch
- Quad pipe preparation (not enabled yet)
- Switched to virtual planes by default
- Dropped VBIF_NRT support
- Added support for Eliza platform
- Reworked alpha handling
- Switched to correct CWB definitions on Eliza
- Dropped dummy INTF_0 on MSM8953
- Corrected INTFs related to DP-MST
DP:
- Removed debug prints looking into PHY internals
DSI:
- Fixes for DSC panels
- RGB101010 support
- Support for SC8280XP
- Moved PHY bindings from display/ to phy/
GPU:
- Preemption support for x2-85 and a840
- IFPC support for a840
- SKU detection support for x2-85 and a840
- Expose AQE support (VK ray-pipeline)
- Avoid locking in VM_BIND fence signaling path
- Fix to avoid reclaim in GPU snapshot path
- Disallow foreign mapping of _NO_SHARE BOs
- Couple a6xx gpu snapshot fixes
- Various other fixes
HDMI:
- Fixed infoframes programming
MDP5:
- Dropped support for MSM8974v1
- Dropped now unused code for MSM8974 v1 and SDM660 / MSM8998
Also misc small fixes
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <rob.clark@oss.qualcomm.com>
Link: https://patch.msgid.link/CACSVV012vn73BaUfk=Hw4WkQHZNPHiqfifWEunAqMc2EGOWUEQ@mail.gmail.com
Recent chipsets like Glymur supports a new mechanism for SKU detection.
A new CX_MISC register exposes the combined (or final) speedbin value
from both HW fuse register and the Soft Fuse register. Implement this new
SKU detection along with a new quirk to identify the GPUs that has soft
fuse support.
There is a side effect of this patch on A4x and older series. The
speedbin field in the MSM_PARAM_CHIPID will be 0 instead of 0xffff. This
should be okay as Mesa correctly handles it. Speedbin was not even a
thing when those GPUs' support were added.
Signed-off-by: Akhil P Oommen <akhilpo@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/714676/
Message-ID: <20260327-a8xx-gpu-batch2-v2-12-2b53c38d2101@oss.qualcomm.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
msm_ioctl_gem_info_get_metadata() always returns 0 regardless of
errors. When copy_to_user() fails or the user buffer is too small,
the error code stored in ret is ignored because the function
unconditionally returns 0. This causes userspace to believe the
ioctl succeeded when it did not.
Additionally, kmemdup() can return NULL on allocation failure, but
the return value is not checked. This leads to a NULL pointer
dereference in the subsequent copy_to_user() call.
Add the missing NULL check for kmemdup() and return ret instead of 0.
Note that the SET counterpart (msm_ioctl_gem_info_set_metadata)
correctly returns ret.
Fixes: 9902cb999e ("drm/msm/gem: Add metadata")
Cc: stable@vger.kernel.org
Signed-off-by: Yasuaki Torimaru <yasuakitorimaru@gmail.com>
Patchwork: https://patchwork.freedesktop.org/patch/714478/
Message-ID: <20260325114635.383241-1-yasuakitorimaru@gmail.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
Previously, in case there was no more work to do, recover worker
wouldn't trigger recovery and would instead rely on the gpu going to
sleep and then resuming when more work is submitted.
Recover_worker will first increment the fence of the hung ring so, if
there's only one job submitted to a ring and that causes an hang, it
will early out.
There's no guarantee that the gpu will suspend and resume before more
work is submitted and if the gpu is in a hung state it will stay in that
state and probably trigger a timeout again.
Just stop checking and always recover the gpu.
Signed-off-by: Anna Maniscalco <anna.maniscalco2000@gmail.com>
Cc: stable@vger.kernel.org
Patchwork: https://patchwork.freedesktop.org/patch/704066/
Message-ID: <20260210-recovery_suspend_fix-v1-1-00ed9013da04@gmail.com>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
We sometimes get into a situtation where GPU hangcheck fails to
recover GPU:
[..]
msm_dpu ae01000.display-controller: [drm:hangcheck_handler] *ERROR* (IPv4: 1): hangcheck detected gpu lockup rb 0!
msm_dpu ae01000.display-controller: [drm:hangcheck_handler] *ERROR* (IPv4: 1): completed fence: 7840161
msm_dpu ae01000.display-controller: [drm:hangcheck_handler] *ERROR* (IPv4: 1): submitted fence: 7840162
msm_dpu ae01000.display-controller: [drm:hangcheck_handler] *ERROR* (IPv4: 1): hangcheck detected gpu lockup rb 0!
msm_dpu ae01000.display-controller: [drm:hangcheck_handler] *ERROR* (IPv4: 1): completed fence: 7840162
msm_dpu ae01000.display-controller: [drm:hangcheck_handler] *ERROR* (IPv4: 1): submitted fence: 7840163
[..]
The problem is that msm_job worker is blocked on gpu->lock
INFO: task ring0:155 blocked for more than 122 seconds.
Not tainted 6.6.99-08727-gaac38b365d2c #1
task:ring0 state:D stack:0 pid:155 ppid:2 flags:0x00000008
Call trace:
__switch_to+0x108/0x208
schedule+0x544/0x11f0
schedule_preempt_disabled+0x30/0x50
__mutex_lock_common+0x410/0x850
__mutex_lock_slowpath+0x28/0x40
mutex_lock+0x5c/0x90
msm_job_run+0x9c/0x140
drm_sched_main+0x514/0x938
kthread+0x114/0x138
ret_from_fork+0x10/0x20
which is owned by recover worker, which is waiting for DMA fences
from a memory reclaim path, under the very same gpu->lock
INFO: task ring0:155 is blocked on a mutex likely owned by task gpu-worker:154.
task:gpu-worker state:D stack:0 pid:154 ppid:2 flags:0x00000008
Call trace:
__switch_to+0x108/0x208
schedule+0x544/0x11f0
schedule_timeout+0x1f8/0x770
dma_fence_default_wait+0x108/0x218
dma_fence_wait_timeout+0x6c/0x1c0
dma_resv_wait_timeout+0xe4/0x118
active_purge+0x34/0x98
drm_gem_lru_scan+0x1d0/0x388
msm_gem_shrinker_scan+0x1cc/0x2e8
shrink_slab+0x228/0x478
shrink_node+0x380/0x730
try_to_free_pages+0x204/0x510
__alloc_pages_direct_reclaim+0x90/0x158
__alloc_pages_slowpath+0x1d4/0x4a0
__alloc_pages+0x9f0/0xc88
vm_area_alloc_pages+0x17c/0x260
__vmalloc_node_range+0x1c0/0x420
kvmalloc_node+0xe8/0x108
msm_gpu_crashstate_capture+0x1e4/0x280
recover_worker+0x1c0/0x638
kthread_worker_fn+0x150/0x2d8
kthread+0x114/0x138
So no one can make any further progress.
Forbid recover/fault worker to enter memory reclaim (under
gpu->lock) to address this deadlock scenario.
Cc: Tomasz Figa <tfiga@chromium.org>
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Reviewed-by: Rob Clark <rob.clark@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/700978/
Message-ID: <20260127073341.2862078-1-senozhatsky@chromium.org>
Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
drm-misc-next for v7.1:
UAPI Changes:
amdxdna:
- support per-BO memory-usage queries
docs:
- Improve UAPI documentation
panthor:
- extend timestamp query with flags
Core Changes:
edid:
- provide enum drm_output_color_format; mass-convert drivers
gem-dma:
- use drm_dev_dma_dev() for DMA mappings
- set VM_DONTDUMP on mmap
mipi-dbi:
- drop simple-display; mass-convert drivers
prime:
- use drm_dev_dma_dev() for DMA mappings
ttm:
- improve handling of gfp_retry_mayfail
Driver Changes:
amdgpu:
- use atomic_create_state for private_obj
amdxdna:
- refactor GEM implementation
- fixes
bridge:
- provide clear-and-put helper for reliable cleanup
- analogix_dp: Use DP helpers for link training
- lontium-lt8713sx: Fix 64-bit division and Kconfig
- samsung-dsim: Use clear-and-put
imagination:
- improve power-off sequence
- support context-reset notification from firmware
komeda:
- support Arm China Linlon D6 plus DT bindings
mediatek:
- use drm_dev_dma_dev() for DMA mappings
panel:
- support Himax HX83121A plus DT bindings
- support JuTouch JT070TM041 plus DT bindings
- support Samsung S6E8FC0 plus DT bindings
- himax-hx83102c: support Samsung S6E8FC0 plus DT bindings; support backlight
- ili9806e: support Rocktech RK050HR345-CT106A plus DT bindings
- simple: support Tianma TM050RDH03 plus DT bindings
panthor:
- support various sources for timestamp queries
- fixes
omapdrm:
- use atomic_create_state for private_obj
rcar-du:
- fix suspend/resume wrt VSP interface
- fix leak of device_link
- clean up
sun4i:
- use drm_dev_dma_dev() for DMA mappings
tegra:
- use atomic_create_state for private_obj
xe:
- send 'none' recovery method for XE_WEDGED_MODE_UPON_ANY_HANG_NO_RESET
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20260326151812.GA76082@linux.fritz.box
The content of every half of screen is sent out via one interface in
dual-DSI case. The content for every interface is blended by a LM
pair in quad-pipe case, thus a LM pair should not blend any content
that cross the half of screen in this case. Clip plane into pipes per
left and right half screen ROI if topology is quad pipe case.
The clipped rectangle on every half of screen is futher handled by two
pipes if its width exceeds a limit for a single pipe.
For non-virtual-plane case, there is always one stage config to serve
a LM or LM pair. So the clipping does not occur when interating stages
in this case. The plane is mapped to 2 pipes only when width or clock
rate exceeds hardware constrain within stage check.
Signed-off-by: Jun Nie <jun.nie@linaro.org>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Jessica Zhang <jessica.zhang@oss.qualcomm.com>
Patchwork: https://patchwork.freedesktop.org/patch/711324/
Link: https://lore.kernel.org/r/20260312-msm-next-quad-pipe-split-v19-3-4ffa2b06c996@linaro.org
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com>