linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-06-03 19:42:23 -04:00

Author	SHA1	Message	Date
Nathan Chancellor	53676e4d44	drm/msm: Restore second parameter name in purge() and evict() After commit `3392291fc5` ("drm/msm: Fix shrinker deadlock"), all supported versions of clang warn (or error with CONFIG_WERROR=y): drivers/gpu/drm/msm/msm_gem_shrinker.c:105:58: error: omitting the parameter name in a function definition is a C23 extension [-Werror,-Wc23-extensions] 105 \| purge(struct drm_gem_object obj, struct ww_acquire_ctx ) \| ^ drivers/gpu/drm/msm/msm_gem_shrinker.c:117:58: error: omitting the parameter name in a function definition is a C23 extension [-Werror,-Wc23-extensions] 117 \| evict(struct drm_gem_object obj, struct ww_acquire_ctx ) \| ^ 2 errors generated. With older but supported versions of GCC, this is an unconditional hard error: drivers/gpu/drm/msm/msm_gem_shrinker.c: In function 'purge': drivers/gpu/drm/msm/msm_gem_shrinker.c:105:35: error: parameter name omitted purge(struct drm_gem_object obj, struct ww_acquire_ctx ) ^~~~~~~~~~~~~~~~~~~~~~~ drivers/gpu/drm/msm/msm_gem_shrinker.c: In function 'evict': drivers/gpu/drm/msm/msm_gem_shrinker.c:117:35: error: parameter name omitted evict(struct drm_gem_object obj, struct ww_acquire_ctx ) ^~~~~~~~~~~~~~~~~~~~~~~ Restore the parameter name to clear up the warnings, renaming it "unused" to make it clear it is only needed to satisfy the prototype of drm_gem_lru_scan(). Cc: stable@vger.kernel.org Fixes: `3392291fc5` ("drm/msm: Fix shrinker deadlock") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-05-24 10:31:24 -07:00
Dave Airlie	84335a9985	Merge tag 'drm-xe-fixes-2026-05-21' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes - SRIOV related fixes (Wajdeczko, Mohanram) - Fix leak and double-free (Lin) - Multi-cast register fixes (Gustavo) - Multi-queue fix (Niranjana) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/ag9rR5VwCdkA0lzI@intel.com	2026-05-23 07:57:08 +10:00
Dave Airlie	4378a41165	Merge tag 'mediatek-drm-fixes-20260521' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes Mediatek DRM Fixes - 20260521 1. fix sparse warnings Signed-off-by: Dave Airlie <airlied@redhat.com> From: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://patch.msgid.link/20260521135649.4681-1-chunkuang.hu@kernel.org	2026-05-22 08:31:08 +10:00
Dave Airlie	71d9e1561a	Merge tag 'drm-misc-fixes-2026-05-21' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: amdxdna: - remove mmap and export for ubuf bridge: - chipone-icn6211: managed bridge cleanup - lt66121: acquire reset GPIO - megachips: fix clean up on failed IRQ requests gem: - clean up LRU locking v3d: - fix UAF in error code paths - release GEM-object ref on free'd jobs virtio: - use uninterruptible resv locking in plane updates Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20260521071456.GA14644@localhost.localdomain	2026-05-22 07:01:04 +10:00
Shuicheng Lin	4d25342543	drm/xe/oa: Fix exec_queue leak on width check in stream open In xe_oa_stream_open_ioctl(), when param.exec_q->width > 1 the function returns -EOPNOTSUPP directly, skipping the existing err_exec_q cleanup path. The exec_queue reference obtained by xe_exec_queue_lookup() is leaked. The exec queue holds a reference on the xe_file, which is only dropped during queue teardown. The leaked lookup ref is not on the file's exec_queue xarray, so file close cannot release it. This keeps both the exec queue and the file private state pinned indefinitely. Jump to err_exec_q instead of returning directly so the reference is released. Fixes: `f0ed39830e` ("xe/oa: Fix query mode of operation for OAR/OAC") Assisted-by: Claude:claude-opus-4.6 Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patch.msgid.link/20260514203210.593488-1-shuicheng.lin@intel.com Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> (cherry picked from commit 339fa0be9e4a5d69fa47e91f4a36574224fb478f) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2026-05-21 09:56:49 -04:00
Dave Airlie	aee43aaf26	Merge tag 'amd-drm-fixes-7.1-2026-05-20' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-7.1-2026-05-20: amdgpu: - Userq fixes - VPE fix - SMU 15 fix - Misc fixes - VCE fixes - DC bios parsing fixes - DC aux fix - Mode1 reset fix - RAS fixes amdkfd: - Misc fixes radeon: - CS parser fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patch.msgid.link/20260520181359.28421-1-alexander.deucher@amd.com	2026-05-21 16:29:45 +10:00
Dave Airlie	30afd245e2	Merge tag 'drm-intel-fixes-2026-05-20' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - Fix joiner color pipeline selection [display] (Chaitanya Kumar Borah) - Fix readback for target_rr in Adaptive Sync SDP [dp] (Ankit Nautiyal) - Apply Intel DPCD workaround when SDP on prior line used [psr] (Jouni Högander) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tursulin@igalia.com> Link: https://patch.msgid.link/ag1hKBRKwwv9JOMW@linux	2026-05-21 11:50:28 +10:00
Dave Airlie	5b4a47dc54	Merge tag 'drm-msm-fixes-2026-05-17' of https://gitlab.freedesktop.org/drm/msm into drm-fixes Fixes for v7.1: Core: - Fixed bindings for SM8650, SM8750 and Eliza - Don't use UTS_RELEASE directly - Fix typo in clock-names property DPU: - Fixed CWB description on Kaanapali - Fixed scanline strides for YUV UBWC formats - Stopped DSI register dumping to access past the end of region DSI: - Fix dumping unaligned regions GPU: - Fix GMEM_BASE for a6xx gen3 - Fix userspace reachable crash on a2xx-a4xx - Fix sysprof_active for counter collection with IFPC enabled GPUs - Fix shrinker lockdep Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <rob.clark@oss.qualcomm.com> Link: https://patch.msgid.link/CACSVV02cTK7h=d0uqanRE-cj35THDqFjqsTB_2zQV1Mcw77aNw@mail.gmail.com	2026-05-21 10:12:22 +10:00
Deepanshu Kartikey	9af1b6e175	drm/virtio: use uninterruptible resv lock for plane updates virtio_gpu_cursor_plane_update() and virtio_gpu_resource_flush() lock the framebuffer BO's dma_resv via virtio_gpu_array_lock_resv() and ignore its return value. The function can fail with -EINTR from dma_resv_lock_interruptible() (signal during lock wait) or with -ENOMEM from dma_resv_reserve_fences() (fence slot allocation), leaving the resv lock not held. The queue path then walks the object array and calls dma_resv_add_fence(), which requires the lock held; with lockdep enabled this trips dma_resv_assert_held(): WARNING: drivers/dma-buf/dma-resv.c:296 at dma_resv_add_fence+0x71e/0x840 Call Trace: virtio_gpu_array_add_fence virtio_gpu_queue_ctrl_sgs virtio_gpu_queue_fenced_ctrl_buffer virtio_gpu_cursor_plane_update drm_atomic_helper_commit_planes drm_atomic_helper_commit_tail commit_tail drm_atomic_helper_commit drm_atomic_commit drm_atomic_helper_update_plane __setplane_atomic drm_mode_cursor_universal drm_mode_cursor_common drm_mode_cursor_ioctl drm_ioctl __x64_sys_ioctl Beyond the WARN, mutating the dma_resv fence list without the lock races with concurrent readers/writers and can corrupt the list. Both call sites run inside the .atomic_update plane callback, which DRM atomic helpers do not allow to fail (by the time it runs, the commit has been signed off to userspace and there is no clean rollback path). Moving the lock acquisition to .prepare_fb was rejected because the broader lock scope deadlocks against other BO locking paths in the same atomic commit. Introduce virtio_gpu_lock_one_resv_uninterruptible() that uses dma_resv_lock() instead of dma_resv_lock_interruptible(). This eliminates the -EINTR failure mode -- the realistic syzbot trigger -- without extending the lock hold across the commit. The helper locks a single BO and rejects nents > 1 with -EINVAL; both fix sites lock exactly one BO. Use it from virtio_gpu_cursor_plane_update() and virtio_gpu_resource_flush(); check the return value to handle the remaining -ENOMEM case from dma_resv_reserve_fences() by freeing the objs and skipping the plane update for that frame. The framebuffer BOs touched here are not shared with other contexts and lock contention is expected to be brief, so the loss of signal-interruptibility is acceptable. Other callers of virtio_gpu_array_lock_resv() (the ioctl paths) continue to use the interruptible variant. The bug was reported by syzbot, triggered via fault injection (fail_nth) on the DRM_IOCTL_MODE_CURSOR path, which forces the -ENOMEM branch in dma_resv_reserve_fences(). Reported-by: syzbot+72bd3dd3a5d5f39a0271@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=72bd3dd3a5d5f39a0271 Fixes: `5cfd31c5b3` ("drm/virtio: fix virtio_gpu_cursor_plane_update().") Cc: stable@vger.kernel.org Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Link: https://patch.msgid.link/20260519082247.34470-1-kartikey406@gmail.com	2026-05-20 18:12:11 +03:00
Christian König	b6fe4ff340	drm/amdgpu: fix handling in amdgpu_userq_create Well mostly the same issues the other code had as well: 1. Memory allocation while holding the userq_mutex lock is forbidden! 2. Things were created/started/published in the wrong order. 3. The reset lock was taken in the wrong order and seems to be unecessary in the first place. 4. Error messages on invalid input parameters can spam the logs. 5. Error messages on memory allocation failures are usually superflous as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 89e50de5654dbe7a137e03d78629542e17ba7202)	2026-05-19 12:25:32 -04:00
Vitaliy Triang3l Kuzmin	d093c01d30	drm/radeon/evergreen_cs: Add missing NULL prefix check in surface check 'evergreen_surface_check' is called with a NULL warning prefix when handling potentially recoverable issues or just to compute the alignment requirements, and 'evergreen_surface_check' is called again in case of failure (with the correct prefix, as opposed to NULL), therefore, the initial check must not print a warning, because the surface may be accepted successfully after having been corrected, however if it isn't, the final check will print the warning anyway. The surface check functions specific to array modes already implement this behavior, but the 'evergreen_surface_check' function itself doesn't. This is also supposed to fix the "'%s' directive argument is null [-Werror=format-overflow=]" compiler warning. Fixes: `285484e2d5` ("drm/radeon: add support for evergreen/ni tiling informations v11") Reported-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Vitaliy Triang3l Kuzmin <ml@triang3l.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit e20ea411c99f6968af35fd03e9ee21f70d799144)	2026-05-19 12:16:16 -04:00
Sunil Khatri	b6a28b77b8	drm/amdgpu: userq_va_mapped should remain true once done Multiple queues needs these bo_va objects belonging to the same uq_mgr. So once they are mapped lets not unmap them as at any point of time any of the queues might be using it. Also userq_va_mapped should be a boolean than atomic. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 5c02889ea22575c3bcfdf212e65fac316cbc6c6a)	2026-05-19 12:15:49 -04:00
Ce Sun	cd7cfcdb4d	drm/amdgpu: avoid integer overflow in VA range check The original addition operation in 64-bit unsigned type may encounter overflow situations. To prevent such issues and safely reject invalid inputs, the check_add_overflow() function is used. Signed-off-by: Ce Sun <cesun102@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit cc768f4dd0bb9083c813683eeec44fc23921f771)	2026-05-19 12:15:41 -04:00
Xiang Liu	893fea60f8	drm/amd/ras: Fix UMC error address allocation leak amdgpu_umc_handle_bad_pages() allocates err_data->err_addr before querying UMC error information. In the direct and firmware query paths, the pointer is reassigned to a fresh allocation before the original buffer is released, so the initial allocation is leaked on each handled event. Free the existing buffer before replacing it in those query paths so the function exit cleanup only owns the active allocation. Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 911b1bdd22c3712a22b60fcc58f7b9f2d07b0803)	2026-05-19 12:15:24 -04:00
Yifan Zhang	353f7430d1	drm/amdgpu: unmap all user mappings of framebuffer and doorbell before mode1 reset During Mode 1 reset, the ASIC undergoes a reset cycle and becomes temporarily inaccessible via PCIe. Any attempt to access framebuffer or MMIO registers during this window can result in uncompleted PCIe transactions, leading to NMI panics or system hangs. To prevent this, Unmap all of the applications mappings of the framebuffer and doorbell BARs before mode1 reset. Also prevent new mappings from coming in during the reset process. v2: remove inode in kfd_dev (Christian) v3: correct unmap offset (Felix), remove prevent new mappings part to avoid deadlock (Christian) Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 70cadefcc6160c575b04f763ada34c20e868d577)	2026-05-19 12:14:55 -04:00
Harry Wentland	6c92f6d960	drm/amd/display: Validate payload length and link_index in dc_process_dmub_aux_transfer_async [Why&How] dc_process_dmub_aux_transfer_async() copies payload->length bytes into a 16-byte stack buffer (dpaux.data[16]) guarded only by an ASSERT(), which is a no-op in release builds. If a caller ever passes length > 16 this results in a stack buffer overflow via memcpy. Additionally, link_index is used to dereference dc->links[] without bounds checking against dc->link_count, risking an out-of-bounds access. Replace the ASSERT with a hard runtime check that returns false when payload->length exceeds the destination buffer size, and add a bounds check for link_index before it is used. Assisted-by: GitHub Copilot:Claude claude-4-opus Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ba4caa9fecdf7a38f98c878ad05a8a64148b6881) Cc: stable@vger.kernel.org	2026-05-19 12:13:56 -04:00
Harry Wentland	86d2b20644	drm/amd/display: Validate GPIO pin LUT table size before iterating [Why&How] The GPIO pin table parsers in get_gpio_i2c_info() and bios_parser_get_gpio_pin_info() derive an element count from the VBIOS table_header.structuresize field, then iterate over gpio_pin[] entries. However, GET_IMAGE() only validates that the table header itself fits within the BIOS image. If the VBIOS reports a structuresize larger than the actual mapped data, the loop reads past the end of the BIOS image, causing an out-of-bounds read. Fix this by calling bios_get_image() to validate that the full claimed structuresize is accessible within the BIOS image before entering the loop in both functions. Assisted-by: GitHub Copilot:claude-opus-4-6 Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ba5e95b43b773ae1bf1f66ee6b31eb774e65afe3) Cc: stable@vger.kernel.org	2026-05-19 12:13:29 -04:00
Harry Wentland	cd86529ec6	drm/amd/display: Fix integer overflow in bios_get_image() [Why&How] The bounds check in bios_get_image() computes 'offset + size' using unsigned 32-bit arithmetic before comparing against bios_size. If a VBIOS image contains a near-UINT32_MAX offset the addition wraps to a small value, the comparison passes, and the function returns a wild pointer past the VBIOS mapping. Additionally, the comparison uses '<' (strict), which incorrectly rejects the valid exact-fit case where offset + size == bios_size. Fix both issues by restructuring the check to avoid the addition entirely: first reject if offset alone exceeds bios_size, then check size against the remaining space (bios_size - offset). This eliminates the overflow and correctly permits exact-fit accesses. Assisted-by: GitHub Copilot:claude-opus-4.6 Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d40fb392af659c4a02b560319f226842f6ec1a95) Cc: stable@vger.kernel.org	2026-05-19 12:13:07 -04:00
David Francis	6dc2c49a70	drm/amdkfd: Check bounds for allocate_sdma_queue restore_sdma_id allocate_sdma_queue has an option where the sdma queue id can be specified (used by CRIU). We weren't bounds-checking that value. Confirm it's less than the maximum number of queues. Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit bfe9a7545b2a7be1c543f1741e16f2d5ec4116ae)	2026-05-19 12:11:43 -04:00
Sunil Khatri	0978406224	drm/amdgpu: use atomic operation to achieve lockless serialization In amdgpu_seq64_alloc there is a possibility that two difference cores from two separate NODES can try to and could get the same free slot. So this fixes that race here using atomic test_and_set clear operations. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 4d50a14d346141e03a7c3905e496d91e048bc30c)	2026-05-19 12:11:34 -04:00
David Francis	a1d4b228e3	drm/amdkfd: Check bounds on allocate_doorbell allocated_doorbell has an option to set the doorbell id to a specific value (used by CRIU). This value was not bounds checked. Check to confirm it's less than KFD_MAX_NUM_OF_QUEUES_PER_PROCESS. Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1f087bb8cf9e8797633da35c85435e557ef74d06)	2026-05-19 12:11:26 -04:00
Timur Kristóf	0c61a9732a	drm/amdgpu/vce3: Fix VCE 3 firmware size and offsets The VCPU BO contains the actual FW at an offset, but it was not calculated into the VCPU BO size. Subtract this from the FW size to make sure there is no out of bounds access. This may fix VM faults when using VCE 3. Cc: John Olender <john.olender@gmail.com> Fixes: `e982262214` ("drm/amdgpu: recalculate VCE firmware BO size") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 15c369257bd85f47a514744f960c5a51c867716f)	2026-05-19 12:11:07 -04:00
Timur Kristóf	5dc3d16cd0	drm/amdgpu/vce2: Fix VCE 2 firmware size and offsets The VCPU BO contains the actual FW at an offset, but it was not calculated into the VCPU BO size. Subtract this from the FW size to make sure there is no out of bounds access. Additionally, increase the VCE_V2_0_DATA_SIZE to have extra space after the VCE handles. Also increase the data size used for each VCE handle. The FW needs 23744 bytes, use 24K to be safe. This fixes VM faults when using VCE 2. Cc: John Olender <john.olender@gmail.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4802 Fixes: `e982262214` ("drm/amdgpu: recalculate VCE firmware BO size") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a20d21df625548c1738c0745f753c5d6eb823bc3)	2026-05-19 12:11:00 -04:00
Timur Kristóf	f5a247e037	drm/amdgpu/vce1: Stop using amdgpu_vce_resume The VCE1 firmware works slightly differently and is already loaded by vce_v1_0_load_fw(). It doesn't actually need to call amdgpu_vce_resume(). Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 33d8951405e2dd81ac61edebc680e2dfb6b4fc9f)	2026-05-19 12:10:53 -04:00
Timur Kristóf	3e5a1d5bb2	drm/amdgpu/vce1: Fix VCE 1 firmware size and offsets The VCPU BO contains the actual FW at an offset, but it was not calculated into the VCPU BO size. Subtract this from the FW size to make sure there is no out of bounds access. Make sure the stack and data offsets are aligned to the 32K TLB size. Check that the FW microcode actually fits in the space that is reserved for it. Fixes: `d4a640d4b9` ("drm/amdgpu/vce1: Implement VCE1 IP block (v2)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit c16fe59f622a080fc457a57b3e8f14c780699449)	2026-05-19 12:10:45 -04:00
Timur Kristóf	3ebcab1132	drm/amdgpu/vce1: Don't repeat GTT MGR node allocation Only allocate entries from the GTT manager when the VCE GTT node is not allocated yet. This prevents the possibility of allocating them multiple times, which causes issues during GPU reset and suspend/resume. Fixes: `71aec08f80` ("amdgpu/vce: use amdgpu_gtt_mgr_alloc_entries") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 8d2a20c1721cb17e22821e1b4ecbb02d475d91c5)	2026-05-19 12:10:37 -04:00
Timur Kristóf	12b60cf345	drm/amdgpu/vce1: Check if VRAM address is lower than GART. Previously, I had assumed this was not possible so it was OK to not handle it, but now we got a report from a user who has a board that is configured this way. When the VCPU BO is already located in a low 32-bit address in VRAM (eg. when VRAM is mapped to the low address space), don't do the workaround. Fixes: `71aec08f80` ("amdgpu/vce: use amdgpu_gtt_mgr_alloc_entries") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f370ec9b164698a9ca1a7b59bfbea07f70df769d)	2026-05-19 12:10:31 -04:00
Timur Kristóf	d993851b6d	drm/amdgpu/vce1: Remove superfluous address check The same thing is already checked a few lines above. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit c1dc555e760dbfc4a4710f7270f525a03d433af8)	2026-05-19 12:10:25 -04:00
Timur Kristóf	9f907adb66	drm/amdgpu/vce1: Check that the GPU address is < 128 MiB When ensuring the low 32-bit address, make sure it is less than 128 MiB, otherwise the VCE seems to fail to initialize. This seems to be an undocumented limitation of the firmware validation mechanism. Note that in case of VCE1 the BAR address is zero and we can't change it also due to the firmware validator. When programming the mmVCE_VCPU_CACHE_OFFSETn registers, don't AND them with a mask. This is incorrect because the register mask is actually 0x0fffffff and useless because we already ensure the addresses are below the limit. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit e729ae5f3ac73c861c062080ac8c3d666c972404)	2026-05-19 12:10:19 -04:00
Timur Kristóf	4d798ea071	drm/amdgpu: Align amdgpu_gtt_mgr entries to TLB size on Tahiti (v2) The TLB is organized in groups of 8 entries, each one is 4K. On Tahiti, the HW requires these GART entries to be 32K-aligned. This fixes a VCE 1 firmware validation failure that can happen after suspend/resume since we use amdgpu_gtt_mgr for VCE 1. v2: - Change variable declaration order - Add comment about "V bit HW bug" Fixes: `698fa62f56` ("drm/amdgpu: Add helper to alloc GART entries") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 530411b465ef0b2c0cc18c2e3d7e38422b1117d1)	2026-05-19 12:10:13 -04:00
Sunday Clement	48b13bfbdf	drm/amdkfd: Fix OOB memory exposure in get_wave_state() The get_wave_state() function for v9 trusts cp_hqd_cntl_stack_size and cp_hqd_cntl_stack_offset values read directly from the MQD, which are written by GPU microcode and fully attacker-controlled on the CRIU-restore path (via AMDKFD_IOC_RESTORE_PROCESS with H3). this leads to an unbounded copy_to_user() that can leak adjacent GTT/kernel memory. If offset > size, integer underflow produces a ~4 GiB read length, if size is set to 1 MiB against a 4 KiB allocation, we leak 1 MiB of adjacent kernel memory (other queues' MQDs, ring buffers, KASLR pointers). Fix by clamping both cp_hqd_cntl_stack_size to the actual allocated buffer size (q->ctl_stack_size) and cp_hqd_cntl_stack_offset to the clamped size before performing arithmetic and copy_to_user(). This ensures we never read beyond the allocated kernel BO regardless of attacker-supplied MQD field values. Signed-off-by: Sunday Clement <Sunday.Clement@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7ef144458f48d5589e36f1b3d83e83db2e5c5ba5)	2026-05-19 12:10:04 -04:00
Yang Wang	d796558def	drm/amd/pm: fix memleak of dpm_policies on smu v15 In smu_v15_0_fini_smc_tables, dpm_policies was not freed or NULLed, causing a memory leak. Add kfree() and NULL assignment to properly release memory and avoid dangling pointers. Fixes: `2beedc3a92` ("drm/amd/pm: Add initial support for smu v15_0_8"); Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 014f329074f688b9b49383e8b70e79e9ef99359e)	2026-05-19 12:09:43 -04:00
Lijo Lazar	9baf02bf88	drm/amdgpu: Fix discovery offset check under VF Discovery table may be kept at offset 0 by host driver. Remove the validation check. Fixes: `01bdc7e219` ("drm/amdgpu: New interface to get IP discovery binary v3") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Ellen Pan <yunru.pan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d3f5bbd007133c64a20e81ef290a93e46c75df40)	2026-05-19 12:09:09 -04:00
Sunil Khatri	d892a6eca7	drm/amdgpu: remove va cursors for all mappings va_cursor struct needs to be cleaned even if the mapping has been removed already. Also simplify it by make it a void function as return value check isn't needed as its called during tear down. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 4d35a45c9b4c1ac5b6e3219f83c3db706b675fa2)	2026-05-19 12:09:01 -04:00
Amir Shetaia	2c34c7b88b	drm/amdgpu: reject non-user addresses early in GEM_USERPTR ioctl amdgpu_gem_userptr_ioctl() currently accepts any value of args->addr and only discovers an out-of-range pointer much later, inside amdgpu_gem_object_create() and the HMM mirror registration path. Userspace can drive that path with kernel-side virtual addresses; the get_user_pages() layer rejects them, but only after the driver has already allocated a GEM object and started wiring up notifier state that then has to be torn down on failure. Add an access_ok() guard at the top of the ioctl, right after the existing page-alignment check and before flag validation, so any address that does not lie within the calling task's user address range is rejected with -EFAULT before any allocation occurs. No legitimate ROCm/HSA userspace passes kernel-mode pointers through this interface, so this is defense-in-depth rather than a behaviour change for valid callers; -EFAULT matches the convention already used by other uaccess-style rejections in the kernel. Also add an explicit #include <linux/uaccess.h>; access_ok() is otherwise only available transitively through other headers in this translation unit. Signed-off-by: Amir Shetaia <Amir.Shetaia@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7a076df36397d780d7e4fb595287b4980451a7f5)	2026-05-19 12:08:47 -04:00
Alan Liu	b6074630a4	drm/amdgpu/vpe: Force collaborate sync after TRAP VPE1 could possibly hang and fail to power off at the end of commands in collaboration mode. This workaround adds a COLLAB_SYNC after TRAP to force instances synchronized to avoid VPE1 fail to power off. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Alan liu <haoping.liu@amd.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/5171 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a8b749c5c5afb7e5daa2bfb95d958fb3c6b8f055) Cc: stable@vger.kernel.org	2026-05-19 12:08:13 -04:00
Sunil Khatri	0be97436bf	drm/amdgpu/userq: update the vm task info during signal ioctl Pagefaults does not have process information correctly populated as vm->task is not set during vm_init but should be updated while real submission. So setting that up during signal_ioctl to get the correct submission process details. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a9b14d88b4d83e21ab965f23d1fb7b07b87e0517)	2026-05-19 12:08:02 -04:00
Sunil Khatri	291df3dc7d	drm/amdgpu/userq: cancel reset work while tear down in progress While tear down of a userq_mgr is happening when all the queues are free we should cancel any reset work if pending before exiting. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 160164609f71f774c4f661227a9b7a370a86b112)	2026-05-19 12:07:52 -04:00
Christian König	c8ed2de0f2	drm/amdgpu: rework userq reset work handling It is illegal to schedule reset work from another reset work! Fix this by scheduling the userq reset work directly on the work queue of the reset domain. Not fully tested, I leave that to the IGT test cases. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit fd9200ccefab94f27877d1943761d6b0ccbd89c8)	2026-05-19 12:07:42 -04:00
Sunil Khatri	be045c5c83	drm/amdgpu/userq: pin mqd and fw object bo to avoid eviction mqd and fw objects are queue core objects which should remain valid and never be unmapped and evicted for user queues to work properly. During eviction if these buffers are evicted the hw continue to use the invalid addresses and caused page faults and system hung. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a3bbf32a336939a1d21b9561f8e53333b684b7ef)	2026-05-19 12:07:36 -04:00
Sunil Khatri	eb359cc314	drm/amdgpu/userq: use drm_exec in amdgpu_userq_fence_read_wptr To access the bo from vm mapping first lock the root bo and then the object bo of the mapping to make sure both locks are taken safely. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 3aab50410653fe7eb35eb6f9c2b27e3549ab09e6)	2026-05-19 12:07:28 -04:00
Niranjana Vishwanathapura	00907da212	drm/xe/multi_queue: Fix secondary queue error case If xe_lrc_create() fails, the secondary queue added to the multi-queue group list is not removed before freeing the queue. Fix error path handling for secondary queues by removing it from the multi-queue group list at the right place. Reported-by: Sebastian Österlund <sebastian.osterlund@intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/work_items/7979 Fixes: `d716a5088c` ("drm/xe/multi_queue: Handle tearing down of a multi queue") Cc: stable@vger.kernel.org # v7.0+ Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patch.msgid.link/20260518191639.320890-2-niranjana.vishwanathapura@intel.com (cherry picked from commit d2d23c12789cf69eddc35b8d38cd8eaabd0168f1) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2026-05-19 11:29:50 -04:00
Osama Abdelkader	d45d5c819f	drm/bridge: megachips: remove bridge when irq request fails If devm_request_threaded_irq() fails after drm_bridge_add(), remove the bridge before returning. Keep drm_bridge_add() rather than devm_drm_bridge_add(): registration is tied to the STDP4028 device while ge_b850v3_register() may complete from either I2C probe; devm would not unwind the bridge if the other client's probe fails. Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com> Fixes: `fcfa0ddc18` ("drm/bridge: Drivers for megachips-stdpxxxx-ge-b850v3-fw (LVDS-DP++)") Cc: stable@vger.kernel.org Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Tested-by: Ian Ray <ian.ray@gehealthcare.com> Link: https://patch.msgid.link/20260430195700.80317-1-osama.abdelkader@gmail.com Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>	2026-05-19 10:46:33 +02:00
Osama Abdelkader	73d01051e8	drm/bridge: chipone-icn6211: use devm_drm_bridge_add in i2c probe Use devm_drm_bridge_add() so the bridge is released if probe fails after registration, and drop drm_bridge_remove() in chipone_i2c_probe. Signed-off-by: Osama Abdelkader <osama.abdelkader@gmail.com> Fixes: `8dde6f7452` ("drm: bridge: icn6211: Add I2C configuration support") Cc: stable@vger.kernel.org Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Link: https://patch.msgid.link/20260430194944.78119-1-osama.abdelkader@gmail.com Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>	2026-05-19 10:14:06 +02:00
Jouni Högander	4703049f76	drm/i915/psr: Apply Intel DPCD workaround when SDP on prior line used There is Intel specific workaround DPCD address containing workaround for case where SDP is on prior line. Apply this workaround according to values in the offset. Fixes: `61e887329e` ("drm/i915/xelpd: Handle PSR2 SDP indication in the prior scanline") Cc: <stable@vger.kernel.org> # v5.15+ Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://patch.msgid.link/20260515095756.2799483-4-jouni.hogander@intel.com (cherry picked from commit c3fe899fbeac86ea4a5ca9dd845b2cbc0da46249) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>	2026-05-19 08:03:17 +01:00
Jouni Högander	f30bece421	drm/i915/psr: Read Intel DPCD workaround register Read Intel DPCD workaround register and store it into intel_connector->dp.psr_caps. psr_caps was chosen as currently it contains only PSR workaround for PSR2 SDP on prior scanline implementation. Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://patch.msgid.link/20260515095756.2799483-3-jouni.hogander@intel.com (cherry picked from commit c48ff24d0f4ab7ad696b2d35ad64ce7e049c668c) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>	2026-05-19 08:03:10 +01:00
Jouni Högander	fbceb39b53	drm/i915/psr: Add defininitions for INTEL_WA_REGISTER_CAPS DPCD register EDP specification says: "If either VSC SDP is unable to be transmitted 100 ns before the SU region, the Source device may optionally transmit the VSC SDP during the prior video scan line’s HBlank period There is a Intel specific drm dp register currently containing bits related how TCON can support PSR2 with SDP on prior line." Unfortunately many panels are having problems in implementing this. So there is a custom Intel specific DPCD register (INTEL_WA_REGISTER_CAPS) to figure out if this is properly implemented on a panel or if panel doesn't require that 100 ns delay before the SU region. Here are the definitions in this custom DPCD address: 0 = Panel doesn't support SDP on prior line 1 = Panel supports SDP on prior line 2 = Panel doesn't have 100ns requirement 3 = Reserved Add definitions for this new register and it's values into new header intel_dpcd.h. v2: add INTEL_DPCD_ prefix to definitions Bspec: 74741 Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://patch.msgid.link/20260515095756.2799483-2-jouni.hogander@intel.com (cherry picked from commit 1da1c9294825f08f622c473480d185680c2a3b75) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>	2026-05-19 08:02:56 +01:00
Ankit Nautiyal	f87abd0c66	drm/i915/dp: Fix readback for target_rr in Adaptive Sync SDP Correct the bit-shift logic to properly readback the 10 bit target_rr from DB3 and DB4. v2: Align the style with readback for vtotal. (Ville) Fixes: `12ea892916` ("drm/i915/dp: Add Read/Write support for Adaptive Sync SDP") Cc: Mitul Golani <mitulkumar.ajitkumar.golani@intel.com> Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20260511123218.1589830-2-ankit.k.nautiyal@intel.com (cherry picked from commit f7abc4af2b19240a145a221461dfe756cc01d74a) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>	2026-05-19 07:45:25 +01:00
Chaitanya Kumar Borah	86ed2d96db	drm/i915/display: Copy color pipeline from plane in the primary joiner pipe When copying plane color state in a joiner configuration, use the plane in the primary joiner pipe since it carries the pipeline number selected by the user-space. This assumes that all pipes in the joiner are symmetric in their plane color capabilities. Cc: stable@vger.kernel.org # v6.19+ Fixes: `a78f1b6baf` ("drm/i915/color: Add framework to program CSC") Tested-by: Vidya Srinivas <vidya.srinivas@intel.com> Signed-off-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Reviewed-by: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Link: https://patch.msgid.link/20260511053213.3122314-2-chaitanya.kumar.borah@intel.com (cherry picked from commit e8308fb5e05ca08ddfb8b46f6d947a6e3fd80cd7) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>	2026-05-19 07:45:23 +01:00
Maíra Canal	6eb6e5acaf	drm/v3d: Release indirect CSD GEM reference on CPU job free v3d_get_cpu_indirect_csd_params() takes a reference to the indirect BO via drm_gem_object_lookup() and stashes it in cpu_job->indirect_csd.indirect, but nothing on the CPU job teardown path ever drops that reference. Drop the extra reference in v3d_cpu_job_free(). The NULL check covers ioctl errors before the lookup ran and CPU job types other than V3D_CPU_JOB_TYPE_INDIRECT_CSD, which leave the field zero-initialised. Cc: stable@vger.kernel.org Fixes: `18b8413b25` ("drm/v3d: Create a CPU job extension for a indirect CSD job") Assisted-by: Claude:claude-opus-4.7 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Link: https://patch.msgid.link/20260515-v3d-cpu-job-leaks-v1-2-7f147cbbf935@igalia.com Signed-off-by: Maíra Canal <mcanal@igalia.com>	2026-05-18 19:59:51 -03:00

1 2 3 4 5 ...

124604 Commits