linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-16 19:31:42 -04:00

Author	SHA1	Message	Date
Dave Airlie	5ba61d8a25	Merge tag 'mediatek-drm-fixes-20260323' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes Mediatek DRM Fixes - 20260323 1. dsi: Store driver data before invoking mipi_dsi_host_register Signed-off-by: Dave Airlie <airlied@redhat.com> From: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://patch.msgid.link/20260323160135.39609-1-chunkuang.hu@kernel.org	2026-03-28 08:05:36 +10:00
Dave Airlie	83318d0c1f	Merge tag 'drm-xe-fixes-2026-03-26' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes - Fix UAF in SRIOV migration restore (Winiarski) - Updates to HW W/a (Roper) - VMBind remap fix (Auld) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/acUgq2q2DrCUzFql@intel.com	2026-03-27 17:48:48 +10:00
Dave Airlie	aab01a8808	Merge tag 'drm-misc-fixes-2026-03-26' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes A page mapping fix for shmem fault handler, a power-off fix for ivpu, a GFP_* flag fix for syncobj, and a MAINTAINERS update. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patch.msgid.link/20260326-lush-cuddly-limpet-ab2aa9@houat	2026-03-27 17:46:26 +10:00
Dave Airlie	355223cb84	Merge tag 'drm-intel-fixes-2026-03-26' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - DP tunnel error handling fix - Spurious GMBUS timeout fix - Unlink NV12 planes earlier - Order OP vs. timeout correctly in __wait_for() Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patch.msgid.link/acTdjAoOGkzl3dcc@jlahtine-mobl	2026-03-27 17:19:34 +10:00
Matthew Auld	bfe9e314d7	drm/xe: always keep track of remap prev/next During 3D workload, user is reporting hitting: [ 413.361679] WARNING: drivers/gpu/drm/xe/xe_vm.c:1217 at vm_bind_ioctl_ops_unwind+0x1e2/0x2e0 [xe], CPU#7: vkd3d_queue/9925 [ 413.361944] CPU: 7 UID: 1000 PID: 9925 Comm: vkd3d_queue Kdump: loaded Not tainted 7.0.0-070000rc3-generic #202603090038 PREEMPT(lazy) [ 413.361949] RIP: 0010:vm_bind_ioctl_ops_unwind+0x1e2/0x2e0 [xe] [ 413.362074] RSP: 0018:ffffd4c25c3df930 EFLAGS: 00010282 [ 413.362077] RAX: 0000000000000000 RBX: ffff8f3ee817ed10 RCX: 0000000000000000 [ 413.362078] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 413.362079] RBP: ffffd4c25c3df980 R08: 0000000000000000 R09: 0000000000000000 [ 413.362081] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8f41fbf99380 [ 413.362082] R13: ffff8f3ee817e968 R14: 00000000ffffffef R15: ffff8f43d00bd380 [ 413.362083] FS: 00000001040ff6c0(0000) GS:ffff8f4696d89000(0000) knlGS:00000000330b0000 [ 413.362085] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 [ 413.362086] CR2: 00007ddfc4747000 CR3: 00000002e6262005 CR4: 0000000000f72ef0 [ 413.362088] PKRU: 55555554 [ 413.362089] Call Trace: [ 413.362092] <TASK> [ 413.362096] xe_vm_bind_ioctl+0xa9a/0xc60 [xe] Which seems to hint that the vma we are re-inserting for the ops unwind is either invalid or overlapping with something already inserted in the vm. It shouldn't be invalid since this is a re-insertion, so must have worked before. Leaving the likely culprit as something already placed where we want to insert the vma. Following from that, for the case where we do something like a rebind in the middle of a vma, and one or both mapped ends are already compatible, we skip doing the rebind of those vma and set next/prev to NULL. As well as then adjust the original unmap va range, to avoid unmapping the ends. However, if we trigger the unwind path, we end up with three va, with the two ends never being removed and the original va range in the middle still being the shrunken size. If this occurs, one failure mode is when another unwind op needs to interact with that range, which can happen with a vector of binds. For example, if we need to re-insert something in place of the original va. In this case the va is still the shrunken version, so when removing it and then doing a re-insert it can overlap with the ends, which were never removed, triggering a warning like above, plus leaving the vm in a bad state. With that, we need two things here: 1) Stop nuking the prev/next tracking for the skip cases. Instead relying on checking for skip prev/next, where needed. That way on the unwind path, we now correctly remove both ends. 2) Undo the unmap va shrinkage, on the unwind path. With the two ends now removed the unmap va should expand back to the original size again, before re-insertion. v2: - Update the explanation in the commit message, based on an actual IGT of triggering this issue, rather than conjecture. - Also undo the unmap shrinkage, for the skip case. With the two ends now removed, the original unmap va range should expand back to the original range. v3: - Track the old start/range separately. vma_size/start() uses the va info directly. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7602 Fixes: `8f33b4f054` ("drm/xe: Avoid doing rebinds") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260318100208.78097-2-matthew.auld@intel.com (cherry picked from commit `aec6969f75`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2026-03-25 08:05:33 -04:00
Tvrtko Ursulin	06f4297134	drm/syncobj: Fix xa_alloc allocation flags The xarray conversion blindly and wrongly replaced idr_alloc with xa_alloc and kept the GFP_NOWAIT. It should have been GFP_KERNEL to account for idr_preload it removed. Fix it. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Fixes: `fec2c3c01f` ("drm/syncobj: Convert syncobj idr to xarray") Reported-by: Himanshu Girotra <himanshu.girotra@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Himanshu Girotra <himanshu.girotra@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net> Link: https://lore.kernel.org/r/20260324111019.22467-1-tvrtko.ursulin@igalia.com	2026-03-25 08:05:35 +00:00
Alex Deucher	90d239cc53	drm/amd/display: Fix DCE LVDS handling LVDS does not use an HPD pin so it may be invalid. Handle this case correctly in link encoder creation. Fixes: `7c8fb3b8e9` ("drm/amd/display: Add hpd_source index check for DCE60/80/100/110/112/120 link encoders") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/5012 Cc: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Cc: Roman Li <roman.li@amd.com> Reviewed-by: Roman Li <roman.li@amd.com> Reviewed-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `3b5620f7ee`) Cc: stable@vger.kernel.org	2026-03-24 13:55:47 -04:00
Donet Tom	4e9597f22a	drm/amdgpu: Handle GPU page faults correctly on non-4K page systems During a GPU page fault, the driver restores the SVM range and then maps it into the GPU page tables. The current implementation passes a GPU-page-size (4K-based) PFN to svm_range_restore_pages() to restore the range. SVM ranges are tracked using system-page-size PFNs. On systems where the system page size is larger than 4K, using GPU-page-size PFNs to restore the range causes two problems: Range lookup fails: Because the restore function receives PFNs in GPU (4K) units, the SVM range lookup does not find the existing range. This will result in a duplicate SVM range being created. VMA lookup failure: The restore function also tries to locate the VMA for the faulting address. It converts the GPU-page-size PFN into an address using the system page size, which results in an incorrect address on non-4K page-size systems. As a result, the VMA lookup fails with the message: "address 0xxxx VMA is removed". This patch passes the system-page-size PFN to svm_range_restore_pages() so that the SVM range is restored correctly on non-4K page systems. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Donet Tom <donettom@linux.ibm.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `074fe395fb`)	2026-03-24 13:55:29 -04:00
Yang Wang	28922a43fd	drm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v14 Forcibly disable the OD_FAN_CURVE feature when temperature or PWM range is invalid, otherwise PMFW will reject this configuration on smu v14.0.2/14.0.3. example: $ sudo cat /sys/bus/pci/devices/<BDF>/gpu_od/fan_ctrl/fan_curve OD_FAN_CURVE: 0: 0C 0% 1: 0C 0% 2: 0C 0% 3: 0C 0% 4: 0C 0% OD_RANGE: FAN_CURVE(hotspot temp): 0C 0C FAN_CURVE(fan speed): 0% 0% $ echo "0 50 40" \| sudo tee fan_curve kernel log: [ 969.761627] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]! [ 1010.897800] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]! Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `ab4905d466`) Cc: stable@vger.kernel.org	2026-03-24 13:54:42 -04:00
Srinivasan Shanmugam	429aec2bc0	drm/amdkfd: Fix NULL pointer check order in kfd_ioctl_create_process In kfd_ioctl_create_process(), the pointer 'p' is used before checking if it is NULL. The code accesses p->context_id before validating 'p'. This can lead to a possible NULL pointer dereference. Move the NULL check before using 'p' so that the pointer is validated before access. Fixes the below: drivers/gpu/drm/amd/amdgpu/../amdkfd/kfd_chardev.c:3177 kfd_ioctl_create_process() warn: variable dereferenced before check 'p' (see line 3174) Fixes: `cc6b66d661` ("amdkfd: introduce new ioctl AMDKFD_IOC_CREATE_PROCESS") Cc: Zhu Lingshan <lingshan.zhu@amd.com> Cc: Felix Kuehling <felix.kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `19d4149b22`)	2026-03-24 13:54:19 -04:00
Alex Deucher	9da4f9964a	drm/amd/display: check if ext_caps is valid in BL setup LVDS connectors don't have extended backlight caps so check if the pointer is valid before accessing it. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/5012 Fixes: `1454642960` ("drm/amd: Re-introduce property to control adaptive backlight modulation") Cc: Mario Limonciello <mario.limonciello@amd.com> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `3f797396d7`) Cc: stable@vger.kernel.org	2026-03-24 13:53:46 -04:00
Srinivasan Shanmugam	7150850146	drm/amdgpu: Fix fence put before wait in amdgpu_amdkfd_submit_ib amdgpu_amdkfd_submit_ib() submits a GPU job and gets a fence from amdgpu_ib_schedule(). This fence is used to wait for job completion. Currently, the code drops the fence reference using dma_fence_put() before calling dma_fence_wait(). If dma_fence_put() releases the last reference, the fence may be freed before dma_fence_wait() is called. This can lead to a use-after-free. Fix this by waiting on the fence first and releasing the reference only after dma_fence_wait() completes. Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c:697 amdgpu_amdkfd_submit_ib() warn: passing freed memory 'f' (line 696) Fixes: `9ae55f030d` ("drm/amdgpu: Follow up change to previous drm scheduler change.") Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `8b9e5259ad`)	2026-03-24 13:53:33 -04:00
Matt Roper	56781a4597	drm/xe: Implement recent spec updates to Wa_16025250150 The hardware teams noticed that the originally documented workaround steps for Wa_16025250150 may not be sufficient to fully avoid a hardware issue. The workaround documentation has been augmented to suggest programming one additional register; make the corresponding change in the driver. Fixes: `7654d51f1f` ("drm/xe/xe2hpg: Add Wa_16025250150") Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> Link: https://patch.msgid.link/20260319-wa_16025250150_part2-v1-1-46b1de1a31b2@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit `a31566762d`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2026-03-24 09:29:10 -04:00
Imre Deak	77fcf58df1	drm/i915/dp_tunnel: Fix error handling when clearing stream BW in atomic state Clearing the DP tunnel stream BW in the atomic state involves getting the tunnel group state, which can fail. Handle the error accordingly. This fixes at least one issue where drm_dp_tunnel_atomic_set_stream_bw() failed to get the tunnel group state returning -EDEADLK, which wasn't handled. This lead to the ctx->contended warn later in modeset_lock() while taking a WW mutex for another object in the same atomic state, and thus within the same already contended WW context. Moving intel_crtc_state_alloc() later would avoid freeing saved_state on the error path; this stable patch leaves that simplification for a follow-up. Cc: Uma Shankar <uma.shankar@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: <stable@vger.kernel.org> # v6.9+ Fixes: `a4efae87ec` ("drm/i915/dp: Compute DP tunnel BW during encoder state computation") Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7617 Reviewed-by: Michał Grzelak <michal.grzelak@intel.com> Reviewed-by: Uma Shankar <uma.shankar@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patch.msgid.link/20260320092900.13210-1-imre.deak@intel.com (cherry picked from commit `fb69d0076e`) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2026-03-24 08:00:00 +02:00
Yang Wang	3e6dd28a11	drm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v13 Forcibly disable the OD_FAN_CURVE feature when temperature or PWM range is invalid, otherwise PMFW will reject this configuration on smu v13.0.x example: $ sudo cat /sys/bus/pci/devices/<BDF>/gpu_od/fan_ctrl/fan_curve OD_FAN_CURVE: 0: 0C 0% 1: 0C 0% 2: 0C 0% 3: 0C 0% 4: 0C 0% OD_RANGE: FAN_CURVE(hotspot temp): 0C 0C FAN_CURVE(fan speed): 0% 0% $ echo "0 50 40" \| sudo tee fan_curve kernel log: [ 756.442527] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]! [ 777.345800] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]! Closes: https://github.com/ROCm/amdgpu/issues/208 Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `470891606c`) Cc: stable@vger.kernel.org	2026-03-23 14:49:15 -04:00
Asad Kamal	2f0e491fae	drm/amd/pm: Return -EOPNOTSUPP for unsupported OD_MCLK on smu_v13_0_6 When SET_UCLK_MAX capability is absent, return -EOPNOTSUPP from smu_v13_0_6_emit_clk_levels() for OD_MCLK instead of 0. This makes unsupported OD_MCLK reporting consistent with other clock types and allows callers to skip the entry cleanly. Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `d82e0a72d9`) Cc: stable@vger.kernel.org	2026-03-23 14:48:52 -04:00
Asad Kamal	cdbc3b62cf	drm/amd/pm: Skip redundant UCLK restore in smu_v13_0_6 Only reapply UCLK soft limits during PP_OD_RESTORE_DEFAULT when the current max differs from the DPM table max. This avoids redundant SMC updates and prevents -EINVAL on restore when no change is needed. Fixes: `b7a9003445` ("drm/amd/pm: Allow setting max UCLK on SMU v13.0.6") Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `17f11bbbc7`)	2026-03-23 14:48:45 -04:00
Alex Hung	37c2caa167	drm/amd/display: Fix drm_edid leak in amdgpu_dm [WHAT] When a sink is connected, aconnector->drm_edid was overwritten without freeing the previous allocation, causing a memory leak on resume. [HOW] Free the previous drm_edid before updating it. Reviewed-by: Roman Li <roman.li@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `52024a94e7`) Cc: stable@vger.kernel.org	2026-03-23 14:48:32 -04:00
Eric Huang	14b81abe7b	drm/amdgpu: prevent immediate PASID reuse case PASID resue could cause interrupt issue when process immediately runs into hw state left by previous process exited with the same PASID, it's possible that page faults are still pending in the IH ring buffer when the process exits and frees up its PASID. To prevent the case, it uses idr cyclic allocator same as kernel pid's. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `8f1de51f49`) Cc: stable@vger.kernel.org	2026-03-23 14:48:06 -04:00
Ruijing Dong	2d300ebfc4	drm/amdgpu: fix strsep() corrupting lockup_timeout on multi-GPU (v3) amdgpu_device_get_job_timeout_settings() passes a pointer directly to the global amdgpu_lockup_timeout[] buffer into strsep(). strsep() destructively replaces delimiter characters with '\0' in-place. On multi-GPU systems, this function is called once per device. When a multi-value setting like "0,0,0,-1" is used, the first GPU's call transforms the global buffer into "0\00\00\0-1". The second GPU then sees only "0" (terminated at the first '\0'), parses a single value, hits the single-value fallthrough (index == 1), and applies timeout=0 to all rings — causing immediate false job timeouts. Fix this by copying into a stack-local array before calling strsep(), so the global module parameter buffer remains intact across calls. The buffer is AMDGPU_MAX_TIMEOUT_PARAM_LENGTH (256) bytes, which is safe for the stack. v2: wrap commit message to 72 columns, add Assisted-by tag. v3: use stack array with strscpy() instead of kstrdup()/kfree() to avoid unnecessary heap allocation (Christian). This patch was developed with assistance from Claude (claude-opus-4-6). Assisted-by: Claude:claude-opus-4-6 Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Ruijing Dong <ruijing.dong@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `94d79f51ef`) Cc: stable@vger.kernel.org	2026-03-23 14:47:47 -04:00
Yussuf Khalil	aed3d041ab	drm/amd/display: Do not skip unrelated mode changes in DSC validation Starting with commit `17ce8a6907` ("drm/amd/display: Add dsc pre-validation in atomic check"), amdgpu resets the CRTC state mode_changed flag to false when recomputing the DSC configuration results in no timing change for a particular stream. However, this is incorrect in scenarios where a change in MST/DSC configuration happens in the same KMS commit as another (unrelated) mode change. For example, the integrated panel of a laptop may be configured differently (e.g., HDR enabled/disabled) depending on whether external screens are attached. In this case, plugging in external DP-MST screens may result in the mode_changed flag being dropped incorrectly for the integrated panel if its DSC configuration did not change during precomputation in pre_validate_dsc(). At this point, however, dm_update_crtc_state() has already created new streams for CRTCs with DSC-independent mode changes. In turn, amdgpu_dm_commit_streams() will never release the old stream, resulting in a memory leak. amdgpu_dm_atomic_commit_tail() will never acquire a reference to the new stream either, which manifests as a use-after-free when the stream gets disabled later on: BUG: KASAN: use-after-free in dc_stream_release+0x25/0x90 [amdgpu] Write of size 4 at addr ffff88813d836524 by task kworker/9:9/29977 Workqueue: events drm_mode_rmfb_work_fn Call Trace: <TASK> dump_stack_lvl+0x6e/0xa0 print_address_description.constprop.0+0x88/0x320 ? dc_stream_release+0x25/0x90 [amdgpu] print_report+0xfc/0x1ff ? srso_alias_return_thunk+0x5/0xfbef5 ? __virt_addr_valid+0x225/0x4e0 ? dc_stream_release+0x25/0x90 [amdgpu] kasan_report+0xe1/0x180 ? dc_stream_release+0x25/0x90 [amdgpu] kasan_check_range+0x125/0x200 dc_stream_release+0x25/0x90 [amdgpu] dc_state_destruct+0x14d/0x5c0 [amdgpu] dc_state_release.part.0+0x4e/0x130 [amdgpu] dm_atomic_destroy_state+0x3f/0x70 [amdgpu] drm_atomic_state_default_clear+0x8ee/0xf30 ? drm_mode_object_put.part.0+0xb1/0x130 __drm_atomic_state_free+0x15c/0x2d0 atomic_remove_fb+0x67e/0x980 Since there is no reliable way of figuring out whether a CRTC has unrelated mode changes pending at the time of DSC validation, remember the value of the mode_changed flag from before the point where a CRTC was marked as potentially affected by a change in DSC configuration. Reset the mode_changed flag to this earlier value instead in pre_validate_dsc(). Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/5004 Fixes: `17ce8a6907` ("drm/amd/display: Add dsc pre-validation in atomic check") Signed-off-by: Yussuf Khalil <dev@pp3345.net> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `cc7c7121ae`)	2026-03-23 14:38:56 -04:00
Michał Winiarski	87997b6c65	drm/xe/pf: Fix use-after-free in migration restore When an error is returned from xe_sriov_pf_migration_restore_produce(), the data pointer is not set to NULL, which can trigger use-after-free in subsequent .write() calls. Set the pointer to NULL upon error to fix the problem. Fixes: `1ed30397c0` ("drm/xe/pf: Add support for encap/decap of bitstream to/from packet") Reported-by: Sebastian Österlund <sebastian.osterlund@intel.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7230 Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://patch.msgid.link/20260217154118.176902-1-michal.winiarski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> (cherry picked from commit `4f53d8c6d2`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2026-03-23 10:10:50 -04:00
Ville Syrjälä	bfa71b7a9d	drm/i915: Unlink NV12 planes earlier unlink_nv12_plane() will clobber parts of the plane state potentially already set up by plane_atomic_check(), so we must make sure not to call the two in the wrong order. The problem happens when a plane previously selected as a Y plane is now configured as a normal plane by user space. plane_atomic_check() will first compute the proper plane state based on the userspace request, and unlink_nv12_plane() later clears some of the state. This used to work on account of unlink_nv12_plane() skipping the state clearing based on the plane visibility. But I removed that check, thinking it was an impossible situation. Now when that situation happens unlink_nv12_plane() will just WARN and proceed to clobber the state. Rather than reverting to the old way of doing things, I think it's more clear if we unlink the NV12 planes before we even compute the new plane state. Cc: stable@vger.kernel.org Reported-by: Khaled Almahallawy <khaled.almahallawy@intel.com> Closes: https://lore.kernel.org/intel-gfx/20260212004852.1920270-1-khaled.almahallawy@intel.com/ Tested-by: Khaled Almahallawy <khaled.almahallawy@intel.com> Fixes: `6a01df2f1b` ("drm/i915: Remove pointless visible check in unlink_nv12_plane()") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20260316163953.12905-2-ville.syrjala@linux.intel.com Reviewed-by: Uma Shankar <uma.shankar@intel.com> (cherry picked from commit `017ecd0498`) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2026-03-23 09:06:43 +02:00
Ville Syrjälä	6ad2a661ff	drm/i915: Order OP vs. timeout correctly in __wait_for() Put the barrier() before the OP so that anything we read out in OP and check in COND will actually be read out after the timeout has been evaluated. Currently the only place where we use OP is __intel_wait_for_register(), but the use there is precisely susceptible to this reordering, assuming the ktime_*() stuff itself doesn't act as a sufficient barrier: __intel_wait_for_register(...) { ... ret = __wait_for(reg_value = intel_uncore_read_notrace(...), (reg_value & mask) == value, ...); ... } Cc: stable@vger.kernel.org Fixes: `1c3c1dc66a` ("drm/i915: Add compiler barrier to wait_for") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patch.msgid.link/20260313110740.24620-1-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit `a464bace04`) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2026-03-23 09:06:40 +02:00
Samasth Norway Ananda	08441f10f4	drm/i915/gmbus: fix spurious timeout on 512-byte burst reads When reading exactly 512 bytes with burst read enabled, the extra_byte_added path breaks out of the inner do-while without decrementing len. The outer while(len) then re-enters and gmbus_wait() times out since all data has been delivered. Decrement len before the break so the outer loop terminates correctly. Fixes: `d5dc0f43f2` ("drm/i915/gmbus: Enable burst read") Signed-off-by: Samasth Norway Ananda <samasth.norway.ananda@oracle.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patch.msgid.link/20260316231920.135438-2-samasth.norway.ananda@oracle.com Signed-off-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit `4ab0f09ee7`) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2026-03-23 09:06:36 +02:00
Luca Leonardo Scorcia	4cfdfeb6ac	drm/mediatek: dsi: Store driver data before invoking mipi_dsi_host_register The call to mipi_dsi_host_register triggers a callback to mtk_dsi_bind, which uses dev_get_drvdata to retrieve the mtk_dsi struct, so this structure needs to be stored inside the driver data before invoking it. As drvdata is currently uninitialized it leads to a crash when registering the DSI DRM encoder right after acquiring the mode_config.idr_mutex, blocking all subsequent DRM operations. Fixes the following crash during mediatek-drm probe (tested on Xiaomi Smart Clock x04g): Unable to handle kernel NULL pointer dereference at virtual address 0000000000000040 [...] Modules linked in: mediatek_drm(+) drm_display_helper cec drm_client_lib drm_dma_helper drm_kms_helper panel_simple [...] Call trace: drm_mode_object_add+0x58/0x98 (P) __drm_encoder_init+0x48/0x140 drm_encoder_init+0x6c/0xa0 drm_simple_encoder_init+0x20/0x34 [drm_kms_helper] mtk_dsi_bind+0x34/0x13c [mediatek_drm] component_bind_all+0x120/0x280 mtk_drm_bind+0x284/0x67c [mediatek_drm] try_to_bring_up_aggregate_device+0x23c/0x320 __component_add+0xa4/0x198 component_add+0x14/0x20 mtk_dsi_host_attach+0x78/0x100 [mediatek_drm] mipi_dsi_attach+0x2c/0x50 panel_simple_dsi_probe+0x4c/0x9c [panel_simple] mipi_dsi_drv_probe+0x1c/0x28 really_probe+0xc0/0x3dc __driver_probe_device+0x80/0x160 driver_probe_device+0x40/0x120 __device_attach_driver+0xbc/0x17c bus_for_each_drv+0x88/0xf0 __device_attach+0x9c/0x1cc device_initial_probe+0x54/0x60 bus_probe_device+0x34/0xa0 device_add+0x5b0/0x800 mipi_dsi_device_register_full+0xdc/0x16c mipi_dsi_host_register+0xc4/0x17c mtk_dsi_probe+0x10c/0x260 [mediatek_drm] platform_probe+0x5c/0xa4 really_probe+0xc0/0x3dc __driver_probe_device+0x80/0x160 driver_probe_device+0x40/0x120 __driver_attach+0xc8/0x1f8 bus_for_each_dev+0x7c/0xe0 driver_attach+0x24/0x30 bus_add_driver+0x11c/0x240 driver_register+0x68/0x130 __platform_register_drivers+0x64/0x160 mtk_drm_init+0x24/0x1000 [mediatek_drm] do_one_initcall+0x60/0x1d0 do_init_module+0x54/0x240 load_module+0x1838/0x1dc0 init_module_from_file+0xd8/0xf0 __arm64_sys_finit_module+0x1b4/0x428 invoke_syscall.constprop.0+0x48/0xc8 do_el0_svc+0x3c/0xb8 el0_svc+0x34/0xe8 el0t_64_sync_handler+0xa0/0xe4 el0t_64_sync+0x198/0x19c Code: 52800022 941004ab 2a0003f3 37f80040 (29005a80) Fixes: `e4732b590a` ("drm/mediatek: dsi: Register DSI host after acquiring clocks and PHY") Signed-off-by: Luca Leonardo Scorcia <l.scorcia@gmail.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20260225094047.76780-1-l.scorcia@gmail.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2026-03-22 14:15:44 +00:00
Dave Airlie	a6e77320ba	Merge tag 'drm-xe-fixes-2026-03-19' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - A number of teardown fixes (Daniele, Matt Brost, Zhanjun, Ashutosh) - Skip over non-leaf PTE for PRL generation (Brian) - Fix an unitialized variable (Umesh) - Fix a missing runtime PM reference (Sanjay) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/abxj4_dBHYBiSvDG@fedora	2026-03-21 02:17:59 +10:00
Dave Airlie	a15130d588	Merge tag 'amd-drm-fixes-7.0-2026-03-19' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-7.0-2026-03-19: amdgpu: - Fix gamma 2.2 colorop TFs - BO list fix - LTO fix - DC FP fix - DisplayID handling fix - DCN 2.01 fix - MMHUB boundary fixes - ISP fix - TLB fence fix - Hainan pm fix radeon: - Hainan pm fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patch.msgid.link/20260319131013.36639-1-alexander.deucher@amd.com	2026-03-21 01:58:36 +10:00
Dave Airlie	437eccb1a8	Merge tag 'drm-misc-fixes-2026-03-19' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes A doc warning fix and a memory leak fix for vmwgfx, a deadlock fix and interrupt handling fixes for imagination, a locking fix for pagemap_until, a UAF fix for drm_dev_unplug, and a multi-channel audio handling fix for dw-hdmi-qp. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://patch.msgid.link/20260319-lush-righteous-malamute-e7bb98@houat	2026-03-21 01:52:36 +10:00
Pedro Demarchi Gomes	fc3bbf34e6	drm/shmem-helper: Fix huge page mapping in fault handler When running ./tools/testing/selftests/mm/split_huge_page_test multiple times with /sys/kernel/mm/transparent_hugepage/shmem_enabled and /sys/kernel/mm/transparent_hugepage/enabled set as always the following BUG occurs: [ 232.728858] ------------[ cut here ]------------ [ 232.729458] kernel BUG at mm/memory.c:2276! [ 232.729726] Oops: invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 232.730217] CPU: 19 UID: 60578 PID: 1497 Comm: llvmpipe-9 Not tainted 7.0.0-rc1mm-new+ #19 PREEMPT(lazy) [ 232.730855] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-9.fc43 06/10/2025 [ 232.731360] RIP: 0010:walk_to_pmd+0x29e/0x3c0 [ 232.731569] Code: d8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 89 ea 48 89 de 4c 89 f7 e8 ae 85 ff ff 85 c0 0f 84 1f fe ff ff 31 db eb d0 <0f> 0b 48 89 ea 48 89 de 4c 89 f7 e8 92 8b ff ff 85 c0 75 e8 48 b8 [ 232.732614] RSP: 0000:ffff8881aa6ff9a8 EFLAGS: 00010282 [ 232.732991] RAX: 8000000142e002e7 RBX: ffff8881433cae10 RCX: dffffc0000000000 [ 232.733362] RDX: 0000000000000000 RSI: 00007fb47840b000 RDI: 8000000142e002e7 [ 232.733801] RBP: 00007fb47840b000 R08: 0000000000000000 R09: 1ffff110354dff46 [ 232.734168] R10: fffffbfff0cb921d R11: 00000000910da5ce R12: 1ffffffff0c1fcdd [ 232.734459] R13: 1ffffffff0c23f36 R14: ffff888171628040 R15: 0000000000000000 [ 232.734861] FS: 00007fb4907f86c0(0000) GS:ffff888791f2c000(0000) knlGS:0000000000000000 [ 232.735265] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 232.735548] CR2: 00007fb47840be00 CR3: 000000015e6dc000 CR4: 00000000000006f0 [ 232.736031] Call Trace: [ 232.736273] <TASK> [ 232.736500] get_locked_pte+0x1f/0xa0 [ 232.736878] insert_pfn+0x9f/0x350 [ 232.737190] ? __pfx_pat_pagerange_is_ram+0x10/0x10 [ 232.737614] ? __pfx_insert_pfn+0x10/0x10 [ 232.737990] ? __pfx_css_rstat_updated+0x10/0x10 [ 232.738281] ? __pfx_pfn_modify_allowed+0x10/0x10 [ 232.738552] ? lookup_memtype+0x62/0x180 [ 232.738761] vmf_insert_pfn_prot+0x14b/0x340 [ 232.739012] ? __pfx_vmf_insert_pfn_prot+0x10/0x10 [ 232.739247] ? __pfx___might_resched+0x10/0x10 [ 232.739475] drm_gem_shmem_fault.cold+0x18/0x39 [ 232.739677] ? rcu_read_unlock+0x20/0x70 [ 232.739882] __do_fault+0x251/0x7b0 [ 232.740028] do_fault+0x6e1/0xc00 [ 232.740167] ? __lock_acquire+0x590/0xc40 [ 232.740335] handle_pte_fault+0x439/0x760 [ 232.740498] ? mtree_range_walk+0x252/0xae0 [ 232.740669] ? __pfx_handle_pte_fault+0x10/0x10 [ 232.740899] __handle_mm_fault+0xa02/0xf30 [ 232.741066] ? __pfx___handle_mm_fault+0x10/0x10 [ 232.741255] ? find_vma+0xa1/0x120 [ 232.741403] handle_mm_fault+0x2bf/0x8f0 [ 232.741564] do_user_addr_fault+0x2d3/0xed0 [ 232.741736] ? trace_page_fault_user+0x1bf/0x240 [ 232.741969] exc_page_fault+0x87/0x120 [ 232.742124] asm_exc_page_fault+0x26/0x30 [ 232.742288] RIP: 0033:0x7fb4d73ed546 [ 232.742441] Code: 66 41 0f 6f fb 66 44 0f 6d dc 66 44 0f 6f c6 66 41 0f 6d f1 66 0f 6c fc 66 45 0f 6c c1 66 44 0f 6f c9 66 0f 6d ca 66 0f db f0 <66> 0f df 04 08 66 44 0f 6c ca 66 45 0f db c2 66 44 0f df 10 66 44 [ 232.743193] RSP: 002b:00007fb4907f68a0 EFLAGS: 00010206 [ 232.743565] RAX: 00007fb47840aa00 RBX: 00007fb4d73ec070 RCX: 0000000000001400 [ 232.743871] RDX: 0000000000002800 RSI: 0000000000003c00 RDI: 0000000000000001 [ 232.744150] RBP: 0000000000000004 R08: 0000000000001400 R09: 00007fb4d73ec060 [ 232.744433] R10: 000055f0261a4288 R11: 00007fb4c013da40 R12: 0000000000000008 [ 232.744712] R13: 0000000000000000 R14: 4332322132212110 R15: 0000000000000004 [ 232.746616] </TASK> [ 232.746711] Modules linked in: nft_nat nft_masq veth bridge stp llc snd_seq_dummy snd_hrtimer snd_seq snd_seq_device snd_timer snd soundcore overlay rfkill nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables qrtr ppdev 9pnet_virtio 9pnet parport_pc i2c_piix4 netfs pcspkr parport i2c_smbus joydev sunrpc vfat fat loop dm_multipath nfnetlink vsock_loopback vmw_vsock_virtio_transport_common vmw_vsock_vmci_transport zram lz4hc_compress vmw_vmci lz4_compress vsock e1000 bochs serio_raw ata_generic pata_acpi scsi_dh_rdac scsi_dh_emc scsi_dh_alua i2c_dev fuse qemu_fw_cfg [ 232.749308] ---[ end trace 0000000000000000 ]--- [ 232.749507] RIP: 0010:walk_to_pmd+0x29e/0x3c0 [ 232.749692] Code: d8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc 48 89 ea 48 89 de 4c 89 f7 e8 ae 85 ff ff 85 c0 0f 84 1f fe ff ff 31 db eb d0 <0f> 0b 48 89 ea 48 89 de 4c 89 f7 e8 92 8b ff ff 85 c0 75 e8 48 b8 [ 232.750428] RSP: 0000:ffff8881aa6ff9a8 EFLAGS: 00010282 [ 232.750645] RAX: 8000000142e002e7 RBX: ffff8881433cae10 RCX: dffffc0000000000 [ 232.750954] RDX: 0000000000000000 RSI: 00007fb47840b000 RDI: 8000000142e002e7 [ 232.751232] RBP: 00007fb47840b000 R08: 0000000000000000 R09: 1ffff110354dff46 [ 232.751514] R10: fffffbfff0cb921d R11: 00000000910da5ce R12: 1ffffffff0c1fcdd [ 232.751837] R13: 1ffffffff0c23f36 R14: ffff888171628040 R15: 0000000000000000 [ 232.752124] FS: 00007fb4907f86c0(0000) GS:ffff888791f2c000(0000) knlGS:0000000000000000 [ 232.752441] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 232.752674] CR2: 00007fb47840be00 CR3: 000000015e6dc000 CR4: 00000000000006f0 [ 232.752983] Kernel panic - not syncing: Fatal exception [ 232.753510] Kernel Offset: disabled [ 232.754643] ---[ end Kernel panic - not syncing: Fatal exception ]--- This happens when two concurrent page faults occur within the same PMD range. One fault installs a PMD mapping through vmf_insert_pfn_pmd(), while the other attempts to install a PTE mapping via vmf_insert_pfn(). The bug is triggered because a pmd_trans_huge is not expected when walking the page table inside vmf_insert_pfn. Avoid this race by adding a huge_fault callback to drm_gem_shmem_vm_ops so that PMD-sized mappings are handled through the appropriate huge page fault path. Fixes: `211b9a39f2` ("drm/shmem-helper: Map huge pages in fault handler") Signed-off-by: Pedro Demarchi Gomes <pedrodemargomes@gmail.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patch.msgid.link/20260319015224.46896-1-pedrodemargomes@gmail.com Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>	2026-03-20 09:15:39 +01:00
Sanjay Yadav	65d046b2d8	drm/xe: Fix missing runtime PM reference in ccs_mode_store ccs_mode_store() calls xe_gt_reset() which internally invokes xe_pm_runtime_get_noresume(). That function requires the caller to already hold an outer runtime PM reference and warns if none is held: [46.891177] xe 0000:03:00.0: [drm] Missing outer runtime PM protection [46.891178] WARNING: drivers/gpu/drm/xe/xe_pm.c:885 at xe_pm_runtime_get_noresume+0x8b/0xc0 Fix this by protecting xe_gt_reset() with the scope-based guard(xe_pm_runtime)(xe), which is the preferred form when the reference lifetime matches a single scope. v2: - Use scope-based guard(xe_pm_runtime)(xe) (Shuicheng) - Update commit message accordingly Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/7593 Fixes: `480b358e7d` ("drm/xe: Do not wake device during a GT reset") Cc: <stable@vger.kernel.org> # v6.19+ Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Suggested-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Sanjay Yadav <sanjay.kumar.yadav@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patch.msgid.link/20260313071608.3459480-2-sanjay.kumar.yadav@intel.com (cherry picked from commit `7937ea733f`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2026-03-19 18:05:04 +01:00
Matthew Brost	01f2557aa6	drm/xe: Open-code GGTT MMIO access protection GGTT MMIO access is currently protected by hotplug (drm_dev_enter), which works correctly when the driver loads successfully and is later unbound or unloaded. However, if driver load fails, this protection is insufficient because drm_dev_unplug() is never called. Additionally, devm release functions cannot guarantee that all BOs with GGTT mappings are destroyed before the GGTT MMIO region is removed, as some BOs may be freed asynchronously by worker threads. To address this, introduce an open-coded flag, protected by the GGTT lock, that guards GGTT MMIO access. The flag is cleared during the dev_fini_ggtt devm release function to ensure MMIO access is disabled once teardown begins. Cc: stable@vger.kernel.org Fixes: `919bb54e98` ("drm/xe: Fix missing runtime outer protection for ggtt_remove_node") Reviewed-by: Zhanjun Dong <zhanjun.dong@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260310225039.1320161-8-zhanjun.dong@intel.com (cherry picked from commit `4f3a998a17`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2026-03-19 17:13:54 +01:00
Umesh Nerlige Ramappa	e6e3ea52bf	drm/xe/lrc: Fix uninitialized new_ts when capturing context timestamp Getting engine specific CTX TIMESTAMP register can fail. In that case, if the context is active, new_ts is uninitialized. Fix that case by initializing new_ts to the last value that was sampled in SW - lrc->ctx_timestamp. Flagged by static analysis. v2: Fix new_ts initialization (Ashutosh) Fixes: `bb63e7257e` ("drm/xe: Avoid toggling schedule state to check LRC timestamp in TDR") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patch.msgid.link/20260312125308.3126607-2-umesh.nerlige.ramappa@intel.com (cherry picked from commit `466e75d480`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2026-03-19 14:23:04 +01:00
Ashutosh Dixit	9be6fd9fbd	drm/xe/oa: Allow reading after disabling OA stream Some OA data might be present in the OA buffer when OA stream is disabled. Allow UMD's to retrieve this data, so that all data till the point when OA stream is disabled can be retrieved. v2: Update tail pointer after disable (Umesh) Fixes: `efb315d0a0` ("drm/xe/oa/uapi: Read file_operation") Cc: stable@vger.kernel.org Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Umesh Nerlige Ramappa<umesh.nerlige.ramappa@intel.com> Link: https://patch.msgid.link/20260313053630.3176100-1-ashutosh.dixit@intel.com (cherry picked from commit `4ff57c5e8d`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2026-03-19 14:22:58 +01:00
Brian Nguyen	38b8dcde23	drm/xe: Skip over non leaf pte for PRL generation The check using xe_child->base.children was insufficient in determining if a pte was a leaf node. So explicitly skip over every non-leaf pt and conditionally abort if there is a scenario where a non-leaf pt is interleaved between leaf pt, which results in the page walker skipping over some leaf pt. Note that the behavior being targeted for abort is PD[0] = 2M PTE PD[1] = PT -> 512 4K PTEs PD[2] = 2M PTE results in abort, page walker won't descend PD[1]. With new abort, ensuring valid PRL before handling a second abort. v2: - Revert to previous assert. - Revised non-leaf handling for interleaf child pt and leaf pte. - Update comments to specifications. (Stuart) - Remove unnecessary XE_PTE_PS64. (Matthew B) v3: - Modify secondary abort to only check non-leaf PTEs. (Matthew B) Fixes: `b912138df2` ("drm/xe: Create page reclaim list on unbind") Signed-off-by: Brian Nguyen <brian3.nguyen@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Link: https://patch.msgid.link/20260305171546.67691-6-brian3.nguyen@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit `1d12358752`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2026-03-19 14:22:53 +01:00
Zhanjun Dong	7838dd8367	drm/xe/guc: Ensure CT state transitions via STOP before DISABLED The GuC CT state transition requires moving to the STOP state before entering the DISABLED state. Update the driver teardown sequence to make the proper state machine transitions. Fixes: `ee4b32220a` ("drm/xe/guc: Add devm release action to safely tear down CT") Cc: stable@vger.kernel.org Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260310225039.1320161-6-zhanjun.dong@intel.com (cherry picked from commit `dace8cb003`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2026-03-19 14:22:39 +01:00
Matthew Brost	e0f82655df	drm/xe: Trigger queue cleanup if not in wedged mode 2 The intent of wedging a device is to allow queues to continue running only in wedged mode 2. In other modes, queues should initiate cleanup and signal all remaining fences. Fix xe_guc_submit_wedge to correctly clean up queues when wedge mode != 2. Fixes: `7dbe8af13c` ("drm/xe: Wedge the entire device") Cc: stable@vger.kernel.org Reviewed-by: Zhanjun Dong <zhanjun.dong@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260310225039.1320161-4-zhanjun.dong@intel.com (cherry picked from commit `e25ba41c82`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2026-03-19 14:22:33 +01:00
Matthew Brost	fb3738693c	drm/xe: Forcefully tear down exec queues in GuC submit fini In GuC submit fini, forcefully tear down any exec queues by disabling CTs, stopping the scheduler (which cleans up lost G2H), killing all remaining queues, and resuming scheduling to allow any remaining cleanup actions to complete and signal any remaining fences. Split guc_submit_fini into device related and software only part. Using device-managed and drm-managed action guarantees the correct ordering of cleanup. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: stable@vger.kernel.org Reviewed-by: Zhanjun Dong <zhanjun.dong@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260310225039.1320161-3-zhanjun.dong@intel.com (cherry picked from commit `a6ab444a11`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2026-03-19 14:22:28 +01:00
Matthew Brost	26c638d560	drm/xe: Always kill exec queues in xe_guc_submit_pause_abort xe_guc_submit_pause_abort is intended to be called after something disastrous occurs (e.g., VF migration fails, device wedging, or driver unload) and should immediately trigger the teardown of remaining submission state. With that, kill any remaining queues in this function. Fixes: `7c4b7e34c8` ("drm/xe/vf: Abort VF post migration recovery on failure") Cc: stable@vger.kernel.org Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260310225039.1320161-2-zhanjun.dong@intel.com (cherry picked from commit `78f3bf00be`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2026-03-19 14:22:22 +01:00
Daniele Ceraolo Spurio	9b72283ec9	drm/xe/guc: Fail immediately on GuC load error By using the same variable for both the return of poll_timeout_us and the return of the polled function guc_wait_ucode, the return value of the latter is overwritten and lost after exiting the polling loop. Since guc_wait_ucode returns -1 on GuC load failure, we lose that information and always continue as if the GuC had been loaded correctly. This is fixed by simply using 2 separate variables. Fixes: `a4916b4da4` ("drm/xe/guc: Refactor GuC load to use poll_timeout_us()") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://patch.msgid.link/20260303001732.2540493-2-daniele.ceraolospurio@intel.com (cherry picked from commit `c85ec5c575`) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2026-03-19 14:22:17 +01:00
Rahul Bukte	0162ab3220	drm/i915/gt: Check set_default_submission() before deferencing When the i915 driver firmware binaries are not present, the set_default_submission pointer is not set. This pointer is dereferenced during suspend anyways. Add a check to make sure it is set before dereferencing. [ 23.289926] PM: suspend entry (deep) [ 23.293558] Filesystems sync: 0.000 seconds [ 23.298010] Freezing user space processes [ 23.302771] Freezing user space processes completed (elapsed 0.000 seconds) [ 23.309766] OOM killer disabled. [ 23.313027] Freezing remaining freezable tasks [ 23.318540] Freezing remaining freezable tasks completed (elapsed 0.001 seconds) [ 23.342038] serial 00:05: disabled [ 23.345719] serial 00:02: disabled [ 23.349342] serial 00:01: disabled [ 23.353782] sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 23.358993] sd 1:0:0:0: [sdb] Synchronizing SCSI cache [ 23.361635] ata1.00: Entering standby power mode [ 23.368863] ata2.00: Entering standby power mode [ 23.445187] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 23.452194] #PF: supervisor instruction fetch in kernel mode [ 23.457896] #PF: error_code(0x0010) - not-present page [ 23.463065] PGD 0 P4D 0 [ 23.465640] Oops: Oops: 0010 [#1] SMP NOPTI [ 23.469869] CPU: 8 UID: 0 PID: 211 Comm: kworker/u48:18 Tainted: G S W 6.19.0-rc4-00020-gf0b9d8eb98df #10 PREEMPT(voluntary) [ 23.482512] Tainted: [S]=CPU_OUT_OF_SPEC, [W]=WARN [ 23.496511] Workqueue: async async_run_entry_fn [ 23.501087] RIP: 0010:0x0 [ 23.503755] Code: Unable to access opcode bytes at 0xffffffffffffffd6. [ 23.510324] RSP: 0018:ffffb4a60065fca8 EFLAGS: 00010246 [ 23.515592] RAX: 0000000000000000 RBX: ffff9f428290e000 RCX: 000000000000000f [ 23.522765] RDX: 0000000000000000 RSI: 0000000000000282 RDI: ffff9f428290e000 [ 23.529937] RBP: ffff9f4282907070 R08: ffff9f4281130428 R09: 00000000ffffffff [ 23.537111] R10: 0000000000000000 R11: 0000000000000001 R12: ffff9f42829070f8 [ 23.544284] R13: ffff9f4282906028 R14: ffff9f4282900000 R15: ffff9f4282906b68 [ 23.551457] FS: 0000000000000000(0000) GS:ffff9f466b2cf000(0000) knlGS:0000000000000000 [ 23.559588] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 23.565365] CR2: ffffffffffffffd6 CR3: 000000031c230001 CR4: 0000000000f70ef0 [ 23.572539] PKRU: 55555554 [ 23.575281] Call Trace: [ 23.577770] <TASK> [ 23.579905] intel_engines_reset_default_submission+0x42/0x60 [ 23.585695] __intel_gt_unset_wedged+0x191/0x200 [ 23.590360] intel_gt_unset_wedged+0x20/0x40 [ 23.594675] gt_sanitize+0x15e/0x170 [ 23.598290] i915_gem_suspend_late+0x6b/0x180 [ 23.602692] i915_drm_suspend_late+0x35/0xf0 [ 23.607008] ? __pfx_pci_pm_suspend_late+0x10/0x10 [ 23.611843] dpm_run_callback+0x78/0x1c0 [ 23.615817] device_suspend_late+0xde/0x2e0 [ 23.620037] async_suspend_late+0x18/0x30 [ 23.624082] async_run_entry_fn+0x25/0xa0 [ 23.628129] process_one_work+0x15b/0x380 [ 23.632182] worker_thread+0x2a5/0x3c0 [ 23.635973] ? __pfx_worker_thread+0x10/0x10 [ 23.640279] kthread+0xf6/0x1f0 [ 23.643464] ? __pfx_kthread+0x10/0x10 [ 23.647263] ? __pfx_kthread+0x10/0x10 [ 23.651045] ret_from_fork+0x131/0x190 [ 23.654837] ? __pfx_kthread+0x10/0x10 [ 23.658634] ret_from_fork_asm+0x1a/0x30 [ 23.662597] </TASK> [ 23.664826] Modules linked in: [ 23.667914] CR2: 0000000000000000 [ 23.671271] ------------[ cut here ]------------ Signed-off-by: Rahul Bukte <rahul.bukte@sony.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://patch.msgid.link/20260203044839.1555147-1-suraj.kandpal@intel.com (cherry picked from commit `daa199abc3`) Fixes: `ff44ad51eb` ("drm/i915: Move engine->submit_request selection to a vfunc") Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2026-03-18 11:53:20 +02:00
Alex Deucher	86650ee224	drm/radeon: apply state adjust rules to some additional HAINAN vairants They need a similar workaround. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1839 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `87327658c8`) Cc: stable@vger.kernel.org	2026-03-17 18:04:15 -04:00
Alex Deucher	9787f7da18	drm/amdgpu: apply state adjust rules to some additional HAINAN vairants They need a similar workaround. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1839 Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `0de31d92a1`) Cc: stable@vger.kernel.org	2026-03-17 18:04:03 -04:00
Alex Deucher	e9f58ff991	drm/amdgpu: rework how we handle TLB fences Add a new VM flag to indicate whether or not we need a TLB fence. Userqs (KFD or KGD) require a TLB fence. A TLB fence is not strictly required for kernel queues, but it shouldn't hurt. That said, enabling this unconditionally should be fine, but it seems to tickle some issues in KIQ/MES. Only enable them for KFD, or when KGD userq queues are enabled (currently via module parameter). Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4798 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4749 Fixes: `f3854e04b7` ("drm/amdgpu: attach tlb fence to the PTs update") Cc: Christian König <christian.koenig@amd.com> Cc: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `69c5fbd2b9`) Cc: stable@vger.kernel.org	2026-03-17 18:03:09 -04:00
Jonas Karlman	cffcb42c57	drm/bridge: dw-hdmi-qp: fix multi-channel audio output Channel Allocation (PB4) and Level Shift Information (PB5) are configured with values from PB1 and PB2 due to the wrong offset being used. This results in missing audio channels or incorrect speaker placement when playing multi-channel audio. Use the correct offset to fix multi-channel audio output. Fixes: `fd0141d1a8` ("drm/bridge: synopsys: Add audio support for dw-hdmi-qp") Reported-by: Christian Hewitt <christianshewitt@gmail.com> Signed-off-by: Jonas Karlman <jonas@kwiboo.se> Signed-off-by: Christian Hewitt <christianshewitt@gmail.com> Reviewed-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com> Link: https://patch.msgid.link/20260228112822.4056354-1-christianshewitt@gmail.com Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>	2026-03-17 18:15:16 +01:00
Maarten Lankhorst	6bee098b91	drm: Fix use-after-free on framebuffers and property blobs when calling drm_dev_unplug When trying to do a rather aggressive test of igt's "xe_module_load --r reload" with a full desktop environment and game running I noticed a few OOPSes when dereferencing freed pointers, related to framebuffers and property blobs after the compositor exits. Solve this by guarding the freeing in drm_file with drm_dev_enter/exit, and immediately put the references from struct drm_file objects during drm_dev_unplug(). Related warnings for framebuffers on the subtest: [ 739.713076] ------------[ cut here ]------------ WARN_ON(!list_empty(&dev->mode_config.fb_list)) [ 739.713079] WARNING: drivers/gpu/drm/drm_mode_config.c:584 at drm_mode_config_cleanup+0x30b/0x320 [drm], CPU#12: xe_module_load/13145 .... [ 739.713328] Call Trace: [ 739.713330] <TASK> [ 739.713335] ? intel_pmdemand_destroy_state+0x11/0x20 [xe] [ 739.713574] ? intel_atomic_global_obj_cleanup+0xe4/0x1a0 [xe] [ 739.713794] intel_display_driver_remove_noirq+0x51/0xb0 [xe] [ 739.714041] xe_display_fini_early+0x33/0x50 [xe] [ 739.714284] devm_action_release+0xf/0x20 [ 739.714294] devres_release_all+0xad/0xf0 [ 739.714301] device_unbind_cleanup+0x12/0xa0 [ 739.714305] device_release_driver_internal+0x1b7/0x210 [ 739.714311] device_driver_detach+0x14/0x20 [ 739.714315] unbind_store+0xa6/0xb0 [ 739.714319] drv_attr_store+0x21/0x30 [ 739.714322] sysfs_kf_write+0x48/0x60 [ 739.714328] kernfs_fop_write_iter+0x16b/0x240 [ 739.714333] vfs_write+0x266/0x520 [ 739.714341] ksys_write+0x72/0xe0 [ 739.714345] __x64_sys_write+0x19/0x20 [ 739.714347] x64_sys_call+0xa15/0xa30 [ 739.714355] do_syscall_64+0xd8/0xab0 [ 739.714361] entry_SYSCALL_64_after_hwframe+0x4b/0x53 and [ 739.714459] ------------[ cut here ]------------ [ 739.714461] xe 0000:67:00.0: [drm] drm_WARN_ON(!list_empty(&fb->filp_head)) [ 739.714464] WARNING: drivers/gpu/drm/drm_framebuffer.c:833 at drm_framebuffer_free+0x6c/0x90 [drm], CPU#12: xe_module_load/13145 [ 739.714715] RIP: 0010:drm_framebuffer_free+0x7a/0x90 [drm] ... [ 739.714869] Call Trace: [ 739.714871] <TASK> [ 739.714876] drm_mode_config_cleanup+0x26a/0x320 [drm] [ 739.714998] ? __drm_printfn_seq_file+0x20/0x20 [drm] [ 739.715115] ? drm_mode_config_cleanup+0x207/0x320 [drm] [ 739.715235] intel_display_driver_remove_noirq+0x51/0xb0 [xe] [ 739.715576] xe_display_fini_early+0x33/0x50 [xe] [ 739.715821] devm_action_release+0xf/0x20 [ 739.715828] devres_release_all+0xad/0xf0 [ 739.715843] device_unbind_cleanup+0x12/0xa0 [ 739.715850] device_release_driver_internal+0x1b7/0x210 [ 739.715856] device_driver_detach+0x14/0x20 [ 739.715860] unbind_store+0xa6/0xb0 [ 739.715865] drv_attr_store+0x21/0x30 [ 739.715868] sysfs_kf_write+0x48/0x60 [ 739.715873] kernfs_fop_write_iter+0x16b/0x240 [ 739.715878] vfs_write+0x266/0x520 [ 739.715886] ksys_write+0x72/0xe0 [ 739.715890] __x64_sys_write+0x19/0x20 [ 739.715893] x64_sys_call+0xa15/0xa30 [ 739.715900] do_syscall_64+0xd8/0xab0 [ 739.715905] entry_SYSCALL_64_after_hwframe+0x4b/0x53 and then finally file close blows up: [ 743.186530] Oops: general protection fault, probably for non-canonical address 0xdead000000000122: 0000 [#1] SMP [ 743.186535] CPU: 3 UID: 1000 PID: 3453 Comm: kwin_wayland Tainted: G W 7.0.0-rc1-valkyria+ #110 PREEMPT_{RT,(lazy)} [ 743.186537] Tainted: [W]=WARN [ 743.186538] Hardware name: Gigabyte Technology Co., Ltd. X299 AORUS Gaming 3/X299 AORUS Gaming 3-CF, BIOS F8n 12/06/2021 [ 743.186539] RIP: 0010:drm_framebuffer_cleanup+0x55/0xc0 [drm] [ 743.186588] Code: d8 72 73 0f b6 42 05 ff c3 39 c3 72 e8 49 8d bd 50 07 00 00 31 f6 e8 3a 80 d3 e1 49 8b 44 24 10 49 8d 7c 24 08 49 8b 54 24 08 <48> 3b 38 0f 85 95 7f 02 00 48 3b 7a 08 0f 85 8b 7f 02 00 48 89 42 [ 743.186589] RSP: 0018:ffffc900085e3cf8 EFLAGS: 00010202 [ 743.186591] RAX: dead000000000122 RBX: 0000000000000001 RCX: ffffffff8217ed03 [ 743.186592] RDX: dead000000000100 RSI: 0000000000000000 RDI: ffff88814675ba08 [ 743.186593] RBP: ffffc900085e3d10 R08: 0000000000000000 R09: 0000000000000000 [ 743.186593] R10: 0000000000000000 R11: 0000000000000000 R12: ffff88814675ba00 [ 743.186594] R13: ffff88810d778000 R14: ffff888119f6dca0 R15: ffff88810c660bb0 [ 743.186595] FS: 00007ff377d21280(0000) GS:ffff888cec3f8000(0000) knlGS:0000000000000000 [ 743.186596] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 743.186596] CR2: 000055690b55e000 CR3: 0000000113586003 CR4: 00000000003706f0 [ 743.186597] Call Trace: [ 743.186598] <TASK> [ 743.186603] intel_user_framebuffer_destroy+0x12/0x90 [xe] [ 743.186722] drm_framebuffer_free+0x3a/0x90 [drm] [ 743.186750] ? trace_hardirqs_on+0x5f/0x120 [ 743.186754] drm_mode_object_put+0x51/0x70 [drm] [ 743.186786] drm_fb_release+0x105/0x190 [drm] [ 743.186812] ? rt_mutex_slowunlock+0x3aa/0x410 [ 743.186817] ? rt_spin_lock+0xea/0x1b0 [ 743.186819] drm_file_free+0x1e0/0x2c0 [drm] [ 743.186843] drm_release_noglobal+0x91/0xf0 [drm] [ 743.186865] __fput+0x100/0x2e0 [ 743.186869] fput_close_sync+0x40/0xa0 [ 743.186870] __x64_sys_close+0x3e/0x80 [ 743.186873] x64_sys_call+0xa07/0xa30 [ 743.186879] do_syscall_64+0xd8/0xab0 [ 743.186881] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 743.186882] RIP: 0033:0x7ff37e567732 [ 743.186884] Code: 08 0f 85 a1 38 ff ff 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 55 bf 01 00 [ 743.186885] RSP: 002b:00007ffc818169a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 [ 743.186886] RAX: ffffffffffffffda RBX: 00007ffc81816a30 RCX: 00007ff37e567732 [ 743.186887] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000012 [ 743.186888] RBP: 00007ffc818169d0 R08: 0000000000000000 R09: 0000000000000000 [ 743.186889] R10: 0000000000000000 R11: 0000000000000246 R12: 000055d60a7996e0 [ 743.186889] R13: 00007ffc81816a90 R14: 00007ffc81816a90 R15: 000055d60a782a30 [ 743.186892] </TASK> [ 743.186893] Modules linked in: rfcomm snd_hrtimer xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_addrtype nft_compat x_tables nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables overlay cfg80211 bnep mtd_intel_dg snd_hda_codec_intelhdmi mtd snd_hda_codec_hdmi nls_utf8 mxm_wmi intel_wmi_thunderbolt gigabyte_wmi wmi_bmof xe drm_gpuvm drm_gpusvm_helper i2c_algo_bit drm_buddy drm_ttm_helper ttm video drm_suballoc_helper gpu_sched drm_client_lib drm_exec drm_display_helper cec drm_kunit_helpers drm_kms_helper kunit x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_codec_alc882 snd_hda_codec_realtek_lib snd_hda_codec_generic snd_hda_intel snd_soc_avs snd_soc_hda_codec snd_hda_ext_core snd_hda_codec snd_hwdep snd_hda_core snd_intel_dspcfg snd_soc_core snd_compress ac97_bus snd_pcm snd_seq snd_seq_device snd_timer i2c_i801 i2c_mux snd i2c_smbus btusb btrtl btbcm btmtk btintel bluetooth ecdh_generic rfkill ecc mei_me mei ioatdma dca wmi nfsd drm i2c_dev fuse nfnetlink [ 743.186938] ---[ end trace 0000000000000000 ]--- And for property blobs: void drm_mode_config_cleanup(struct drm_device *dev) { ... list_for_each_entry_safe(blob, bt, &dev->mode_config.property_blob_list, head_global) { drm_property_blob_put(blob); } Resulting in: [ 371.072940] BUG: unable to handle page fault for address: 000001ffffffffff [ 371.072944] #PF: supervisor read access in kernel mode [ 371.072945] #PF: error_code(0x0000) - not-present page [ 371.072947] PGD 0 P4D 0 [ 371.072950] Oops: Oops: 0000 [#1] SMP [ 371.072953] CPU: 0 UID: 1000 PID: 3693 Comm: kwin_wayland Not tainted 7.0.0-rc1-valkyria+ #111 PREEMPT_{RT,(lazy)} [ 371.072956] Hardware name: Gigabyte Technology Co., Ltd. X299 AORUS Gaming 3/X299 AORUS Gaming 3-CF, BIOS F8n 12/06/2021 [ 371.072957] RIP: 0010:drm_property_destroy_user_blobs+0x3b/0x90 [drm] [ 371.073019] Code: 00 00 48 83 ec 10 48 8b 86 30 01 00 00 48 39 c3 74 59 48 89 c2 48 8d 48 c8 48 8b 00 4c 8d 60 c8 eb 04 4c 8d 60 c8 48 8b 71 40 <48> 39 16 0f 85 39 32 01 00 48 3b 50 08 0f 85 2f 32 01 00 48 89 70 [ 371.073021] RSP: 0018:ffffc90006a73de8 EFLAGS: 00010293 [ 371.073022] RAX: 000001ffffffffff RBX: ffff888118a1a930 RCX: ffff8881b92355c0 [ 371.073024] RDX: ffff8881b92355f8 RSI: 000001ffffffffff RDI: ffff888118be4000 [ 371.073025] RBP: ffffc90006a73e08 R08: ffff8881009b7300 R09: ffff888cecc5b000 [ 371.073026] R10: ffffc90006a73e90 R11: 0000000000000002 R12: 000001ffffffffc7 [ 371.073027] R13: ffff888118a1a980 R14: ffff88810b366d20 R15: ffff888118a1a970 [ 371.073028] FS: 00007f1faccbb280(0000) GS:ffff888cec2db000(0000) knlGS:0000000000000000 [ 371.073029] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 371.073030] CR2: 000001ffffffffff CR3: 000000010655c001 CR4: 00000000003706f0 [ 371.073031] Call Trace: [ 371.073033] <TASK> [ 371.073036] drm_file_free+0x1df/0x2a0 [drm] [ 371.073077] drm_release_noglobal+0x7a/0xe0 [drm] [ 371.073113] __fput+0xe2/0x2b0 [ 371.073118] fput_close_sync+0x40/0xa0 [ 371.073119] __x64_sys_close+0x3e/0x80 [ 371.073122] x64_sys_call+0xa07/0xa30 [ 371.073126] do_syscall_64+0xc0/0x840 [ 371.073130] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 371.073132] RIP: 0033:0x7f1fb3501732 [ 371.073133] Code: 08 0f 85 a1 38 ff ff 49 89 fb 48 89 f0 48 89 d7 48 89 ce 4c 89 c2 4d 89 ca 4c 8b 44 24 08 4c 8b 4c 24 10 4c 89 5c 24 08 0f 05 <c3> 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 f3 0f 1e fa 55 bf 01 00 [ 371.073135] RSP: 002b:00007ffe8e6f0278 EFLAGS: 00000246 ORIG_RAX: 0000000000000003 [ 371.073136] RAX: ffffffffffffffda RBX: 00007ffe8e6f0300 RCX: 00007f1fb3501732 [ 371.073137] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000012 [ 371.073138] RBP: 00007ffe8e6f02a0 R08: 0000000000000000 R09: 0000000000000000 [ 371.073139] R10: 0000000000000000 R11: 0000000000000246 R12: 00005585ba46eea0 [ 371.073140] R13: 00007ffe8e6f0360 R14: 00007ffe8e6f0360 R15: 00005585ba458a30 [ 371.073143] </TASK> [ 371.073144] Modules linked in: rfcomm snd_hrtimer xt_addrtype xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp nft_compat x_tables nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables overlay cfg80211 bnep snd_hda_codec_intelhdmi snd_hda_codec_hdmi mtd_intel_dg mtd nls_utf8 wmi_bmof mxm_wmi gigabyte_wmi intel_wmi_thunderbolt xe drm_gpuvm drm_gpusvm_helper i2c_algo_bit drm_buddy drm_ttm_helper ttm video drm_suballoc_helper gpu_sched drm_client_lib drm_exec drm_display_helper cec drm_kunit_helpers drm_kms_helper kunit x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_codec_alc882 snd_hda_codec_realtek_lib snd_hda_codec_generic snd_hda_intel snd_soc_avs snd_soc_hda_codec snd_hda_ext_core snd_hda_codec snd_hwdep snd_hda_core snd_intel_dspcfg snd_soc_core snd_compress ac97_bus snd_pcm snd_seq snd_seq_device snd_timer i2c_i801 btusb i2c_mux i2c_smbus btrtl snd btbcm btmtk btintel bluetooth ecdh_generic rfkill ecc mei_me mei ioatdma dca wmi nfsd drm i2c_dev fuse nfnetlink [ 371.073198] CR2: 000001ffffffffff [ 371.073199] ---[ end trace 0000000000000000 ]--- Add a guard around file close, and ensure the warnings from drm_mode_config do not trigger. Fix those by allowing an open reference to the file descriptor and cleaning up the file linked list entry in drm_mode_config_cleanup(). Cc: <stable@vger.kernel.org> # v4.18+ Fixes: `bee330f3d6` ("drm: Use srcu to protect drm_device.unplugged") Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20260313151728.14990-4-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>	2026-03-17 17:49:12 +01:00
Pratap Nirujogi	3fc4648b53	drm/amdgpu: Fix ISP segfault issue in kernel v7.0 Add NULL pointer checks for dev->type before accessing dev->type->name in ISP genpd add/remove functions to prevent kernel crashes. This regression was introduced in v7.0 as the wakeup sources are registered using physical device instead of ACPI device. This led to adding wakeup source device as the first child of AMDGPU device without initializing dev-type variable, and resulted in segfault when accessed it in the amdgpu isp driver. Fixes: `057edc58aa` ("ACPI: PM: Register wakeup sources under physical devices") Suggested-by: Bin Du <Bin.Du@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `c51632d1ed`)	2026-03-17 12:19:29 -04:00
Alex Deucher	f39e127027	drm/amdgpu/gmc9.0: add bounds checking for cid The value should never exceed the array size as those are the only values the hardware is expected to return, but add checks anyway. Cc: Benjamin Cheng <benjamin.cheng@amd.com> Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `e14d468304`) Cc: stable@vger.kernel.org	2026-03-17 12:19:23 -04:00
Alex Deucher	9c52f49545	drm/amdgpu/mmhub4.2.0: add bounds checking for cid The value should never exceed the array size as those are the only values the hardware is expected to return, but add checks anyway. Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `dea5f235ba`) Cc: stable@vger.kernel.org	2026-03-17 12:19:17 -04:00
Alex Deucher	3cdd405831	drm/amdgpu/mmhub4.1.0: add bounds checking for cid The value should never exceed the array size as those are the only values the hardware is expected to return, but add checks anyway. Reviewed-by: Benjamin Cheng <benjamin.cheng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `04f063d850`) Cc: stable@vger.kernel.org	2026-03-17 12:19:11 -04:00

1 2 3 4 5 ...

122554 Commits