linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-04-04 11:15:39 -04:00

Author	SHA1	Message	Date
Dave Airlie	9a3f210737	Merge tag 'drm-xe-fixes-2025-09-11' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes - Don't touch survivability_mode on fini (Michal) - Fixes around eviction and suspend (Thomas) - Extend Wa_13011645652 to PTL-H, WCL (Julia) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/aMLq7QlaEPHGKXKX@intel.com	2025-09-12 09:44:07 +10:00
Dave Airlie	dab1f85526	Merge tag 'drm-misc-fixes-2025-09-11' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes A maintainer update, an out-of-bound check for panthor and a revert for nouveau to fix a race. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://lore.kernel.org/r/20250911-glistening-uakari-of-serendipity-06ceb1@houat	2025-09-12 09:34:48 +10:00
Dave Airlie	f2c8bbb6e9	Merge tag 'mediatek-drm-fixes-20250910' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes Mediatek DRM Fixes - 20250910 1. fix potential OF node use-after-free Signed-off-by: Dave Airlie <airlied@redhat.com> From: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://lore.kernel.org/r/20250910231813.3526-1-chunkuang.hu@kernel.org	2025-09-12 09:31:27 +10:00
Dave Airlie	1d00adb873	Merge tag 'amd-drm-fixes-6.17-2025-09-10' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.17-2025-09-10: amdgpu: - PSP 11.x fix - DPCD quirk handing fix - DCN 3.5 PG fix - Audio suspend fix - OEM i2c clean up fix - Module unload memory leak fix - DC delay fix - ISP firmware fix - VCN fixes amdkfd: - P2P topology fix - APU mem limit calculation fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250910162855.2507853-1-alexander.deucher@amd.com	2025-09-12 09:24:57 +10:00
Dave Airlie	467360e295	Merge tag 'drm-intel-fixes-2025-09-10' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - Fix size for for_each_set_bit() in abox iteration [display] (Jani Nikula) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tursulin@igalia.com> Link: https://lore.kernel.org/r/aMFUtRdJ46qK-EXl@linux	2025-09-12 09:21:20 +10:00
Dave Airlie	2c38074c36	Merge tag 'drm-rust-fixes-2025-09-05' of https://gitlab.freedesktop.org/drm/rust/kernel into drm-fixes - Add drm-rust tree to MAINTAINERS - Require CONFIG_64BIT for Nova Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/aLquN1YvdyI_6PJS@google.com	2025-09-12 09:05:42 +10:00
Johan Hovold	9ba2556cef	drm/mediatek: clean up driver data initialisation The platform and drm devices are only used to look up the drm device and its driver data respectively when initialising the driver data during bind(). Drop the reference counts as soon as they have been used to make the code more readable. Note that the crtc count is never incremented on lookup failures. Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: CK Hu <ck.hu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20250829090345.21075-3-johan@kernel.org/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2025-09-10 12:52:59 +00:00
Johan Hovold	4de37a48b6	drm/mediatek: fix potential OF node use-after-free The for_each_child_of_node() helper drops the reference it takes to each node as it iterates over children and an explicit of_node_put() is only needed when exiting the loop early. Drop the recently introduced bogus additional reference count decrement at each iteration that could potentially lead to a use-after-free. Fixes: `1f403699c4` ("drm/mediatek: Fix device/node reference count leaks in mtk_drm_get_all_drm_priv") Cc: Ma Ke <make24@iscas.ac.cn> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: CK Hu <ck.hu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20250829090345.21075-2-johan@kernel.org/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2025-09-10 12:49:37 +00:00
David Rosca	3318f2d20c	drm/amdgpu/vcn: Allow limiting ctx to instance 0 for AV1 at any time There is no reason to require this to happen on first submitted IB only. We need to wait for the queue to be idle, but it can be done at any time (including when there are multiple video sessions active). Signed-off-by: David Rosca <david.rosca@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `8908fdce06`) Cc: stable@vger.kernel.org	2025-09-09 16:42:26 -04:00
David Rosca	2b10cb58d7	drm/amdgpu/vcn4: Fix IB parsing with multiple engine info packages There can be multiple engine info packages in one IB and the first one may be common engine, not decode/encode. We need to parse the entire IB instead of stopping after finding first engine info. Signed-off-by: David Rosca <david.rosca@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `dc8f9f0f45`) Cc: stable@vger.kernel.org	2025-09-09 16:41:49 -04:00
Pratap Nirujogi	857ccfc19f	drm/amd/amdgpu: Declare isp firmware binary file Declare isp firmware file isp_4_1_1.bin required by isp4.1.1 device. Suggested-by: Alexey Zagorodnikov <xglooom@gmail.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `d97b74a833`) Cc: stable@vger.kernel.org	2025-09-09 16:41:15 -04:00
Alex Deucher	1d66c3f2b8	drm/amd/display: use udelay rather than fsleep This function can be called from an atomic context so we can't use fsleep(). Fixes: `01f60348d8` ("drm/amd/display: Fix 'failed to blank crtc!'") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4549 Cc: Wen Chen <Wen.Chen3@amd.com> Cc: Fangzhi Zuo <jerry.zuo@amd.com> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `27e4dc2c05`)	2025-09-09 16:39:16 -04:00
Alex Deucher	7838fb5f11	drm/amdgpu: fix a memory leak in fence cleanup when unloading Commit `b61badd20b` ("drm/amdgpu: fix usage slab after free") reordered when amdgpu_fence_driver_sw_fini() was called after that patch, amdgpu_fence_driver_sw_fini() effectively became a no-op as the sched entities we never freed because the ring pointers were already set to NULL. Remove the NULL setting. Reported-by: Lin.Cao <lincao12@amd.com> Cc: Vitaly Prosyak <vitaly.prosyak@amd.com> Cc: Christian König <christian.koenig@amd.com> Fixes: `b61badd20b` ("drm/amdgpu: fix usage slab after free") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `a525fa37aa`) Cc: stable@vger.kernel.org	2025-09-09 16:38:26 -04:00
Julia Filipchuk	fd99415ec8	drm/xe: Extend Wa_13011645652 to PTL-H, WCL Expand workaround to additional graphics architectures. Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: intel-xe@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v6.17+ Signed-off-by: Julia Filipchuk <julia.filipchuk@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250903190122.1028373-2-julia.filipchuk@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `6fc957185e`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-09 13:20:36 -04:00
Thomas Hellström	eb5723a751	drm/xe: Block exec and rebind worker while evicting for suspend / hibernate When the xe pm_notifier evicts for suspend / hibernate, there might be racing tasks trying to re-validate again. This can lead to suspend taking excessive time or get stuck in a live-lock. This behaviour becomes much worse with the fix that actually makes re-validation bring back bos to VRAM rather than letting them remain in TT. Prevent that by having exec and the rebind worker waiting for a completion that is set to block by the pm_notifier before suspend and is signaled by the pm_notifier after resume / wakeup. It's probably still possible to craft malicious applications that block suspending. More work is pending to fix that. v3: - Avoid wait_for_completion() in the kernel worker since it could potentially cause work item flushes from freezable processes to wait forever. Instead terminate the rebind workers if needed and re-launch at resume. (Matt Auld) v4: - Fix some bad naming and leftover debug printouts. - Fix kerneldoc. - Use drmm_mutex_init() for the xe->rebind_resume_lock (Matt Auld). - Rework the interface of xe_vm_rebind_resume_worker (Matt Auld). Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/4288 Fixes: `c6a4d46ec1` ("drm/xe: evict user memory in PM notifier") Cc: Matthew Auld <matthew.auld@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: <stable@vger.kernel.org> # v6.16+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250904160715.2613-4-thomas.hellstrom@linux.intel.com (cherry picked from commit `599334572a`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-09 13:20:31 -04:00
Thomas Hellström	d84820309e	drm/xe: Allow the pm notifier to continue on failure Its actions are opportunistic anyway and will be completed on device suspend. Marking as a fix to simplify backporting of the fix that follows in the series. v2: - Keep the runtime pm reference over suspend / hibernate and document why. (Matt Auld, Rodrigo Vivi): Fixes: `c6a4d46ec1` ("drm/xe: evict user memory in PM notifier") Cc: Matthew Auld <matthew.auld@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: <stable@vger.kernel.org> # v6.16+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250904160715.2613-3-thomas.hellstrom@linux.intel.com (cherry picked from commit `ebd546fdff`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-09 13:20:26 -04:00
Thomas Hellström	5c87fee3c9	drm/xe: Attempt to bring bos back to VRAM after eviction VRAM+TT bos that are evicted from VRAM to TT may remain in TT also after a revalidation following eviction or suspend. This manifests itself as applications becoming sluggish after buffer objects get evicted or after a resume from suspend or hibernation. If the bo supports placement in both VRAM and TT, and we are on DGFX, mark the TT placement as fallback. This means that it is tried only after VRAM + eviction. This flaw has probably been present since the xe module was upstreamed but use a Fixes: commit below where backporting is likely to be simple. For earlier versions we need to open- code the fallback algorithm in the driver. v2: - Remove check for dgfx. (Matthew Auld) - Update the xe_dma_buf kunit test for the new strategy (CI) - Allow dma-buf to pin in current placement (CI) - Make xe_bo_validate() for pinned bos a NOP. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5995 Fixes: `a78a8da51b` ("drm/ttm: replace busy placement with flags v6") Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v6.9+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250904160715.2613-2-thomas.hellstrom@linux.intel.com (cherry picked from commit `cb3d7b3b46`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-09 13:20:22 -04:00
Michal Wajdeczko	7934fdc25a	drm/xe/configfs: Don't touch survivability_mode on fini This is a user controlled configfs attribute, we should not modify that outside the configfs attr.store() implementation. Fixes: `bc417e54e2` ("drm/xe: Enable configfs support for survivability mode") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250904103521.7130-1-michal.wajdeczko@intel.com (cherry picked from commit `079a5c83db`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-09 13:20:17 -04:00
Yifan Zhang	5350355627	amd/amdkfd: correct mem limit calculation for small APUs Current mem limit check leaks some GTT memory (reserved_for_pt reserved_for_ras + adev->vram_pin_size) for small APUs. Since carveout VRAM is tunable on APUs, there are three case regarding the carveout VRAM size relative to GTT: 1. 0 < carveout < gtt apu_prefer_gtt = true, is_app_apu = false 2. carveout > gtt / 2 apu_prefer_gtt = false, is_app_apu = false 3. 0 = carveout apu_prefer_gtt = true, is_app_apu = true It doesn't make sense to check below limitation in case 1 (default case, small carveout) because the values in the below expression are mixed with carveout and gtt. adev->kfd.vram_used[xcp_id] + vram_needed > vram_size - reserved_for_pt - reserved_for_ras - atomic64_read(&adev->vram_pin_size) gtt: kfd.vram_used, vram_needed, vram_size carveout: reserved_for_pt, reserved_for_ras, adev->vram_pin_size In case 1, vram allocation will go to gtt domain, skip vram check since ttm_mem_limit check already cover this allocation. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `fa7c99f04f`)	2025-09-09 12:28:28 -04:00
Eric Huang	ce42a3b581	drm/amdkfd: fix p2p links bug in topology When creating p2p links, KFD needs to check XGMI link with two conditions, hive_id and is_sharing_enabled, but it is missing to check is_sharing_enabled, so add it to fix the error. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `36cc7d1317`)	2025-09-09 12:28:18 -04:00
Geoffrey McRae	1dfd2864a1	drm/amd/display: remove oem i2c adapter on finish Fixes a bug where unbinding of the GPU would leave the oem i2c adapter registered resulting in a null pointer dereference when applications try to access the invalid device. Fixes: `3d5470c973` ("drm/amd/display/dm: add support for OEM i2c bus") Cc: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Geoffrey McRae <geoffrey.mcrae@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `89923fb7ea`) Cc: stable@vger.kernel.org	2025-09-09 12:27:52 -04:00
Mario Limonciello (AMD)	60f71f0db7	drm/amd/display: Drop dm_prepare_suspend() and dm_complete() [Why] dm_prepare_suspend() was added in commit `50e0bae34f` ("drm/amd/display: Add and use new dm_prepare_suspend() callback") to allow display to turn off earlier in the suspend sequence. This caused a regression that HDMI audio sometimes didn't work properly after resume unless audio was playing during suspend. [How] Drop dm_prepare_suspend() callback. All code in it will still run during dm_suspend(). Also drop unnecessary dm_complete() callback. dm_complete() was used for failed prepare and also for any case of successful resume. The code in it already runs in dm_resume(). This change will introduce more time that the display is turned on during suspend sequence. The compositor can turn it off sooner if desired. Cc: Harry Wentland <harry.wentland@amd.com> Reported-by: Przemysław Kopa <prz.kopa@gmail.com> Closes: https://lore.kernel.org/amd-gfx/1cea0d56-7739-4ad9-bf8e-c9330faea2bb@kernel.org/T/#m383d9c08397043a271b36c32b64bb80e524e4b0f Reported-by: Kalvin <hikaph+oss@gmail.com> Closes: https://github.com/alsa-project/alsa-lib/issues/465 Closes: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/4809 Fixes: `50e0bae34f` ("drm/amd/display: Add and use new dm_prepare_suspend() callback") Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `2fd653b9bb`) Cc: stable@vger.kernel.org	2025-09-09 12:26:28 -04:00
Ovidiu Bunea	70f0b051f8	drm/amd/display: Correct sequences and delays for DCN35 PG & RCG [why] The current PG & RCG programming in driver has some gaps and incorrect sequences. [how] Added delays after ungating clocks to allow ramp up, increased polling to allow more time for power up, and removed the incorrect sequences. Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Charlene Liu <charlene.liu@amd.com> Signed-off-by: Ovidiu Bunea <ovidiu.bunea@amd.com> Signed-off-by: Wayne Lin <wayne.lin@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `1bde5584e2`) Cc: stable@vger.kernel.org	2025-09-09 12:25:22 -04:00
Fangzhi Zuo	f5c32370db	drm/amd/display: Disable DPCD Probe Quirk Disable dpcd probe quirk to native aux. Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com> Reviewed-by: Imre Deak <imre.deak@intel.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4500 Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://lore.kernel.org/r/20250904191351.746707-1-Jerry.Zuo@amd.com Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `c5f4fb4058`) Cc: stable@vger.kernel.org # 6.16.y: `5281cbe0b5` Cc: stable@vger.kernel.org # 6.16.y: `0b4aa85e89` Cc: stable@vger.kernel.org # 6.16.y: `b87ed522b3` Cc: stable@vger.kernel.org # 6.16.y	2025-09-09 12:23:05 -04:00
Jani Nikula	cfa7b76597	drm/i915/power: fix size for for_each_set_bit() in abox iteration for_each_set_bit() expects size to be in bits, not bytes. The abox mask iteration uses bytes, but it works by coincidence, because the local variable holding the mask is unsigned long, and the mask only ever has bit 2 as the highest bit. Using a smaller type could lead to subtle and very hard to track bugs. Fixes: `62afef2811` ("drm/i915/rkl: RKL uses ABOX0 for pixel transfers") Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: stable@vger.kernel.org # v5.9+ Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20250905104149.1144751-1-jani.nikula@intel.com Signed-off-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit `7ea3baa6ef`) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>	2025-09-09 09:08:37 +01:00
Lijo Lazar	440cec4ca1	drm/amdgpu: Wait for bootloader after PSPv11 reset Some PSPv11 SOCs take a longer time for PSP based mode-1 reset. Instead of checking for C2PMSG_33 status, add the callback wait_for_bootloader. Wait for bootloader to be back to steady state is already part of the generic mode-1 reset flow. Increase the retry count for bootloader wait and also fix the mask to prevent fake pass. Fixes: `8345a71fc5` ("drm/amdgpu: Add more checks to PSP mailbox") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4531 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `32f73741d6`)	2025-09-08 11:05:53 -04:00
Dave Airlie	8b556ddeee	Merge tag 'amd-drm-fixes-6.17-2025-09-03' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.17-2025-09-03: amdgpu: - UserQ fixes - MES 11 fix - eDP/LVDS fix - Fix non-DC audio clean up - Fix duplicate cursor issue - Fix error path in PSP init Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250903221656.251254-1-alexander.deucher@amd.com	2025-09-05 08:06:34 +10:00
Chia-I Wu	a00f2015ac	drm/panthor: validate group queue count A panthor group can have at most MAX_CS_PER_CSG panthor queues. Fixes: `4bdca11507` ("drm/panthor: Add the driver frontend block") Signed-off-by: Chia-I Wu <olvaffe@gmail.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> # v1 Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Steven Price <steven.price@arm.com> Link: https://lore.kernel.org/r/20250903192133.288477-1-olvaffe@gmail.com	2025-09-04 15:59:23 +01:00
Dave Airlie	40bcf6ecf9	Merge tag 'drm-xe-fixes-2025-09-03' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes - Fix incorrect migration of backed-up object to VRAM (Thomas) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/aLiP26TiHkYxtBXL@intel.com	2025-09-04 12:52:19 +10:00
Dave Airlie	42e0a73bf7	Merge tag 'drm-misc-fixes-2025-09-03' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Two nouveau interrupt handling fixes, one race fix for ivpu, a race fix for drm_sched, and a clock fix for ti-sn65dsi86. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://lore.kernel.org/r/qc2rd7bskgufjtyspbjflyjpswcnhyja6s7nm2yb67j7hezyey@yfn2w6n5trff	2025-09-04 12:36:11 +10:00
Philipp Stanner	d506703472	Revert "drm/nouveau: Remove waitque for sched teardown" This reverts: commit `bead880022` ("drm/nouveau: Remove waitque for sched teardown") commit `5f46f5c7af` ("drm/nouveau: Add new callback for scheduler teardown") from the drm/sched teardown leak fix series: https://lore.kernel.org/dri-devel/20250710125412.128476-2-phasta@kernel.org/ The aforementioned series removed a blocking waitqueue from nouveau_sched_fini(). It was mistakenly assumed that this waitqueue only prevents jobs from leaking, which the series fixed. The waitqueue, however, also guarantees that all VM_BIND related jobs are finished in order, cleaning up mappings in the GPU's MMU. These jobs must be executed sequentially. Without the waitqueue, this is no longer guaranteed, because entity and scheduler teardown can race with each other. Revert all patches related to the waitqueue removal. Fixes: `bead880022` ("drm/nouveau: Remove waitque for sched teardown") Suggested-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250901083107.10206-2-phasta@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>	2025-09-03 23:16:59 +02:00
Colin Ian King	467e00b30d	drm/amd/amdgpu: Fix missing error return on kzalloc failure Currently the kzalloc failure check just sets reports the failure and sets the variable ret to -ENOMEM, which is not checked later for this specific error. Fix this by just returning -ENOMEM rather than setting ret. Fixes: `4fb9307154` ("drm/amd/amdgpu: remove redundant host to psp cmd buf allocations") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `1ee9d1a096`)	2025-09-03 16:27:56 -04:00
Michael Walle	bdd5a14e66	drm/bridge: ti-sn65dsi86: fix REFCLK setting The bridge has three bootstrap pins which are sampled to determine the frequency of the external reference clock. The driver will also (over)write that setting. But it seems this is racy after the bridge is enabled. It was observed that although the driver write the correct value (by sniffing on the I2C bus), the register has the wrong value. The datasheet states that the GPIO lines have to be stable for at least 5us after asserting the EN signal. Thus, there seems to be some logic which samples the GPIO lines and this logic appears to overwrite the register value which was set by the driver. Waiting 20us after asserting the EN line resolves this issue. Fixes: `a095f15c00` ("drm/bridge: add support for sn65dsi86 bridge driver") Signed-off-by: Michael Walle <mwalle@kernel.org> Reviewed-by: Douglas Anderson <dianders@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Link: https://lore.kernel.org/r/20250821122341.1257286-1-mwalle@kernel.org	2025-09-02 09:56:05 -07:00
Thomas Hellström	379b3c983f	drm/xe: Fix incorrect migration of backed-up object to VRAM If an object is backed up to shmem it is incorrectly identified as not having valid data by the move code. This means moving to VRAM skips the -EMULTIHOP step and the bo is cleared. This causes all sorts of weird behaviour on DGFX if an already evicted object is targeted by the shrinker. Fix this by using ttm_tt_is_swapped() to identify backed-up objects. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5996 Fixes: `00c8efc318` ("drm/xe: Add a shrinker for xe bos") Cc: Matthew Brost <matthew.brost@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v6.15+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://lore.kernel.org/r/20250828134837.5709-1-thomas.hellstrom@linux.intel.com (cherry picked from commit `1047bd8279`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-02 09:00:47 -04:00
Pierre-Eric Pelloux-Prayer	232674e1a6	drm/sched: Fix racy access to drm_sched_entity.dependency The drm_sched_job_unschedulable trace point can access entity->dependency after it was cleared by the callback installed in drm_sched_entity_add_dependency_cb, causing: BUG: kernel NULL pointer dereference, address: 0000000000000020 [...] Workqueue: comp_1.1.0 drm_sched_run_job_work [gpu_sched] RIP: 0010:trace_event_raw_event_drm_sched_job_unschedulable+0x70/0xd0 [gpu_sched] To fix this we either need to keep a reference to the fence before setting up the callbacks, or move the trace_drm_sched_job_unschedulable calls into drm_sched_entity_add_dependency_cb where they can be done earlier. Fixes: `76d97c870f` ("drm/sched: Trace dependencies for GPU jobs") Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://lore.kernel.org/r/20250901124032.1955-1-pierre-eric.pelloux-prayer@amd.com (cherry picked from commit `b2b8af21fe`) Signed-off-by: Maxime Ripard <mripard@kernel.org>	2025-09-02 12:58:56 +02:00
Danilo Krummrich	5c5a41a754	gpu: nova-core: depend on CONFIG_64BIT If built on architectures with CONFIG_ARCH_DMA_ADDR_T_64BIT=y nova-core produces that following build failures: error[E0308]: mismatched types --> drivers/gpu/nova-core/fb.rs:49:59 \| 49 \| hal::fb_hal(chipset).write_sysmem_flush_page(bar, page.dma_handle())?; \| ----------------------- ^^^^^^^^^^^^^^^^^ expected `u64`, found `u32` \| \| \| arguments to this method are incorrect \| note: method defined here --> drivers/gpu/nova-core/fb/hal.rs:19:8 \| 19 \| fn write_sysmem_flush_page(&self, bar: &Bar0, addr: u64) -> Result; \| ^^^^^^^^^^^^^^^^^^^^^^^ help: you can convert a `u32` to a `u64` \| 49 \| hal::fb_hal(chipset).write_sysmem_flush_page(bar, page.dma_handle().into())?; \| +++++++ error[E0308]: mismatched types --> drivers/gpu/nova-core/fb.rs:65:47 \| 65 \| if hal.read_sysmem_flush_page(bar) == self.page.dma_handle() { \| ------------------------------- ^^^^^^^^^^^^^^^^^^^^^^ expected `u64`, found `u32` \| \| \| expected because this is `u64` \| help: you can convert a `u32` to a `u64` \| 65 \| if hal.read_sysmem_flush_page(bar) == self.page.dma_handle().into() { \| +++++++ error: this arithmetic operation will overflow --> drivers/gpu/nova-core/falcon.rs:469:23 \| 469 \| .set_base((dma_start >> 40) as u16) \| ^^^^^^^^^^^^^^^^^ attempt to shift right by `40_i32`, which would overflow \| = note: `#[deny(arithmetic_overflow)]` on by default This is due to the code making assumptions on the width of dma_addr_t to be 64 bit. While this could technically be handled, it is rather painful to deal with, as the following example illustrates: pub(super) fn read_sysmem_flush_page_ga100(bar: &Bar0) -> DmaAddress { let addr = u64::from(regs::NV_PFB_NISO_FLUSH_SYSMEM_ADDR::read(bar).adr_39_08()) << FLUSH_SYSMEM_ADDR_SHIFT \| u64::from(regs::NV_PFB_NISO_FLUSH_SYSMEM_ADDR_HI::read(bar).adr_63_40()) << FLUSH_SYSMEM_ADDR_SHIFT_HI; addr.try_into().unwrap_or_else(\|_\| { kernel::warn_on!(true); 0 }) } At the same time there's not much value for nova-core to support 32-bit, given that the supported GPU architectures are Turing and later, hence depend on CONFIG_64BIT. Cc: John Hubbard <jhubbard@nvidia.com> Reported-by: Miguel Ojeda <ojeda@kernel.org> Closes: https://lore.kernel.org/lkml/20250828160247.37492-1-ojeda@kernel.org/ Fixes: `6554ad65b5` ("gpu: nova-core: register sysmem flush page") Fixes: `69f5cd67ce` ("gpu: nova-core: add falcon register definitions and base code") Reviewed-by: Alexandre Courbot <acourbot@nvidia.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Link: https://lore.kernel.org/r/20250828223954.351348-1-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>	2025-09-01 11:06:37 +02:00
Faith Ekstrand	2cb66ae604	nouveau: Membar before between semaphore writes and the interrupt This ensures that the memory write and the interrupt are properly ordered and we won't wake up the kernel before the semaphore write has hit memory. Fixes: `b1ca384772` ("drm/nouveau/gv100-: switch to volta semaphore methods") Cc: stable@vger.kernel.org Signed-off-by: Faith Ekstrand <faith.ekstrand@collabora.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://lore.kernel.org/r/20250829021633.1674524-2-airlied@gmail.com Signed-off-by: Danilo Krummrich <dakr@kernel.org>	2025-08-29 18:36:51 +02:00
Dave Airlie	0ef5c4e4db	nouveau: fix disabling the nonstall irq due to storm code Nouveau has code that when it gets an IRQ with no allowed handler it disables it to avoid storms. However with nonstall interrupts, we often disable them from the drm driver, but still request their emission via the push submission. Just don't disable nonstall irqs ever in normal operation, the event handling code will filter them out, and the driver will just enable/disable them at load time. This fixes timeouts we've been seeing on/off for a long time, but they became a lot more noticeable on Blackwell. This doesn't fix all of them, there is a subsequent fence emission fix to fix the last few. Fixes: `3ebd64aa3c` ("drm/nouveau/intr: support multiple trees, and explicit interfaces") Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://lore.kernel.org/r/20250829021633.1674524-1-airlied@gmail.com [ Fix a typo and a minor checkpatch.pl warning; remove "v2" from commit subject. - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>	2025-08-29 18:36:23 +02:00
Ivan Lipski	3ebf766c35	drm/amd/display: Clear the CUR_ENABLE register on DCN314 w/out DPP PG [Why&How] ON DCN314, clearing DPP SW structure without power gating it can cause a double cursor in full screen with non-native scaling. A W/A that clears CURSOR0_CONTROL cursor_enable flag if dcn10_plane_atomic_power_down is called and DPP power gating is disabled. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4168 Reviewed-by: Sun peng (Leo) Li <sunpeng.li@amd.com> Signed-off-by: Ivan Lipski <ivan.lipski@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `645f74f1dc`) Cc: stable@vger.kernel.org	2025-08-29 11:15:08 -04:00
Alex Deucher	71403f58b4	drm/amdgpu: drop hw access in non-DC audio fini We already disable the audio pins in hw_fini so there is no need to do it again in sw_fini. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4481 Cc: oushixiong <oushixiong1025@163.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `5eeb16ca72`) Cc: stable@vger.kernel.org	2025-08-29 11:13:38 -04:00
Mario Limonciello	a8b79b0918	drm/amd: Re-enable common modes for eDP and LVDS [Why] Although compositors will add their own modes, Xorg won't use it's own modes and will only stick to modes advertised by the driver. This mean a user that used to pick 1024x768 could no longer access it unless the panel's native resolution was 1024x768. [How] Revert commit `6d396e7ac1` ("drm/amd/display: Disable common modes for LVDS") and commit `7948afb46a` ("drm/amd/display: Disable common modes for eDP"). The panel will still use scaling for any non-native modes due to commit `978fa2f6d0` ("drm/amd/display: Use scaling for non-native resolutions on eDP") Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4538 Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250828140856.2887993-1-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `c2fbf72fe3`)	2025-08-29 11:13:04 -04:00
Alex Deucher	5171848bdf	drm/amdgpu/mes11: make MES_MISC_OP_CHANGE_CONFIG failure non-fatal If the firmware is too old, just warn and return success. Fixes: `27b7915147` ("drm/amdgpu/mes: keep enforce isolation up to date") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4414 Cc: shaoyun.Liu@amd.com Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `9f28af76fa`) Cc: stable@vger.kernel.org	2025-08-29 11:12:44 -04:00
Jesse.Zhang	2d41a4bfee	drm/amdgpu/sdma: bump firmware version checks for user queue support Using the previous firmware could lead to problems with PROTECTED_FENCE_SIGNAL commands, specifically causing register conflicts between MCU_DBG0 and MCU_DBG1. The updated firmware versions ensure proper alignment and unification of the SDMA_SUBOP_PROTECTED_FENCE_SIGNAL value with SDMA 7.x, resolving these hardware coordination issues Fixes: `e8cca30d8b` ("drm/amdgpu/sdma6: add ucode version checks for userq support") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `aab8b689ad`) Cc: stable@vger.kernel.org	2025-08-29 11:11:45 -04:00
Dave Airlie	42d2abbfa8	Merge tag 'mediatek-drm-fixes-20250829' of https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes Mediatek DRM Fixes - 20250829 1. Add error handling for old state CRTC in atomic_disable 2. Fix DSI host and panel bridge pre-enable order 3. Fix device/node reference count leaks in mtk_drm_get_all_drm_priv 4. mtk_hdmi: Fix inverted parameters in some regmap_update_bits calls Signed-off-by: Dave Airlie <airlied@redhat.com> From: Chun-Kuang Hu <chunkuang.hu@kernel.org> Link: https://lore.kernel.org/r/20250828234116.4960-1-chunkuang.hu@kernel.org	2025-08-29 10:05:10 +10:00
Louis-Alexis Eyraud	c34414883f	drm/mediatek: mtk_hdmi: Fix inverted parameters in some regmap_update_bits calls In mtk_hdmi driver, a recent change replaced custom register access function calls by regmap ones, but two replacements by regmap_update_bits were done incorrectly, because original offset and mask parameters were inverted, so fix them. Fixes: `d6e25b3590` ("drm/mediatek: hdmi: Use regmap instead of iomem for main registers") Signed-off-by: Louis-Alexis Eyraud <louisalexis.eyraud@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20250818-mt8173-fix-hdmi-issue-v1-1-55aff9b0295d@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2025-08-28 23:15:41 +00:00
Dave Airlie	49862587fa	Merge tag 'drm-msm-fixes-2025-08-26' of https://gitlab.freedesktop.org/drm/msm into drm-fixes Fixes for v6.17-rc4 Core/GPU: - fix comment doc warning in gpuvm - fix build with KMS disabled - fix pgtable setup/teardown race - global fault counter fix - various error path fixes - GPU devcoredump snapshot fixes - handle in-place VM_BIND remaps to solve turnip vm update race - skip re-emitting IBs for unusable VMs - Don't use %pK through printk - moved display snapshot init earlier, fixing a crash DPU: - Fixed crash in virtual plane checking code - Fixed mode comparison in virtual plane checking code DSI: - Adjusted width of resulution-related registers - Fixed locking issue on 14nm PLLs UBWC (per Bjorn's ack) - Added UBWC configuration for several missing platforms (fixing regression) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rob Clark <rob.clark@oss.qualcomm.com> Link: https://lore.kernel.org/r/CACSVV02+u1VW1dzuz6JWwVEfpgTj6Y-JXMH+vX43KsKTVsW+Yg@mail.gmail.com	2025-08-29 09:05:18 +10:00
Dave Airlie	4b1c24ef50	Merge tag 'amd-drm-fixes-6.17-2025-08-28' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.17-2025-08-28: amdgpu: - UserQ fixes - Revert CSA fix - SR-IOV fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250828173904.75850-1-alexander.deucher@amd.com	2025-08-29 08:50:44 +10:00
Dave Airlie	60d98e1a8d	Merge tag 'drm-misc-fixes-2025-08-28' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Several nouveau fixes to remove unused code, fix an error path and be less restrictive with the formats it accepts. A fix for amdgpu to pin vmapped dma-buf, and a revert for tegra for a regression in the dma-buf / GEM code. Signed-off-by: Dave Airlie <airlied@redhat.com> From: Maxime Ripard <mripard@redhat.com> Link: https://lore.kernel.org/r/20250828-hypersonic-colorful-squirrel-64f04b@houat	2025-08-29 08:44:53 +10:00
Alex Deucher	c767d74a9c	drm/amdgpu/userq: fix error handling of invalid doorbell If the doorbell is invalid, be sure to set the r to an error state so the function returns an error. Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `7e2a5b0a9a`) Cc: stable@vger.kernel.org	2025-08-27 14:01:52 -04:00
Jesse.Zhang	ee38ea0ae4	drm/amdgpu: update firmware version checks for user queue support The minimum firmware versions required for user queue functionality have been increased to address an issue where the queue privilege state was lost during queue connect operations. The problem occurred because the privilege state was being restored to its initial value at the beginning of the function, overwriting the state that was properly set during the queue connect case. This commit updates the minimum version requirements: - ME firmware from 2390 to 2420 - PFP firmware from 2530 to 2580 - MEC firmware from 2600 to 2650 - MES firmware remains at 120 These updated firmware versions contain the necessary fixes to properly maintain queue privilege state throughout connect operations. Fixes: `61ca97e959` ("drm/amdgpu: Add fw minimum version check for usermode queue") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `5f976c9939`) Cc: stable@vger.kernel.org	2025-08-27 14:01:32 -04:00

1 2 3 4 5 ...

116970 Commits