linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-02-16 17:45:01 -05:00

Author	SHA1	Message	Date
Dave Airlie	00062ea01d	Merge tag 'drm-xe-fixes-2025-08-14' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes - Some more xe_migrate_access_memory fixes (Auld) - Defer buffer object shrinker write-backs and GPU waits (Thomas) - HWMON fix for clamping limits (Karthik) - SRIOV-PF: Set VF LMEM BAR size (Michal) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/aJ4MIZQurSo0uNxn@intel.com	2025-08-15 09:50:26 +10:00
Dave Airlie	4699c04b68	Merge tag 'drm-intel-fixes-2025-08-13' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - Fix the implementation of wa_18038517565 [fbc] (Vinod Govindapillai) - Do not trigger Frame Change events from frontbuffer flush [psr] (Jouni Högander) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Tvrtko Ursulin <tursulin@igalia.com> Link: https://lore.kernel.org/r/aJ0HAh06VHWVdv63@linux	2025-08-15 09:05:05 +10:00
Michał Winiarski	94eae6ee4c	drm/xe/pf: Set VF LMEM BAR size LMEM is partitioned between multiple VFs and we expect that the more VFs we have, the less LMEM is assigned to each VF. This means that we can achieve full LMEM BAR access without the need to attempt full VF LMEM BAR resize via pci_resize_resource(). Always try to set the largest possible BAR size that allows to fit the number of enabled VFs and inform the user in case the resize attempt is not successful. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Link: https://lore.kernel.org/r/20250527120637.665506-7-michal.winiarski@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `32a4d1b98e`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-08-14 10:30:53 -04:00
Dave Airlie	68ad07df92	Merge tag 'amd-drm-fixes-6.17-2025-08-13' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.17-2025-08-13: amdgpu: - PSP fix - VRAM reservation fix - CSA fix - Process kill fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250813151905.2040816-1-alexander.deucher@amd.com	2025-08-14 08:16:31 +10:00
Dave Airlie	f858f63e1d	Merge tag 'drm-misc-next-fixes-2025-08-12' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: bridge: - fix OF-node leak - fix documentation fbdev-emulation: - pass correct format info to drm_helper_mode_fill_fb_struct() panfrost: - print correct RSS size Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://lore.kernel.org/r/20250812064712.GA14554@2a02-2454-fd5e-fd00-2c49-c639-c55f-a125.dyn6.pyur.net	2025-08-14 07:51:34 +10:00
Liu01 Tong	aa5fc4362f	drm/amdgpu: fix task hang from failed job submission during process kill During process kill, drm_sched_entity_flush() will kill the vm entities. The following job submissions of this process will fail, and the resources of these jobs have not been released, nor have the fences been signalled, causing tasks to hang and timeout. Fix by check entity status in amdgpu_vm_ready() and avoid submit jobs to stopped entity. v2: add amdgpu_vm_ready() check before amdgpu_vm_clear_freed() in function amdgpu_cs_vm_handling(). Fixes: `1f02f2044b` ("drm/amdgpu: Avoid extra evict-restore process.") Signed-off-by: Liu01 Tong <Tong.Liu01@amd.com> Signed-off-by: Lin.Cao <lincao12@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `f101c13a87`)	2025-08-12 16:16:36 -04:00
Jack Xiao	040bc6d0e0	drm/amdgpu: fix incorrect vm flags to map bo It should use vm flags instead of pte flags to specify bo vm attributes. Fixes: `7946340fa3` ("drm/amdgpu: Move csa related code to separate file") Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `b08425fa77`)	2025-08-12 16:08:04 -04:00
YiPeng Chai	10ef476aad	drm/amdgpu: fix vram reservation issue The vram block allocation flag must be cleared before making vram reservation, otherwise reserving addresses within the currently freed memory range will always fail. Fixes: `c9cad937c0` ("drm/amdgpu: add drm buddy support to amdgpu") Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `d38eaf27de`)	2025-08-12 16:07:45 -04:00
Frank Min	e67b8afcb6	drm/amdgpu: Add PSP fw version check for fw reserve GFX command The fw reserved GFX command is only supported starting from PSP fw version 0x3a0e14 and 0x3b0e0d. Older versions do not support this command. Add a version guard to ensure the command is only used when the running PSP fw meets the minimum version requirement. This ensures backward compatibility and safe operation across fw revisions. Fixes: `a3b7f9c306` ("drm/amdgpu: reclaim psp fw reservation memory region") Signed-off-by: Frank Min <Frank.Min@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `065e23170a`)	2025-08-12 16:07:31 -04:00
Karthik Poosa	55d49f0616	drm/xe/hwmon: Add SW clamp for power limits writes Clamp writes to power limits powerX_crit/currX_crit, powerX_cap, powerX_max, to the maximum supported by the pcode mailbox when sysfs-provided values exceed this limit. Although the pcode already performs clamping, values beyond the pcode mailbox's supported range get truncated, leading to incorrect critical power settings. This patch ensures proper clamping to prevent such truncation. v2: - Address below review comments. (Riana) - Split comments into multiple sentences. - Use local variables for readability. - Add a debug log. - Use u64 instead of unsigned long. v3: - Change drm_dbg logs to drm_info. (Badal) v4: - Rephrase the drm_info log. (Rodrigo, Riana) - Rename variable max_mbx_power_limit to max_supp_power_limit, as limit is same for platforms with and without mailbox power limit support. Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Fixes: `92d44a422d` ("drm/xe/hwmon: Expose card reactive critical power") Fixes: `fb1b70607f` ("drm/xe/hwmon: Expose power attributes") Reviewed-by: Riana Tauro <riana.tauro@intel.com> Link: https://lore.kernel.org/r/20250808185310.3466529-1-karthik.poosa@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `d301eb950d`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-08-12 12:52:26 -04:00
Thomas Hellström	2dd7a47669	drm/xe: Defer buffer object shrinker write-backs and GPU waits When the xe buffer-object shrinker allows GPU waits and write-back, (typically from kswapd), perform multiple passes, skipping subsequent passes if the shrinker number of scanned objects target is reached. 1) Without GPU waits and write-back 2) Without write-back 3) With both GPU-waits and write-back This is to avoid stalls and costly write- and readbacks unless they are really necessary. v2: - Don't test for scan completion twice. (Stuart Summers) - Update tags. Reported-by: melvyn <melvyn2@dnsense.pub> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5557 Cc: Summers Stuart <stuart.summers@intel.com> Fixes: `00c8efc318` ("drm/xe: Add a shrinker for xe bos") Cc: <stable@vger.kernel.org> # v6.15+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://lore.kernel.org/r/20250805074842.11359-1-thomas.hellstrom@linux.intel.com (cherry picked from commit `80944d3341`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-08-12 12:52:26 -04:00
Matthew Auld	145832fbdd	drm/xe/migrate: prevent potential UAF If we hit the error path, the previous fence (if there is one) has already been put() prior to this, so doing a fence_wait could lead to UAF. Tweak the flow to do to the put() until after we do the wait. Fixes: `270172f64b` ("drm/xe: Update xe_ttm_access_memory to use GPU for non-visible access") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Maciej Patelczyk <maciej.patelczyk@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://lore.kernel.org/r/20250731093807.207572-8-matthew.auld@intel.com (cherry picked from commit `9b7ca35ed2`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-08-12 12:52:26 -04:00
Matthew Auld	4126cb327a	drm/xe/migrate: don't overflow max copy size With non-page aligned copy, we need to use 4 byte aligned pitch, however the size itself might still be close to our maximum of ~8M, and so the dimensions of the copy can easily exceed the S16_MAX limit of the copy command leading to the following assert: xe 0000:03:00.0: [drm] Assertion `size / pitch <= ((s16)(((u16)~0U) >> 1))` failed! platform: BATTLEMAGE subplatform: 1 graphics: Xe2_HPG 20.01 step A0 media: Xe2_HPM 13.01 step A1 tile: 0 VRAM 10.0 GiB GT: 0 type 1 WARNING: CPU: 23 PID: 10605 at drivers/gpu/drm/xe/xe_migrate.c:673 emit_copy+0x4b5/0x4e0 [xe] To fix this account for the pitch when calculating the number of current bytes to copy. Fixes: `270172f64b` ("drm/xe: Update xe_ttm_access_memory to use GPU for non-visible access") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Maciej Patelczyk <maciej.patelczyk@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://lore.kernel.org/r/20250731093807.207572-7-matthew.auld@intel.com (cherry picked from commit `8c2d61e0e9`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-08-12 12:52:26 -04:00
Matthew Auld	9d7a1cbebb	drm/xe/migrate: prevent infinite recursion If the buf + offset is not aligned to XE_CAHELINE_BYTES we fallback to using a bounce buffer. However the bounce buffer here is allocated on the stack, and the only alignment requirement here is that it's naturally aligned to u8, and not XE_CACHELINE_BYTES. If the bounce buffer is also misaligned we then recurse back into the function again, however the new bounce buffer might also not be aligned, and might never be until we eventually blow through the stack, as we keep recursing. Instead of using the stack use kmalloc, which should respect the power-of-two alignment request here. Fixes a kernel panic when triggering this path through eudebug. v2 (Stuart): - Add build bug check for power-of-two restriction - s/EINVAL/ENOMEM/ Fixes: `270172f64b` ("drm/xe: Update xe_ttm_access_memory to use GPU for non-visible access") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Maciej Patelczyk <maciej.patelczyk@intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://lore.kernel.org/r/20250731093807.207572-6-matthew.auld@intel.com (cherry picked from commit `38b34e928a`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-08-12 12:52:26 -04:00
Jouni Högander	184889dfe0	drm/i915/psr: Do not trigger Frame Change events from frontbuffer flush We want to get rid of triggering "Frame Change" events from frontbuffer flush calls. We are about to move using TRANS_PUSH register for this on LunarLake and onwards. Touching TRANS_PUSH register from fronbuffer flush would be problematic as it's written by DSB as well. Fix this by using intel_psr_exit when flush or invalidate is done on LunarLake and onwards. This is not possible on AlderLake and MeteorLake due to HW bug in PSR2 disable. This patch is also fixing problems with cursor plane where cursor is disappearing or duplicate cursor is seen on the screen. v2: Commit message updated Bspec: 68927, 68934, 66624 Reported-by: Janna Martl <janna.martl109@gmail.com> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/5522 Fixes: `411ad63877` ("drm/i915/psr: Use SFF_CTL on invalidate/flush for LunarLake onwards") Tested-by: Janna Martl <janna.martl109@gmail.com> Signed-off-by: Jouni Högander <jouni.hogander@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/20250801062905.564453-1-jouni.hogander@intel.com (cherry picked from commit `46fb38cb20`) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>	2025-08-12 09:05:11 +01:00
Vinod Govindapillai	fd56b9c950	drm/i915/fbc: fix the implementation of wa_18038517565 As per the wa_18038517565, we need to disable FBC compressor clock gating before enabling FBC and enable after disabling FBC. Placing the enabling of clock gating in the fbc deactivate function can make the above wa logic go wrong in case of frontbuffer rendering FBC mechanism. FBC deactivate can get called during fb invalidate and then the corresponding FBC activate can get called without properly disabling the clock gating and can result in compression stalled. So move the enable clock gating at the end of one FBC session after FBC is completely disabled for a pipe. Bspec: 74212, 72197, 69741, 65555 Fixes: `010363c461` ("drm/i915/display: implement wa_18038517565") Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com> Reviewed-by: Jouni Högander <jouni.hogander@intel.com> Link: https://lore.kernel.org/r/20250729124648.288497-1-vinod.govindapillai@intel.com (cherry picked from commit `82dde0407a`) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>	2025-08-12 09:05:07 +01:00
Adrián Larumbe	54d4f44551	drm/panfrost: Print RSS for tiler heap BO's in debugfs GEMS file Otherwise it would display the virtual allocation size, which is often much bigger than the RSS. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Fixes: `e48ade5e23` ("drm/panfrost: show device-wide list of DRM GEM objects over DebugFS") Tested-by: Christopher Healy <healych@amazon.com> Reviewed-by: Daniel Stone <daniels@collabora.com> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://lore.kernel.org/r/20250808010235.2831853-1-adrian.larumbe@collabora.com	2025-08-12 08:31:47 +02:00
Linus Torvalds	0227b49b50	Merge tag 'gpio-updates-for-v6.17-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux Pull gpio updates from Bartosz Golaszewski: "As discussed: there's a small commit that removes the legacy GPIO line value setter callbacks as they're no longer used and a big, treewide commit that renames the new ones to the old names across all GPIO drivers at once. While at it: there are also two fixes that I picked up over the course of the merge window: - remove unused, legacy GPIO line value setters from struct gpio_chip - rename the new set callbacks back to the original names treewide - fix interrupt handling in gpio-mlxbf2 - revert a buggy immutable irqchip conversion" * tag 'gpio-updates-for-v6.17-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/brgl/linux: treewide: rename GPIO set callbacks back to their original names gpio: remove legacy GPIO line value setter callbacks gpio: mlxbf2: use platform_get_irq_optional() Revert "gpio: pxa: Make irq_chip immutable"	2025-08-09 08:15:43 +03:00
Linus Torvalds	ffe8ac927d	Merge tag 'drm-next-2025-08-08' of https://gitlab.freedesktop.org/drm/kernel Pull drm fixes from Dave Airlie: "This is the fixes that built up in the merge window, mostly amdgpu and xe with one i915 display fix, seems like things are pretty good for rc1. i915: - DP LPFS fixes xe: - SRIOV: PF fixes and removal of need of module param - Fix driver unbind around Devcoredump - Mark xe driver as BROKEN if kernel page size is not 4kB amdgpu: - GC 9.5.0 fixes - SMU fix - DCE 6 DC fixes - mmhub client ID fixes - VRR fix - Backlight fix - UserQ fix - Legacy reset fix - Misc fixes amdkfd: - CRIU fix - Debugfs fix" * tag 'drm-next-2025-08-08' of https://gitlab.freedesktop.org/drm/kernel: (28 commits) drm/amdgpu: add missing vram lost check for LEGACY RESET drm/amdgpu/discovery: fix fw based ip discovery drm/amdkfd: Destroy KFD debugfs after destroy KFD wq amdgpu/amdgpu_discovery: increase timeout limit for IFWI init drm/amdgpu: Update SDMA firmware version check for user queue support drm/amdgpu: Add NULL check for asic_funcs drm/amd/display: Revert "drm/amd/display: Fix AMDGPU_MAX_BL_LEVEL value" drm/amd/display: fix a Null pointer dereference vulnerability drm/amd/display: Add primary plane to commits for correct VRR handling drm/amdgpu: update mmhub 3.3 client id mappings drm/amdgpu: update mmhub 3.0.1 client id mappings drm/amdgpu: Retain job->vm in amdgpu_job_prepare_job drm/amd/display: Fix DCE 6.0 and 6.4 PLL programming. drm/amd/display: Don't overwrite dce60_clk_mgr drm/amdkfd: Fix checkpoint-restore on multi-xcc drm/amd: Restore cached manual clock settings during resume drm/amd: Restore cached power limit during resume drm/amdgpu: Update external revid for GC v9.5.0 drm/amdgpu: Update supported modes for GC v9.5.0 Mark xe driver as BROKEN if kernel page size is not 4kB ...	2025-08-08 06:48:14 +03:00
Dave Airlie	64c6275194	Merge tag 'amd-drm-fixes-6.17-2025-08-07' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-fixes-6.17-2025-08-07: amdgpu: - GC 9.5.0 fixes - SMU fix - DCE 6 DC fixes - mmhub client ID fixes - VRR fix - Backlight fix - UserQ fix - Legacy reset fix - Misc fixes amdkfd: - CRIU fix - Debugfs fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250807132030.1168068-1-alexander.deucher@amd.com	2025-08-08 08:17:13 +10:00
Dave Airlie	10acca927f	Merge tag 'drm-xe-next-fixes-2025-08-06' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next - SRIOV: PF fixes and removal of need of module param (Michal) - Fix driver unbind around Devcoredump (Bala) - Mark xe driver as BROKEN if kernel page size is not 4kB (Simon) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/aJNXnIAp2Cq-2pZj@intel.com	2025-08-08 05:50:11 +10:00
Bartosz Golaszewski	d9d87d90cc	treewide: rename GPIO set callbacks back to their original names The conversion of all GPIO drivers to using the .set_rv() and .set_multiple_rv() callbacks from struct gpio_chip (which - unlike their predecessors - return an integer and allow the controller drivers to indicate failures to users) is now complete and the legacy ones have been removed. Rename the new callbacks back to their original names in one sweeping change. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>	2025-08-07 10:07:06 +02:00
Alex Deucher	81699fe81b	drm/amdgpu: add missing vram lost check for LEGACY RESET Legacy resets reset the memory controllers so VRAM contents may be unreliable after reset. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `aae94897b6`) Cc: stable@vger.kernel.org	2025-08-06 16:54:25 -04:00
Alex Deucher	514678da56	drm/amdgpu/discovery: fix fw based ip discovery We only need the fw based discovery table for sysfs. No need to parse it. Additionally parsing some of the board specific tables may result in incorrect data on some boards. just load the binary and don't parse it on those boards. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4441 Fixes: `80a0e82829` ("drm/amdgpu/discovery: optionally use fw based ip discovery") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `62eedd150f`) Cc: stable@vger.kernel.org	2025-08-06 16:54:04 -04:00
Amber Lin	2e58401a24	drm/amdkfd: Destroy KFD debugfs after destroy KFD wq Since KFD proc content was moved to kernel debugfs, we can't destroy KFD debugfs before kfd_process_destroy_wq. Move kfd_process_destroy_wq prior to kfd_debugfs_fini to fix a kernel NULL pointer problem. It happens when /sys/kernel/debug/kfd was already destroyed in kfd_debugfs_fini but kfd_process_destroy_wq calls kfd_debugfs_remove_process. This line debugfs_remove_recursive(entry->proc_dentry); tries to remove /sys/kernel/debug/kfd/proc/<pid> while /sys/kernel/debug/kfd is already gone. It hangs the kernel by kernel NULL pointer. Signed-off-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Eric Huang <jinhuieric.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `0333052d90`) Cc: stable@vger.kernel.org	2025-08-06 16:52:08 -04:00
Xaver Hugl	928587381b	amdgpu/amdgpu_discovery: increase timeout limit for IFWI init With a timeout of only 1 second, my rx 5700XT fails to initialize, so this increases the timeout to 2s. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3697 Signed-off-by: Xaver Hugl <xaver.hugl@kde.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `9ed3d7bdf2`) Cc: stable@vger.kernel.org	2025-08-06 16:51:26 -04:00
Imre Deak	c0a8e4443d	drm/radeon: Pass along the format info from .fb_create() to drm_helper_mode_fill_fb_struct() Plumb the format info from .fb_create() all the way to drm_helper_mode_fill_fb_struct() to avoid the redundant lookup. For the fbdev case a manual drm_get_format_info() lookup is needed. The patch is based on the driver parts of the patchset at Link: below, which missed converting the radeon driver. Due to the absence of this change in the patchset at Link:, after the Fixed: commit below, radeon_framebuffer_init() -> drm_helper_mode_fill_fb_struct() set drm_framebuffer::format incorrectly to NULL, which lead to the !fb->format WARN() in drm_framebuffer_init() and causing framebuffer creation to fail. This patch fixes both of these issues. v2: Amend the commit log mentioning the functional issues the patch fixes. (Tomi) Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: amd-gfx@lists.freedesktop.org Cc: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Fixes: `41ab92d35c` ("drm: Make passing of format info to drm_helper_mode_fill_fb_struct() mandatory") Link: https://lore.kernel.org/all/20250701090722.13645-1-ville.syrjala@linux.intel.com Acked-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://lore.kernel.org/r/20250805175752.690504-4-imre.deak@intel.com	2025-08-06 15:26:16 +03:00
Imre Deak	d2b524c906	drm/nouveau: Pass along the format info from .fb_create() to drm_helper_mode_fill_fb_struct() Plumb the format info from .fb_create() all the way to drm_helper_mode_fill_fb_struct() to avoid the redundant lookup. The patch is based on the driver parts of the patchset at Link: below, which missed converting the nouveau driver. Due to the absence of this change in the patchset at Link:, after the Fixed: commit below, nouveau_framebuffer_new() -> drm_helper_mode_fill_fb_struct() set drm_framebuffer::format incorrectly to NULL, which lead to the !fb->format WARN() in drm_framebuffer_init() and causing framebuffer creation to fail. This patch fixes both of these issues. v2: Amend the commit log mentioning the functional issues the patch fixes. (Tomi) Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Lyude Paul <lyude@redhat.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Cc: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Cc: nouveau@lists.freedesktop.org Fixes: `41ab92d35c` ("drm: Make passing of format info to drm_helper_mode_fill_fb_struct() mandatory") Link: https://lore.kernel.org/all/20250701090722.13645-1-ville.syrjala@linux.intel.com Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Danilo Krummrich <dakr@kernel.org> Reviewed-by: James Jones <jajones@nvidia.com> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Tested-by: James Jones <jajones@nvidia.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://lore.kernel.org/r/20250805175752.690504-3-imre.deak@intel.com	2025-08-06 15:26:15 +03:00
Imre Deak	f8f6e72fe2	drm/omap: Pass along the format info from .fb_create() to drm_helper_mode_fill_fb_struct() Plumb the format info from .fb_create() all the way to drm_helper_mode_fill_fb_struct() to avoid the redundant lookup. For the fbdev case a manual drm_get_format_info() lookup is needed. The patch is based on the driver parts of the patchset at Link: below, which missed converting the omap driver. Due to the absence of this change in the patchset at Link:, after the Fixed: commit below, omap_framebuffer_init() -> drm_helper_mode_fill_fb_struct() set drm_framebuffer::format incorrectly to NULL, which lead to the !fb->format WARN() in drm_framebuffer_init() and causing framebuffer creation to fail. This patch fixes both of these issues. v2: Amend the commit log mentioning the functional issues the patch fixes. (Tomi) Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Cc: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Maxime Ripard <mripard@kernel.org> Fixes: `41ab92d35c` ("drm: Make passing of format info to drm_helper_mode_fill_fb_struct() mandatory") Reported-by: Mark Brown <broonie@kernel.org> Closes: https://lore.kernel.org/all/98b3a62c-91ff-4f91-a58b-e1265f84180b@sirena.org.uk Link: https://lore.kernel.org/all/20250701090722.13645-1-ville.syrjala@linux.intel.com Tested-by: Mark Brown <broonie@kernel.org> Tested-by: Linux Kernel Functional Testing <lkft@linaro.org> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://lore.kernel.org/r/20250805175752.690504-2-imre.deak@intel.com	2025-08-06 15:26:14 +03:00
Dave Airlie	48bb97cff9	Merge tag 'drm-intel-next-fixes-2025-08-05' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next drm/i915 fixes for v6.17-rc1: - Fixes around DP LFPS (Low-Frequency Periodic Signaling) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Jani Nikula <jani.nikula@intel.com> Link: https://lore.kernel.org/r/e1147bede8f219682419d198022cfe8d9d4edc28@intel.com	2025-08-06 06:11:29 +10:00
Jesse.Zhang	124ffa2970	drm/amdgpu: Update SDMA firmware version check for user queue support This commit fixes a firmware version check for enabling user queue support in SDMA v7.0. The previous version check (7836028) was incorrect and could lead to issues with PROTECTED_FENCE_SIGNAL commands causing register conflicts between MCU_DBG0 and MCU_DBG1. Fixes: `8c011408ed` ("drm/amdgpu/sdma7: add ucode version checks for userq support") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `92e2449241`) Cc: stable@vger.kernel.org	2025-08-04 15:48:14 -04:00
Lijo Lazar	c2fe914d50	drm/amdgpu: Add NULL check for asic_funcs If driver load fails too early, asic_funcs pointer remains unassigned. Add NULL check to sanitize unwind path. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `582bf7c515`) Cc: stable@vger.kernel.org	2025-08-04 15:47:50 -04:00
Mario Limonciello	8e6a18cbf3	drm/amd/display: Revert "drm/amd/display: Fix AMDGPU_MAX_BL_LEVEL value" This reverts commit `66abb99699`. This broke custom brightness curves but it wasn't obvious because of other related changes. Custom brightness curves are always from a 0-255 input signal. The correct fix was to fix the default value which was done by [1]. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4412 Link: https://lore.kernel.org/amd-gfx/0f094c4b-d2a3-42cd-824c-dc2858a5618d@kernel.org/T/#m69f875a7e69aa22df3370b3e3a9e69f4a61fdaf2 Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Roman Li <roman.li@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `6ec8a5cbec`) Cc: stable@vger.kernel.org	2025-08-04 15:46:36 -04:00
Siyang Liu	1bcf63a443	drm/amd/display: fix a Null pointer dereference vulnerability [Why] A null pointer dereference vulnerability exists in the AMD display driver's (DC module) cleanup function dc_destruct(). When display control context (dc->ctx) construction fails (due to memory allocation failure), this pointer remains NULL. During subsequent error handling when dc_destruct() is called, there's no NULL check before dereferencing the perf_trace member (dc->ctx->perf_trace), causing a kernel null pointer dereference crash. [How] Check if dc->ctx is non-NULL before dereferencing. Link: https://lore.kernel.org/r/tencent_54FF4252EDFB6533090A491A25EEF3EDBF06@qq.com Co-developed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> (Updated commit text and removed unnecessary error message) Signed-off-by: Siyang Liu <Security@tencent.com> Signed-off-by: Roman Li <roman.li@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `9dd8e2ba26`) Cc: stable@vger.kernel.org	2025-08-04 15:44:55 -04:00
Michel Dänzer	3477c1b097	drm/amd/display: Add primary plane to commits for correct VRR handling amdgpu_dm_commit_planes calls update_freesync_state_on_stream only for the primary plane. If a commit affects a CRTC but not its primary plane, it would previously not trigger a refresh cycle or affect LFC, violating current UAPI semantics. Fixes e.g. atomic commits affecting only the cursor plane being limited to the minimum refresh rate. Don't do this for the legacy cursor ioctls though, it would break the UAPI semantics for those. Suggested-by: Xaver Hugl <xaver.hugl@kde.org> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3034 Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `cc7bfba959`) Cc: stable@vger.kernel.org	2025-08-04 15:43:58 -04:00
Alex Deucher	9f9bddfa31	drm/amdgpu: update mmhub 3.3 client id mappings Update the client id mapping so the correct clients get printed when there is a mmhub page fault. v2: fix typos spotted by David Wu. v3: fix additional typo spotted by David. Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `e932f4779a`) Cc: stable@vger.kernel.org	2025-08-04 15:42:15 -04:00
Alex Deucher	0bae62cc98	drm/amdgpu: update mmhub 3.0.1 client id mappings Update the client id mapping so the correct clients get printed when there is a mmhub page fault. Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `2a2681eda7`) Cc: stable@vger.kernel.org	2025-08-04 15:41:52 -04:00
YuanShang	c00d8b79fd	drm/amdgpu: Retain job->vm in amdgpu_job_prepare_job The field job->vm is used in function amdgpu_job_run to get the page table re-generation counter and decide whether the job should be skipped. Specifically, function amdgpu_vm_generation checks if the VM is valid for this job to use. For instance, if a gfx job depends on a cancelled sdma job from entity vm->delayed, then the gfx job should be skipped. Fixes: `26c95e838e` ("drm/amdgpu: set the VM pointer to NULL in amdgpu_job_prepare") Signed-off-by: YuanShang <YuanShang.Mao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `ed76936c6b`) Cc: stable@vger.kernel.org	2025-08-04 15:41:09 -04:00
Timur Kristóf	1c8dc3e088	drm/amd/display: Fix DCE 6.0 and 6.4 PLL programming. Apparently, both DCE 6.0 and 6.4 have 3 PLLs, but PLL0 can only be used for DP. Make sure to initialize the correct amount of PLLs in DC for these DCE versions and use PLL0 only for DP. Also, on DCE 6.0 and 6.4, the PLL0 needs to be powered on at initialization as opposed to DCE 6.1 and 7.x which use a different clock source for DFS. The following functions were used as reference from the old radeon driver implementation of DCE 6.x: - radeon_atom_pick_pll - atombios_crtc_set_disp_eng_pll Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `35222b5934`) Cc: stable@vger.kernel.org	2025-08-04 15:39:42 -04:00
Timur Kristóf	4db9cd5548	drm/amd/display: Don't overwrite dce60_clk_mgr dc_clk_mgr_create accidentally overwrites the dce60_clk_mgr with the dce_clk_mgr, causing incorrect behaviour on DCE6. Fix it by removing the extra dce_clk_mgr_construct. Fixes: `62eab49faa` ("drm/amd/display: hide VGH asic specific structs") Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `bbddcbe36a`) Cc: stable@vger.kernel.org	2025-08-04 15:39:21 -04:00
David Yat Sin	f6c0f3d244	drm/amdkfd: Fix checkpoint-restore on multi-xcc GPUs with multi-xcc have multiple MQDs per queue. This patch saves and restores all the MQDs within the partition. Signed-off-by: David Yat Sin <David.YatSin@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `a578f2a58c`) Cc: stable@vger.kernel.org	2025-08-04 15:38:49 -04:00
Mario Limonciello	796ff8a7e0	drm/amd: Restore cached manual clock settings during resume If the SCLK limits have been set before S3 they will not be restored. The limits are however cached in the driver and so they can be restored by running a commit sequence during resume. Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250725031222.3015095-3-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `4e9526924d`) Cc: stable@vger.kernel.org	2025-08-04 15:37:26 -04:00
Mario Limonciello	ed4efe426a	drm/amd: Restore cached power limit during resume The power limit will be cached in smu->current_power_limit but if the ASIC goes into S3 this value won't be restored. Restore the value during SMU resume. Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20250725031222.3015095-2-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `26a609e053`) Cc: stable@vger.kernel.org	2025-08-04 15:37:05 -04:00
Lijo Lazar	05c8b69051	drm/amdgpu: Update external revid for GC v9.5.0 Use different external revid for GC v9.5.0 SOCs. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `21c6764ed4`) Cc: stable@vger.kernel.org	2025-08-04 15:32:37 -04:00
Lijo Lazar	389d79a195	drm/amdgpu: Update supported modes for GC v9.5.0 For GC v9.5.0 SOCs, both CPX and QPX compute modes are also supported in NPS2 mode. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Mangesh Gadre <Mangesh.Gadre@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `9d1ac25c7f`) Cc: stable@vger.kernel.org	2025-08-04 15:32:07 -04:00
Simon Richter	022906afdf	Mark xe driver as BROKEN if kernel page size is not 4kB This driver, for the time being, assumes that the kernel page size is 4kB, so it fails on loong64 and aarch64 with 16kB pages, and ppc64el with 64kB pages. Signed-off-by: Simon Richter <Simon.Richter@hogyros.de> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: stable@vger.kernel.org # v6.8+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://lore.kernel.org/r/20250802024152.3021-1-Simon.Richter@hogyros.de (cherry picked from commit `0521a86822`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-08-04 11:59:11 -04:00
Michal Wajdeczko	cb7a3f949a	drm/xe/pf: Make sure PF is ready to configure VFs The PF driver might be resumed just to configure VFs, but since it is doing some asynchronous GuC reconfigurations after fresh reset, we should wait until all pending works are completed. This is especially important in case of LMEM provisioning, since we also need to update the LMTT and send invalidation requests to all GuCs, which are expected to be already in the VGT mode. Fixes: `68ae022278` ("drm/xe/pf: Force GuC virtualization mode") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250801142822.180530-3-michal.wajdeczko@intel.com (cherry picked from commit `c6c86441c4`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-08-04 11:59:06 -04:00
Michal Wajdeczko	c286ce6b01	drm/xe/pf: Disable PF restart worker on device removal We can't let restart worker run once device is removed, since other data that it might want to access could be already released. Explicitly disable worker as part of device cleanup action. Fixes: `a4d1c5d0b9` ("drm/xe/pf: Move VFs reprovisioning to worker") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://lore.kernel.org/r/20250801142822.180530-2-michal.wajdeczko@intel.com (cherry picked from commit `a424353937`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-08-04 11:59:01 -04:00
Balasubramani Vivekanandan	465f1dba74	drm/xe/devcoredump: Defer devcoredump initialization during probe Doing devcoredump initializing before GT though look harmless, it leads to problem during driver unbind. Because of this order, GT/Engine release functions will be called before xe devcoredump release function (xe_driver_devcoredump_fini) leading to the following kernel crash[1] because the devcoredump functions might still use GT/Engine datastructures after those are freed. The following crash is observed while running the IGT xe_wedged@wedged-at-any-timeout. The test forces a wedged state by submitting a workload which hangs. Then does a unbind/rebind of the driver to recover from the wedged state. The hanged workload leads to a devcoredump. The following crash is noticed when the devcoredump capture races with the driver unbind. During driver unbind, the release function hw_engine_fini() will be called which assigns NULL to hwe->gt. But the same data structure is accessed during the coredump capture in the function xe_engine_snapshot_print by reading snapshot->hwe->gt. With this patch, we make sure the devcoredump is stopped before deinitializing the core driver functions. [1]: BUG: kernel NULL pointer dereference, address: 0000000000000000 Workqueue: events_unbound xe_devcoredump_deferred_snap_work [xe] RIP: 0010:xe_engine_snapshot_print+0x47/0x420 [xe] Call Trace: <TASK> ? drm_printf+0x64/0x90 __xe_devcoredump_read+0x23f/0x2d0 [xe] ? __pfx___drm_printfn_coredump+0x10/0x10 ? __pfx___drm_puts_coredump+0x10/0x10 xe_devcoredump_deferred_snap_work+0x17a/0x190 [xe] process_one_work+0x22e/0x6f0 worker_thread+0x1e8/0x3d0 ? __pfx_worker_thread+0x10/0x10 kthread+0x11f/0x250 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x47/0x70 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 v2: Detailed commit description (Rodrigo) v3: FIXME added (Rodrigo, Stuart) Fixes: `4209d635a8` ("drm/xe: Remove devcoredump during driver release") Reviewed-by: Stuart Summers <stuart.summers@intel.com> Link: https://lore.kernel.org/r/20250731061300.14320-1-balasubramani.vivekanandan@intel.com Signed-off-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Link: https://lore.kernel.org/r/20250801052356.21885-1-balasubramani.vivekanandan@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `1fdc4c381f`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-08-04 11:58:56 -04:00
Michal Wajdeczko	df9bdd4381	drm/xe/pf: Enable SR-IOV PF mode by default We already claim official support for SR-IOV PF/VF modes on PTL and BMG platforms, but by default we start the Xe driver on those platforms in non-virtualized mode (native) since we still have max_vfs modparam set to disable creation of the VFs. It's time to let the Xe driver support SR-IOV PF mode by default. We were already testing this on our CI, which was relying on the patch that was enabling it for CONFIG_DRM_XE_DEBUG used by our CI. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Thomas Hellstrom <thomas.hellstrom@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250722182618.30811-3-michal.wajdeczko@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `a2b461bd6f`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-08-04 11:56:18 -04:00

1 2 3 4 5 ...

116123 Commits