Currently the TIMER_BASE_CONFIG0 register gets initialized to a fixed
value as initially found in vendor driver code supporting the RK3588
SoC. As a matter of fact the value matches the rate of the HDMI TX
reference clock, which is roughly 428.57 MHz.
However, on RK3576 SoC that rate is slightly lower, i.e. 396.00 MHz, and
the incorrect register configuration breaks CEC functionality.
Set the timer base according to the actual reference clock rate that
shall be provided by the platform driver. Otherwise fallback to the
vendor default.
While at it, also drop the unnecessary empty lines in
dw_hdmi_qp_init_hw().
Signed-off-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Signed-off-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20250903-rk3588-hdmi-cec-v4-2-fa25163c4b08@collabora.com
Use a vblank timer to simulate the vblank interrupt. The DRM vblank
helpers provide an implementation on top of Linux' hrtimer. Qxl
enables and disables the timer as part of the CRTC. The atomic_flush
callback sets up the event. Like vblank interrupts, the vblank timer
fires at the rate of the display refresh.
Most userspace limits its page flip rate according to the DRM vblank
event. Qxl's virtual hardware does not provide vblank interrupts, so
DRM sends each event ASAP. With the fast access times of virtual display
memory, the event rate is much higher than the display mode's refresh
rate; creating the next page flip almost immediately. This leads to
excessive CPU overhead from even small display updates, such as moving
the mouse pointer.
This problem affects qxl and all other virtual displays. See [1] for
a discussion in the context of hypervdrm.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/dri-devel/SN6PR02MB415702B00D6D52B0EE962C98D46CA@SN6PR02MB4157.namprd02.prod.outlook.com/ # [1]
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Link: https://lore.kernel.org/r/20251008122911.231674-1-tzimmermann@suse.de
Use a vblank timer to simulate the vblank interrupt. The DRM vblank
helpers provide an implementation on top of Linux' hrtimer. Cirrus-qemu
enables and disables the timer as part of the CRTC. The atomic_flush
callback sets up the event. Like vblank interrupts, the vblank timer
fires at the rate of the display refresh.
Most userspace limits its page flip rate according to the DRM vblank
event. Cirrus-qemu' virtual hardware does not provide vblank interrupts,
so DRM sends each event ASAP. With the fast access times of virtual display
memory, the event rate is much higher than the display mode's refresh
rate; creating the next page flip almost immediately. This leads to
excessive CPU overhead from even small display updates, such as moving
the mouse pointer.
This problem affects cirrus-qemu and all other virtual displays. See [1]
for a discussion in the context of hypervdrm.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/dri-devel/SN6PR02MB415702B00D6D52B0EE962C98D46CA@SN6PR02MB4157.namprd02.prod.outlook.com/ # [1]
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Link: https://lore.kernel.org/r/20251008121450.227997-1-tzimmermann@suse.de
Use a vblank timer to simulate the vblank interrupt. The DRM vblank
helpers provide an implementation on top of Linux' hrtimer. Bochs
enables and disables the timer as part of the CRTC. The atomic_flush
callback sets up the event. Like vblank interrupts, the vblank timer
fires at the rate of the display refresh.
Most userspace limits its page flip rate according to the DRM vblank
event. Bochs' virtual hardware does not provide vblank interrupts, so
DRM sends each event ASAP. With the fast access time of virtual display
memory, the event rate is much higher than the display mode's refresh
rate; creating the next page flip almost immediately. This leads to
excessive CPU overhead from even small display updates, such as moving
the mouse pointer.
This problem affects bochs and all other virtual displays. See [1] for
a discussion in the context of hypervdrm.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/dri-devel/SN6PR02MB415702B00D6D52B0EE962C98D46CA@SN6PR02MB4157.namprd02.prod.outlook.com/ # [1]
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Link: https://lore.kernel.org/r/20251008093931.19138-1-tzimmermann@suse.de
Store each hardware's CRTC memory threshold in the specific instance
of struct ast_device_quirks. Removes the calls to IS_AST_GENn() from
ast_set_crtthd_reg().
The values stored in the registers appear to be plain limits. Hence
write them in the driver in decimal format instead of hexadecimal.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://lore.kernel.org/r/20251007150343.273718-4-tzimmermann@suse.de
Init the new field dclk_table in struct ast_device to the per-gen
table of DRAM clock parameters. Use the field during modesetting.
The table is static, so a setup is only required once. Removes the
call to IS_AST_GEN() from the atomic commit's code path.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://lore.kernel.org/r/20251007150343.273718-2-tzimmermann@suse.de
On SoCs, like the SAM9X75, which embed the XLCDC ip, the registers that
configure the unified scaling engine were not filled with proper values.
Indeed, for YCbCr formats, the VXSCFACT bitfield of the HEOCFG25
register and the HXSCFACT bitfield of the HEOCFG27 register were
incorrect.
For 4:2:0 formats, both vertical and horizontal factors for
chroma chanels should be divided by 2 from the factors for the luma
channel. Hence:
HEOCFG24.VXSYFACT = VFACTOR
HEOCFG25.VSXCFACT = VFACTOR / 2
HEOCFG26.HXSYFACT = HFACTOR
HEOCFG27.HXSCFACT = HFACTOR / 2
However, for 4:2:2 formats, only the horizontal factor for chroma
chanels should be divided by 2 from the factor for the luma channel;
the vertical factor is the same for all the luma and chroma channels.
Hence:
HEOCFG24.VXSYFACT = VFACTOR
HEOCFG25.VXSCFACT = VFACTOR
HEOCFG26.HXSYFACT = HFACTOR
HEOCFG27.HXSCFACT = HFACTOR / 2
Fixes: d498771b0b ("drm: atmel_hlcdc: Add support for XLCDC using IP specific driver ops")
Signed-off-by: Cyrille Pitchen <cyrille.pitchen@microchip.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>
Link: https://lore.kernel.org/r/20241014094942.325211-1-manikandan.m@microchip.com
Signed-off-by: Manikandan Muralidharan <manikandan.m@microchip.com>
The `ttm_resource_manager_usage()` function currently assumes `man->bdev` is non-NULL when accessing `man->bdev->lru_lock`.
However, in scenarios where the resource manager is not fully initialized (e.g., APU platforms that lack dedicated VRAM, or incomplete manager setup),
`man->bdev` may remain NULL. This leads to a NULL pointer dereference when attempting to acquire the `lru_lock`, triggering kernel OOPS.
Fix this by adding an explicit safety check for `man->bdev` before accessing its members:
- Use `WARN_ON_ONCE(!man->bdev)` to emit a one-time warning (a soft assertion) when `man->bdev` is NULL. This helps catch invalid usage patterns during debugging without breaking production workflows.
- Return 0 immediately if `man->bdev` is NULL, as a non-initialized manager cannot have valid resource usage to report.
Suggested-by: Christian König <christian.koenig@amd.com>
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Link: https://lore.kernel.org/r/20251013085849.1735612-1-Jesse.Zhang@amd.com
The state pointer found in the struct drm_atomic_state internals for
most object is a bit ambiguous, and confusing when those internals also
have old state and new state.
After the recent cleanups, the state pointer only use is to point to the
state we need to free when destroying the atomic state.
We can thus rename it something less ambiguous, and hopefully more
meaningful.
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20251008-drm-rename-state-v2-1-49b490b2676a@kernel.org
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Pull more drm fixes from Dave Airlie:
"Just the follow up fixes for rc1 from the next branch, amdgpu and xe
mostly with a single v3d fix in there.
amdgpu:
- DC DCE6 fixes
- GPU reset fixes
- Secure diplay messaging cleanup
- MES fix
- GPUVM locking fixes
- PMFW messaging cleanup
- PCI US/DS switch handling fix
- VCN queue reset fix
- DC FPU handling fix
- DCN 3.5 fix
- DC mirroring fix
amdkfd:
- Fix kfd process ref leak
- mmap write lock handling fix
- Fix comments in IOCTL
xe:
- Fix build with clang 16
- Fix handling of invalid configfs syntax usage and spell out the
expected syntax in the documentation
- Do not try late bind firmware when running as VF since it shouldn't
handle firmware loading
- Fix idle assertion for local BOs
- Fix uninitialized variable for late binding
- Do not require perfmon_capable to expose free memory at page
granularity. Handle it like other drm drivers do
- Fix lock handling on suspend error path
- Fix I2C controller resume after S3
v3d:
- fix fence locking"
* tag 'drm-next-2025-10-11-1' of https://gitlab.freedesktop.org/drm/kernel: (34 commits)
drm/amd/display: Incorrect Mirror Cositing
drm/amd/display: Enable Dynamic DTBCLK Switch
drm/amdgpu: Report individual reset error
drm/amdgpu: partially revert "revert to old status lock handling v3"
drm/amd/display: Fix unsafe uses of kernel mode FPU
drm/amd/pm: Disable VCN queue reset on SMU v13.0.6 due to regression
drm/amdgpu: Fix general protection fault in amdgpu_vm_bo_reset_state_machine
drm/amdgpu: Check swus/ds for switch state save
drm/amdkfd: Fix two comments in kfd_ioctl.h
drm/amd/pm: Avoid interface mismatch messaging
drm/amdgpu: Merge amdgpu_vm_set_pasid into amdgpu_vm_init
drm/amd/amdgpu: Fix the mes version that support inv_tlbs
drm/amd: Check whether secure display TA loaded successfully
drm/amdkfd: Fix mmap write lock not release
drm/amdkfd: Fix kfd process ref leaking when userptr unmapping
drm/amdgpu: Fix for GPU reset being blocked by KIQ I/O.
drm/amd/display: Disable scaling on DCE6 for now
drm/amd/display: Properly disable scaling on DCE6
drm/amd/display: Properly clear SCL_*_FILTER_CONTROL on DCE6
drm/amd/display: Add missing DCE6 SCL_HORZ_FILTER_INIT* SRIs
...
Pull drm fixes from Dave Airlie:
"Some fixes leftover from our fixes branch, just nouveau and vmwgfx:
nouveau:
- Return errno code from TTM move helper
vmwgfx:
- Fix null-ptr access in cursor code
- Fix UAF in validation
- Use correct iterator in validation"
* tag 'drm-fixes-2025-10-11' of https://gitlab.freedesktop.org/drm/kernel:
drm/nouveau: fix bad ret code in nouveau_bo_move_prep
drm/vmwgfx: Fix copy-paste typo in validation
drm/vmwgfx: Fix Use-after-free in validation
drm/vmwgfx: Fix a null-ptr access in the cursor snooper
Use a vblank timer to simulate the vblank interrupt. The DRM vblank
helpers provide an implementation on top of Linux' hrtimer. Virtgpu
enables and disables the timer as part of the CRTC. The atomic_flush
callback sets up the event. Like vblank interrupts, the vblank timer
fires at the rate of the display refresh.
Most userspace limits its page flip rate according to the DRM vblank
event. Virtgpu's virtual hardware does not provide vblank interrupts, so
DRM sends each event ASAP. With the fast access times of virtual display
memory, the event rate is much higher than the display mode's refresh
rate; creating the next page flip almost immediately. This leads to
excessive CPU overhead from even small display updates, such as moving
the mouse pointer.
This problem affects virtgpu and all other virtual displays. See [1] for
a discussion in the context of hypervdrm.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/dri-devel/SN6PR02MB415702B00D6D52B0EE962C98D46CA@SN6PR02MB4157.namprd02.prod.outlook.com/ # [1]
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://lore.kernel.org/r/20251008130701.246988-1-tzimmermann@suse.de
In `nouveau_bo_move_prep`, if `nouveau_mem_map` fails, an error code
should be returned. Currently, it returns zero even if vmm addr is not
correctly mapped.
Cc: stable@vger.kernel.org
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Signed-off-by: Shuhao Fu <sfual@cse.ust.hk>
Fixes: 9ce523cc3b ("drm/nouveau: separate buffer object backing memory from nvkm structures")
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
If reinitialization of one of the GPUs fails after reset, it logs
failure on all subsequent GPUs eventhough they have resumed
successfully.
A sample log where only device at 0000:95:00.0 had a failure -
amdgpu 0000:15:00.0: amdgpu: GPU reset(19) succeeded!
amdgpu 0000:65:00.0: amdgpu: GPU reset(19) succeeded!
amdgpu 0000:75:00.0: amdgpu: GPU reset(19) succeeded!
amdgpu 0000:85:00.0: amdgpu: GPU reset(19) succeeded!
amdgpu 0000:95:00.0: amdgpu: GPU reset(19) failed
amdgpu 0000:e5:00.0: amdgpu: GPU reset(19) failed
amdgpu 0000:f5:00.0: amdgpu: GPU reset(19) failed
amdgpu 0000:05:00.0: amdgpu: GPU reset(19) failed
amdgpu 0000:15:00.0: amdgpu: GPU reset end with ret = -5
To avoid confusion, report the error for each device
separately and return the first error as the overall result.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The CI systems are pointing out list corruptions, so we still need to
fix something here.
Keep the asserts, but revert the lock changes for now.
Fixes: 59e4405e9e ("drm/amdgpu: revert to old status lock handling v3")
Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The point of isolating code that uses kernel mode FPU in separate
compilation units is to ensure that even implicit uses of, e.g., SIMD
registers for spilling occur only in a context where this is permitted,
i.e., from inside a kernel_fpu_begin/end block.
This is important on arm64, which uses -mgeneral-regs-only to build all
kernel code, with the exception of such compilation units where FP or
SIMD registers are expected to be used. Given that the compiler may
invent uses of FP/SIMD anywhere in such a unit, none of its code may be
accessible from outside a kernel_fpu_begin/end block.
This means that all callers into such compilation units must use the
DC_FP start/end macros, which must not occur there themselves. For
robustness, all functions with external linkage that reside there should
call dc_assert_fp_enabled() to assert that the FPU context was set up
correctly.
Fix this for the DCN35, DCN351 and DCN36 implementations.
Cc: Austin Zheng <austin.zheng@amd.com>
Cc: Jun Lei <jun.lei@amd.com>
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Leo Li <sunpeng.li@amd.com>
Cc: Rodrigo Siqueira <siqueira@igalia.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Disable VCN reset capability for the program 4 as it's
causing regressions.
Fixes: 9d20f37a10 ("drm/amd/pm: Add VCN reset support for SMU v13.0.6")
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
PMFW interface version is not used by some IP implementations like SMU
v13.0.6/12, instead rely on PMFW version checks. Avoid the log if
interface version is not used.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
As KFD no longer uses a separate PASID, the global amdgpu_vm_set_pasid()function is no longer necessary.
Merge its functionality directly intoamdgpu_vm_init() to simplify code flow and eliminate redundant locking.
v2: remove superflous check
adjust amdgpu_vm_fin and remove amdgpu_vm_set_pasid (Chritian)
v3: drop amdgpu_vm_assert_locked (Chritian)
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4614
Fixes: 59e4405e9e ("drm/amdgpu: revert to old status lock handling v3")
Reviewed-by: Christian König <christian.koenig@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
MES version 0x83 is not stable to use the inv_tlbs API. Defer it to 0x84 vertsion.
Fixes: 85442bac84 ("drm/amd/amdgpu: Fix the mes version that support inv_tlbs")
Signed-off-by: Shaoyun Liu <shaoyun.liu@amd.com>
Reviewed-by: Michael Chen <michael.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Not all renoir hardware supports secure display. If the TA is present
but the feature isn't supported it will fail to load or send commands.
This shows ERR messages to the user that make it seems like there is
a problem.
[How]
Check the resp_status of the context to see if there was an error
before trying to send any secure display commands.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1415
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If mmap write lock is taken while draining retry fault, mmap write lock
is not released because svm_range_restore_pages calls mmap_read_unlock
then returns. This causes deadlock and system hangs later because mmap
read or write lock cannot be taken.
Downgrade mmap write lock to read lock if draining retry fault fix this
bug.
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
kfd_lookup_process_by_pid hold the kfd process reference to ensure it
doesn't get destroyed while sending the segfault event to user space.
Calling kfd_lookup_process_by_pid as function parameter leaks the kfd
process refcount and miss the NULL pointer check if app process is
already destroyed.
Fixes: 2d274bf709 ("amd/amdkfd: Trigger segfault for early userptr unmmapping")
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There is some probability that reset workqueue is blocked by KIQ I/O for 10+ seconds after gpu hangs.
So we need to add a in_reset check during each KIQ register poll.
Signed-off-by: Heng Zhou <Heng.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Scaling doesn't work on DCE6 at the moment, the current
register programming produces incorrect output when using
fractional scaling (between 100-200%) on resolutions higher
than 1080p.
Disable it until we figure out how to program it properly.
Fixes: 7c15fd86aa ("drm/amd/display: dc/dce: add initial DCE6 support (v10)")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
SCL_SCALER_ENABLE can be used to enable/disable the scaler
on DCE6. Program it to 0 when scaling isn't used, 1 when used.
Additionally, clear some other registers when scaling is
disabled and program the SCL_UPDATE register as recommended.
This fixes visible glitches for users whose BIOS sets up a
mode with scaling at boot, which DC was unable to clean up.
Fixes: b70aaf5586 ("drm/amd/display: dce_transform: add DCE6 specific macros,functions")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>