linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-28 16:35:01 -04:00

Author	SHA1	Message	Date
Srinivasan Shanmugam	7d83c129a8	drm/amdgpu: Fix parameter annotation in vcn_v5_0_0_is_idle Update parameter description in the vcn_v5_0_0_is_idle function Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c:1231: warning: Function parameter or struct member 'ip_block' not described in 'vcn_v5_0_0_is_idle' drivers/gpu/drm/amd/amdgpu/vcn_v5_0_0.c:1231: warning: Excess function parameter 'handle' description in 'vcn_v5_0_0_is_idle' Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:05 -05:00
Salah Triki	cbf85b9cb8	bluetooth: btusb: Initialize .owner field of force_poll_sync_fops Initialize .owner field of force_poll_sync_fops to THIS_MODULE in order to prevent btusb from being unloaded while its operations are in use. Fixes: `800fe5ec30` ("Bluetooth: btusb: Add support for queuing during polling interval") Signed-off-by: Salah Triki <salah.triki@gmail.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2025-02-27 16:50:05 -05:00
Philip Yang	fe9d0061c4	drm/amdkfd: debugfs hang_hws skip GPU with MES debugfs hang_hws is used by GPU reset test with HWS, for MES this crash the kernel with NULL pointer access because dqm->packet_mgr is not setup for MES path. Skip GPU with MES for now, MES hang_hws debugfs interface will be supported later. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:05 -05:00
Philip Yang	7919b4cad5	drm/amdkfd: Fix pqm_destroy_queue race with GPU reset If GPU in reset, destroy_queue return -EIO, pqm_destroy_queue should delete the queue from process_queue_list and free the resource. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:05 -05:00
Srinivasan Shanmugam	0f3fda3117	drm/amdgpu: Fix parameter annotations for VCN clock gating functions The previous references to a non-existent `adev` parameter have been removed & corrected to reflect the use of the `vinst` pointer, which points to the VCN instance structure, in the below files: - vcn_v1_0.c - vcn_v2_0.c - vcn_v3_0.c Fixes the below with gcc W=1: drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c:624: warning: Function parameter or struct member 'vinst' not described in 'vcn_v1_0_enable_clock_gating' drivers/gpu/drm/amd/amdgpu/vcn_v1_0.c:624: warning: Excess function parameter 'adev' description in 'vcn_v1_0_enable_clock_gating' drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:376: warning: Function parameter or struct member 'vinst' not described in 'vcn_v2_0_mc_resume' drivers/gpu/drm/amd/amdgpu/vcn_v2_0.c:376: warning: Excess function parameter 'adev' description in 'vcn_v2_0_mc_resume' drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:776: warning: Function parameter or struct member 'vinst' not described in 'vcn_v3_0_disable_clock_gating' drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:776: warning: Excess function parameter 'adev' description in 'vcn_v3_0_disable_clock_gating' drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:776: warning: Excess function parameter 'inst' description in 'vcn_v3_0_disable_clock_gating' drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:965: warning: Function parameter or struct member 'vinst' not described in 'vcn_v3_0_enable_clock_gating' drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:965: warning: Excess function parameter 'adev' description in 'vcn_v3_0_enable_clock_gating' drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c:965: warning: Excess function parameter 'inst' description in 'vcn_v3_0_enable_clock_gating' Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:05 -05:00
Philip Yang	f0b4440cdc	drm/amdkfd: Fix mode1 reset crash issue If HW scheduler hangs and mode1 reset is used to recover GPU, KFD signal user space to abort the processes. After process abort exit, user queues still use the GPU to access system memory before h/w is reset while KFD cleanup worker free system memory and free VRAM. There is use-after-free race bug that KFD allocate and reuse the freed system memory, and user queue write to the same system memory to corrupt the data structure and cause driver crash. To fix this race, KFD cleanup worker terminate user queues, then flush reset_domain wq to wait for any GPU ongoing reset complete, and then free outstanding BOs. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:04 -05:00
Philip Yang	1b9366c601	drm/amdkfd: KFD release_work possible circular locking If waiting for gpu reset done in KFD release_work, thers is WARNING: possible circular locking dependency detected #2 kfd_create_process kfd_process_mutex flush kfd release work #1 kfd release work wait for amdgpu reset work #0 amdgpu_device_gpu_reset kgd2kfd_pre_reset kfd_process_mutex Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock((work_completion)(&p->release_work)); lock((wq_completion)kfd_process_wq); lock((work_completion)(&p->release_work)); lock((wq_completion)amdgpu-reset-dev); To fix this, KFD create process move flush release work outside kfd_process_mutex. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:04 -05:00
Philip Yang	ee3ed10066	drm/amdkfd: Remove kfd_process_hw_exception worker With GPU reset-domain worker implemented, KFD hw_exception worker is not needed any more, just call amdgpu_amdkfd_gpu_reset directly from kfd_hws_hang. Suggested-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:04 -05:00
Asad Kamal	f9234217d0	drm/amd/amdgpu: Add support for xgmi_v6_4_1 Add support for xgmi_v6_4_1 and use it appropriate places Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:04 -05:00
Lijo Lazar	485993e2f1	drm/amdgpu: Add xgmi speed/width related info Add APIs to initialize XGMI speed, width details and get to max bandwidth supported. It is assumed that a device only supports same generation of XGMI links with uniform width. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:04 -05:00
Lijo Lazar	6f16d101da	drm/amdgpu: Move xgmi definitions to xgmi header Move definitions related to xgmi to amdgpu_xgmi header Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:04 -05:00
Kenneth Feng	0107c595c5	drm/amd/pm: add fan abnormal detection add fan abnormal detection on smu v14.0.2&smu v14.0.3 Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:04 -05:00
Xiaogang Chen	509d662a57	drm/amdkfd: remove kfd_pasid.c from amdgpu driver build Since kfd uses pasid values from graphic driver now do not need use kfd pasid fucntions. Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:04 -05:00
David Yat Sin	e90711946b	drm/amdkfd: clamp queue size to minimum If queue size is less than minimum, clamp it to minimum to prevent underflow when writing queue mqd. Signed-off-by: David Yat Sin <David.YatSin@amd.com> Reviewed-by: Jay Cornwall <jay.cornwall@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:04 -05:00
André Almeida	9c696cc57c	drm/amdgpu: Create a debug option to disable ring reset Prior to the addition of ring reset, the debug option `debug_disable_soft_recovery` could be used to force a full device reset. Now that we have ring reset, create a debug option to disable them in amdgpu, forcing the driver to go with the full device reset path again when both options are combined. This option is useful for testing and debugging purposes when one wants to test the full reset from userspace. Signed-off-by: André Almeida <andrealmeid@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:04 -05:00
Ma Ke	63e6a77ccf	drm/amd/display: Fix null check for pipe_ctx->plane_state in resource_build_scaling_params Null pointer dereference issue could occur when pipe_ctx->plane_state is null. The fix adds a check to ensure 'pipe_ctx->plane_state' is not null before accessing. This prevents a null pointer dereference. Found by code review. Fixes: `3be5262e35` ("drm/amd/display: Rename more dc_surface stuff to plane_state") Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Ma Ke <make24@iscas.ac.cn> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:04 -05:00
Alex Deucher	c0a01660de	Documentation/gpu: remove duplicate entries in different glossaries Some items were defined in both the general and DC glossaries. Remove the duplicate entries. Fixes: `2df30ae0ba` ("Documentation/gpu: Add acronyms for some firmware components") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Rodrigo Siqueira <siqueira@igalia.com> Reviewed-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:04 -05:00
Alex Deucher	1d72fc2e9e	drm/amdgpu/mes11: drop amdgpu_mes_suspend()/amdgpu_mes_resume() calls They are noops on GFX11 for most firmware versions. KFD already handles its own queues and they should already be unmapped at this point so even if this runs, it's not doing anything. Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:03 -05:00
Colin Ian King	eaa3feb16d	drm/amdgpu: Fix spelling mistake "initiailize" -> "initialize" and grammar There is a spelling mistake and a grammatical error in a dev_err message. Fix it. Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:03 -05:00
Xiang Liu	00f85667fa	drm/amdgpu: Decode deferred error type in aca bank parser In the case of poison inband log, the error type need to be specified by checking the deferred or poison bit of status register. v2: check both deferred and poison bit Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:03 -05:00
Le Ma	5b5f01eff7	drm/amdgpu: add sdma page queue irq processing for sdma442 Add the trap irq processing for page queue of sdma442 Signed-off-by: Le Ma <le.ma@amd.com> Reviewed-by and Tested-by: Jesse Zhang <jesse.zhang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:03 -05:00
Kenneth Feng	7d37bcab97	drm/amd/pm: disable gfxoff on the specific sku disable gfxoff on the specific sku based on the requirement Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:03 -05:00
Xiang Liu	d4bd7a50ca	drm/amdgpu: Report generic instead of unknown boot time errors Change the DMESG reporting of unknown errors to "Boot Controller Generic Error" to align with the RAS SPEC and provide more clarity to customers. Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:03 -05:00
Lijo Lazar	b965e42530	drm/amdgpu: Fix logic to fetch supported NPS modes Correct the logic to find supported NPS modes from firmware. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reported-by: Ava Zhang <niandong.zhang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Fixes: `30eb41f5d1` ("drm/amdgpu: Use firmware supported NPS modes") Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:03 -05:00
Xiang Liu	906d2859e1	drm/amdgpu: Disable fru_id field in CPER section The fru_id field is disabled cause of mis-matching defination between CPER spec and driver. Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:50:03 -05:00
Srinivasan Shanmugam	fddc450263	drm/amdkfd: Fix Circular Locking Dependency in 'svm_range_cpu_invalidate_pagetables' This commit addresses a circular locking dependency in the svm_range_cpu_invalidate_pagetables function. The function previously held a lock while determining whether to perform an unmap or eviction operation, which could lead to deadlocks. Fixes the below: [ 223.418794] ====================================================== [ 223.418820] WARNING: possible circular locking dependency detected [ 223.418845] 6.12.0-amdstaging-drm-next-lol-050225 #14 Tainted: G U OE [ 223.418869] ------------------------------------------------------ [ 223.418889] kfdtest/3939 is trying to acquire lock: [ 223.418906] ffff8957552eae38 (&dqm->lock_hidden){+.+.}-{3:3}, at: evict_process_queues_cpsch+0x43/0x210 [amdgpu] [ 223.419302] but task is already holding lock: [ 223.419303] ffff8957556b83b0 (&prange->lock){+.+.}-{3:3}, at: svm_range_cpu_invalidate_pagetables+0x9d/0x850 [amdgpu] [ 223.419447] Console: switching to colour dummy device 80x25 [ 223.419477] [IGT] amd_basic: executing [ 223.419599] which lock already depends on the new lock. [ 223.419611] the existing dependency chain (in reverse order) is: [ 223.419621] -> #2 (&prange->lock){+.+.}-{3:3}: [ 223.419636] __mutex_lock+0x85/0xe20 [ 223.419647] mutex_lock_nested+0x1b/0x30 [ 223.419656] svm_range_validate_and_map+0x2f1/0x15b0 [amdgpu] [ 223.419954] svm_range_set_attr+0xe8c/0x1710 [amdgpu] [ 223.420236] svm_ioctl+0x46/0x50 [amdgpu] [ 223.420503] kfd_ioctl_svm+0x50/0x90 [amdgpu] [ 223.420763] kfd_ioctl+0x409/0x6d0 [amdgpu] [ 223.421024] __x64_sys_ioctl+0x95/0xd0 [ 223.421036] x64_sys_call+0x1205/0x20d0 [ 223.421047] do_syscall_64+0x87/0x140 [ 223.421056] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 223.421068] -> #1 (reservation_ww_class_mutex){+.+.}-{3:3}: [ 223.421084] __ww_mutex_lock.constprop.0+0xab/0x1560 [ 223.421095] ww_mutex_lock+0x2b/0x90 [ 223.421103] amdgpu_amdkfd_alloc_gtt_mem+0xcc/0x2b0 [amdgpu] [ 223.421361] add_queue_mes+0x3bc/0x440 [amdgpu] [ 223.421623] unhalt_cpsch+0x1ae/0x240 [amdgpu] [ 223.421888] kgd2kfd_start_sched+0x5e/0xd0 [amdgpu] [ 223.422148] amdgpu_amdkfd_start_sched+0x3d/0x50 [amdgpu] [ 223.422414] amdgpu_gfx_enforce_isolation_handler+0x132/0x270 [amdgpu] [ 223.422662] process_one_work+0x21e/0x680 [ 223.422673] worker_thread+0x190/0x330 [ 223.422682] kthread+0xe7/0x120 [ 223.422690] ret_from_fork+0x3c/0x60 [ 223.422699] ret_from_fork_asm+0x1a/0x30 [ 223.422708] -> #0 (&dqm->lock_hidden){+.+.}-{3:3}: [ 223.422723] __lock_acquire+0x16f4/0x2810 [ 223.422734] lock_acquire+0xd1/0x300 [ 223.422742] __mutex_lock+0x85/0xe20 [ 223.422751] mutex_lock_nested+0x1b/0x30 [ 223.422760] evict_process_queues_cpsch+0x43/0x210 [amdgpu] [ 223.423025] kfd_process_evict_queues+0x8a/0x1d0 [amdgpu] [ 223.423285] kgd2kfd_quiesce_mm+0x43/0x90 [amdgpu] [ 223.423540] svm_range_cpu_invalidate_pagetables+0x4a7/0x850 [amdgpu] [ 223.423807] __mmu_notifier_invalidate_range_start+0x1f5/0x250 [ 223.423819] copy_page_range+0x1e94/0x1ea0 [ 223.423829] copy_process+0x172f/0x2ad0 [ 223.423839] kernel_clone+0x9c/0x3f0 [ 223.423847] __do_sys_clone+0x66/0x90 [ 223.423856] __x64_sys_clone+0x25/0x30 [ 223.423864] x64_sys_call+0x1d7c/0x20d0 [ 223.423872] do_syscall_64+0x87/0x140 [ 223.423880] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 223.423891] other info that might help us debug this: [ 223.423903] Chain exists of: &dqm->lock_hidden --> reservation_ww_class_mutex --> &prange->lock [ 223.423926] Possible unsafe locking scenario: [ 223.423935] CPU0 CPU1 [ 223.423942] ---- ---- [ 223.423949] lock(&prange->lock); [ 223.423958] lock(reservation_ww_class_mutex); [ 223.423970] lock(&prange->lock); [ 223.423981] lock(&dqm->lock_hidden); [ 223.423990] * DEADLOCK * [ 223.423999] 5 locks held by kfdtest/3939: [ 223.424006] #0: ffffffffb82b4fc0 (dup_mmap_sem){.+.+}-{0:0}, at: copy_process+0x1387/0x2ad0 [ 223.424026] #1: ffff89575eda81b0 (&mm->mmap_lock){++++}-{3:3}, at: copy_process+0x13a8/0x2ad0 [ 223.424046] #2: ffff89575edaf3b0 (&mm->mmap_lock/1){+.+.}-{3:3}, at: copy_process+0x13e4/0x2ad0 [ 223.424066] #3: ffffffffb82e76e0 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}, at: copy_page_range+0x1cea/0x1ea0 [ 223.424088] #4: ffff8957556b83b0 (&prange->lock){+.+.}-{3:3}, at: svm_range_cpu_invalidate_pagetables+0x9d/0x850 [amdgpu] [ 223.424365] stack backtrace: [ 223.424374] CPU: 0 UID: 0 PID: 3939 Comm: kfdtest Tainted: G U OE 6.12.0-amdstaging-drm-next-lol-050225 #14 [ 223.424392] Tainted: [U]=USER, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE [ 223.424401] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS PRO WIFI/X570 AORUS PRO WIFI, BIOS F36a 02/16/2022 [ 223.424416] Call Trace: [ 223.424423] <TASK> [ 223.424430] dump_stack_lvl+0x9b/0xf0 [ 223.424441] dump_stack+0x10/0x20 [ 223.424449] print_circular_bug+0x275/0x350 [ 223.424460] check_noncircular+0x157/0x170 [ 223.424469] ? __bfs+0xfd/0x2c0 [ 223.424481] __lock_acquire+0x16f4/0x2810 [ 223.424490] ? srso_return_thunk+0x5/0x5f [ 223.424505] lock_acquire+0xd1/0x300 [ 223.424514] ? evict_process_queues_cpsch+0x43/0x210 [amdgpu] [ 223.424783] __mutex_lock+0x85/0xe20 [ 223.424792] ? evict_process_queues_cpsch+0x43/0x210 [amdgpu] [ 223.425058] ? srso_return_thunk+0x5/0x5f [ 223.425067] ? mark_held_locks+0x54/0x90 [ 223.425076] ? evict_process_queues_cpsch+0x43/0x210 [amdgpu] [ 223.425339] ? srso_return_thunk+0x5/0x5f [ 223.425350] mutex_lock_nested+0x1b/0x30 [ 223.425358] ? mutex_lock_nested+0x1b/0x30 [ 223.425367] evict_process_queues_cpsch+0x43/0x210 [amdgpu] [ 223.425631] kfd_process_evict_queues+0x8a/0x1d0 [amdgpu] [ 223.425893] kgd2kfd_quiesce_mm+0x43/0x90 [amdgpu] [ 223.426156] svm_range_cpu_invalidate_pagetables+0x4a7/0x850 [amdgpu] [ 223.426423] ? srso_return_thunk+0x5/0x5f [ 223.426436] __mmu_notifier_invalidate_range_start+0x1f5/0x250 [ 223.426450] copy_page_range+0x1e94/0x1ea0 [ 223.426461] ? srso_return_thunk+0x5/0x5f [ 223.426474] ? srso_return_thunk+0x5/0x5f [ 223.426484] ? lock_acquire+0xd1/0x300 [ 223.426494] ? copy_process+0x1718/0x2ad0 [ 223.426502] ? srso_return_thunk+0x5/0x5f [ 223.426510] ? sched_clock_noinstr+0x9/0x10 [ 223.426519] ? local_clock_noinstr+0xe/0xc0 [ 223.426528] ? copy_process+0x1718/0x2ad0 [ 223.426537] ? srso_return_thunk+0x5/0x5f [ 223.426550] copy_process+0x172f/0x2ad0 [ 223.426569] kernel_clone+0x9c/0x3f0 [ 223.426577] ? __schedule+0x4c9/0x1b00 [ 223.426586] ? srso_return_thunk+0x5/0x5f [ 223.426594] ? sched_clock_noinstr+0x9/0x10 [ 223.426602] ? srso_return_thunk+0x5/0x5f [ 223.426610] ? local_clock_noinstr+0xe/0xc0 [ 223.426619] ? schedule+0x107/0x1a0 [ 223.426629] __do_sys_clone+0x66/0x90 [ 223.426643] __x64_sys_clone+0x25/0x30 [ 223.426652] x64_sys_call+0x1d7c/0x20d0 [ 223.426661] do_syscall_64+0x87/0x140 [ 223.426671] ? srso_return_thunk+0x5/0x5f [ 223.426679] ? common_nsleep+0x44/0x50 [ 223.426690] ? srso_return_thunk+0x5/0x5f [ 223.426698] ? trace_hardirqs_off+0x52/0xd0 [ 223.426709] ? srso_return_thunk+0x5/0x5f [ 223.426717] ? syscall_exit_to_user_mode+0xcc/0x200 [ 223.426727] ? srso_return_thunk+0x5/0x5f [ 223.426736] ? do_syscall_64+0x93/0x140 [ 223.426748] ? srso_return_thunk+0x5/0x5f [ 223.426756] ? up_write+0x1c/0x1e0 [ 223.426765] ? srso_return_thunk+0x5/0x5f [ 223.426775] ? srso_return_thunk+0x5/0x5f [ 223.426783] ? trace_hardirqs_off+0x52/0xd0 [ 223.426792] ? srso_return_thunk+0x5/0x5f [ 223.426800] ? syscall_exit_to_user_mode+0xcc/0x200 [ 223.426810] ? srso_return_thunk+0x5/0x5f [ 223.426818] ? do_syscall_64+0x93/0x140 [ 223.426826] ? syscall_exit_to_user_mode+0xcc/0x200 [ 223.426836] ? srso_return_thunk+0x5/0x5f [ 223.426844] ? do_syscall_64+0x93/0x140 [ 223.426853] ? srso_return_thunk+0x5/0x5f [ 223.426861] ? irqentry_exit+0x6b/0x90 [ 223.426869] ? srso_return_thunk+0x5/0x5f [ 223.426877] ? exc_page_fault+0xa7/0x2c0 [ 223.426888] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 223.426898] RIP: 0033:0x7f46758eab57 [ 223.426906] Code: ba 04 00 f3 0f 1e fa 64 48 8b 04 25 10 00 00 00 45 31 c0 31 d2 31 f6 bf 11 00 20 01 4c 8d 90 d0 02 00 00 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 41 41 89 c0 85 c0 75 2c 64 48 8b 04 25 10 00 [ 223.426930] RSP: 002b:00007fff5c3e5188 EFLAGS: 00000246 ORIG_RAX: 0000000000000038 [ 223.426943] RAX: ffffffffffffffda RBX: 00007f4675f8c040 RCX: 00007f46758eab57 [ 223.426954] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011 [ 223.426965] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 [ 223.426975] R10: 00007f4675e81a50 R11: 0000000000000246 R12: 0000000000000001 [ 223.426986] R13: 00007fff5c3e5470 R14: 00007fff5c3e53e0 R15: 00007fff5c3e5410 [ 223.427004] </TASK> v2: To resolve this issue, the allocation of the process context buffer (`proc_ctx_bo`) has been moved from the `add_queue_mes` function to the `pqm_create_queue` function. This change ensures that the buffer is allocated only when the first queue for a process is created and only if the Micro Engine Scheduler (MES) is enabled. (Felix) v3: Fix typo s/Memory Execution Scheduler (MES)/Micro Engine Scheduler in commit message. (Lijo) Fixes: `438b39ac74` ("drm/amdkfd: pause autosuspend when creating pdd") Cc: Jesse Zhang <jesse.zhang@amd.com> Cc: Yunxiang Li <Yunxiang.Li@amd.com> Cc: Philip Yang <Philip.Yang@amd.com> Cc: Alex Sierra <alex.sierra@amd.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 16:49:32 -05:00
Jie Zhang	b1f07bc58d	drm/msm/a6xx: Add support for Adreno 623 Add support for Adreno 623 GPU found in QCS8300 chipsets. Signed-off-by: Jie Zhang <quic_jiezh@quicinc.com> Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/640056/ Signed-off-by: Rob Clark <robdclark@chromium.org>	2025-02-27 13:05:23 -08:00
Jie Zhang	11cdb81b3c	drm/msm/a6xx: Fix gpucc register block for A621 Adreno 621 has a different memory map for GPUCC block. So update a6xx_gpu_state code to dump the correct set of gpucc registers. Signed-off-by: Jie Zhang <quic_jiezh@quicinc.com> Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/640055/ Signed-off-by: Rob Clark <robdclark@chromium.org>	2025-02-27 13:05:23 -08:00
Jie Zhang	378a621999	drm/msm/a6xx: Split out gpucc register block Some GPUs have different memory map for GPUCC block. So split out the gpucc range from a6xx_gmu_cx_registers to a separate block to accommodate those GPUs. Signed-off-by: Jie Zhang <quic_jiezh@quicinc.com> Signed-off-by: Akhil P Oommen <quic_akhilpo@quicinc.com> Patchwork: https://patchwork.freedesktop.org/patch/640052/ Signed-off-by: Rob Clark <robdclark@chromium.org>	2025-02-27 13:05:23 -08:00
Dan Carpenter	0b305b7cad	drm/msm/gem: Fix error code msm_parse_deps() The SUBMIT_ERROR() macro turns the error code negative. This extra '-' operation turns it back to positive EINVAL again. The error code is passed to ERR_PTR() and since positive values are not an IS_ERR() it eventually will lead to an oops. Delete the '-'. Fixes: `866e43b945` ("drm/msm: UAPI error reporting") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Patchwork: https://patchwork.freedesktop.org/patch/637625/ Signed-off-by: Rob Clark <robdclark@chromium.org>	2025-02-27 12:58:38 -08:00
Benjamin Chan	dce1b82398	drm/amdgpu: Add amdisp pinctrl MFD resource AMDISP GPIO control uses a dedicated pinctrl driver, and requires MFD hotadd GPIO resources. Co-developed-by: Pratap Nirujogi <pratap.nirujogi@amd.com> Signed-off-by: Benjamin Chan <benjamin.chan@amd.com> Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 15:52:49 -05:00
Alex Deucher	4343f814e5	drm/amdgpu/mes12: drop amdgpu_mes_suspend()/amdgpu_mes_resume() calls They are noops on GFX12. There is no suspend/resume all support in firmware so the function doesn't do anything. KFD already handles its own queues and they should already be unmapped at this point so even if this runs, it's not doing anything. Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 15:52:43 -05:00
Dr. David Alan Gilbert	82c13da746	drm/amd/display: Remove unused optc3_fpu_set_vrr_m_const The last use of optc3_fpu_set_vrr_m_const() was removed in 2022's commit `64f991590f` ("drm/amd/display: Fix a compilation failure on PowerPC caused by FPU code") which removed the only caller (with a similar) name. Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 15:52:40 -05:00
Pratap Nirujogi	a67e75beff	drm/amdgpu: Replace DRM_ERROR() with drm_err() DRM_ERROR() is no longer preferred. Replace DRM_ERROR() usage with drm_err() in isp driver. Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 15:52:36 -05:00
Luan Arcanjo	b5838d1517	drm/amd/display/dc: Refactor remove duplications All dce command_table_helper's shares a copy-pasted collection of copy-pasted functions, which are: phy_id_to_atom, clock_source_id_to_atom_phy_clk_src_id, and engine_bp_to_atom. This patch removes the multiple copy-pasted by moving them to the command_table_helper.c and make the command_table_helper's calls the functions implemented by the command_table_helper.c instead. The changes were not tested on actual hardware. I am only able to verify that the changes keep the code compileable and do my best to to look repeatedly if I am not actually changing any code. This is the version 4 of the PATCH, fixed comments about licence in the new files and the matches From email to Signed-off-by email. Fixed comments about using command_table_helper instead of creating a dce_common Signed-off-by: Luan Icaro Pinto Arcanjo <luanicaro@usp.br> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 15:52:32 -05:00
Alex Deucher	4d1b653571	drm/amdgpu/vcn: use dev_info() for firmware information To properly handle multiple GPUs. Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 15:52:32 -05:00
Alex Deucher	c51aa7923e	drm/amdgpu/vcn: optimize firmware storage If each instance uses the same fw image, only store one copy in the driver. Acked-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 15:52:32 -05:00
Alex Deucher	31a37dfc8f	drm/amdgpu/vcn5.0.1: use generic set_power_gating_state helper No need for an IP specific version. Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 15:52:32 -05:00
Alex Deucher	9b648fa54c	drm/amdgpu/vcn5.0.0: use generic set_power_gating_state helper No need for an IP specific version. Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 15:52:32 -05:00
Alex Deucher	4bb5879322	drm/amdgpu/vcn4.0.5: use generic set_power_gating_state helper No need for an IP specific version. Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 15:52:31 -05:00
Alex Deucher	1ee6b2bff2	drm/amdgpu/vcn4.0.3: use generic set_power_gating_state helper No need for an IP specific version. Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 15:52:31 -05:00
Alex Deucher	8bdfa5756b	drm/amdgpu/vcn4.0: use generic set_power_gating_state helper No need for an IP specific version. Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 15:52:31 -05:00
Alex Deucher	38c0d9882a	drm/amdgpu/vcn3.0: use generic set_power_gating_state helper No need for an IP specific version. Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 15:52:31 -05:00
Alex Deucher	bd32af6faa	drm/amdgpu/vcn2.5: use generic set_power_gating_state helper No need for an IP specific version. Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 15:52:31 -05:00
Alex Deucher	3389dd059f	drm/amdgpu/vcn2.0: use generic set_power_gating_state helper No need for an IP specific version. Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 15:52:31 -05:00
Alex Deucher	cac3dc89f2	drm/amdgpu/vcn1.0: use generic set_power_gating_state helper No need for an IP specific version. Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 15:52:31 -05:00
Alex Deucher	a2cf2a883c	drm/amdgpu/vcn: add a generic helper for set_power_gating_state It's common for all VCN variants. Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 15:52:31 -05:00
Alex Deucher	4ce4fe2720	drm/amdgpu/vcn: use per instance callbacks for idle work handler Use the vcn instance power gating callbacks rather than the IP powergating callback. This limits power gating to only the instance in use rather than all of the instances. Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 15:52:31 -05:00
Alex Deucher	592846e3fe	drm/amdgpu/vcn5.0.1: add set_pg_state callback Rework the code as a vcn instance callback. Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 15:52:31 -05:00
Alex Deucher	f2eb0a66ca	drm/amdgpu/vcn5.0.0: add set_pg_state callback Rework the code as a vcn instance callback. Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-02-27 15:52:31 -05:00

... 23 24 25 26 27 ...

1339240 Commits