linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-08 00:29:36 -04:00

Author	SHA1	Message	Date
Sunil Khatri	58d283801d	drm/amdgpu: add vcn_v3_0 ip dump support Add support of vcn ip dump in the devcoredump for vcn_v3_0. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-27 17:28:35 -04:00
Sunil Khatri	50d10d9271	drm/amdgpu: add macro to calculate offset with instance Add macro definition which calculate offset of the register with index override. This is useful in case when there is an array of registers which is common for all instances. To read registers in that case it is easy to define registers once and the index value is manually passed to calculate proper offset of register for each instance. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-27 17:28:28 -04:00
Sunil Khatri	f3392e662e	drm/amdgpu: add vcn ip dump ptr in vcn global struct Add pointer to the vcn ip dump in the vcn global structure to be accessible for all vcn version via global adev. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-27 17:28:17 -04:00
Jiapeng Chong	8f28c465a4	drm/amd/display: remove unneeded semicolon No functional modification involved. ./drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c:481:2-3: Unneeded semicolon. ./drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c:3783:168-169: Unneeded semicolon. ./drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_core/dml2_core_dcn4_calcs.c:3782:166-167: Unneeded semicolon. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=9575 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-25 17:43:50 -04:00
Jonathan Kim	e06b71b231	drm/amdkfd: allow users to target recommended SDMA engines Certain GPUs have better copy performance over xGMI on specific SDMA engines depending on the source and destination GPU. Allow users to create SDMA queues on these recommended engines. Close to 2x overall performance has been observed with this optimization. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-25 17:43:41 -04:00
Kenneth Feng	60c30ba7ba	drm/amdgpu/pm: support gpu_metrics sysfs interface for smu v14.0.2/3 support gpu_metrics sysfs interface for smu v14.0.2/3 Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-25 17:43:27 -04:00
Jiapeng Chong	f3c681f0c3	drm/amd/display: use swap() in sort() Use existing swap() function rather than duplicating its implementation. ./drivers/gpu/drm/amd/display/dc/dml2/dml21/src/dml2_pmo/dml2_pmo_dcn3.c:17:29-30: WARNING opportunity for swap(). Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=9573 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-25 17:43:23 -04:00
Nathan Chancellor	fdedd77b0e	drm/amd/display: Reapply `2fde4fdddc` Commit `2563391e57` ("drm/amd/display: DML2.1 resynchronization") blew away the compiler warning fix from commit `2fde4fdddc` ("drm/amd/display: Avoid -Wenum-float-conversion in add_margin_and_round_to_dfs_grainularity()"), causing the warning to reappear. drivers/gpu/drm/amd/amdgpu/../display/dc/dml2/dml21/src/dml2_dpmm/dml2_dpmm_dcn4.c:183:58: error: arithmetic between enumeration type 'enum dentist_divider_range' and floating-point type 'double' [-Werror,-Wenum-float-conversion] 183 \| divider = (unsigned int)(DFS_DIVIDER_RANGE_SCALE_FACTOR * (vco_freq_khz / clock_khz)); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. Apply the fix again to resolve the warning. Fixes: `2563391e57` ("drm/amd/display: DML2.1 resynchronization") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-25 17:42:20 -04:00
Colin Ian King	75c3f06fd9	drm/amd/display: Fix spelling mistake "tolarance" -> "tolerance" There is a spelling mistake in a dml2_printf message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-25 17:42:14 -04:00
Alex Deucher	17c6baff3d	drm/radeon: properly handle vbios fake edid sizing The comment in the vbios structure says: // = 128 means EDID length is 128 bytes, otherwise the EDID length = ucFakeEDIDLength*128 This fake edid struct has not been used in a long time, so I'm not sure if there were actually any boards out there with a non-128 byte EDID, but align the code with the comment. Reviewed-by: Thomas Weißschuh <linux@weissschuh.net> Reported-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lists.freedesktop.org/archives/amd-gfx/2024-June/109964.html Fixes: `c324acd503` ("drm/radeon/kms: parse the extended LCD info block") Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-25 17:41:58 -04:00
Alex Deucher	8155566a26	drm/amdgpu: properly handle vbios fake edid sizing The comment in the vbios structure says: // = 128 means EDID length is 128 bytes, otherwise the EDID length = ucFakeEDIDLength*128 This fake edid struct has not been used in a long time, so I'm not sure if there were actually any boards out there with a non-128 byte EDID, but align the code with the comment. Reviewed-by: Thomas Weißschuh <linux@weissschuh.net> Reported-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lists.freedesktop.org/archives/amd-gfx/2024-June/109964.html Fixes: `d38ceaf99e` ("drm/amdgpu: add core driver (v4)") Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-25 17:41:46 -04:00
Philip Yang	629568d25f	drm/amdkfd: Validate queue cwsr area and eop buffer size When creating KFD user compute queue, check if queue eop buffer size, cwsr area size, ctl stack size equal to the size of KFD node properities. Check the entire cwsr area which may split into multiple svm ranges aligned to granularity boundary. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-24 14:46:06 -04:00
Philip Yang	517fff221c	drm/amdkfd: Store queue cwsr area size to node properties Use the queue eop buffer size, cwsr area size, ctl stack size calculation from Thunk, store the value to KFD node properties. Those will be used to validate queue eop buffer size, cwsr area size, ctl stack size when creating KFD user compute queue. Those will be exposed to user space via sysfs KFD node properties, to remove the duplicate calculation code from Thunk. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-24 14:45:58 -04:00
ZhenGuo Yin	47c0388b05	drm/amdgpu: reset vm state machine after gpu reset(vram lost) [Why] Page table of compute VM in the VRAM will lost after gpu reset. VRAM won't be restored since compute VM has no shadows. [How] Use higher 32-bit of vm->generation to record a vram_lost_counter. Reset the VM state machine when vm->genertaion is not equal to the new generation token. v2: Check vm->generation instead of calling drm_sched_entity_error in amdgpu_vm_validate. v3: Use new generation token instead of vram_lost_counter for check. Signed-off-by: ZhenGuo Yin <zhenguo.yin@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-24 14:45:24 -04:00
Srinivasan Shanmugam	08ae395ea2	drm/amd/display: Add null check for set_output_gamma in dcn30_set_output_transfer_func This commit adds a null check for the set_output_gamma function pointer in the dcn30_set_output_transfer_func function. Previously, set_output_gamma was being checked for nullity at line 386, but then it was being dereferenced without any nullity check at line 401. This could potentially lead to a null pointer dereference error if set_output_gamma is indeed null. To fix this, we now ensure that set_output_gamma is not null before dereferencing it. We do this by adding a nullity check for set_output_gamma before the call to set_output_gamma at line 401. If set_output_gamma is null, we log an error message and do not call the function. This fix prevents a potential null pointer dereference error. drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:401 dcn30_set_output_transfer_func() error: we previously assumed 'mpc->funcs->set_output_gamma' could be null (see line 386) drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c 373 bool dcn30_set_output_transfer_func(struct dc dc, 374 struct pipe_ctx pipe_ctx, 375 const struct dc_stream_state stream) 376 { 377 int mpcc_id = pipe_ctx->plane_res.hubp->inst; 378 struct mpc mpc = pipe_ctx->stream_res.opp->ctx->dc->res_pool->mpc; 379 const struct pwl_params params = NULL; 380 bool ret = false; 381 382 / program OGAM or 3DLUT only for the top pipe/ 383 if (pipe_ctx->top_pipe == NULL) { 384 /program rmu shaper and 3dlut in MPC/ 385 ret = dcn30_set_mpc_shaper_3dlut(pipe_ctx, stream); 386 if (ret == false && mpc->funcs->set_output_gamma) { ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ If this is NULL 387 if (stream->out_transfer_func.type == TF_TYPE_HWPWL) 388 params = &stream->out_transfer_func.pwl; 389 else if (pipe_ctx->stream->out_transfer_func.type == 390 TF_TYPE_DISTRIBUTED_POINTS && 391 cm3_helper_translate_curve_to_hw_format( 392 &stream->out_transfer_func, 393 &mpc->blender_params, false)) 394 params = &mpc->blender_params; 395 / there are no ROM LUTs in OUTGAM */ 396 if (stream->out_transfer_func.type == TF_TYPE_PREDEFINED) 397 BREAK_TO_DEBUGGER(); 398 } 399 } 400 --> 401 mpc->funcs->set_output_gamma(mpc, mpcc_id, params); ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Then it will crash 402 return ret; 403 } Fixes: `d99f13878d` ("drm/amd/display: Add DCN3 HWSEQ") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Hersen Wu <hersenxs.wu@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-24 14:44:16 -04:00
Tim Huang	0b071245dd	drm/amdgpu: add missed harvest check for VCN IP v4/v5 To prevent below probe failure, add a check for models with VCN IP v4.0.6 where VCN1 may be harvested. v2: Apply the same check to VCN IP v4.0 and v5.0. [ 54.070117] RIP: 0010:vcn_v4_0_5_start_dpg_mode+0x9be/0x36b0 [amdgpu] [ 54.071055] Code: 80 fb ff 8d 82 00 80 fe ff 81 fe 00 06 00 00 0f 43 c2 49 69 d5 38 0d 00 00 48 8d 71 04 c1 e8 02 4c 01 f2 48 89 b2 50 f6 02 00 <89> 01 48 8b 82 50 f6 02 00 48 8d 48 04 48 89 8a 50 f6 02 00 c7 00 [ 54.072408] RSP: 0018:ffffb17985f736f8 EFLAGS: 00010286 [ 54.072793] RAX: 00000000000000d6 RBX: ffff99a82f680000 RCX: 0000000000000000 [ 54.073315] RDX: ffff99a82f680000 RSI: 0000000000000004 RDI: ffff99a82f680000 [ 54.073835] RBP: ffffb17985f73730 R08: 0000000000000001 R09: 0000000000000000 [ 54.074353] R10: 0000000000000008 R11: ffffb17983c05000 R12: 0000000000000000 [ 54.074879] R13: 0000000000000000 R14: ffff99a82f680000 R15: 0000000000000001 [ 54.075400] FS: 00007f8d9c79a000(0000) GS:ffff99ab2f140000(0000) knlGS:0000000000000000 [ 54.075988] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 54.076408] CR2: 0000000000000000 CR3: 0000000140c3a000 CR4: 0000000000750ef0 [ 54.076927] PKRU: 55555554 [ 54.077132] Call Trace: [ 54.077319] <TASK> [ 54.077484] ? show_regs+0x69/0x80 [ 54.077747] ? __die+0x28/0x70 [ 54.077979] ? page_fault_oops+0x180/0x4b0 [ 54.078286] ? do_user_addr_fault+0x2d2/0x680 [ 54.078610] ? exc_page_fault+0x84/0x190 [ 54.078910] ? asm_exc_page_fault+0x2b/0x30 [ 54.079224] ? vcn_v4_0_5_start_dpg_mode+0x9be/0x36b0 [amdgpu] [ 54.079941] ? vcn_v4_0_5_start_dpg_mode+0xe6/0x36b0 [amdgpu] [ 54.080617] vcn_v4_0_5_set_powergating_state+0x82/0x19b0 [amdgpu] [ 54.081316] amdgpu_device_ip_set_powergating_state+0x64/0xc0 [amdgpu] [ 54.082057] amdgpu_vcn_ring_begin_use+0x6f/0x1d0 [amdgpu] [ 54.082727] amdgpu_ring_alloc+0x44/0x70 [amdgpu] [ 54.083351] amdgpu_vcn_dec_sw_ring_test_ring+0x40/0x110 [amdgpu] [ 54.084054] amdgpu_ring_test_helper+0x22/0x90 [amdgpu] [ 54.084698] vcn_v4_0_5_hw_init+0x87/0xc0 [amdgpu] [ 54.085307] amdgpu_device_init+0x1f96/0x2780 [amdgpu] [ 54.085951] amdgpu_driver_load_kms+0x1e/0xc0 [amdgpu] [ 54.086591] amdgpu_pci_probe+0x19f/0x550 [amdgpu] [ 54.087215] local_pci_probe+0x48/0xa0 [ 54.087509] pci_device_probe+0xc9/0x250 [ 54.087812] really_probe+0x1a4/0x3f0 [ 54.088101] __driver_probe_device+0x7d/0x170 [ 54.088443] driver_probe_device+0x24/0xa0 [ 54.088765] __driver_attach+0xdd/0x1d0 [ 54.089068] ? __pfx___driver_attach+0x10/0x10 [ 54.089417] bus_for_each_dev+0x8e/0xe0 [ 54.089718] driver_attach+0x22/0x30 [ 54.090000] bus_add_driver+0x120/0x220 [ 54.090303] driver_register+0x62/0x120 [ 54.090606] ? __pfx_amdgpu_init+0x10/0x10 [amdgpu] [ 54.091255] __pci_register_driver+0x62/0x70 [ 54.091593] amdgpu_init+0x67/0xff0 [amdgpu] [ 54.092190] do_one_initcall+0x5f/0x330 [ 54.092495] do_init_module+0x68/0x240 [ 54.092794] load_module+0x201c/0x2110 [ 54.093093] init_module_from_file+0x97/0xd0 [ 54.093428] ? init_module_from_file+0x97/0xd0 [ 54.093777] idempotent_init_module+0x11c/0x2a0 [ 54.094134] __x64_sys_finit_module+0x64/0xc0 [ 54.094476] do_syscall_64+0x58/0x120 [ 54.094767] entry_SYSCALL_64_after_hwframe+0x6e/0x76 Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-24 14:44:05 -04:00
Yifan Zhang	3b37e2725a	drm/amdgpu: skip kfd init if GFX is not ready. avoid kfd init crash in that case. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Tested-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-24 14:43:58 -04:00
Philip Yang	305cd109b7	drm/amdkfd: Validate user queue update Ensure update queue new ring buffer is mapped on GPU with correct size. Decrease queue old ring_bo queue_refcount and increase new ring_bo queue_refcount. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-24 14:43:49 -04:00
Philip Yang	b049504e21	drm/amdkfd: Validate user queue svm memory residency Queue CWSR area maybe registered to GPU as svm memory, create queue to ensure svm mapped to GPU with KFD_IOCTL_SVM_FLAG_GPU_ALWAYS_MAPPED flag. Add queue_refcount to struct svm_range, to track queue CWSR area usage. Because unmap mmu notifier callback return value is ignored, if application unmap the CWSR area while queue is active, pr_warn message in dmesg log. To be safe, evict user queue. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-24 14:43:28 -04:00
Alex Deucher	bd4bea5ab2	drm/amdgpu/gfx9.4.3: Enable bad opcode interrupt For the bad opcode case, it will cause CP/ME hang. The firmware will prevent the ME side from hanging by raising a bad opcode interrupt. And the driver needs to perform a vmid reset when receiving the interrupt. Acked-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:45:46 -04:00
Alex Deucher	238352b494	drm/amdgpu/gfx9: Enable bad opcode interrupt For the bad opcode case, it will cause CP/ME hang. The firmware will prevent the ME side from hanging by raising a bad opcode interrupt. And the driver needs to perform a vmid reset when receiving the interrupt. Acked-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:45:46 -04:00
Jesse Zhang	5ebca62eb8	drm/amdgpu/gfx12: Enable bad opcode interrupt For the bad opcode case, it will cause CP/ME hang. The firmware will prevent the ME side from hanging by raising a bad opcode interrupt. And the driver needs to perform a vmid reset when receiving the interrupt. v2: update irq naming (drop priv) (Alex) Acked-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:45:46 -04:00
Jesse Zhang	bc6c2a6f64	drm/amdgpu/gfx10: Enable bad opcode interrupt For the bad opcode case, it will cause CP/ME hang. The firmware will prevent the ME side from hanging by raising a bad opcode interrupt. And the driver needs to perform a vmid reset when receiving the interrupt. v2: update irq naming (drop priv) (Alex) Acked-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:45:45 -04:00
Jesse Zhang	a790902237	drm/amdgpu/gfx11: Enable bad opcode interrupt For the bad opcode case, it will cause CP/ME hang. The firmware will prevent the ME side from hanging by raising a bad opcode interrupt. And the driver needs to perform a vmid reset when receiving the interrupt. v2: update irq naming (drop priv) (Alex) Acked-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:45:45 -04:00
Alex Deucher	acddd5cf70	drm/amdgpu/gfx: add bad opcode interrupt Add the irq source for bad opcodes. Acked-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:45:45 -04:00
Alex Deucher	48695573d2	drm/amdgpu/gfx9: properly handle error ints on all pipes Need to handle the interrupt enables for all pipes. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:45:45 -04:00
Alex Deucher	3987932176	drm/amdgpu/gfx12: properly handle error ints on all pipes Need to handle the interrupt enables for all pipes. v2: fix indexing (Jessie) Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:45:45 -04:00
Alex Deucher	2662b7d9d8	drm/amdgpu/gfx11: properly handle error ints on all pipes Need to handle the interrupt enables for all pipes. v2: fix indexing (Jessie) Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:45:45 -04:00
Alex Deucher	4b95cec689	drm/amdgpu/gfx10: properly handle error ints on all pipes Need to handle the interrupt enables for all pipes. v2: fix indexing (Jessie) Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:45:45 -04:00
Alex Deucher	af4808ac40	drm/amdgpu/gfx12: enable wave kill for compute queues It should work the same for compute as well as gfx. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:45:45 -04:00
Alex Deucher	f53f526f70	drm/amdgpu/gfx11: enable wave kill for compute queues It should work the same for compute as well as gfx. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:45:45 -04:00
Alex Deucher	a2737c404c	drm/amdgpu/gfx10: enable wave kill for compute queues It should work the same for compute as well as gfx. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:45:45 -04:00
Stanley.Yang	015b8a2fdf	drm/amdgpu: Fix eeprom max record count The eeprom table is empty before initializing, set eeprom table version first before initializing. Changed from V1: Reuse amdgpu_ras_set_eeprom_table_version function Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:45:45 -04:00
Srinivasan Shanmugam	c395fd47d1	drm/amd/display: Add NULL check for clk_mgr in dcn32_init_hw This commit addresses a potential null pointer dereference issue in the `dcn32_init_hw` function. The issue could occur when `dc->clk_mgr` is null. The fix adds a check to ensure `dc->clk_mgr` is not null before accessing its functions. This prevents a potential null pointer dereference. Reported by smatch: drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn32/dcn32_hwseq.c:961 dcn32_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 782) Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:43:13 -04:00
YiPeng Chai	8284951a6e	drm/amdgpu: fix ras UE error injection failure issue The ras command shared memory is allocated from VRAM and the response status of the command buffer will not be zero due to gpu being in fatal error state after ras UE error injection. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:43:06 -04:00
Philip Yang	834368eab3	drm/amdkfd: Ensure user queue buffers residency Add atomic queue_refcount to struct bo_va, return -EBUSY to fail unmap BO from the GPU if the bo_va queue_refcount is not zero. Create queue to increase the bo_va queue_refcount, destroy queue to decrease the bo_va queue_refcount, to ensure the queue buffers mapped on the GPU when queue is active. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:42:54 -04:00
Alex Deucher	22a9d5cbf8	drm/amdgpu/gfx9.4.3: implement wave kill for compute queues Based on gfx9.0 implementation. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:42:45 -04:00
Sunil Khatri	eac3b274aa	drm/amdgpu: add print support for sdma_v_4_4_2 ip_dump Add print support for ip dump for sdma_v_4_4_2 in devcoredump. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:42:37 -04:00
Srinivasan Shanmugam	4b6377f0e9	drm/amd/display: Add NULL check for clk_mgr and clk_mgr->funcs in dcn401_init_hw This commit addresses a potential null pointer dereference issue in the `dcn401_init_hw` function. The issue could occur when `dc->clk_mgr` or `dc->clk_mgr->funcs` is null. The fix adds a check to ensure `dc->clk_mgr` and `dc->clk_mgr->funcs` is not null before accessing its functions. This prevents a potential null pointer dereference. Reported by smatch: drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn401/dcn401_hwseq.c:416 dcn401_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 225) Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:42:29 -04:00
Srinivasan Shanmugam	cba7fec864	drm/amd/display: Add NULL check for clk_mgr and clk_mgr->funcs in dcn30_init_hw This commit addresses a potential null pointer dereference issue in the `dcn30_init_hw` function. The issue could occur when `dc->clk_mgr` or `dc->clk_mgr->funcs` is null. The fix adds a check to ensure `dc->clk_mgr` and `dc->clk_mgr->funcs` is not null before accessing its functions. This prevents a potential null pointer dereference. Reported by smatch: drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn30/dcn30_hwseq.c:789 dcn30_init_hw() error: we previously assumed 'dc->clk_mgr' could be null (see line 628) Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:42:23 -04:00
Philip Yang	68e599db7a	drm/amdkfd: Validate user queue buffers Find user queue rptr, ring buf, eop buffer and cwsr area BOs, and check BOs are mapped on the GPU with correct size and take the BO reference. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:42:14 -04:00
Alex Deucher	9c7e69d2e1	drm/amdgpu/gfx9: enable wave kill for compute queues It should work the same for compute as well as gfx. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:42:11 -04:00
Alex Deucher	7e60ecc2b7	drm/amdgpu/gfx8: enable wave kill for compute queues It should work the same for compute as well as gfx. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:42:08 -04:00
Alex Deucher	12fb3e9c88	drm/amdgpu/gfx7: enable wave kill for compute queues It should work the same for compute as well as gfx. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:42:03 -04:00
Srinivasan Shanmugam	ac21404491	drm/amd/display: Add null check for head_pipe in dcn32_acquire_idle_pipe_for_head_pipe_in_layer This commit addresses a potential null pointer dereference issue in the `dcn32_acquire_idle_pipe_for_head_pipe_in_layer` function. The issue could occur when `head_pipe` is null. The fix adds a check to ensure `head_pipe` is not null before asserting it. If `head_pipe` is null, the function returns NULL to prevent a potential null pointer dereference. Reported by smatch: drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn32/dcn32_resource.c:2690 dcn32_acquire_idle_pipe_for_head_pipe_in_layer() error: we previously assumed 'head_pipe' could be null (see line 2681) Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:41:56 -04:00
Srinivasan Shanmugam	f22f4754aa	drm/amd/display: Add null check for head_pipe in dcn201_acquire_free_pipe_for_layer This commit addresses a potential null pointer dereference issue in the `dcn201_acquire_free_pipe_for_layer` function. The issue could occur when `head_pipe` is null. The fix adds a check to ensure `head_pipe` is not null before asserting it. If `head_pipe` is null, the function returns NULL to prevent a potential null pointer dereference. Reported by smatch: drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn201/dcn201_resource.c:1016 dcn201_acquire_free_pipe_for_layer() error: we previously assumed 'head_pipe' could be null (see line 1010) Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:41:47 -04:00
Srinivasan Shanmugam	d81873f9e7	drm/amd/display: Fix index out of bounds in DCN30 color transformation This commit addresses a potential index out of bounds issue in the `cm3_helper_translate_curve_to_hw_format` function in the DCN30 color management module. The issue could occur when the index 'i' exceeds the number of transfer function points (TRANSFER_FUNC_POINTS). The fix adds a check to ensure 'i' is within bounds before accessing the transfer function points. If 'i' is out of bounds, the function returns false to indicate an error. drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:180 cm3_helper_translate_curve_to_hw_format() error: buffer overflow 'output_tf->tf_pts.red' 1025 <= s32max drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:181 cm3_helper_translate_curve_to_hw_format() error: buffer overflow 'output_tf->tf_pts.green' 1025 <= s32max drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:182 cm3_helper_translate_curve_to_hw_format() error: buffer overflow 'output_tf->tf_pts.blue' 1025 <= s32max Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:41:39 -04:00
Srinivasan Shanmugam	bdf6068102	drm/amd/display: Implement bounds check for stream encoder creation in DCN401 'stream_enc_regs' array is an array of dcn10_stream_enc_registers structures. The array is initialized with four elements, corresponding to the four calls to stream_enc_regs() in the array initializer. This means that valid indices for this array are 0, 1, 2, and 3. The error message 'stream_enc_regs' 4 <= 5 below, is indicating that there is an attempt to access this array with an index of 5, which is out of bounds. This could lead to undefined behavior Here, eng_id is used as an index to access the stream_enc_regs array. If eng_id is 5, this would result in an out-of-bounds access on the stream_enc_regs array. Thus fixing Buffer overflow error in dcn401_stream_encoder_create Found by smatch: drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn401/dcn401_resource.c:1209 dcn401_stream_encoder_create() error: buffer overflow 'stream_enc_regs' 4 <= 5 Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:41:29 -04:00
Srinivasan Shanmugam	b7e99058eb	drm/amd/display: Fix index out of bounds in degamma hardware format translation Fixes index out of bounds issue in `cm_helper_translate_curve_to_degamma_hw_format` function. The issue could occur when the index 'i' exceeds the number of transfer function points (TRANSFER_FUNC_POINTS). The fix adds a check to ensure 'i' is within bounds before accessing the transfer function points. If 'i' is out of bounds the function returns false to indicate an error. Reported by smatch: drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_cm_common.c:594 cm_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.red' 1025 <= s32max drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_cm_common.c:595 cm_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.green' 1025 <= s32max drivers/gpu/drm/amd/amdgpu/../display/dc/dcn10/dcn10_cm_common.c:596 cm_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.blue' 1025 <= s32max Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:41:19 -04:00
Srinivasan Shanmugam	bc50b614d5	drm/amd/display: Fix index out of bounds in DCN30 degamma hardware format translation This commit addresses a potential index out of bounds issue in the `cm3_helper_translate_curve_to_degamma_hw_format` function in the DCN30 color management module. The issue could occur when the index 'i' exceeds the number of transfer function points (TRANSFER_FUNC_POINTS). The fix adds a check to ensure 'i' is within bounds before accessing the transfer function points. If 'i' is out of bounds, the function returns false to indicate an error. Reported by smatch: drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:338 cm3_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.red' 1025 <= s32max drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:339 cm3_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.green' 1025 <= s32max drivers/gpu/drm/amd/amdgpu/../display/dc/dcn30/dcn30_cm_common.c:340 cm3_helper_translate_curve_to_degamma_hw_format() error: buffer overflow 'output_tf->tf_pts.blue' 1025 <= s32max Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-07-23 17:41:12 -04:00

1 2 3 4 5 ...

1283742 Commits