linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-10 06:49:29 -04:00

Author	SHA1	Message	Date
Liao Yuanhong	7a50377cea	drm/radeon/atom: Remove redundant ternary operators For ternary operators in the form of "a ? true : false", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 17:38:39 -04:00
Liao Yuanhong	9ab06ab36d	drm/amd/pm/powerplay/smumgr: remove redundant ternary operators For ternary operators in the form of "a ? true : false", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Swap variable positions on either side of '==' to enhance readability. Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 17:38:39 -04:00
Liao Yuanhong	fa740b115b	drm/amd/pm/powerplay/hwmgr/ppatomctrl: Remove redundant ternary operators For ternary operators in the form of "a ? true : false", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Swap variable positions on either side of '!=' to enhance readability. Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 17:38:39 -04:00
Liao Yuanhong	db51c5d98b	amdgpu/pm/legacy: remove redundant ternary operators For ternary operators in the form of "a ? true : false", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 17:38:39 -04:00
Liao Yuanhong	53c271b9a0	drm/amd/display: Remove redundant ternary operators For ternary operators in the form of "a ? true : false" or "a ? false : true", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 17:38:39 -04:00
Jesse.Zhang	54d18bc600	drm/amdgpu/userq: add a detect and reset callback Add a detect and reset callback and add the implementation for mes. The callback will detect all hung queues of a particular ip type (e.g., GFX or compute or SDMA) and reset them. v2: increase reset counter and set fence force completion v3: Removed userq_mutex in mes_userq_detect_and_reset since the driver holds it when calling Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 17:38:39 -04:00
Qianfeng Rong	cbda64f3f5	drm/amdkfd: Fix error code sign for EINVAL in svm_ioctl() Use negative error code -EINVAL instead of positive EINVAL in the default case of svm_ioctl() to conform to Linux kernel error code conventions. Fixes: `42de677f79` ("drm/amdkfd: register svm range") Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 17:38:39 -04:00
Alex Deucher	94bd7bf2c9	drm/amdgpu: don't enable SMU on cyan skillfish Cyan skillfish uses different SMU firmware. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 17:38:39 -04:00
Alex Deucher	fa819e3a7c	drm/amdgpu: add support for cyan skillfish gpu_info Some SOCs which are part of the cyan skillfish family rely on an explicit firmware for IP discovery. Add support for the gpu_info firmware. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 17:38:39 -04:00
Alex Deucher	9e6a5cf1a2	drm/amdgpu: add support for cyan skillfish without IP discovery For platforms without an IP discovery table. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 17:38:38 -04:00
Alex Deucher	e8529dbc75	drm/amdgpu: add ip offset support for cyan skillfish For chips that don't have IP discovery tables. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 17:38:38 -04:00
Srinivasan Shanmugam	38ab33dbea	drm/amdgpu: Fix function header names in amdgpu_connectors.c Align the function headers for `amdgpu_max_hdmi_pixel_clock` and `amdgpu_connector_dvi_mode_valid` with the function implementations so they match the expected kdoc style. Fixes the below: drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c:1199: warning: This comment starts with '/*', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst Returns the maximum supported HDMI (TMDS) pixel clock in KHz. drivers/gpu/drm/amd/amdgpu/amdgpu_connectors.c:1212: warning: This comment starts with '/*', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst Validates the given display mode on DVI and HDMI connectors. Fixes: `585b2f685c` ("drm/amdgpu: Respect max pixel clock for HDMI and DVI-D (v2)") Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 17:38:38 -04:00
Yifan Zhang	fa7c99f04f	amd/amdkfd: correct mem limit calculation for small APUs Current mem limit check leaks some GTT memory (reserved_for_pt reserved_for_ras + adev->vram_pin_size) for small APUs. Since carveout VRAM is tunable on APUs, there are three case regarding the carveout VRAM size relative to GTT: 1. 0 < carveout < gtt apu_prefer_gtt = true, is_app_apu = false 2. carveout > gtt / 2 apu_prefer_gtt = false, is_app_apu = false 3. 0 = carveout apu_prefer_gtt = true, is_app_apu = true It doesn't make sense to check below limitation in case 1 (default case, small carveout) because the values in the below expression are mixed with carveout and gtt. adev->kfd.vram_used[xcp_id] + vram_needed > vram_size - reserved_for_pt - reserved_for_ras - atomic64_read(&adev->vram_pin_size) gtt: kfd.vram_used, vram_needed, vram_size carveout: reserved_for_pt, reserved_for_ras, adev->vram_pin_size In case 1, vram allocation will go to gtt domain, skip vram check since ttm_mem_limit check already cover this allocation. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 17:38:38 -04:00
Geoffrey McRae	89923fb7ea	drm/amd/display: remove oem i2c adapter on finish Fixes a bug where unbinding of the GPU would leave the oem i2c adapter registered resulting in a null pointer dereference when applications try to access the invalid device. Fixes: `3d5470c973` ("drm/amd/display/dm: add support for OEM i2c bus") Cc: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Geoffrey McRae <geoffrey.mcrae@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 17:38:38 -04:00
Alex Deucher	276e8beb2a	drm/amdgpu/userq: add force completion helpers Add support for forcing completion of userq fences. This is needed for userq resets and asic resets so that we can set the error on the fence and force completion. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 17:38:38 -04:00
Alex Deucher	c5da9e9c02	drm/amdgpu: add user queue reset source Track resets from user queues. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 17:38:38 -04:00
Jesse.Zhang	724471254e	drm/amdgpu/mes12: implement detect and reset callback Implement support for the hung queue detect and reset functionality. v2: Always use AMDGPU_MES_SCHED_PIPE Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 17:38:38 -04:00
Jesse.Zhang	b28cfc8305	drm/amdgpu/mes11: implement detect and reset callback Implement support for the hung queue detect and reset functionality. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 17:38:38 -04:00
Jesse.Zhang	78e1222fbf	drm/amdgpu/mes: add front end for detect and reset hung queue Helper function to detect and reset hung queues. MES will return an array of doorbell indices of which queues are hung and were optionally reset. v2: Clear the doorbell array before detection Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 17:38:31 -04:00
Jesse.Zhang	6abd725fdf	drm/amd/amdgpu: Implement MES suspend/resume gang functionality for v12 This commit implements the actual MES (Micro Engine Scheduler) suspend and resume gang operations for version 12 hardware. Previously these functions were just stubs returning success. v2: Always use AMDGPU_MES_SCHED_PIPE Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>	2025-09-05 17:38:07 -04:00
Jesse.Zhang	2318336573	drm/amdgpu: Add preempt and restore callbacks to userq funcs Add two new function pointers to struct amdgpu_userq_funcs: - preempt: To handle preemption of user mode queues - restore: To restore preempted user mode queues These callbacks will allow the driver to properly manage queue preemption and restoration when needed, such as during context switching or priority changes. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 17:37:50 -04:00
Sunil Khatri	878f33f390	drm/amdgpu: fix the formating for debugfs print Fix the format of debugfs print in the mqd. Need to add a colon so parser can parse it properly. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 16:07:02 -04:00
Alex Deucher	1e18746381	drm/amd: add more cyan skillfish PCI ids Add additional PCI IDs to the cyan skillfish family. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 16:06:58 -04:00
Sunil Khatri	7e0fc7b2b7	drm/amdgpu: add more information in debugfs to pagetable dump Add more information in the debugfs which is needed to dump a pagetable correctly for userqueues where vmid is not known in the kernel. Signed-off-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 16:06:52 -04:00
Xiang Liu	f320ed01cf	drm/amdgpu: Correct info field of bad page threshold exceed CPER Correct valid_bits and ms_chk_bits of section info field for bad page threshold exceed CPER to match OOB's behavior. Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 16:06:49 -04:00
Eric Huang	36cc7d1317	drm/amdkfd: fix p2p links bug in topology When creating p2p links, KFD needs to check XGMI link with two conditions, hive_id and is_sharing_enabled, but it is missing to check is_sharing_enabled, so add it to fix the error. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 16:06:43 -04:00
Qianfeng Rong	3f36e712a8	drm/radeon/ci_dpm: Use int type to store negative error codes Change the 'ret' variable in ci_populate_all_graphic_levels() and ci_populate_all_memory_levels() from u32 to int, as it needs to store either negative error codes or zero returned by other functions. Storing the negative error codes in unsigned type, doesn't cause an issue at runtime but can be confusing. Additionally, assigning negative error codes to unsigned type may trigger a GCC warning when the -Wsign-conversion flag is enabled. No effect on runtime. Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 16:06:37 -04:00
Liao Yuanhong	086f66edf9	drm/amdgpu/vcn: Remove redundant ternary operators For ternary operators in the form of "a ? true : false", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 16:06:34 -04:00
Liao Yuanhong	9502b09933	drm/amdgpu/jpeg: Remove redundant ternary operators For ternary operators in the form of "a ? true : false", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 16:06:29 -04:00
Liao Yuanhong	8a4bc4508c	drm/amdgpu/ih: Remove redundant ternary operators For ternary operators in the form of "a ? false : true", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 16:00:15 -04:00
Liao Yuanhong	60df3e81d7	drm/amdgpu/gmc: Remove redundant ternary operators For ternary operators in the form of "a ? false : true", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 16:00:13 -04:00
Liao Yuanhong	d261e744af	drm/amdgpu/gfx: Remove redundant ternary operators For ternary operators in the form of "a ? false : true", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 16:00:09 -04:00
Liao Yuanhong	7670daf65a	drm/amdgpu/amdgpu_cper: Remove redundant ternary operators For ternary operators in the form of "a ? false : true", if 'a' itself returns a boolean result, the ternary operator can be omitted. Remove redundant ternary operators to clean up the code. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Liao Yuanhong <liaoyuanhong@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 16:00:04 -04:00
Colin Ian King	4320fd9e0d	drm/amd/amdgpu: Fix a less than zero check on a uint32_t struct field Currently the error check from the call to mes_v12_inv_tlb_convert_hub_id is always false because a uint32_t struct field hub_id is being used to to perform the less than zero error check. Fix this by using the int variable ret to perform the check. Fixes: `87e6505261` ("drm/amd/amdgpu : Use the MES INV_TLBS API for tlb invalidation on gfx12") Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-05 15:59:11 -04:00
Gustavo A. R. Silva	1e6d36e15b	drm/amdgpu/amdkfd: Avoid a couple hundred -Wflex-array-member-not-at-end warnings -Wflex-array-member-not-at-end was introduced in GCC-14, and we are getting ready to enable it, globally. Move the conflicting declarations to the end of the corresponding structures. Notice that `struct dev_pagemap` is a flexible structure, this is a structure that contains a flexible-array member. struct dev_pagemap always has room for at least one range. amdgpu only uses a single range. Therefore no change are needed to the allocation of struct amdgpu_device. Fix 283 of the following type of warnings: 283 drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.h:111:28: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-02 17:36:20 -04:00
Colin Ian King	1ee9d1a096	drm/amd/amdgpu: Fix missing error return on kzalloc failure Currently the kzalloc failure check just sets reports the failure and sets the variable ret to -ENOMEM, which is not checked later for this specific error. Fix this by just returning -ENOMEM rather than setting ret. Fixes: `4fb9307154` ("drm/amd/amdgpu: remove redundant host to psp cmd buf allocations") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-02 15:57:10 -04:00
Timur Kristóf	98fb1d5596	drm/amd/pm: Print VCE clocks too in si_dpm (v3) They are part of a power state too and should be printed alongside the rest of the data from the power state. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-02 15:57:06 -04:00
Timur Kristóf	6df0768c0d	drm/amd/pm: Remove wm_low and wm_high fields from amdgpu_crtc (v2) These fields were only used by si_dpm and are not necessary anymore. They also may have been incorrect because: - wm_high was set to the LOW_WATERMARK field of watermark A. - wm_low was not set on DCE 6 and was always zero. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-02 15:57:01 -04:00
Timur Kristóf	7009e3af04	drm/amd/pm: Disable SCLK switching on Oland with high pixel clocks (v3) Port of commit `227545b9a0` ("drm/radeon/dpm: Disable sclk switching on Oland when two 4K 60Hz monitors are connected") This is an ad-hoc DPM fix, necessary because we don't have proper bandwidth calculation for DCE 6. We define "high pixelclock" for SI as higher than necessary for 4K 30Hz. For example, 4K 60Hz and 1080p 144Hz fall into this category. When two high pixel clock displays are connected to Oland, additionally disable shader clock switching, which results in a higher voltage, thereby addressing some visible flickering. v2: Add more comments. v3: Split into two commits for easier review. Fixes: `841686df9f` ("drm/amdgpu: add SI DPM support (v4)") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-02 15:56:15 -04:00
Timur Kristóf	ed3803533c	drm/amd/pm: Disable MCLK switching with non-DC at 120 Hz+ (v2) According to pp_pm_compute_clocks the non-DC display code has "issues with mclk switching with refresh rates over 120 hz". The workaround is to disable MCLK switching in this case. Do the same for legacy DPM. Fixes: `6ddbd37f10` ("drm/amd/pm: optimize the amdgpu_pm_compute_clocks() implementations") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-02 15:55:44 -04:00
Timur Kristóf	9003a07468	drm/amd/pm: Treat zero vblank time as too short in si_dpm (v3) Some parts of the code base expect that MCLK switching is turned off when the vblank time is set to zero. According to pp_pm_compute_clocks the non-DC code has issues with MCLK switching with refresh rates over 120 Hz. v3: Add code comment to explain this better. Add an if statement instead of changing the switch_limit. Fixes: `841686df9f` ("drm/amdgpu: add SI DPM support (v4)") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-02 15:55:34 -04:00
Timur Kristóf	ce02513012	drm/amd/pm: Adjust si_upload_smc_data register programming (v3) Based on some comments in dm_pp_display_configuration above the crtc_index and line_time fields, these values are programmed to the SMC to work around an SMC hang when it switches MCLK. According to Alex, the Windows driver programs them to: mclk_change_block_cp_min = 200 / line_time mclk_change_block_cp_max = 100 / line_time Let's use the same for the sake of consistency. Previously we used the watermark values, but it seemed buggy as the code was mixing up low/high and A/B watermarks, and was not saving a low watermark value on DCE 6, so mclk_change_block_cp_max would be always zero previously. Split this change off from the previous si_upload_smc_data to make it easier to bisect, in case it causes any issues. Fixes: `841686df9f` ("drm/amdgpu: add SI DPM support (v4)") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-02 15:55:28 -04:00
Timur Kristóf	a43b2cec04	drm/amd/pm: Fix si_upload_smc_data (v3) The si_upload_smc_data function uses si_write_smc_soft_register to set some register values in the SMC, and expects the result to be PPSMC_Result_OK which is 1. The PPSMC_Result_OK / PPSMC_Result_Failed values are used for checking the result of a command sent to the SMC. However, the si_write_smc_soft_register actually doesn't send any commands to the SMC and returns zero on success, so this check was incorrect. Fix that by not checking the return value, just like other calls to si_write_smc_soft_register. v3: Additionally, when no display is plugged in, there is no need to restrict MCLK switching, so program the registers to zero. Fixes: `841686df9f` ("drm/amdgpu: add SI DPM support (v4)") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-02 15:55:10 -04:00
Timur Kristóf	813d13524a	drm/amd/pm: Increase SMC timeout on SI and warn (v3) The SMC can take an excessive amount of time to process some messages under some conditions. Background: Sending a message to the SMC works by writing the message into the mmSMC_MESSAGE_0 register and its optional parameter into the mmSMC_SCRATCH0, and then polling mmSMC_RESP_0. Previously the timeout was AMDGPU_MAX_USEC_TIMEOUT, ie. 100 ms. Increase the timeout to 200 ms for all messages and to 1 sec for a few messages which I've observed to be especially slow: PPSMC_MSG_NoForcedLevel PPSMC_MSG_SetEnabledLevels PPSMC_MSG_SetForcedLevels PPSMC_MSG_DisableULV PPSMC_MSG_SwitchToSwState This fixes the following problems on Tahiti when switching from a lower clock power state to a higher clock state, such as when DC turns on a display which was previously turned off. * si_restrict_performance_levels_before_switch would fail (if the user previously forced high clocks using sysfs) * si_set_sw_state would fail (always) It turns out that both of those failures were SMC timeouts and that the SMC actually didn't fail or hang, just needs more time to process those. Add a warning when there is an SMC timeout to make it easier to identify this type of problem in the future. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-02 15:55:04 -04:00
Timur Kristóf	3a0c3a4035	drm/amd/pm: Disable ULV even if unsupported (v3) Always send PPSMC_MSG_DisableULV to the SMC, even if ULV mode is unsupported, to make sure it is properly turned off. v3: Simplify si_disable_ulv further. Always check the return value of amdgpu_si_send_msg_to_smc. Fixes: `841686df9f` ("drm/amdgpu: add SI DPM support (v4)") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-02 15:54:34 -04:00
Timur Kristóf	c661219cd7	drm/amdgpu: Power up UVD 3 for FW validation (v2) Unlike later versions, UVD 3 has firmware validation. For this to work, the UVD should be powered up correctly. When DPM is enabled and the display clock is off, the SMU may choose a power state which doesn't power the UVD, which can result in failure to initialize UVD. v2: Add code comments to explain about the UVD power state and how UVD clock is turned on/off. Fixes: `b38f3e80ec` ("drm amdgpu: SI UVD v3_1 (v2)") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-02 15:54:03 -04:00
David Francis	85705b18ae	drm/amdgpu: Allow kfd CRIU with no buffer objects The kfd CRIU checkpoint ioctl would return an error if trying to checkpoint a process with no kfd buffer objects. This is a normal case and should not be an error. Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-02 15:53:56 -04:00
David Francis	4d82724f7f	drm/amdgpu: Add mapping info option for GEM_OP ioctl Add new GEM_OP_IOCTL option GET_MAPPING_INFO, which returns a list of mappings associated with a given bo, along with their positions and offsets. Userspace for this and the previous change can be found at: https://github.com/checkpoint-restore/criu/pull/2613 Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-02 15:53:33 -04:00
David Francis	f9db1fc52c	drm/amdgpu: Add ioctl to get all gem handles for a process Add new ioctl DRM_IOCTL_AMDGPU_GEM_LIST_HANDLES. This ioctl returns a list of bos with their handles, sizes, and flags and domains. This ioctl is meant to be used during CRIU checkpoint and provide information needed to reconstruct the bos in CRIU restore. Userspace for this and the next change can be found at https://github.com/checkpoint-restore/criu/pull/2613 Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-02 15:34:00 -04:00
David Francis	0317e0e224	drm/amdgpu: Allow more flags to be set on gem create. The GEM create flag AMDGPU_GEM_CREATE_VRAM_WIPE_ON_RELEASE specifies that gem memory contains sensitive information and should be cleared to prevent snooping. The COHERENT and UNCACHED gem create flags enable memory features related to sharing memory across devices. For CRIU we need to re-create KFD BOs through the GEM_CREATE IOCTL, so allow those KFD specific flags here as well. This will also aid us in the future and allows to move the KFD components over using the render node for allocations. Signed-off-by: David Francis <David.Francis@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-02 15:34:00 -04:00

1 2 3 4 5 ...

1382560 Commits