linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-16 07:51:31 -04:00

Author	SHA1	Message	Date
Alex Deucher	1987c79b4f	drm/amdgpu/pm: align Hawaii mclk workaround with radeon Align the hawaii mclk workaround with radeon and windows. Link: https://gitlab.freedesktop.org/drm/amd/-/work_items/1816 Fixes: `9f4b35411c` ("drm/amd/powerplay: add CI asics support to smumgr (v3)") Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 9649528b637f668c5af9f2b83ca4ad8576ae2121) Cc: stable@vger.kernel.org	2026-05-05 10:15:11 -04:00
Alex Deucher	2a561b361b	drm/amdgpu/pm: add missing revision check for CI The ci_populate_all_memory_levels() workaround only applies to revision 0 SKUs. Link: https://gitlab.freedesktop.org/drm/amd/-/work_items/1816 Fixes: `9f4b35411c` ("drm/amd/powerplay: add CI asics support to smumgr (v3)") Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1db15ba8f72f400bbad8ae0ce24fafc43429d4bd) Cc: stable@vger.kernel.org	2026-05-05 10:14:52 -04:00
Lijo Lazar	47a5dfc8ad	drm/amd/pm: Add fine grained flag to SMU v13.0.6 Gfx clock is fine grained on SMU v13.0.6/12 SOCs. Add the flag to report clock frequencies correctly. Fixes: `7380228401` ("drm/amd/pm: Use generic dpm table for SMUv13 SOCs") Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d4871d837bbf70173f63426a84fa80b39e408b9e)	2026-04-28 15:51:18 -04:00
Lijo Lazar	d6b99885b1	drm/amd/pm: Update emit clock logic If only one level is enabled in clock table, there is no need to follow the fine grained clock logic which expects a minimum of two levels (min/max). Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7f19097af1496dd908a044ca95862f32d05f02df)	2026-04-28 15:46:30 -04:00
Yang Wang	ccf8932ed8	drm/amd/pm: fix missing fine-grained dpm table flag on aldebaran Add the missing SMU_DPM_TABLE_FINE_GRAINED flag to aldebaran DPM table. This fixes the pp_dpm_sclk node issue caused by missing flag configuration. Fixes: `7ea1c722fe` ("drm/amd/pm: Use common helper for aldebaran dpm table") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 3427dea3a48ebddb491a26093f3627384b3cb2c2)	2026-04-24 11:09:43 -04:00
Yang Wang	4f2c86c62a	drm/amd/pm: add od table upload error message parsing for smu v14.0.x parse and print detailed reasons for od table upload failures to help users understand error causes. example: $ echo "0 30 40" \| sudo tee fan_curve $ echo "1 40 30" \| sudo tee fan_curve $ echo "c" \| sudo tee fan_curve kernel log: [ 75.040174] amdgpu 0000:0a:00.0: Failed to upload overdrive table, ret:-5 [ 75.040178] amdgpu 0000:0a:00.0: Invalid overdrive table content: OD_FAN_CURVE_PWM_ERROR (13) [ 75.040181] amdgpu 0000:0a:00.0: Failed to upload overdrive table! Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-04-17 14:53:02 -04:00
Yang Wang	79d47bc4c7	drm/amd/pm: add read arg support to smu_cmn_update_table Extend the smu_cmn_update_table function to support reading a 32-bit return argument from the SMU firmware during table transfer operations. - Rename the original function to smu_cmn_update_table_read_arg - Add a uint32_t *read_arg output parameter to capture firmware response - Pass the read_arg pointer to the SMU message command - Keep full backward compatibility using a macro wrapper for the old API This allows the driver to retrieve status codes, results, or configuration feedback from the SMU firmware after table data transfer. No functional changes for existing users of the original smu_cmn_update_table() API. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-04-17 14:52:47 -04:00
Yang Wang	25fd8095a8	drm/amd/pm: fix runtime PM imbalance issue in amdgpu_pm.c Fix runtime PM counter imbalance to prevent device from failing to enter low power state Fixes: `a50d32c41f` ("drm/amd/pm: Deprecate print_clock_levels interface") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-04-17 14:51:15 -04:00
Srinivasan Shanmugam	a6d561a88c	drm/amd/pm: Fix mode2 reset ACK handling on aldebaran v2 aldebaran_mode2_reset() sends a mode2 reset message and waits for an acknowledgment from the SMU. The current ACK handling is incorrect. The wait loop runs only when ret is -ETIME. But after a successful async send, ret is 0. Because of this, the loop is skipped and the code does not wait for the reset acknowledgment. Also, the code checks for ret != 1 after calling smu_msg_wait_response(). However, smu_msg_wait_response() returns 0 on success and negative error codes on failure. So checking against 1 is wrong. Return -EOPNOTSUPP when the firmware does not support this reset message. Fix this by setting ret to -ETIME before entering the wait loop, checking for ret != 0 after getting the SMU response, and returning -EOPNOTSUPP when the firmware does not support the message. v2: - Update ACK check to use ret != 0 instead of ret != 1, since smu_msg_wait_response() returns 0 on success (Feifei) - Remove unnecessary handling for ret == 0 Fixes: `e42569d02a` ("drm/amd/pm: Modify mode2 msg sequence on aldebaran") Reported-by: Dan Carpenter <error27@gmail.com> Cc: Feifei Xu <Feifei.Xu@amd.com> Cc: Lijo Lazar <lijo.lazar@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-04-17 14:50:26 -04:00
Srinivasan Shanmugam	e81a492d12	drm/amd/pm: smu7: Remove stale error check in smu7_hwmgr_backend_init smu7_hwmgr_backend_init() is responsible for initializing the SMU7 power management backend. It allocates and sets up the backend structure, initializes voltage tables, configures dependency tables, and prepares platform-specific power and clock parameters. The function follows a typical pattern where each initialization step returns a status in "result", and failures are handled via a common "goto fail" path that performs cleanup. Commit 2c21648bb814 ("drm/amd/pm/smu7: Remove non-functional SMU7 voltage dependency on DAL") removed a function call in this initialization sequence, but left behind the corresponding error check. As a result, "result" is checked twice without being updated in between: result = smu7_init_voltage_dependency_on_display_clock_table(hwmgr); if (result) goto fail; ... if (result) goto fail; The second check is redundant and unreachable for any new failure, since no operation modifies "result" between the two checks. This triggers a Smatch warning about a duplicate zero check and reduces code clarity. Remove the stale error check to keep the control flow correct and readable. Fixes: `9f49e3d4cb` ("drm/amd/pm/smu7: Remove non-functional SMU7 voltage dependency on DAL") Reported-by: Dan Carpenter <error27@gmail.com> Cc: Timur Kristóf <timur.kristof@gmail.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-04-17 14:49:54 -04:00
Yang Wang	504f0098eb	drm/amd/pm: fix incorrect FeatureCtrlMask setting on smu v14.0.x OverDriveTable.FanMinimumPwm and FeatureCtrlMask.PP_OD_FEATURE_FAN_LEGACY_BIT have a hard dependency. Invalid handling of this dependency leads to disabled thermal monitoring and temperature boundary validation. v2: squash in typo fix (Yang) Fixes: `9710b84e2a` ("drm/amd/pm: add overdrive support on smu v14.0.2/3") Cc: stable@vger.kernel.org Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-04-17 14:45:04 -04:00
Yang Wang	48d1a5b33a	drm/amd/pm: fix memleak issue in smu_v15_0_8_get_gpu_metrics() remove unsued code to avoid memleak issue. (NOTE: This bug occurs during internal branch switching) Fixes: `0a66ca3b35` ("drm/amd/pm: add get_gpu_metrics support for 15.0.8") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-04-03 13:54:28 -04:00
Yang Wang	e4465c0464	drm/amd/pm: optimize logic and remove unnecessary checks in smu v15.0.8 the following two sets of logic are clearly mutually exclusive in smu_v15_0_8_set_soft_freq_limited_range. remove unnecessary code logic to keep the code logic clear. e.g: if (smu_dpm->dpm_level != AMD_DPM_FORCED_LEVEL_MANUAL) return -EINVAL; if (smu_dpm->dpm_level == AMD_DPM_FORCED_LEVEL_MANUAL) { ... } Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-04-03 13:54:21 -04:00
Yang Wang	95e21dff47	drm/amd/pm: fix null pointer dereference issue in smu_v15_0_8_get_power_limit() Fix null pointer issues caused by coding errors Fixes: `e20e47bcb3` ("drm/amd/pm: add set{get}_power_limit support for smu 15.0.8") Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-04-03 13:53:09 -04:00
Yang Wang	592713a896	drm/amd/pm: correct mem_busy_percent display due to calculation errors PMFW may return invalid values due to internal calculation errors. so, the kmd driver must validate and sanitize the returned values to prevent issues caused by firmware calculation errors. For example, values 0xfffe (-2) and 0xffff (-1) are treated as invalid and clamped to 0. this applies to devices with CAB (Cache As Buffer) functionality. Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4905 Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-04-03 13:52:44 -04:00
Lijo Lazar	3ecee2073c	drm/amd/pm: Use smu vram copy in SMUv15 Use smu vram copy wrapper function for vram copy operations in SMUv15.0.8 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-04-03 13:52:10 -04:00
Lijo Lazar	3bec582562	drm/amd/pm: Use smu vram copy in SMUv13 Use smu vram copy wrapper function for vram copy operations in SMUv13.0.6 and SMUv13.0.12. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-04-03 13:52:04 -04:00
Lijo Lazar	f2275ea90b	drm/amd/pm: Add smu vram copy function Add a wrapper function for copying data/to from vram. This additionally checks for any RAS fatal error. Copy cannot be trusted if any RAS fatal error happened as VRAM becomes inaccessible. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-04-03 13:51:54 -04:00
Timur Kristóf	4724bc5b8d	drm/amd/pm/smu7: Add SCLK cap for quirky Hawaii board On a specific Radeon R9 390X board, the GPU can "randomly" hang while gaming. Initially I thought this was a RADV bug and tried to work around this in Mesa: commit 8ea08747b86b ("radv: Mitigate GPU hang on Hawaii in Dota 2 and RotTR") However, I got some feedback from other users who are reporting that the above mitigation causes a significant performance regression for them, and they didn't experience the hang on their GPU in the first place. After some further investigation, it turns out that the problem is that the highest SCLK DPM level on this board isn't stable. Lowering SCLK to 1040 MHz (from 1070 MHz) works around the issue, and has a negligible impact on performance compared to the Mesa patch. (Note that increasing the voltage can also work around it, but we felt that lowering the SCLK is the safer option.) To solve the above issue, add an "sclk_cap" field to smu7_hwmgr and set this field for the affected board. The capped SCLK value correctly appears on the sysfs interface and shows up in GUI tools such as LACT. Fixes: `9f4b35411c` ("drm/amd/powerplay: add CI asics support to smumgr (v3)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-30 16:49:01 -04:00
Timur Kristóf	baf28ec579	drm/amd/pm/ci: Fill DW8 fields from SMC In ci_populate_dw8() we currently just read a value from the SMU and then throw it away. Instead of throwing away the value, we should use it to fill other fields in DW8 (like radeon). Otherwise the value of the other fiels is just cleared when we copy this data to the SMU later. Fixes: `9f4b35411c` ("drm/amd/powerplay: add CI asics support to smumgr (v3)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-30 16:48:50 -04:00
Timur Kristóf	5facfd4c4c	drm/amd/pm/ci: Clear EnabledForActivity field for memory levels Follow what radeon did and what amdgpu does for other GPUs with SMU7. Fixes: `9f4b35411c` ("drm/amd/powerplay: add CI asics support to smumgr (v3)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-30 16:48:41 -04:00
Timur Kristóf	d784759c07	drm/amd/pm/ci: Fix powertune defaults for Hawaii 0x67B0 There is no AMD GPU with the ID 0x66B0, this looks like a typo. It should be 0x67B0 which is actually part of the PCI ID list, and should use the Hawaii XT powertune defaults according to the old radeon driver. Fixes: `9f4b35411c` ("drm/amd/powerplay: add CI asics support to smumgr (v3)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-30 16:48:18 -04:00
Timur Kristóf	9f49e3d4cb	drm/amd/pm/smu7: Remove non-functional SMU7 voltage dependency on DAL It looks like this was written for an old version of DC (DAL) and was never adapted afterwards. This was non-functional because it relied on the "dal_power_level" field which was never assigned anywhere in the code base. Also, it was not implemented for CI ASICs. Now superseded by the newer voltage dependency on display clock table added by the previous commit, let's remove. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-30 16:48:11 -04:00
Timur Kristóf	0138610c14	drm/amd/pm/smu7: Fix SMU7 voltage dependency on display clock The DCE (display controller engine) requires a minimum voltage in order to function correctly, depending on which clock level it currently uses. Add a new table that contains display clock frequency levels and the corresponding required voltages. The clock frequency levels are taken from DC (and the old radeon driver's voltage dependency table for CI in cases where its values were lower). The voltage levels are taken from the following function: phm_initializa_dynamic_state_adjustment_rule_settings(). Furthermore, in case of CI, call smu7_patch_vddc() on the new table to account for leakage voltage (like in radeon). Use the display clock value from amd_pp_display_configuration to look up the voltage level needed by the DCE. Send the voltage to the SMU via the PPSMC_MSG_VddC_Request command. The previous implementation of this feature was non-functional because it relied on a "dal_power_level" field which was never assigned; and it was not at all implemented for CI ASICs. I verified this on a Radeon R9 M380 which previously booted to a black screen with DC enabled (default since Linux 6.19), but now works correctly. Fixes: `599a7e9fe1` ("drm/amd/powerplay: implement smu7 hwmgr to manager asics with smu ip version 7.") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-30 16:45:22 -04:00
Timur Kristóf	9851f29cb0	drm/amd/pm/ci: Disable MCLK DPM on problematic CI ASICs There are two known cases where MCLK DPM can causes issues: Radeon R9 M380 found in iMac computers from 2015. The SMU in this GPU just hangs as soon as we send it the PPSMC_MSG_MCLKDPM_Enable command, even when MCLK switching is disabled, and even when we only populate one MCLK DPM level. Apply workaround to all devices with the same subsystem ID. Radeon R7 260X due to old memory controller microcode. We only flash the MC ucode when it isn't set up by the VBIOS, therefore there is no way to make sure that it has the correct ucode version. I verified that this patch fixes the SMU hang on the R9 M380 which would previously fail to boot. This also fixes the UVD initialization error on that GPU which happened because the SMU couldn't ungate the UVD after it hung. Fixes: `86457c3b21` ("drm/amd/powerplay: Add support for CI asics to hwmgr") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-30 16:44:54 -04:00
Timur Kristóf	894f0d34d6	drm/amd/pm/ci: Use highest MCLK on CI when MCLK DPM is disabled When MCLK DPM is disabled for any reason, populate the MCLK table with the highest MCLK DPM level, so that the ASIC can use the highest possible memory clock to get good performance even when MCLK DPM is disabled. Fixes: `9f4b35411c` ("drm/amd/powerplay: add CI asics support to smumgr (v3)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-30 16:44:16 -04:00
Lijo Lazar	964e532d58	drm/amd/pm: Unify version check in SMUv14 Use common helper function for firmware version check and logging in SMUv14 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-30 15:14:38 -04:00
Lijo Lazar	353f200825	drm/amd/pm: Unify version check in SMUv12 Use common helper function for firmware version check and logging in SMUv12. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-30 15:13:06 -04:00
Lijo Lazar	6b0a611628	drm/amd/pm: Unify version check in SMUv11 Use common helper function for firmware version check and logging in SMUv11 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-30 15:12:56 -04:00
Asad Kamal	3cfe23b6b0	drm/amd/pm: Use str_enabled_disabled in amdgpu_pm sysfs Coccinelle flags hand-rolled "enabled"/"disabled" strings; use the shared str_enabled_disabled() helper from string_choices.h for npm_status and thermal throttling logging sysfs text. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202603251434.zIN2QYWn-lkp@intel.com/ Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-30 14:33:48 -04:00
Jesse.Zhang	6728daa259	drm/amd/pm: Enable VCN reset for pgm=4 with appropriate FW version Extend the VCN reset capability to include pgm=4 variants when the firmware version meets the required threshold (>= 0x04557100). This follows the existing pattern for pgm=0 and pgm=7, ensuring that VCN reset is enabled only on configurations where it is supported by the firmware. Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <jesse.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-24 13:32:04 -04:00
Yang Wang	bdb2b9e1e0	drm/amd/pm: add dedicated dram addr msg for smu v15 Add dedicated SMU Dram MSG mapping to avoid conflicts in SMU IP v15 common code for upcoming ASICs. add new smu msg: - SMU_MSG_SetDriverDramAddr - SMU_MSG_SetToolsDramAddr Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-24 13:30:28 -04:00
Yang Wang	ab4905d466	drm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v14 Forcibly disable the OD_FAN_CURVE feature when temperature or PWM range is invalid, otherwise PMFW will reject this configuration on smu v14.0.2/14.0.3. example: $ sudo cat /sys/bus/pci/devices/<BDF>/gpu_od/fan_ctrl/fan_curve OD_FAN_CURVE: 0: 0C 0% 1: 0C 0% 2: 0C 0% 3: 0C 0% 4: 0C 0% OD_RANGE: FAN_CURVE(hotspot temp): 0C 0C FAN_CURVE(fan speed): 0% 0% $ echo "0 50 40" \| sudo tee fan_curve kernel log: [ 969.761627] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]! [ 1010.897800] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]! Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-24 13:30:20 -04:00
Yang Wang	470891606c	drm/amd/pm: disable OD_FAN_CURVE if temp or pwm range invalid for smu v13 Forcibly disable the OD_FAN_CURVE feature when temperature or PWM range is invalid, otherwise PMFW will reject this configuration on smu v13.0.x example: $ sudo cat /sys/bus/pci/devices/<BDF>/gpu_od/fan_ctrl/fan_curve OD_FAN_CURVE: 0: 0C 0% 1: 0C 0% 2: 0C 0% 3: 0C 0% 4: 0C 0% OD_RANGE: FAN_CURVE(hotspot temp): 0C 0C FAN_CURVE(fan speed): 0% 0% $ echo "0 50 40" \| sudo tee fan_curve kernel log: [ 756.442527] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]! [ 777.345800] amdgpu 0000:03:00.0: amdgpu: Fan curve temp setting(50) must be within [0, 0]! Closes: https://github.com/ROCm/amdgpu/issues/208 Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-23 14:21:39 -04:00
Asad Kamal	7aaa09ab30	drm/amd/pm: Enable user specified gfx clock ranges Enable user specified gfx clock ranges for smu_15_0_8 Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-23 14:19:12 -04:00
Hawking Zhang	df4929d76a	drm/amdgpu: Add smu v15_0_8 ip block Add smu v15_0_8 ip block Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Likun Gao <Likun.Gao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-23 14:19:09 -04:00
Asad Kamal	6a609b800c	drm/amd/pm: Add NPM support for smu_v15_0_8 Add node power management support for smu_v15_0_8 Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-23 14:19:07 -04:00
Asad Kamal	8847d59969	drm/amd/pm: Add baseboard temperature metrics support Add baseboard temperature metrics support via system metrics table for smu_v15_0_8 v4: Add separate function to fill baseboard temperature, use 16, remove casting v5: Optimize to use single switch case (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-23 14:19:03 -04:00
Asad Kamal	e3b96f5b20	drm/amd/pm: Add gpuboard temperature metrics support Add gpuboard temperature metrics support via system metrics table for smu_v15_0_8 v3: Use per sensor attr id (Lijo) v4: Use s16 for temp, remove cast, use separate function to fill gpuboard temperature metrics data (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-23 14:18:58 -04:00
Asad Kamal	1415503db0	drm/amd/pm: Add read sensor support Add read sensor support for smu_v15_0_8 v2: Remove gfx voltage support (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-23 14:18:55 -04:00
Asad Kamal	a6c2ecd95e	drm/amd/pm: Add ppt1 support Add ppt1 support for smu_v15_0_8 Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-23 14:18:53 -04:00
Asad Kamal	bc296e6c95	drm/amd/pm: Add get_thermal_temperature_range support Add get_thermal_temperature_range support smu_v15_0_8 v2: Remove sriov check (Lijo) v3: Restrict to 1VF mode(Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-23 14:18:46 -04:00
Asad Kamal	422b399b09	drm/amd/pm: Add od_edit_dpm_table support Add od_edit_dpm_table support for smu_v15_0_8 v2: Skip Gl2clk/Fclk (Lijo) v3: sqaush in set_performance_support (Asad) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-23 14:18:43 -04:00
Asad Kamal	c7de5a863c	drm/amd/pm: add populate_umd_state_clk support add populate_umd_state_clk support for smu 15.0.8 v2: remove gl2clk/socclk/fclk, restrict to only current min/max (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-23 14:18:36 -04:00
Asad Kamal	bf39c461ad	drm/amd/pm: Add emit clock support Add emit clock support and fetching other metrics data like temperature, clock for smu_v15_0_8 v2: Use umc count for hbm stack temperature (Lijo) v3: Use correct logic for hbm stacks (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-23 14:18:11 -04:00
Yang Wang	e20e47bcb3	drm/amd/pm: add set{get}_power_limit support for smu 15.0.8 export .set_power_limit & .get_power_limit interface for smu 15.0.8 Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-23 14:18:07 -04:00
Yang Wang	39e0a73bde	drm/amd/pm: add get_unique_id support for smu 15.0.8 export .get_unique_id interface for smu 15.0.8 Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-23 14:18:03 -04:00
Yang Wang	0a66ca3b35	drm/amd/pm: add get_gpu_metrics support for 15.0.8 export .get_gpu_metrics interface for 15.0.8 v2: Remove members already exposed by other interfaces, use mask, logical conversion (Lijo) v3: Use correct logic for hbm stacks loop (Lijo) Remove buffer allocation v4: Make out of bound check outside loop (Lijo) v5: fix locking in error case (Alex) Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-23 14:17:59 -04:00
Asad Kamal	6bcea37ff2	drm/amd/pm: Add get_pm_metrics support for smu 15.0.8 export .get_pm_metrics interface for smu 15.0.8. v2: Make tmo as unsigned (Lijo) Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-23 14:17:56 -04:00
Asad Kamal	34b19dab0c	drm/amd/pm: Add default dpm table support for smu 15.0.8 Add default dpm table support for smu 15.0.8 v2: Remove lclk, move pptable check up, add missing clk (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2026-03-23 14:17:53 -04:00

1 2 3 4 5 ...

2268 Commits