Merge tag 'amd-pstate-v7.1-2026-04-02' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux

Pull amd-pstate new content for 7.1 (2026-04-02) from Mario Limonciello:

"Add support for new features:
  * CPPC performance priority
  * Dynamic EPP
  * Raw EPP
  * New unit tests for new features
 Fixes for:
  * PREEMPT_RT
  * sysfs files being present when HW missing
  * Broken/outdated documentation"

* tag 'amd-pstate-v7.1-2026-04-02' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/superm1/linux: (22 commits)
  MAINTAINERS: amd-pstate: Step down as maintainer, add Prateek as reviewer
  cpufreq: Pass the policy to cpufreq_driver->adjust_perf()
  cpufreq/amd-pstate: Pass the policy to amd_pstate_update()
  cpufreq/amd-pstate-ut: Add a unit test for raw EPP
  cpufreq/amd-pstate: Add support for raw EPP writes
  cpufreq/amd-pstate: Add support for platform profile class
  cpufreq/amd-pstate: add kernel command line to override dynamic epp
  cpufreq/amd-pstate: Add dynamic energy performance preference
  Documentation: amd-pstate: fix dead links in the reference section
  cpufreq/amd-pstate: Cache the max frequency in cpudata
  Documentation/amd-pstate: Add documentation for amd_pstate_floor_{freq,count}
  Documentation/amd-pstate: List amd_pstate_prefcore_ranking sysfs file
  Documentation/amd-pstate: List amd_pstate_hw_prefcore sysfs file
  amd-pstate-ut: Add a testcase to validate the visibility of driver attributes
  amd-pstate-ut: Add module parameter to select testcases
  amd-pstate: Introduce a tracepoint trace_amd_pstate_cppc_req2()
  amd-pstate: Add sysfs support for floor_freq and floor_count
  amd-pstate: Add support for CPPC_REQ2 and FLOOR_PERF
  x86/cpufeatures: Add AMD CPPC Performance Priority feature.
  amd-pstate: Make certain freq_attrs conditionally visible
  ...
This commit is contained in:
Rafael J. Wysocki
2026-04-04 20:55:56 +02:00
17 changed files with 1009 additions and 118 deletions

View File

@@ -493,6 +493,13 @@ Kernel parameters
disable
Disable amd-pstate preferred core.
amd_dynamic_epp=
[X86]
disable
Disable amd-pstate dynamic EPP.
enable
Enable amd-pstate dynamic EPP.
amijoy.map= [HW,JOY] Amiga joystick support
Map of devices attached to JOY0DAT and JOY1DAT
Format: <a>,<b>

View File

@@ -239,8 +239,12 @@ control its functionality at the system level. They are located in the
root@hr-test1:/home/ray# ls /sys/devices/system/cpu/cpufreq/policy0/*amd*
/sys/devices/system/cpu/cpufreq/policy0/amd_pstate_highest_perf
/sys/devices/system/cpu/cpufreq/policy0/amd_pstate_hw_prefcore
/sys/devices/system/cpu/cpufreq/policy0/amd_pstate_lowest_nonlinear_freq
/sys/devices/system/cpu/cpufreq/policy0/amd_pstate_max_freq
/sys/devices/system/cpu/cpufreq/policy0/amd_pstate_floor_freq
/sys/devices/system/cpu/cpufreq/policy0/amd_pstate_floor_count
/sys/devices/system/cpu/cpufreq/policy0/amd_pstate_prefcore_ranking
``amd_pstate_highest_perf / amd_pstate_max_freq``
@@ -264,14 +268,46 @@ This attribute is read-only.
``amd_pstate_hw_prefcore``
Whether the platform supports the preferred core feature and it has been
enabled. This attribute is read-only.
Whether the platform supports the preferred core feature and it has
been enabled. This attribute is read-only. This file is only visible
on platforms which support the preferred core feature.
``amd_pstate_prefcore_ranking``
The performance ranking of the core. This number doesn't have any unit, but
larger numbers are preferred at the time of reading. This can change at
runtime based on platform conditions. This attribute is read-only.
runtime based on platform conditions. This attribute is read-only. This file
is only visible on platforms which support the preferred core feature.
``amd_pstate_floor_freq``
The floor frequency associated with each CPU. Userspace can write any
value between ``cpuinfo_min_freq`` and ``scaling_max_freq`` into this
file. When the system is under power or thermal constraints, the
platform firmware will attempt to throttle the CPU frequency to the
value specified in ``amd_pstate_floor_freq`` before throttling it
further. This allows userspace to specify different floor frequencies
to different CPUs. For optimal results, threads of the same core
should have the same floor frequency value. This file is only visible
on platforms that support the CPPC Performance Priority feature.
``amd_pstate_floor_count``
The number of distinct Floor Performance levels supported by the
platform. For example, if this value is 2, then the number of unique
values obtained from the command ``cat
/sys/devices/system/cpu/cpufreq/policy*/amd_pstate_floor_freq |
sort -n | uniq`` should be at most this number for the behavior
described in ``amd_pstate_floor_freq`` to take effect. A zero value
implies that the platform supports unlimited floor performance levels.
This file is only visible on platforms that support the CPPC
Performance Priority feature.
**Note**: When ``amd_pstate_floor_count`` is non-zero, the frequency to
which the CPU is throttled under power or thermal constraints is
undefined when the number of unique values of ``amd_pstate_floor_freq``
across all CPUs in the system exceeds ``amd_pstate_floor_count``.
``energy_performance_available_preferences``
@@ -280,16 +316,22 @@ A list of all the supported EPP preferences that could be used for
These profiles represent different hints that are provided
to the low-level firmware about the user's desired energy vs efficiency
tradeoff. ``default`` represents the epp value is set by platform
firmware. This attribute is read-only.
firmware. ``custom`` designates that integer values 0-255 may be written
as well. This attribute is read-only.
``energy_performance_preference``
The current energy performance preference can be read from this attribute.
and user can change current preference according to energy or performance needs
Please get all support profiles list from
``energy_performance_available_preferences`` attribute, all the profiles are
integer values defined between 0 to 255 when EPP feature is enabled by platform
firmware, if EPP feature is disabled, driver will ignore the written value
Coarse named profiles are available in the attribute
``energy_performance_available_preferences``.
Users can also write individual integer values between 0 to 255.
When dynamic EPP is enabled, writes to energy_performance_preference are blocked
even when EPP feature is enabled by platform firmware. Lower epp values shift the bias
towards improved performance while a higher epp value shifts the bias towards
power-savings. The exact impact can change from one platform to the other.
If a valid integer was last written, then a number will be returned on future reads.
If a valid string was last written then a string will be returned on future reads.
This attribute is read-write.
``boost``
@@ -311,6 +353,24 @@ boost or `1` to enable it, for the respective CPU using the sysfs path
Other performance and frequency values can be read back from
``/sys/devices/system/cpu/cpuX/acpi_cppc/``, see :ref:`cppc_sysfs`.
Dynamic energy performance profile
==================================
The amd-pstate driver supports dynamically selecting the energy performance
profile based on whether the machine is running on AC or DC power.
Whether this behavior is enabled by default depends on the kernel
config option `CONFIG_X86_AMD_PSTATE_DYNAMIC_EPP`. This behavior can also be overridden
at runtime by the sysfs file ``/sys/devices/system/cpu/cpufreq/policyX/dynamic_epp``.
When set to enabled, the driver will select a different energy performance
profile when the machine is running on battery or AC power. The driver will
also register with the platform profile handler to receive notifications of
user desired power state and react to those.
When set to disabled, the driver will not change the energy performance profile
based on the power source and will not react to user desired power state.
Attempting to manually write to the ``energy_performance_preference`` sysfs
file will fail when ``dynamic_epp`` is enabled.
``amd-pstate`` vs ``acpi-cpufreq``
======================================
@@ -422,6 +482,13 @@ For systems that support ``amd-pstate`` preferred core, the core rankings will
always be advertised by the platform. But OS can choose to ignore that via the
kernel parameter ``amd_prefcore=disable``.
``amd_dynamic_epp``
When AMD pstate is in auto mode, dynamic EPP will control whether the kernel
autonomously changes the EPP mode. The default is configured by
``CONFIG_X86_AMD_PSTATE_DYNAMIC_EPP`` but can be explicitly enabled with
``amd_dynamic_epp=enable`` or disabled with ``amd_dynamic_epp=disable``.
User Space Interface in ``sysfs`` - General
===========================================
@@ -790,13 +857,13 @@ Reference
===========
.. [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming,
https://www.amd.com/system/files/TechDocs/24593.pdf
https://docs.amd.com/v/u/en-US/24593_3.44_APM_Vol2
.. [2] Advanced Configuration and Power Interface Specification,
https://uefi.org/sites/default/files/resources/ACPI_Spec_6_4_Jan22.pdf
.. [3] Processor Programming Reference (PPR) for AMD Family 19h Model 51h, Revision A1 Processors
https://www.amd.com/system/files/TechDocs/56569-A1-PUB.zip
https://docs.amd.com/v/u/en-US/56569-A1-PUB_3.03
.. [4] Linux Kernel Selftests,
https://www.kernel.org/doc/html/latest/dev-tools/kselftest.html

View File

@@ -1234,9 +1234,9 @@ F: drivers/gpu/drm/amd/pm/
AMD PSTATE DRIVER
M: Huang Rui <ray.huang@amd.com>
M: Gautham R. Shenoy <gautham.shenoy@amd.com>
M: Mario Limonciello <mario.limonciello@amd.com>
R: Perry Yuan <perry.yuan@amd.com>
R: K Prateek Nayak <kprateek.nayak@amd.com>
L: linux-pm@vger.kernel.org
S: Supported
F: Documentation/admin-guide/pm/amd-pstate.rst

View File

@@ -415,7 +415,7 @@
*/
#define X86_FEATURE_OVERFLOW_RECOV (17*32+ 0) /* "overflow_recov" MCA overflow recovery support */
#define X86_FEATURE_SUCCOR (17*32+ 1) /* "succor" Uncorrectable error containment and recovery */
#define X86_FEATURE_CPPC_PERF_PRIO (17*32+ 2) /* CPPC Floor Perf support */
#define X86_FEATURE_SMCA (17*32+ 3) /* "smca" Scalable MCA */
/* Intel-defined CPU features, CPUID level 0x00000007:0 (EDX), word 18 */

View File

@@ -765,12 +765,14 @@
#define MSR_AMD_CPPC_CAP2 0xc00102b2
#define MSR_AMD_CPPC_REQ 0xc00102b3
#define MSR_AMD_CPPC_STATUS 0xc00102b4
#define MSR_AMD_CPPC_REQ2 0xc00102b5
/* Masks for use with MSR_AMD_CPPC_CAP1 */
#define AMD_CPPC_LOWEST_PERF_MASK GENMASK(7, 0)
#define AMD_CPPC_LOWNONLIN_PERF_MASK GENMASK(15, 8)
#define AMD_CPPC_NOMINAL_PERF_MASK GENMASK(23, 16)
#define AMD_CPPC_HIGHEST_PERF_MASK GENMASK(31, 24)
#define AMD_CPPC_FLOOR_PERF_CNT_MASK GENMASK_ULL(39, 32)
/* Masks for use with MSR_AMD_CPPC_REQ */
#define AMD_CPPC_MAX_PERF_MASK GENMASK(7, 0)
@@ -778,6 +780,9 @@
#define AMD_CPPC_DES_PERF_MASK GENMASK(23, 16)
#define AMD_CPPC_EPP_PERF_MASK GENMASK(31, 24)
/* Masks for use with MSR_AMD_CPPC_REQ2 */
#define AMD_CPPC_FLOOR_PERF_MASK GENMASK(7, 0)
/* AMD Performance Counter Global Status and Control MSRs */
#define MSR_AMD64_PERF_CNTR_GLOBAL_STATUS 0xc0000300
#define MSR_AMD64_PERF_CNTR_GLOBAL_CTL 0xc0000301

View File

@@ -52,6 +52,7 @@ static const struct cpuid_bit cpuid_bits[] = {
{ X86_FEATURE_CPB, CPUID_EDX, 9, 0x80000007, 0 },
{ X86_FEATURE_PROC_FEEDBACK, CPUID_EDX, 11, 0x80000007, 0 },
{ X86_FEATURE_AMD_FAST_CPPC, CPUID_EDX, 15, 0x80000007, 0 },
{ X86_FEATURE_CPPC_PERF_PRIO, CPUID_EDX, 16, 0x80000007, 0 },
{ X86_FEATURE_MBA, CPUID_EBX, 6, 0x80000008, 0 },
{ X86_FEATURE_X2AVIC_EXT, CPUID_ECX, 6, 0x8000000a, 0 },
{ X86_FEATURE_COHERENCY_SFW_NO, CPUID_EBX, 31, 0x8000001f, 0 },

View File

@@ -40,6 +40,7 @@ config X86_AMD_PSTATE
select ACPI_PROCESSOR
select ACPI_CPPC_LIB if X86_64
select CPU_FREQ_GOV_SCHEDUTIL if SMP
select ACPI_PLATFORM_PROFILE
help
This driver adds a CPUFreq driver which utilizes a fine grain
processor performance frequency control range instead of legacy
@@ -68,6 +69,18 @@ config X86_AMD_PSTATE_DEFAULT_MODE
For details, take a look at:
<file:Documentation/admin-guide/pm/amd-pstate.rst>.
config X86_AMD_PSTATE_DYNAMIC_EPP
bool "AMD Processor P-State dynamic EPP support"
depends on X86_AMD_PSTATE
default n
help
Allow the kernel to dynamically change the energy performance
value from events like ACPI platform profile and AC adapter plug
events.
This feature can also be changed at runtime, this configuration
option only sets the kernel default value behavior.
config X86_AMD_PSTATE_UT
tristate "selftest for AMD Processor P-State driver"
depends on X86 && ACPI_PROCESSOR

View File

@@ -133,6 +133,41 @@ TRACE_EVENT(amd_pstate_epp_perf,
)
);
TRACE_EVENT(amd_pstate_cppc_req2,
TP_PROTO(unsigned int cpu_id,
u8 floor_perf,
bool changed,
int err_code
),
TP_ARGS(cpu_id,
floor_perf,
changed,
err_code),
TP_STRUCT__entry(
__field(unsigned int, cpu_id)
__field(u8, floor_perf)
__field(bool, changed)
__field(int, err_code)
),
TP_fast_assign(
__entry->cpu_id = cpu_id;
__entry->floor_perf = floor_perf;
__entry->changed = changed;
__entry->err_code = err_code;
),
TP_printk("cpu%u: floor_perf=%u, changed=%u (error = %d)",
__entry->cpu_id,
__entry->floor_perf,
__entry->changed,
__entry->err_code
)
);
#endif /* _AMD_PSTATE_TRACE_H */
/* This part must be outside protection */

View File

@@ -23,9 +23,12 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/bitfield.h>
#include <linux/cpufeature.h>
#include <linux/cpufreq.h>
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/moduleparam.h>
#include <linux/mm.h>
#include <linux/fs.h>
#include <linux/cleanup.h>
@@ -35,6 +38,11 @@
#include "amd-pstate.h"
static char *test_list;
module_param(test_list, charp, 0444);
MODULE_PARM_DESC(test_list,
"Comma-delimited list of tests to run (empty means run all tests)");
DEFINE_FREE(cleanup_page, void *, if (_T) free_page((unsigned long)_T))
struct amd_pstate_ut_struct {
const char *name;
@@ -48,16 +56,39 @@ static int amd_pstate_ut_acpi_cpc_valid(u32 index);
static int amd_pstate_ut_check_enabled(u32 index);
static int amd_pstate_ut_check_perf(u32 index);
static int amd_pstate_ut_check_freq(u32 index);
static int amd_pstate_ut_epp(u32 index);
static int amd_pstate_ut_check_driver(u32 index);
static int amd_pstate_ut_check_freq_attrs(u32 index);
static struct amd_pstate_ut_struct amd_pstate_ut_cases[] = {
{"amd_pstate_ut_acpi_cpc_valid", amd_pstate_ut_acpi_cpc_valid },
{"amd_pstate_ut_check_enabled", amd_pstate_ut_check_enabled },
{"amd_pstate_ut_check_perf", amd_pstate_ut_check_perf },
{"amd_pstate_ut_check_freq", amd_pstate_ut_check_freq },
{"amd_pstate_ut_check_driver", amd_pstate_ut_check_driver }
{"amd_pstate_ut_acpi_cpc_valid", amd_pstate_ut_acpi_cpc_valid },
{"amd_pstate_ut_check_enabled", amd_pstate_ut_check_enabled },
{"amd_pstate_ut_check_perf", amd_pstate_ut_check_perf },
{"amd_pstate_ut_check_freq", amd_pstate_ut_check_freq },
{"amd_pstate_ut_epp", amd_pstate_ut_epp },
{"amd_pstate_ut_check_driver", amd_pstate_ut_check_driver },
{"amd_pstate_ut_check_freq_attrs", amd_pstate_ut_check_freq_attrs },
};
static bool test_in_list(const char *list, const char *name)
{
size_t name_len = strlen(name);
const char *p = list;
while (*p) {
const char *sep = strchr(p, ',');
size_t token_len = sep ? sep - p : strlen(p);
if (token_len == name_len && !strncmp(p, name, token_len))
return true;
if (!sep)
break;
p = sep + 1;
}
return false;
}
static bool get_shared_mem(void)
{
bool result = false;
@@ -241,6 +272,111 @@ static int amd_pstate_set_mode(enum amd_pstate_mode mode)
return amd_pstate_update_status(mode_str, strlen(mode_str));
}
static int amd_pstate_ut_epp(u32 index)
{
struct cpufreq_policy *policy __free(put_cpufreq_policy) = NULL;
char *buf __free(cleanup_page) = NULL;
static const char * const epp_strings[] = {
"performance",
"balance_performance",
"balance_power",
"power",
};
struct amd_cpudata *cpudata;
enum amd_pstate_mode orig_mode;
bool orig_dynamic_epp;
int ret, cpu = 0;
int i;
u16 epp;
policy = cpufreq_cpu_get(cpu);
if (!policy)
return -ENODEV;
cpudata = policy->driver_data;
orig_mode = amd_pstate_get_status();
orig_dynamic_epp = cpudata->dynamic_epp;
/* disable dynamic EPP before running test */
if (cpudata->dynamic_epp) {
pr_debug("Dynamic EPP is enabled, disabling it\n");
amd_pstate_clear_dynamic_epp(policy);
}
buf = (char *)__get_free_page(GFP_KERNEL);
if (!buf)
return -ENOMEM;
ret = amd_pstate_set_mode(AMD_PSTATE_ACTIVE);
if (ret)
goto out;
for (epp = 0; epp <= U8_MAX; epp++) {
u8 val;
/* write all EPP values */
memset(buf, 0, PAGE_SIZE);
snprintf(buf, PAGE_SIZE, "%d", epp);
ret = store_energy_performance_preference(policy, buf, strlen(buf));
if (ret < 0)
goto out;
/* check if the EPP value reads back correctly for raw numbers */
memset(buf, 0, PAGE_SIZE);
ret = show_energy_performance_preference(policy, buf);
if (ret < 0)
goto out;
strreplace(buf, '\n', '\0');
ret = kstrtou8(buf, 0, &val);
if (!ret && epp != val) {
pr_err("Raw EPP value mismatch: %d != %d\n", epp, val);
ret = -EINVAL;
goto out;
}
}
for (i = 0; i < ARRAY_SIZE(epp_strings); i++) {
memset(buf, 0, PAGE_SIZE);
snprintf(buf, PAGE_SIZE, "%s", epp_strings[i]);
ret = store_energy_performance_preference(policy, buf, strlen(buf));
if (ret < 0)
goto out;
memset(buf, 0, PAGE_SIZE);
ret = show_energy_performance_preference(policy, buf);
if (ret < 0)
goto out;
strreplace(buf, '\n', '\0');
if (strcmp(buf, epp_strings[i])) {
pr_err("String EPP value mismatch: %s != %s\n", buf, epp_strings[i]);
ret = -EINVAL;
goto out;
}
}
ret = 0;
out:
if (orig_dynamic_epp) {
int ret2;
ret2 = amd_pstate_set_mode(AMD_PSTATE_DISABLE);
if (!ret && ret2)
ret = ret2;
}
if (orig_mode != amd_pstate_get_status()) {
int ret2;
ret2 = amd_pstate_set_mode(orig_mode);
if (!ret && ret2)
ret = ret2;
}
return ret;
}
static int amd_pstate_ut_check_driver(u32 index)
{
enum amd_pstate_mode mode1, mode2 = AMD_PSTATE_DISABLE;
@@ -270,12 +406,143 @@ static int amd_pstate_ut_check_driver(u32 index)
return ret;
}
enum attr_category {
ATTR_ALWAYS,
ATTR_PREFCORE,
ATTR_EPP,
ATTR_FLOOR_FREQ,
};
static const struct {
const char *name;
enum attr_category category;
} expected_freq_attrs[] = {
{"amd_pstate_max_freq", ATTR_ALWAYS},
{"amd_pstate_lowest_nonlinear_freq", ATTR_ALWAYS},
{"amd_pstate_highest_perf", ATTR_ALWAYS},
{"amd_pstate_prefcore_ranking", ATTR_PREFCORE},
{"amd_pstate_hw_prefcore", ATTR_PREFCORE},
{"energy_performance_preference", ATTR_EPP},
{"energy_performance_available_preferences", ATTR_EPP},
{"amd_pstate_floor_freq", ATTR_FLOOR_FREQ},
{"amd_pstate_floor_count", ATTR_FLOOR_FREQ},
};
static bool attr_in_driver(struct freq_attr **driver_attrs, const char *name)
{
int j;
for (j = 0; driver_attrs[j]; j++) {
if (!strcmp(driver_attrs[j]->attr.name, name))
return true;
}
return false;
}
/*
* Verify that for each mode the driver's live ->attr array contains exactly
* the attributes that should be visible. Expected visibility is derived
* independently from hw_prefcore, cpu features, and the current mode
* not from the driver's own visibility functions.
*/
static int amd_pstate_ut_check_freq_attrs(u32 index)
{
enum amd_pstate_mode orig_mode = amd_pstate_get_status();
static const enum amd_pstate_mode modes[] = {
AMD_PSTATE_PASSIVE, AMD_PSTATE_ACTIVE, AMD_PSTATE_GUIDED,
};
bool has_prefcore, has_floor_freq;
int m, i, ret;
has_floor_freq = cpu_feature_enabled(X86_FEATURE_CPPC_PERF_PRIO);
/*
* Determine prefcore support from any online CPU's cpudata.
* hw_prefcore reflects the platform-wide decision made at init.
*/
has_prefcore = false;
for_each_online_cpu(i) {
struct cpufreq_policy *policy __free(put_cpufreq_policy) = NULL;
struct amd_cpudata *cpudata;
policy = cpufreq_cpu_get(i);
if (!policy)
continue;
cpudata = policy->driver_data;
has_prefcore = cpudata->hw_prefcore;
break;
}
for (m = 0; m < ARRAY_SIZE(modes); m++) {
struct freq_attr **driver_attrs;
ret = amd_pstate_set_mode(modes[m]);
if (ret)
goto out;
driver_attrs = amd_pstate_get_current_attrs();
if (!driver_attrs) {
pr_err("%s: no driver attrs in mode %s\n",
__func__, amd_pstate_get_mode_string(modes[m]));
ret = -EINVAL;
goto out;
}
for (i = 0; i < ARRAY_SIZE(expected_freq_attrs); i++) {
bool expected, found;
switch (expected_freq_attrs[i].category) {
case ATTR_ALWAYS:
expected = true;
break;
case ATTR_PREFCORE:
expected = has_prefcore;
break;
case ATTR_EPP:
expected = (modes[m] == AMD_PSTATE_ACTIVE);
break;
case ATTR_FLOOR_FREQ:
expected = has_floor_freq;
break;
default:
expected = false;
break;
}
found = attr_in_driver(driver_attrs,
expected_freq_attrs[i].name);
if (expected != found) {
pr_err("%s: mode %s: attr %s expected %s but is %s\n",
__func__,
amd_pstate_get_mode_string(modes[m]),
expected_freq_attrs[i].name,
expected ? "visible" : "hidden",
found ? "visible" : "hidden");
ret = -EINVAL;
goto out;
}
}
}
ret = 0;
out:
amd_pstate_set_mode(orig_mode);
return ret;
}
static int __init amd_pstate_ut_init(void)
{
u32 i = 0, arr_size = ARRAY_SIZE(amd_pstate_ut_cases);
for (i = 0; i < arr_size; i++) {
int ret = amd_pstate_ut_cases[i].func(i);
int ret;
if (test_list && *test_list &&
!test_in_list(test_list, amd_pstate_ut_cases[i].name))
continue;
ret = amd_pstate_ut_cases[i].func(i);
if (ret)
pr_err("%-4d %-20s\t fail: %d!\n", i+1, amd_pstate_ut_cases[i].name, ret);

View File

@@ -36,6 +36,7 @@
#include <linux/io.h>
#include <linux/delay.h>
#include <linux/uaccess.h>
#include <linux/power_supply.h>
#include <linux/static_call.h>
#include <linux/topology.h>
@@ -86,6 +87,11 @@ static struct cpufreq_driver amd_pstate_driver;
static struct cpufreq_driver amd_pstate_epp_driver;
static int cppc_state = AMD_PSTATE_UNDEFINED;
static bool amd_pstate_prefcore = true;
#ifdef CONFIG_X86_AMD_PSTATE_DYNAMIC_EPP
static bool dynamic_epp = CONFIG_X86_AMD_PSTATE_DYNAMIC_EPP;
#else
static bool dynamic_epp;
#endif
static struct quirk_entry *quirks;
/*
@@ -103,6 +109,7 @@ static struct quirk_entry *quirks;
* 2 balance_performance
* 3 balance_power
* 4 power
* 5 custom (for raw EPP values)
*/
enum energy_perf_value_index {
EPP_INDEX_DEFAULT = 0,
@@ -110,6 +117,7 @@ enum energy_perf_value_index {
EPP_INDEX_BALANCE_PERFORMANCE,
EPP_INDEX_BALANCE_POWERSAVE,
EPP_INDEX_POWERSAVE,
EPP_INDEX_CUSTOM,
EPP_INDEX_MAX,
};
@@ -119,6 +127,7 @@ static const char * const energy_perf_strings[] = {
[EPP_INDEX_BALANCE_PERFORMANCE] = "balance_performance",
[EPP_INDEX_BALANCE_POWERSAVE] = "balance_power",
[EPP_INDEX_POWERSAVE] = "power",
[EPP_INDEX_CUSTOM] = "custom",
};
static_assert(ARRAY_SIZE(energy_perf_strings) == EPP_INDEX_MAX);
@@ -129,7 +138,7 @@ static unsigned int epp_values[] = {
[EPP_INDEX_BALANCE_POWERSAVE] = AMD_CPPC_EPP_BALANCE_POWERSAVE,
[EPP_INDEX_POWERSAVE] = AMD_CPPC_EPP_POWERSAVE,
};
static_assert(ARRAY_SIZE(epp_values) == EPP_INDEX_MAX);
static_assert(ARRAY_SIZE(epp_values) == EPP_INDEX_MAX - 1);
typedef int (*cppc_mode_transition_fn)(int);
@@ -261,7 +270,6 @@ static int msr_update_perf(struct cpufreq_policy *policy, u8 min_perf,
if (fast_switch) {
wrmsrq(MSR_AMD_CPPC_REQ, value);
return 0;
} else {
int ret = wrmsrq_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ, value);
@@ -330,6 +338,75 @@ static inline int amd_pstate_set_epp(struct cpufreq_policy *policy, u8 epp)
return static_call(amd_pstate_set_epp)(policy, epp);
}
static int amd_pstate_set_floor_perf(struct cpufreq_policy *policy, u8 perf)
{
struct amd_cpudata *cpudata = policy->driver_data;
u64 value, prev;
bool changed;
int ret;
if (!cpu_feature_enabled(X86_FEATURE_CPPC_PERF_PRIO))
return 0;
value = prev = READ_ONCE(cpudata->cppc_req2_cached);
FIELD_MODIFY(AMD_CPPC_FLOOR_PERF_MASK, &value, perf);
changed = value != prev;
if (!changed) {
ret = 0;
goto out_trace;
}
ret = wrmsrq_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ2, value);
if (ret) {
changed = false;
pr_err("failed to set CPPC REQ2 value. Error (%d)\n", ret);
goto out_trace;
}
WRITE_ONCE(cpudata->cppc_req2_cached, value);
out_trace:
if (trace_amd_pstate_cppc_req2_enabled())
trace_amd_pstate_cppc_req2(cpudata->cpu, perf, changed, ret);
return ret;
}
static int amd_pstate_init_floor_perf(struct cpufreq_policy *policy)
{
struct amd_cpudata *cpudata = policy->driver_data;
u8 floor_perf;
u64 value;
int ret;
if (!cpu_feature_enabled(X86_FEATURE_CPPC_PERF_PRIO))
return 0;
ret = rdmsrq_on_cpu(cpudata->cpu, MSR_AMD_CPPC_REQ2, &value);
if (ret) {
pr_err("failed to read CPPC REQ2 value. Error (%d)\n", ret);
return ret;
}
WRITE_ONCE(cpudata->cppc_req2_cached, value);
floor_perf = FIELD_GET(AMD_CPPC_FLOOR_PERF_MASK,
cpudata->cppc_req2_cached);
/* Set a sane value for floor_perf if the default value is invalid */
if (floor_perf < cpudata->perf.lowest_perf) {
floor_perf = cpudata->perf.nominal_perf;
ret = amd_pstate_set_floor_perf(policy, floor_perf);
if (ret)
return ret;
}
cpudata->bios_floor_perf = floor_perf;
cpudata->floor_freq = perf_to_freq(cpudata->perf, cpudata->nominal_freq,
floor_perf);
return 0;
}
static int shmem_set_epp(struct cpufreq_policy *policy, u8 epp)
{
struct amd_cpudata *cpudata = policy->driver_data;
@@ -427,6 +504,7 @@ static int msr_init_perf(struct amd_cpudata *cpudata)
perf.lowest_perf = FIELD_GET(AMD_CPPC_LOWEST_PERF_MASK, cap1);
WRITE_ONCE(cpudata->perf, perf);
WRITE_ONCE(cpudata->prefcore_ranking, FIELD_GET(AMD_CPPC_HIGHEST_PERF_MASK, cap1));
WRITE_ONCE(cpudata->floor_perf_cnt, FIELD_GET(AMD_CPPC_FLOOR_PERF_CNT_MASK, cap1));
return 0;
}
@@ -565,15 +643,12 @@ static inline bool amd_pstate_sample(struct amd_cpudata *cpudata)
return true;
}
static void amd_pstate_update(struct amd_cpudata *cpudata, u8 min_perf,
static void amd_pstate_update(struct cpufreq_policy *policy, u8 min_perf,
u8 des_perf, u8 max_perf, bool fast_switch, int gov_flags)
{
struct cpufreq_policy *policy __free(put_cpufreq_policy) = cpufreq_cpu_get(cpudata->cpu);
struct amd_cpudata *cpudata = policy->driver_data;
union perf_cached perf = READ_ONCE(cpudata->perf);
if (!policy)
return;
/* limit the max perf when core performance boost feature is disabled */
if (!cpudata->boost_supported)
max_perf = min_t(u8, perf.nominal_perf, max_perf);
@@ -688,7 +763,7 @@ static int amd_pstate_update_freq(struct cpufreq_policy *policy,
if (!fast_switch)
cpufreq_freq_transition_begin(policy, &freqs);
amd_pstate_update(cpudata, perf.min_limit_perf, des_perf,
amd_pstate_update(policy, perf.min_limit_perf, des_perf,
perf.max_limit_perf, fast_switch,
policy->governor->flags);
@@ -713,13 +788,12 @@ static unsigned int amd_pstate_fast_switch(struct cpufreq_policy *policy,
return policy->cur;
}
static void amd_pstate_adjust_perf(unsigned int cpu,
static void amd_pstate_adjust_perf(struct cpufreq_policy *policy,
unsigned long _min_perf,
unsigned long target_perf,
unsigned long capacity)
{
u8 max_perf, min_perf, des_perf, cap_perf;
struct cpufreq_policy *policy __free(put_cpufreq_policy) = cpufreq_cpu_get(cpu);
struct amd_cpudata *cpudata;
union perf_cached perf;
@@ -750,22 +824,20 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
if (max_perf < min_perf)
max_perf = min_perf;
amd_pstate_update(cpudata, min_perf, des_perf, max_perf, true,
amd_pstate_update(policy, min_perf, des_perf, max_perf, true,
policy->governor->flags);
}
static int amd_pstate_cpu_boost_update(struct cpufreq_policy *policy, bool on)
{
struct amd_cpudata *cpudata = policy->driver_data;
union perf_cached perf = READ_ONCE(cpudata->perf);
u32 nominal_freq, max_freq;
u32 nominal_freq;
int ret = 0;
nominal_freq = READ_ONCE(cpudata->nominal_freq);
max_freq = perf_to_freq(perf, cpudata->nominal_freq, perf.highest_perf);
if (on)
policy->cpuinfo.max_freq = max_freq;
policy->cpuinfo.max_freq = cpudata->max_freq;
else if (policy->cpuinfo.max_freq > nominal_freq)
policy->cpuinfo.max_freq = nominal_freq;
@@ -950,13 +1022,15 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
WRITE_ONCE(cpudata->nominal_freq, nominal_freq);
/* max_freq is calculated according to (nominal_freq * highest_perf)/nominal_perf */
max_freq = perf_to_freq(perf, nominal_freq, perf.highest_perf);
WRITE_ONCE(cpudata->max_freq, max_freq);
lowest_nonlinear_freq = perf_to_freq(perf, nominal_freq, perf.lowest_nonlinear_perf);
WRITE_ONCE(cpudata->lowest_nonlinear_freq, lowest_nonlinear_freq);
/**
* Below values need to be initialized correctly, otherwise driver will fail to load
* max_freq is calculated according to (nominal_freq * highest_perf)/nominal_perf
* lowest_nonlinear_freq is a value between [min_freq, nominal_freq]
* Check _CPC in ACPI table objects if any values are incorrect
*/
@@ -1019,10 +1093,9 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
cpudata->nominal_freq,
perf.lowest_perf);
policy->cpuinfo.max_freq = policy->max = perf_to_freq(perf,
cpudata->nominal_freq,
perf.highest_perf);
policy->cpuinfo.max_freq = policy->max = cpudata->max_freq;
policy->driver_data = cpudata;
ret = amd_pstate_cppc_enable(policy);
if (ret)
goto free_cpudata1;
@@ -1035,6 +1108,12 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
if (cpu_feature_enabled(X86_FEATURE_CPPC))
policy->fast_switch_possible = true;
ret = amd_pstate_init_floor_perf(policy);
if (ret) {
dev_err(dev, "Failed to initialize Floor Perf (%d)\n", ret);
goto free_cpudata1;
}
ret = freq_qos_add_request(&policy->constraints, &cpudata->req[0],
FREQ_QOS_MIN, FREQ_QOS_MIN_DEFAULT_VALUE);
if (ret < 0) {
@@ -1049,7 +1128,6 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
goto free_cpudata2;
}
policy->driver_data = cpudata;
if (!current_pstate_driver->adjust_perf)
current_pstate_driver->adjust_perf = amd_pstate_adjust_perf;
@@ -1061,6 +1139,7 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
free_cpudata1:
pr_warn("Failed to initialize CPU %d: %d\n", policy->cpu, ret);
kfree(cpudata);
policy->driver_data = NULL;
return ret;
}
@@ -1071,6 +1150,7 @@ static void amd_pstate_cpu_exit(struct cpufreq_policy *policy)
/* Reset CPPC_REQ MSR to the BIOS value */
amd_pstate_update_perf(policy, perf.bios_min_perf, 0U, 0U, 0U, false);
amd_pstate_set_floor_perf(policy, cpudata->bios_floor_perf);
freq_qos_remove_request(&cpudata->req[1]);
freq_qos_remove_request(&cpudata->req[0]);
@@ -1078,6 +1158,167 @@ static void amd_pstate_cpu_exit(struct cpufreq_policy *policy)
kfree(cpudata);
}
static int amd_pstate_get_balanced_epp(struct cpufreq_policy *policy)
{
struct amd_cpudata *cpudata = policy->driver_data;
if (power_supply_is_system_supplied())
return cpudata->epp_default_ac;
else
return cpudata->epp_default_dc;
}
static int amd_pstate_power_supply_notifier(struct notifier_block *nb,
unsigned long event, void *data)
{
struct amd_cpudata *cpudata = container_of(nb, struct amd_cpudata, power_nb);
struct cpufreq_policy *policy __free(put_cpufreq_policy) = cpufreq_cpu_get(cpudata->cpu);
u8 epp;
int ret;
if (event != PSY_EVENT_PROP_CHANGED)
return NOTIFY_OK;
/* dynamic actions are only applied while platform profile is in balanced */
if (cpudata->current_profile != PLATFORM_PROFILE_BALANCED)
return 0;
epp = amd_pstate_get_balanced_epp(policy);
ret = amd_pstate_set_epp(policy, epp);
if (ret)
pr_warn("Failed to set CPU %d EPP %u: %d\n", cpudata->cpu, epp, ret);
return NOTIFY_OK;
}
static int amd_pstate_profile_probe(void *drvdata, unsigned long *choices)
{
set_bit(PLATFORM_PROFILE_LOW_POWER, choices);
set_bit(PLATFORM_PROFILE_BALANCED, choices);
set_bit(PLATFORM_PROFILE_PERFORMANCE, choices);
return 0;
}
static int amd_pstate_profile_get(struct device *dev,
enum platform_profile_option *profile)
{
struct amd_cpudata *cpudata = dev_get_drvdata(dev);
*profile = cpudata->current_profile;
return 0;
}
static int amd_pstate_profile_set(struct device *dev,
enum platform_profile_option profile)
{
struct amd_cpudata *cpudata = dev_get_drvdata(dev);
struct cpufreq_policy *policy __free(put_cpufreq_policy) = cpufreq_cpu_get(cpudata->cpu);
int ret;
switch (profile) {
case PLATFORM_PROFILE_LOW_POWER:
ret = amd_pstate_set_epp(policy, AMD_CPPC_EPP_POWERSAVE);
if (ret)
return ret;
break;
case PLATFORM_PROFILE_BALANCED:
ret = amd_pstate_set_epp(policy,
amd_pstate_get_balanced_epp(policy));
if (ret)
return ret;
break;
case PLATFORM_PROFILE_PERFORMANCE:
ret = amd_pstate_set_epp(policy, AMD_CPPC_EPP_PERFORMANCE);
if (ret)
return ret;
break;
default:
pr_err("Unknown Platform Profile %d\n", profile);
return -EOPNOTSUPP;
}
cpudata->current_profile = profile;
return 0;
}
static const struct platform_profile_ops amd_pstate_profile_ops = {
.probe = amd_pstate_profile_probe,
.profile_set = amd_pstate_profile_set,
.profile_get = amd_pstate_profile_get,
};
void amd_pstate_clear_dynamic_epp(struct cpufreq_policy *policy)
{
struct amd_cpudata *cpudata = policy->driver_data;
if (cpudata->power_nb.notifier_call)
power_supply_unreg_notifier(&cpudata->power_nb);
if (cpudata->ppdev) {
platform_profile_remove(cpudata->ppdev);
cpudata->ppdev = NULL;
}
kfree(cpudata->profile_name);
cpudata->dynamic_epp = false;
}
EXPORT_SYMBOL_GPL(amd_pstate_clear_dynamic_epp);
static int amd_pstate_set_dynamic_epp(struct cpufreq_policy *policy)
{
struct amd_cpudata *cpudata = policy->driver_data;
int ret;
u8 epp;
switch (cpudata->current_profile) {
case PLATFORM_PROFILE_PERFORMANCE:
epp = AMD_CPPC_EPP_PERFORMANCE;
break;
case PLATFORM_PROFILE_LOW_POWER:
epp = AMD_CPPC_EPP_POWERSAVE;
break;
case PLATFORM_PROFILE_BALANCED:
epp = amd_pstate_get_balanced_epp(policy);
break;
default:
pr_err("Unknown Platform Profile %d\n", cpudata->current_profile);
return -EOPNOTSUPP;
}
ret = amd_pstate_set_epp(policy, epp);
if (ret)
return ret;
cpudata->profile_name = kasprintf(GFP_KERNEL, "amd-pstate-epp-cpu%d", cpudata->cpu);
cpudata->ppdev = platform_profile_register(get_cpu_device(policy->cpu),
cpudata->profile_name,
policy->driver_data,
&amd_pstate_profile_ops);
if (IS_ERR(cpudata->ppdev)) {
ret = PTR_ERR(cpudata->ppdev);
goto cleanup;
}
/* only enable notifier if things will actually change */
if (cpudata->epp_default_ac != cpudata->epp_default_dc) {
cpudata->power_nb.notifier_call = amd_pstate_power_supply_notifier;
ret = power_supply_reg_notifier(&cpudata->power_nb);
if (ret)
goto cleanup;
}
cpudata->dynamic_epp = true;
return 0;
cleanup:
amd_pstate_clear_dynamic_epp(policy);
return ret;
}
/* Sysfs attributes */
/*
@@ -1088,14 +1329,9 @@ static void amd_pstate_cpu_exit(struct cpufreq_policy *policy)
static ssize_t show_amd_pstate_max_freq(struct cpufreq_policy *policy,
char *buf)
{
struct amd_cpudata *cpudata;
union perf_cached perf;
struct amd_cpudata *cpudata = policy->driver_data;
cpudata = policy->driver_data;
perf = READ_ONCE(cpudata->perf);
return sysfs_emit(buf, "%u\n",
perf_to_freq(perf, cpudata->nominal_freq, perf.highest_perf));
return sysfs_emit(buf, "%u\n", cpudata->max_freq);
}
static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *policy,
@@ -1165,40 +1401,60 @@ static ssize_t show_energy_performance_available_preferences(
return offset;
}
static ssize_t store_energy_performance_preference(
struct cpufreq_policy *policy, const char *buf, size_t count)
ssize_t store_energy_performance_preference(struct cpufreq_policy *policy,
const char *buf, size_t count)
{
struct amd_cpudata *cpudata = policy->driver_data;
ssize_t ret;
bool raw_epp = false;
u8 epp;
ret = sysfs_match_string(energy_perf_strings, buf);
if (ret < 0)
return -EINVAL;
if (cpudata->dynamic_epp) {
pr_debug("EPP cannot be set when dynamic EPP is enabled\n");
return -EBUSY;
}
if (!ret)
epp = cpudata->epp_default;
else
epp = epp_values[ret];
/*
* if the value matches a number, use that, otherwise see if
* matches an index in the energy_perf_strings array
*/
ret = kstrtou8(buf, 0, &epp);
raw_epp = !ret;
if (ret) {
ret = sysfs_match_string(energy_perf_strings, buf);
if (ret < 0 || ret == EPP_INDEX_CUSTOM)
return -EINVAL;
if (ret)
epp = epp_values[ret];
else
epp = amd_pstate_get_balanced_epp(policy);
}
if (epp > 0 && policy->policy == CPUFREQ_POLICY_PERFORMANCE) {
if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE) {
pr_debug("EPP cannot be set under performance policy\n");
return -EBUSY;
}
ret = amd_pstate_set_epp(policy, epp);
if (ret)
return ret;
return ret ? ret : count;
cpudata->raw_epp = raw_epp;
return count;
}
EXPORT_SYMBOL_GPL(store_energy_performance_preference);
static ssize_t show_energy_performance_preference(
struct cpufreq_policy *policy, char *buf)
ssize_t show_energy_performance_preference(struct cpufreq_policy *policy, char *buf)
{
struct amd_cpudata *cpudata = policy->driver_data;
u8 preference, epp;
epp = FIELD_GET(AMD_CPPC_EPP_PERF_MASK, cpudata->cppc_req_cached);
if (cpudata->raw_epp)
return sysfs_emit(buf, "%u\n", epp);
switch (epp) {
case AMD_CPPC_EPP_PERFORMANCE:
preference = EPP_INDEX_PERFORMANCE;
@@ -1218,6 +1474,138 @@ static ssize_t show_energy_performance_preference(
return sysfs_emit(buf, "%s\n", energy_perf_strings[preference]);
}
EXPORT_SYMBOL_GPL(show_energy_performance_preference);
static ssize_t store_amd_pstate_floor_freq(struct cpufreq_policy *policy,
const char *buf, size_t count)
{
struct amd_cpudata *cpudata = policy->driver_data;
union perf_cached perf = READ_ONCE(cpudata->perf);
unsigned int freq;
u8 floor_perf;
int ret;
ret = kstrtouint(buf, 0, &freq);
if (ret)
return ret;
if (freq < policy->cpuinfo.min_freq || freq > policy->max)
return -EINVAL;
floor_perf = freq_to_perf(perf, cpudata->nominal_freq, freq);
ret = amd_pstate_set_floor_perf(policy, floor_perf);
if (!ret)
cpudata->floor_freq = freq;
return ret ?: count;
}
static ssize_t show_amd_pstate_floor_freq(struct cpufreq_policy *policy, char *buf)
{
struct amd_cpudata *cpudata = policy->driver_data;
return sysfs_emit(buf, "%u\n", cpudata->floor_freq);
}
static ssize_t show_amd_pstate_floor_count(struct cpufreq_policy *policy, char *buf)
{
struct amd_cpudata *cpudata = policy->driver_data;
u8 count = cpudata->floor_perf_cnt;
return sysfs_emit(buf, "%u\n", count);
}
cpufreq_freq_attr_ro(amd_pstate_max_freq);
cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
cpufreq_freq_attr_ro(amd_pstate_highest_perf);
cpufreq_freq_attr_ro(amd_pstate_prefcore_ranking);
cpufreq_freq_attr_ro(amd_pstate_hw_prefcore);
cpufreq_freq_attr_rw(energy_performance_preference);
cpufreq_freq_attr_ro(energy_performance_available_preferences);
cpufreq_freq_attr_rw(amd_pstate_floor_freq);
cpufreq_freq_attr_ro(amd_pstate_floor_count);
struct freq_attr_visibility {
struct freq_attr *attr;
bool (*visibility_fn)(void);
};
/* For attributes which are always visible */
static bool always_visible(void)
{
return true;
}
/* Determines whether prefcore related attributes should be visible */
static bool prefcore_visibility(void)
{
return amd_pstate_prefcore;
}
/* Determines whether energy performance preference should be visible */
static bool epp_visibility(void)
{
return cppc_state == AMD_PSTATE_ACTIVE;
}
/* Determines whether amd_pstate_floor_freq related attributes should be visible */
static bool floor_freq_visibility(void)
{
return cpu_feature_enabled(X86_FEATURE_CPPC_PERF_PRIO);
}
static struct freq_attr_visibility amd_pstate_attr_visibility[] = {
{&amd_pstate_max_freq, always_visible},
{&amd_pstate_lowest_nonlinear_freq, always_visible},
{&amd_pstate_highest_perf, always_visible},
{&amd_pstate_prefcore_ranking, prefcore_visibility},
{&amd_pstate_hw_prefcore, prefcore_visibility},
{&energy_performance_preference, epp_visibility},
{&energy_performance_available_preferences, epp_visibility},
{&amd_pstate_floor_freq, floor_freq_visibility},
{&amd_pstate_floor_count, floor_freq_visibility},
};
struct freq_attr **amd_pstate_get_current_attrs(void)
{
if (!current_pstate_driver)
return NULL;
return current_pstate_driver->attr;
}
EXPORT_SYMBOL_GPL(amd_pstate_get_current_attrs);
static struct freq_attr **get_freq_attrs(void)
{
bool attr_visible[ARRAY_SIZE(amd_pstate_attr_visibility)];
struct freq_attr **attrs;
int i, j, count;
for (i = 0, count = 0; i < ARRAY_SIZE(amd_pstate_attr_visibility); i++) {
struct freq_attr_visibility *v = &amd_pstate_attr_visibility[i];
attr_visible[i] = v->visibility_fn();
if (attr_visible[i])
count++;
}
/* amd_pstate_{max_freq, lowest_nonlinear_freq, highest_perf} should always be visible */
BUG_ON(!count);
attrs = kcalloc(count + 1, sizeof(struct freq_attr *), GFP_KERNEL);
if (!attrs)
return ERR_PTR(-ENOMEM);
for (i = 0, j = 0; i < ARRAY_SIZE(amd_pstate_attr_visibility); i++) {
if (!attr_visible[i])
continue;
attrs[j++] = amd_pstate_attr_visibility[i].attr;
}
return attrs;
}
static void amd_pstate_driver_cleanup(void)
{
@@ -1225,6 +1613,8 @@ static void amd_pstate_driver_cleanup(void)
sched_clear_itmt_support();
cppc_state = AMD_PSTATE_DISABLE;
kfree(current_pstate_driver->attr);
current_pstate_driver->attr = NULL;
current_pstate_driver = NULL;
}
@@ -1249,6 +1639,7 @@ static int amd_pstate_set_driver(int mode_idx)
static int amd_pstate_register_driver(int mode)
{
struct freq_attr **attr = NULL;
int ret;
ret = amd_pstate_set_driver(mode);
@@ -1257,6 +1648,22 @@ static int amd_pstate_register_driver(int mode)
cppc_state = mode;
/*
* Note: It is important to compute the attrs _after_
* re-initializing the cppc_state. Some attributes become
* visible only when cppc_state is AMD_PSTATE_ACTIVE.
*/
attr = get_freq_attrs();
if (IS_ERR(attr)) {
ret = (int) PTR_ERR(attr);
pr_err("Couldn't compute freq_attrs for current mode %s [%d]\n",
amd_pstate_get_mode_string(cppc_state), ret);
amd_pstate_driver_cleanup();
return ret;
}
current_pstate_driver->attr = attr;
/* at least one CPU supports CPB */
current_pstate_driver->boost_enabled = cpu_feature_enabled(X86_FEATURE_CPB);
@@ -1398,40 +1805,42 @@ static ssize_t prefcore_show(struct device *dev,
return sysfs_emit(buf, "%s\n", str_enabled_disabled(amd_pstate_prefcore));
}
cpufreq_freq_attr_ro(amd_pstate_max_freq);
cpufreq_freq_attr_ro(amd_pstate_lowest_nonlinear_freq);
static ssize_t dynamic_epp_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
return sysfs_emit(buf, "%s\n", str_enabled_disabled(dynamic_epp));
}
static ssize_t dynamic_epp_store(struct device *a, struct device_attribute *b,
const char *buf, size_t count)
{
bool enabled;
int ret;
ret = kstrtobool(buf, &enabled);
if (ret)
return ret;
if (dynamic_epp == enabled)
return -EINVAL;
/* reinitialize with desired dynamic EPP value */
dynamic_epp = enabled;
ret = amd_pstate_change_driver_mode(cppc_state);
if (ret)
dynamic_epp = false;
return ret ? ret : count;
}
cpufreq_freq_attr_ro(amd_pstate_highest_perf);
cpufreq_freq_attr_ro(amd_pstate_prefcore_ranking);
cpufreq_freq_attr_ro(amd_pstate_hw_prefcore);
cpufreq_freq_attr_rw(energy_performance_preference);
cpufreq_freq_attr_ro(energy_performance_available_preferences);
static DEVICE_ATTR_RW(status);
static DEVICE_ATTR_RO(prefcore);
static struct freq_attr *amd_pstate_attr[] = {
&amd_pstate_max_freq,
&amd_pstate_lowest_nonlinear_freq,
&amd_pstate_highest_perf,
&amd_pstate_prefcore_ranking,
&amd_pstate_hw_prefcore,
NULL,
};
static struct freq_attr *amd_pstate_epp_attr[] = {
&amd_pstate_max_freq,
&amd_pstate_lowest_nonlinear_freq,
&amd_pstate_highest_perf,
&amd_pstate_prefcore_ranking,
&amd_pstate_hw_prefcore,
&energy_performance_preference,
&energy_performance_available_preferences,
NULL,
};
static DEVICE_ATTR_RW(dynamic_epp);
static struct attribute *pstate_global_attributes[] = {
&dev_attr_status.attr,
&dev_attr_prefcore.attr,
&dev_attr_dynamic_epp.attr,
NULL
};
@@ -1501,9 +1910,7 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
policy->cpuinfo.min_freq = policy->min = perf_to_freq(perf,
cpudata->nominal_freq,
perf.lowest_perf);
policy->cpuinfo.max_freq = policy->max = perf_to_freq(perf,
cpudata->nominal_freq,
perf.highest_perf);
policy->cpuinfo.max_freq = policy->max = cpudata->max_freq;
policy->driver_data = cpudata;
ret = amd_pstate_cppc_enable(policy);
@@ -1523,15 +1930,27 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
if (amd_pstate_acpi_pm_profile_server() ||
amd_pstate_acpi_pm_profile_undefined()) {
policy->policy = CPUFREQ_POLICY_PERFORMANCE;
cpudata->epp_default = amd_pstate_get_epp(cpudata);
cpudata->epp_default_ac = cpudata->epp_default_dc = amd_pstate_get_epp(cpudata);
cpudata->current_profile = PLATFORM_PROFILE_PERFORMANCE;
} else {
policy->policy = CPUFREQ_POLICY_POWERSAVE;
cpudata->epp_default = AMD_CPPC_EPP_BALANCE_PERFORMANCE;
cpudata->epp_default_ac = AMD_CPPC_EPP_PERFORMANCE;
cpudata->epp_default_dc = AMD_CPPC_EPP_BALANCE_PERFORMANCE;
cpudata->current_profile = PLATFORM_PROFILE_BALANCED;
}
ret = amd_pstate_set_epp(policy, cpudata->epp_default);
if (dynamic_epp)
ret = amd_pstate_set_dynamic_epp(policy);
else
ret = amd_pstate_set_epp(policy, amd_pstate_get_balanced_epp(policy));
if (ret)
return ret;
goto free_cpudata1;
ret = amd_pstate_init_floor_perf(policy);
if (ret) {
dev_err(dev, "Failed to initialize Floor Perf (%d)\n", ret);
goto free_cpudata1;
}
current_pstate_driver->adjust_perf = NULL;
@@ -1540,6 +1959,7 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
free_cpudata1:
pr_warn("Failed to initialize CPU %d: %d\n", policy->cpu, ret);
kfree(cpudata);
policy->driver_data = NULL;
return ret;
}
@@ -1552,7 +1972,10 @@ static void amd_pstate_epp_cpu_exit(struct cpufreq_policy *policy)
/* Reset CPPC_REQ MSR to the BIOS value */
amd_pstate_update_perf(policy, perf.bios_min_perf, 0U, 0U, 0U, false);
amd_pstate_set_floor_perf(policy, cpudata->bios_floor_perf);
if (cpudata->dynamic_epp)
amd_pstate_clear_dynamic_epp(policy);
kfree(cpudata);
policy->driver_data = NULL;
}
@@ -1607,24 +2030,39 @@ static int amd_pstate_epp_set_policy(struct cpufreq_policy *policy)
static int amd_pstate_cpu_online(struct cpufreq_policy *policy)
{
return amd_pstate_cppc_enable(policy);
struct amd_cpudata *cpudata = policy->driver_data;
union perf_cached perf = READ_ONCE(cpudata->perf);
u8 cached_floor_perf;
int ret;
ret = amd_pstate_cppc_enable(policy);
if (ret)
return ret;
cached_floor_perf = freq_to_perf(perf, cpudata->nominal_freq, cpudata->floor_freq);
return amd_pstate_set_floor_perf(policy, cached_floor_perf);
}
static int amd_pstate_cpu_offline(struct cpufreq_policy *policy)
{
struct amd_cpudata *cpudata = policy->driver_data;
union perf_cached perf = READ_ONCE(cpudata->perf);
int ret;
/*
* Reset CPPC_REQ MSR to the BIOS value, this will allow us to retain the BIOS specified
* min_perf value across kexec reboots. If this CPU is just onlined normally after this, the
* limits, epp and desired perf will get reset to the cached values in cpudata struct
*/
return amd_pstate_update_perf(policy, perf.bios_min_perf,
ret = amd_pstate_update_perf(policy, perf.bios_min_perf,
FIELD_GET(AMD_CPPC_DES_PERF_MASK, cpudata->cppc_req_cached),
FIELD_GET(AMD_CPPC_MAX_PERF_MASK, cpudata->cppc_req_cached),
FIELD_GET(AMD_CPPC_EPP_PERF_MASK, cpudata->cppc_req_cached),
false);
if (ret)
return ret;
return amd_pstate_set_floor_perf(policy, cpudata->bios_floor_perf);
}
static int amd_pstate_suspend(struct cpufreq_policy *policy)
@@ -1646,6 +2084,10 @@ static int amd_pstate_suspend(struct cpufreq_policy *policy)
if (ret)
return ret;
ret = amd_pstate_set_floor_perf(policy, cpudata->bios_floor_perf);
if (ret)
return ret;
/* set this flag to avoid setting core offline*/
cpudata->suspended = true;
@@ -1657,15 +2099,24 @@ static int amd_pstate_resume(struct cpufreq_policy *policy)
struct amd_cpudata *cpudata = policy->driver_data;
union perf_cached perf = READ_ONCE(cpudata->perf);
int cur_perf = freq_to_perf(perf, cpudata->nominal_freq, policy->cur);
u8 cached_floor_perf;
int ret;
/* Set CPPC_REQ to last sane value until the governor updates it */
return amd_pstate_update_perf(policy, perf.min_limit_perf, cur_perf, perf.max_limit_perf,
0U, false);
ret = amd_pstate_update_perf(policy, perf.min_limit_perf, cur_perf, perf.max_limit_perf,
0U, false);
if (ret)
return ret;
cached_floor_perf = freq_to_perf(perf, cpudata->nominal_freq, cpudata->floor_freq);
return amd_pstate_set_floor_perf(policy, cached_floor_perf);
}
static int amd_pstate_epp_resume(struct cpufreq_policy *policy)
{
struct amd_cpudata *cpudata = policy->driver_data;
union perf_cached perf = READ_ONCE(cpudata->perf);
u8 cached_floor_perf;
if (cpudata->suspended) {
int ret;
@@ -1678,7 +2129,8 @@ static int amd_pstate_epp_resume(struct cpufreq_policy *policy)
cpudata->suspended = false;
}
return 0;
cached_floor_perf = freq_to_perf(perf, cpudata->nominal_freq, cpudata->floor_freq);
return amd_pstate_set_floor_perf(policy, cached_floor_perf);
}
static struct cpufreq_driver amd_pstate_driver = {
@@ -1695,7 +2147,6 @@ static struct cpufreq_driver amd_pstate_driver = {
.set_boost = amd_pstate_set_boost,
.update_limits = amd_pstate_update_limits,
.name = "amd-pstate",
.attr = amd_pstate_attr,
};
static struct cpufreq_driver amd_pstate_epp_driver = {
@@ -1711,7 +2162,6 @@ static struct cpufreq_driver amd_pstate_epp_driver = {
.update_limits = amd_pstate_update_limits,
.set_boost = amd_pstate_set_boost,
.name = "amd-pstate-epp",
.attr = amd_pstate_epp_attr,
};
/*
@@ -1857,7 +2307,7 @@ static int __init amd_pstate_init(void)
return ret;
global_attr_free:
cpufreq_unregister_driver(current_pstate_driver);
amd_pstate_unregister_driver(0);
return ret;
}
device_initcall(amd_pstate_init);
@@ -1884,8 +2334,19 @@ static int __init amd_prefcore_param(char *str)
return 0;
}
static int __init amd_dynamic_epp_param(char *str)
{
if (!strcmp(str, "disable"))
dynamic_epp = false;
if (!strcmp(str, "enable"))
dynamic_epp = true;
return 0;
}
early_param("amd_pstate", amd_pstate_param);
early_param("amd_prefcore", amd_prefcore_param);
early_param("amd_dynamic_epp", amd_dynamic_epp_param);
MODULE_AUTHOR("Huang Rui <ray.huang@amd.com>");
MODULE_DESCRIPTION("AMD Processor P-state Frequency Driver");

View File

@@ -9,6 +9,7 @@
#define _LINUX_AMD_PSTATE_H
#include <linux/pm_qos.h>
#include <linux/platform_profile.h>
/*********************************************************************
* AMD P-state INTERFACE *
@@ -62,13 +63,20 @@ struct amd_aperf_mperf {
* @cpu: CPU number
* @req: constraint request to apply
* @cppc_req_cached: cached performance request hints
* @cppc_req2_cached: cached value of MSR_AMD_CPPC_REQ2
* @perf: cached performance-related data
* @prefcore_ranking: the preferred core ranking, the higher value indicates a higher
* priority.
* @floor_perf_cnt: Cached value of the number of distinct floor
* performance levels supported
* @bios_floor_perf: Cached value of the boot-time floor performance level from
* MSR_AMD_CPPC_REQ2
* @min_limit_freq: Cached value of policy->min (in khz)
* @max_limit_freq: Cached value of policy->max (in khz)
* @nominal_freq: the frequency (in khz) that mapped to nominal_perf
* @max_freq: in ideal conditions the maximum frequency (in khz) possible frequency
* @lowest_nonlinear_freq: the frequency (in khz) that mapped to lowest_nonlinear_perf
* @floor_freq: Cached value of the user requested floor_freq
* @cur: Difference of Aperf/Mperf/tsc count between last and current sample
* @prev: Last Aperf/Mperf/tsc count value read from register
* @freq: current cpu frequency value (in khz)
@@ -78,6 +86,11 @@ struct amd_aperf_mperf {
* AMD P-State driver supports preferred core featue.
* @epp_cached: Cached CPPC energy-performance preference value
* @policy: Cpufreq policy value
* @suspended: If CPU core if offlined
* @epp_default_ac: Default EPP value for AC power source
* @epp_default_dc: Default EPP value for DC power source
* @dynamic_epp: Whether dynamic EPP is enabled
* @power_nb: Notifier block for power events
*
* The amd_cpudata is key private data for each CPU thread in AMD P-State, and
* represents all the attributes and goals that AMD P-State requests at runtime.
@@ -87,14 +100,19 @@ struct amd_cpudata {
struct freq_qos_request req[2];
u64 cppc_req_cached;
u64 cppc_req2_cached;
union perf_cached perf;
u8 prefcore_ranking;
u8 floor_perf_cnt;
u8 bios_floor_perf;
u32 min_limit_freq;
u32 max_limit_freq;
u32 nominal_freq;
u32 max_freq;
u32 lowest_nonlinear_freq;
u32 floor_freq;
struct amd_aperf_mperf cur;
struct amd_aperf_mperf prev;
@@ -106,7 +124,16 @@ struct amd_cpudata {
/* EPP feature related attributes*/
u32 policy;
bool suspended;
u8 epp_default;
u8 epp_default_ac;
u8 epp_default_dc;
bool dynamic_epp;
bool raw_epp;
struct notifier_block power_nb;
/* platform profile */
enum platform_profile_option current_profile;
struct device *ppdev;
char *profile_name;
};
/*
@@ -123,5 +150,13 @@ enum amd_pstate_mode {
const char *amd_pstate_get_mode_string(enum amd_pstate_mode mode);
int amd_pstate_get_status(void);
int amd_pstate_update_status(const char *buf, size_t size);
ssize_t store_energy_performance_preference(struct cpufreq_policy *policy,
const char *buf, size_t count);
ssize_t show_energy_performance_preference(struct cpufreq_policy *policy, char *buf);
void amd_pstate_clear_dynamic_epp(struct cpufreq_policy *policy);
struct freq_attr;
struct freq_attr **amd_pstate_get_current_attrs(void);
#endif /* _LINUX_AMD_PSTATE_H */

View File

@@ -2221,7 +2221,7 @@ EXPORT_SYMBOL_GPL(cpufreq_driver_fast_switch);
/**
* cpufreq_driver_adjust_perf - Adjust CPU performance level in one go.
* @cpu: Target CPU.
* @policy: cpufreq policy object of the target CPU.
* @min_perf: Minimum (required) performance level (units of @capacity).
* @target_perf: Target (desired) performance level (units of @capacity).
* @capacity: Capacity of the target CPU.
@@ -2240,12 +2240,12 @@ EXPORT_SYMBOL_GPL(cpufreq_driver_fast_switch);
* parallel with either ->target() or ->target_index() or ->fast_switch() for
* the same CPU.
*/
void cpufreq_driver_adjust_perf(unsigned int cpu,
void cpufreq_driver_adjust_perf(struct cpufreq_policy *policy,
unsigned long min_perf,
unsigned long target_perf,
unsigned long capacity)
{
cpufreq_driver->adjust_perf(cpu, min_perf, target_perf, capacity);
cpufreq_driver->adjust_perf(policy, min_perf, target_perf, capacity);
}
/**

View File

@@ -3239,12 +3239,12 @@ static unsigned int intel_cpufreq_fast_switch(struct cpufreq_policy *policy,
return target_pstate * cpu->pstate.scaling;
}
static void intel_cpufreq_adjust_perf(unsigned int cpunum,
static void intel_cpufreq_adjust_perf(struct cpufreq_policy *policy,
unsigned long min_perf,
unsigned long target_perf,
unsigned long capacity)
{
struct cpudata *cpu = all_cpu_data[cpunum];
struct cpudata *cpu = all_cpu_data[policy->cpu];
u64 hwp_cap = READ_ONCE(cpu->hwp_cap_cached);
int old_pstate = cpu->pstate.current_pstate;
int cap_pstate, min_pstate, max_pstate, target_pstate;

View File

@@ -373,7 +373,7 @@ struct cpufreq_driver {
* conditions) scale invariance can be disabled, which causes the
* schedutil governor to fall back to the latter.
*/
void (*adjust_perf)(unsigned int cpu,
void (*adjust_perf)(struct cpufreq_policy *policy,
unsigned long min_perf,
unsigned long target_perf,
unsigned long capacity);
@@ -618,7 +618,7 @@ struct cpufreq_governor {
/* Pass a target to the cpufreq driver */
unsigned int cpufreq_driver_fast_switch(struct cpufreq_policy *policy,
unsigned int target_freq);
void cpufreq_driver_adjust_perf(unsigned int cpu,
void cpufreq_driver_adjust_perf(struct cpufreq_policy *policy,
unsigned long min_perf,
unsigned long target_perf,
unsigned long capacity);

View File

@@ -461,6 +461,7 @@ static void sugov_update_single_perf(struct update_util_data *hook, u64 time,
unsigned int flags)
{
struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, update_util);
struct sugov_policy *sg_policy = sg_cpu->sg_policy;
unsigned long prev_util = sg_cpu->util;
unsigned long max_cap;
@@ -482,10 +483,10 @@ static void sugov_update_single_perf(struct update_util_data *hook, u64 time,
if (sugov_hold_freq(sg_cpu) && sg_cpu->util < prev_util)
sg_cpu->util = prev_util;
cpufreq_driver_adjust_perf(sg_cpu->cpu, sg_cpu->bw_min,
cpufreq_driver_adjust_perf(sg_policy->policy, sg_cpu->bw_min,
sg_cpu->util, max_cap);
sg_cpu->sg_policy->last_freq_update_time = time;
sg_policy->last_freq_update_time = time;
}
static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu, u64 time)

View File

@@ -1257,18 +1257,17 @@ impl<T: Driver> Registration<T> {
/// # Safety
///
/// - This function may only be called from the cpufreq C infrastructure.
/// - The pointer arguments must be valid pointers.
unsafe extern "C" fn adjust_perf_callback(
cpu: c_uint,
ptr: *mut bindings::cpufreq_policy,
min_perf: c_ulong,
target_perf: c_ulong,
capacity: c_ulong,
) {
// SAFETY: The C API guarantees that `cpu` refers to a valid CPU number.
let cpu_id = unsafe { CpuId::from_u32_unchecked(cpu) };
if let Ok(mut policy) = PolicyCpu::from_cpu(cpu_id) {
T::adjust_perf(&mut policy, min_perf, target_perf, capacity);
}
// SAFETY: The `ptr` is guaranteed to be valid by the contract with the C code for the
// lifetime of `policy`.
let policy = unsafe { Policy::from_raw_mut(ptr) };
T::adjust_perf(policy, min_perf, target_perf, capacity);
}
/// Driver's `get_intermediate` callback.

View File

@@ -415,7 +415,7 @@
*/
#define X86_FEATURE_OVERFLOW_RECOV (17*32+ 0) /* "overflow_recov" MCA overflow recovery support */
#define X86_FEATURE_SUCCOR (17*32+ 1) /* "succor" Uncorrectable error containment and recovery */
#define X86_FEATURE_CPPC_PERF_PRIO (17*32+ 2) /* CPPC Floor Perf support */
#define X86_FEATURE_SMCA (17*32+ 3) /* "smca" Scalable MCA */
/* Intel-defined CPU features, CPUID level 0x00000007:0 (EDX), word 18 */