Pull more devicetree updates from Rob Herring:
- First part of DT header detangling dropping cpu.h from of_device.h
and replacing some includes with forward declarations. A handful of
drivers needed some adjustment to their includes as a result.
- Refactor of_device.h to be used by bus drivers rather than various
device drivers. This moves non-bus related functions out of
of_device.h. The end goal is for of_platform.h and of_device.h to
stop including each other.
- Refactor open coded parsing of "ranges" in some bus drivers to use DT
address parsing functions
- Add some new address parsing functions of_property_read_reg(),
of_range_count(), and of_range_to_resource() in preparation to
convert more open coded parsing of DT addresses to use them.
- Treewide clean-ups to use of_property_read_bool() and
of_property_present() as appropriate. The ones here are the ones that
didn't get picked up elsewhere.
* tag 'devicetree-for-6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (34 commits)
bus: tegra-gmi: Replace of_platform.h with explicit includes
hte: Use of_property_present() for testing DT property presence
w1: w1-gpio: Use of_property_read_bool() for boolean properties
virt: fsl: Use of_property_present() for testing DT property presence
soc: fsl: Use of_property_present() for testing DT property presence
sbus: display7seg: Use of_property_read_bool() for boolean properties
sparc: Use of_property_read_bool() for boolean properties
sparc: Use of_property_present() for testing DT property presence
bus: mvebu-mbus: Remove open coded "ranges" parsing
of/address: Add of_property_read_reg() helper
of/address: Add of_range_count() helper
of/address: Add support for 3 address cell bus
of/address: Add of_range_to_resource() helper
of: unittest: Add bus address range parsing tests
of: Drop cpu.h include from of_device.h
OPP: Adjust includes to remove of_device.h
irqchip: loongson-eiointc: Add explicit include for cpuhotplug.h
cpuidle: Adjust includes to remove of_device.h
cpufreq: sun50i: Add explicit include for cpu.h
cpufreq: Adjust includes to remove of_device.h
...
Add support for DLVR (Digital Linear Voltage Regulator) attributes,
which can be used to control RFIM.
Here instead of "fivr" another directory "dlvr" is created with DLVR
attributes:
/sys/bus/pci/devices/0000:00:04.0/dlvr
├── dlvr_freq_mhz
├── dlvr_freq_select
├── dlvr_hardware_rev
├── dlvr_pll_busy
├── dlvr_rfim_enable
└── dlvr_spread_spectrum_pct
└── dlvr_control_mode
└── dlvr_control_lock
Attributes
dlvr_freq_mhz (RO):
Current DLVR PLL frequency in MHz.
dlvr_freq_select (RW):
Sets DLVR PLL clock frequency.
dlvr_hardware_rev (RO):
DLVR hardware revision.
dlvr_pll_busy (RO):
PLL can't accept frequency change when set.
dlvr_rfim_enable (RW):
0: Disable RF frequency hopping, 1: Enable RF frequency hopping.
dlvr_control_mode (RW):
Specifies how frequencies are spread. 0: Down spread, 1: Spread in Center.
dlvr_control_lock (RW):
1: future writes are ignored.
dlvr_spread_spectrum_pct (RW)
A write to this register updates the DLVR spread spectrum percent value.
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
[ rjw: Subject edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Pull more thermal control changes for 6.4-rc1 from Daniel Lezcano:
"- Do preparating cleaning and DT bindings for RK3588 support
(Sebastian Reichel)
- Add driver support for RK3588 (Finley Xiao)
- Use devm_reset_control_array_get_exclusive() for the Rockchip driver
(Ye Xingchen)
- Detect power gated thermal zones and return -EAGAIN when reading the
temperature (Mikko Perttunen)
- Remove thermal_bind_params structure as it is unused (Zhang Rui)
- Drop unneeded quotes in DT bindings allowing to run yamllint (Rob
Herring)
- Update the power allocator documentation according to the thermal
trace relocation (Lukas Bulwahn)
- Fix sensor 1 interrupt status bitmask for the Mediatek LVTS sensor
(Chen-Yu Tsai)
- Use the dev_err_probe() helper in the Amlogic driver (Ye Xingchen)
- Add AP domain support to LVTS thermal controllers for mt8195
(Balsam CHIHI)
- Remove buggy call to thermal_of_zone_unregister() (Daniel Lezcano)
- Make thermal_of_zone_[un]register() private to the thermal OF code
(Daniel Lezcano)
- Create a private copy of the thermal zone device parameters
structure when registering a thermal zone (Daniel Lezcano)"
* tag 'thermal-v6.4-rc1-2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
thermal/core: Alloc-copy-free the thermal zone parameters structure
thermal/of: Unexport unused OF functions
thermal/drivers/bcm2835: Remove buggy call to thermal_of_zone_unregister
thermal/drivers/mediatek/lvts_thermal: Add AP domain for mt8195
dt-bindings: thermal: mediatek: Add AP domain to LVTS thermal controllers for mt8195
thermal: amlogic: Use dev_err_probe()
thermal/drivers/mediatek/lvts_thermal: Fix sensor 1 interrupt status bitmask
MAINTAINERS: adjust entry in THERMAL/POWER_ALLOCATOR after header movement
dt-bindings: thermal: Drop unneeded quotes
thermal/core: Remove thermal_bind_params structure
thermal/drivers/tegra-bpmp: Handle offline zones
thermal/drivers/rockchip: use devm_reset_control_array_get_exclusive()
dt-bindings: rockchip-thermal: Support the RK3588 SoC compatible
thermal/drivers/rockchip: Support RK3588 SoC in the thermal driver
thermal/drivers/rockchip: Support dynamic sized sensor array
thermal/drivers/rockchip: Simplify channel id logic
thermal/drivers/rockchip: Use dev_err_probe
thermal/drivers/rockchip: Simplify clock logic
thermal/drivers/rockchip: Simplify getting match data
Some older processors don't allow BIT(13) and BIT(15) in the current
mask set by "THERM_STATUS_CLEAR_CORE_MASK". This results in:
unchecked MSR access error: WRMSR to 0x19c (tried to
write 0x000000000000aaa8) at rIP: 0xffffffff816f66a6
(throttle_active_work+0xa6/0x1d0)
To avoid unchecked MSR issues, check CPUID for each relevant feature and
use that information to set the supported feature bits only in the
"clear" mask for cores. Do the same for the analogous package mask set
by "THERM_STATUS_CLEAR_PKG_MASK".
Introduce functions thermal_intr_init_core_clear_mask() and
thermal_intr_init_pkg_clear_mask() to set core and package mask bits,
respectively. These functions are called during initialization.
Fixes: 6fe1e64b60 ("thermal: intel: Prevent accidental clearing of HFI status")
Reported-by: Rui Salvaterra <rsalvaterra@gmail.com>
Link: https://lore.kernel.org/lkml/cdf43fb423368ee3994124a9e8c9b4f8d00712c6.camel@linux.intel.com/T/
Tested-by: Rui Salvaterra <rsalvaterra@gmail.com>
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: 6.2+ <stable@kernel.org> # 6.2+
[ rjw: Renamed 2 funtions and 2 static variables, edited subject and
changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The caller of the function thermal_zone_device_register_with_trips()
can pass a thermal_zone_params structure parameter.
This one is used by the thermal core code until the thermal zone is
destroyed. That forces the caller, so the driver, to keep the pointer
valid until it unregisters the thermal zone if we want to make the
thermal zone device structure private the core code.
As the thermal zone device structure would be private, the driver can
not access to thermal zone device structure to retrieve the tzp field
after it passed it to register the thermal zone.
So instead of forcing the users of the function to deal with the tzp
structure life cycle, make the usage easier by allocating our own
thermal zone params, copying the parameter content and by freeing at
unregister time. The user can then create the parameters on the stack,
pass it to the registering function and forget about it.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230404075138.2914680-3-daniel.lezcano@linaro.org
The driver is using the devm_thermal_of_zone_device_register().
In the error path of the function calling
devm_thermal_of_zone_device_register(), the function
devm_thermal_of_zone_unregister() should be called instead of
thermal_of_zone_unregister(), otherwise this one will be called twice
when the device is freed.
The same happens for the remove function where the devm_ guarantee the
thermal_of_zone_unregister() will be called, so adding this call in
the remove function will lead to a double free also.
Use devm_ variant in the error path of the probe function.
Remove thermal_of_zone_unregister() in the remove function.
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ray Jui <rjui@broadcom.com>
Cc: Scott Branden <sbranden@broadcom.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230404075138.2914680-1-daniel.lezcano@linaro.org
The binary representation for sensor 1 interrupt status was incorrectly
assembled, when compared to the full table given in the same comment
section. The conversion into hex was also incorrect, leading to
incorrect interrupt status bitmask for sensor 1. This would cause the
driver to incorrectly identify changes for sensor 1, when in fact it
was sensor 0, or a sensor access time out.
Fix the binary and hex representations in the comments, and the actual
bitmask macro.
Fixes: f5f633b182 ("thermal/drivers/mediatek: Add the Low Voltage Thermal Sensor driver")
Signed-off-by: Chen-Yu Tsai <wenst@chromium.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230328031017.1360976-1-wenst@chromium.org
Thermal zones located in power domains may not be accessible when
the domain is powergated. In this situation, reading the temperature
will return -BPMP_EFAULT. When evaluating trips, BPMP will internally
use -256C as the temperature for offline zones.
For smooth operation, for offline zones, return -EAGAIN when reading
the temperature and allow registration of zones even if they are
offline during probe.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Link: https://lore.kernel.org/r/20230330094904.2589428-1-cyndis@kapsi.fi
Pull thermal control material for 6.4-rc1 from Daniel Lezcano:
"- Add more thermal zone device encapsulation: prevent setting
structure field directly, access the sensor device instead the
thermal zone's device for trace, relocate the traces in
drivers/thermal (Daniel Lezcano)
- Use the generic trip point for the i.MX and remove the get_trip_temp
ops (Daniel Lezcano)
- Use the devm_platform_ioremap_resource() in the Hisilicon driver
(Yang Li)
- Remove R-Car H3 ES1.* handling as public has only access to the ES2
version and the upstream support for the ES1 has been shutdown (Wolfram
Sang)
- Add a delay after initializing the bank in order to let the time to
the hardware to initialze itself before reading the temperature
(Amjad Ouled-Ameur)
- Add MT8365 support (Amjad Ouled-Ameur)"
* tag 'thermal-v6.4-rc1-1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/thermal/linux:
thermal/drivers/ti: Use fixed update interval
thermal/drivers/stm: Don't set no_hwmon to false
thermal/drivers/db8500: Use driver dev instead of tz->device
thermal/core: Relocate the traces definition in thermal directory
thermal/drivers/hisi: Use devm_platform_ioremap_resource()
thermal/drivers/imx: Use the thermal framework for the trip point
thermal/drivers/imx: Remove get_trip_temp ops
thermal/drivers/rcar_gen3_thermal: Remove R-Car H3 ES1.* handling
thermal/drivers/mediatek: Add delay after thermal banks initialization
thermal/drivers/mediatek: Add support for MT8365 SoC
thermal/drivers/mediatek: Control buffer enablement tweaks
dt-bindings: thermal: mediatek: Add binding documentation for MT8365 SoC
Once thermal_list_lock has been acquired in
__thermal_cooling_device_register(), it is not necessary to drop it
and take it again until all of the thermal zones have been updated,
so change the code accordingly.
No expected functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Currently the TI thermal driver sets the sensor update interval based
on the polling of the thermal zone. In order to get the polling rate,
the code inspects the thermal zone device structure internals, thus
breaking the self-encapsulation of the thermal framework core
framework.
On the other side, we see the common polling rates set in the device
tree for the platforms using this driver are 500 or 1000 ms.
Setting the polling rate to 250 ms would be far enough to cover the
combination we found in the device tree.
Instead of accessing the thermal zone device structure polling rate,
let's use a common update interval of 250 ms for the driver.
Cc: Keerthy <j-keerthy@ti.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Dhruva Gole <d-gole@ti.com>
Acked-by: Keerthy <j-keerthy@ti.com>
Link: https://lore.kernel.org/r/20230307133735.90772-7-daniel.lezcano@linaro.org
The traces are exported but only local to the thermal core code. On
the other side, the traces take the thermal zone device structure as
argument, thus they have to rely on the exported thermal.h header
file. As we want to move the structure to the private thermal core
header, first we have to relocate those traces to the same place as
many drivers do.
Cc: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20230307133735.90772-2-daniel.lezcano@linaro.org
The thermal framework provides an API to get the trip related to a
trip point id. We want to consolidate the generic trip points code,
thus preventing the different drivers to deal with the trip points
after they registered them.
The set_trip_temp ops will be changed regarding the above changes but
first we need to rework a bit the different implementation in the
drivers.
The goal is to prevent using the trip id but use a trip point passed
as parameter which will contain all the needed information.
As we don't have the trip point passed as parameter yet, we get the
trip point using the generic trip thermal framewrok APIs and use it to
take exactly the same decisions.
The difference with this change and the previous code is from where we
get the thermal trip point (which is the same).
No functional change intended.
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Reviewed-by: Fabio Estevam <festevam@gmail.com>
Link: https://lore.kernel.org/r/20230309092821.1590586-2-daniel.lezcano@linaro.org
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
When cpumask is specified as a module parameter the value is
overwritten by the module init routine. This can easily be fixed
by checking to see if the mask has already been allocated in the
init routine.
When max_idle is specified as a module parameter a panic will occur.
The problem is that the idle_injection_cpu_mask is not allocated until
the module init routine executes. This can easily be fixed by allocating
the cpumask if it's not already allocated.
Fixes: ebf5197102 ("thermal: intel: powerclamp: Add two module parameters")
Signed-off-by: David Arcari <darcari@redhat.com>
Reviewed-by: Srinivas Pandruvada<srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
It is preferred to use typed property access functions (i.e.
of_property_read_<type> functions) rather than low-level
of_get_property/of_find_property functions for reading properties. As
part of this, convert of_get_property/of_find_property calls to the
recently added of_property_present() helper when we just want to test
for presence of a property and nothing more.
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 52f04f10b9 ("thermal: intel: int340x: processor_thermal: Fix
deadlock") addressed deadlock issue during user space trip update. But it
missed a case when thermal zone device is disabled when user writes 0.
Call to thermal_zone_device_disable() also causes deadlock as it also
tries to lock tz->lock, which is already claimed by trip_point_temp_store()
in the thermal core code.
Remove call to thermal_zone_device_disable() in the function
sys_set_trip_temp(), which is called from trip_point_temp_store().
Fixes: 52f04f10b9 ("thermal: intel: int340x: processor_thermal: Fix deadlock")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Cc: 6.2+ <stable@vger.kernel.org> # 6.2+
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
When the hwmon device node of a thermal zone device is not found,
using hwmon->device causes a kernel NULL pointer dereference.
Fixes: dec07d399c ("thermal: Don't use 'device' internal thermal zone structure field")
Reported-by: Preble Adam C <adam.c.preble@intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Acked-by: Daniel Lezcano <daniel.lezcano@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
The lockdep_assert_held() calls added to cooling_device_stats_setup()
and cooling_device_stats_destroy() by commit 790930f442 ("thermal:
core: Introduce thermal_cooling_device_update()") trigger false-positive
lockdep reports in code paths that are not subject to race conditions
(before cooling device registration and after cooling device removal).
For this reason, remove the lockdep_assert_held() calls from both
cooling_device_stats_setup() and cooling_device_stats_destroy() and
add one to thermal_cooling_device_stats_reinit() that has to be called
under the cdev lock.
Fixes: 790930f442 ("thermal: core: Introduce thermal_cooling_device_update()")
Link: https://lore.kernel.org/linux-acpi/ZCIDTLFt27Ei7+V6@ideak-desk.fi.intel.com
Reported-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Commit 7c3d5c20dc ("thermal/core: Add a generic thermal_zone_get_trip()
function") stopped marking trip points with a zero temperature as
disabled, behavior that was originally introduced in commit 81ad4276b5
("Thermal: Ignore invalid trip points").
When using the mlxsw driver we see that when such trip points are not
disabled, the thermal subsystem repeatedly tries to set the state of the
associated cooling devices to the maximum state.
Address this by restoring the original behavior and mark trip points
with a zero temperature as disabled.
Fixes: 7c3d5c20dc ("thermal/core: Add a generic thermal_zone_get_trip() function")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Introduce a core thermal API function, thermal_cooling_device_update(),
for updating the max_state value for a cooling device and rearranging
its statistics in sysfs after a possible change of its ->get_max_state()
callback return value.
That callback is now invoked only once, during cooling device
registration, to populate the max_state field in the cooling device
object, so if its return value changes, it needs to be invoked again
and the new return value needs to be stored as max_state. Moreover,
the statistics presented in sysfs need to be rearranged in general,
because there may not be enough room in them to store data for all
of the possible states (in the case when max_state grows).
The new function takes care of that (and some other minor things
related to it), but some extra locking and lockdep annotations are
added in several places too to protect against crashes in the cases
when the statistics are not present or when a stale max_state value
might be used by sysfs attributes.
Note that the actual user of the new function will be added separately.
Link: https://lore.kernel.org/linux-pm/53ec1f06f61c984100868926f282647e57ecfb2d.camel@intel.com/
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Introduce a helper function, thermal_cooling_device_present(), for
checking if the given cooling device is in the list of registered
cooling devices to avoid some code duplication in a subsequent
patch.
No expected functional impact.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
When setting a trip point temperature from sysfs, there is an upper
bound check on the user input, but no lower bound check.
As hardware register has 7 bits for a trip point temperature, the offset
to tj_max of the input temperature must be equal to/less than 0x7f.
Or else,
1. bogus temperature is updated into the trip temperature bits.
2. the upper bits of the register can be polluted.
For example,
$ rdmsr 0x1b2
2000003
$ echo -180000 > /sys/class/thermal/thermal_zone1/trip_point_1_temp
$ rdmsr 0x1b2
3980003
Not only the trip point temp is set to 76C on this platform (tj_max is
100), the Power Notification (Bit 24) is also enabled erronously.
Fix the problem by adding lower bound check for sysfs input.
Reported-by: Dan Carpenter <error27@gmail.com>
Link: https://lore.kernel.org/all/add7a378-4d50-4ba1-81d3-a0c17db25a0b@kili.mountain/
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>