The feature of minimizing overhead of struct page associated with each
HugeTLB page aims to free its vmemmap pages (used as struct page) to save
memory, where is ~14GB/16GB per 1TB HugeTLB pages (2MB/1GB type). In
short, when a HugeTLB page is allocated or freed, the vmemmap array
representing the range associated with the page will need to be remapped.
When a page is allocated, vmemmap pages are freed after remapping. When a
page is freed, previously discarded vmemmap pages must be allocated before
remapping. More implementations and details can be found here [1].
The infrastructure of freeing vmemmap pages associated with each HugeTLB
page is already there, we can easily enable HUGETLB_PAGE_FREE_VMEMMAP for
arm64, the only thing to be fixed is flush_dcache_page() .
flush_dcache_page() need to be adapted to operate on the head page's flags
since the tail vmemmap pages are mapped with read-only after the feature
is enabled (clear operation is not permitted).
There was some discussions about this in the thread [2], but there was no
conclusion in the end. And I copied the concern proposed by Anshuman to
here and explain why those concern is superfluous. It is safe to enable
it for x86_64 as well as arm64.
1st concern:
'''
But what happens when a hot remove section's vmemmap area (which is
being teared down) is nearby another vmemmap area which is either created
or being destroyed for HugeTLB alloc/free purpose. As you mentioned
HugeTLB pages inside the hot remove section might be safe. But what about
other HugeTLB areas whose vmemmap area shares page table entries with
vmemmap entries for a section being hot removed ? Massive HugeTLB alloc
/use/free test cycle using memory just adjacent to a memory hotplug area,
which is always added and removed periodically, should be able to expose
this problem.
'''
Answer: At the time memory is removed, all HugeTLB pages either have been
migrated away or dissolved. So there is no race between memory hot remove
and free_huge_page_vmemmap(). Therefore, HugeTLB pages inside the hot
remove section is safe. Let's talk your question "what about other
HugeTLB areas whose vmemmap area shares page table entries with vmemmap
entries for a section being hot removed ?", the question is not
established. The minimal granularity size of hotplug memory 128MB (on
arm64, 4k base page), any HugeTLB smaller than 128MB is within a section,
then, there is no share PTE page tables between HugeTLB in this section
and ones in other sections and a HugeTLB page could not cross two
sections. In this case, the section cannot be freed. Any HugeTLB bigger
than 128MB (section size) whose vmemmap pages is an integer multiple of
2MB (PMD-mapped). As long as:
1) HugeTLBs are naturally aligned, power-of-two sizes
2) The HugeTLB size >= the section size
3) The HugeTLB size >= the vmemmap leaf mapping size
Then a HugeTLB will not share any leaf page table entries with *anything
else*, but will share intermediate entries. In this case, at the time
memory is removed, all HugeTLB pages either have been migrated away or
dissolved. So there is also no race between memory hot remove and
free_huge_page_vmemmap().
2nd concern:
'''
differently, not sure if ptdump would require any synchronization.
Dumping an wrong value is probably okay but crashing because a page table
entry is being freed after ptdump acquired the pointer is bad. On arm64,
ptdump() is protected against hotremove via [get|put]_online_mems().
'''
Answer: The ptdump should be fine since vmemmap_remap_free() only
exchanges PTEs or splits the PMD entry (which means allocating a PTE page
table). Both operations do not free any page tables (PTE), so ptdump
cannot run into a UAF on any page tables. The worst case is just dumping
an wrong value.
[1] https://lore.kernel.org/all/20210510030027.56044-1-songmuchun@bytedance.com/
[2] https://lore.kernel.org/all/20210518091826.36937-1-songmuchun@bytedance.com/
[songmuchun@bytedance.com: restructure the code comment inside flush_dcache_page()]
Link: https://lkml.kernel.org/r/20220414072646.21910-1-songmuchun@bytedance.com
Link: https://lkml.kernel.org/r/20220331065640.5777-2-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Barry Song <baohua@kernel.org>
Tested-by: Barry Song <baohua@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Bodeddula Balasubramaniam <bodeddub@amazon.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: James Morse <james.morse@arm.com>
Cc: Xiongchun Duan <duanxiongchun@bytedance.com>
Cc: Fam Zheng <fam.zheng@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
vmap() takes struct page *pages as one of arguments, and user may provide
an invalid pointer which may lead to corrupted translation table.
An example of such behaviour is erroneous usage of virt_to_page():
vaddr1 = dma_alloc_coherent()
page = virt_to_page() // Wrong here
...
vaddr2 = vmap(page)
memset(vaddr2) // Faulting here
virt_to_page() returns a wrong pointer if vaddr1 is not a linear kernel
address. The problem is that vmap() populates pte with bad pfn
successfully, and it's much harder to debug at memory access time. This
case should be caught by DEBUG_VIRTUAL being that enabled, but it's not
enabled in popular distros.
Kernel already checks the pages against NULL. In the case mentioned
above, however, the address is not NULL, and it's big enough so that the
hardware generated Address Size Abort on arm64:
[ 665.484101] Unhandled fault at 0xffff8000252cd000
[ 665.488807] Mem abort info:
[ 665.491617] ESR = 0x96000043
[ 665.494675] EC = 0x25: DABT (current EL), IL = 32 bits
[ 665.499985] SET = 0, FnV = 0
[ 665.503039] EA = 0, S1PTW = 0
[ 665.506167] Data abort info:
[ 665.509047] ISV = 0, ISS = 0x00000043
[ 665.512882] CM = 0, WnR = 1
[ 665.515851] swapper pgtable: 4k pages, 48-bit VAs, pgdp=00000000818cb000
[ 665.522550] [ffff8000252cd000] pgd=000000affcfff003, pud=000000affcffe003, pmd=0000008fad8c3003, pte=00688000a5217713
[ 665.533160] Internal error: level 3 address size fault: 96000043 [#1] SMP
[ 665.539936] Modules linked in: [...]
[ 665.616212] CPU: 178 PID: 13199 Comm: test Tainted: P OE 5.4.0-84-generic #94~18.04.1-Ubuntu
[ 665.626806] Hardware name: HPE Apollo 70 /C01_APACHE_MB , BIOS L50_5.13_1.0.6 07/10/2018
[ 665.636618] pstate: 80400009 (Nzcv daif +PAN -UAO)
[ 665.641407] pc : __memset+0x38/0x188
[ 665.645146] lr : test+0xcc/0x3f8
[ 665.650184] sp : ffff8000359bb840
[ 665.653486] x29: ffff8000359bb840 x28: 0000000000000000
[ 665.658785] x27: 0000000000000000 x26: 0000000000231000
[ 665.664083] x25: ffff00ae660f6110 x24: ffff00ae668cb800
[ 665.669382] x23: 0000000000000001 x22: ffff00af533e5000
[ 665.674680] x21: 0000000000001000 x20: 0000000000000000
[ 665.679978] x19: ffff00ae66950000 x18: ffffffffffffffff
[ 665.685276] x17: 00000000588636a5 x16: 0000000000000013
[ 665.690574] x15: ffffffffffffffff x14: 000000000007ffff
[ 665.695872] x13: 0000000080000000 x12: 0140000000000000
[ 665.701170] x11: 0000000000000041 x10: ffff8000652cd000
[ 665.706468] x9 : ffff8000252cf000 x8 : ffff8000252cd000
[ 665.711767] x7 : 0303030303030303 x6 : 0000000000001000
[ 665.717065] x5 : ffff8000252cd000 x4 : 0000000000000000
[ 665.722363] x3 : ffff8000252cdfff x2 : 0000000000000001
[ 665.727661] x1 : 0000000000000003 x0 : ffff8000252cd000
[ 665.732960] Call trace:
[ 665.735395] __memset+0x38/0x188
[...]
Interestingly, this abort happens even if copy_from_kernel_nofault() is
used, which is quite inconvenient for debugging purposes.
This patch adds a pfn_valid() check into vmap() path, so that invalid
mapping will not be created; WARN_ON() is used to let client code know
that something goes wrong, and it's not a regular EINVAL situation.
Link: https://lkml.kernel.org/r/20220422220410.1308706-1-yury.norov@gmail.com
Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Alexey Klimov <aklimov@redhat.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Ding Tianhong <dingtianhong@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
After commit 0e4b01df86 ("mm, memcg: throttle allocators when failing
reclaim over memory.high") allocating memory over memory.high became very
time consuming. But it's exactly what the memory.high test from cgroup
kselftests is doing: it tries to allocate 100M with 30M memory.high value.
It takes forever to complete.
In order to keep it passing (or failing) in a reasonable amount of time
let's try to allocate only a little over 30M: 31M to be precise.
With this change test_memcontrol finishes in a reasonable amount of
time:
$ time ./test_memcontrol
ok 1 test_memcg_subtree_control
ok 2 test_memcg_current
ok 3 test_memcg_min
ok 4 test_memcg_low
ok 5 test_memcg_high
ok 6 test_memcg_max
ok 7 test_memcg_oom_events
ok 8 test_memcg_swap_max
ok 9 test_memcg_sock
ok 10 test_memcg_oom_group_leaf_events
ok 11 test_memcg_oom_group_parent_events
ok 12 test_memcg_oom_group_score_events
real 0m2.273s
user 0m0.064s
sys 0m0.739s
Link: https://lkml.kernel.org/r/20220415000133.3955987-3-roman.gushchin@linux.dev
Signed-off-by: Roman Gushchin <roman.gushchin@linux.dev>
Reviewed-by: David Vernet <void@manifault.com>
Cc: Chris Down <chris@chrisdown.name>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zefan Li <lizefan.x@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
An application is suspected of having memory leak when its memory
consumption is high and keeps increasing. There are several commonly used
memory allocators: slab, cma, vmalloc, etc. The memory leak
identification can be sped up if the page information allocated by an
allocator can be analyzed separately.
This patch provides supports for memory allocator labelling for slab,
vmalloc, and cma. The pages allocated by slab and cma can be confirmed
from the "PFN" line according to the kernel codes, and the label of the
vmalloc allocator can be obtained by analyzing the stack trace. Thanks
for Vlastimil Babka's constructive suggestions.
Based on Yinan Zhang's study, the call chain of vmalloc() is vmalloc() ->
... -> __vmalloc_node_range() -> __vmalloc_area_node().
__vmalloc_area_node() requests memory through the interface of buddy
allocation system. In the current version, __vmalloc_area_node() uses
four interfaces: alloc_pages_bulk_array_mempolicy(),
alloc_pages_bulk_array_node(), alloc_pages() and alloc_pages_node(). By
disassembling the code, we find that __vmalloc_area_node() is expanded in
__vmalloc_node_range(). So __vmalloc_area_node is not in the stack trace.
On the test machine, the stack trace of pages allocated by vmalloc has the
following four forms:
__alloc_pages_bulk+0x230/0x6a0
__vmalloc_node_range+0x19c/0x598
alloc_pages_bulk_array_mempolicy+0xbc/0x278
__vmalloc_node_range+0x1e8/0x598
__alloc_pages+0x160/0x2b0
__vmalloc_node_range+0x234/0x598
alloc_pages+0xac/0x150
__vmalloc_node_range+0x44c/0x598
Therefore, in two consecutive lines of stacktrace, if the first line
contains the word "alloc_pages" and the second line contains the word
"__vmalloc_node_range", it can be determined that the page is allocated by
vmalloc. And the function offset and size are not the same on different
machines, so there is no need to match them.
At the same time, this patch updates the --cull and --sort options to
support allocator-based merge statistics and sorting. The added functions
are fully compatible with the original work. When using, you can use
"allocator", or abbreviated as "ator". Relevant updates have also been
made in the documentation(Documentation/vm/page_owner.rst).
Example:
./page_owner_sort <input> <output> --cull=st,pid,name,allocator
./page_owner_sort <input> <output> --sort=ator,pid,name
This work is coauthored by Jiajian Ye, Yinan Zhang, Shenghong Han,
Chongxi Zhao, Yuhong Feng and Yongqiang Liu.
Link: https://lkml.kernel.org/r/20220410132932.9402-1-caoyixuan2019@email.szu.edu.cn
Signed-off-by: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
Cc: Chongxi Zhao <zhaochongxi2019@email.szu.edu.cn>
Cc: Haowen Bai <baihaowen@meizu.com>
Cc: Jiajian Ye <yejiajian2018@email.szu.edu.cn>
Cc: Sean Anderson <seanga2@gmail.com>
Cc: Shenghong Han <hanshenghong2019@email.szu.edu.cn>
Cc: Yinan Zhang <zhangyinan2019@email.szu.edu.cn>
Cc: Yongqiang Liu <liuyongqiang13@huawei.com>
Cc: Yuhong Feng <yuhongf@szu.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When viewing page owner information, we may want to sort blocks of
information by multiple keys, since one single key does not uniquely
identify a block. Therefore, following adjustments are made:
1. Add a new --sort option to support sorting blocks of information by
multiple keys.
./page_owner_sort <input> <output> --sort=<order>
./page_owner_sort <input> <output> --sort <order>
<order> is a single argument in the form of a comma-separated list,
which offers a way to specify sorting order.
Sorting syntax is [+|-]key[,[+|-]key[,...]]. The ascending or descending
order can be specified by adding the + (ascending, default) or - (descend
-ing) prefix to the key:
./page_owner_sort <input> <output> [option] --sort -key1,+key2,key3...
For example, to sort the blocks first by task command name in lexicographic
order and then by pid in ascending numerical order, use the following:
./page_owner_sort <input> <output> --sort=name,+pid
To sort the blocks first by pid in ascending order and then by timestamp
of the page when it is allocated in descending order, use the following:
./page_owner_sort <input> <output> --sort=pid,-alloc_ts
2. Add explanations of a newly added --sort option in the function usage()
and the document(Documentation/vm/page_owner.rst).
This work is coauthored by
Yixuan Cao
Shenghong Han
Yinan Zhang
Chongxi Zhao
Yuhong Feng
Yongqiang Liu
Link: https://lkml.kernel.org/r/20220401024856.767-3-yejiajian2018@email.szu.edu.cn
Signed-off-by: Jiajian Ye <yejiajian2018@email.szu.edu.cn>
Cc: Chongxi Zhao <zhaochongxi2019@email.szu.edu.cn>
Cc: Shenghong Han <hanshenghong2019@email.szu.edu.cn>
Cc: Yinan Zhang <zhangyinan2019@email.szu.edu.cn>
Cc: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
Cc: Yongqiang Liu <liuyongqiang13@huawei.com>
Cc: Yuhong Feng <yuhongf@szu.edu.cn>
Cc: Haowen Bai <baihaowen@meizu.com>
Cc: Sean Anderson <seanga2@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
When viewing page owner information, we may want to select blocks whose
PID/TGID/TASK_COMM_NAME appears in a user-specified list for data analysis
and aggregation. But currently page_owner_sort only supports selecting
blocks associated with only one specified PID/TGID/TASK_COMM_NAME.
Therefore, following adjustments are made to fix the problem:
1. Enhance selecting function to support the selection of multiple
PIDs/TGIDs/TASK_COMM_NAMEs.
The enhanced usages are as follows:
--pid <pidlist> Select by pid. This selects the blocks whose PID
numbers appear in <pidlist>.
--tgid <tgidlist> Select by tgid. This selects the blocks whose
TGID numbers appear in <tgidlist>.
--name <cmdlist> Select by task command name. This selects the
blocks whose task command name appear in <cmdlist>.
Where <pidlist>, <tgidlist>, <cmdlist> are single arguments in the form of
a comma-separated list,which offers a way to specify individual selecting
rules.
For example, if you want to select blocks whose tgids are 1, 2 or 3, you
have to use 4 commands as follows:
./page_owner_sort <input> <output1> --tgid=1
./page_owner_sort <input> <output2> --tgid=2
./page_owner_sort <input> <output3> --tgid=3
cat <output1> <output2> <output3> > <output>
With this patch, you can use only 1 command to obtain the same result as
above:
./page_owner_sort <input> <output1> --tgid=1,2,3
2. Update explanations of --pid, --tgid and --name in the function
usage() and the document(Documents/vm/page_owner.rst).
This work is coauthored by
Yixuan Cao
Shenghong Han
Yinan Zhang
Chongxi Zhao
Yuhong Feng
Yongqiang Liu
Link: https://lkml.kernel.org/r/20220401024856.767-2-yejiajian2018@email.szu.edu.cn
Signed-off-by: Jiajian Ye <yejiajian2018@email.szu.edu.cn>
Cc: Chongxi Zhao <zhaochongxi2019@email.szu.edu.cn>
Cc: Shenghong Han <hanshenghong2019@email.szu.edu.cn>
Cc: Yinan Zhang <zhangyinan2019@email.szu.edu.cn>
Cc: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
Cc: Yongqiang Liu <liuyongqiang13@huawei.com>
Cc: Yuhong Feng <yuhongf@szu.edu.cn>
Cc: Haowen Bai <baihaowen@meizu.com>
Cc: Sean Anderson <seanga2@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Pull drm fixes from Dave Airlie:
"Another relatively quiet week, amdgpu leads the way, some i915 display
fixes, and a single sunxi fix.
amdgpu:
- Runtime pm fix
- DCN memory leak fix in error path
- SI DPM deadlock fix
- S0ix fix
amdkfd:
- GWS fix
- GWS support for CRIU
i915:
- Fix#5284: Backlight control regression on XMG Core 15 e21
- Fix black display plane on Acer One AO532h
- Two smaller display fixes
sunxi:
- Single fix removing applying PHYS_OFFSET twice"
* tag 'drm-fixes-2022-04-29' of git://anongit.freedesktop.org/drm/drm:
drm/amdgpu: keep mmhub clock gating being enabled during s2idle suspend
drm/amd/pm: fix the deadlock issue observed on SI
drm/amd/display: Fix memory leak in dcn21_clock_source_create
drm/amdgpu: don't runtime suspend if there are displays attached (v3)
drm/amdkfd: CRIU add support for GWS queues
drm/amdkfd: Fix GWS queue count
drm/sun4i: Remove obsolete references to PHYS_OFFSET
drm/i915/fbc: Consult hw.crtc instead of uapi.crtc
drm/i915: Fix SEL_FETCH_PLANE_*(PIPE_B+) register addresses
drm/i915: Check EDID for HDR static metadata when choosing blc
drm/i915: Fix DISP_POS_Y and DISP_HEIGHT defines
Pull networking fixes from Jakub Kicinski:
"Including fixes from bluetooth, bpf and netfilter.
Current release - new code bugs:
- bridge: switchdev: check br_vlan_group() return value
- use this_cpu_inc() to increment net->core_stats, fix preempt-rt
Previous releases - regressions:
- eth: stmmac: fix write to sgmii_adapter_base
Previous releases - always broken:
- netfilter: nf_conntrack_tcp: re-init for syn packets only,
resolving issues with TCP fastopen
- tcp: md5: fix incorrect tcp_header_len for incoming connections
- tcp: fix F-RTO may not work correctly when receiving DSACK
- tcp: ensure use of most recently sent skb when filling rate samples
- tcp: fix potential xmit stalls caused by TCP_NOTSENT_LOWAT
- virtio_net: fix wrong buf address calculation when using xdp
- xsk: fix forwarding when combining copy mode with busy poll
- xsk: fix possible crash when multiple sockets are created
- bpf: lwt: fix crash when using bpf_skb_set_tunnel_key() from
bpf_xmit lwt hook
- sctp: null-check asoc strreset_chunk in sctp_generate_reconf_event
- wireguard: device: check for metadata_dst with skb_valid_dst()
- netfilter: update ip6_route_me_harder to consider L3 domain
- gre: make o_seqno start from 0 in native mode
- gre: switch o_seqno to atomic to prevent races in collect_md mode
Misc:
- add Eric Dumazet to networking maintainers
- dt: dsa: realtek: remove realtek,rtl8367s string
- netfilter: flowtable: Remove the empty file"
* tag 'net-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (65 commits)
tcp: fix F-RTO may not work correctly when receiving DSACK
Revert "ibmvnic: Add ethtool private flag for driver-defined queue limits"
net: enetc: allow tc-etf offload even with NETIF_F_CSUM_MASK
ixgbe: ensure IPsec VF<->PF compatibility
MAINTAINERS: Update BNXT entry with firmware files
netfilter: nft_socket: only do sk lookups when indev is available
net: fec: add missing of_node_put() in fec_enet_init_stop_mode()
bnx2x: fix napi API usage sequence
tls: Skip tls_append_frag on zero copy size
Add Eric Dumazet to networking maintainers
netfilter: conntrack: fix udp offload timeout sysctl
netfilter: nf_conntrack_tcp: re-init for syn packets only
net: dsa: lantiq_gswip: Don't set GSWIP_MII_CFG_RMII_CLK
net: Use this_cpu_inc() to increment net->core_stats
Bluetooth: hci_sync: Cleanup hci_conn if it cannot be aborted
Bluetooth: hci_event: Fix creating hci_conn object on error status
Bluetooth: hci_event: Fix checking for invalid handle on error status
ice: fix use-after-free when deinitializing mailbox snapshot
ice: wait 5 s for EMP reset after firmware flash
ice: Protect vf_state check by cfg_lock in ice_vc_process_vf_msg()
...
Pull thermal control fixes from Rafael Wysocki:
"These take back recent chages that started to confuse users and fix up
an attr.show callback prototype in a driver.
Specifics:
- Stop warning about deprecation of the userspace thermal governor
and cooling device status interface, because there are cases in
which user space has to drive thermal management with the help of
them (Daniel Lezcano)
- Fix attr.show callback prototype in the int340x thermal driver
(Kees Cook)"
* tag 'thermal-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
thermal/governor: Remove deprecated information
Revert "thermal/core: Deprecate changing cooling device state from userspace"
thermal: int340x: Fix attr.show callback prototype
Pull power management fixes from Rafael Wysocki:
"These fix up recent intel_idle driver changes and fix some ARM cpufreq
driver issues.
Specifics:
- Fix issues with the Qualcomm's cpufreq driver (Dmitry Baryshkov,
Vladimir Zapolskiy).
- Fix memory leak with the Sun501 driver (Xiaobing Luo).
- Make intel_idle enable C1E promotion on all CPUs when C1E is
preferred to C1 (Artem Bityutskiy).
- Make C6 optimization on Sapphire Rapids added recently work as
expected if both C1E and C1 are "preferred" (Artem Bityutskiy)"
* tag 'pm-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
intel_idle: Fix SPR C6 optimization
intel_idle: Fix the 'preferred_cstates' module parameter
cpufreq: qcom-cpufreq-hw: Clear dcvs interrupts
cpufreq: fix memory leak in sun50i_cpufreq_nvmem_probe
cpufreq: qcom-cpufreq-hw: Fix throttle frequency value on EPSS platforms
cpufreq: qcom-hw: provide online/offline operations
cpufreq: qcom-hw: fix the opp entries refcounting
cpufreq: qcom-hw: fix the race between LMH worker and cpuhp
cpufreq: qcom-hw: drop affinity hint before freeing the IRQ
Pull ACPI fixes from Rafael WysockiL
"These fix up the ACPI processor driver after a change made during the
5.16 cycle that inadvertently broke falling back to shallower C-states
when C3 cannot be used.
Specifics:
- Make the ACPI processor driver avoid falling back to C3 type of
C-states when C3 cannot be requested (Ville Syrjälä)
- Revert a quirk that is not necessary any more after fixing the
underlying issue properly (Ville Syrjälä)"
* tag 'acpi-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "ACPI: processor: idle: fix lockup regression on 32-bit ThinkPad T40"
ACPI: processor: idle: Avoid falling back to C3 type C-states
Pull x86 platform driver fixes from Hans de Goede:
"Highlights:
- asus-wmi bug-fixes
- intel-sdsu bug-fixes
- build (warning) fixes
- couple of hw-id additions"
* tag 'platform-drivers-x86-v5.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86/intel: pmc/core: change pmc_lpm_modes to static
platform/x86/intel/sdsi: Fix bug in multi packet reads
platform/x86/intel/sdsi: Poll on ready bit for writes
platform/x86/intel/sdsi: Handle leaky bucket
platform/x86: intel-uncore-freq: Prevent driver loading in guests
platform/x86: gigabyte-wmi: added support for B660 GAMING X DDR4 motherboard
platform/x86: dell-laptop: Add quirk entry for Latitude 7520
platform/x86: asus-wmi: Fix driver not binding when fan curve control probe fails
platform/x86: asus-wmi: Potential buffer overflow in asus_wmi_evaluate_method_buf()
tools/power/x86/intel-speed-select: fix build failure when using -Wl,--as-needed
Pull regulator fix from Mark Brown:
"A minor fix for the DT binding documentation of the rt5190a driver"
* tag 'regulator-fix-v5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: dt-bindings: Revise the rt5190a buck/ldo description