Vishal Moola (Oracle)
bf2d4334f7
mm: add utility functions for ptdesc
...
Introduce utility functions setting the foundation for ptdescs. These
will also assist in the splitting out of ptdesc from struct page.
Functions that focus on the descriptor are prefixed with ptdesc_* while
functions that focus on the pagetable are prefixed with pagetable_*.
pagetable_alloc() is defined to allocate new ptdesc pages as compound
pages. This is to standardize ptdescs by allowing for one allocation and
one free function, in contrast to 2 allocation and 2 free functions.
Link: https://lkml.kernel.org/r/20230807230513.102486-4-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com >
Cc: Arnd Bergmann <arnd@arndb.de >
Cc: Catalin Marinas <catalin.marinas@arm.com >
Cc: Christophe Leroy <christophe.leroy@csgroup.eu >
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com >
Cc: Dave Hansen <dave.hansen@linux.intel.com >
Cc: David Hildenbrand <david@redhat.com >
Cc: "David S. Miller" <davem@davemloft.net >
Cc: Dinh Nguyen <dinguyen@kernel.org >
Cc: Geert Uytterhoeven <geert@linux-m68k.org >
Cc: Geert Uytterhoeven <geert+renesas@glider.be >
Cc: Guo Ren <guoren@kernel.org >
Cc: Huacai Chen <chenhuacai@kernel.org >
Cc: Hugh Dickins <hughd@google.com >
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de >
Cc: Jonas Bonn <jonas@southpole.se >
Cc: Matthew Wilcox <willy@infradead.org >
Cc: Mike Rapoport (IBM) <rppt@kernel.org >
Cc: Palmer Dabbelt <palmer@rivosinc.com >
Cc: Paul Walmsley <paul.walmsley@sifive.com >
Cc: Richard Weinberger <richard@nod.at >
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de >
Cc: Yoshinori Sato <ysato@users.sourceforge.jp >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-21 13:37:52 -07:00
Vishal Moola (Oracle)
9a35de4ffc
pgtable: create struct ptdesc
...
Currently, page table information is stored within struct page. As part
of simplifying struct page, create struct ptdesc for page table
information.
Link: https://lkml.kernel.org/r/20230807230513.102486-3-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com >
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org >
Cc: Arnd Bergmann <arnd@arndb.de >
Cc: Catalin Marinas <catalin.marinas@arm.com >
Cc: Christophe Leroy <christophe.leroy@csgroup.eu >
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com >
Cc: Dave Hansen <dave.hansen@linux.intel.com >
Cc: David Hildenbrand <david@redhat.com >
Cc: "David S. Miller" <davem@davemloft.net >
Cc: Dinh Nguyen <dinguyen@kernel.org >
Cc: Geert Uytterhoeven <geert@linux-m68k.org >
Cc: Geert Uytterhoeven <geert+renesas@glider.be >
Cc: Guo Ren <guoren@kernel.org >
Cc: Huacai Chen <chenhuacai@kernel.org >
Cc: Hugh Dickins <hughd@google.com >
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de >
Cc: Jonas Bonn <jonas@southpole.se >
Cc: Matthew Wilcox <willy@infradead.org >
Cc: Palmer Dabbelt <palmer@rivosinc.com >
Cc: Paul Walmsley <paul.walmsley@sifive.com >
Cc: Richard Weinberger <richard@nod.at >
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de >
Cc: Yoshinori Sato <ysato@users.sourceforge.jp >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-21 13:37:51 -07:00
Vishal Moola (Oracle)
f7bda0d85d
mm: add PAGE_TYPE_OP folio functions
...
Patch series "Split ptdesc from struct page", v9.
The MM subsystem is trying to shrink struct page. This patchset
introduces a memory descriptor for page table tracking - struct ptdesc.
This patchset introduces ptdesc, splits ptdesc from struct page, and
converts many callers of page table constructor/destructors to use
ptdescs.
Ptdesc is a foundation to further standardize page tables, and eventually
allow for dynamic allocation of page tables independent of struct page.
However, the use of pages for page table tracking is quite deeply
ingrained and varied across archictectures, so there is still a lot of
work to be done before that can happen.
This patch (of 31):
No folio equivalents for page type operations have been defined, so define
them for later folio conversions.
Also changes the Page##uname macros to take in const struct page* since we
only read the memory here.
Link: https://lkml.kernel.org/r/20230807230513.102486-1-vishal.moola@gmail.com
Link: https://lkml.kernel.org/r/20230807230513.102486-2-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com >
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org >
Cc: Arnd Bergmann <arnd@arndb.de >
Cc: Catalin Marinas <catalin.marinas@arm.com >
Cc: Christophe Leroy <christophe.leroy@csgroup.eu >
Cc: Claudio Imbrenda <imbrenda@linux.ibm.com >
Cc: Dave Hansen <dave.hansen@linux.intel.com >
Cc: David Hildenbrand <david@redhat.com >
Cc: "David S. Miller" <davem@davemloft.net >
Cc: Dinh Nguyen <dinguyen@kernel.org >
Cc: Geert Uytterhoeven <geert@linux-m68k.org >
Cc: Huacai Chen <chenhuacai@kernel.org >
Cc: Hugh Dickins <hughd@google.com >
Cc: Jonas Bonn <jonas@southpole.se >
Cc: Matthew Wilcox <willy@infradead.org >
Cc: Paul Walmsley <paul.walmsley@sifive.com >
Cc: Richard Weinberger <richard@nod.at >
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de >
Cc: Yoshinori Sato <ysato@users.sourceforge.jp >
Cc: Geert Uytterhoeven <geert+renesas@glider.be >
Cc: Guo Ren <guoren@kernel.org >
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de >
Cc: Palmer Dabbelt <palmer@rivosinc.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-21 13:37:51 -07:00
Aneesh Kumar K.V
1a8c64e110
mm/memory_hotplug: embed vmem_altmap details in memory block
...
With memmap on memory, some architecture needs more details w.r.t altmap
such as base_pfn, end_pfn, etc to unmap vmemmap memory. Instead of
computing them again when we remove a memory block, embed vmem_altmap
details in struct memory_block if we are using memmap on memory block
feature.
[yangyingliang@huawei.com: fix error return code in add_memory_resource()]
Link: https://lkml.kernel.org/r/20230809081552.1351184-1-yangyingliang@huawei.com
Link: https://lkml.kernel.org/r/20230808091501.287660-7-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com >
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com >
Acked-by: Michal Hocko <mhocko@suse.com >
Acked-by: David Hildenbrand <david@redhat.com >
Cc: Christophe Leroy <christophe.leroy@csgroup.eu >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: Nicholas Piggin <npiggin@gmail.com >
Cc: Oscar Salvador <osalvador@suse.de >
Cc: Vishal Verma <vishal.l.verma@intel.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-21 13:37:49 -07:00
Aneesh Kumar K.V
e3c2bfdd33
mm/memory_hotplug: allow memmap on memory hotplug request to fallback
...
If not supported, fallback to not using memap on memmory. This avoids
the need for callers to do the fallback.
Link: https://lkml.kernel.org/r/20230808091501.287660-3-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com >
Acked-by: Michal Hocko <mhocko@suse.com >
Acked-by: David Hildenbrand <david@redhat.com >
Cc: Christophe Leroy <christophe.leroy@csgroup.eu >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: Nicholas Piggin <npiggin@gmail.com >
Cc: Oscar Salvador <osalvador@suse.de >
Cc: Vishal Verma <vishal.l.verma@intel.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-21 13:37:48 -07:00
Kefeng Wang
3f32c49ed6
mm: memtest: convert to memtest_report_meminfo()
...
It is better to not expose too many internal variables of memtest,
add a helper memtest_report_meminfo() to show memtest results.
Link: https://lkml.kernel.org/r/20230808033359.174986-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com >
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org >
Cc: Matthew Wilcox <willy@infradead.org >
Cc: Tomas Mudrunka <tomas.mudrunka@gmail.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-21 13:37:47 -07:00
Mateusz Guzik
9a9d0b8299
mm: move dummy_vm_ops out of a header
...
Otherwise the kernel ends up with multiple copies:
$ nm vmlinux | grep dummy_vm_ops
ffffffff81e4ea00 d dummy_vm_ops.2
ffffffff81e11760 d dummy_vm_ops.254
ffffffff81e406e0 d dummy_vm_ops.4
ffffffff81e3c780 d dummy_vm_ops.7
While here prefix it with vma_.
Link: https://lkml.kernel.org/r/20230806231611.1395735-1-mjguzik@gmail.com
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com >
Cc: Matthew Wilcox <willy@infradead.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-21 13:37:46 -07:00
Suren Baghdasaryan
60081bf19b
mm: lock vma explicitly before doing vm_flags_reset and vm_flags_reset_once
...
Implicit vma locking inside vm_flags_reset() and vm_flags_reset_once() is
not obvious and makes it hard to understand where vma locking is happening.
Also in some cases (like in dup_userfaultfd()) vma should be locked earlier
than vma_flags modification. To make locking more visible, change these
functions to assert that the vma write lock is taken and explicitly lock
the vma beforehand. Fix userfaultfd functions which should lock the vma
earlier.
Link: https://lkml.kernel.org/r/20230804152724.3090321-5-surenb@google.com
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org >
Signed-off-by: Suren Baghdasaryan <surenb@google.com >
Cc: Jann Horn <jannh@google.com >
Cc: Liam R. Howlett <Liam.Howlett@oracle.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-21 13:37:46 -07:00
Suren Baghdasaryan
ce2fc5fffd
mm: for !CONFIG_PER_VMA_LOCK equate write lock assertion for vma and mmap
...
When CONFIG_PER_VMA_LOCK=n, vma_assert_write_locked() should be equivalent
to mmap_assert_write_locked().
Link: https://lkml.kernel.org/r/20230804152724.3090321-3-surenb@google.com
Suggested-by: Jann Horn <jannh@google.com >
Signed-off-by: Suren Baghdasaryan <surenb@google.com >
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com >
Cc: Linus Torvalds <torvalds@linuxfoundation.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-21 13:37:45 -07:00
SeongJae Park
17e7c724d3
mm/damon/core: implement target type damos filter
...
One DAMON context can have multiple monitoring targets, and DAMOS schemes
are applied to all targets. In some cases, users need to apply different
scheme to different targets. Retrieving monitoring results via DAMON
sysfs interface' 'tried_regions' directory could be one good example.
Also, there could be cases that cgroup DAMOS filter is not enough. All
such use cases can be worked around by having multiple DAMON contexts
having only single target, but it is inefficient in terms of resource
usage, thogh the overhead is not estimated to be huge.
Implement DAMON monitoring target based DAMOS filter for the case. Like
address range target DAMOS filter, handle these filters in the DAMON core
layer, since it is more efficient than doing in operations set layer.
This also means that regions that filtered out by monitoring target type
DAMOS filters are counted as not tried by the scheme. Hence, target
granularity monitoring results retrieval via DAMON sysfs interface becomes
available.
Link: https://lkml.kernel.org/r/20230802214312.110532-9-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org >
Cc: Brendan Higgins <brendanhiggins@google.com >
Cc: Jonathan Corbet <corbet@lwn.net >
Cc: Shuah Khan <shuah@kernel.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-21 13:37:37 -07:00
SeongJae Park
ab9bda001b
mm/damon/core: introduce address range type damos filter
...
Patch series "Extend DAMOS filters for address ranges and DAMON monitoring
targets"
There are use cases that need to apply DAMOS schemes to specific address
ranges or DAMON monitoring targets. NUMA nodes in the physical address
space, special memory objects in the virtual address space, and monitoring
target specific efficient monitoring results snapshot retrieval could be
examples of such use cases. This patchset extends DAMOS filters feature
for such cases, by implementing two more filter types, namely address
ranges and DAMON monitoring types.
Patches sequence
----------------
The first seven patches are for the address ranges based DAMOS filter.
The first patch implements the filter feature and expose it via DAMON
kernel API. The second patch further expose the feature to users via
DAMON sysfs interface. The third and fourth patches implement unit tests
and selftests for the feature. Three patches (fifth to seventh) updating
the documents follow.
The following six patches are for the DAMON monitoring target based DAMOS
filter. The eighth patch implements the feature in the core layer and
expose it via DAMON's kernel API. The ninth patch further expose it to
users via DAMON sysfs interface. Tenth patch add a selftest, and two
patches (eleventh and twelfth) update documents.
[1] https://lore.kernel.org/damon/20230728203444.70703-1-sj@kernel.org/
This patch (of 13):
Users can know special characteristic of specific address ranges. NUMA
nodes or special objects or buffers in virtual address space could be such
examples. For such cases, DAMOS schemes could required to be applied to
only specific address ranges. Implement yet another type of DAMOS filter
for the purpose.
Note that the existing filter types, namely anon pages and memcg DAMOS
filters needed page level type check. Because such check can be done
efficiently in the opertions set layer, those filters are handled in
operations set layer. Specifically, only paddr operations set
implementation supports these filters. Also, because statistics counting
is done in the DAMON core layer, the regions that filtered out by these
filters are counted as tried but failed to the statistics.
Unlike those, address range based filters can efficiently handled in the
core layer. Hence, do the handling in the layer, and count the regions
that filtered out by those as the scheme has not tried for the region.
This difference should clearly documented.
Link: https://lkml.kernel.org/r/20230802214312.110532-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20230802214312.110532-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org >
Cc: Brendan Higgins <brendanhiggins@google.com >
Cc: Jonathan Corbet <corbet@lwn.net >
Cc: Shuah Khan <shuah@kernel.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-21 13:37:35 -07:00
Miaohe Lin
ca39c5e7d1
mm/memcg: update obsolete comment above parent_mem_cgroup()
...
Since commit bef8620cd8 ("mm: memcg: deprecate the non-hierarchical
mode"), use_hierarchy is already deprecated. And it's further removed via
commit 9d9d341df4 ("cgroup: remove obsoleted broken_hierarchy and
warned_broken_hierarchy"). Update corresponding comment.
Link: https://lkml.kernel.org/r/20230801124359.2266860-1-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Cc: Michal Hocko <mhocko@suse.com >
Cc: Roman Gushchin <roman.gushchin@linux.dev >
Cc: Johannes Weiner <hannes@cmpxchg.org >
Cc: Shakeel Butt <shakeelb@google.com >
Cc: Muchun Song <songmuchun@bytedance.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-21 13:37:32 -07:00
Kefeng Wang
11250fd12e
mm: factor out VMA stack and heap checks
...
Patch series "mm: convert to vma_is_initial_heap/stack()", v3.
Add vma_is_initial_stack() and vma_is_initial_heap() helpers and use them
to simplify code.
This patch (of 4):
Factor out VMA stack and heap checks and name them vma_is_initial_stack()
and vma_is_initial_heap() for general use.
Link: https://lkml.kernel.org/r/20230728050043.59880-1-wangkefeng.wang@huawei.com
Link: https://lkml.kernel.org/r/20230728050043.59880-2-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com >
Reviewed-by: David Hildenbrand <david@redhat.com >
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org >
Cc: Christian Göttsche <cgzones@googlemail.com >
Cc: Alex Deucher <alexander.deucher@amd.com >
Cc: Arnaldo Carvalho de Melo <acme@kernel.org >
Cc: Christian Göttsche <cgzones@googlemail.com >
Cc: Christian König <christian.koenig@amd.com >
Cc: Daniel Vetter <daniel@ffwll.ch >
Cc: David Airlie <airlied@gmail.com >
Cc: Eric Paris <eparis@parisplace.org >
Cc: Felix Kuehling <felix.kuehling@amd.com >
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com >
Cc: Paul Moore <paul@paul-moore.com >
Cc: Stephen Smalley <stephen.smalley.work@gmail.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-21 13:37:31 -07:00
Kemeng Shi
67311a36e5
mm/page_ext: move page_ext_operations definition under CONFIG_PAGE_EXTENSION
...
page_ext_operations should only be defined when CONFIG_PAGE_EXTENSION is
enabled.
Besides, this may detect missing reliance on CONFIG_PAGE_EXTENSION from
future Page Extension clients at compile time.
Link: https://lkml.kernel.org/r/20230717113227.1897173-4-shikemeng@huaweicloud.com
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-21 13:37:31 -07:00
Miaohe Lin
5d241789df
mm/memcg: fix obsolete function name in mem_cgroup_protection()
...
Commit 45c7f7e1ef ("mm, memcg: decouple e{low,min} state mutations from
protection checks") changed the function name but not the corresponding
comment.
Link: https://lkml.kernel.org/r/20230727115934.657787-1-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Cc: Michal Hocko <mhocko@suse.com >
Cc: Roman Gushchin <roman.gushchin@linux.dev >
Cc: Johannes Weiner <hannes@cmpxchg.org >
Cc: Shakeel Butt <shakeelb@google.com >
Cc: Muchun Song <songmuchun@bytedance.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-21 13:37:28 -07:00
Kemeng Shi
c0a5d93a88
mm/page_ext: add common function to get client data from page_ext
...
Patch series "add page_ext_data to get client data in page_ext".
Current clients get data from page_ext by adding offset which is auto
generated in page_ext core and exposes the data layout design inside
page_ext core. This series adds a page_ext_data() to hide this from
clients.
Benefits include:
1. Future clients can call page_ext_data directly instead of defining
a new function like get_page_owner to get the data.
2. There is no change to clients if the layout of page_ext data changes.
This patch (of 3):
Add common page_ext_data function to get client data. This could hide
offset which is auto generated in page_ext core and expose the desgin of
page_ext data layout.
Link: https://lkml.kernel.org/r/20230718145812.1991717-1-shikemeng@huaweicloud.com
Link: https://lkml.kernel.org/r/20230718145812.1991717-2-shikemeng@huaweicloud.com
Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com >
Reviewed-by: Andrew Morton <akpm@linux-foudation.org >
Acked-by: Mike Rapoport (IBM) <rppt@kernel.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-21 13:37:27 -07:00
Matthew Wilcox (Oracle)
ca54f6d89d
zswap: make zswap_load() take a folio
...
Only convert a few easy parts of this function to use the folio passed in;
convert back to struct page for the majority of it. Removes three hidden
calls to compound_head().
Link: https://lkml.kernel.org/r/20230715042343.434588-6-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Cc: Christoph Hellwig <hch@infradead.org >
Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com >
Cc: Johannes Weiner <hannes@cmpxchg.org >
Cc: Nhat Pham <nphamcs@gmail.com >
Cc: Vitaly Wool <vitaly.wool@konsulko.com >
Cc: Yosry Ahmed <yosryahmed@google.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-21 13:37:27 -07:00
Matthew Wilcox (Oracle)
074e3e262a
memcg: convert get_obj_cgroup_from_page to get_obj_cgroup_from_folio
...
As the one caller now has a folio, pass it in and use it. Removes three
calls to compound_head().
Link: https://lkml.kernel.org/r/20230715042343.434588-4-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Cc: Christoph Hellwig <hch@infradead.org >
Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com >
Cc: Johannes Weiner <hannes@cmpxchg.org >
Cc: Nhat Pham <nphamcs@gmail.com >
Cc: Vitaly Wool <vitaly.wool@konsulko.com >
Cc: Yosry Ahmed <yosryahmed@google.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-21 13:37:26 -07:00
Matthew Wilcox (Oracle)
34f4c198bf
zswap: make zswap_store() take a folio
...
Patch series "Followup folio conversions for zswap".
With frontswap killed, it's worth converting the zswap_load() and
zswap_store() functions to take a folio instead of a page pointer. They
aren't converted to support large folios, but there are a lot of
unnecessary calls to compound_head() that are removed by these patches.
This patch (of 4):
Only convert a few easy parts of this function to use the folio passed in;
convert back to struct page for the majority of it. This does remove a
few hidden calls to compound_head().
Link: https://lkml.kernel.org/r/20230715042343.434588-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20230715042343.434588-3-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Cc: Christoph Hellwig <hch@infradead.org >
Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com >
Cc: Johannes Weiner <hannes@cmpxchg.org >
Cc: Matthew Wilcox (Oracle) <willy@infradead.org >
Cc: Nhat Pham <nphamcs@gmail.com >
Cc: Vitaly Wool <vitaly.wool@konsulko.com >
Cc: Yosry Ahmed <yosryahmed@google.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-21 13:37:26 -07:00
Johannes Weiner
42c06a0e8e
mm: kill frontswap
...
The only user of frontswap is zswap, and has been for a long time. Have
swap call into zswap directly and remove the indirection.
[hannes@cmpxchg.org: remove obsolete comment, per Yosry]
Link: https://lkml.kernel.org/r/20230719142832.GA932528@cmpxchg.org
[fengwei.yin@intel.com: don't warn if none swapcache folio is passed to zswap_load]
Link: https://lkml.kernel.org/r/20230810095652.3905184-1-fengwei.yin@intel.com
Link: https://lkml.kernel.org/r/20230717160227.GA867137@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org >
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com >
Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com >
Acked-by: Nhat Pham <nphamcs@gmail.com >
Acked-by: Yosry Ahmed <yosryahmed@google.com >
Acked-by: Christoph Hellwig <hch@lst.de >
Cc: Domenico Cerasuolo <cerasuolodomenico@gmail.com >
Cc: Matthew Wilcox (Oracle) <willy@infradead.org >
Cc: Vitaly Wool <vitaly.wool@konsulko.com >
Cc: Vlastimil Babka <vbabka@suse.cz >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-21 13:37:26 -07:00
Aneesh Kumar K.V
27af67f356
powerpc/book3s64/mm: enable transparent pud hugepage
...
This is enabled only with radix translation and 1G hugepage size. This
will be used with devdax device memory with a namespace alignment of 1G.
Anon transparent hugepage is not supported even though we do have helpers
checking pud_trans_huge(). We should never find that return true. The
only expected pte bit combination is _PAGE_PTE | _PAGE_DEVMAP.
Some of the helpers are never expected to get called on hash translation
and hence is marked to call BUG() in such a case.
Link: https://lkml.kernel.org/r/20230724190759.483013-10-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com >
Cc: Catalin Marinas <catalin.marinas@arm.com >
Cc: Christophe Leroy <christophe.leroy@csgroup.eu >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Joao Martins <joao.m.martins@oracle.com >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Muchun Song <muchun.song@linux.dev >
Cc: Nicholas Piggin <npiggin@gmail.com >
Cc: Oscar Salvador <osalvador@suse.de >
Cc: Will Deacon <will@kernel.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:55 -07:00
Aneesh Kumar K.V
104c49d5b6
powerpc/mm/trace: convert trace event to trace event class
...
A follow-up patch will add a pud variant for this same event. Using event
class makes that addition simpler.
No functional change in this patch.
Link: https://lkml.kernel.org/r/20230724190759.483013-9-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com >
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu >
Cc: Catalin Marinas <catalin.marinas@arm.com >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Joao Martins <joao.m.martins@oracle.com >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Muchun Song <muchun.song@linux.dev >
Cc: Nicholas Piggin <npiggin@gmail.com >
Cc: Oscar Salvador <osalvador@suse.de >
Cc: Will Deacon <will@kernel.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:54 -07:00
Aneesh Kumar K.V
0b6f15824c
mm/vmemmap optimization: split hugetlb and devdax vmemmap optimization
...
Arm disabled hugetlb vmemmap optimization [1] because hugetlb vmemmap
optimization includes an update of both the permissions (writeable to
read-only) and the output address (pfn) of the vmemmap ptes. That is not
supported without unmapping of pte(marking it invalid) by some
architectures.
With DAX vmemmap optimization we don't require such pte updates and
architectures can enable DAX vmemmap optimization while having hugetlb
vmemmap optimization disabled. Hence split DAX optimization support into
a different config.
s390, loongarch and riscv don't have devdax support. So the DAX config is
not enabled for them. With this change, arm64 should be able to select
DAX optimization
[1] commit 060a2c92d1 ("arm64: mm: hugetlb: Disable HUGETLB_PAGE_OPTIMIZE_VMEMMAP")
Link: https://lkml.kernel.org/r/20230724190759.483013-8-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com >
Cc: Catalin Marinas <catalin.marinas@arm.com >
Cc: Christophe Leroy <christophe.leroy@csgroup.eu >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Joao Martins <joao.m.martins@oracle.com >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Muchun Song <muchun.song@linux.dev >
Cc: Nicholas Piggin <npiggin@gmail.com >
Cc: Oscar Salvador <osalvador@suse.de >
Cc: Will Deacon <will@kernel.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:54 -07:00
Aneesh Kumar K.V
54a948a1e9
mm/huge pud: use transparent huge pud helpers only with CONFIG_TRANSPARENT_HUGEPAGE
...
pudp_set_wrprotect and move_huge_pud helpers are only used when
CONFIG_TRANSPARENT_HUGEPAGE is enabled. Similar to pmdp_set_wrprotect and
move_huge_pmd_helpers use architecture override only if
CONFIG_TRANSPARENT_HUGEPAGE is set
Link: https://lkml.kernel.org/r/20230724190759.483013-7-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com >
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu >
Cc: Catalin Marinas <catalin.marinas@arm.com >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Joao Martins <joao.m.martins@oracle.com >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Muchun Song <muchun.song@linux.dev >
Cc: Nicholas Piggin <npiggin@gmail.com >
Cc: Oscar Salvador <osalvador@suse.de >
Cc: Will Deacon <will@kernel.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:54 -07:00
Aneesh Kumar K.V
973bf6800c
mm: add pud_same similar to __HAVE_ARCH_P4D_SAME
...
This helps architectures to override pmd_same and pud_same independently.
Link: https://lkml.kernel.org/r/20230724190759.483013-6-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com >
Cc: Catalin Marinas <catalin.marinas@arm.com >
Cc: Christophe Leroy <christophe.leroy@csgroup.eu >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Joao Martins <joao.m.martins@oracle.com >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Muchun Song <muchun.song@linux.dev >
Cc: Nicholas Piggin <npiggin@gmail.com >
Cc: Oscar Salvador <osalvador@suse.de >
Cc: Will Deacon <will@kernel.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:54 -07:00
Aneesh Kumar K.V
c1a6c536fb
mm/vmemmap: improve vmemmap_can_optimize and allow architectures to override
...
dax vmemmap optimization requires a minimum of 2 PAGE_SIZE area within
vmemmap such that tail page mapping can point to the second PAGE_SIZE
area. Enforce that in vmemmap_can_optimize() function.
Architectures like powerpc also want to enable vmemmap optimization
conditionally (only with radix MMU translation). Hence allow architecture
override.
Link: https://lkml.kernel.org/r/20230724190759.483013-4-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com >
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu >
Cc: Catalin Marinas <catalin.marinas@arm.com >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Joao Martins <joao.m.martins@oracle.com >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Muchun Song <muchun.song@linux.dev >
Cc: Nicholas Piggin <npiggin@gmail.com >
Cc: Oscar Salvador <osalvador@suse.de >
Cc: Will Deacon <will@kernel.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:53 -07:00
Aneesh Kumar K.V
f32928ab6f
mm: change pudp_huge_get_and_clear_full take vm_area_struct as arg
...
We will use this in a later patch to do tlb flush when clearing pud
entries on powerpc. This is similar to commit 93a98695f2 ("mm: change
pmdp_huge_get_and_clear_full take vm_area_struct as arg")
Link: https://lkml.kernel.org/r/20230724190759.483013-3-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com >
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu >
Cc: Catalin Marinas <catalin.marinas@arm.com >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Joao Martins <joao.m.martins@oracle.com >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Muchun Song <muchun.song@linux.dev >
Cc: Nicholas Piggin <npiggin@gmail.com >
Cc: Oscar Salvador <osalvador@suse.de >
Cc: Will Deacon <will@kernel.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:53 -07:00
Aneesh Kumar K.V
348ad1606f
mm/hugepage pud: allow arch-specific helper function to check huge page pud support
...
Patch series "Add support for DAX vmemmap optimization for ppc64", v6.
This patch series implements changes required to support DAX vmemmap
optimization for ppc64. The vmemmap optimization is only enabled with
radix MMU translation and 1GB PUD mapping with 64K page size.
The patch series also splits the hugetlb vmemmap optimization as a
separate Kconfig variable so that architectures can enable DAX vmemmap
optimization without enabling hugetlb vmemmap optimization. This should
enable architectures like arm64 to enable DAX vmemmap optimization while
they can't enable hugetlb vmemmap optimization. More details of the same
are in patch "mm/vmemmap optimization: Split hugetlb and devdax vmemmap
optimization".
With 64K page size for 16384 pages added (1G) we save 14 pages
With 4K page size for 262144 pages added (1G) we save 4094 pages
With 4K page size for 512 pages added (2M) we save 6 pages
This patch (of 13):
Architectures like powerpc would like to enable transparent huge page pud
support only with radix translation. To support that add
has_transparent_pud_hugepage() helper that architectures can override.
[aneesh.kumar@linux.ibm.com: use the new has_transparent_pud_hugepage()]
Link: https://lkml.kernel.org/r/87tttrvtaj.fsf@linux.ibm.com
Link: https://lkml.kernel.org/r/20230724190759.483013-1-aneesh.kumar@linux.ibm.com
Link: https://lkml.kernel.org/r/20230724190759.483013-2-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com >
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu >
Cc: Catalin Marinas <catalin.marinas@arm.com >
Cc: Dan Williams <dan.j.williams@intel.com >
Cc: Joao Martins <joao.m.martins@oracle.com >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Muchun Song <muchun.song@linux.dev >
Cc: Nicholas Piggin <npiggin@gmail.com >
Cc: Oscar Salvador <osalvador@suse.de >
Cc: Will Deacon <will@kernel.org >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:53 -07:00
Matthew Wilcox (Oracle)
350f6bbca1
mm: allow per-VMA locks on file-backed VMAs
...
Remove the TCP layering violation by allowing per-VMA locks on all VMAs.
The fault path will immediately fail in handle_mm_fault(). There may be a
small performance reduction from this patch as a little unnecessary work
will be done on each page fault. See later patches for the improvement.
Link: https://lkml.kernel.org/r/20230724185410.1124082-3-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Reviewed-by: Suren Baghdasaryan <surenb@google.com >
Cc: Arjun Roy <arjunroy@google.com >
Cc: Eric Dumazet <edumazet@google.com >
Cc: Punit Agrawal <punit.agrawal@bytedance.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:51 -07:00
Matthew Wilcox (Oracle)
284e059204
mm: remove CONFIG_PER_VMA_LOCK ifdefs
...
Patch series "Handle most file-backed faults under the VMA lock", v3.
This patchset adds the ability to handle page faults on parts of files
which are already in the page cache without taking the mmap lock.
This patch (of 10):
Provide lock_vma_under_rcu() when CONFIG_PER_VMA_LOCK is not defined to
eliminate ifdefs in the users.
Link: https://lkml.kernel.org/r/20230724185410.1124082-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20230724185410.1124082-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Reviewed-by: Suren Baghdasaryan <surenb@google.com >
Cc: Punit Agrawal <punit.agrawal@bytedance.com >
Cc: Arjun Roy <arjunroy@google.com >
Cc: Eric Dumazet <edumazet@google.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:50 -07:00
Liam R. Howlett
da0892547b
maple_tree: re-introduce entry to mas_preallocate() arguments
...
The current preallocation strategy is to preallocate the absolute
worst-case allocation for a tree modification. The entry (or NULL) is
needed to know how many nodes are needed to write to the tree. Start by
adding the argument to the mas_preallocate() definition.
Link: https://lkml.kernel.org/r/20230724183157.3939892-8-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com >
Cc: Peng Zhang <zhangpeng.00@bytedance.com >
Cc: Suren Baghdasaryan <surenb@google.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:48 -07:00
Liam R. Howlett
c1297987cc
maple_tree: introduce __mas_set_range()
...
mas_set_range() resets the node to MAS_START, which will cause a re-walk
of the tree to the range. This is unnecessary when the maple state is
already at the correct location of the write. Add a function that only
sets the range to avoid unnecessary re-walking of the tree.
Link: https://lkml.kernel.org/r/20230724183157.3939892-6-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com >
Cc: Peng Zhang <zhangpeng.00@bytedance.com >
Cc: Suren Baghdasaryan <surenb@google.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:48 -07:00
Liam R. Howlett
fd892593d4
mm: change do_vmi_align_munmap() tracking of VMAs to remove
...
The majority of the calls to munmap a vm range is within a single vma.
The maple tree is able to store a single entry at 0, with a size of 1 as
a pointer and avoid any allocations. Change do_vmi_align_munmap() to
store the VMAs being munmap()'ed into a tree indexed by the count. This
will leverage the ability to store the first entry without a node
allocation.
Storing the entries into a tree by the count and not the vma start and
end means changing the functions which iterate over the entries. Update
unmap_vmas() and free_pgtables() to take a maple state and a tree end
address to support this functionality.
Passing through the same maple state to unmap_vmas() and free_pgtables()
means the state needs to be reset between calls. This happens in the
static unmap_region() and exit_mmap().
Link: https://lkml.kernel.org/r/20230724183157.3939892-4-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com >
Cc: Peng Zhang <zhangpeng.00@bytedance.com >
Cc: Suren Baghdasaryan <surenb@google.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:47 -07:00
Jann Horn
90717566f8
mm: don't drop VMA locks in mm_drop_all_locks()
...
Despite its name, mm_drop_all_locks() does not drop _all_ locks; the mmap
lock is held write-locked by the caller, and the caller is responsible for
dropping the mmap lock at a later point (which will also release the VMA
locks).
Calling vma_end_write_all() here is dangerous because the caller might
have write-locked a VMA with the expectation that it will stay
write-locked until the mmap_lock is released, as usual.
This _almost_ becomes a problem in the following scenario:
An anonymous VMA A and an SGX VMA B are mapped adjacent to each other.
Userspace calls munmap() on a range starting at the start address of A and
ending in the middle of B.
Hypothetical call graph with additional notes in brackets:
do_vmi_align_munmap
[begin first for_each_vma_range loop]
vma_start_write [on VMA A]
vma_mark_detached [on VMA A]
__split_vma [on VMA B]
sgx_vma_open [== new->vm_ops->open]
sgx_encl_mm_add
__mmu_notifier_register [luckily THIS CAN'T ACTUALLY HAPPEN]
mm_take_all_locks
mm_drop_all_locks
vma_end_write_all [drops VMA lock taken on VMA A before]
vma_start_write [on VMA B]
vma_mark_detached [on VMA B]
[end first for_each_vma_range loop]
vma_iter_clear_gfp [removes VMAs from maple tree]
mmap_write_downgrade
unmap_region
mmap_read_unlock
In this hypothetical scenario, while do_vmi_align_munmap() thinks it still
holds a VMA write lock on VMA A, the VMA write lock has actually been
invalidated inside __split_vma().
The call from sgx_encl_mm_add() to __mmu_notifier_register() can't
actually happen here, as far as I understand, because we are duplicating
an existing SGX VMA, but sgx_encl_mm_add() only calls
__mmu_notifier_register() for the first SGX VMA created in a given
process. So this could only happen in fork(), not on munmap(). But in my
view it is just pure luck that this can't happen.
Also, we wouldn't actually have any bad consequences from this in
do_vmi_align_munmap(), because by the time the bug drops the lock on VMA
A, we've already marked VMA A as detached, which makes it completely
ineligible for any VMA-locked page faults. But again, that's just pure
luck.
So remove the vma_end_write_all(), so that VMA write locks are only ever
released on mmap_write_unlock() or mmap_write_downgrade().
Also add comments to document the locking rules established by this patch.
Link: https://lkml.kernel.org/r/20230720193436.454247-1-jannh@google.com
Fixes: eeff9a5d47 ("mm/mmap: prevent pagefault handler from racing with mmu_notifier registration")
Signed-off-by: Jann Horn <jannh@google.com >
Reviewed-by: Suren Baghdasaryan <surenb@google.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:46 -07:00
ZhangPeng
6d2790d95d
mm/page_io: introduce bio_first_folio_all()
...
Introduce bio_first_folio_all() to return a folio, which makes it easier
to use.
Link: https://lkml.kernel.org/r/20230721034451.16412-4-zhangpeng362@huawei.com
Signed-off-by: ZhangPeng <zhangpeng362@huawei.com >
Suggested-by: Matthew Wilcox (Oracle) <willy@infradead.org >
Cc: Christoph Hellwig <hch@infradead.org >
Cc: Kefeng Wang <wangkefeng.wang@huawei.com >
Cc: Nanyong Sun <sunnanyong@huawei.com >
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:45 -07:00
Miaohe Lin
ea09800bf1
mm: fix obsolete function name above debug_pagealloc_enabled_static()
...
Since commit 04013513cc ("mm, page_alloc: do not rely on the order of
page_poison and init_on_alloc/free parameters"), init_debug_pagealloc() is
converted to init_mem_debugging_and_hardening(). Later it's renamed to
mem_debugging_and_hardening_init() via commit f2fc4b44ec ("mm: move
init_mem_debugging_and_hardening() to mm/mm_init.c").
Link: https://lkml.kernel.org/r/20230720112806.3851893-1-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:41 -07:00
Alistair Popple
1af5a81099
mmu_notifiers: rename invalidate_range notifier
...
There are two main use cases for mmu notifiers. One is by KVM which uses
mmu_notifier_invalidate_range_start()/end() to manage a software TLB.
The other is to manage hardware TLBs which need to use the
invalidate_range() callback because HW can establish new TLB entries at
any time. Hence using start/end() can lead to memory corruption as these
callbacks happen too soon/late during page unmap.
mmu notifier users should therefore either use the start()/end() callbacks
or the invalidate_range() callbacks. To make this usage clearer rename
the invalidate_range() callback to arch_invalidate_secondary_tlbs() and
update documention.
Link: https://lkml.kernel.org/r/6f77248cd25545c8020a54b4e567e8b72be4dca1.1690292440.git-series.apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com >
Suggested-by: Jason Gunthorpe <jgg@nvidia.com >
Acked-by: Catalin Marinas <catalin.marinas@arm.com >
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com >
Cc: Andrew Donnellan <ajd@linux.ibm.com >
Cc: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com >
Cc: Frederic Barrat <fbarrat@linux.ibm.com >
Cc: Jason Gunthorpe <jgg@ziepe.ca >
Cc: John Hubbard <jhubbard@nvidia.com >
Cc: Kevin Tian <kevin.tian@intel.com >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: Nicholas Piggin <npiggin@gmail.com >
Cc: Nicolin Chen <nicolinc@nvidia.com >
Cc: Robin Murphy <robin.murphy@arm.com >
Cc: Sean Christopherson <seanjc@google.com >
Cc: SeongJae Park <sj@kernel.org >
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com >
Cc: Will Deacon <will@kernel.org >
Cc: Zhi Wang <zhi.wang.linux@gmail.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:41 -07:00
Alistair Popple
ec8832d007
mmu_notifiers: don't invalidate secondary TLBs as part of mmu_notifier_invalidate_range_end()
...
Secondary TLBs are now invalidated from the architecture specific TLB
invalidation functions. Therefore there is no need to explicitly notify
or invalidate as part of the range end functions. This means we can
remove mmu_notifier_invalidate_range_end_only() and some of the
ptep_*_notify() functions.
Link: https://lkml.kernel.org/r/90d749d03cbab256ca0edeb5287069599566d783.1690292440.git-series.apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com >
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com >
Cc: Andrew Donnellan <ajd@linux.ibm.com >
Cc: Catalin Marinas <catalin.marinas@arm.com >
Cc: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com >
Cc: Frederic Barrat <fbarrat@linux.ibm.com >
Cc: Jason Gunthorpe <jgg@ziepe.ca >
Cc: John Hubbard <jhubbard@nvidia.com >
Cc: Kevin Tian <kevin.tian@intel.com >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: Nicholas Piggin <npiggin@gmail.com >
Cc: Nicolin Chen <nicolinc@nvidia.com >
Cc: Robin Murphy <robin.murphy@arm.com >
Cc: Sean Christopherson <seanjc@google.com >
Cc: SeongJae Park <sj@kernel.org >
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com >
Cc: Will Deacon <will@kernel.org >
Cc: Zhi Wang <zhi.wang.linux@gmail.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:41 -07:00
Alistair Popple
6bbd42e2df
mmu_notifiers: call invalidate_range() when invalidating TLBs
...
The invalidate_range() is going to become an architecture specific mmu
notifier used to keep the TLB of secondary MMUs such as an IOMMU in sync
with the CPU page tables. Currently it is called from separate code paths
to the main CPU TLB invalidations. This can lead to a secondary TLB not
getting invalidated when required and makes it hard to reason about when
exactly the secondary TLB is invalidated.
To fix this move the notifier call to the architecture specific TLB
maintenance functions for architectures that have secondary MMUs requiring
explicit software invalidations.
This fixes a SMMU bug on ARM64. On ARM64 PTE permission upgrades require
a TLB invalidation. This invalidation is done by the architecture
specific ptep_set_access_flags() which calls flush_tlb_page() if required.
However this doesn't call the notifier resulting in infinite faults being
generated by devices using the SMMU if it has previously cached a
read-only PTE in it's TLB.
Moving the invalidations into the TLB invalidation functions ensures all
invalidations happen at the same time as the CPU invalidation. The
architecture specific flush_tlb_all() routines do not call the notifier as
none of the IOMMUs require this.
Link: https://lkml.kernel.org/r/0287ae32d91393a582897d6c4db6f7456b1001f2.1690292440.git-series.apopple@nvidia.com
Signed-off-by: Alistair Popple <apopple@nvidia.com >
Suggested-by: Jason Gunthorpe <jgg@ziepe.ca >
Tested-by: SeongJae Park <sj@kernel.org >
Acked-by: Catalin Marinas <catalin.marinas@arm.com >
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com >
Tested-by: Luis Chamberlain <mcgrof@kernel.org >
Cc: Andrew Donnellan <ajd@linux.ibm.com >
Cc: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com >
Cc: Frederic Barrat <fbarrat@linux.ibm.com >
Cc: John Hubbard <jhubbard@nvidia.com >
Cc: Kevin Tian <kevin.tian@intel.com >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: Nicholas Piggin <npiggin@gmail.com >
Cc: Nicolin Chen <nicolinc@nvidia.com >
Cc: Robin Murphy <robin.murphy@arm.com >
Cc: Sean Christopherson <seanjc@google.com >
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com >
Cc: Will Deacon <will@kernel.org >
Cc: Zhi Wang <zhi.wang.linux@gmail.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:41 -07:00
Liam R. Howlett
19a462f06e
maple_tree: Be more strict about locking
...
Use lockdep to check the write path in the maple tree holds the lock in
write mode.
Introduce mt_write_lock_is_held() to check if the lock is held for
writing. Update the necessary checks for rcu_dereference_protected() to
use the new write lock check.
Link: https://lkml.kernel.org/r/20230714195551.894800-5-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Oliver Sang <oliver.sang@intel.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:40 -07:00
Liam R. Howlett
02fdb25fb4
mm/mmap: change detached vma locking scheme
...
Don't set the lock to the mm lock so that the detached VMA tree does not
complain about being unlocked when the mmap_lock is dropped prior to
freeing the tree.
Introduce mt_on_stack() for setting the external lock to NULL only when
LOCKDEP is used.
Move the destroying of the detached tree outside the mmap lock all
together.
Link: https://lkml.kernel.org/r/20230719183142.ktgcmuj2pnlr3h3s@revolver
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Oliver Sang <oliver.sang@intel.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:40 -07:00
Liam R. Howlett
134d153c93
maple_tree: relax lockdep checks for on-stack trees
...
To support early release of the maple tree locks, do not lockdep check the
lock if it is set to NULL. This is intended for the special case on-stack
use of tracking entries and not for general use.
Link: https://lkml.kernel.org/r/20230714195551.894800-3-Liam.Howlett@oracle.com
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com >
Cc: Linus Torvalds <torvalds@linux-foundation.org >
Cc: Oliver Sang <oliver.sang@intel.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:39 -07:00
Sidhartha Kumar
affd26b1fb
mm/hugetlb: get rid of page_hstate()
...
Convert the last page_hstate() user to use folio_hstate() so page_hstate()
can be safely removed.
Link: https://lkml.kernel.org/r/20230719184145.301911-1-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com >
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com >
Cc: Matthew Wilcox (Oracle) <willy@infradead.org >
Cc: Muchun Song <songmuchun@bytedance.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:39 -07:00
Peng Zhang
cabdf74e6b
mm: kfence: allocate kfence_metadata at runtime
...
kfence_metadata is currently a static array. For the purpose of
allocating scalable __kfence_pool, we first change it to runtime
allocation of metadata. Since the size of an object of kfence_metadata is
1160 bytes, we can save at least 72 pages (with default 256 objects)
without enabling kfence.
[akpm@linux-foundation.org: restore newline, per Marco]
Link: https://lkml.kernel.org/r/20230718073019.52513-1-zhangpeng.00@bytedance.com
Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com >
Reviewed-by: Marco Elver <elver@google.com >
Cc: Alexander Potapenko <glider@google.com >
Cc: Dmitry Vyukov <dvyukov@google.com >
Cc: Muchun Song <muchun.song@linux.dev >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:39 -07:00
Zhu, Lipeng
aee79d4e52
fs/address_space: add alignment padding for i_map and i_mmap_rwsem to mitigate a false sharing.
...
When running UnixBench/Shell Scripts, we observed high false sharing for
accessing i_mmap against i_mmap_rwsem.
UnixBench/Shell Scripts are typical load/execute command test scenarios,
which concurrently launch->execute->exit a lot of shell commands. A lot
of processes invoke vma_interval_tree_remove which touch "i_mmap", the
call stack:
----vma_interval_tree_remove
|----unlink_file_vma
| free_pgtables
| |----exit_mmap
| | mmput
| | |----begin_new_exec
| | | load_elf_binary
| | | bprm_execve
Meanwhile, there are a lot of processes touch 'i_mmap_rwsem' to acquire
the semaphore in order to access 'i_mmap'. In existing 'address_space'
layout, 'i_mmap' and 'i_mmap_rwsem' are in the same cacheline.
The patch places the i_mmap and i_mmap_rwsem in separate cache lines to
avoid this false sharing problem.
With this patch, based on kernel v6.4.0, on Intel Sapphire Rapids
112C/224T platform, the score improves by ~5.3%. And perf c2c tool shows
the false sharing is resolved as expected, the symbol
vma_interval_tree_remove disappeared in cache line 0 after this change.
Baseline:
=================================================
Shared Cache Line Distribution Pareto
=================================================
-------------------------------------------------------------
0 3729 5791 0 0 0xff19b3818445c740
-------------------------------------------------------------
3.27% 3.02% 0.00% 0.00% 0x18 0 1 0xffffffffa194403b 604 483 389 692 203 [k] vma_interval_tree_insert [kernel.kallsyms] vma_interval_tree_insert+75 0 1
4.13% 3.63% 0.00% 0.00% 0x20 0 1 0xffffffffa19440a2 553 413 415 962 215 [k] vma_interval_tree_remove [kernel.kallsyms] vma_interval_tree_remove+18 0 1
2.04% 1.35% 0.00% 0.00% 0x28 0 1 0xffffffffa219a1d6 1210 855 460 1229 222 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+678 0 1
0.62% 1.85% 0.00% 0.00% 0x28 0 1 0xffffffffa219a1bf 762 329 577 527 198 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+655 0 1
0.48% 0.31% 0.00% 0.00% 0x28 0 1 0xffffffffa219a58c 1677 1476 733 1544 224 [k] down_write [kernel.kallsyms] down_write+28 0 1
0.05% 0.07% 0.00% 0.00% 0x28 0 1 0xffffffffa219a21d 1040 819 689 33 27 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+749 0 1
0.00% 0.05% 0.00% 0.00% 0x28 0 1 0xffffffffa17707db 0 1005 786 1373 223 [k] up_write [kernel.kallsyms] up_write+27 0 1
0.00% 0.02% 0.00% 0.00% 0x28 0 1 0xffffffffa219a064 0 233 778 32 30 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+308 0 1
33.82% 34.10% 0.00% 0.00% 0x30 0 1 0xffffffffa1770945 779 495 534 6011 224 [k] rwsem_spin_on_owner [kernel.kallsyms] rwsem_spin_on_owner+53 0 1
17.06% 15.28% 0.00% 0.00% 0x30 0 1 0xffffffffa1770915 593 438 468 2715 224 [k] rwsem_spin_on_owner [kernel.kallsyms] rwsem_spin_on_owner+5 0 1
3.54% 3.52% 0.00% 0.00% 0x30 0 1 0xffffffffa2199f84 881 601 583 1421 223 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+84 0 1
With this change:
-------------------------------------------------------------
0 556 838 0 0 0xff2780d7965d2780
-------------------------------------------------------------
0.18% 0.60% 0.00% 0.00% 0x8 0 1 0xffffffffafff27b8 503 453 569 14 13 [k] do_dentry_open [kernel.kallsyms] do_dentry_open+456 0 1
0.54% 0.12% 0.00% 0.00% 0x8 0 1 0xffffffffaffc51ac 510 199 428 15 12 [k] hugepage_vma_check [kernel.kallsyms] hugepage_vma_check+252 0 1
1.80% 2.15% 0.00% 0.00% 0x18 0 1 0xffffffffb079a1d6 1778 799 343 215 136 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+678 0 1
0.54% 1.31% 0.00% 0.00% 0x18 0 1 0xffffffffb079a1bf 547 296 528 91 71 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+655 0 1
0.72% 0.72% 0.00% 0.00% 0x18 0 1 0xffffffffb079a58c 1479 1534 676 288 163 [k] down_write [kernel.kallsyms] down_write+28 0 1
0.00% 0.12% 0.00% 0.00% 0x18 0 1 0xffffffffafd707db 0 2381 744 282 158 [k] up_write [kernel.kallsyms] up_write+27 0 1
0.00% 0.12% 0.00% 0.00% 0x18 0 1 0xffffffffb079a064 0 239 518 6 6 [k] rwsem_down_write_slowpath [kernel.kallsyms] rwsem_down_write_slowpath+308 0 1
46.58% 47.02% 0.00% 0.00% 0x20 0 1 0xffffffffafd70945 704 403 499 1137 219 [k] rwsem_spin_on_owner [kernel.kallsyms] rwsem_spin_on_owner+53 0 1
23.92% 25.78% 0.00% 0.00% 0x20 0 1 0xffffffffafd70915 558 413 500 542 185 [k] rwsem_spin_on_owner [kernel.kallsyms] rwsem_spin_on_owner+5 0 1
v1->v2: change padding to exchange fields.
Link: https://lkml.kernel.org/r/20230716145653.20122-1-lipeng.zhu@intel.com
Signed-off-by: Lipeng Zhu <lipeng.zhu@intel.com >
Reviewed-by: Tim Chen <tim.c.chen@linux.intel.com >
Cc: Alexander Viro <viro@zeniv.linux.org.uk >
Cc: Christian Brauner <brauner@kernel.org >
Cc: Yu Ma <yu.ma@intel.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:38 -07:00
Barry Song
f73419bb89
mm/tlbbatch: rename and extend some functions
...
This patch does some preparation works to extend batched TLB flush to
arm64. Including:
- Extend set_tlb_ubc_flush_pending() and arch_tlbbatch_add_mm()
to accept an additional argument for address, architectures
like arm64 may need this for tlbi.
- Rename arch_tlbbatch_add_mm() to arch_tlbbatch_add_pending()
to match its current function since we don't need to handle
mm on architectures like arm64 and add_mm is not proper,
add_pending will make sense to both as on x86 we're pending the
TLB flush operations while on arm64 we're pending the synchronize
operations.
This intends no functional changes on x86.
Link: https://lkml.kernel.org/r/20230717131004.12662-3-yangyicong@huawei.com
Tested-by: Yicong Yang <yangyicong@hisilicon.com >
Tested-by: Xin Hao <xhao@linux.alibaba.com >
Tested-by: Punit Agrawal <punit.agrawal@bytedance.com >
Signed-off-by: Barry Song <v-songbaohua@oppo.com >
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com >
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com >
Reviewed-by: Xin Hao <xhao@linux.alibaba.com >
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com >
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com >
Cc: Jonathan Corbet <corbet@lwn.net >
Cc: Nadav Amit <namit@vmware.com >
Cc: Mel Gorman <mgorman@suse.de >
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com >
Cc: Arnd Bergmann <arnd@arndb.de >
Cc: Barry Song <baohua@kernel.org >
Cc: Darren Hart <darren@os.amperecomputing.com >
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com >
Cc: lipeifeng <lipeifeng@oppo.com >
Cc: Mark Rutland <mark.rutland@arm.com >
Cc: Peter Zijlstra <peterz@infradead.org >
Cc: Ryan Roberts <ryan.roberts@arm.com >
Cc: Steven Miao <realmz6@gmail.com >
Cc: Will Deacon <will@kernel.org >
Cc: Zeng Tao <prime.zeng@hisilicon.com >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:36 -07:00
Baoquan He
95da27c4c6
mm: ioremap: remove unneeded ioremap_allowed and iounmap_allowed
...
Now there are no users of ioremap_allowed and iounmap_allowed, clean
them up.
Link: https://lkml.kernel.org/r/20230706154520.11257-20-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com >
Reviewed-by: Christoph Hellwig <hch@lst.de >
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com >
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org >
Cc: Alexander Gordeev <agordeev@linux.ibm.com >
Cc: Arnd Bergmann <arnd@arndb.de >
Cc: Brian Cain <bcain@quicinc.com >
Cc: Catalin Marinas <catalin.marinas@arm.com >
Cc: Christian Borntraeger <borntraeger@linux.ibm.com >
Cc: Christophe Leroy <christophe.leroy@csgroup.eu >
Cc: Chris Zankel <chris@zankel.net >
Cc: David Laight <David.Laight@ACULAB.COM >
Cc: Geert Uytterhoeven <geert@linux-m68k.org >
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com >
Cc: Heiko Carstens <hca@linux.ibm.com >
Cc: Helge Deller <deller@gmx.de >
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com >
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de >
Cc: Jonas Bonn <jonas@southpole.se >
Cc: Matthew Wilcox <willy@infradead.org >
Cc: Max Filippov <jcmvbkbc@gmail.com >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: Nathan Chancellor <nathan@kernel.org >
Cc: Nicholas Piggin <npiggin@gmail.com >
Cc: Niklas Schnelle <schnelle@linux.ibm.com >
Cc: Rich Felker <dalias@libc.org >
Cc: Stafford Horne <shorne@gmail.com >
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi >
Cc: Sven Schnelle <svens@linux.ibm.com >
Cc: Vasily Gorbik <gor@linux.ibm.com >
Cc: Vineet Gupta <vgupta@kernel.org >
Cc: Will Deacon <will@kernel.org >
Cc: Yoshinori Sato <ysato@users.sourceforge.jp >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:36 -07:00
Baoquan He
016fec9101
mm: move is_ioremap_addr() into new header file
...
Now is_ioremap_addr() is only used in kernel/iomem.c and gonna be used in
mm/ioremap.c. Move it into its own new header file linux/ioremap.h.
Link: https://lkml.kernel.org/r/20230706154520.11257-17-bhe@redhat.com
Suggested-by: Christoph Hellwig <hch@lst.de >
Signed-off-by: Baoquan He <bhe@redhat.com >
Reviewed-by: Christoph Hellwig <hch@lst.de >
Cc: Alexander Gordeev <agordeev@linux.ibm.com >
Cc: Arnd Bergmann <arnd@arndb.de >
Cc: Brian Cain <bcain@quicinc.com >
Cc: Catalin Marinas <catalin.marinas@arm.com >
Cc: Christian Borntraeger <borntraeger@linux.ibm.com >
Cc: Christophe Leroy <christophe.leroy@csgroup.eu >
Cc: Chris Zankel <chris@zankel.net >
Cc: David Laight <David.Laight@ACULAB.COM >
Cc: Geert Uytterhoeven <geert@linux-m68k.org >
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com >
Cc: Heiko Carstens <hca@linux.ibm.com >
Cc: Helge Deller <deller@gmx.de >
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com >
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de >
Cc: Jonas Bonn <jonas@southpole.se >
Cc: Kefeng Wang <wangkefeng.wang@huawei.com >
Cc: Matthew Wilcox <willy@infradead.org >
Cc: Max Filippov <jcmvbkbc@gmail.com >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: Mike Rapoport (IBM) <rppt@kernel.org >
Cc: Nathan Chancellor <nathan@kernel.org >
Cc: Nicholas Piggin <npiggin@gmail.com >
Cc: Niklas Schnelle <schnelle@linux.ibm.com >
Cc: Rich Felker <dalias@libc.org >
Cc: Stafford Horne <shorne@gmail.com >
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi >
Cc: Sven Schnelle <svens@linux.ibm.com >
Cc: Vasily Gorbik <gor@linux.ibm.com >
Cc: Vineet Gupta <vgupta@kernel.org >
Cc: Will Deacon <will@kernel.org >
Cc: Yoshinori Sato <ysato@users.sourceforge.jp >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:35 -07:00
Baoquan He
dfdc6ba957
mm: ioremap: allow ARCH to have its own ioremap method definition
...
Architectures can be converted to GENERIC_IOREMAP, to take standard
ioremap_xxx() and iounmap() way. But some ARCH-es could have specific
handling for ioremap_prot(), ioremap() and iounmap(), than standard
methods.
In oder to convert these ARCH-es to take GENERIC_IOREMAP method, allow
these architecutres to have their own ioremap_prot(), ioremap() and
iounmap() definitions.
Link: https://lkml.kernel.org/r/20230706154520.11257-6-bhe@redhat.com
Signed-off-by: Baoquan He <bhe@redhat.com >
Acked-by: Arnd Bergmann <arnd@arndb.de >
Reviewed-by: Christoph Hellwig <hch@lst.de >
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com >
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org >
Cc: Alexander Gordeev <agordeev@linux.ibm.com >
Cc: Brian Cain <bcain@quicinc.com >
Cc: Catalin Marinas <catalin.marinas@arm.com >
Cc: Christian Borntraeger <borntraeger@linux.ibm.com >
Cc: Christophe Leroy <christophe.leroy@csgroup.eu >
Cc: Chris Zankel <chris@zankel.net >
Cc: David Laight <David.Laight@ACULAB.COM >
Cc: Geert Uytterhoeven <geert@linux-m68k.org >
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com >
Cc: Heiko Carstens <hca@linux.ibm.com >
Cc: Helge Deller <deller@gmx.de >
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com >
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de >
Cc: Jonas Bonn <jonas@southpole.se >
Cc: Matthew Wilcox <willy@infradead.org >
Cc: Max Filippov <jcmvbkbc@gmail.com >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: Nathan Chancellor <nathan@kernel.org >
Cc: Nicholas Piggin <npiggin@gmail.com >
Cc: Niklas Schnelle <schnelle@linux.ibm.com >
Cc: Rich Felker <dalias@libc.org >
Cc: Stafford Horne <shorne@gmail.com >
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi >
Cc: Sven Schnelle <svens@linux.ibm.com >
Cc: Vasily Gorbik <gor@linux.ibm.com >
Cc: Vineet Gupta <vgupta@kernel.org >
Cc: Will Deacon <will@kernel.org >
Cc: Yoshinori Sato <ysato@users.sourceforge.jp >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:33 -07:00
Christophe Leroy
7613366a19
mm/ioremap: define generic_ioremap_prot() and generic_iounmap()
...
Define a generic version of ioremap_prot() and iounmap() that
architectures can call after they have performed the necessary alteration
to parameters and/or necessary verifications.
Link: https://lkml.kernel.org/r/20230706154520.11257-5-bhe@redhat.com
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu >
Signed-off-by: Baoquan He <bhe@redhat.com >
Reviewed-by: Christoph Hellwig <hch@lst.de >
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com >
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org >
Cc: Alexander Gordeev <agordeev@linux.ibm.com >
Cc: Arnd Bergmann <arnd@arndb.de >
Cc: Brian Cain <bcain@quicinc.com >
Cc: Catalin Marinas <catalin.marinas@arm.com >
Cc: Christian Borntraeger <borntraeger@linux.ibm.com >
Cc: Chris Zankel <chris@zankel.net >
Cc: David Laight <David.Laight@ACULAB.COM >
Cc: Geert Uytterhoeven <geert@linux-m68k.org >
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com >
Cc: Heiko Carstens <hca@linux.ibm.com >
Cc: Helge Deller <deller@gmx.de >
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com >
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de >
Cc: Jonas Bonn <jonas@southpole.se >
Cc: Matthew Wilcox <willy@infradead.org >
Cc: Max Filippov <jcmvbkbc@gmail.com >
Cc: Michael Ellerman <mpe@ellerman.id.au >
Cc: Nathan Chancellor <nathan@kernel.org >
Cc: Nicholas Piggin <npiggin@gmail.com >
Cc: Niklas Schnelle <schnelle@linux.ibm.com >
Cc: Rich Felker <dalias@libc.org >
Cc: Stafford Horne <shorne@gmail.com >
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi >
Cc: Sven Schnelle <svens@linux.ibm.com >
Cc: Vasily Gorbik <gor@linux.ibm.com >
Cc: Vineet Gupta <vgupta@kernel.org >
Cc: Will Deacon <will@kernel.org >
Cc: Yoshinori Sato <ysato@users.sourceforge.jp >
Signed-off-by: Andrew Morton <akpm@linux-foundation.org >
2023-08-18 10:12:32 -07:00