linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-16 14:51:51 -04:00

Author	SHA1	Message	Date
SeongJae Park	c2b0cb96e7	selftests/damon/drgn_dump_damon_status: support quota goal_tuner dumping Update drgn_dump_damon_status.py, which is being used to dump the in-kernel DAMON status for tests, to dump goal_tuner setup status. Link: https://lkml.kernel.org/r/20260310010529.91162-11-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:26 -07:00
SeongJae Park	c00863bc7c	selftests/damon/_damon_sysfs: support goal_tuner setup Add support of goal_tuner setup to the test-purpose DAMON sysfs interface control helper, _damon_sysfs.py. Link: https://lkml.kernel.org/r/20260310010529.91162-10-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:26 -07:00
SeongJae Park	d972d68d50	mm/damon/tests/core-kunit: test goal_tuner commit Extend damos_commit_quota() kunit test for the newly added goal_tuner parameter. Link: https://lkml.kernel.org/r/20260310010529.91162-9-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:26 -07:00
SeongJae Park	3eda936f2a	Docs/ABI/damon: update for goal_tuner Update the ABI document for the newly added goal_tuner sysfs file. Link: https://lkml.kernel.org/r/20260310010529.91162-8-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:26 -07:00
SeongJae Park	d9cfe515d3	Docs/admin-guide/mm/damon/usage: document goal_tuner sysfs file Update the DAMON usage document for the new sysfs file for the goal based quota auto-tuning algorithm selection. Link: https://lkml.kernel.org/r/20260310010529.91162-7-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:26 -07:00
SeongJae Park	5a242f9daf	Docs/mm/damon/design: document the goal-based quota tuner selections Update the design document for the newly added goal-based quota tuner selection feature. Link: https://lkml.kernel.org/r/20260310010529.91162-6-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:26 -07:00
SeongJae Park	e9a19cc85d	mm/damon/sysfs-schemes: implement quotas->goal_tuner file Add a new DAMON sysfs interface file, namely 'goal_tuner' under the DAMOS quotas directory. It is connected to the damos_quota->goal_tuner field. Users can therefore select their favorite goal-based quotas tuning algorithm by writing the name of the tuner to the file. Reading the file returns the name of the currently selected tuner. Link: https://lkml.kernel.org/r/20260310010529.91162-5-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:25 -07:00
SeongJae Park	af738a6a00	mm/damon/core: introduce DAMOS_QUOTA_GOAL_TUNER_TEMPORAL Introduce a new goal-based DAMOS quota auto-tuning algorithm, namely DAMOS_QUOTA_GOAL_TUNER_TEMPORAL (temporal in short). The algorithm aims to trigger the DAMOS action only for a temporal time, to achieve the goal as soon as possible. For the temporal period, it uses as much quota as allowed. Once the goal is achieved, it sets the quota zero, so effectively makes the scheme be deactivated. Link: https://lkml.kernel.org/r/20260310010529.91162-4-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:25 -07:00
SeongJae Park	54419bbd0e	mm/damon/core: allow quota goals set zero effective size quota User-explicit quotas (size and time quotas) having zero value means the quotas are unset. And, effective size quota is set as the minimum value of the explicit quotas. When quota goals are set, the goal-based quota tuner can make it lower. But the existing only single tuner never sets the effective size quota zero. Because of the fact, DAMON core assumes zero effective quota means the user has set no quota. Multiple tuners are now allowed, though. In the future, some tuners might want to set a zero effective size quota. There is no reason to restrict that. Meanwhile, because of the current implementation, it will only deactivate all quotas and make the scheme work at its full speed. Introduce a dedicated function for checking if no quota is set. The function checks the fact by showing if the user-set explicit quotas are zero and no goal is installed. It is decoupled from zero effective quota, and hence allows future tuners set zero effective quota for intentionally deactivating the scheme by a purpose. Link: https://lkml.kernel.org/r/20260310010529.91162-3-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:25 -07:00
SeongJae Park	8719c59c4b	mm/damon/core: introduce damos_quota_goal_tuner Patch series "mm/damon: support multiple goal-based quota tuning algorithms". Aim-oriented DAMOS quota auto-tuning uses a single tuning algorithm. The algorithm is designed to find a quota value that should be consistently kept for achieving the aimed goal for long term. It is useful and reliable at automatically operating systems that have dynamic environments in the long term. As always, however, no single algorithm fits all. When the environment has static characteristics or there are control towers in not only the kernel space but also the user space, the algorithm shows some limitations. In such environments, users want kernel work in a more short term deterministic way. Actually there were at least two reports [1,2] of such cases. Extend DAMOS quotas goal to support multiple quota tuning algorithms that users can select. Keep the current algorithm as the default one, to not break the old users. Also give it a name, "consist", as it is designed to "consistently" apply the DAMOS action. And introduce a new tuning algorithm, namely "temporal". It is designed to apply the DAMOS action only temporally, in a deterministic way. In more detail, as long as the goal is under-achieved, it uses the maximum quota available. Once the goal is over-achieved, it sets the quota zero. Tests ===== I confirmed the feature is working as expected using the latest version of DAMON user-space tool, like below. $ # start DAMOS for reclaiming memory aiming 30% free memory $ sudo ./damo/damo start --damos_action pageout \ --damos_quota_goal_tuner temporal \ --damos_quota_goal node_mem_free_bp 30% 0 \ --damos_quota_interval 1s \ --damos_quota_space 100M Note that >=3.1.8 version of DAMON user-space tool supports this feature (--damos_quota_goal_tuner). As expected, DAMOS stops reclaiming memory as soon as the goal amount of free memory is made. When 'consist' tuner is used, the reclamation was continued even after the goal amount of free memory is made, resulting in more than goal amount of free memory, as expected. Patch Sequence ============== First four patches implement the features. Patch 1 extends core API to allow multiple tuners and make the current tuner as the default and only available tuner, namely 'consist'. Patch 2 allows future tuners setting zero effective quota. Patch 3 introduces the second tuner, namely 'temporal'. Patch 4 further extends DAMON sysfs API to let users use that. Three following patches (patches 5-7) update design, usage, and ABI documents, respectively. Final four patches (patches 8-11) are for adding tests. The eighth patch (patch 8) extends the kunit test for online parameters commit for validating the goal_tuner. The ninth and the tenth patches (patches 9-10) extend the testing-purpose DAMON sysfs control helper and DAMON status dumping tool to support the newly added feature. The final eleventh one (patch 11) extends the existing online commit selftest to cover the new feature. This patch (of 11): DAMOS quota goal feature utilizes a single feedback loop based algorithm for automatic tuning of the effective quota. It is useful in dynamic environments that operate systems with only kernels in the long term. But, no one fits all. It is not very easy to control in environments having more controlled characteristics and user-space control towers. We actually got multiple reports [1,2] of use cases that the algorithm is not optimal. Introduce a new field of 'struct damos_quotas', namely 'goal_tuner'. It specifies what tuning algorithm the given scheme should use, and allows DAMON API callers to set it as they want. Nonetheless, this commit introduces no new tuning algorithm but only the interface. This commit hence makes no behavioral change. A new algorithm will be added by the following commit. Link: https://lkml.kernel.org/r/20260310010529.91162-2-sj@kernel.org Link: https://lore.kernel.org/CALa+Y17__d=ZsM1yX+MXx0ozVdsXnFqF4p0g+kATEitrWyZFfg@mail.gmail.com [1] Link: https://lore.kernel.org/20260204022537.814-1-yunjeong.mun@sk.com [2] Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: David Hildenbrand <david@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:25 -07:00
Hui Zhu	86e69c020b	mm/swap: strengthen locking assertions and invariants in cluster allocation swap_cluster_alloc_table() requires several locks to be held by its callers: ci->lock, the per-CPU swap_cluster lock, and, for non-solid-state devices (non-SWP_SOLIDSTATE), the si->global_cluster_lock. While most call paths (e.g., via cluster_alloc_swap_entry() or alloc_swap_scan_list()) correctly acquire these locks before invocation, the path through swap_reclaim_work() -> swap_reclaim_full_clusters() -> isolate_lock_cluster() is distinct. This path operates exclusively on si->full_clusters, where the swap allocation tables are guaranteed to be already allocated. Consequently, isolate_lock_cluster() should never trigger a call to swap_cluster_alloc_table() for these clusters. Strengthen the locking and state assertions to formalize these invariants: 1. Add a lockdep_assert_held() for si->global_cluster_lock in swap_cluster_alloc_table() for non-SWP_SOLIDSTATE devices. 2. Reorder existing lockdep assertions in swap_cluster_alloc_table() to match the actual lock acquisition order (per-CPU lock, then global lock, then cluster lock). 3. Add a VM_WARN_ON_ONCE() in isolate_lock_cluster() to ensure that table allocations are only attempted for clusters being isolated from the free list. Attempting to allocate a table for a cluster from other lists (like the full list during reclaim) indicates a violation of subsystem invariants. These changes ensure locking consistency and help catch potential synchronization or logic issues during development. [zhuhui@kylinos.cn: remove redundant comment, per Barry] Link: https://lkml.kernel.org/r/20260311022241.177801-1-hui.zhu@linux.dev [zhuhui@kylinos.cn: initialize `flags', per Chris] Link: https://lkml.kernel.org/r/20260312023024.903143-1-hui.zhu@linux.dev Link: https://lkml.kernel.org/r/20260310015657.42395-1-hui.zhu@linux.dev Signed-off-by: Hui Zhu <zhuhui@kylinos.cn> Reviewed-by: Youngjun Park <youngjun.park@lge.com> Reviewed-by: Barry Song <baohua@kernel.org> Acked-by: Chris Li <chrisl@kernel.org> Cc: Baoquan He <bhe@redhat.com> Cc: Kairui Song <kasong@tencent.com> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Nhat Pham <nphamcs@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:25 -07:00
Anthony Yznaga	d239462787	mm: prevent droppable mappings from being locked Droppable mappings must not be lockable. There is a check for VMAs with VM_DROPPABLE set in mlock_fixup() along with checks for other types of unlockable VMAs which ensures this when calling mlock()/mlock2(). For mlockall(MCL_FUTURE), the check for unlockable VMAs is different. In apply_mlockall_flags(), if the flags parameter has MCL_FUTURE set, the current task's mm's default VMA flag field mm->def_flags has VM_LOCKED applied to it. VM_LOCKONFAULT is also applied if MCL_ONFAULT is also set. When these flags are set as default in this manner they are cleared in __mmap_complete() for new mappings that do not support mlock. A check for VM_DROPPABLE in __mmap_complete() is missing resulting in droppable mappings created with VM_LOCKED set. To fix this and reduce that chance of similar bugs in the future, introduce and use vma_supports_mlock(). Link: https://lkml.kernel.org/r/20260310155821.17869-1-anthony.yznaga@oracle.com Fixes: `9651fcedf7` ("mm: add MAP_DROPPABLE for designating always lazily freeable mappings") Signed-off-by: Anthony Yznaga <anthony.yznaga@oracle.com> Suggested-by: David Hildenbrand <david@kernel.org> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Tested-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Jann Horn <jannh@google.com> Cc: Jason A. Donenfeld <jason@zx2c4.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:25 -07:00
Sergey Senozhatsky	301f392200	zram: unify and harden algo/priority params handling We have two functions that accept algo= and priority= params - algorithm_params_store() and recompress_store(). This patch unifies and hardens handling of those parameters. There are 4 possible cases: - only priority= provided [recommended] We need to verify that provided priority value is within permitted range for each particular function. - both algo= and priority= provided We cannot prioritize one over another. All we should do is to verify that zram is configured in the way that user-space expects it to be. Namely that zram indeed has compressor algo= setup at given priority=. - only algo= provided [not recommended] We should lookup priority in compressors list. - none provided [not recommended] Just use function's defaults. Link: https://lkml.kernel.org/r/20260311084312.1766036-7-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Suggested-by: Minchan Kim <minchan@kernel.org> Cc: Brian Geffon <bgeffon@google.com> Cc: gao xu <gaoxu2@honor.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:25 -07:00
Sergey Senozhatsky	cedfa028b5	zram: remove chained recompression Chained recompression has unpredictable behavior and is not useful in practice. First, systems usually configure just one alternative recompression algorithm, which has slower compression/decompression but better compression ratio. A single alternative algorithm doesn't need chaining. Second, even with multiple recompression algorithms, chained recompression is suboptimal. If a lower priority algorithm succeeds, the page is never attempted with a higher priority algorithm, leading to worse memory savings. If a lower priority algorithm fails, the page is still attempted with a higher priority algorithm, wasting resources on the failed lower priority attempt. In either case, the system would be better off targeting a specific priority directly. Chained recompression also significantly complicates the code. Remove it. Link: https://lkml.kernel.org/r/20260311084312.1766036-6-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Brian Geffon <bgeffon@google.com> Cc: gao xu <gaoxu2@honor.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:24 -07:00
Sergey Senozhatsky	be5f13d948	zram: update recompression documentation Emphasize usage of the `priority` parameter for recompression and explain why `algo` parameter can lead to unexpected behavior and thus is not recommended. Link: https://lkml.kernel.org/r/20260311084312.1766036-5-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Brian Geffon <bgeffon@google.com> Cc: gao xu <gaoxu2@honor.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:24 -07:00
Sergey Senozhatsky	5004a27edb	zram: drop ->num_active_comps It's not entirely correct to use ->num_active_comps for max-prio limit, as ->num_active_comps just tells the number of configured algorithms, not the max configured priority. For instance, in the following theoretical example: [lz4] [nil] [nil] [deflate] ->num_active_comps is 2, while the actual max-prio is 3. Drop ->num_active_comps and use ZRAM_MAX_COMPS instead. Link: https://lkml.kernel.org/r/20260311084312.1766036-4-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Suggested-by: Minchan Kim <minchan@kernel.org> Cc: Brian Geffon <bgeffon@google.com> Cc: gao xu <gaoxu2@honor.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:24 -07:00
Sergey Senozhatsky	ed19b9d550	zram: do not autocorrect bad recompression parameters Do not silently autocorrect bad recompression priority parameter value and just error out. Link: https://lkml.kernel.org/r/20260311084312.1766036-3-senozhatsky@chromium.org Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Suggested-by: Minchan Kim <minchan@kernel.org> Cc: Brian Geffon <bgeffon@google.com> Cc: gao xu <gaoxu2@honor.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:24 -07:00
Sergey Senozhatsky	241f9005b1	zram: do not permit params change after init Patch series "zram: recompression cleanups and tweaks", v2. This series is a somewhat random mix of fixups, recompression cleanups and improvements partly based on internal conversations. A few patches in the series remove unexpected or confusing behaviour, e.g. auto correction of bad priority= param for recompression, which should have always been just an error. Then it also removes "chain recompression" which has a tricky, unexpected and confusing behaviour at times. We also unify and harden the handling of algo/priority params. There is also an addition of missing device lock in algorithm_params_store() which previously permitted modification of algo params while the device is active. This patch (of 6): First, algorithm_params_store(), like any sysfs handler, should grab device lock. Second, like any write() sysfs handler, it should grab device lock in exclusive mode. Third, it should not permit change of algos' parameters after device init, as this doesn't make sense - we cannot compress with one C/D dict and then just change C/D dict to a different one, for example. Another thing to notice is that algorithm_params_store() accesses device's ->comp_algs for algo priority lookup, which should be protected by device lock in exclusive mode in general. Link: https://lkml.kernel.org/r/20260311084312.1766036-1-senozhatsky@chromium.org Link: https://lkml.kernel.org/r/20260311084312.1766036-2-senozhatsky@chromium.org Fixes: `4eac932103` ("zram: introduce algorithm_params device attribute") Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org> Acked-by: Brian Geffon <bgeffon@google.com> Cc: gao xu <gaoxu2@honor.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:24 -07:00
Pratyush Yadav	22bdab8e98	kho: drop restriction on maximum page order KHO currently restricts the maximum order of a restored page to the maximum order supported by the buddy allocator. While this works fine for much of the data passed across kexec, it is possible to have pages larger than MAX_PAGE_ORDER. For one, it is possible to get a larger order when using kho_preserve_pages() if the number of pages is large enough, since it tries to combine multiple aligned 0-order preservations into one higher order preservation. For another, upcoming support for hugepages can have gigantic hugepages being preserved over KHO. There is no real reason for this limit. The KHO preservation machinery can handle any page order. Remove this artificial restriction on max page order. Link: https://lkml.kernel.org/r/20260309123410.382308-2-pratyush@kernel.org Signed-off-by: Pratyush Yadav <pratyush@kernel.org> Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Cc: Samiullah Khawaja <skhawaja@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:24 -07:00
Pratyush Yadav (Google)	91e74fa8b1	kho: make sure preservations do not span multiple NUMA nodes The KHO restoration machinery is not capable of dealing with preservations that span multiple NUMA nodes. kho_preserve_folio() guarantees the preservation will only span one NUMA node since folios can't span multiple nodes. This leaves kho_preserve_pages(). While semantically kho_preserve_pages() only deals with 0-order pages, so all preservations should be single page only, in practice it combines preservations to higher orders for efficiency. This can result in a preservation spanning multiple nodes. Break up the preservations into a smaller order if that happens. Link: https://lkml.kernel.org/r/20260309123410.382308-1-pratyush@kernel.org Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org> Suggested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:24 -07:00
David Hildenbrand (Arm)	396042fb2b	KVM: PPC: remove hugetlb.h inclusion hugetlb.h is no longer required now that we moved vma_kernel_pagesize() to mm.h. Link: https://lkml.kernel.org/r/20260309151901.123947-5-david@kernel.org Signed-off-by: David Hildenbrand (Arm) <david@kernel.org> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Ritesh Harjani <ritesh.list@gmail.com> Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:23 -07:00
David Hildenbrand (Arm)	e8301b6adc	KVM: remove hugetlb.h inclusion hugetlb.h is no longer required now that we moved vma_kernel_pagesize() to mm.h. Link: https://lkml.kernel.org/r/20260309151901.123947-4-david@kernel.org Signed-off-by: David Hildenbrand (Arm) <david@kernel.org> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:23 -07:00
David Hildenbrand (Arm)	a9496e9e4b	mm: move vma_mmu_pagesize() from hugetlb to vma.c vma_mmu_pagesize() is also queried on non-hugetlb VMAs and does not really belong into hugetlb.c. PPC64 provides a custom overwrite with CONFIG_HUGETLB_PAGE, see arch/powerpc/mm/book3s64/slice.c, so we cannot easily make this a static inline function. So let's move it to vma.c and add some proper kerneldoc. To make vma tests happy, add a simple vma_kernel_pagesize() stub in tools/testing/vma/include/custom.h. Link: https://lkml.kernel.org/r/20260309151901.123947-3-david@kernel.org Signed-off-by: David Hildenbrand (Arm) <david@kernel.org> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:23 -07:00
David Hildenbrand (Arm)	341ffe82a7	mm: move vma_kernel_pagesize() from hugetlb to mm.h Patch series "mm: move vma_(kernel\|mmu)_pagesize() out of hugetlb.c", v2. Looking into vma_(kernel\|mmu)_pagesize(), I realized that there is one scenario where DAX would not do the right thing when the kernel is not compiled with hugetlb support. Without hugetlb support, vma_(kernel\|mmu)_pagesize() will always return PAGE_SIZE instead of using the ->pagesize() result provided by dax-device code. Fix that by moving vma_kernel_pagesize() to core MM code, where it belongs. I don't think this is stable material, but am not 100% sure. Also, move vma_mmu_pagesize() while at it. Remove the unnecessary hugetlb.h inclusion from KVM code. This patch (of 4): In the past, only hugetlb had special "vma_kernel_pagesize()" requirements, so it provided its own implementation. In commit `05ea88608d` ("mm, hugetlbfs: introduce ->pagesize() to vm_operations_struct") we generalized that approach by providing a vm_ops->pagesize() callback to be used by device-dax. Once device-dax started using that callback in commit `c1d53b92b9` ("device-dax: implement ->pagesize() for smaps to report MMUPageSize") it was missed that CONFIG_DEV_DAX does not depend on hugetlb support. So building a kernel with CONFIG_DEV_DAX but without CONFIG_HUGETLBFS would not pick up that value. Fix it by moving vma_kernel_pagesize() to mm.h, providing only a single implementation. While at it, improve the kerneldoc a bit. Ideally, we'd move vma_mmu_pagesize() as well to the header. However, its __weak symbol might be overwritten by a PPC variant in hugetlb code. So let's leave it in there for now, as it really only matters for some hugetlb oddities. This was found by code inspection. Link: https://lkml.kernel.org/r/20260309151901.123947-1-david@kernel.org Link: https://lkml.kernel.org/r/20260309151901.123947-2-david@kernel.org Fixes: `c1d53b92b9` ("device-dax: implement ->pagesize() for smaps to report MMUPageSize") Signed-off-by: David Hildenbrand (Arm) <david@kernel.org> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: "Christophe Leroy (CS GROUP)" <chleroy@kernel.org> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:23 -07:00
Akinobu Mita	1eba4c9599	docs: mm: fix typo in numa_memory_policy.rst Fix a typo: MPOL_INTERLEAVED -> MPOL_INTERLEAVE. Link: https://lkml.kernel.org/r/20260310151837.5888-1-akinobu.mita@gmail.com Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:23 -07:00
SeongJae Park	a4e82de81f	Docs/mm/damon/index: fix typo: autoamted -> automated There is an obvious typo. Fix it (s/autoamted/automated/). Link: https://lkml.kernel.org/r/20260307195356.203753-8-sj@kernel.org Fixes: `32d11b3208` ("Docs/mm/damon/index: simplify the intro") Signed-off-by: SeongJae Park <sj@kernel.org> Acked-by: wang lian <lianux.mm@gmail.com> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: David Hildenbrand <david@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:23 -07:00
SeongJae Park	20675fc8c0	Docs/mm/damon/maintainer-profile: use flexible review cadence The document mentions the maitainer is working in the usual 9-5 fashion. The maintainer nowadays prefers working in a more flexible way. Update the document to avoid contributors having a wrong time expectation. Link: https://lkml.kernel.org/r/20260307195356.203753-7-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Acked-by: wang lian <lianux.mm@gmail.com> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: David Hildenbrand <david@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:23 -07:00
SeongJae Park	d7f00084f6	Docs/admin-guide/mm/damn/lru_sort: fix intervals autotune parameter name The section name should be the same as the parameter name. Fix it. Link: https://lkml.kernel.org/r/20260307195356.203753-6-sj@kernel.org Fixes: `ed581147a4` ("Docs/admin-guide/mm/damon/lru_sort: document intervals autotuning") Signed-off-by: SeongJae Park <sj@kernel.org> Acked-by: wang lian <lianux.mm@gmail.com> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: David Hildenbrand <david@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:22 -07:00
SeongJae Park	3802e1d98e	mm/damon: document non-zero length damon_region assumption DAMON regions are assumed to always be non-zero length. There was a confusion [1] about it, probably due to lack of the documentation. Document it. Link: https://lkml.kernel.org/r/20260307195356.203753-5-sj@kernel.org Link: https://lore.kernel.org/20251231070029.79682-1-sj@kernel.org/ [1] Signed-off-by: SeongJae Park <sj@kernel.org> Acked-by: wang lian <lianux.mm@gmail.com> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: David Hildenbrand <david@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:22 -07:00
SeongJae Park	2a5f4454e0	mm/damon/core: clarify damon_set_attrs() usages damon_set_attrs() is called for multiple purposes from multiple places. Calling it in an unsafe context can make DAMON internal state polluted and results in unexpected behaviors. Clarify when it is safe, and where it is being called. Link: https://lkml.kernel.org/r/20260307195356.203753-4-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Acked-by: wang lian <lianux.mm@gmail.com> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: David Hildenbrand <david@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:22 -07:00
SeongJae Park	fd83b0d1c4	mm/damon/tests/core-kunit: add a test for damon_is_last_region() There was a bug [1] in damon_is_last_region(). Add a kunit test to not reintroduce the bug. Link: https://lkml.kernel.org/r/20260307195356.203753-3-sj@kernel.org Link: https://lore.kernel.org/20260114152049.99727-1-sj@kernel.org/ [1] Signed-off-by: SeongJae Park <sj@kernel.org> Tested-by: wang lian <lianux.mm@gmail.com> Reviewed-by: wang lian <lianux.mm@gmail.com> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: David Hildenbrand <david@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:22 -07:00
SeongJae Park	5d6a520aff	mm/damon/core: use mult_frac() Patch series "mm/damon: improve/fixup/update ratio calculation, test and documentation". Yet another batch of misc/minor improvements and fixups. Use mult_frac() instead of the worse open-coding for rate calculations (patch 1). Add a test for a previously found and fixed bug (patch 2). Improve and update comments and documentations for easier code review and up-to-date information (patches 3-6). Finally, fix an obvious typo (patch 7). This patch (of 7): There are multiple places in core code that do open-code rate calculations. Use mult_frac(), which is developed for doing that in a way more safe from overflow and precision loss. Link: https://lkml.kernel.org/r/20260307195356.203753-1-sj@kernel.org Link: https://lkml.kernel.org/r/20260307195356.203753-2-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Acked-by: wang lian <lianux.mm@gmail.com> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: David Hildenbrand <david@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:22 -07:00
SeongJae Park	23754a36cd	mm/damon/core: use time_after_eq() in kdamond_fn() damon_ctx->passed_sample_intervals and damon_ctx->next_*_sis are unsigned long. Those are compared in kdamond_fn() using normal comparison operators. It is unsafe from overflow. Use time_after_eq(), which is safe from overflows when correctly used, instead. Link: https://lkml.kernel.org/r/20260307194915.203169-4-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:22 -07:00
SeongJae Park	f05e253637	mm/damon/core: use time_before() for next_apply_sis damon_ctx->passed_sample_intervals and damos->next_apply_sis are unsigned long, and compared via normal comparison operators. It is unsafe from overflow. Use time_before(), which is safe from overflow when correctly used, instead. Link: https://lkml.kernel.org/r/20260307194915.203169-3-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:22 -07:00
SeongJae Park	7e6c650fdb	mm/damon/core: remove damos_set_next_apply_sis() duplicates Patch series "mm/damon/core: make passed_sample_intervals comparisons overflow-safe". DAMON accounts time using its own jiffies-like time counter, namely damon_ctx->passed_sample_intervals. The counter is incremented on each iteration of kdamond_fn() main loop, which sleeps at least one sample interval. Hence the name is like that. DAMON has time-periodic operations including monitoring results aggregation and DAMOS action application. DAMON sets the next time to do each of such operations in the passed_sample_intervals unit. And it does the operation when the counter becomes the same to or larger than the pre-set values, and update the next time for the operation. Note that the operation is done not only when the values exactly match but also when the time is passed, because the values can be updated for online-committed DAMON parameters. The counter is 'unsigned long' type, and the comparison is done using normal comparison operators. It is not safe from overflows. This can cause rare and limited but odd situations. Let's suppose there is an operation that should be executed every 20 sampling intervals, and the passed_sample_intervals value for next execution of the operation is ULONG_MAX - 3. Once the passed_sample_intervals reaches ULONG_MAX - 3, the operation will be executed, and the next time value for doing the operation becomes 17 (ULONG_MAX - 3 + 20), since overflow happens. In the next iteration of the kdamond_fn() main loop, passed_sample_intervals is larger than the next operation time value, so the operation will be executed again. It will continue executing the operation for each iteration, until the passed_sample_intervals also overflows. Note that this will not be common and problematic in the real world. The sampling interval, which takes for each passed_sample_intervals increment, is 5 ms by default. And it is usually [auto-]tuned for hundreds of milliseconds. That means it takes about 248 days or 4,971 days to have the overflow on 32 bit machines when the sampling interval is 5 ms and 100 ms, respectively (1<<32 * sampling_interval_in_seconds / 3600 / 24). On 64 bit machines, the numbers become 2924712086.77536 and 58494241735.5072 years. So the real user impact is negligible. But still this is better to be fixed as long as the fix is simple and efficient. Fix this by simply replacing the overflow-unsafe native comparison operators with the existing overflow-safe time comparison helpers. The first patch only cleans up the next DAMOS action application time setup for consistency and reduced code. The second and the third patches update DAMOS action application time setup and rest, respectively. This patch (of 3): There is a function for damos->next_apply_sis setup. But some places are open-coding it. Consistently use the helper. Link: https://lkml.kernel.org/r/20260307194915.203169-2-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:21 -07:00
SeongJae Park	bfb1523cde	Docs/mm/damon/design: document the power-of-two limitation for addr_unit The min_region_sz is set as max(DAMON_MIN_REGION_SZ / addr_unit, 1). DAMON_MIN_REGION_SZ is the same to PAGE_SIZE, and addr_unit is what the user can arbitrarily set. Commit `c80f46ac22` ("mm/damon/core: disallow non-power of two min_region_sz") made min_region_sz to always be a power of two. Hence, addr_unit should be a power of two when it is smaller than PAGE_SIZE. While 'addr_unit' is a user-exposed parameter, the rule is not documented. This can confuse users. Specifically, if the user sets addr_unit as a value that is smaller than PAGE_SIZE and not a power of two, the setup will explicitly fail. Document the rule on the design document. Usage documents reference the design document for detail, so updating only the design document should suffice. Link: https://lkml.kernel.org/r/20260307194222.202075-3-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: David Hildenbrand <david@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:21 -07:00
SeongJae Park	a260de7d45	mm/damon/tests/core-kunit: add a test for damon_commit_ctx() Patch series "mm/damon: test and document power-of-2 min_region_sz requirement". Since commit `c80f46ac22` ("mm/damon/core: disallow non-power of two min_region_sz"), min_region_sz is always restricted to be a power of two. Add a kunit test to confirm the functionality. Also, the change adds a restriction to addr_unit parameter. Clarify it on the document. This patch (of 2): Add a kunit test for confirming the change that is made on commit `c80f46ac22` ("mm/damon/core: disallow non-power of two min_region_sz") functions as expected. Link: https://lkml.kernel.org/r/20260307194222.202075-2-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: David Hildenbrand <david@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: SeongJae Park <sj@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:21 -07:00
SeongJae Park	300252ebb1	selftests/damon/config: enable DAMON_DEBUG_SANITY CONFIG_DAMON_DEBUG_SANITY is recommended for DAMON development and test setups. Enable it on the build config for DAMON selftests. Link: https://lkml.kernel.org/r/20260306152914.86303-11-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:21 -07:00
SeongJae Park	09cbdf7dbe	mm/damon/tests/.kunitconifg: enable DAMON_DEBUG_SANITY CONFIG_DAMON_DEBUG_SANITY is recommended for DAMON development and test setups. Enable it on the default configurations for DAMON kunit test run. Link: https://lkml.kernel.org/r/20260306152914.86303-10-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:21 -07:00
SeongJae Park	c556187b6e	mm/damon/core: add damon_reset_aggregated() debug_sanity check At time of damon_reset_aggregated(), aggregation of the interval should be completed, and hence nr_accesses and nr_accesses_bp should match. I found a few bugs caused it to be broken in the past, from online parameters update and complicated nr_accesses handling changes. Add a sanity check for that under CONFIG_DAMON_DEBUG_SANITY. Link: https://lkml.kernel.org/r/20260306152914.86303-9-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:21 -07:00
SeongJae Park	6aa1f78354	mm/damon/core: add damon_split_region_at() debug_sanity check damon_split_region_at() should be called with the correct address to split on. Add a sanity check for that under CONFIG_DAMON_DEBUG_SANITY. Link: https://lkml.kernel.org/r/20260306152914.86303-8-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:20 -07:00
SeongJae Park	c070da2391	mm/damon/core: add damon_merge_regions_of() debug_sanity check damon_merge_regions_of() should be called only after aggregation is finished and therefore each region's nr_accesses and nr_accesses_bp match. There were bugs that broke the assumption, during development of online DAMON parameter updates and monitoring results handling changes. Add a sanity check for that under CONFIG_DAMON_DEBUG_SANITY. Link: https://lkml.kernel.org/r/20260306152914.86303-7-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:20 -07:00
SeongJae Park	0bb7682fdb	mm/damon/core: add damon_merge_two_regions() debug_sanity check A data corruption could cause damon_merge_two_regions() creating zero length DAMON regions. Add a sanity check for that under CONFIG_DAMON_DEBUG_SANITY. Link: https://lkml.kernel.org/r/20260306152914.86303-6-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:20 -07:00
SeongJae Park	242a764abe	mm/damon/core: add damon_nr_regions() debug_sanity check damon_target->nr_regions is introduced to get the number quickly without having to iterate regions always. Add a sanity check for that under CONFIG_DAMON_DEBUG_SANITY. Link: https://lkml.kernel.org/r/20260306152914.86303-5-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:20 -07:00
SeongJae Park	9a647920d0	mm/damon/core: add damon_del_region() debug_sanity check damon_del_region() should be called for targets that have one or more regions. Add a sanity check for that under CONFIG_DAMON_DEBUG_SANITY. Link: https://lkml.kernel.org/r/20260306152914.86303-4-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:20 -07:00
SeongJae Park	b0264a951c	mm/damon/core: add damon_new_region() debug_sanity check damon_new_region() is supposed to be called with only valid address range arguments. Do the check under DAMON_DEBUG_SANITY. Link: https://lkml.kernel.org/r/20260306152914.86303-3-sj@kernel.org Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:20 -07:00
SeongJae Park	62f0582875	mm/damon: add CONFIG_DAMON_DEBUG_SANITY Patch series "mm/damon: add optional debugging-purpose sanity checks". DAMON code has a few assumptions that can be critical if violated. Validating the assumptions in code can be useful at finding such critical bugs. I was actually adding some such additional sanity checks in my personal tree, and those were useful at finding bugs that I made during the development of new patches. We also found [1] sometimes the assumptions are misunderstood. The validation can work as good documentation for such cases. Add some of such debugging purpose sanity checks. Because those additional checks can impose more overhead, make those only optional via new config, CONFIG_DAMON_DEBUG_SANITY, that is recommended for only development and test setups. And as recommended, enable it for DAMON kunit tests and selftests. Note that the verification only WARN_ON() for each of the insanity. The developer or tester may better to set panic_on_oops together, like damon-tests/corr did [2]. This patch (of 10): Add a new build config that will enable additional DAMON sanity checks. It is recommended to be enabled on only development and test setups, since it can impose additional overhead. Link: https://lkml.kernel.org/r/20260306152914.86303-1-sj@kernel.org Link: https://lkml.kernel.org/r/20260306152914.86303-2-sj@kernel.org Link: https://lore.kernel.org/20251231070029.79682-1-sj@kernel.org [1] Link: `a80fbee55e` [2] Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Brendan Higgins <brendan.higgins@linux.dev> Cc: David Gow <davidgow@google.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:20 -07:00
Usama Arif	5a14198ec6	mm/migrate_device: document folio_get requirement before frozen PMD split split_huge_pmd_address() with freeze=true splits a PMD migration entry into PTE migration entries, consuming one folio reference in the process. The folio_get() before it provides this reference. Add a comment explaining this relationship. The expected folio refcount at the start of migrate_vma_split_unmapped_folio() is 1. Link: https://lkml.kernel.org/r/20260309212502.3922825-1-usama.arif@linux.dev Signed-off-by: Usama Arif <usama.arif@linux.dev> Suggested-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Nico Pache <npache@redhat.com> Acked-by: David Hildenbrand (Arm) <david@kernel.org> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Gregory Price <gourry@gourry.net> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Ying Huang <ying.huang@linux.alibaba.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:19 -07:00
Arnd Bergmann	d765108993	ubsan: turn off kmsan inside of ubsan instrumentation The structure initialization in the two type mismatch handling functions causes a call to __msan_memset() to be generated inside of a UACCESS block, which in turn leads to an objtool warning about possibly leaking uaccess-enabled state: lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch+0xda: call to __msan_memset() with UACCESS enabled lib/ubsan.o: warning: objtool: __ubsan_handle_type_mismatch_v1+0xf4: call to __msan_memset() with UACCESS enabled Most likely __msan_memset() is safe to be called here and could be added to the uaccess_safe_builtin[] list of safe functions, but seeing that the ubsan file itself already has kasan, ubsan and kcsan disabled itself, it is probably a good idea to also turn off kmsan here, in particular this also avoids the risk of recursing between ubsan and kcsan checks in other functions of this file. I saw this happen while testing randconfig builds with clang-22, but did not try older versions, or attempt to see which kernel change introduced the warning. Link: https://lkml.kernel.org/r/20260306150613.350029-1-arnd@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Kees Cook <kees@kernel.org> Cc: Marco Elver <elver@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Bill Wendling <morbo@google.com> Cc: Justin Stitt <justinstitt@google.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:19 -07:00
Byungchul Park	db359fccf2	mm: introduce a new page type for page pool in page type Currently, the condition 'page->pp_magic == PP_SIGNATURE' is used to determine if a page belongs to a page pool. However, with the planned removal of @pp_magic, we should instead leverage the page_type in struct page, such as PGTY_netpp, for this purpose. Introduce and use the page type APIs e.g. PageNetpp(), __SetPageNetpp(), and __ClearPageNetpp() instead, and remove the existing APIs accessing @pp_magic e.g. page_pool_page_is_pp(), netmem_or_pp_magic(), and netmem_clear_pp_magic(). Plus, add @page_type to struct net_iov at the same offset as struct page so as to use the page_type APIs for struct net_iov as well. While at it, reorder @type and @owner in struct net_iov to avoid a hole and increasing the struct size. This work was inspired by the following link: https://lore.kernel.org/all/582f41c0-2742-4400-9c81-0d46bf4e8314@gmail.com/ While at it, move the sanity check for page pool to on the free path. [byungchul@sk.com: gate the sanity check, per Johannes] Link: https://lkml.kernel.org/r/20260316223113.20097-1-byungchul@sk.com Link: https://lkml.kernel.org/r/20260224051347.19621-1-byungchul@sk.com Co-developed-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Pavel Begunkov <asml.silence@gmail.com> Signed-off-by: Byungchul Park <byungchul@sk.com> Suggested-by: David Hildenbrand <david@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrew Lunn <andrew+netdev@lunn.ch> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Brendan Jackman <jackmanb@google.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: David S. Miller <davem@davemloft.net> Cc: David Wei <dw@davidwei.uk> Cc: Dragos Tatulea <dtatulea@nvidia.com> Cc: Eric Dumazet <edumazet@google.com> Cc: John Fastabend <john.fastabend@gmail.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes (Oracle) <ljs@kernel.org> Cc: Mark Bloch <mbloch@nvidia.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mina Almasry <almasrymina@google.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Saeed Mahameed <saeedm@nvidia.com> Cc: Simon Horman <horms@kernel.org> Cc: Stanislav Fomichev <sdf@fomichev.me> Cc: Stehen Rothwell <sfr@canb.auug.org.au> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Taehee Yoo <ap420073@gmail.com> Cc: Tariq Toukan <tariqt@nvidia.com> Cc: Usama Arif <usamaarif642@gmail.com> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2026-04-05 13:53:19 -07:00

1 2 3 4 5 ...

1429080 Commits