khugepaged: reduce race probability between migration and khugepaged

Suppose a folio is under migration, and khugepaged is also trying to
collapse it.  collapse_pte_mapped_thp() will retrieve the folio from the
page cache via filemap_lock_folio(), thus taking a reference on the folio
and sleeping on the folio lock, since the lock is held by the migration
path.  Migration will then fail in __folio_migrate_mapping ->
folio_ref_freeze.  Reduce the probability of such a race happening
(leading to migration failure) by bailing out if we detect a PMD is marked
with a migration entry.

This fixes the migration-shared-anon-thp testcase failure on Apple M3.

Note that, this is not a "fix" since it only reduces the chance of
interference of khugepaged with migration, wherein both the kernel
functionalities are deemed "best-effort".

Link: https://lkml.kernel.org/r/20250704040417.63826-1-dev.jain@arm.com
Signed-off-by: Dev Jain <dev.jain@arm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Oscar Salvador <osalvador@suse.de>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Mariano Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
This commit is contained in:
Dev Jain
2025-07-04 09:34:17 +05:30
committed by Andrew Morton
parent ee58e38489
commit 7f810385fd

View File

@@ -941,6 +941,14 @@ static inline int check_pmd_state(pmd_t *pmd)
if (pmd_none(pmde))
return SCAN_PMD_NONE;
/*
* The folio may be under migration when khugepaged is trying to
* collapse it. Migration success or failure will eventually end
* up with a present PMD mapping a folio again.
*/
if (is_pmd_migration_entry(pmde))
return SCAN_PMD_MAPPED;
if (!pmd_present(pmde))
return SCAN_PMD_NULL;
if (pmd_trans_huge(pmde))