From a6715d7ec472a476db17787697a4abda62962284 Mon Sep 17 00:00:00 2001 From: Evangelos Petrongonas Date: Fri, 10 Apr 2026 01:16:05 +0000 Subject: [PATCH 1/4] kho: skip KHO for crash kernel kho_fill_kimage() unconditionally populates the kimage with KHO metadata for every kexec image type. When the image is a crash kernel, this can be problematic as the crash kernel can run in a small reserved region and the KHO scratch areas can sit outside it. The crash kernel then faults during kho_memory_init() when it tries phys_to_virt() on the KHO FDT address: Unable to handle kernel paging request at virtual address xxxxxxxx ... fdt_offset_ptr+... fdt_check_node_offset_+... fdt_first_property_offset+... fdt_get_property_namelen_+... fdt_getprop+... kho_memory_init+... mm_core_init+... start_kernel+... kho_locate_mem_hole() already skips KHO logic for KEXEC_TYPE_CRASH images, but kho_fill_kimage() was missing the same guard. As kho_fill_kimage() is the single point that populates image->kho.fdt and image->kho.scratch, fixing it here is sufficient for both arm64 and x86 as the FDT and boot_params path are bailing out when these fields are unset. Fixes: d7255959b69a ("kho: allow kexec load before KHO finalization") Signed-off-by: Evangelos Petrongonas Reviewed-by: Mike Rapoport (Microsoft) Link: https://patch.msgid.link/20260410011609.1103-1-epetron@amazon.de Signed-off-by: Mike Rapoport (Microsoft) --- kernel/liveupdate/kexec_handover.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/liveupdate/kexec_handover.c b/kernel/liveupdate/kexec_handover.c index 94762de1fe5f..4fde8325c49f 100644 --- a/kernel/liveupdate/kexec_handover.c +++ b/kernel/liveupdate/kexec_handover.c @@ -1702,7 +1702,7 @@ int kho_fill_kimage(struct kimage *image) int err = 0; struct kexec_buf scratch; - if (!kho_enable) + if (!kho_enable || image->type == KEXEC_TYPE_CRASH) return 0; image->kho.fdt = virt_to_phys(kho_out.fdt); From 0fb1daf0b78d0e23b63b6b65de56d4a3fd83bc14 Mon Sep 17 00:00:00 2001 From: David Carlier Date: Wed, 15 Apr 2026 06:23:00 +0100 Subject: [PATCH 2/4] mm/memfd_luo: report error when restoring a folio fails mid-loop memfd_luo_retrieve_folios() initialises err to -EIO, but the per-iteration calls to mem_cgroup_charge(), shmem_add_to_page_cache() and shmem_inode_acct_blocks() reuse and overwrite err. Once any iteration completes successfully, err becomes zero. If a later iteration's kho_restore_folio() returns NULL, the failure path jumps to put_folios without resetting err, so the function returns 0. The caller memfd_luo_retrieve() then takes the success path, sets args->file and reports the restore as successful, leaving userspace with a partially populated memfd and no indication that anything went wrong. Set err to -EIO in the kho_restore_folio() failure branch so the error is propagated to the caller. Signed-off-by: David Carlier Reviewed-by: Pratyush Yadav Fixes: b3749f174d68 ("mm: memfd_luo: allow preserving memfd") Link: https://patch.msgid.link/20260415052300.362539-1-devnexen@gmail.com Signed-off-by: Mike Rapoport (Microsoft) --- mm/memfd_luo.c | 1 + 1 file changed, 1 insertion(+) diff --git a/mm/memfd_luo.c b/mm/memfd_luo.c index b02b503c750d..35d1247281e0 100644 --- a/mm/memfd_luo.c +++ b/mm/memfd_luo.c @@ -427,6 +427,7 @@ static int memfd_luo_retrieve_folios(struct file *file, if (!folio) { pr_err("Unable to restore folio at physical address: %llx\n", phys); + err = -EIO; goto put_folios; } index = pfolio->index; From d581fc99d3b958cb6e363104e9aab57f36aee6f3 Mon Sep 17 00:00:00 2001 From: David Carlier Date: Thu, 23 Apr 2026 13:56:47 +0100 Subject: [PATCH 3/4] mm/memfd_luo: reject memfds whose page count exceeds UINT_MAX memfd_luo_preserve_folios() declares max_folios as unsigned int and computes it from the inode size, then passes it to memfd_pin_folios() which itself caps max_folios at unsigned int. For files whose base-page count exceeds UINT_MAX (larger than 16 TiB with 4 KiB pages), the assignment truncates silently: only a prefix of the file gets pinned and preserved, while memfd_luo_preserve() still records the full inode size in ser->size. On retrieve the inode is restored to the full size but only the preserved prefix repopulates the page cache, so the tail comes back as holes and user data is silently lost across the live update. Reject such files at preserve time with -EFBIG rather than chunk the pin loop, which would also require enlarging the preserved folios array well beyond what is practical. Fixes: b3749f174d68 ("mm: memfd_luo: allow preserving memfd") Signed-off-by: David Carlier Reviewed-by: Pasha Tatashin Reviewed-by: Pratyush Yadav Link: https://patch.msgid.link/20260423125648.152113-1-devnexen@gmail.com Signed-off-by: Pasha Tatashin --- mm/memfd_luo.c | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/mm/memfd_luo.c b/mm/memfd_luo.c index 35d1247281e0..94ae113f68f6 100644 --- a/mm/memfd_luo.c +++ b/mm/memfd_luo.c @@ -259,7 +259,7 @@ static int memfd_luo_preserve(struct liveupdate_file_op_args *args) struct inode *inode = file_inode(args->file); struct memfd_luo_folio_ser *folios_ser; struct memfd_luo_ser *ser; - u64 nr_folios; + u64 nr_folios, inode_size; int err = 0, seals; inode_lock(inode); @@ -285,7 +285,18 @@ static int memfd_luo_preserve(struct liveupdate_file_op_args *args) } ser->pos = args->file->f_pos; - ser->size = i_size_read(inode); + inode_size = i_size_read(inode); + + /* + * memfd_pin_folios() caps at UINT_MAX folios; refuse larger + * files to avoid silently preserving only a prefix. + */ + if (DIV_ROUND_UP_ULL(inode_size, PAGE_SIZE) > UINT_MAX) { + err = -EFBIG; + goto err_free_ser; + } + + ser->size = inode_size; ser->seals = seals; err = memfd_luo_preserve_folios(args->file, &ser->folios, From 7b0b68b2b95606e65594958686833e53423f58f2 Mon Sep 17 00:00:00 2001 From: David Carlier Date: Thu, 23 Apr 2026 13:56:48 +0100 Subject: [PATCH 4/4] mm/memfd_luo: document preservation of file seals Commit 8a552d68a86e ("mm: memfd_luo: preserve file seals") started preserving file seals across live update and restoring them via memfd_add_seals() on retrieve, but the DOC header was not updated and still listed seals under "Non-Preserved Properties" as being unsealed on restore. Move the Seals entry to the "Preserved Properties" section and describe the actual behavior, including the MEMFD_LUO_ALL_SEALS restriction that both preserve and retrieve enforce. Fixes: 8a552d68a86e ("mm: memfd_luo: preserve file seals") Signed-off-by: David Carlier Reviewed-by: Pratyush Yadav Reviewed-by: Pasha Tatashin Link: https://patch.msgid.link/20260423125648.152113-2-devnexen@gmail.com Signed-off-by: Pasha Tatashin --- mm/memfd_luo.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/mm/memfd_luo.c b/mm/memfd_luo.c index 94ae113f68f6..59de210bee5f 100644 --- a/mm/memfd_luo.c +++ b/mm/memfd_luo.c @@ -50,6 +50,11 @@ * memfds are always opened with ``O_RDWR`` and ``O_LARGEFILE``. This property * is maintained. * + * Seals + * File seals set on the memfd are preserved and re-applied on restore. + * Only seals known to this LUO version (see ``MEMFD_LUO_ALL_SEALS``) may + * be present; preservation fails with ``-EOPNOTSUPP`` otherwise. + * * Non-Preserved Properties * ======================== * @@ -61,10 +66,6 @@ * A memfd can be created with the ``MFD_CLOEXEC`` flag that sets the * ``FD_CLOEXEC`` on the file. This flag is not preserved and must be set * again after restore via ``fcntl()``. - * - * Seals - * File seals are not preserved. The file is unsealed on restore and if - * needed, must be sealed again via ``fcntl()``. */ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt