mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
synced 2026-05-16 11:21:26 -04:00
ee628d9cc8d5b96fdceeb270cf662efc4f85f2b6
1413580 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
ee628d9cc8 |
mm: add basic tests for lazy_mmu
Add basic KUnit tests for the generic aspects of the lazy MMU mode: ensure that it appears active when it should, depending on how enable/disable and pause/resume pairs are nested. [akpm@linux-foundation.org: export ppc64_tlb_batch and __flush_tlb_pending to modules] [ritesh.list@gmail.com: use EXPORT_SYMBOL_IF_KUNIT()] Link: https://lkml.kernel.org/r/87a4zhkt6h.ritesh.list@gmail.com [kevin.brodsky@arm.com: move MODULE_IMPORT_NS(), add comment] Link: https://lkml.kernel.org/r/20251217163812.2633648-2-kevin.brodsky@arm.com Link: https://lkml.kernel.org/r/20251215150323.2218608-15-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com> Signed-off-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Borislav Betkov <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David S. Miller <davem@davemloft.net> Cc: David Woodhouse <dwmw2@infradead.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Juegren Gross <jgross@suse.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
291b3abed6 |
x86/xen: use lazy_mmu_state when context-switching
We currently set a TIF flag when scheduling out a task that is in lazy MMU mode, in order to restore it when the task is scheduled again. The generic lazy_mmu layer now tracks whether a task is in lazy MMU mode in task_struct::lazy_mmu_state. We can therefore check that state when switching to the new task, instead of using a separate TIF flag. Link: https://lkml.kernel.org/r/20251215150323.2218608-14-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Borislav Betkov <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David Hildenbrand <david@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: David Woodhouse <dwmw2@infradead.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
dacd24ec49 |
sparc/mm: replace batch->active with is_lazy_mmu_mode_active()
A per-CPU batch struct is activated when entering lazy MMU mode; its lifetime is the same as the lazy MMU section (it is deactivated when leaving the mode). Preemption is disabled in that interval to ensure that the per-CPU reference remains valid. The generic lazy_mmu layer now tracks whether a task is in lazy MMU mode. We can therefore use the generic helper is_lazy_mmu_mode_active() to tell whether a batch struct is active instead of tracking it explicitly. Link: https://lkml.kernel.org/r/20251215150323.2218608-13-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com> Acked-by: Andreas Larsson <andreas@gaisler.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Borislav Betkov <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David Hildenbrand <david@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: David Woodhouse <dwmw2@infradead.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Juegren Gross <jgross@suse.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
313a05a15a |
powerpc/mm: replace batch->active with is_lazy_mmu_mode_active()
A per-CPU batch struct is activated when entering lazy MMU mode; its lifetime is the same as the lazy MMU section (it is deactivated when leaving the mode). Preemption is disabled in that interval to ensure that the per-CPU reference remains valid. The generic lazy_mmu layer now tracks whether a task is in lazy MMU mode. We can therefore use the generic helper is_lazy_mmu_mode_active() to tell whether a batch struct is active instead of tracking it explicitly. Link: https://lkml.kernel.org/r/20251215150323.2218608-12-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Borislav Betkov <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David Hildenbrand (Red Hat) <david@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: David Woodhouse <dwmw2@infradead.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Juegren Gross <jgross@suse.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
4dd9b4d7a8 |
arm64: mm: replace TIF_LAZY_MMU with is_lazy_mmu_mode_active()
The generic lazy_mmu layer now tracks whether a task is in lazy MMU mode. As a result we no longer need a TIF flag for that purpose - let's use the new is_lazy_mmu_mode_active() helper instead. The explicit check for in_interrupt() is no longer necessary either as is_lazy_mmu_mode_active() always returns false in interrupt context. Link: https://lkml.kernel.org/r/20251215150323.2218608-11-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Borislav Betkov <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David Hildenbrand <david@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: David Woodhouse <dwmw2@infradead.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Juegren Gross <jgross@suse.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
5ab2467495 |
mm: enable lazy_mmu sections to nest
Despite recent efforts to prevent lazy_mmu sections from nesting, it
remains difficult to ensure that it never occurs - and in fact it does
occur on arm64 in certain situations (CONFIG_DEBUG_PAGEALLOC). Commit
|
||
|
|
9273dfaeac |
mm: bail out of lazy_mmu_mode_* in interrupt context
The lazy MMU mode cannot be used in interrupt context. This is documented
in <linux/pgtable.h>, but isn't consistently handled across architectures.
arm64 ensures that calls to lazy_mmu_mode_* have no effect in interrupt
context, because such calls do occur in certain configurations - see
commit
|
||
|
|
0a096ab7a3 |
mm: introduce generic lazy_mmu helpers
The implementation of the lazy MMU mode is currently entirely
arch-specific; core code directly calls arch helpers:
arch_{enter,leave}_lazy_mmu_mode().
We are about to introduce support for nested lazy MMU sections. As things
stand we'd have to duplicate that logic in every arch implementing
lazy_mmu - adding to a fair amount of logic already duplicated across
lazy_mmu implementations.
This patch therefore introduces a new generic layer that calls the
existing arch_* helpers. Two pair of calls are introduced:
* lazy_mmu_mode_enable() ... lazy_mmu_mode_disable()
This is the standard case where the mode is enabled for a given
block of code by surrounding it with enable() and disable()
calls.
* lazy_mmu_mode_pause() ... lazy_mmu_mode_resume()
This is for situations where the mode is temporarily disabled
by first calling pause() and then resume() (e.g. to prevent any
batching from occurring in a critical section).
The documentation in <linux/pgtable.h> will be updated in a subsequent
patch.
No functional change should be introduced at this stage. The
implementation of enable()/resume() and disable()/pause() is currently
identical, but nesting support will change that.
Most of the call sites have been updated using the following Coccinelle
script:
@@
@@
{
...
- arch_enter_lazy_mmu_mode();
+ lazy_mmu_mode_enable();
...
- arch_leave_lazy_mmu_mode();
+ lazy_mmu_mode_disable();
...
}
@@
@@
{
...
- arch_leave_lazy_mmu_mode();
+ lazy_mmu_mode_pause();
...
- arch_enter_lazy_mmu_mode();
+ lazy_mmu_mode_resume();
...
}
A couple of notes regarding x86:
* Xen is currently the only case where explicit handling is required
for lazy MMU when context-switching. This is purely an
implementation detail and using the generic lazy_mmu_mode_*
functions would cause trouble when nesting support is introduced,
because the generic functions must be called from the current task.
For that reason we still use arch_leave() and arch_enter() there.
* x86 calls arch_flush_lazy_mmu_mode() unconditionally in a few
places, but only defines it if PARAVIRT_XXL is selected, and we
are removing the fallback in <linux/pgtable.h>. Add a new fallback
definition to <asm/pgtable.h> to keep things building.
Link: https://lkml.kernel.org/r/20251215150323.2218608-8-kevin.brodsky@arm.com
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Borislav Betkov <bp@alien8.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: David Hildenbrand (Red Hat) <david@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jann Horn <jannh@google.com>
Cc: Juegren Gross <jgross@suse.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
||
|
|
7303ecbfe4 |
mm: introduce CONFIG_ARCH_HAS_LAZY_MMU_MODE
Architectures currently opt in for implementing lazy_mmu helpers by defining __HAVE_ARCH_ENTER_LAZY_MMU_MODE. In preparation for introducing a generic lazy_mmu layer that will require storage in task_struct, let's switch to a cleaner approach: instead of defining a macro, select a CONFIG option. This patch introduces CONFIG_ARCH_HAS_LAZY_MMU_MODE and has each arch select it when it implements lazy_mmu helpers. __HAVE_ARCH_ENTER_LAZY_MMU_MODE is removed and <linux/pgtable.h> relies on the new CONFIG instead. On x86, lazy_mmu helpers are only implemented if PARAVIRT_XXL is selected. This creates some complications in arch/x86/boot/, because a few files manually undefine PARAVIRT* options. As a result <asm/paravirt.h> does not define the lazy_mmu helpers, but this breaks the build as <linux/pgtable.h> only defines them if !CONFIG_ARCH_HAS_LAZY_MMU_MODE. There does not seem to be a clean way out of this - let's just undefine that new CONFIG too. Link: https://lkml.kernel.org/r/20251215150323.2218608-7-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com> Acked-by: Andreas Larsson <andreas@gaisler.com> [sparc] Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Borislav Betkov <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David Hildenbrand (Red Hat) <david@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: David Woodhouse <dwmw2@infradead.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Juegren Gross <jgross@suse.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
f2be745071 |
mm: clarify lazy_mmu sleeping constraints
The lazy MMU mode documentation makes clear that an implementation should not assume that preemption is disabled or any lock is held upon entry to the mode; however it says nothing about what code using the lazy MMU interface should expect. In practice sleeping is forbidden (for generic code) while the lazy MMU mode is active: say it explicitly. Link: https://lkml.kernel.org/r/20251215150323.2218608-6-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Borislav Betkov <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David S. Miller <davem@davemloft.net> Cc: David Woodhouse <dwmw2@infradead.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Juegren Gross <jgross@suse.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
442bf488b9 |
sparc/mm: implement arch_flush_lazy_mmu_mode()
Upcoming changes to the lazy_mmu API will cause arch_flush_lazy_mmu_mode() to be called when leaving a nested lazy_mmu section. Move the relevant logic from arch_leave_lazy_mmu_mode() to arch_flush_lazy_mmu_mode() and have the former call the latter. Note: the additional this_cpu_ptr() call on the arch_leave_lazy_mmu_mode() path will be removed in a subsequent patch. Link: https://lkml.kernel.org/r/20251215150323.2218608-5-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com> Acked-by: Andreas Larsson <andreas@gaisler.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Borislav Betkov <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David Hildenbrand (Red Hat) <david@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: David Woodhouse <dwmw2@infradead.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Juegren Gross <jgross@suse.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
c3f0778ffe |
powerpc/mm: implement arch_flush_lazy_mmu_mode()
Upcoming changes to the lazy_mmu API will cause arch_flush_lazy_mmu_mode() to be called when leaving a nested lazy_mmu section. Move the relevant logic from arch_leave_lazy_mmu_mode() to arch_flush_lazy_mmu_mode() and have the former call the latter. The radix_enabled() check is required in both as arch_flush_lazy_mmu_mode() will be called directly from the generic layer in a subsequent patch. Note: the additional this_cpu_ptr() and radix_enabled() calls on the arch_leave_lazy_mmu_mode() path will be removed in a subsequent patch. Link: https://lkml.kernel.org/r/20251215150323.2218608-4-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Acked-by: David Hildenbrand <david@redhat.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Borislav Betkov <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David Hildenbrand (Red Hat) <david@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: David Woodhouse <dwmw2@infradead.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Juegren Gross <jgross@suse.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
66bdd779d3 |
x86/xen: simplify flush_lazy_mmu()
arch_flush_lazy_mmu_mode() is called when outstanding batched pgtable operations must be completed immediately. There should however be no need to leave and re-enter lazy MMU completely. The only part of that sequence that we really need is xen_mc_flush(); call it directly. Link: https://lkml.kernel.org/r/20251215150323.2218608-3-kevin.brodsky@arm.com Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Reviewed-by: Juergen Gross <jgross@suse.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Borislav Betkov <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: David Hildenbrand (Red Hat) <david@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: David Woodhouse <dwmw2@infradead.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
58852f24f9 |
powerpc/64s: do not re-activate batched TLB flush
Patch series "Nesting support for lazy MMU mode", v6.
When the lazy MMU mode was introduced eons ago, it wasn't made clear
whether such a sequence was legal:
arch_enter_lazy_mmu_mode()
...
arch_enter_lazy_mmu_mode()
...
arch_leave_lazy_mmu_mode()
...
arch_leave_lazy_mmu_mode()
It seems fair to say that nested calls to
arch_{enter,leave}_lazy_mmu_mode() were not expected, and most
architectures never explicitly supported it.
Nesting does in fact occur in certain configurations, and avoiding it has
proved difficult. This series therefore enables lazy_mmu sections to
nest, on all architectures.
Nesting is handled using a counter in task_struct (patch 8), like other
stateless APIs such as pagefault_{disable,enable}(). This is fully
handled in a new generic layer in <linux/pgtable.h>; the arch_* API
remains unchanged. A new pair of calls, lazy_mmu_mode_{pause,resume}(),
is also introduced to allow functions that are called with the lazy MMU
mode enabled to temporarily pause it, regardless of nesting.
An arch now opts in to using the lazy MMU mode by selecting
CONFIG_ARCH_LAZY_MMU; this is more appropriate now that we have a generic
API, especially with state conditionally added to task_struct.
This patch (of 14):
Since commit
|
||
|
|
2a912d440c |
alloc_tag: move memory_allocation_profiling_sysctls into .rodata
Remove the change in file mode permissions done before initializing the sysctl. It is not necessary as the writing of the kernel variable will be blocked by the proc_mem_profiling_handler when writing is disallowed (also controlled by mem_profiling_support). Link: https://lkml.kernel.org/r/20251215-jag-alloc_tag_const-v1-1-35ea56a1ce13@kernel.org Signed-off-by: Joel Granados <joel.granados@kernel.org> Acked-by: Suren Baghdasaryan <surenb@google.com> Cc: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
817383b34d |
mm/damon/core: fix memory leak of repeat mode damon_call_control objects
A memory leak exists in the handling of repeat mode damon_call_control
objects by kdamond_call(). While damon_call() correctly allows multiple
repeat mode objects (with ->repeat set to true) to be added to the
per-context list, kdamond_call() incorrectly processes them.
The function moves all repeat mode objects from the context's list to a
temporary list (repeat_controls). However, it only moves the first object
back to the context's list for future calls, leaving the remaining objects
on the temporary list where they are abandoned and leaked.
This patch fixes the leak by ensuring all repeat mode objects are properly
re-added to the context's list.
Note that the leak is not in the real world, and therefore no user is
impacted. It is only potential for imaginaray damon_call() use cases that
do not exist in the tree for now. In more detail, the leak happens only
when the multiple repeat mode objects are assumed to be deallocated by
kdamond_call() (damon_call_control->dealloc_on_cancel is set). There is
no such damon_call() use cases at the moment.
Link: https://lkml.kernel.org/r/20251202082340.34178-1-lienze@kylinos.cn
Fixes:
|
||
|
|
a03ed8f144 |
mm/vmalloc: clarify why vmap_range_noflush() might sleep
The only reason vmap_range_noflush() can sleep is because of pagetable allocations. The actual allocation mechanism is arch-specific so might_alloc() doesn't work here (what GFP flags would be used?). Hence, just add a comment. Also note that this might do a TLB shootdown. This is not actually sleeping but it requires IRQs on for x86, and might_sleep() incidentally serves to detect violations of that too. Link: https://lkml.kernel.org/r/20251215-b4-vmalloc-might_alloc-v3-1-92dd8e406868@google.com Signed-off-by: Brendan Jackman <jackmanb@google.com> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
6c790212c5 |
Merge tag 'devicetree-fixes-for-6.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring: - Fix a refcount leak in of_alias_scan() - Support descending into child nodes when populating nodes in /firmware * tag 'devicetree-fixes-for-6.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: of: fix reference count leak in of_alias_scan() of: platform: Use default match table for /firmware |
||
|
|
c25f2fb1f4 |
Merge tag 'mm-hotfixes-stable-2026-01-20-13-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton: - A patch series from David Hildenbrand which fixes a few things related to hugetlb PMD sharing - The remainder are singletons, please see their changelogs for details * tag 'mm-hotfixes-stable-2026-01-20-13-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm: restore per-memcg proactive reclaim with !CONFIG_NUMA mm/kfence: fix potential deadlock in reboot notifier Docs/mm/allocation-profiling: describe sysctrl limitations in debug mode mm: do not copy page tables unnecessarily for VM_UFFD_WP mm/hugetlb: fix excessive IPI broadcasts when unsharing PMD tables using mmu_gather mm/rmap: fix two comments related to huge_pmd_unshare() mm/hugetlb: fix two comments related to huge_pmd_unshare() mm/hugetlb: fix hugetlb_pmd_shared() mm: remove unnecessary and incorrect mmap lock assert x86/kfence: avoid writing L1TF-vulnerable PTEs mm/vma: do not leak memory when .mmap_prepare swaps the file migrate: correct lock ordering for hugetlb file folios panic: only warn about deprecated panic_print on write access fs/writeback: skip AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes() mm: take into account mm_cid size for mm_struct static definitions mm: rename cpu_bitmap field to flexible_array mm: add missing static initializer for init_mm::mm_cid.lock |
||
|
|
c03e9c42ae |
Merge tag 'dma-mapping-6.19-2026-01-20' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux
Pull dma-mapping fixes from Marek Szyprowski: - minor fixes for the corner cases of the SWIOTLB pool management (Robin Murphy) * tag 'dma-mapping-6.19-2026-01-20' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux: dma/pool: Avoid allocating redundant pools mm_zone: Generalise has_managed_dma() dma/pool: Improve pool lookup |
||
|
|
8f7537efbe |
Merge tag 'pwm/for-6.19-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ukleinek/linux
Pull pwm fixes and a maintainer update from Uwe Kleine-König:
- pwm: Ensure ioctl() returns a negative errno on error
This affects two ioctls on /dev/pwmchipX where the return value of
copy_to_user() was passed to userspace. This is fixed to return
-EFAULT now instead.
- pwm: max7360: Populate missing .sizeof_wfhw in max7360_pwm_ops
This fixes an oversight in the original commit that added support for
the max7360 driver (
|
||
|
|
16aca2c98a |
mm: restore per-memcg proactive reclaim with !CONFIG_NUMA
Commit |
||
|
|
9bc9ccbf4c |
mm/kfence: fix potential deadlock in reboot notifier
The reboot notifier callback can deadlock when calling
cancel_delayed_work_sync() if toggle_allocation_gate() is blocked in
wait_event_idle() waiting for allocations, that might not happen on
shutdown path.
The issue is that cancel_delayed_work_sync() waits for the work to
complete, but the work is waiting for kfence_allocation_gate > 0 which
requires allocations to happen (each allocation is increased by 1) -
allocations that may have stopped during shutdown.
Fix this by:
1. Using cancel_delayed_work() (non-sync) to avoid blocking. Now the
callback succeeds and return.
2. Adding wake_up() to unblock any waiting toggle_allocation_gate()
3. Adding !kfence_enabled to the wait condition so the wake succeeds
The static_branch_disable() IPI will still execute after the wake, but at
this early point in shutdown (reboot notifier runs with INT_MAX priority),
the system is still functional and CPUs can respond to IPIs.
Link: https://lkml.kernel.org/r/20260116-kfence_fix-v1-1-4165a055933f@debian.org
Fixes:
|
||
|
|
cb7d761bf5 |
Docs/mm/allocation-profiling: describe sysctrl limitations in debug mode
When CONFIG_MEM_ALLOC_PROFILING_DEBUG=y, /proc/sys/vm/mem_profiling is
read-only to avoid debug warnings in a scenario when an allocation is
made while profiling is disabled (allocation does not get an allocation
tag), then profiling gets enabled and allocation gets freed (warning due
to the allocation missing allocation tag).
Link: https://lkml.kernel.org/r/20260116184423.2708363-1-surenb@google.com
Fixes:
|
||
|
|
35e2470326 |
mm: do not copy page tables unnecessarily for VM_UFFD_WP
Commit |
||
|
|
8ce720d5bd |
mm/hugetlb: fix excessive IPI broadcasts when unsharing PMD tables using mmu_gather
As reported, ever since commit |
||
|
|
a8682d500f |
mm/rmap: fix two comments related to huge_pmd_unshare()
PMD page table unsharing no longer touches the refcount of a PMD page
table. Also, it is not about dropping the refcount of a "PMD page" but
the "PMD page table".
Let's just simplify by saying that the PMD page table was unmapped,
consequently also unmapping the folio that was mapped into this page.
This code should be deduplicated in the future.
Link: https://lkml.kernel.org/r/20251223214037.580860-4-david@kernel.org
Fixes:
|
||
|
|
3937027cae |
mm/hugetlb: fix two comments related to huge_pmd_unshare()
Ever since we stopped using the page count to detect shared PMD page tables, these comments are outdated. The only reason we have to flush the TLB early is because once we drop the i_mmap_rwsem, the previously shared page table could get freed (to then get reallocated and used for other purpose). So we really have to flush the TLB before that could happen. So let's simplify the comments a bit. The "If we unshared PMDs, the TLB flush was not recorded in mmu_gather." part introduced as in commit |
||
|
|
ca1a47cd3f |
mm/hugetlb: fix hugetlb_pmd_shared()
Patch series "mm/hugetlb: fixes for PMD table sharing (incl. using
mmu_gather)", v3.
One functional fix, one performance regression fix, and two related
comment fixes.
I cleaned up my prototype I recently shared [1] for the performance fix,
deferring most of the cleanups I had in the prototype to a later point.
While doing that I identified the other things.
The goal of this patch set is to be backported to stable trees "fairly"
easily. At least patch #1 and #4.
Patch #1 fixes hugetlb_pmd_shared() not detecting any sharing
Patch #2 + #3 are simple comment fixes that patch #4 interacts with.
Patch #4 is a fix for the reported performance regression due to excessive
IPI broadcasts during fork()+exit().
The last patch is all about TLB flushes, IPIs and mmu_gather.
Read: complicated
There are plenty of cleanups in the future to be had + one reasonable
optimization on x86. But that's all out of scope for this series.
Runtime tested, with a focus on fixing the performance regression using
the original reproducer [2] on x86.
This patch (of 4):
We switched from (wrongly) using the page count to an independent shared
count. Now, shared page tables have a refcount of 1 (excluding
speculative references) and instead use ptdesc->pt_share_count to identify
sharing.
We didn't convert hugetlb_pmd_shared(), so right now, we would never
detect a shared PMD table as such, because sharing/unsharing no longer
touches the refcount of a PMD table.
Page migration, like mbind() or migrate_pages() would allow for migrating
folios mapped into such shared PMD tables, even though the folios are not
exclusive. In smaps we would account them as "private" although they are
"shared", and we would be wrongly setting the PM_MMAP_EXCLUSIVE in the
pagemap interface.
Fix it by properly using ptdesc_pmd_is_shared() in hugetlb_pmd_shared().
Link: https://lkml.kernel.org/r/20251223214037.580860-1-david@kernel.org
Link: https://lkml.kernel.org/r/20251223214037.580860-2-david@kernel.org
Link: https://lore.kernel.org/all/8cab934d-4a56-44aa-b641-bfd7e23bd673@kernel.org/ [1]
Link: https://lore.kernel.org/all/8cab934d-4a56-44aa-b641-bfd7e23bd673@kernel.org/ [2]
Fixes:
|
||
|
|
90888b4ae1 |
mm: remove unnecessary and incorrect mmap lock assert
This check was introduced by commit |
||
|
|
0a155a8a24 |
MAINTAINERS: Add myself as reviewer for PWM rust drivers
I would like to help with reviewing the Rust part of the PWM drivers. While I maintain the Rust bindings, adding this separate entry ensures I am automatically CC-ed on the driver implementations (drivers/pwm/*.rs) Signed-off-by: Michal Wilczynski <m.wilczynski@samsung.com> Link: https://patch.msgid.link/20260119-maintain_rust_drivers-v1-1-88711afc559e@samsung.com Signed-off-by: Uwe Kleine-König <ukleinek@kernel.org> |
||
|
|
b505f19445 |
x86/kfence: avoid writing L1TF-vulnerable PTEs
For native, the choice of PTE is fine. There's real memory backing the
non-present PTE. However, for XenPV, Xen complains:
(XEN) d1 L1TF-vulnerable L1e 8010000018200066 - Shadowing
To explain, some background on XenPV pagetables:
Xen PV guests are control their own pagetables; they choose the new
PTE value, and use hypercalls to make changes so Xen can audit for
safety.
In addition to a regular reference count, Xen also maintains a type
reference count. e.g. SegDesc (referenced by vGDT/vLDT), Writable
(referenced with _PAGE_RW) or L{1..4} (referenced by vCR3 or a lower
pagetable level). This is in order to prevent e.g. a page being
inserted into the pagetables for which the guest has a writable mapping.
For non-present mappings, all other bits become software accessible,
and typically contain metadata rather a real frame address. There is
nothing that a reference count could sensibly be tied to. As such, even
if Xen could recognise the address as currently safe, nothing would
prevent that frame from changing owner to another VM in the future.
When Xen detects a PV guest writing a L1TF-PTE, it responds by
activating shadow paging. This is normally only used for the live phase
of migration, and comes with a reasonable overhead.
KFENCE only cares about getting #PF to catch wild accesses; it doesn't
care about the value for non-present mappings. Use a fully inverted PTE,
to avoid hitting the slow path when running under Xen.
While adjusting the logic, take the opportunity to skip all actions if the
PTE is already in the right state, half the number PVOps callouts, and
skip TLB maintenance on a !P -> P transition which benefits non-Xen cases
too.
Link: https://lkml.kernel.org/r/20260106180426.710013-1-andrew.cooper3@citrix.com
Fixes:
|
||
|
|
605f6586ec |
mm/vma: do not leak memory when .mmap_prepare swaps the file
The current implementation of mmap() is set up such that a struct file object is obtained for the input fd in ksys_mmap_pgoff() via fget(), and its reference count decremented at the end of the function via. fput(). If a merge can be achieved, we are fine to simply decrement the refcount on the file. Otherwise, in __mmap_new_file_vma(), we increment the reference count on the file via get_file() such that the fput() in ksys_mmap_pgoff() does not free the now-referenced file object. The introduction of the f_op->mmap_prepare hook changes things, as it becomes possible for a driver to replace the file object right at the beginning of the mmap operation. The current implementation is buggy if this happens because it unconditionally calls get_file() on the mapping's file whether or not it was replaced (and thus whether or not its reference count will be decremented at the end of ksys_mmap_pgoff()). This results in a memory leak, and was exposed in commit |
||
|
|
b7880cb166 |
migrate: correct lock ordering for hugetlb file folios
Syzbot has found a deadlock (analyzed by Lance Yang):
1) Task (5749): Holds folio_lock, then tries to acquire i_mmap_rwsem(read lock).
2) Task (5754): Holds i_mmap_rwsem(write lock), then tries to acquire
folio_lock.
migrate_pages()
-> migrate_hugetlbs()
-> unmap_and_move_huge_page() <- Takes folio_lock!
-> remove_migration_ptes()
-> __rmap_walk_file()
-> i_mmap_lock_read() <- Waits for i_mmap_rwsem(read lock)!
hugetlbfs_fallocate()
-> hugetlbfs_punch_hole() <- Takes i_mmap_rwsem(write lock)!
-> hugetlbfs_zero_partial_page()
-> filemap_lock_hugetlb_folio()
-> filemap_lock_folio()
-> __filemap_get_folio <- Waits for folio_lock!
The migration path is the one taking locks in the wrong order according to
the documentation at the top of mm/rmap.c. So expand the scope of the
existing i_mmap_lock to cover the calls to remove_migration_ptes() too.
This is (mostly) how it used to be after commit
|
||
|
|
90f3c12324 |
panic: only warn about deprecated panic_print on write access
The panic_print_deprecated() warning is being triggered on both read and write operations to the panic_print parameter. This causes spurious warnings when users run 'sysctl -a' to list all sysctl values, since that command reads /proc/sys/kernel/panic_print and triggers the deprecation notice. Modify the handlers to only emit the deprecation warning when the parameter is actually being set: - sysctl_panic_print_handler(): check 'write' flag before warning. - panic_print_get(): remove the deprecation call entirely. This way, users are only warned when they actively try to use the deprecated parameter, not when passively querying system state. Link: https://lkml.kernel.org/r/20260106163321.83586-1-gal@nvidia.com Fixes: |
||
|
|
f9a49aa302 |
fs/writeback: skip AS_NO_DATA_INTEGRITY mappings in wait_sb_inodes()
Above the while() loop in wait_sb_inodes(), we document that we must wait
for all pages under writeback for data integrity. Consequently, if a
mapping, like fuse, traditionally does not have data integrity semantics,
there is no need to wait at all; we can simply skip these inodes.
This restores fuse back to prior behavior where syncs are no-ops. This
fixes a user regression where if a system is running a faulty fuse server
that does not reply to issued write requests, this causes wait_sb_inodes()
to wait forever.
Link: https://lkml.kernel.org/r/20260105211737.4105620-2-joannelkoong@gmail.com
Fixes:
|
||
|
|
be31340a4c |
mm: take into account mm_cid size for mm_struct static definitions
Both init_mm and efi_mm static definitions need to make room for the 2
mm_cid cpumasks.
This fixes possible out-of-bounds accesses to init_mm and efi_mm.
Add a space between # and define for the mm_alloc_cid() definition to make
it consistent with the coding style used in the rest of this header file.
Link: https://lkml.kernel.org/r/20251224173358.647691-4-mathieu.desnoyers@efficios.com
Fixes:
|
||
|
|
6ac433f8b2 |
mm: rename cpu_bitmap field to flexible_array
The cpu_bitmap flexible array now contains more than just the cpu_bitmap.
In preparation for changing the static mm_struct definitions to cover for
the additional space required, change the cpu_bitmap type from "unsigned
long" to "char", require an unsigned long alignment of the flexible array,
and rename the field from "cpu_bitmap" to "flexible_array".
Introduce the MM_STRUCT_FLEXIBLE_ARRAY_INIT macro to statically initialize
the flexible array. This covers the init_mm and efi_mm static
definitions.
This is a preparation step for fixing the missing mm_cid size for static
mm_struct definitions.
Link: https://lkml.kernel.org/r/20251224173358.647691-3-mathieu.desnoyers@efficios.com
Fixes:
|
||
|
|
12a6ddfc76 |
mm: add missing static initializer for init_mm::mm_cid.lock
Initialize the mm_cid.lock struct member of init_mm.
Link: https://lkml.kernel.org/r/20251224173358.647691-2-mathieu.desnoyers@efficios.com
Fixes:
|
||
|
|
73c9007d9b |
Merge tag 'ata-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux
Pull ata fixes from Damien Le Moal: - A set of fixes for link power management as the recent changes/fixes introduced regressions with ATAPI devices and with adapters that have DUMMY ports, preventing an adapter to fully reach a low power state and thus preventing the system CPU from reaching low power C-states (from Niklas) * tag 'ata-6.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/libata/linux: ata: libata: Print features also for ATAPI devices ata: libata: Add DIPM and HIPM to ata_dev_print_features() early return ata: libata: Add cpr_log to ata_dev_print_features() early return ata: libata-sata: Improve link_power_management_supported sysfs attribute ata: libata: Call ata_dev_config_lpm() for ATAPI devices ata: ahci: Do not read the per port area for unimplemented ports |
||
|
|
63faf32666 |
pwm: max7360: Populate missing .sizeof_wfhw in max7360_pwm_ops
The sizeof_wfhw field wasn't populated in max7360_pwm_ops so it was set
to 0 by default.
While this is ok for now because:
sizeof(struct max7360_pwm_waveform) < PWM_WFHWSIZE
in the future, if struct max7360_pwm_waveform grows, it could lead to
stack corruption.
Fixes:
|
||
|
|
c198b7773c |
pwm: Ensure ioctl() returns a negative errno on error
copy_to_user() returns the number of bytes not copied, thus if there is
a problem a positive number. However the ioctl callback is supposed to
return a negative error code on error.
This error is a unfortunate as strictly speaking it became ABI with the
introduction of pwm character devices. However I never saw the issue in
real life -- I found this by code inspection -- and it only affects an
error case where readonly memory is passed to the ioctls or the address
mapping changes while the ioctl is active. Also there are already error
cases returning negative values, so the calling code must be prepared to
see such values already.
Fixes:
|
||
|
|
24d479d26b | Linux 6.19-rc6 v6.19-rc6 | ||
|
|
90a855e75a |
Merge tag 'landlock-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock fixes from Mickaël Salaün: "This fixes TCP handling, tests, documentation, non-audit elided code, and minor cosmetic changes" * tag 'landlock-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: landlock: Clarify documentation for the IOCTL access right selftests/landlock: Properly close a file descriptor landlock: Improve the comment for domain_is_scoped selftests/landlock: Use scoped_base_variants.h for ptrace_test selftests/landlock: Fix missing semicolon selftests/landlock: Fix typo in fs_test landlock: Optimize stack usage when !CONFIG_AUDIT landlock: Fix spelling landlock: Clean up hook_ptrace_access_check() landlock: Improve erratum documentation landlock: Remove useless include landlock: Fix wrong type usage selftests/landlock: NULL-terminate unix pathname addresses selftests/landlock: Remove invalid unix socket bind() selftests/landlock: Add missing connect(minimal AF_UNSPEC) test selftests/landlock: Fix TCP bind(AF_UNSPEC) test case landlock: Fix TCP handling of short AF_UNSPEC addresses landlock: Fix formatting |
||
|
|
6f32aa9161 |
Merge tag 'cgroup-for-6.19-rc5-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup
Pull cgroup fixes from Tejun Heo: - Add Chen Ridong as cpuset reviewer - Add SPDX license identifiers to cgroup files that were missing them * tag 'cgroup-for-6.19-rc5-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: kernel: cgroup: Add LGPL-2.1 SPDX license ID to legacy_freezer.c kernel: cgroup: Add SPDX-License-Identifier lines MAINTAINERS: Add Chen Ridong as cpuset reviewer |
||
|
|
f8907398a6 |
Merge tag 'ext4_for_linus-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
Pull ext4 fixes from Ted Ts'o: - Fix an inconsistency in structure size on 32-bit platforms caused by padding differences for the new EXT4_IOC_[GS]ET_TUNE_SB_PARAM ioctls - Fix a buffer leak on the error path when dropping the refcount an xattr value stored in an inode - Fix missing locking on the error path for the file defragmentation ioctl leading to a BUG * tag 'ext4_for_linus-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix iloc.bh leak in ext4_xattr_inode_update_ref ext4: add missing down_write_data_sem in mext_move_extent(). ext4: fix ext4_tune_sb_params padding |
||
|
|
e90b81e8ff |
Merge tag 'dmaengine-fix-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine fixes from Vinod Koul:
"A bunch of driver fixes for:
- dma mask fix for mmp pdma driver
- Xilinx regmap max register, uninitialized addr_width fix
- device leak fix for bunch of drivers in the subsystem
- stm32 dmamux, TI crossbar driver fixes for device & of node leak
and route allocation cleanup
- Tegra use afer free fix
- Memory leak fix in Qualcomm gpi and omap-dma driver
- compatible fix for apple driver"
* tag 'dmaengine-fix-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (25 commits)
dmaengine: apple-admac: Add "apple,t8103-admac" compatible
dmaengine: omap-dma: fix dma_pool resource leak in error paths
dmaengine: qcom: gpi: Fix memory leak in gpi_peripheral_config()
dmaengine: sh: rz-dmac: Fix rz_dmac_terminate_all()
dmaengine: xilinx_dma: Fix uninitialized addr_width when "xlnx,addrwidth" property is missing
dmaengine: tegra-adma: Fix use-after-free
dmaengine: fsl-edma: Fix clk leak on alloc_chan_resources failure
dmaengine: mmp_pdma: Fix race condition in mmp_pdma_residue()
dmaengine: ti: k3-udma: fix device leak on udma lookup
dmaengine: ti: dma-crossbar: clean up dra7x route allocation error paths
dmaengine: ti: dma-crossbar: fix device leak on am335x route allocation
dmaengine: ti: dma-crossbar: fix device leak on dra7x route allocation
dmaengine: stm32: dmamux: clean up route allocation error labels
dmaengine: stm32: dmamux: fix OF node leak on route allocation failure
dmaengine: stm32: dmamux: fix device leak on route allocation
dmaengine: sh: rz-dmac: fix device leak on probe failure
dmaengine: lpc32xx-dmamux: fix device leak on route allocation
dmaengine: lpc18xx-dmamux: fix device leak on route allocation
dmaengine: idxd: fix device leaks on compat bind and unbind
dmaengine: dw: dmamux: fix OF node leak on route allocation failure
...
|
||
|
|
3271b25e3d |
Merge tag 'phy-fixes-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy
Pull phy fixes from Vinod Koul:
"A bunch of driver fixes:
- Freescale typec orientation switch fix, clearing register fix,
assertion of phy reset during power on
- Qualcomm pcs register clear before using
- stm one off fix
- TI runtimepm error handling, regmap leak fixes
- Rockchip gadget mode disconnection and disruption fixes
- Tegra register level fix
- Broadcom pointer cast warning fix"
* tag 'phy-fixes-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy:
phy: freescale: imx8m-pcie: assert phy reset during power on
phy: rockchip: inno-usb2: Fix a double free bug in rockchip_usb2phy_probe()
phy: broadcom: ns-usb3: Fix Wvoid-pointer-to-enum-cast warning (again)
phy: tegra: xusb: Explicitly configure HS_DISCON_LEVEL to 0x7
phy: rockchip: inno-usb2: fix communication disruption in gadget mode
phy: rockchip: inno-usb2: fix disconnection in gadget mode
phy: ti: gmii-sel: fix regmap leak on probe failure
phy: sparx5-serdes: make it selectable for ARCH_LAN969X
phy: ti: da8xx-usb: Handle devm_pm_runtime_enable() errors
phy: stm32-usphyc: Fix off by one in probe()
phy: qcom-qusb2: Fix NULL pointer dereference on early suspend
phy: fsl-imx8mq-usb: Clear the PCS_TX_SWING_FULL field before using it
dt-bindings: phy: qcom,sc8280xp-qmp-pcie-phy: Update pcie phy bindings for qcs8300
phy: fsl-imx8mq-usb: fix typec orientation switch when built as module
|
||
|
|
56bc8a18aa |
Merge tag 'soundwire-6.19-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire
Pull soundwire fix from Vinod Koul: - Single off-by-one fix for allocating slave id * tag 'soundwire-6.19-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire: soundwire: bus: fix off-by-one when allocating slave IDs |
||
|
|
27983960f0 |
Merge tag 'usb-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are some small USB fixes and new device ids for 6.19-rc6
Included in here are:
- new usb-serial device ids
- dwc3-apple driver fixes to get things working properly on that
hardware platform
- ohci/uhci platfrom driver module soft-deps with ehci to remove a
runtime warning that sometimes shows up on some platforms.
- quirk for broken devices that can not handle reading the BOS
descriptor from them without going crazy.
- usb-serial driver fixes
- xhci driver fixes
- usb gadget driver fixes
All of these except for the last xhci fix has been in linux-next for a
while. The xhci fix has been reported by others to solve the issue for
them, so should be ok"
* tag 'usb-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
xhci: sideband: don't dereference freed ring when removing sideband endpoint
usb: gadget: uvc: retry vb2_reqbufs() with vb_vmalloc_memops if use_sg fail
usb: gadget: uvc: return error from uvcg_queue_init()
usb: gadget: uvc: fix interval_duration calculation
usb: gadget: uvc: fix req_payload_size calculation
usb: dwc3: apple: Ignore USB role switches to the active role
usb: host: xhci-tegra: Use platform_get_irq_optional() for wake IRQs
USB: OHCI/UHCI: Add soft dependencies on ehci_platform
usb: dwc3: apple: Set USB2 PHY mode before dwc3 init
USB: serial: f81232: fix incomplete serial port generation
USB: serial: ftdi_sio: add support for PICAXE AXE027 cable
USB: serial: option: add Telit LE910 MBIM composition
usb: core: add USB_QUIRK_NO_BOS for devices that hang on BOS descriptor
dt-bindings: usb: qcom,dwc3: Correct MSM8994 interrupts
dt-bindings: usb: qcom,dwc3: Correct IPQ5018 interrupts
tcpm: allow looking for role_sw device in the main node
usb: dwc3: Check for USB4 IP_NAME
|