mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
synced 2026-05-07 11:33:58 -04:00
663d0c19f3acc0d697f623d34b8eb3b438bf2bda
1351227 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
b481341e4c |
selftest: test system mappings are sealed
Add sysmap_is_sealed.c to test system mappings are sealed. Note: CONFIG_MSEAL_SYSTEM_MAPPINGS must be set, as indicated in config file. Link: https://lkml.kernel.org/r/20250305021711.3867874-8-jeffxu@google.com Signed-off-by: Jeff Xu <jeffxu@chromium.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Kees Cook <kees@kernel.org> Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org> Cc: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: Anna-Maria Behnsen <anna-maria@linutronix.de> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Benjamin Berg <benjamin@sipsolutions.net> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Elliot Hughes <enh@google.com> Cc: Florian Faineli <f.fainelli@gmail.com> Cc: Greg Ungerer <gerg@kernel.org> Cc: Guenter Roeck <groeck@chromium.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jann Horn <jannh@google.com> Cc: Jason A. Donenfeld <jason@zx2c4.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Jorge Lucangeli Obes <jorgelo@chromium.org> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Linus Waleij <linus.walleij@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcow (Oracle) <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <mike.rapoport@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pedro Falcato <pedro.falcato@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Stephen Röttger <sroettger@google.com> Cc: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
a8c15bb400 |
mseal sysmap: update mseal.rst
Update memory sealing documentation to include details about system mappings. Link: https://lkml.kernel.org/r/20250305021711.3867874-7-jeffxu@google.com Signed-off-by: Jeff Xu <jeffxu@chromium.org> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org> Cc: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: Anna-Maria Behnsen <anna-maria@linutronix.de> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Benjamin Berg <benjamin@sipsolutions.net> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Elliot Hughes <enh@google.com> Cc: Florian Faineli <f.fainelli@gmail.com> Cc: Greg Ungerer <gerg@kernel.org> Cc: Guenter Roeck <groeck@chromium.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jann Horn <jannh@google.com> Cc: Jason A. Donenfeld <jason@zx2c4.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Jorge Lucangeli Obes <jorgelo@chromium.org> Cc: Linus Waleij <linus.walleij@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcow (Oracle) <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <mike.rapoport@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pedro Falcato <pedro.falcato@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Stephen Röttger <sroettger@google.com> Cc: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
3d38922abf |
mseal sysmap: uprobe mapping
Provide support to mseal the uprobe mapping. Unlike other system mappings, the uprobe mapping is not established during program startup. However, its lifetime is the same as the process's lifetime. It could be sealed from creation. Test was done with perf tool, and observe the uprobe mapping is sealed. Link: https://lkml.kernel.org/r/20250305021711.3867874-6-jeffxu@google.com Signed-off-by: Jeff Xu <jeffxu@chromium.org> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Kees Cook <kees@kernel.org> Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org> Cc: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: Anna-Maria Behnsen <anna-maria@linutronix.de> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Benjamin Berg <benjamin@sipsolutions.net> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Elliot Hughes <enh@google.com> Cc: Florian Faineli <f.fainelli@gmail.com> Cc: Greg Ungerer <gerg@kernel.org> Cc: Guenter Roeck <groeck@chromium.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jann Horn <jannh@google.com> Cc: Jason A. Donenfeld <jason@zx2c4.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Jorge Lucangeli Obes <jorgelo@chromium.org> Cc: Linus Waleij <linus.walleij@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcow (Oracle) <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <mike.rapoport@gmail.com> Cc: Pedro Falcato <pedro.falcato@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Stephen Röttger <sroettger@google.com> Cc: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
0061b6e162 |
mseal sysmap: enable arm64
Provide support for CONFIG_MSEAL_SYSTEM_MAPPINGS on arm64, covering the vdso, vvar, and compat-mode vectors and sigpage mappings. Production release testing passes on Android and Chrome OS. Link: https://lkml.kernel.org/r/20250305021711.3867874-5-jeffxu@google.com Signed-off-by: Jeff Xu <jeffxu@chromium.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Kees Cook <kees@kernel.org> Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org> Cc: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: Anna-Maria Behnsen <anna-maria@linutronix.de> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Benjamin Berg <benjamin@sipsolutions.net> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Elliot Hughes <enh@google.com> Cc: Florian Faineli <f.fainelli@gmail.com> Cc: Greg Ungerer <gerg@kernel.org> Cc: Guenter Roeck <groeck@chromium.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jann Horn <jannh@google.com> Cc: Jason A. Donenfeld <jason@zx2c4.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Jorge Lucangeli Obes <jorgelo@chromium.org> Cc: Linus Waleij <linus.walleij@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcow (Oracle) <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <mike.rapoport@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pedro Falcato <pedro.falcato@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Stephen Röttger <sroettger@google.com> Cc: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
3049def198 |
mseal sysmap: enable x86-64
Provide support for CONFIG_MSEAL_SYSTEM_MAPPINGS on x86-64, covering the vdso, vvar, vvar_vclock. Production release testing passes on Android and Chrome OS. Link: https://lkml.kernel.org/r/20250305021711.3867874-4-jeffxu@google.com Signed-off-by: Jeff Xu <jeffxu@chromium.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Kees Cook <kees@kernel.org> Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org> Cc: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: Anna-Maria Behnsen <anna-maria@linutronix.de> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Benjamin Berg <benjamin@sipsolutions.net> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Elliot Hughes <enh@google.com> Cc: Florian Faineli <f.fainelli@gmail.com> Cc: Greg Ungerer <gerg@kernel.org> Cc: Guenter Roeck <groeck@chromium.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jann Horn <jannh@google.com> Cc: Jason A. Donenfeld <jason@zx2c4.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Jorge Lucangeli Obes <jorgelo@chromium.org> Cc: Linus Waleij <linus.walleij@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcow (Oracle) <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <mike.rapoport@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pedro Falcato <pedro.falcato@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Stephen Röttger <sroettger@google.com> Cc: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
1d6fad7b84 |
mseal sysmap: generic vdso vvar mapping
With the introduction of the generic vdso data storage the VM_SEALED_SYSMAP vm flag must be moved from the architecture specific _install_special_mapping() call [1] [2] which maps the vvar mapping to generic code. [1] https://lkml.kernel.org/r/20250305021711.3867874-4-jeffxu@google.com [2] https://lkml.kernel.org/r/20250305021711.3867874-5-jeffxu@google.com Link: https://lkml.kernel.org/r/20250311123326.2686682-2-hca@linux.ibm.com Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Jeff Xu <jeffxu@chromium.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
7b0141daf3 |
selftests: x86: test_mremap_vdso: skip if vdso is msealed
Add code to detect if the vdso is memory sealed, skip the test if it is. Link: https://lkml.kernel.org/r/20250305021711.3867874-3-jeffxu@google.com Signed-off-by: Jeff Xu <jeffxu@chromium.org> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org> Cc: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: Anna-Maria Behnsen <anna-maria@linutronix.de> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Benjamin Berg <benjamin@sipsolutions.net> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Elliot Hughes <enh@google.com> Cc: Florian Faineli <f.fainelli@gmail.com> Cc: Greg Ungerer <gerg@kernel.org> Cc: Guenter Roeck <groeck@chromium.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jann Horn <jannh@google.com> Cc: Jason A. Donenfeld <jason@zx2c4.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Jorge Lucangeli Obes <jorgelo@chromium.org> Cc: Linus Waleij <linus.walleij@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcow (Oracle) <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <mike.rapoport@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pedro Falcato <pedro.falcato@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Stephen Röttger <sroettger@google.com> Cc: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
5796d3967c |
mseal sysmap: kernel config and header change
Patch series "mseal system mappings", v9. As discussed during mseal() upstream process [1], mseal() protects the VMAs of a given virtual memory range against modifications, such as the read/write (RW) and no-execute (NX) bits. For complete descriptions of memory sealing, please see mseal.rst [2]. The mseal() is useful to mitigate memory corruption issues where a corrupted pointer is passed to a memory management system. For example, such an attacker primitive can break control-flow integrity guarantees since read-only memory that is supposed to be trusted can become writable or .text pages can get remapped. The system mappings are readonly only, memory sealing can protect them from ever changing to writable or unmmap/remapped as different attributes. System mappings such as vdso, vvar, vvar_vclock, vectors (arm compat-mode), sigpage (arm compat-mode), are created by the kernel during program initialization, and could be sealed after creation. Unlike the aforementioned mappings, the uprobe mapping is not established during program startup. However, its lifetime is the same as the process's lifetime [3]. It could be sealed from creation. The vsyscall on x86-64 uses a special address (0xffffffffff600000), which is outside the mm managed range. This means mprotect, munmap, and mremap won't work on the vsyscall. Since sealing doesn't enhance the vsyscall's security, it is skipped in this patch. If we ever seal the vsyscall, it is probably only for decorative purpose, i.e. showing the 'sl' flag in the /proc/pid/smaps. For this patch, it is ignored. It is important to note that the CHECKPOINT_RESTORE feature (CRIU) may alter the system mappings during restore operations. UML(User Mode Linux) and gVisor, rr are also known to change the vdso/vvar mappings. Consequently, this feature cannot be universally enabled across all systems. As such, CONFIG_MSEAL_SYSTEM_MAPPINGS is disabled by default. To support mseal of system mappings, architectures must define CONFIG_ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS and update their special mappings calls to pass mseal flag. Additionally, architectures must confirm they do not unmap/remap system mappings during the process lifetime. The existence of this flag for an architecture implies that it does not require the remapping of thest system mappings during process lifetime, so sealing these mappings is safe from a kernel perspective. This version covers x86-64 and arm64 archiecture as minimum viable feature. While no specific CPU hardware features are required for enable this feature on an archiecture, memory sealing requires a 64-bit kernel. Other architectures can choose whether or not to adopt this feature. Currently, I'm not aware of any instances in the kernel code that actively munmap/mremap a system mapping without a request from userspace. The PPC does call munmap when _install_special_mapping fails for vdso; however, it's uncertain if this will ever fail for PPC - this needs to be investigated by PPC in the future [4]. The UML kernel can add this support when KUnit tests require it [5]. In this version, we've improved the handling of system mapping sealing from previous versions, instead of modifying the _install_special_mapping function itself, which would affect all architectures, we now call _install_special_mapping with a sealing flag only within the specific architecture that requires it. This targeted approach offers two key advantages: 1) It limits the code change's impact to the necessary architectures, and 2) It aligns with the software architecture by keeping the core memory management within the mm layer, while delegating the decision of sealing system mappings to the individual architecture, which is particularly relevant since 32-bit architectures never require sealing. Prior to this patch series, we explored sealing special mappings from userspace using glibc's dynamic linker. This approach revealed several issues: - The PT_LOAD header may report an incorrect length for vdso, (smaller than its actual size). The dynamic linker, which relies on PT_LOAD information to determine mapping size, would then split and partially seal the vdso mapping. Since each architecture has its own vdso/vvar code, fixing this in the kernel would require going through each archiecture. Our initial goal was to enable sealing readonly mappings, e.g. .text, across all architectures, sealing vdso from kernel since creation appears to be simpler than sealing vdso at glibc. - The [vvar] mapping header only contains address information, not length information. Similar issues might exist for other special mappings. - Mappings like uprobe are not covered by the dynamic linker, and there is no effective solution for them. This feature's security enhancements will benefit ChromeOS, Android, and other high security systems. Testing: This feature was tested on ChromeOS and Android for both x86-64 and ARM64. - Enable sealing and verify vdso/vvar, sigpage, vector are sealed properly, i.e. "sl" shown in the smaps for those mappings, and mremap is blocked. - Passing various automation tests (e.g. pre-checkin) on ChromeOS and Android to ensure the sealing doesn't affect the functionality of Chromebook and Android phone. I also tested the feature on Ubuntu on x86-64: - With config disabled, vdso/vvar is not sealed, - with config enabled, vdso/vvar is sealed, and booting up Ubuntu is OK, normal operations such as browsing the web, open/edit doc are OK. Link: https://lore.kernel.org/all/20240415163527.626541-1-jeffxu@chromium.org/ [1] Link: Documentation/userspace-api/mseal.rst [2] Link: https://lore.kernel.org/all/CABi2SkU9BRUnqf70-nksuMCQ+yyiWjo3fM4XkRkL-NrCZxYAyg@mail.gmail.com/ [3] Link: https://lore.kernel.org/all/CABi2SkV6JJwJeviDLsq9N4ONvQ=EFANsiWkgiEOjyT9TQSt+HA@mail.gmail.com/ [4] Link: https://lore.kernel.org/all/202502251035.239B85A93@keescook/ [5] This patch (of 7): Provide infrastructure to mseal system mappings. Establish two kernel configs (CONFIG_MSEAL_SYSTEM_MAPPINGS, ARCH_SUPPORTS_MSEAL_SYSTEM_MAPPINGS) and VM_SEALED_SYSMAP macro for future patches. Link: https://lkml.kernel.org/r/20250305021711.3867874-1-jeffxu@google.com Link: https://lkml.kernel.org/r/20250305021711.3867874-2-jeffxu@google.com Signed-off-by: Jeff Xu <jeffxu@chromium.org> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Adhemerval Zanella <adhemerval.zanella@linaro.org> Cc: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: Anna-Maria Behnsen <anna-maria@linutronix.de> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Benjamin Berg <benjamin@sipsolutions.net> Cc: Christoph Hellwig <hch@lst.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Rientjes <rientjes@google.com> Cc: David S. Miller <davem@davemloft.net> Cc: Elliot Hughes <enh@google.com> Cc: Florian Faineli <f.fainelli@gmail.com> Cc: Greg Ungerer <gerg@kernel.org> Cc: Guenter Roeck <groeck@chromium.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Hyeonggon Yoo <42.hyeyoo@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jann Horn <jannh@google.com> Cc: Jason A. Donenfeld <jason@zx2c4.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Jorge Lucangeli Obes <jorgelo@chromium.org> Cc: Linus Waleij <linus.walleij@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Matthew Wilcow (Oracle) <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <mike.rapoport@gmail.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Pedro Falcato <pedro.falcato@gmail.com> Cc: Peter Xu <peterx@redhat.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Stephen Röttger <sroettger@google.com> Cc: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
02d9e1a204 |
mm: pgtable: remove tlb_remove_page_ptdesc()
The tlb_remove_ptdesc()/tlb_remove_table() is specially designed for page table pages, and now all architectures have been converted to use it to remove page table pages. So let's remove tlb_remove_page_ptdesc(), it currently has no users and should not be used for page table pages. Link: https://lkml.kernel.org/r/3df04c8494339073b71be4acb2d92e108ecd1b60.1740454179.git.zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickens <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Matthew Wilcow (Oracle) <willy@infradead.org> Cc: "Mike Rapoport (IBM)" <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Rik van Riel <riel@surriel.com> Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
f1fdec956f |
x86: pgtable: convert to use tlb_remove_ptdesc()
The x86 has already been converted to use struct ptdesc, so convert it to use tlb_remove_ptdesc() instead of tlb_remove_table(). Link: https://lkml.kernel.org/r/36ad56b7e06fa4b17fb23c4fc650e8e0d72bb3cd.1740454179.git.zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickens <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Matthew Wilcow (Oracle) <willy@infradead.org> Cc: "Mike Rapoport (IBM)" <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
4239c198e8 |
riscv: pgtable: unconditionally use tlb_remove_ptdesc()
To support fast gup, the commit
|
||
|
|
e3ecf7c7d0 |
mm: pgtable: convert some architectures to use tlb_remove_ptdesc()
Now, the nine architectures of csky, hexagon, loongarch, m68k, mips,
nios2, openrisc, sh and um do not select CONFIG_MMU_GATHER_RCU_TABLE_FREE,
and just call pagetable_dtor() + tlb_remove_page_ptdesc() (the wrapper of
tlb_remove_page()). This is the same as the implementation of
tlb_remove_{ptdesc|table}() under !CONFIG_MMU_GATHER_TABLE_FREE, so
convert these architectures to use tlb_remove_ptdesc().
The ultimate goal is to make the architecture only use tlb_remove_ptdesc()
or tlb_remove_table() for page table pages.
[zhengqi.arch@bytedance.com: v2]
Link: https://lkml.kernel.org/r/20250303072603.45423-1-zhengqi.arch@bytedance.com
[akpm@linux-foundation.org: remove trailing semi in arch/loongarch/include/asm/pgalloc.h]
Link: https://lkml.kernel.org/r/19db3e8673b67bad2f1df1ab37f1c89d99eacfea.1740454179.git.zhengqi.arch@bytedance.com
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k]
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickens <hughd@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Matthew Wilcow (Oracle) <willy@infradead.org>
Cc: "Mike Rapoport (IBM)" <rppt@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yu Zhao <yuzhao@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
||
|
|
1a03c275a3 |
mm: pgtable: change pt parameter of tlb_remove_ptdesc() to struct ptdesc*
All callers of tlb_remove_ptdesc() pass it a pointer of struct ptdesc, so let's change the pt parameter from void * to struct ptdesc * to perform a type safety check. Link: https://lkml.kernel.org/r/60bb44299cf2d731df6592e446e7f694054d0dbe.1740454179.git.zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Originally-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickens <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Matthew Wilcow (Oracle) <willy@infradead.org> Cc: "Mike Rapoport (IBM)" <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Rik van Riel <riel@surriel.com> Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
f21bb37afb |
mm: pgtable: make generic tlb_remove_table() use struct ptdesc
Patch series "remove tlb_remove_page_ptdesc()", v2. As suggested by Peter Zijlstra below [1], this series aims to remove tlb_remove_page_ptdesc(). : Fundamentally tlb_remove_page() is about removing *pages* as from a PTE, : there should not be a page-table anywhere near here *ever*. : : Yes, some architectures use tlb_remove_page() for page-tables too, but : that is more or less an implementation detail that can be fixed. After this series, all architectures use tlb_remove_table() or tlb_remove_ptdesc() to remove the page table pages. In the future, once all architectures using tlb_remove_table() have also converted to using struct ptdesc (eg. powerpc), it may be possible to use only tlb_remove_ptdesc(). [1] https://lore.kernel.org/linux-mm/20250103111457.GC22934@noisy.programming.kicks-ass.net/ This patch (of 6): Now only arm will call tlb_remove_ptdesc()/tlb_remove_table() when CONFIG_MMU_GATHER_TABLE_FREE is disabled. In this case, the type of the table parameter is actually struct ptdesc * instead of struct page *. Since struct ptdesc still overlaps with struct page and has not been separated from it, forcing the table parameter to struct page * will not cause any problems at this time. But this is definitely incorrect and needs to be fixed. So just like the generic __tlb_remove_table(), let generic tlb_remove_table() use struct ptdesc by default when CONFIG_MMU_GATHER_TABLE_FREE is disabled. Link: https://lkml.kernel.org/r/cover.1740454179.git.zhengqi.arch@bytedance.com Link: https://lkml.kernel.org/r/5be8c3ab7bd68510bf0db4cf84010f4dfe372917.1740454179.git.zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Reviewed-by: Kevin Brodsky <kevin.brodsky@arm.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Hugh Dickens <hughd@google.com> Cc: Jann Horn <jannh@google.com> Cc: Matthew Wilcow (Oracle) <willy@infradead.org> Cc: "Mike Rapoport (IBM)" <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Vishal Moola (Oracle) <vishal.moola@gmail.com> Cc: Will Deacon <will@kernel.org> Cc: Yu Zhao <yuzhao@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
1b3d3e9f4a |
microblaze/mm: put mm_cmdline_setup() in .init.text section
As reported by lkp, there is a section mismatch of mm_cmdline_setup() and memblock. The reason is we don't specify the section of mm_cmdline_setup() and gcc put it into .text.unlikely. As mm_cmdline_setup() is only used in mmu_init(), which is in .init.text section, put mm_cmdline_setup() into it too. Link: https://lkml.kernel.org/r/20250328010136.13139-1-richard.weiyang@gmail.com Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202503241259.kJV3U7Xj-lkp@intel.com/ Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Michal Simek <monstr@monstr.eu> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
9342bc134a |
mm/memory_hotplug: fix call folio_test_large with tail page in do_migrate_range
We triggered the below BUG: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x2 pfn:0x240402 head: order:9 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 flags: 0x1ffffe0000000040(head|node=1|zone=3|lastcpupid=0x1ffff) page_type: f4(hugetlb) page dumped because: VM_BUG_ON_PAGE(page->compound_head & 1) ------------[ cut here ]------------ kernel BUG at ./include/linux/page-flags.h:310! Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP Modules linked in: CPU: 7 UID: 0 PID: 166 Comm: sh Not tainted 6.14.0-rc7-dirty #374 Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : const_folio_flags+0x3c/0x58 lr : const_folio_flags+0x3c/0x58 Call trace: const_folio_flags+0x3c/0x58 (P) do_migrate_range+0x164/0x720 offline_pages+0x63c/0x6fc memory_subsys_offline+0x190/0x1f4 device_offline+0xc0/0x13c state_store+0x90/0xd8 dev_attr_store+0x18/0x2c sysfs_kf_write+0x44/0x54 kernfs_fop_write_iter+0x120/0x1cc vfs_write+0x240/0x378 ksys_write+0x70/0x108 __arm64_sys_write+0x1c/0x28 invoke_syscall+0x48/0x10c el0_svc_common.constprop.0+0x40/0xe0 When allocating a hugetlb folio, between the folio is taken from buddy and prep_compound_page() is called, start_isolate_page_range() and do_migrate_range() is called. When do_migrate_range() scans the head page of the hugetlb folio, the compound_head field isn't set, so scans the tail page next. And at this time, the compound_head field of tail page is set, folio_test_large() is called by tail page, thus triggers VM_BUG_ON(). To fix it, get folio refcount before calling folio_test_large(). Link: https://lkml.kernel.org/r/20250324131750.1551884-1-tujinjiang@huawei.com Fixes: |
||
|
|
38c5ecaadd |
MAINTAINERS: mm: add entry for secretmem
Link: https://lkml.kernel.org/r/20250326215541.1809379-5-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Oscar Salvador <osalvador@suse.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Dan Willaims <dan.j.williams@intel.com> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
6985850f3e |
MAINTAINERS: mm: add entry for numa memblocks and numa emulation
Link: https://lkml.kernel.org/r/20250326215541.1809379-4-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Oscar Salvador <osalvador@suse.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Dan Willaims <dan.j.williams@intel.com> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
8871b533ef |
MAINTAINERS: mm: add entry for execmem
Link: https://lkml.kernel.org/r/20250326215541.1809379-3-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Oscar Salvador <osalvador@suse.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Dan Willaims <dan.j.williams@intel.com> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
59aa44d1ee |
MAINTAINERS: fixup USERFAULTFD entry
Patch series "MAINTAINERS: add my isub-entries to MM part." Following discussion at LSF/MM/BPF I'm adding execmem, secretmem and numa memblocks sub-entries for MEMORY MANAGEMENT in MAINTAINERS. This patch (of 4): Change title to "MEMORY MANAGEMENT - USERFAULTFD" and make it sub-topic in memory management and add missing include/linux/userfaultfd_k.h and mailing list Link: https://lkml.kernel.org/r/20250326215541.1809379-1-rppt@kernel.org Link: https://lkml.kernel.org/r/20250326215541.1809379-2-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: Michal Hocko <mhocko@suse.com> Acked-by: Oscar Salvador <osalvador@suse.de> Acked-by: Vlastimil Babka <vbabka@suse.cz> Cc: Dan Willaims <dan.j.williams@intel.com> Cc: "Mike Rapoport (IBM)" <rppt@kernel.org> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
983e760bcd |
selftest/mm: va_high_addr_switch: add ppc64 support check
Add PPC64 Radix MMU support to the va_high_addr_switch.sh by introducing check_supported_ppc64(). The function verifies: - 5-level paging (PGTABLE_LEVELS >= 5) enable in kernel config - Radix MMU (required for PPC64 5-level translation) - HugePages availability (needed for some tests) If any check fails, the test is skipped (ksft_skip). This ensures compatibility with Power9/Power10 systems running in Radix MMU mode. Avoid failures on 4-level paging system: # mmap(NULL, MAP_HUGETLB): 0xffffffffffffffff - FAILED # mmap(LOW_ADDR, MAP_HUGETLB): 0xffffffffffffffff - FAILED # mmap(HIGH_ADDR, MAP_HUGETLB): 0xffffffffffffffff - FAILED # mmap(HIGH_ADDR, MAP_HUGETLB) again: 0xffffffffffffffff - FAILED # mmap(HIGH_ADDR, MAP_FIXED | MAP_HUGETLB): 0xffffffffffffffff - FAILED # mmap(-1, MAP_HUGETLB): 0xffffffffffffffff - FAILED # mmap(-1, MAP_HUGETLB) again: 0xffffffffffffffff - FAILED # mmap(ADDR_SWITCH_HINT - PAGE_SIZE, 2*HUGETLB_SIZE, MAP_HUGETLB): 0xffffffffffffffff - FAILED # mmap(ADDR_SWITCH_HINT , 2*HUGETLB_SIZE, MAP_FIXED | MAP_HUGETLB): 0xffffffffffffffff - FAILED Link: https://lkml.kernel.org/r/20250327114813.25980-1-liwang@redhat.com Signed-off-by: Li Wang <liwang@redhat.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
7790c9c926 |
memblock: don't release high memory to page allocator when HIGHMEM is off
Nathan Chancellor reports the following crash on a MIPS system with CONFIG_HIGHMEM=n: Linux version 6.14.0-rc6-00359-g6faea3422e3b (nathan@ax162) (mips-linux-gcc (GCC) 14.2.0, GNU ld (GNU Binutils) 2.42) #1 SMP Fri Mar 21 08:12:02 MST 2025 earlycon: uart8250 at I/O port 0x3f8 (options '38400n8') printk: legacy bootconsole [uart8250] enabled Config serial console: console=ttyS0,38400n8r CPU0 revision is: 00019300 (MIPS 24Kc) FPU revision is: 00739300 MIPS: machine is mti,malta Software DMA cache coherency enabled Initial ramdisk at: 0x8fad0000 (5360128 bytes) OF: reserved mem: Reserved memory: No reserved-memory node in the DT Primary instruction cache 2kB, VIPT, 2-way, linesize 16 bytes. Primary data cache 2kB, 2-way, VIPT, no aliases, linesize 16 bytes Zone ranges: DMA [mem 0x0000000000000000-0x0000000000ffffff] Normal [mem 0x0000000001000000-0x000000001fffffff] Movable zone start for each node Early memory node ranges node 0: [mem 0x0000000000000000-0x000000000fffffff] node 0: [mem 0x0000000090000000-0x000000009fffffff] Initmem setup node 0 [mem 0x0000000000000000-0x000000009fffffff] On node 0, zone Normal: 16384 pages in unavailable ranges random: crng init done percpu: Embedded 3 pages/cpu s18832 r8192 d22128 u49152 Kernel command line: rd_start=0xffffffff8fad0000 rd_size=5360128 console=ttyS0,38400n8r printk: log buffer data + meta data: 32768 + 102400 = 135168 bytes Dentry cache hash table entries: 65536 (order: 4, 262144 bytes, linear) Inode-cache hash table entries: 32768 (order: 3, 131072 bytes, linear) Writing ErrCtl register=00000000 Readback ErrCtl register=00000000 Built 1 zonelists, mobility grouping on. Total pages: 16384 mem auto-init: stack:all(zero), heap alloc:off, heap free:off Unhandled kernel unaligned access[#1]: CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.14.0-rc6-00359-g6faea3422e3b #1 Hardware name: mti,malta $ 0 : 00000000 00000001 81cb0880 00129027 $ 4 : 00000001 0000000a 00000002 00129026 $ 8 : ffffdfff 80101e00 00000002 00000000 $12 : 81c9c224 81c63e68 00000002 00000000 $16 : 805b1e00 00025800 81cb0880 00000002 $20 : 00000000 81c63e64 0000000a 81f10000 $24 : 81c63e64 81c63e60 $28 : 81c60000 81c63de0 00000001 81cc9d20 Hi : 00000000 Lo : 00000000 epc : 814a227c __free_pages_ok+0x144/0x3c0 ra : 81cc9d20 memblock_free_all+0x1d4/0x27c Status: 10000002 KERNEL EXL Cause : 00800410 (ExcCode 04) BadVA : 00129026 PrId : 00019300 (MIPS 24Kc) Modules linked in: Process swapper (pid: 0, threadinfo=(ptrval), task=(ptrval), tls=00000000) Stack : 81f10000 805a9e00 81c80000 00000000 00000002 814aa240 000003ff 00000400 00000000 81f10000 81c9c224 00003b1f 81c80000 81c63e60 81ca0000 81c63e64 81f10000 0000000a 0000001f 81cc9d20 81f10000 81cc96d8 00000000 81c80000 81c9c224 81c63e60 81c63e64 00000000 81f10000 00024000 00028000 00025c00 90000000 a0000000 00000002 00000017 00000000 00000000 81f10000 81f10000 ... Call Trace: [<814a227c>] __free_pages_ok+0x144/0x3c0 [<81cc9d20>] memblock_free_all+0x1d4/0x27c [<81cc6764>] mm_core_init+0x100/0x138 [<81cb4ba4>] start_kernel+0x4a0/0x6e4 Code: 1080ffd5 02003825 2467ffff <8ce30000> 7c630500 1060ffd4 00000000 8ce30000 7c630180 The crash happens because commit |
||
|
|
2ebc3b68ac |
mm/mm_init: init holes in the end of the memory map for FLATMEM
Patch series "mm: fixes for fallouts from mem_init() cleanup". These are the fixes for fallouts from mem_init() cleanup reported by Nathan Chancellor and kbuild. The details are in the commit messages. This patch (of 2): Kernel test robot reports the following crash on 32-bit system with FLATMEM and DEBUG_VM_PGFLAGS enabled: [ 0.478822][ T0] kernel BUG at include/linux/page-flags.h:536! [ 0.479312][ T0] Oops: invalid opcode: 0000 [#1] PREEMPT SMP [ 0.479768][ T0] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.14.0-rc6-00357-g8268af309d07 #1 [ 0.480470][ T0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 [ 0.481260][ T0] EIP: reserve_bootmem_region (include/linux/page-flags.h:536) [ 0.481683][ T0] Code: 5d c3 01 f1 89 c8 ba e1 38 f4 c3 e8 1e 37 8e fc 0f 0b b8 90 e2 62 c4 e8 e2 05 5e fc 01 f1 89 c8 ba be 85 f7 c3 e8 04 37 8e fc <0f> 0b b8 80 e2 62 c4 e8 c8 05 5e fc 55 89 e5 53 57 56 83 ec 10 89 [ 0.483177][ T0] EAX: 00000000 EBX: c425df50 ECX: 00000000 EDX: 00000000 [ 0.483712][ T0] ESI: 017ffc00 EDI: ffffffff EBP: c425df34 ESP: c425df2c [ 0.484248][ T0] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00210046 [ 0.484846][ T0] CR0: 80050033 CR2: 00000000 CR3: 04b48000 CR4: 00000090 [ 0.485376][ T0] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 0.485907][ T0] DR6: fffe0ff0 DR7: 00000400 [ 0.486253][ T0] Call Trace: [ 0.486494][ T0] ? __die_body (arch/x86/kernel/dumpstack.c:478) [ 0.486822][ T0] ? die (arch/x86/kernel/dumpstack.c:?) [ 0.487099][ T0] ? do_trap (arch/x86/kernel/traps.c:? arch/x86/kernel/traps.c:197) [ 0.487409][ T0] ? do_error_trap (arch/x86/kernel/traps.c:217) [ 0.487752][ T0] ? reserve_bootmem_region (include/linux/page-flags.h:536) [ 0.488153][ T0] ? exc_overflow (arch/x86/kernel/traps.c:301) [ 0.488490][ T0] ? handle_invalid_op (arch/x86/kernel/traps.c:254) [ 0.488869][ T0] ? reserve_bootmem_region (include/linux/page-flags.h:536) [ 0.489271][ T0] ? exc_invalid_op (arch/x86/kernel/traps.c:316) [ 0.489619][ T0] ? handle_exception (arch/x86/entry/entry_32.S:1055) [ 0.489996][ T0] ? exc_overflow (arch/x86/kernel/traps.c:301) [ 0.490332][ T0] ? reserve_bootmem_region (include/linux/page-flags.h:536) [ 0.490733][ T0] ? exc_overflow (arch/x86/kernel/traps.c:301) [ 0.491068][ T0] ? reserve_bootmem_region (include/linux/page-flags.h:536) [ 0.491470][ T0] memmap_init_reserved_pages (mm/memblock.c:2203) [ 0.491887][ T0] free_low_memory_core_early (mm/memblock.c:?) [ 0.492302][ T0] memblock_free_all (mm/memblock.c:2272 include/linux/atomic/atomic-arch-fallback.h:546 include/linux/atomic/atomic-long.h:123 include/linux/atomic/atomic-instrumented.h:3261 include/linux/mm.h:67 mm/memblock.c:2273) [ 0.492659][ T0] mem_init (arch/x86/mm/init_32.c:735) [ 0.492952][ T0] mm_core_init (mm/mm_init.c:2730) [ 0.493271][ T0] start_kernel (init/main.c:958) [ 0.493604][ T0] i386_start_kernel (arch/x86/kernel/head32.c:79) [ 0.493969][ T0] startup_32_smp (arch/x86/kernel/head_32.S:292) The crash happens because after commit |
||
|
|
4a0cb63144 |
MAINTAINERS: add peterx as userfaultfd reviewer
Add an entry for userfaultfd and make myself a reviewer of it, just in case it helps people manage the cc list. I named it MEMORY USERFAULTFD, could be a bad name, but then it can be together with the MEMORY* entries when everything is in alphabetic order, which is definitely a benefit. The line may not change much on how I'd work with userfaultfd; I think I'll do the same as before.. But maybe it still, more or less, adds some responsibility on top, indeed. [akpm@linux-foundation.org: add include/linux/userfaultfd_k.h, per Mike] [akpm@linux-foundation.org: fix misordering] Link: https://lkml.kernel.org/r/20250322002124.131736-1-peterx@redhat.com Signed-off-by: Peter Xu <peterx@redhat.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com> Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mike Rapoport <rppt@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
bd145bdd26 |
mm/page_alloc: replace flag check with PageHWPoison() in check_new_page_bad()
This patch replaces the direct check for the __PG_HWPOISON flag with the PageHWPoison() macro, improving code readability and maintaining consistency with other parts of the memory management code. Link: https://lkml.kernel.org/r/20250320063346.489030-1-ye.liu@linux.dev Signed-off-by: Ye Liu <liuye@kylinos.cn> Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
7f29070f4c |
mm/damon/core: simplify control flow in damon_register_ops()
The function logic is not complex, so using goto is unnecessary. Replace it with a straightforward if-else to simplify control flow and improve readability. Link: https://lkml.kernel.org/r/Z9vxcPCw8tDsjKw1@OneApple Signed-off-by: Taotao Chen <chentaotao@didiglobal.com> Reviewed-by: SeongJae Park <sj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
7fa46cdfff |
mm/kasan: use SLAB_NO_MERGE flag instead of an empty constructor
Use SLAB_NO_MERGE flag to prevent merging instead of providing an empty constructor. Using an empty constructor in this manner is an abuse of slab interface. The SLAB_NO_MERGE flag should be used with caution, but in this case, it is acceptable as the cache is intended solely for debugging purposes. No functional changes intended. Link: https://lkml.kernel.org/r/20250318015926.1629748-1-harry.yoo@oracle.com Signed-off-by: Harry Yoo <harry.yoo@oracle.com> Reviewed-by: Alexander Potapenko <glider@google.com> Reviewed-by: Andrey Konovalov <andreyknvl@gmail.com> Acked-by: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> |
||
|
|
7a95a05f15 |
mm: page_alloc: fix defrag_mode's retry & OOM path
Brendan points out that defrag_mode doesn't properly clear
ALLOC_NOFRAGMENT on its last-ditch attempt to allocate. But looking
closer, the problem is actually more severe: it doesn't actually *check*
whether it's already retried, and keeps looping. This means the OOM path
is never taken, and the thread can loop indefinitely.
This is verified with an intentional OOM test on defrag_mode=1, which
results in the machine hanging. After this patch, it triggers the OOM
kill reliably and recovers.
Clear ALLOC_NOFRAGMENT properly, and only retry once.
Link: https://lkml.kernel.org/r/20250401041231.GA2117727@cmpxchg.org
Fixes:
|
||
|
|
36eed54008 |
mm/mremap: do not set vrm->vma NULL immediately prior to checking it
This seems rather unwise. If we cannot merge, extend, then we need to
recall the original VMA to see if we need to uncharge.
If we do need to, do so.
Link: https://lkml.kernel.org/r/b2fb6b9c-376d-4e9b-905e-26d847fd3865@lucifer.local
Fixes:
|
||
|
|
c11bcbc0a5 |
mm: zswap: fix crypto_free_acomp() deadlock in zswap_cpu_comp_dead()
Currently, zswap_cpu_comp_dead() calls crypto_free_acomp() while holding
the per-CPU acomp_ctx mutex. crypto_free_acomp() then holds scomp_lock
(through crypto_exit_scomp_ops_async()).
On the other hand, crypto_alloc_acomp_node() holds the scomp_lock (through
crypto_scomp_init_tfm()), and then allocates memory. If the allocation
results in reclaim, we may attempt to hold the per-CPU acomp_ctx mutex.
The above dependencies can cause an ABBA deadlock. For example in the
following scenario:
(1) Task A running on CPU #1:
crypto_alloc_acomp_node()
Holds scomp_lock
Enters reclaim
Reads per_cpu_ptr(pool->acomp_ctx, 1)
(2) Task A is descheduled
(3) CPU #1 goes offline
zswap_cpu_comp_dead(CPU #1)
Holds per_cpu_ptr(pool->acomp_ctx, 1))
Calls crypto_free_acomp()
Waits for scomp_lock
(4) Task A running on CPU #2:
Waits for per_cpu_ptr(pool->acomp_ctx, 1) // Read on CPU #1
DEADLOCK
Since there is no requirement to call crypto_free_acomp() with the per-CPU
acomp_ctx mutex held in zswap_cpu_comp_dead(), move it after the mutex is
unlocked. Also move the acomp_request_free() and kfree() calls for
consistency and to avoid any potential sublte locking dependencies in the
future.
With this, only setting acomp_ctx fields to NULL occurs with the mutex
held. This is similar to how zswap_cpu_comp_prepare() only initializes
acomp_ctx fields with the mutex held, after performing all allocations
before holding the mutex.
Opportunistically, move the NULL check on acomp_ctx so that it takes place
before the mutex dereference.
Link: https://lkml.kernel.org/r/20250226185625.2672936-1-yosry.ahmed@linux.dev
Fixes:
|
||
|
|
1ca77ff183 |
mm/hugetlb: move hugetlb_sysctl_init() to the __init section
hugetlb_sysctl_init() is only invoked once by an __init function and is
merely a wrapper around another __init function so there is not reason to
keep it.
Fixes the following warning when toning down some GCC inline options:
WARNING: modpost: vmlinux: section mismatch in reference:
hugetlb_sysctl_init+0x1b (section: .text) ->
__register_sysctl_init (section: .init.text)
Link: https://lkml.kernel.org/r/20250319060041.2737320-1-marc.herbert@linux.intel.com
Signed-off-by: Marc Herbert <Marc.Herbert@linux.intel.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: Muchun Song <muchun.song@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
||
|
|
a0a9f2180b |
mm: page_isolation: avoid calling folio_hstate() without hugetlb_lock
I found a NULL pointer dereference as followed:
BUG: kernel NULL pointer dereference, address: 0000000000000028
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP PTI
CPU: 5 UID: 0 PID: 5964 Comm: sh Kdump: loaded Not tainted 6.13.0-dirty #20
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.
RIP: 0010:has_unmovable_pages+0x184/0x360
...
Call Trace:
<TASK>
set_migratetype_isolate+0xd1/0x180
start_isolate_page_range+0xd2/0x170
alloc_contig_range_noprof+0x101/0x660
alloc_contig_pages_noprof+0x238/0x290
alloc_gigantic_folio.isra.0+0xb6/0x1f0
only_alloc_fresh_hugetlb_folio.isra.0+0xf/0x60
alloc_pool_huge_folio+0x80/0xf0
set_max_huge_pages+0x211/0x490
__nr_hugepages_store_common+0x5f/0xe0
nr_hugepages_store+0x77/0x80
kernfs_fop_write_iter+0x118/0x200
vfs_write+0x23c/0x3f0
ksys_write+0x62/0xe0
do_syscall_64+0x5b/0x170
entry_SYSCALL_64_after_hwframe+0x76/0x7e
As has_unmovable_pages() call folio_hstate() without hugetlb_lock, there
is a race to free the HugeTLB page between PageHuge() and folio_hstate().
There is no need to add hugetlb_lock here as the HugeTLB page can be freed
in lot of places. So it's enough to unfold folio_hstate() and add a check
to avoid NULL pointer dereference for hugepage_migration_supported().
Link: https://lkml.kernel.org/r/20250122061151.578768-1-liushixin2@huawei.com
Fixes:
|
||
|
|
b98072af60 |
mm/hugetlb_vmemmap: fix memory loads ordering
Using x86_64 as an example, for a 32KB struct page[] area describing a 2MB
hugeTLB, HVO reduces the area to 4KB by the following steps:
1. Split the (r/w vmemmap) PMD mapping the area into 512 (r/w) PTEs;
2. For the 8 PTEs mapping the area, remap PTE 1-7 to the page mapped
by PTE 0, and at the same time change the permission from r/w to
r/o;
3. Free the pages PTE 1-7 used to map, hence the reduction from 32KB
to 4KB.
However, the following race can happen due to improperly memory loads
ordering:
CPU 1 (HVO) CPU 2 (speculative PFN walker)
page_ref_freeze()
synchronize_rcu()
rcu_read_lock()
page_is_fake_head() is false
vmemmap_remap_pte()
XXX: struct page[] becomes r/o
page_ref_unfreeze()
page_ref_count() is not zero
atomic_add_unless(&page->_refcount)
XXX: try to modify r/o struct page[]
Specifically, page_is_fake_head() must be ordered after page_ref_count()
on CPU 2 so that it can only return true for this case, to avoid the later
attempt to modify r/o struct page[].
This patch adds the missing memory barrier and makes the tests on
page_is_fake_head() and page_ref_count() done in the proper order.
Link: https://lkml.kernel.org/r/20250108074822.722696-1-yuzhao@google.com
Fixes:
|
||
|
|
fe4cdc2c4e |
mm/userfaultfd: fix release hang over concurrent GUP
This patch should fix a possible userfaultfd release() hang during
concurrent GUP.
This problem was initially reported by Dimitris Siakavaras in July 2023
[1] in a firecracker use case. Firecracker has a separate process
handling page faults remotely, and when the process releases the
userfaultfd it can race with a concurrent GUP from KVM trying to fault in
a guest page during the secondary MMU page fault process.
A similar problem was reported recently again by Jinjiang Tu in March 2025
[2], even though the race happened this time with a mlockall() operation,
which does GUP in a similar fashion.
In 2017, commit
|
||
|
|
28a1b05678 |
Merge tag 'i2c-for-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c updates from Wolfram Sang:
"i2c-core updates (collected by Wolfram):
- remove last user and unexport i2c_of_match_device()
- irq usage cleanup from Jiri
i2c-host updates (collected by Andi):
Refactoring and cleanups:
- octeon, cadence, i801, pasemi, mlxbf, bcm-iproc: general
refactorings
- octeon: remove 10-bit address support
Improvements:
- amd-asf: improved error handling
- designware: use guard(mutex)
- amd-asf, designware: update naming to follow latest specs
- cadence: fix cleanup path in probe
- i801: use MMIO and I/O mapping helpers to access registers
- pxa: handle error after clk_prepare_enable
New features:
- added i2c_10bit_addr_*_from_msg() and updated multiple drivers
- omap: added multiplexer state handling
- qcom-geni: update frequency configuration
- qup: introduce DMA usage policy
New hardware support:
- exynos: add support for Samsung exynos7870
- k1: add support for spacemit k1 (new driver)
- imx: add support for i.mx94 lpi2c
- rk3x: add support for rk3562
- designware: add support for Renesas RZ/N1D
Multiplexers:
- ltc4306, reg: fix assignment in platform_driver structure
at24 eeprom updates (collected by Bartosz):
- add two new compatible entries to the DT binding document
- drop of_match_ptr() and ACPI_PTR() macros"
* tag 'i2c-for-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: (50 commits)
dt-bindings: i2c: snps,designware-i2c: describe Renesas RZ/N1D variant
irqdomain: i2c: Switch to irq_find_mapping()
i2c: iproc: Refactor prototype and remove redundant error checks
i2c: qcom-geni: Update i2c frequency table to match hardware guidance
i2c: mlxbf: Use readl_poll_timeout_atomic() for polling
i2c: pasemi: Add registers bits and switch to BIT()
i2c: k1: Initialize variable before use
i2c: spacemit: add support for SpacemiT K1 SoC
dt-bindings: i2c: spacemit: add support for K1 SoC
i2c: omap: Add support for setting mux
dt-bindings: i2c: omap: Add mux-states property
i2c: octeon: remove 10-bit addressing support
i2c: octeon: fix return commenting
i2c: i801: Use MMIO if available
i2c: i801: Switch to iomapped register access
i2c: i801: Improve too small kill wait time in i801_check_post
i2c: i801: Move i801_wait_intr and i801_wait_byte_done in the code
i2c: i801: Cosmetic improvements
i2c: cadence: Move reset_control_assert after pm_runtime_set_suspended in probe error path
i2c: cadence: Simplify using devm_clk_get_enabled()
...
|
||
|
|
f8554f512b |
selftests: ublk: kublk: fix an error log line
When doing io_uring operations using liburing, errno is not used to indicate errors, so the %m format specifier does not provide any relevant information for failed io_uring commands. Fix a log line emitted on get_params failure to translate the error code returned in the cqe->res field instead. Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250401-ublk_selftests-v1-2-98129c9bc8bb@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
|
|
53c959295b |
selftests: ublk: kublk: use ioctl-encoded opcodes
There are a couple of places in the kublk selftests ublk server which use the legacy ublk opcodes. These operations fail (with -EOPNOTSUPP) on a kernel compiled without CONFIG_BLKDEV_UBLK_LEGACY_OPCODES set. We could easily require it to be set as a prerequisite for these selftests, but since new applications should not be using the legacy opcodes, use the ioctl-encoded opcodes everywhere in kublk. Signed-off-by: Uday Shankar <ushankar@purestorage.com> Reviewed-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250401-ublk_selftests-v1-1-98129c9bc8bb@purestorage.com Signed-off-by: Jens Axboe <axboe@kernel.dk> |
||
|
|
e5f1e8af9c |
x86/fred: Fix system hang during S4 resume with FRED enabled
Upon a wakeup from S4, the restore kernel starts and initializes the FRED MSRs as needed from its perspective. It then loads a hibernation image, including the image kernel, and attempts to load image pages directly into their original page frames used before hibernation unless those frames are currently in use. Once all pages are moved to their original locations, it jumps to a "trampoline" page in the image kernel. At this point, the image kernel takes control, but the FRED MSRs still contain values set by the restore kernel, which may differ from those set by the image kernel before hibernation. Therefore, the image kernel must ensure the FRED MSRs have the same values as before hibernation. Since these values depend only on the location of the kernel text and data, they can be recomputed from scratch. Reported-by: Xi Pardee <xi.pardee@intel.com> Reported-by: Todd Brandt <todd.e.brandt@intel.com> Tested-by: Todd Brandt <todd.e.brandt@intel.com> Suggested-by: H. Peter Anvin (Intel) <hpa@zytor.com> Signed-off-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: H. Peter Anvin (Intel) <hpa@zytor.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250401075728.3626147-1-xin@zytor.com |
||
|
|
212120a164 |
Documentation/EDAC: Fix warning document isn't included in any toctree
Fix the build (htmldocs) warning:
Documentation/edac/index.rst: WARNING: document isn't included in any toctree.
Fixes:
|
||
|
|
fcfd94d696 |
io_uring/zcrx: return early from io_zcrx_recv_skb if readlen is 0
When readlen is set for a recvzc request, tcp_read_sock() will call
io_zcrx_recv_skb() one final time with len == desc->count == 0. This is
caused by the !desc->count check happening too late. The offset + 1 !=
skb->len happens earlier and causes the while loop to continue.
Fix this in io_zcrx_recv_skb() instead of tcp_read_sock(). Return early
if len is 0 i.e. the read is done.
Fixes:
|
||
|
|
91e5bfe317 |
Merge tag 'dmaengine-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine
Pull dmaengine updates from Vinod Koul:
"The dmaengine subsystem updates for this cycle consist of a new driver
(Microchip) along with couple of yaml binding conversions, core api
updates and bunch of driver updates etc.
New HW support:
- Microchip sama7d65 dma controller
- Yaml conversion of atmel dma binding and Freescale Elo DMA
Controller binding
Core:
- Remove device_prep_dma_imm_data() API as users are removed
- Reduce scope of some less frequently used DMA request channel APIs
with aim to cleanup these in future
Updates:
- Drop Fenghua Yu from idxd maintainers, as he changed jobs
- AMD ptdma support for multiqueue and ae4dma deprecated PCI IDs
removal"
* tag 'dmaengine-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/dmaengine: (29 commits)
dmaengine: ptdma: Utilize the AE4DMA engine's multi-queue functionality
dmaengine: ae4dma: Use the MSI count and its corresponding IRQ number
dmaengine: ae4dma: Remove deprecated PCI IDs
dmaengine: Remove device_prep_dma_imm_data from struct dma_device
dmaengine: ti: edma: support sw triggered chans in of_edma_xlate()
dmaengine: ti: k3-udma: Enable second resource range for BCDMA and PKTDMA
dmaengine: fsl-edma: free irq correctly in remove path
dmaengine: fsl-edma: cleanup chan after dma_async_device_unregister
dt-bindings: dma: snps,dw-axi-dmac: Allow devices to be marked as noncoherent
dmaengine: dmatest: Fix dmatest waiting less when interrupted
dt-bindings: dma: Convert fsl,elo*-dma to YAML
dt-bindings: dma: fsl-mxs-dma: Add compatible string for i.MX8 chips
dmaengine: Fix typo in comment
dmaengine: ti: k3-udma-glue: Drop skip_fdq argument from k3_udma_glue_reset_rx_chn
dmaengine: bcm2835-dma: fix warning when CONFIG_PM=n
dt-bindings: dma: fsl,edma: Add i.MX94 support
dt-bindings: dma: atmel: add microchip,sama7d65-dma
dmaengine: img-mdc: remove incorrect of_match_ptr annotation
dmaengine: idxd: Delete unnecessary NULL check
dmaengine: pxa: Enable compile test
...
|
||
|
|
e63a165308 |
Merge tag 'phy-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy
Pull phy updates from Vinod Koul:
"A fairly moderate sized request for the generic phy subsystem with
some new device and driver support along with driver updates with
Samsung and Qualcomm ones being major ones.
New HW Support:
- Qualcomm X1P42100 PCIe Gen4x4, QCS615 qmp usbc, PCIe UNIPHY 28LP
driver, SM8750 QMP UFS PHY
- Rockchip rk3576 hdptx, rk3562 naneng-combo support
- Samsung MIPI D-/C-PHY driver, ExynosAutov920 ufs phy driver
Updates:
- Samsung USB3 Type-C lane orientation detection and configuration
for Google gs101
- Qualcomm support for dual lane PHY support for QCS8300 SoC"
* tag 'phy-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (47 commits)
phy: rockchip-naneng-combo: Support rk3562
dt-bindings: phy: rockchip: Add rk3562 naneng-combophy compatible
phy: rockchip: Add Samsung MIPI D-/C-PHY driver
dt-bindings: phy: Add Rockchip MIPI C-/D-PHY schema
phy: qcom: uniphy-28lp: add COMMON_CLK dependency
phy: rockchip: usbdp: Remove unnecessary bool conversion
phy: rockchip: usbdp: Avoid call hpd_event_trigger in dp_phy_init
phy: rockchip: usbdp: Only verify link rates/lanes/voltage when the corresponding set flags are set
phy: qcom-qmp-pcie: add dual lane PHY support for QCS8300
dt-bindings: phy: qcom,sc8280xp-qmp-pcie-phy: Document the QCS8300 QMP PCIe PHY Gen4 x2
phy: qcom-qmp-ufs: Add PHY Configuration support for sm8750
dt-bindings: phy: qcom,sc8280xp-qmp-ufs-phy: document the SM8750 QMP UFS PHY
phy: qcom: Introduce PCIe UNIPHY 28LP driver
dt-bindings: phy: qcom,uniphy-pcie: Document PCIe uniphy
phy: qcom: qmp-usbc: Add qmp configuration for QCS615
phy: freescale: imx8m-pcie: assert phy reset and perst in power off
phy: freescale: imx8m-pcie: cleanup reset logic
phy: core: Remove unused phy_pm_runtime_(allow|forbid)
dt-bindings: phy: document Allwinner A523 USB-2.0 PHY
phy: phy-rockchip-samsung-hdptx: Add support for RK3576
...
|
||
|
|
4d31167e84 |
Merge tag 'soundwire-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire
Pull soundwire updates from Vinod Koul: - Support for SoundWire Bulk Register Access (BRA) protocol in core along with Intel driver support and ASoC bits required - AMD driver updates and support for ACP 7.0 and 7.1 platforms * tag 'soundwire-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vkoul/soundwire: (28 commits) soundwire: take in count the bandwidth of a prepared stream ASoC: rt711-sdca: add DP0 support soundwire: debugfs: add interface for BPT/BRA transfers ASoC: SOF: Intel: hda-sdw-bpt: add CHAIN_DMA support soundwire: intel_ace2x: add BPT send_async/wait callbacks soundwire: intel: add BPT context definition ASoC: SOF: Intel: hda-sdw-bpt: add helpers for SoundWire BPT DMA soundwire: intel_auxdevice: add indirection for BPT send_async/wait soundwire: cadence: add BTP/BRA helpers to format data soundwire: bus: add bpt_stream pointer soundwire: bus: add send_async/wait APIs for BPT protocol soundwire: stream: reuse existing code for BPT stream soundwire: stream: special-case the bus compute_params() routine soundwire: stream: extend sdw_alloc_stream() to take 'type' parameter soundwire: extend sdw_stream_type to BPT soundwire: cadence: add BTP support for DP0 Documentation: driver: add SoundWire BRA description soundwire: amd: change the log level for command response log soundwire: slave: fix an OF node reference leak in soundwire slave device soundwire: Use str_enable_disable-like helpers ... |
||
|
|
d0ebf4c7eb |
x86/platform/iosf_mbi: Remove unused iosf_mbi_unregister_pmic_bus_access_notifier()
The last use of iosf_mbi_unregister_pmic_bus_access_notifier() was
removed in 2017 by:
|
||
|
|
25601e8544 |
Merge tag 'char-misc-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc / IIO driver updates from Greg KH:
"Here is the big set of char, misc, iio, and other smaller driver
subsystems for 6.15-rc1. Lots of stuff in here, including:
- loads of IIO changes and driver updates
- counter driver updates
- w1 driver updates
- faux conversions for some drivers that were abusing the platform
bus interface
- coresight driver updates
- rust miscdevice binding updates based on real-world-use
- other minor driver updates
All of these have been in linux-next with no reported issues for quite
a while"
* tag 'char-misc-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (292 commits)
samples: rust_misc_device: fix markup in top-level docs
Coresight: Fix a NULL vs IS_ERR() bug in probe
misc: lis3lv02d: convert to use faux_device
tlclk: convert to use faux_device
regulator: dummy: convert to use the faux device interface
bus: mhi: host: Fix race between unprepare and queue_buf
coresight: configfs: Constify struct config_item_type
doc: iio: ad7380: describe offload support
iio: ad7380: add support for SPI offload
iio: light: Add check for array bounds in veml6075_read_int_time_ms
iio: adc: ti-ads7924 Drop unnecessary function parameters
staging: iio: ad9834: Use devm_regulator_get_enable()
staging: iio: ad9832: Use devm_regulator_get_enable()
iio: gyro: bmg160_spi: add of_match_table
dt-bindings: iio: adc: Add i.MX94 and i.MX95 support
iio: adc: ad7768-1: remove unnecessary locking
Documentation: ABI: add wideband filter type to sysfs-bus-iio
iio: adc: ad7768-1: set MOSI idle state to prevent accidental reset
iio: adc: ad7768-1: Fix conversion result sign
iio: adc: ad7124: Benefit of dev = indio_dev->dev.parent in ad7124_parse_channel_config()
...
|
||
|
|
2cd5769fb0 |
Merge tag 'driver-core-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updatesk from Greg KH:
"Here is the big set of driver core updates for 6.15-rc1. Lots of stuff
happened this development cycle, including:
- kernfs scaling changes to make it even faster thanks to rcu
- bin_attribute constify work in many subsystems
- faux bus minor tweaks for the rust bindings
- rust binding updates for driver core, pci, and platform busses,
making more functionaliy available to rust drivers. These are all
due to people actually trying to use the bindings that were in
6.14.
- make Rafael and Danilo full co-maintainers of the driver core
codebase
- other minor fixes and updates"
* tag 'driver-core-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (52 commits)
rust: platform: require Send for Driver trait implementers
rust: pci: require Send for Driver trait implementers
rust: platform: impl Send + Sync for platform::Device
rust: pci: impl Send + Sync for pci::Device
rust: platform: fix unrestricted &mut platform::Device
rust: pci: fix unrestricted &mut pci::Device
rust: device: implement device context marker
rust: pci: use to_result() in enable_device_mem()
MAINTAINERS: driver core: mark Rafael and Danilo as co-maintainers
rust/kernel/faux: mark Registration methods inline
driver core: faux: only create the device if probe() succeeds
rust/faux: Add missing parent argument to Registration::new()
rust/faux: Drop #[repr(transparent)] from faux::Registration
rust: io: fix devres test with new io accessor functions
rust: io: rename `io::Io` accessors
kernfs: Move dput() outside of the RCU section.
efi: rci2: mark bin_attribute as __ro_after_init
rapidio: constify 'struct bin_attribute'
firmware: qemu_fw_cfg: constify 'struct bin_attribute'
powerpc/perf/hv-24x7: Constify 'struct bin_attribute'
...
|
||
|
|
d6b02199cd |
Merge tag 'mm-nonmm-stable-2025-03-30-18-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton: - The series "powerpc/crash: use generic crashkernel reservation" from Sourabh Jain changes powerpc's kexec code to use more of the generic layers. - The series "get_maintainer: report subsystem status separately" from Vlastimil Babka makes some long-requested improvements to the get_maintainer output. - The series "ucount: Simplify refcounting with rcuref_t" from Sebastian Siewior cleans up and optimizing the refcounting in the ucount code. - The series "reboot: support runtime configuration of emergency hw_protection action" from Ahmad Fatoum improves the ability for a driver to perform an emergency system shutdown or reboot. - The series "Converge on using secs_to_jiffies() part two" from Easwar Hariharan performs further migrations from msecs_to_jiffies() to secs_to_jiffies(). - The series "lib/interval_tree: add some test cases and cleanup" from Wei Yang permits more userspace testing of kernel library code, adds some more tests and performs some cleanups. - The series "hung_task: Dump the blocking task stacktrace" from Masami Hiramatsu arranges for the hung_task detector to dump the stack of the blocking task and not just that of the blocked task. - The series "resource: Split and use DEFINE_RES*() macros" from Andy Shevchenko provides some cleanups to the resource definition macros. - Plus the usual shower of singleton patches - please see the individual changelogs for details. * tag 'mm-nonmm-stable-2025-03-30-18-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (77 commits) mailmap: consolidate email addresses of Alexander Sverdlin fs/procfs: fix the comment above proc_pid_wchan() relay: use kasprintf() instead of fixed buffer formatting resource: replace open coded variant of DEFINE_RES() resource: replace open coded variants of DEFINE_RES_*_NAMED() resource: replace open coded variant of DEFINE_RES_NAMED_DESC() resource: split DEFINE_RES_NAMED_DESC() out of DEFINE_RES_NAMED() samples: add hung_task detector mutex blocking sample hung_task: show the blocker task if the task is hung on mutex kexec_core: accept unaccepted kexec segments' destination addresses watchdog/perf: optimize bytes copied and remove manual NUL-termination lib/interval_tree: fix the comment of interval_tree_span_iter_next_gap() lib/interval_tree: skip the check before go to the right subtree lib/interval_tree: add test case for span iteration lib/interval_tree: add test case for interval_tree_iter_xxx() helpers lib/rbtree: add random seed lib/rbtree: split tests lib/rbtree: enable userland test suite for rbtree related data structure checkpatch: describe --min-conf-desc-length scripts/gdb/symbols: determine KASLR offset on s390 ... |
||
|
|
9e4e249018 |
cpufreq: Reference count policy in cpufreq_update_limits()
Since acpi_processor_notify() can be called before registering a cpufreq
driver or even in cases when a cpufreq driver is not registered at all,
cpufreq_update_limits() needs to check if a cpufreq driver is present
and prevent it from being unregistered.
For this purpose, make it call cpufreq_cpu_get() to obtain a cpufreq
policy pointer for the given CPU and reference count the corresponding
policy object, if present.
Fixes:
|
||
|
|
eb0ece1602 |
Merge tag 'mm-stable-2025-03-30-16-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton: - The series "Enable strict percpu address space checks" from Uros Bizjak uses x86 named address space qualifiers to provide compile-time checking of percpu area accesses. This has caused a small amount of fallout - two or three issues were reported. In all cases the calling code was found to be incorrect. - The series "Some cleanup for memcg" from Chen Ridong implements some relatively monir cleanups for the memcontrol code. - The series "mm: fixes for device-exclusive entries (hmm)" from David Hildenbrand fixes a boatload of issues which David found then using device-exclusive PTE entries when THP is enabled. More work is needed, but this makes thins better - our own HMM selftests now succeed. - The series "mm: zswap: remove z3fold and zbud" from Yosry Ahmed remove the z3fold and zbud implementations. They have been deprecated for half a year and nobody has complained. - The series "mm: further simplify VMA merge operation" from Lorenzo Stoakes implements numerous simplifications in this area. No runtime effects are anticipated. - The series "mm/madvise: remove redundant mmap_lock operations from process_madvise()" from SeongJae Park rationalizes the locking in the madvise() implementation. Performance gains of 20-25% were observed in one MADV_DONTNEED microbenchmark. - The series "Tiny cleanup and improvements about SWAP code" from Baoquan He contains a number of touchups to issues which Baoquan noticed when working on the swap code. - The series "mm: kmemleak: Usability improvements" from Catalin Marinas implements a couple of improvements to the kmemleak user-visible output. - The series "mm/damon/paddr: fix large folios access and schemes handling" from Usama Arif provides a couple of fixes for DAMON's handling of large folios. - The series "mm/damon/core: fix wrong and/or useless damos_walk() behaviors" from SeongJae Park fixes a few issues with the accuracy of kdamond's walking of DAMON regions. - The series "expose mapping wrprotect, fix fb_defio use" from Lorenzo Stoakes changes the interaction between framebuffer deferred-io and core MM. No functional changes are anticipated - this is preparatory work for the future removal of page structure fields. - The series "mm/damon: add support for hugepage_size DAMOS filter" from Usama Arif adds a DAMOS filter which permits the filtering by huge page sizes. - The series "mm: permit guard regions for file-backed/shmem mappings" from Lorenzo Stoakes extends the guard region feature from its present "anon mappings only" state. The feature now covers shmem and file-backed mappings. - The series "mm: batched unmap lazyfree large folios during reclamation" from Barry Song cleans up and speeds up the unmapping for pte-mapped large folios. - The series "reimplement per-vma lock as a refcount" from Suren Baghdasaryan puts the vm_lock back into the vma. Our reasons for pulling it out were largely bogus and that change made the code more messy. This patchset provides small (0-10%) improvements on one microbenchmark. - The series "Docs/mm/damon: misc DAMOS filters documentation fixes and improves" from SeongJae Park does some maintenance work on the DAMON docs. - The series "hugetlb/CMA improvements for large systems" from Frank van der Linden addresses a pile of issues which have been observed when using CMA on large machines. - The series "mm/damon: introduce DAMOS filter type for unmapped pages" from SeongJae Park enables users of DMAON/DAMOS to filter my the page's mapped/unmapped status. - The series "zsmalloc/zram: there be preemption" from Sergey Senozhatsky teaches zram to run its compression and decompression operations preemptibly. - The series "selftests/mm: Some cleanups from trying to run them" from Brendan Jackman fixes a pile of unrelated issues which Brendan encountered while runnimg our selftests. - The series "fs/proc/task_mmu: add guard region bit to pagemap" from Lorenzo Stoakes permits userspace to use /proc/pid/pagemap to determine whether a particular page is a guard page. - The series "mm, swap: remove swap slot cache" from Kairui Song removes the swap slot cache from the allocation path - it simply wasn't being effective. - The series "mm: cleanups for device-exclusive entries (hmm)" from David Hildenbrand implements a number of unrelated cleanups in this code. - The series "mm: Rework generic PTDUMP configs" from Anshuman Khandual implements a number of preparatoty cleanups to the GENERIC_PTDUMP Kconfig logic. - The series "mm/damon: auto-tune aggregation interval" from SeongJae Park implements a feedback-driven automatic tuning feature for DAMON's aggregation interval tuning. - The series "Fix lazy mmu mode" from Ryan Roberts fixes some issues in powerpc, sparc and x86 lazy MMU implementations. Ryan did this in preparation for implementing lazy mmu mode for arm64 to optimize vmalloc. - The series "mm/page_alloc: Some clarifications for migratetype fallback" from Brendan Jackman reworks some commentary to make the code easier to follow. - The series "page_counter cleanup and size reduction" from Shakeel Butt cleans up the page_counter code and fixes a size increase which we accidentally added late last year. - The series "Add a command line option that enables control of how many threads should be used to allocate huge pages" from Thomas Prescher does that. It allows the careful operator to significantly reduce boot time by tuning the parallalization of huge page initialization. - The series "Fix calculations in trace_balance_dirty_pages() for cgwb" from Tang Yizhou fixes the tracing output from the dirty page balancing code. - The series "mm/damon: make allow filters after reject filters useful and intuitive" from SeongJae Park improves the handling of allow and reject filters. Behaviour is made more consistent and the documention is updated accordingly. - The series "Switch zswap to object read/write APIs" from Yosry Ahmed updates zswap to the new object read/write APIs and thus permits the removal of some legacy code from zpool and zsmalloc. - The series "Some trivial cleanups for shmem" from Baolin Wang does as it claims. - The series "fs/dax: Fix ZONE_DEVICE page reference counts" from Alistair Popple regularizes the weird ZONE_DEVICE page refcount handling in DAX, permittig the removal of a number of special-case checks. - The series "refactor mremap and fix bug" from Lorenzo Stoakes is a preparatoty refactoring and cleanup of the mremap() code. - The series "mm: MM owner tracking for large folios (!hugetlb) + CONFIG_NO_PAGE_MAPCOUNT" from David Hildenbrand reworks the manner in which we determine whether a large folio is known to be mapped exclusively into a single MM. - The series "mm/damon: add sysfs dirs for managing DAMOS filters based on handling layers" from SeongJae Park adds a couple of new sysfs directories to ease the management of DAMON/DAMOS filters. - The series "arch, mm: reduce code duplication in mem_init()" from Mike Rapoport consolidates many per-arch implementations of mem_init() into code generic code, where that is practical. - The series "mm/damon/sysfs: commit parameters online via damon_call()" from SeongJae Park continues the cleaning up of sysfs access to DAMON internal data. - The series "mm: page_ext: Introduce new iteration API" from Luiz Capitulino reworks the page_ext initialization to fix a boot-time crash which was observed with an unusual combination of compile and cmdline options. - The series "Buddy allocator like (or non-uniform) folio split" from Zi Yan reworks the code to split a folio into smaller folios. The main benefit is lessened memory consumption: fewer post-split folios are generated. - The series "Minimize xa_node allocation during xarry split" from Zi Yan reduces the number of xarray xa_nodes which are generated during an xarray split. - The series "drivers/base/memory: Two cleanups" from Gavin Shan performs some maintenance work on the drivers/base/memory code. - The series "Add tracepoints for lowmem reserves, watermarks and totalreserve_pages" from Martin Liu adds some more tracepoints to the page allocator code. - The series "mm/madvise: cleanup requests validations and classifications" from SeongJae Park cleans up some warts which SeongJae observed during his earlier madvise work. - The series "mm/hwpoison: Fix regressions in memory failure handling" from Shuai Xue addresses two quite serious regressions which Shuai has observed in the memory-failure implementation. - The series "mm: reliable huge page allocator" from Johannes Weiner makes huge page allocations cheaper and more reliable by reducing fragmentation. - The series "Minor memcg cleanups & prep for memdescs" from Matthew Wilcox is preparatory work for the future implementation of memdescs. - The series "track memory used by balloon drivers" from Nico Pache introduces a way to track memory used by our various balloon drivers. - The series "mm/damon: introduce DAMOS filter type for active pages" from Nhat Pham permits users to filter for active/inactive pages, separately for file and anon pages. - The series "Adding Proactive Memory Reclaim Statistics" from Hao Jia separates the proactive reclaim statistics from the direct reclaim statistics. - The series "mm/vmscan: don't try to reclaim hwpoison folio" from Jinjiang Tu fixes our handling of hwpoisoned pages within the reclaim code. * tag 'mm-stable-2025-03-30-16-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (431 commits) mm/page_alloc: remove unnecessary __maybe_unused in order_to_pindex() x86/mm: restore early initialization of high_memory for 32-bits mm/vmscan: don't try to reclaim hwpoison folio mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper cgroup: docs: add pswpin and pswpout items in cgroup v2 doc mm: vmscan: split proactive reclaim statistics from direct reclaim statistics selftests/mm: speed up split_huge_page_test selftests/mm: uffd-unit-tests support for hugepages > 2M docs/mm/damon/design: document active DAMOS filter type mm/damon: implement a new DAMOS filter type for active pages fs/dax: don't disassociate zero page entries MM documentation: add "Unaccepted" meminfo entry selftests/mm: add commentary about 9pfs bugs fork: use __vmalloc_node() for stack allocation docs/mm: Physical Memory: Populate the "Zones" section xen: balloon: update the NR_BALLOON_PAGES state hv_balloon: update the NR_BALLOON_PAGES state balloon_compaction: update the NR_BALLOON_PAGES state meminfo: add a per node counter for balloon drivers mm: remove references to folio in __memcg_kmem_uncharge_page() ... |
||
|
|
93d34608fd |
ASoC: imx-card: Add NULL check in imx_card_probe()
devm_kasprintf() returns NULL when memory allocation fails. Currently,
imx_card_probe() does not check for this case, which results in a NULL
pointer dereference.
Add NULL check after devm_kasprintf() to prevent this issue.
Fixes:
|