linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-21 14:56:34 -04:00

Author	SHA1	Message	Date
Eric Biggers	98066f2f89	crypto: lib/chacha - strongly type the ChaCha state The ChaCha state matrix is 16 32-bit words. Currently it is represented in the code as a raw u32 array, or even just a pointer to u32. This weak typing is error-prone. Instead, introduce struct chacha_state: struct chacha_state { u32 x[16]; }; Convert all ChaCha and HChaCha functions to use struct chacha_state. No functional changes. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-05-12 13:32:53 +08:00
Chelsy Ratnawat	f11c1efe46	selftests: fix some typos in tools/testing/selftests Fix multiple spelling errors: - "rougly" -> "roughly" - "fielesystems" -> "filesystems" - "Can'" -> "Can't" Link: https://lkml.kernel.org/r/20250503211959.507815-1-chelsyratnawat2001@gmail.com Signed-off-by: Chelsy Ratnawat <chelsyratnawat2001@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-05-11 17:54:13 -07:00
Herton R. Krzesinski	92f3c5a005	lib/test_kmod: do not hardcode/depend on any filesystem Right now test_kmod has hardcoded dependencies on btrfs/xfs. That is not optimal since you end up needing to select/build them, but it is not really required since other fs could be selected for the testing. Also, we can't change the default/driver module used for testing on initialization. Thus make it more generic: introduce two module parameters (start_driver and start_test_fs), which allow to select which modules/fs to use for the testing on test_kmod initialization. Then it's up to the user to select which modules/fs to use for testing based on his config. However, keep test_module as required default. This way, config/modules becomes selectable as when the testing is done from selftests (userspace). While at it, also change trigger_config_run_type, since at module initialization we already set the defaults at __kmod_config_init and should not need to do it again in test_kmod_init(), thus we can avoid to again set test_driver/test_fs. Link: https://lkml.kernel.org/r/20250418165047.702487-1-herton@redhat.com Signed-off-by: Herton R. Krzesinski <herton@redhat.com> Reviewed-by: Luis Chambelrain <mcgrof@kernel.org> Cc: Daniel Gomez <da.gomez@samsung.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Petr Pavlu <petr.pavlu@suse.com> Cc: Sami Tolvanen <samitolvanen@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-05-11 17:54:09 -07:00
Enze Li	f736953e2b	selftests/damon: remove the remaining test scripts for DAMON debugfs interface DAMON has dropped debugfs support; therefore, remove these unused scripts. Link: https://lkml.kernel.org/r/20250411024332.1373861-1-enze.li@linux.dev Fixes: `5ec4333b19` ("mm/damon: remove DAMON debugfs interface") Signed-off-by: Enze Li <lienze@kylinos.cn> Reviewed-by: SeongJae Park <sj@kernel.org> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-05-11 17:48:30 -07:00
Donet Tom	585a914588	selftests/mm: restore default nr_hugepages value during cleanup in hugetlb_reparenting_test.sh During cleanup, the value of /proc/sys/vm/nr_hugepages is currently being set to 0. At the end of the test, if all tests pass, the original nr_hugepages value is restored. However, if any test fails, it remains set to 0. With this patch, we ensure that the original nr_hugepages value is restored during cleanup, regardless of whether the test passes or fails. Link: https://lkml.kernel.org/r/20250410100748.2310-1-donettom@linux.ibm.com Fixes: `29750f71a9` ("hugetlb_cgroup: add hugetlb_cgroup reservation tests") Signed-off-by: Donet Tom <donettom@linux.ibm.com> Cc: Li Wang <liwang@redhat.com> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Waiman Long <longman@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-05-11 17:48:29 -07:00
Sidhartha Kumar	271152a973	maple_tree: add sufficient height In order to support rebalancing and spanning stores using less than the worst case number of nodes, we need to track more than just the vacant height. Using only vacant height to reduce the worst case maple node allocation count can lead to a shortcoming of nodes in the following scenarios. For rebalancing writes, when a leaf node becomes insufficient, it may be combined with a sibling into a single node. This means that the parent node which has entries for this children will lose one entry. If this parent node was just meeting the minimum entries, losing one entry will now cause this parent node to be insufficient. This leads to a cascading operation of rebalancing at different levels and can lead to more node allocations than simply using vacant height can return. For spanning writes, a similar situation occurs. At the location at which a spanning write is detected, the number of ancestor nodes may similarly need to rebalanced into a smaller number of nodes and the same cascading situation could occur. To use less than the full height of the tree for the number of allocations, we also need to track the height at which a non-leaf node cannot become insufficient. This means even if a rebalance occurs to a child of this node, it currently has enough entries that it can lose one without any further action. This field is stored in the maple write state as sufficient height. In mas_prealloc_calc() when figuring out how many nodes to allocate, we check if the vacant node is lower in the tree than a sufficient node (has a larger value). If it is, we cannot use the vacant height and must use the difference in the height and sufficient height as the basis for the number of nodes needed. An off by one bug was also discovered in mast_overflow() where it is using >= rather than >. This caused extra iterations of the mas_spanning_rebalance() loop and lead to unneeded allocations. A test is also added to check the number of allocations is correct. Link: https://lkml.kernel.org/r/20250410191446.2474640-6-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-05-11 17:48:29 -07:00
Sidhartha Kumar	ad88fc17d2	maple_tree: use vacant nodes to reduce worst case allocations In order to determine the store type for a maple tree operation, a walk of the tree is done through mas_wr_walk(). This function descends the tree until a spanning write is detected or we reach a leaf node. While descending, keep track of the height at which we encounter a node with available space. This is done by checking if mas->end is less than the number of slots a given node type can fit. Now that the height of the vacant node is tracked, we can use the difference between the height of the tree and the height of the vacant node to know how many levels we will have to propagate creating new nodes. Update mas_prealloc_calc() to consider the vacant height and reduce the number of worst-case allocations. Rebalancing and spanning stores are not supported and fall back to using the full height of the tree for allocations. Update preallocation testing assertions to take into account vacant height. Link: https://lkml.kernel.org/r/20250410191446.2474640-4-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-05-11 17:48:28 -07:00
Sidhartha Kumar	f9d3a963fe	maple_tree: use height and depth consistently For the maple tree, the root node is defined to have a depth of 0 with a height of 1. Each level down from the node, these values are incremented by 1. Various code paths define a root with depth 1 which is inconsisent with the definition. Modify the code to be consistent with this definition. In mas_spanning_rebalance(), l_mas.depth was being used to track the height based on the number of iterations done in the main loop. This information was then used in mas_put_in_tree() to set the height. Rather than overload the l_mas.depth field to track height, simply keep track of height in the local variable new_height and directly pass this to mas_wmb_replace() which will be passed into mas_put_in_tree(). This allows up to remove writes to l_mas.depth. Link: https://lkml.kernel.org/r/20250410191446.2474640-3-sidhartha.kumar@oracle.com Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-05-11 17:48:28 -07:00
Lorenzo Stoakes	10d288964d	tools/testing/selftests: assert that anon merge cases behave as expected Prior to the recently applied commit that permits this merge, mprotect()'ing a faulted VMA, adjacent to an unfaulted VMA, such that the two share characteristics would fail to merge due to what appear to be unintended consequences of commit `965f55dea0` ("mmap: avoid merging cloned VMAs"). Now we have fixed this bug, assert that we can indeed merge anonymous VMAs this way. Also assert that forked source/target VMAs are equally rejected. Previously, all empty target anon merges with one VMA faulted and the other unfaulted would be rejected incorrectly, now we ensure that unforked merge, but forked do not. Additionally, add the new test file to the MEMORY MAPPING section in MAINTAINERS, as these tests are explicitly memory mapping related. Link: https://lkml.kernel.org/r/2b69330274a3b71721f7042c5eabe91143934415.1744104124.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-05-11 17:48:26 -07:00
Lorenzo Stoakes	bd23f293a0	tools/testing: add PROCMAP_QUERY helper functions in mm self tests The PROCMAP_QUERY ioctl() is very useful - it allows for binary access to /proc/$pid/[s]maps data and thus convenient lookup of data contained there. This patch exposes this for convenient use by mm self tests so the state of VMAs can easily be queried. Link: https://lkml.kernel.org/r/ce83d877093d1fc594762cf4b82f0c27963030ee.1744104124.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com> Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-05-11 17:48:26 -07:00
Lorenzo Stoakes	879bca0a2c	mm/vma: fix incorrectly disallowed anonymous VMA merges Patch series "fix incorrectly disallowed anonymous VMA merges", v2. It appears that we have been incorrectly rejecting merge cases for 15 years, apparently by mistake. Imagine a range of anonymous mapped momemory divided into two VMAs like this, with incompatible protection bits: RW RWX unfaulted faulted \|-----------\|-----------\| \| prev \| vma \| \|-----------\|-----------\| mprotect(RW) Now imagine mprotect()'ing vma so it is RW. This appears as if it should merge, it does not. Neither does this case, again mprotect()'ing vma RW: RWX RW faulted unfaulted \|-----------\|-----------\| \| vma \| next \| \|-----------\|-----------\| mprotect(RW) Nor: RW RWX RW unfaulted faulted unfaulted \|-----------\|-----------\|-----------\| \| prev \| vma \| next \| \|-----------\|-----------\|-----------\| mprotect(RW) What's going on here? In commit `5beb493052` ("mm: change anon_vma linking to fix multi-process server scalability issue"), from 2010, Rik von Riel took careful care to account for these cases - commenting that '[this is] easily overlooked: when mprotect shifts the boundary, make sure the expanding vma has anon_vma set if the shrinking vma had, to cover any anon pages imported.' However, commit `965f55dea0` ("mmap: avoid merging cloned VMAs") introduced a little over a year later, appears to have accidentally disallowed this. By adjusting the is_mergeable_anon_vma() function to avoid lock contention across large trees of forked anon_vma's, this commit wrongly assumed the VMA being checked (the ostensible merge 'target') should be faulted, that is, have an anon_vma, and thus an anon_vma_chain list established, but only of length 1. This appears to have been unintentional, as disallowing empty target VMAs like this across the board makes no sense. We already have logic that accounts for this case, the same logic Rik introduced in 2010, now via dup_anon_vma() (and ultimately anon_vma_clone()), so there is no problem permitting this. This series fixes this mistake and also ensures that scalability concerns remain addressed by explicitly checking that whatever VMA is being merged has not been forked. A full set of self tests which reproduce the issue are provided, as well as updating userland VMA tests to assert this behaviour. The self tests additionally assert scalability concerns are addressed. This patch (of 3): anon_vma_chain's were introduced by Rik von Riel in commit `5beb493052` ("mm: change anon_vma linking to fix multi-process server scalability issue"). This patch was introduced in March 2010. As part of this change, careful attention was made to the instance of mprotect() causing a VMA merge, with one faulted (i.e. having anon_vma set) and another not: /* * Easily overlooked: when mprotect shifts the boundary, * make sure the expanding vma has anon_vma set if the * shrinking vma had, to cover any anon pages imported. / In the modern VMA code, this is handled in dup_anon_vma() (and ultimately anon_vma_clone()). This case is one of the three configurations of adjacent VMA anon_vma state that we might encounter on merge (where dst is the VMA which will be merged into and src the one being merged into dst): 1. dst->anon_vma, src->anon_vma - These must be equal, no-op. 2. dst->anon_vma, !src->anon_vma - We simply use dst->anon_vma, no-op. 3. !dst->anon_vma, src->anon_vma - The case in question here. In case 3, the instance addressed here - we duplicate the AVC connections from src and place into dst. However, in practice, we very often do NOT do this. This appears to be due to an inadvertent consequence of the change introduced by commit `965f55dea0` ("mmap: avoid merging cloned VMAs"), introduced in May 2011. This implies that this merge case was functional only for a little over a year, and has since been broken for ~15 years. Here, lock scalability concerns lead to us restricting anonymous merges only to those VMAs with 1 entry in their vma->anon_vma_chain, that is, a VMA that is not connected to any parent process's anon_vma. The mergeability test looks like this: static inline bool is_mergeable_anon_vma(struct anon_vma anon_vma1, struct anon_vma anon_vma2, struct vm_area_struct vma) { if ((!anon_vma1 \|\| !anon_vma2) && (!vma \|\| !vma->anon_vma \|\| list_is_singular(&vma->anon_vma_chain))) return true; return anon_vma1 == anon_vma2; } However, we have a problem here - typically the vma passed here is the destination VMA. For instance in vma_merge_existing_range() we invoke: can_vma_merge_left() -> [ check that there is an immediately adjacent prior VMA ] -> can_vma_merge_after() -> is_mergeable_vma() for general attribute check -> is_mergeable_anon_vma([ proposed anon_vma ], prev->anon_vma, prev) So if we were considering a target unfaulted 'prev': unfaulted faulted \|-----------\|-----------\| \| prev \| vma \| \|-----------\|-----------\| This would call is_mergeable_anon_vma(NULL, vma->anon_vma, prev). The list_is_singular() check for vma->anon_vma_chain, an empty list on fault, would cause this merge to _fail_ even though all else indicates a merge. Equally a simple merge into a next VMA would hit the same problem: faulted unfaulted \|-----------\|-----------\| \| vma \| next \| \|-----------\|-----------\| can_vma_merge_right() -> [ check that there is an immediately adjacent succeeding VMA ] -> can_vma_merge_before() -> is_mergeable_vma() for general attribute check -> is_mergeable_anon_vma([ proposed anon_vma ], next->anon_vma, next) For a 3-way merge, we'd also hit the same problem if it was configured like this for instance: unfaulted faulted unfaulted \|-----------\|-----------\|-----------\| \| prev \| vma \| next \| \|-----------\|-----------\|-----------\| As we'd call can_vma_merge_left() for prev, and can_vma_merge_right() for next, both of which would fail. vma_merge_new_range() (and relatedly, vma_expand()) are not impacted, as the new VMA would never already be faulted (it is a proposed new range). Because we already handle each of the aforementioned merge cases, and can absolutely therefore deal with an existing VMA merge with !dst->anon_vma, src->anon_vma, there is absolutely no reason to disallow this kind of merge. It seems that the intention of this patch is to ensure that, in the instance of merging unfaulted VMAs with faulted ones, we never wish to do so with those with multiple AVCs due to the fact that anon_vma lock's are held across both parent and child anon_vma's (actually, the 'root' parent anon_vma's lock is used). In fact, the original commit alludes to this - "find_mergeable_anon_vma() already considers this case". In find_mergeable_anon_vma() however, we check the anon_vma which will be merged from, if it is set, then we check list_is_singular(vma->anon_vma_chain). So to match this logic, update is_mergeable_anon_vma() to perform this scalability check on the VMA whose anon_vma we ultimately merge into. This matches existing behaviour with forked VMAs, only we no longer wrongly disallow ALL empty target merges. So we both allow merge cases and ensure the scalability check is correctly applied. We may wish to revisit these lock scalability concerns at a later date and ensure they are still valid. Additionally, correct userland VMA tests which were mistakenly not asserting these cases correctly previously to now correctly assert this, and to ensure vmg->anon_vma state is always consistent to account for newly introduced asserts. Link: https://lkml.kernel.org/r/cover.1744104124.git.lorenzo.stoakes@oracle.com Link: https://lkml.kernel.org/r/18c756fc9eaf7ad082a710c91133b8346f8cd9a8.1744104124.git.lorenzo.stoakes@oracle.com Fixes: `965f55dea0` ("mmap: avoid merging cloned VMAs") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Rik van Riel <riel@surriel.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Yang <richard.weiyang@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-05-11 17:48:26 -07:00
Li Wang	e487a5d513	selftest/mm: make hugetlb_reparenting_test tolerant to async reparenting In cgroup v2, memory and hugetlb usage reparenting is asynchronous. This can cause test flakiness when immediately asserting usage after deleting a child cgroup. To address this, add a helper function `assert_with_retry()` that checks usage values with a timeout-based retry. This improves test stability without relying on fixed sleep delays. Also bump up the tolerance size to 7MB. To avoid False Positives: ... # Assert memory charged correctly for child only use. # actual a = 11 MB # expected a = 0 MB # fail # cleanup # [FAIL] not ok 11 hugetlb_reparenting_test.sh -cgroup-v2 # exit=1 # 0 # SUMMARY: PASS=10 SKIP=0 FAIL=1 Link: https://lkml.kernel.org/r/20250407084201.74492-1-liwang@redhat.com Signed-off-by: Li Wang <liwang@redhat.com> Tested-by: Donet Tom <donettom@linux.ibm.com> Cc: Waiman Long <longman@redhat.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Kirill A. Shuemov <kirill.shutemov@linux.intel.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-05-11 17:48:18 -07:00
Andrei Vagin	a9562fd03a	selftests/mm: add PAGEMAP_SCAN guard region test Add a selftest to verify the PAGEMAP_SCAN ioctl correctly reports guard regions using the newly introduced PAGE_IS_GUARD flag. Link: https://lkml.kernel.org/r/20250324065328.107678-4-avagin@google.com Signed-off-by: Andrei Vagin <avagin@gmail.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: David Hildenbrand <david@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-05-11 17:48:17 -07:00
Dmitry V. Levin	bc6fa71195	selftests/ptrace: add a test case for PTRACE_SET_SYSCALL_INFO Check whether PTRACE_SET_SYSCALL_INFO semantics implemented in the kernel matches userspace expectations. Link: https://lkml.kernel.org/r/20250303112052.GG24170@strace.io Signed-off-by: Dmitry V. Levin <ldv@strace.io> Reviewed-by: Oleg Nesterov <oleg@redhat.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexey Gladkov (Intel) <legion@kernel.org> Cc: Andreas Larsson <andreas@gaisler.com> Cc: anton ivanov <anton.ivanov@cambridgegreys.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Borislav Betkov <bp@alien8.de> Cc: Brian Cain <bcain@quicinc.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Christian Zankel <chris@zankel.net> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Davide Berardi <berardi.dav@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Eugene Syromiatnikov <esyr@redhat.com> Cc: Eugene Syromyatnikov <evgsyr@gmail.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonas Bonn <jonas@southpole.se> Cc: Maciej W. Rozycki <macro@orcam.me.uk> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Mike Frysinger <vapier@gentoo.org> Cc: Naveen N Rao <naveen@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Renzo Davoi <renzo@cs.unibo.it> Cc: Richard Weinberger <richard@nod.at> Cc: Rich Felker <dalias@libc.org> Cc: Russel King <linux@armlinux.org.uk> Cc: Shuah Khan <shuah@kernel.org> Cc: Stafford Horne <shorne@gmail.com> Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Will Deacon <will@kernel.org> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-05-11 17:48:16 -07:00
Siddarth G	0bf19a357e	selftests/mm: convert page_size to unsigned long Cppcheck warning: int result is assigned to long long variable. If the variable is long long to avoid loss of information, then you have loss of information. This patch changes the type of page_size from 'unsigned int' to 'unsigned long' instead of using ULL suffixes. Changing hpage_size to 'unsigned long' was considered, but since gethugepage() expects an int, this change was avoided. Link: https://lkml.kernel.org/r/20250403101345.29226-1-siddarthsgml@gmail.com Signed-off-by: Siddarth G <siddarthsgml@gmail.com> Reported-by: David Binderman <dcb314@hotmail.com> Closes: https://lore.kernel.org/all/AS8PR02MB10217315060BBFDB21F19643E9CA62@AS8PR02MB10217.eurprd02.prod.outlook.com/ Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-05-11 17:48:09 -07:00
Linus Torvalds	6f5bf947ba	Merge tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 ITS mitigation from Dave Hansen: "Mitigate Indirect Target Selection (ITS) issue. I'd describe this one as a good old CPU bug where the behavior is _obviously_ wrong, but since it just results in bad predictions it wasn't wrong enough to notice. Well, the researchers noticed and also realized that thus bug undermined a bunch of existing indirect branch mitigations. Thus the unusually wide impact on this one. Details: ITS is a bug in some Intel CPUs that affects indirect branches including RETs in the first half of a cacheline. Due to ITS such branches may get wrongly predicted to a target of (direct or indirect) branch that is located in the second half of a cacheline. Researchers at VUSec found this behavior and reported to Intel. Affected processors: - Cascade Lake, Cooper Lake, Whiskey Lake V, Coffee Lake R, Comet Lake, Ice Lake, Tiger Lake and Rocket Lake. Scope of impact: - Guest/host isolation: When eIBRS is used for guest/host isolation, the indirect branches in the VMM may still be predicted with targets corresponding to direct branches in the guest. - Intra-mode using cBPF: cBPF can be used to poison the branch history to exploit ITS. Realigning the indirect branches and RETs mitigates this attack vector. - User/kernel: With eIBRS enabled user/kernel isolation is not impacted by ITS. - Indirect Branch Prediction Barrier (IBPB): Due to this bug indirect branches may be predicted with targets corresponding to direct branches which were executed prior to IBPB. This will be fixed in the microcode. Mitigation: As indirect branches in the first half of cacheline are affected, the mitigation is to replace those indirect branches with a call to thunk that is aligned to the second half of the cacheline. RETs that take prediction from RSB are not affected, but they may be affected by RSB-underflow condition. So, RETs in the first half of cacheline are also patched to a return thunk that executes the RET aligned to second half of cacheline" * tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: selftest/x86/bugs: Add selftests for ITS x86/its: FineIBT-paranoid vs ITS x86/its: Use dynamic thunks for indirect branches x86/ibt: Keep IBT disabled during alternative patching mm/execmem: Unify early execmem_cache behaviour x86/its: Align RETs in BHB clear sequence to avoid thunking x86/its: Add support for RSB stuffing mitigation x86/its: Add "vmexit" option to skip mitigation on some CPUs x86/its: Enable Indirect Target Selection mitigation x86/its: Add support for ITS-safe return thunk x86/its: Add support for ITS-safe indirect thunk x86/its: Enumerate Indirect Target Selection (ITS) bug Documentation: x86/bugs/its: Add ITS documentation	2025-05-11 17:23:03 -07:00
Linus Torvalds	cd802e7e5f	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull KVM fixes from Paolo Bonzini: "ARM: - Avoid use of uninitialized memcache pointer in user_mem_abort() - Always set HCR_EL2.xMO bits when running in VHE, allowing interrupts to be taken while TGE=0 and fixing an ugly bug on AmpereOne that occurs when taking an interrupt while clearing the xMO bits (AC03_CPU_36) - Prevent VMMs from hiding support for AArch64 at any EL virtualized by KVM - Save/restore the host value for HCRX_EL2 instead of restoring an incorrect fixed value - Make host_stage2_set_owner_locked() check that the entire requested range is memory rather than just the first page RISC-V: - Add missing reset of smstateen CSRs x86: - Forcibly leave SMM on SHUTDOWN interception on AMD CPUs to avoid causing problems due to KVM stuffing INIT on SHUTDOWN (KVM needs to sanitize the VMCB as its state is undefined after SHUTDOWN, emulating INIT is the least awful choice). - Track the valid sync/dirty fields in kvm_run as a u64 to ensure KVM KVM doesn't goof a sanity check in the future. - Free obsolete roots when (re)loading the MMU to fix a bug where pre-faulting memory can get stuck due to always encountering a stale root. - When dumping GHCB state, use KVM's snapshot instead of the raw GHCB page to print state, so that KVM doesn't print stale/wrong information. - When changing memory attributes (e.g. shared <=> private), add potential hugepage ranges to the mmu_invalidate_range_{start,end} set so that KVM doesn't create a shared/private hugepage when the the corresponding attributes will become mixed (the attributes are commited after KVM finishes the invalidation). - Rework the SRSO mitigation to enable BP_SPEC_REDUCE only when KVM has at least one active VM. Effectively BP_SPEC_REDUCE when KVM is loaded led to very measurable performance regressions for non-KVM workloads" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: SVM: Set/clear SRSO's BP_SPEC_REDUCE on 0 <=> 1 VM count transitions KVM: arm64: Fix memory check in host_stage2_set_owner_locked() KVM: arm64: Kill HCRX_HOST_FLAGS KVM: arm64: Properly save/restore HCRX_EL2 KVM: arm64: selftest: Don't try to disable AArch64 support KVM: arm64: Prevent userspace from disabling AArch64 support at any virtualisable EL KVM: arm64: Force HCR_EL2.xMO to 1 at all times in VHE mode KVM: arm64: Fix uninitialized memcache pointer in user_mem_abort() KVM: x86/mmu: Prevent installing hugepages when mem attributes are changing KVM: SVM: Update dump_ghcb() to use the GHCB snapshot fields KVM: RISC-V: reset smstateen CSRs KVM: x86/mmu: Check and free obsolete roots in kvm_mmu_reload() KVM: x86: Check that the high 32bits are clear in kvm_arch_vcpu_ioctl_run() KVM: SVM: Forcibly leave SMM mode on SHUTDOWN interception	2025-05-11 11:30:13 -07:00
Linus Torvalds	3ce9925823	Merge tag 'mm-hotfixes-stable-2025-05-10-14-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc hotfixes from Andrew Morton: "22 hotfixes. 13 are cc:stable and the remainder address post-6.14 issues or aren't considered necessary for -stable kernels. About half are for MM. Five OCFS2 fixes and a few MAINTAINERS updates" * tag 'mm-hotfixes-stable-2025-05-10-14-23' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (22 commits) mm: fix folio_pte_batch() on XEN PV nilfs2: fix deadlock warnings caused by lock dependency in init_nilfs() mm/hugetlb: copy the CMA flag when demoting mm, swap: fix false warning for large allocation with !THP_SWAP selftests/mm: fix a build failure on powerpc selftests/mm: fix build break when compiling pkey_util.c mm: vmalloc: support more granular vrealloc() sizing tools/testing/selftests: fix guard region test tmpfs assumption ocfs2: stop quota recovery before disabling quotas ocfs2: implement handshaking with ocfs2 recovery thread ocfs2: switch osb->disable_recovery to enum mailmap: map Uwe's BayLibre addresses to a single one MAINTAINERS: add mm THP section mm/userfaultfd: fix uninitialized output field for -EAGAIN race selftests/mm: compaction_test: support platform with huge mount of memory MAINTAINERS: add core mm section ocfs2: fix panic in failed foilio allocation mm/huge_memory: fix dereferencing invalid pmd migration entry MAINTAINERS: add reverse mapping section x86: disable image size check for test builds ...	2025-05-10 15:50:56 -07:00
Paolo Bonzini	36867c0e94	Merge tag 'kvmarm-fixes-6.15-3' of https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 fixes for 6.15, round #3 - Avoid use of uninitialized memcache pointer in user_mem_abort() - Always set HCR_EL2.xMO bits when running in VHE, allowing interrupts to be taken while TGE=0 and fixing an ugly bug on AmpereOne that occurs when taking an interrupt while clearing the xMO bits (AC03_CPU_36) - Prevent VMMs from hiding support for AArch64 at any EL virtualized by KVM - Save/restore the host value for HCRX_EL2 instead of restoring an incorrect fixed value - Make host_stage2_set_owner_locked() check that the entire requested range is memory rather than just the first page	2025-05-10 11:10:02 -04:00
Jiayuan Chen	be48b8790a	selftests/bpf: Add test to cover sockmap with ktls The selftest can reproduce an issue where we miss the uncharge operation when freeing msg, which will cause the following warning. We fixed the issue and added this reproducer to selftest to ensure it will not happen again. ------------[ cut here ]------------ WARNING: CPU: 1 PID: 40 at net/ipv4/af_inet.c inet_sock_destruct+0x173/0x1d5 Tainted: [W]=WARN Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 Workqueue: events sk_psock_destroy RIP: 0010:inet_sock_destruct+0x173/0x1d5 RSP: 0018:ffff8880085cfc18 EFLAGS: 00010202 RAX: 1ffff11003dbfc00 RBX: ffff88801edfe3e8 RCX: ffffffff822f5af4 RDX: 0000000000000007 RSI: dffffc0000000000 RDI: ffff88801edfe16c RBP: ffff88801edfe184 R08: ffffed1003dbfc31 R09: 0000000000000000 R10: ffffffff822f5ab7 R11: ffff88801edfe187 R12: ffff88801edfdec0 R13: ffff888020376ac0 R14: ffff888020376ac0 R15: ffff888020376a60 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000556365155830 CR3: 000000001d6aa000 CR4: 0000000000350ef0 Call Trace: <TASK> __sk_destruct+0x46/0x222 sk_psock_destroy+0x22f/0x242 process_one_work+0x504/0x8a8 ? process_one_work+0x39d/0x8a8 ? __pfx_process_one_work+0x10/0x10 ? worker_thread+0x44/0x2ae ? __list_add_valid_or_report+0x83/0xea ? srso_return_thunk+0x5/0x5f ? __list_add+0x45/0x52 process_scheduled_works+0x73/0x82 worker_thread+0x1ce/0x2ae Acked-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://lore.kernel.org/r/20250425060015.6968-3-jiayuan.chen@linux.dev	2025-05-09 18:10:13 -07:00
Cosmin Ratiu	97c4e094a4	tests/ncdevmem: Fix double-free of queue array netdev_bind_rx takes ownership of the queue array passed as parameter and frees it, so a queue array buffer cannot be reused across multiple netdev_bind_rx calls. This commit fixes that by always passing in a newly created queue array to all netdev_bind_rx calls in ncdevmem. Fixes: `85585b4bc8` ("selftests: add ncdevmem, netcat for devmem TCP") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: Joe Damato <jdamato@fastly.com> Reviewed-by: Mina Almasry <almasrymina@google.com> Link: https://patch.msgid.link/20250508084434.1933069-1-cratiu@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-05-09 15:05:07 -07:00
Pawan Gupta	7a9b709e7c	selftest/x86/bugs: Add selftests for ITS Below are the tests added for Indirect Target Selection (ITS): - its_sysfs.py - Check if sysfs reflects the correct mitigation status for the mitigation selected via the kernel cmdline. - its_permutations.py - tests mitigation selection with cmdline permutations with other bugs like spectre_v2 and retbleed. - its_indirect_alignment.py - verifies that for addresses in .retpoline_sites section that belong to lower half of cacheline are patched to ITS-safe thunk. Typical output looks like below: Site 49: function symbol: __x64_sys_restart_syscall+0x1f <0xffffffffbb1509af> # vmlinux: 0xffffffff813509af: jmp 0xffffffff81f5a8e0 # kcore: 0xffffffffbb1509af: jmpq %rax # ITS thunk NOT expected for site 49 # PASSED: Found %rax # Site 50: function symbol: __resched_curr+0xb0 <0xffffffffbb181910> # vmlinux: 0xffffffff81381910: jmp 0xffffffff81f5a8e0 # kcore: 0xffffffffbb181910: jmp 0xffffffffc02000fc # ITS thunk expected for site 50 # PASSED: Found 0xffffffffc02000fc -> jmpq *%rax <scattered-thunk?> - its_ret_alignment.py - verifies that for addresses in .return_sites section that belong to lower half of cacheline are patched to its_return_thunk. Typical output looks like below: Site 97: function symbol: collect_event+0x48 <0xffffffffbb007f18> # vmlinux: 0xffffffff81207f18: jmp 0xffffffff81f5b500 # kcore: 0xffffffffbb007f18: jmp 0xffffffffbbd5b560 # PASSED: Found jmp 0xffffffffbbd5b560 <its_return_thunk> # Site 98: function symbol: collect_event+0xa4 <0xffffffffbb007f74> # vmlinux: 0xffffffff81207f74: jmp 0xffffffff81f5b500 # kcore: 0xffffffffbb007f74: retq # PASSED: Found retq Some of these tests have dependency on tools like virtme-ng[1] and drgn[2]. When the dependencies are not met, the test will be skipped. [1] https://github.com/arighi/virtme-ng [2] https://github.com/osandov/drgn Co-developed-by: Tao Zhang <tao1.zhang@linux.intel.com> Signed-off-by: Tao Zhang <tao1.zhang@linux.intel.com> Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>	2025-05-09 13:39:45 -07:00
Jiri Olsa	d57293db64	selftests/bpf: Add link info test for ref_ctr_offset retrieval Adding link info test for ref_ctr_offset retrieval for both uprobe and uretprobe probes. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yafang Shao <laoar.shao@gmail.com> Link: https://lore.kernel.org/bpf/20250509153539.779599-3-jolsa@kernel.org	2025-05-09 13:01:08 -07:00
Terry Tritton	73989c9988	selftests/seccomp: fix negative_ENOSYS tracer tests on arm32 TRACE_syscall.ptrace.negative_ENOSYS and TRACE_syscall.seccomp.negative_ENOSYS on arm32 are being reported as failures instead of skipping. The teardown_trace_fixture function sets the test to KSFT_FAIL in case of a non 0 return value from the tracer process. Due to _metadata now being shared between the forked processes the tracer is returning the KSFT_SKIP value set by the tracee which is non 0. Remove the setting of the _metadata.exit_code in teardown_trace_fixture. Fixes: `24cf65a622` ("selftests/harness: Share _metadata between forked processes") Signed-off-by: Terry Tritton <terry.tritton@linaro.org> Link: https://lore.kernel.org/r/20250509115622.64775-1-terry.tritton@linaro.org Signed-off-by: Kees Cook <kees@kernel.org>	2025-05-09 12:15:43 -07:00
Thomas Weißschuh	1efe202228	selftests/timens: timerfd: Use correct clockid type in tclock_gettime() tclock_gettime() is a wrapper around clock_gettime(). The first parameter of clock_gettime() is of type "clockid_t", not "clock_t". Use the correct type instead. Link: https://lore.kernel.org/r/20250502-selftests-timens-fixes-v1-3-fb517c76f04d@linutronix.de Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-05-09 13:12:57 -06:00
Thomas Weißschuh	261639fa51	selftests/timens: Make run_tests() functions static These functions are never used outside their defining compilation unit and can be made static. Link: https://lore.kernel.org/r/20250502-selftests-timens-fixes-v1-2-fb517c76f04d@linutronix.de Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-05-09 13:12:48 -06:00
Thomas Weißschuh	84b8d6c908	selftests/timens: Print TAP headers The TAP specification requires that the output begins with a header line. These headers lines are missing in the timens tests. Print such a line. Link: https://lore.kernel.org/r/20250502-selftests-timens-fixes-v1-1-fb517c76f04d@linutronix.de Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-05-09 13:12:43 -06:00
Peter Seiderer	11f6dcf784	selftests: pid_namespace: add missing sys/mount.h include in pid_max.c Fix compile on openSUSE Tumbleweed (gcc-14.2.1, glibc-2.40): - add missing sys/mount.h include Fixes: pid_max.c: In function ‘pid_max_cb’: pid_max.c:42:15: error: implicit declaration of function ‘mount’ [-Wimplicit-function-declaration] 42 \| ret = mount("", "/", NULL, MS_PRIVATE \| MS_REC, 0); \| ^~~~~ Link: https://lore.kernel.org/r/20250115105211.390370-3-ps.report@gmx.net Signed-off-by: Peter Seiderer <ps.report@gmx.net> Reviewed-by: T.J. Mercier <tjmercier@google.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-05-09 13:12:33 -06:00
Nícolas F. R. A. Prado	23b88515a3	kselftest: cpufreq: Get rid of double suspend in rtcwake case Commit `0b631ed3ce` ("kselftest: cpufreq: Add RTC wakeup alarm") added support for automatic wakeup in the suspend routine of the cpufreq kselftest by using rtcwake, however it left the manual power state change in the common path. The end result is that when running the cpufreq kselftest with '-t suspend_rtc' or '-t hibernate_rtc', the system will go to sleep and be woken up by the RTC, but then immediately go to sleep again with no wakeup programmed, so it will sleep forever in an automated testing setup. Fix this by moving the manual power state change so that it only happens when not using rtcwake. Link: https://lore.kernel.org/r/20250430-ksft-cpufreq-suspend-rtc-double-fix-v1-1-dc17a729c5a7@collabora.com Fixes: `0b631ed3ce` ("kselftest: cpufreq: Add RTC wakeup alarm") Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-05-09 12:43:39 -06:00
Swapnil Sapkal	8ffe772076	selftests/cpufreq: Fix cpufreq basic read and update testcases In cpufreq basic selftests, one of the testcases is to read all cpufreq sysfs files and print the values. This testcase assumes all the cpufreq sysfs files have read permissions. However certain cpufreq sysfs files (eg. stats/reset) are write only files and this testcase errors out when it is not able to read the file. Similarily, there is one more testcase which reads the cpufreq sysfs file data and write it back to same file. This testcase also errors out for sysfs files without read permission. Fix these testcases by adding proper read permission checks. Link: https://lore.kernel.org/r/20250430171433.10866-1-swapnil.sapkal@amd.com Reported-by: Narasimhan V <narasimhan.v@amd.com> Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-05-09 12:43:31 -06:00
Ayush Jain	ab4b00407d	selftests/ftrace: Convert poll to a gen_file Poll program is a helper to ftracetest, thus make it a generic file and remove it from being run as a test. Currently when executing tests using $ make run_tests CC poll TAP version 13 1..2 # timeout set to 0 # selftests: ftrace: poll # Error: Polling file is not specified not ok 1 selftests: ftrace: poll # exit=255 Fix this by using TEST_GEN_FILES to build the 'poll' binary as a helper rather than as a test. Fixes: `80c3e28528` ("selftests/tracing: Add hist poll() support test") Link: https://lore.kernel.org/r/20250409044632.363285-1-Ayush.jain3@amd.com Signed-off-by: Ayush Jain <Ayush.jain3@amd.com> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>	2025-05-09 12:43:10 -06:00
Luis Gerhorst	cf15cdc0f0	selftests/bpf: Fix caps for __xlated/jited_unpriv Currently, __xlated_unpriv and __jited_unpriv do not work because the BPF syscall will overwrite info.jited_prog_len and info.xlated_prog_len with 0 if the process is not bpf_capable(). This bug was not noticed before, because there is no test that actually uses __xlated_unpriv/__jited_unpriv. To resolve this, simply restore the capabilities earlier (but still after loading the program). Adding this here unconditionally is fine because the function first checks that the capabilities were initialized before attempting to restore them. This will be important later when we add tests that check whether a speculation barrier was inserted in the correct location. Signed-off-by: Luis Gerhorst <luis.gerhorst@fau.de> Fixes: `9c9f733913` ("selftests/bpf: allow checking xlated programs in verifier_* tests") Fixes: `7d743e4c75` ("selftests/bpf: __jited test tag to check disassembly after jit") Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Tested-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250501073603.1402960-2-luis.gerhorst@fau.de Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-09 11:29:11 -07:00
Peilin Ye	d3131466b4	selftests/bpf: Enable non-arena load-acquire/store-release selftests for riscv64 For riscv64, enable all BPF_{LOAD_ACQ,STORE_REL} selftests except the arena_atomics/* ones (not guarded behind CAN_USE_LOAD_ACQ_STORE_REL), since arena access is not yet supported. Acked-by: Björn Töpel <bjorn@kernel.org> Reviewed-by: Pu Lehui <pulehui@huawei.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> # QEMU/RVA23 Signed-off-by: Peilin Ye <yepeilin@google.com> Link: https://lore.kernel.org/r/9d878fa99a72626208a8eed3c04c4140caf77fda.1746588351.git.yepeilin@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-09 10:05:27 -07:00
Peilin Ye	0357f29de8	selftests/bpf: Verify zero-extension behavior in load-acquire tests Verify that 8-, 16- and 32-bit load-acquires are zero-extending by using immediate values with their highest bit set. Do the same for the 64-bit variant to keep the style consistent. Acked-by: Björn Töpel <bjorn@kernel.org> Reviewed-by: Pu Lehui <pulehui@huawei.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> # QEMU/RVA23 Signed-off-by: Peilin Ye <yepeilin@google.com> Link: https://lore.kernel.org/r/11097fd515f10308b3941469ee4c86cb8872db3f.1746588351.git.yepeilin@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-09 10:05:27 -07:00
Peilin Ye	6e492ffcab	selftests/bpf: Avoid passing out-of-range values to __retval() Currently, we pass 0x1234567890abcdef to __retval() for the following two tests: verifier_load_acquire/load_acquire_64 verifier_store_release/store_release_64 However, the upper 32 bits of that value are being ignored, since __retval() expects an int. Actually, the tests would still pass even if I change '__retval(0x1234567890abcdef)' to e.g. '__retval(0x90abcdef)'. Restructure the tests a bit to test the entire 64-bit values properly. Do the same to their 8-, 16- and 32-bit variants as well to keep the style consistent. Fixes: `ff3afe5da9` ("selftests/bpf: Add selftests for load-acquire and store-release instructions") Acked-by: Björn Töpel <bjorn@kernel.org> Reviewed-by: Pu Lehui <pulehui@huawei.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> # QEMU/RVA23 Signed-off-by: Peilin Ye <yepeilin@google.com> Link: https://lore.kernel.org/r/d67f4c6f6ee0d0388cbce1f4892ec4176ee2d604.1746588351.git.yepeilin@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-09 10:05:27 -07:00
Peilin Ye	13fdecf345	selftests/bpf: Use CAN_USE_LOAD_ACQ_STORE_REL when appropriate Instead of open-coding the conditions, use '#ifdef CAN_USE_LOAD_ACQ_STORE_REL' to guard the following tests: verifier_precision/bpf_load_acquire verifier_precision/bpf_store_release verifier_store_release/* Note that, for the first two tests in verifier_precision.c, switching to '#ifdef CAN_USE_LOAD_ACQ_STORE_REL' means also checking if '__clang_major__ >= 18', which has already been guaranteed by the outer '#if' check. Acked-by: Björn Töpel <bjorn@kernel.org> Reviewed-by: Pu Lehui <pulehui@huawei.com> Tested-by: Björn Töpel <bjorn@rivosinc.com> # QEMU/RVA23 Signed-off-by: Peilin Ye <yepeilin@google.com> Link: https://lore.kernel.org/r/45d7e025f6e390a8ff36f08fc51e31705ac896bd.1746588351.git.yepeilin@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2025-05-09 10:05:27 -07:00
Cong Wang	16ce349b15	selftests/tc-testing: Add qdisc limit trimming tests Added new test cases for FQ, FQ_CODEL, FQ_PIE, and HHF qdiscs to verify queue trimming behavior when the qdisc limit is dynamically reduced. Each test injects packets, reduces the qdisc limit, and checks that the new limit is enforced. This is still best effort since timing qdisc backlog is not easy. Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2025-05-09 12:34:38 +01:00
Jakub Kicinski	d97e2634fb	selftests: net-drv: remove the nic_performance and nic_link_layer tests Revert `fbbf93556f` ("selftests: nic_performance: Add selftest for performance of NIC driver") Revert `c087dc5439` ("selftests: nic_link_layer: Add selftest case for speed and duplex states") Revert `6116075e18` ("selftests: nic_link_layer: Add link layer selftest for NIC driver") These tests don't clean up after themselves, don't use the disruptive annotations, don't get included in make install etc. etc. The tests were added before we have any "HW" runner, so the issues were missed. Our CI doesn't have any way of excluding broken tests, remove these for now to stop the random pollution of results due to broken env. We can always add them back once / if fixed. Acked-by: Stanislav Fomichev <sdf@fomichev.me> Reviewed-by: David Wei <dw@davidwei.uk> Link: https://patch.msgid.link/20250507140109.929801-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-05-08 18:58:30 -07:00
Florian Westphal	1f389a648a	selftests: netfilter: fix conntrack stress test failures on debug kernels Jakub reports test failures on debug kernel: FAIL: proc inconsistency after uniq filter for ... This is because entries are expiring while validation is happening. Increase the timeout of ctnetlink injected entries and the icmp (ping) timeout to 1h to avoid this. To reduce run-time, add less entries via ctnetlink when KSFT_MACHINE_SLOW is set. also log of a failed run had: PASS: dump in netns had same entry count (-C 0, -L 0, -p 0, /proc 0) ... i.e. all entries already expired: add a check and set failure if this happens. While at it, include a diff when there were duplicate entries and add netns name to error messages (it tells if icmp or ctnetlink failed). Fixes: `d33f889fd8` ("selftests: netfilter: add conntrack stress test") Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/netdev/20250506061125.1a244d12@kernel.org/ Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://patch.msgid.link/20250507075000.5819-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-05-08 18:57:24 -07:00
Jakub Kicinski	6b02fd7799	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR (net-6.15-rc6). No conflicts. Adjacent changes: net/core/dev.c: `08e9f2d584` ("net: Lock netdevices during dev_shutdown") `a82dc19db1` ("net: avoid potential race between netdev_get_by_index_lock() and netns switch") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2025-05-08 08:59:02 -07:00
Linus Torvalds	2c89c1b655	Merge tag 'net-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from CAN, WiFi and netfilter. We have still a comple of regressions open due to the recent drivers locking refactor. The patches are in-flight, but not ready yet. Current release - regressions: - core: lock netdevices during dev_shutdown - sch_htb: make htb_deactivate() idempotent - eth: virtio-net: don't re-enable refill work too early Current release - new code bugs: - eth: icssg-prueth: fix kernel panic during concurrent Tx queue access Previous releases - regressions: - gre: fix again IPv6 link-local address generation. - eth: b53: fix learning on VLAN unaware bridges Previous releases - always broken: - wifi: fix out-of-bounds access during multi-link element defragmentation - can: - initialize spin lock on device probe - fix order of unregistration calls - openvswitch: fix unsafe attribute parsing in output_userspace() - eth: - virtio-net: fix total qstat values - mtk_eth_soc: reset all TX queues on DMA free - fbnic: firmware IPC mailbox fixes" * tag 'net-6.15-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (55 commits) virtio-net: fix total qstat values net: export a helper for adding up queue stats fbnic: Do not allow mailbox to toggle to ready outside fbnic_mbx_poll_tx_ready fbnic: Pull fbnic_fw_xmit_cap_msg use out of interrupt context fbnic: Improve responsiveness of fbnic_mbx_poll_tx_ready fbnic: Cleanup handling of completions fbnic: Actually flush_tx instead of stalling out fbnic: Add additional handling of IRQs fbnic: Gate AXI read/write enabling on FW mailbox fbnic: Fix initialization of mailbox descriptor rings net: dsa: b53: do not set learning and unicast/multicast on up net: dsa: b53: fix learning on VLAN unaware bridges net: dsa: b53: fix toggling vlan_filtering net: dsa: b53: do not program vlans when vlan filtering is off net: dsa: b53: do not allow to configure VLAN 0 net: dsa: b53: always rejoin default untagged VLAN on bridge leave net: dsa: b53: fix VLAN ID for untagged vlan on bridge leave net: dsa: b53: fix flushing old pvid VLAN on pvid change net: dsa: b53: fix clearing PVID of a port net: dsa: b53: keep CPU port always tagged again ...	2025-05-08 08:33:56 -07:00
Mark Rutland	864f3ddcd7	kselftest/arm64: fp-ptrace: Adjust to new inactive mode behaviour In order to fix an ABI problem, we recently changed the way that reads of the NT_ARM_SVE and NT_ARM_SSVE regsets behave when their corresponding vector state is inactive. Update the fp-ptrace test for the new behaviour. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Spickett <david.spickett@arm.com> Cc: Luis Machado <luis.machado@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250508132644.1395904-25-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2025-05-08 15:45:24 +01:00
Mark Rutland	031a2acaa1	kselftest/arm64: fp-ptrace: Adjust to new VL change behaviour In order to fix an ABI problem, we recently changed the way that changing the SVE/SME vector length affects PSTATE.SM. Historically, changing the SME vector length would clear PSTATE.SM. Now, changing the SME vector length preserves PSTATE.SM. Update the fp-ptrace test for the new behaviour. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Spickett <david.spickett@arm.com> Cc: Luis Machado <luis.machado@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250508132644.1395904-24-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2025-05-08 15:45:24 +01:00
Mark Rutland	be45e63f79	kselftest/arm64: tpidr2: Adjust to new clone() behaviour In order to fix an ABI problem, we recently changed the way that a clone() syscall manipulates TPIDR2 and PSTATE.ZA. Historically the child would inherit the parent's TPIDR2 value unless CLONE_SETTLS was set, and now the child will inherit the parent's TPIDR2 value unless CLONE_VM is set. Update the tpidr2 test for the new behaviour. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Daniel Kiss <daniel.kiss@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Richard Sandiford <richard.sandiford@arm.com> Cc: Sander De Smalen <sander.desmalen@arm.com> Cc: Tamas Petz <tamas.petz@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Yury Khrustalev <yury.khrustalev@arm.com> Link: https://lore.kernel.org/r/20250508132644.1395904-23-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2025-05-08 15:45:24 +01:00
Mark Rutland	78b23877db	kselftest/arm64: fp-ptrace: Fix expected FPMR value when PSTATE.SM is changed The fp-ptrace test suite expects that FPMR is set to zero when PSTATE.SM is changed via ptrace, but ptrace has never altered FPMR in this way, and the test logic erroneously relies upon (and has concealed) a bug where task_fpsimd_load() would unexpectedly and non-deterministically clobber FPMR. Using ptrace, FPMR can only be altered by writing to the NT_ARM_FPMR regset. The value of PSTATE.SM can be altered by writing to the NT_ARM_SVE or NT_ARM_SSVE regsets, and/or by changing the SME vector length (when writing to the NT_ARM_SVE, NT_ARM_SSVE, or NT_ARM_ZA regsets), but none of these writes will change the value of FPMR. The task_fpsimd_load() bug was introduced with the initial FPMR support in commit: `203f2b95a8` ("arm64/fpsimd: Support FEAT_FPMR") The incorrect FPMR test code was introduced in commit: `7dbd26d0b2` ("kselftest/arm64: Add FPMR coverage to fp-ptrace") Subsequently, the task_fpsimd_load() bug was fixed in commit: `e5fa85fce0` ("arm64/fpsimd: Don't corrupt FPMR when streaming mode changes") ... whereupon the fp-ptrace FPMR tests started failing reliably, e.g. \| # # Mismatch in saved FPMR: 915058000 != 0 \| # not ok 25 SVE write, SVE 64->64, SME 64/0->64/1 Fix this by changing the test to expect that FPMR is NOT changed when PSTATE.SM is changed via ptrace, matching the extant behaviour. I've chosen to update the test code rather than modifying ptrace to zero FPMR when PSTATE.SM changes. Not zeroing FPMR is simpler overall, and allows the NT_ARM_FPMR regset to be handled independently from other regsets, leaving less scope for error. Fixes: `7dbd26d0b2` ("kselftest/arm64: Add FPMR coverage to fp-ptrace") Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Spickett <david.spickett@arm.com> Cc: Luis Machado <luis.machado@arm.com> Cc: Marc Zyngier <maz@kernel.org> Cc: Mark Brown <broonie@kernel.org> Cc: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250508132644.1395904-22-mark.rutland@arm.com Signed-off-by: Will Deacon <will@kernel.org>	2025-05-08 15:45:24 +01:00
Sean Christopherson	5e9ac644c4	KVM: selftests: Add a test for x86's fastops emulation Add a test to verify KVM's fastops emulation via forced emulation. KVM's so called "fastop" infrastructure executes the to-be-emulated instruction directly on hardware instead of manually emulating the instruction in software, using various shenanigans to glue together the emulator context and CPU state, e.g. to get RFLAGS fed into the instruction and back out for the emulator. Add testcases for all instructions that are low hanging fruit. While the primary goal of the selftest is to validate the glue code, a secondary goal is to ensure "emulation" matches hardware exactly, including for arithmetic flags that are architecturally undefined. While arithmetic flags may be architecturally undefined, their behavior is deterministic for a given CPU (likely a given uarch, and possibly even an entire family or class of CPUs). I.e. KVM has effectively been emulating underlying hardware behavior for years. Link: https://lore.kernel.org/r/20250506011250.1089254-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2025-05-08 07:16:44 -07:00
Nysal Jan K.A.	8cf6ecb18b	selftests/mm: fix a build failure on powerpc The compiler is unaware of the size of code generated by the ".rept" assembler directive. This results in the compiler emitting branch instructions where the offset to branch to exceeds the maximum allowed value, resulting in build failures like the following: CC protection_keys /tmp/ccypKWAE.s: Assembler messages: /tmp/ccypKWAE.s:2073: Error: operand out of range (0x0000000000020158 is not between 0xffffffffffff8000 and 0x0000000000007ffc) /tmp/ccypKWAE.s:2509: Error: operand out of range (0x0000000000020130 is not between 0xffffffffffff8000 and 0x0000000000007ffc) Fix the issue by manually adding nop instructions using the preprocessor. Link: https://lkml.kernel.org/r/20250428131937.641989-2-nysal@linux.ibm.com Fixes: `46036188ea` ("selftests/mm: build with -O2") Reported-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Nysal Jan K.A. <nysal@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Reviewed-by: Donet Tom <donettom@linux.ibm.com> Tested-by: Donet Tom <donettom@linux.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-05-07 23:39:41 -07:00
Madhavan Srinivasan	22adb52862	selftests/mm: fix build break when compiling pkey_util.c Commit `50910acd6f` ("selftests/mm: use sys_pkey helpers consistently") added a pkey_util.c to refactor some of the protection_keys functions accessible by other tests. But this broken the build in powerpc in two ways, pkey-powerpc.h: In function `arch_is_powervm': pkey-powerpc.h:73:21: error: storage size of `buf' isn't known 73 \| struct stat buf; \| ^~~ pkey-powerpc.h:75:14: error: implicit declaration of function `stat'; did you mean `strcat'? [-Wimplicit-function-declaration] 75 \| if ((stat("/sys/firmware/devicetree/base/ibm,partition-name", &buf) == 0) && \| ^~~~ \| strcat Since pkey_util.c includes pkeys-helper.h, which in turn includes pkeys-powerpc.h, stat.h including is missing for "struct stat". This is fixed by adding "sys/stat.h" in pkeys-powerpc.h Secondly, pkey-powerpc.h:55:18: warning: format `%llx' expects argument of type `long long unsigned int', but argument 3 has type `u64' {aka `long unsigned int'} [-Wformat=] 55 \| dprintf4("%s() changing %016llx to %016llx\n", \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 56 \| __func__, __read_pkey_reg(), pkey_reg); \| ~~~~~~~~~~~~~~~~~ \| \| \| u64 {aka long unsigned int} pkey-helpers.h:63:32: note: in definition of macro `dprintf_level' 63 \| sigsafe_printf(args); \ \| ^~~~ These format specifier related warning are removed by adding "__SANE_USERSPACE_TYPES__" to pkeys_utils.c. Link: https://lkml.kernel.org/r/20250428131937.641989-1-nysal@linux.ibm.com Fixes: `50910acd6f` ("selftests/mm: use sys_pkey helpers consistently") Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Nysal Jan K.A. <nysal@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-05-07 23:39:41 -07:00
Lorenzo Stoakes	a8efadda86	tools/testing/selftests: fix guard region test tmpfs assumption The current implementation of the guard region tests assume that /tmp is mounted as tmpfs, that is shmem. This isn't always the case, and at least one instance of a spurious test failure has been reported as a result. This assumption is unsafe, rushed and silly - and easily remedied by simply using memfd, so do so. We also have to fixup the readonly_file test to explicitly only be applicable to file-backed cases. Link: https://lkml.kernel.org/r/20250425162436.564002-1-lorenzo.stoakes@oracle.com Fixes: `272f37d3e9` ("tools/selftests: expand all guard region tests to file-backed") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reported-by: Ryan Roberts <ryan.roberts@arm.com> Closes: https://lore.kernel.org/linux-mm/a2d2766b-0ab4-437b-951a-8595a7506fe9@arm.com/ Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-05-07 23:39:40 -07:00
Feng Tang	ab00ddd802	selftests/mm: compaction_test: support platform with huge mount of memory When running mm selftest to verify mm patches, 'compaction_test' case failed on an x86 server with 1TB memory. And the root cause is that it has too much free memory than what the test supports. The test case tries to allocate 100000 huge pages, which is about 200 GB for that x86 server, and when it succeeds, it expects it's large than 1/3 of 80% of the free memory in system. This logic only works for platform with 750 GB ( 200 / (1/3) / 80% ) or less free memory, and may raise false alarm for others. Fix it by changing the fixed page number to self-adjustable number according to the real number of free memory. Link: https://lkml.kernel.org/r/20250423103645.2758-1-feng.tang@linux.alibaba.com Fixes: `bd67d5c15c` ("Test compaction of mlocked memory") Signed-off-by: Feng Tang <feng.tang@linux.alibaba.com> Acked-by: Dev Jain <dev.jain@arm.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Tested-by: Baolin Wang <baolin.wang@inux.alibaba.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Sri Jayaramappa <sjayaram@akamai.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-05-07 23:39:39 -07:00

... 30 31 32 33 34 ...

20495 Commits