Commit Graph

21684 Commits

Author SHA1 Message Date
Christian Brauner
b8f7622aa6 selftests/open_tree: add OPEN_TREE_NAMESPACE tests
Add tests for OPEN_TREE_NAMESPACE.

Link: https://patch.msgid.link/20251229-work-empty-namespace-v1-2-bfb24c7b061f@kernel.org
Tested-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-01-16 19:21:40 +01:00
Puranjay Mohan
086c99fbe4 selftests: bpf: Add test for multiple syncs from linked register
Before the last commit, sync_linked_regs() corrupted the register whose
bounds are being updated by copying known_reg's id to it. The ids are
the same in value but known_reg has the BPF_ADD_CONST flag which is
wrongly copied to reg.

This later causes issues when creating new links to this reg.
assign_scalar_id_before_mov() sees this BPF_ADD_CONST and gives a new id
to this register and breaks the old links. This is exposed by the added
selftest.

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Tested-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/r/20260115151143.1344724-3-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-16 10:08:59 -08:00
Sean Christopherson
a91cc48246 KVM: selftests: Test READ=>WRITE dirty logging behavior for shadow MMU
Update the nested dirty log test to validate KVM's handling of READ faults
when dirty logging is enabled.  Specifically, set the Dirty bit in the
guest PTEs used to map L2 GPAs, so that KVM will create writable SPTEs
when handling L2 read faults.  When handling read faults in the shadow MMU,
KVM opportunistically creates a writable SPTE if the mapping can be
writable *and* the gPTE is dirty (or doesn't support the Dirty bit), i.e.
if KVM doesn't need to intercept writes in order to emulate Dirty-bit
updates.

To actually test the L2 READ=>WRITE sequence, e.g. without masking a false
pass by other test activity, route the READ=>WRITE and WRITE=>WRITE
sequences to separate L1 pages, and differentiate between "marked dirty
due to a WRITE access/fault" and "marked dirty due to creating a writable
SPTE for a READ access/fault".  The updated sequence exposes the bug fixed
by KVM commit 1f4e5fc83a ("KVM: x86: fix nested guest live migration
with PML") when the guest performs a READ=>WRITE sequence with dirty guest
PTEs.

Opportunistically tweak and rename the address macros, and add comments,
to make it more obvious what the test is doing.  E.g. NESTED_TEST_MEM1
vs. GUEST_TEST_MEM doesn't make it all that obvious that the test is
creating aliases in both the L2 GPA and GVA address spaces, but only when
L1 is using TDP to run L2.

Cc: Yosry Ahmed <yosry.ahmed@linux.dev>
Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20260115172154.709024-1-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-16 07:48:54 -08:00
Ricardo B. Marlière
4f5f148dd7 selftests: net: fib-onlink-tests: Convert to use namespaces by default
Currently, the test breaks if the SUT already has a default route
configured for IPv6. Fix by avoiding the use of the default namespace.

Fixes: 4ed591c8ab ("net/ipv6: Allow onlink routes to have a device mismatch if it is the default route")
Suggested-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Ricardo B. Marlière <rbm@suse.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de>
Link: https://patch.msgid.link/20260113-selftests-net-fib-onlink-v2-1-89de2b931389@suse.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-15 19:58:21 -08:00
Michal Luczaj
a63e5fe095 vsock/test: Add test for a linear and non-linear skb getting coalesced
Loopback transport can mangle data in rx queue when a linear skb is
followed by a small MSG_ZEROCOPY packet.

To exercise the logic, send out two packets: a weirdly sized one (to ensure
some spare tail room in the skb) and a zerocopy one that's small enough to
fit in the spare room of its predecessor. Then, wait for both to land in
the rx queue, and check the data received. Faulty packets merger manifests
itself by corrupting payload of the later packet.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://patch.msgid.link/20260113-vsock-recv-coalescence-v2-2-552b17837cf4@rbox.co
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-15 19:44:44 -08:00
Jakub Kicinski
c27022497d Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.19-rc6).

No conflicts, or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-15 18:02:48 -08:00
Jiri Olsa
934d9746ed selftests/bpf: Add test for bpf_override_return helper
We do not actually test the bpf_override_return helper functionality
itself at the moment, only the bpf program being able to attach it.

Adding test that override prctl syscall return value on top of
kprobe and kprobe.multi.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/bpf/20260112121157.854473-2-jolsa@kernel.org
2026-01-15 16:15:26 -08:00
Linus Torvalds
13b2d15d99 Merge tag 'mm-hotfixes-stable-2026-01-15-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:

 - kerneldoc fixes from Bagas Sanjaya

 - DAMON fixes from SeongJae

 - mremap VMA-related fixes from Lorenzo

 - various singletons - please see the changelogs for details

* tag 'mm-hotfixes-stable-2026-01-15-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (30 commits)
  drivers/dax: add some missing kerneldoc comment fields for struct dev_dax
  mm: numa,memblock: include <asm/numa.h> for 'numa_nodes_parsed'
  mailmap: add entry for Daniel Thompson
  tools/testing/selftests: fix gup_longterm for unknown fs
  mm/page_alloc: prevent pcp corruption with SMP=n
  iommu/sva: include mmu_notifier.h header
  mm: kmsan: fix poisoning of high-order non-compound pages
  tools/testing/selftests: add forked (un)/faulted VMA merge tests
  mm/vma: enforce VMA fork limit on unfaulted,faulted mremap merge too
  tools/testing/selftests: add tests for !tgt, src mremap() merges
  mm/vma: fix anon_vma UAF on mremap() faulted, unfaulted merge
  mm/zswap: fix error pointer free in zswap_cpu_comp_prepare()
  mm/damon/sysfs-scheme: cleanup access_pattern subdirs on scheme dir setup failure
  mm/damon/sysfs-scheme: cleanup quotas subdirs on scheme dir setup failure
  mm/damon/sysfs: cleanup attrs subdirs on context dir setup failure
  mm/damon/sysfs: cleanup intervals subdirs on attrs dir setup failure
  mm/damon/core: remove call_control in inactive contexts
  powerpc/watchdog: add support for hardlockup_sys_info sysctl
  mips: fix HIGHMEM initialization
  mm/hugetlb: ignore hugepage kernel args if hugepages are unsupported
  ...
2026-01-15 10:47:14 -08:00
Linus Torvalds
9e995c573b Merge tag 'net-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
 "Including fixes from bluetooth, can and IPsec.

  Current release - regressions:

   - net: add net.core.qdisc_max_burst

   - can: propagate CAN device capabilities via ml_priv

  Previous releases - regressions:

   - dst: fix races in rt6_uncached_list_del() and
     rt_del_uncached_list()

   - ipv6: fix use-after-free in inet6_addr_del().

   - xfrm: fix inner mode lookup in tunnel mode GSO segmentation

   - ip_tunnel: spread netdev_lockdep_set_classes()

   - ip6_tunnel: use skb_vlan_inet_prepare() in __ip6_tnl_rcv()

   - bluetooth: hci_sync: enable PA sync lost event

   - eth: virtio-net:
      - fix the deadlock when disabling rx NAPI
      - fix misalignment bug in struct virtnet_info

  Previous releases - always broken:

   - ipv4: ip_gre: make ipgre_header() robust

   - can: fix SSP_SRC in cases when bit-rate is higher than 1 MBit.

   - eth:
      - mlx5e: profile change fix
      - octeon_ep_vf: fix free_irq dev_id mismatch in IRQ rollback
      - macvlan: fix possible UAF in macvlan_forward_source()"

* tag 'net-6.19-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (37 commits)
  virtio_net: Fix misalignment bug in struct virtnet_info
  net: can: j1939: j1939_xtp_rx_rts_session_active(): deactivate session upon receiving the second rts
  can: raw: instantly reject disabled CAN frames
  can: propagate CAN device capabilities via ml_priv
  Revert "can: raw: instantly reject unsupported CAN frames"
  net/sched: sch_qfq: do not free existing class in qfq_change_class()
  selftests: drv-net: fix RPS mask handling for high CPU numbers
  selftests: drv-net: fix RPS mask handling in toeplitz test
  ipv6: Fix use-after-free in inet6_addr_del().
  dst: fix races in rt6_uncached_list_del() and rt_del_uncached_list()
  net: hv_netvsc: reject RSS hash key programming without RX indirection table
  tools: ynl: render event op docs correctly
  net: add net.core.qdisc_max_burst
  net: airoha: Fix typo in airoha_ppe_setup_tc_block_cb definition
  net: phy: motorcomm: fix duplex setting error for phy leds
  net: octeon_ep_vf: fix free_irq dev_id mismatch in IRQ rollback
  net/mlx5e: Restore destroying state bit after profile cleanup
  net/mlx5e: Pass netdev to mlx5e_destroy_netdev instead of priv
  net/mlx5e: Don't store mlx5e_priv in mlx5e_dev devlink priv
  net/mlx5e: Fix crash on profile change rollback failure
  ...
2026-01-15 10:11:11 -08:00
Fuad Tabba
e0a99a2b72 KVM: selftests: Fix typos and stale comments in kvm_util
Fix minor documentation errors in `kvm_util.h` and `kvm_util.c`.

- Correct the argument description for `vcpu_args_set` in `kvm_util.h`,
  which incorrectly listed `vm` instead of `vcpu`.
- Fix a typo in the comment for `kvm_selftest_arch_init` ("exeucting" ->
  "executing").
- Correct the return value description for `vm_vaddr_unused_gap` in
  `kvm_util.c` to match the implementation, which returns an address "at
  or above" `vaddr_min`, not "at or below".

No functional change intended.

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260109082218.3236580-6-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-15 13:39:53 +00:00
Fuad Tabba
de00d07321 KVM: selftests: Move page_align() to shared header
To avoid code duplication, move page_align() to the shared `kvm_util.h`
header file. Rename it to vm_page_align(), to make it clear that the
alignment is done with respect to the guest's base page size.

No functional change intended.

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260109082218.3236580-5-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-15 13:39:53 +00:00
Fuad Tabba
582b39463f KVM: riscv: selftests: Fix incorrect rounding in page_align()
The implementation of `page_align()` in `processor.c` calculates
alignment incorrectly for values that are already aligned. Specifically,
`(v + vm->page_size) & ~(vm->page_size - 1)` aligns to the *next* page
boundary even if `v` is already page-aligned, potentially wasting a page
of memory.

Fix the calculation to use standard alignment logic: `(v + vm->page_size
- 1) & ~(vm->page_size - 1)`.

Fixes: 3e06cdf105 ("KVM: selftests: Add initial support for RISC-V 64-bit")
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260109082218.3236580-4-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-15 13:39:53 +00:00
Fuad Tabba
dd0c5d04d1 KVM: arm64: selftests: Fix incorrect rounding in page_align()
The implementation of `page_align()` in `processor.c` calculates
alignment incorrectly for values that are already aligned. Specifically,
`(v + vm->page_size) & ~(vm->page_size - 1)` aligns to the *next* page
boundary even if `v` is already page-aligned, potentially wasting a page
of memory.

Fix the calculation to use standard alignment logic: `(v + vm->page_size
- 1) & ~(vm->page_size - 1)`.

Fixes: 7a6629ef74 ("kvm: selftests: add virt mem support for aarch64")
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260109082218.3236580-3-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-15 13:39:53 +00:00
Fuad Tabba
7e03d07d03 KVM: arm64: selftests: Disable unused TTBR1_EL1 translations
KVM selftests map all guest code and data into the lower virtual address
range (0x0000...) managed by TTBR0_EL1. The upper range (0xFFFF...)
managed by TTBR1_EL1 is unused and uninitialized.

If a guest accesses the upper range, the MMU attempts a translation
table walk using uninitialized registers, leading to unpredictable
behavior.

Set `TCR_EL1.EPD1` to disable translation table walks for TTBR1_EL1,
ensuring that any access to the upper range generates an immediate
Translation Fault. Additionally, set `TCR_EL1.TBI1` (Top Byte Ignore) to
ensure that tagged pointers in the upper range also deterministically
trigger a Translation Fault via EPD1.

Define `TCR_EPD1_MASK`, `TCR_EPD1_SHIFT`, and `TCR_TBI1` in
`processor.h` to support this configuration. These are based on their
definitions in `arch/arm64/include/asm/pgtable-hwdef.h`.

Suggested-by: Will Deacon <will@kernel.org>
Reviewed-by: Itaru Kitayama <itaru.kitayama@fujitsu.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
Link: https://patch.msgid.link/20260109082218.3236580-2-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-15 13:39:53 +00:00
Marc Zyngier
b638a9d0f8 KVM: arm64: selftests: Add a test for FEAT_IDST
Add a very basic test checking that FEAT_IDST actually works for
the {GMID,SMIDR,CSSIDR2}_EL1 registers.

Link: https://patch.msgid.link/20260108173233.2911955-10-maz@kernel.org
Signed-off-by: Marc Zyngier <maz@kernel.org>
2026-01-15 11:58:57 +00:00
Lorenzo Stoakes
21c68ad1d9 tools/testing/selftests: fix gup_longterm for unknown fs
Commit 66bce7afba ("selftests/mm: fix test result reporting in
gup_longterm") introduced a small bug causing unknown filesystems to
always result in a test failure.

This is because do_test() was updated to use a common reporting path, but
this case appears to have been missed.

This is problematic for e.g.  virtme-ng which uses an overlayfs file
system, causing gup_longterm to appear to fail each time due to a test
count mismatch:

	# Planned tests != run tests (50 != 46)
	# Totals: pass:24 fail:0 xfail:0 xpass:0 skip:22 error:0

The fix is to simply change the return into a break.

Link: https://lkml.kernel.org/r/20260106154547.214907-1-lorenzo.stoakes@oracle.com
Fixes: 66bce7afba ("selftests/mm: fix test result reporting in gup_longterm")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reviewed-by: David Hildenbrand (Red Hat) <david@kernel.org>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-14 22:16:26 -08:00
Lorenzo Stoakes
fb39444732 tools/testing/selftests: add forked (un)/faulted VMA merge tests
Now we correctly handle forked faulted/unfaulted merge on mremap(),
exhaustively assert that we handle this correctly.

Do this in the less duplicative way by adding a new merge_with_fork
fixture and forked/unforked variants, and abstract the forking logic as
necessary to avoid code duplication with this also.

Link: https://lkml.kernel.org/r/1daf76d89fdb9d96f38a6a0152d8f3c2e9e30ac7.1767638272.git.lorenzo.stoakes@oracle.com
Fixes: 879bca0a2c ("mm/vma: fix incorrectly disallowed anonymous VMA merges")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: David Hildenbrand (Red Hat) <david@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jeongjun Park <aha310510@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Rik van Riel <riel@surriel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yeoreum Yun <yeoreum.yun@arm.com>
Cc: Harry Yoo <harry.yoo@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-14 22:16:25 -08:00
Lorenzo Stoakes
0ace8f2db6 tools/testing/selftests: add tests for !tgt, src mremap() merges
Test that mremap()'ing a VMA into a position such that the target VMA on
merge is unfaulted and the source faulted is correctly performed.

We cover 4 cases:

    1. Previous VMA unfaulted:

                  copied -----|
                              v
            |-----------|.............|
            | unfaulted |(faulted VMA)|
            |-----------|.............|
                 prev

    target = prev, expand prev to cover.

    2. Next VMA unfaulted:

                  copied -----|
                              v
                        |.............|-----------|
                        |(faulted VMA)| unfaulted |
                        |.............|-----------|
                                          next

    target = next, expand next to cover.

    3. Both adjacent VMAs unfaulted:

                  copied -----|
                              v
            |-----------|.............|-----------|
            | unfaulted |(faulted VMA)| unfaulted |
            |-----------|.............|-----------|
                 prev                      next

    target = prev, expand prev to cover.

    4. prev unfaulted, next faulted:

                  copied -----|
                              v
            |-----------|.............|-----------|
            | unfaulted |(faulted VMA)|  faulted  |
            |-----------|.............|-----------|
                 prev                      next

    target = prev, expand prev to cover. Essentially equivalent to 3, but
    with additional requirement that next's anon_vma is the same as the
    copied VMA's.

Each of these are performed with MREMAP_DONTUNMAP set, which will cause a
KASAN assert for UAF or an assert on zero refcount anon_vma if a bug
exists with correctly propagating anon_vma state in each scenario.

Link: https://lkml.kernel.org/r/f903af2930c7c2c6e0948c886b58d0f42d8e8ba3.1767638272.git.lorenzo.stoakes@oracle.com
Fixes: 879bca0a2c ("mm/vma: fix incorrectly disallowed anonymous VMA merges")
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: David Hildenbrand (Red Hat) <david@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Jeongjun Park <aha310510@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Rik van Riel <riel@surriel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yeoreum Yun <yeoreum.yun@arm.com>
Cc: Harry Yoo <harry.yoo@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-14 22:16:24 -08:00
Anton Protopopov
7c8e817e44 selftests/bpf: Extend live regs tests with a test for gotox
Add a test which checks that the destination register of a gotox
instruction is marked as used and that the union of jump targets
is considered as live.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
Link: https://lore.kernel.org/r/20260114162544.83253-3-a.s.protopopov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-14 19:08:09 -08:00
Alexei Starovoitov
e3d0dbb3b5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc5
Cross-merge BPF and other fixes after downstream PR.

No conflicts.

Adjacent:
Auto-merging MAINTAINERS
Auto-merging Makefile
Auto-merging kernel/bpf/verifier.c
Auto-merging kernel/sched/ext.c
Auto-merging mm/memcontrol.c

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-14 15:22:01 -08:00
Yosry Ahmed
55058e3215 KVM: selftests: Add a selftests for nested VMLOAD/VMSAVE
Add a test for VMLOAD/VMSAVE in an L2 guest. The test verifies that L1
intercepts for VMSAVE/VMLOAD always work regardless of
VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK.

Then, more interestingly, it makes sure that when L1 does not intercept
VMLOAD/VMSAVE, they work as intended in L2. When
VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK is enabled by L1, VMSAVE/VMLOAD from
L2 should interpret the GPA as an L2 GPA and translate it through the
NPT. When VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK is disabled by L1,
VMSAVE/VMLOAD from L2 should interpret the GPA as an L1 GPA.

To test this, put two VMCBs (0 and 1) in L1's physical address space,
and have a single L2 GPA where:
- L2 VMCB GPA == L1 VMCB(0) GPA
- L2 VMCB GPA maps to L1 VMCB(1) via the NPT in L1.

This setup allows detecting how the GPA is interpreted based on which L1
VMCB is actually accessed.

In both cases, L2 sets KERNEL_GS_BASE (one of the fields handled by
VMSAVE/VMLOAD), and executes VMSAVE to write its value to the VMCB. The
test userspace code then checks that the write was made to the correct
VMCB (based on whether VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK is set by L1),
and writes a new value to that VMCB. L2 then executes VMLOAD to load the
new value and makes sure it's reflected correctly in KERNERL_GS_BASE.

Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20260110004821.3411245-4-yosry.ahmed@linux.dev
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-14 14:09:10 -08:00
Yosry Ahmed
f756ed82c6 KVM: selftests: Slightly simplify memstress_setup_nested()
Instead of calling memstress_setup_ept_mappings() only in the first
iteration in the loop, move it before the loop.

The call needed to happen within the loop before commit e40e72fec0
("KVM: selftests: Stop passing VMX metadata to TDP mapping functions"),
as memstress_setup_ept_mappings() used to take in a pointer to vmx_pages
and pass it into tdp_identity_map_1g() (to get the EPT root GPA). This
is no longer the case, as tdp_identity_map_1g() gets the EPT root
through stage2 MMU.

Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20260113171456.2097312-1-yosry.ahmed@linux.dev
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-14 09:06:45 -08:00
Thomas Weißschuh
7158fc54b2 vdso: Remove struct getcpu_cache
The cache parameter of getcpu() is useless nowadays for various reasons.

  * It is never passed by userspace for either the vDSO or syscalls.
  * It is never used by the kernel.
  * It could not be made to work on the current vDSO architecture.
  * The structure definition is not part of the UAPI headers.
  * vdso_getcpu() is superseded by restartable sequences in any case.

Remove the struct and its header.

As a side-effect this gets rid of an unwanted inclusion of the linux/
header namespace from vDSO code.

[ tglx: Adapt to s390 upstream changes */

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Heiko Carstens <hca@linux.ibm.com> # s390
Link: https://patch.msgid.link/20251230-getcpu_cache-v3-1-fb9c5f880ebe@linutronix.de
2026-01-14 08:56:40 +01:00
Linus Torvalds
c537e12dae Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Pull bpf fixes from Alexei Starovoitov:

 - Fix incorrect usage of BPF_TRAMP_F_ORIG_STACK in riscv JIT (Menglong
   Dong)

 - Fix reference count leak in bpf_prog_test_run_xdp() (Tetsuo Handa)

 - Fix metadata size check in bpf_test_run() (Toke Høiland-Jørgensen)

 - Check that BPF insn array is not allowed as a map for const strings
   (Deepanshu Kartikey)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Fix reference count leak in bpf_prog_test_run_xdp()
  bpf: Reject BPF_MAP_TYPE_INSN_ARRAY in check_reg_const_str()
  selftests/bpf: Update xdp_context_test_run test to check maximum metadata size
  bpf, test_run: Subtract size of xdp_frame from allowed metadata size
  riscv, bpf: Fix incorrect usage of BPF_TRAMP_F_ORIG_STACK
2026-01-13 21:21:13 -08:00
Anton Protopopov
c656807675 selftests/bpf: Add tests for loading insn array values with offsets
The ldimm64 instruction for map value supports an offset.
For insn array maps it wasn't tested before, as normally
such instructions aren't generated. However, this is still
possible to pass such instructions, so add a few tests to
check that correct offsets work properly and incorrect
offsets are rejected.

Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com>
Link: https://lore.kernel.org/r/20260111153047.8388-4-a.s.protopopov@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-13 19:36:47 -08:00
Matt Bobrowski
bbdbed193b selftests/bpf: assert BPF kfunc default trusted pointer semantics
The BPF verifier was recently updated to treat pointers to struct types
returned from BPF kfuncs as implicitly trusted by default. Add a new
test case to exercise this new implicit trust semantic.

The KF_ACQUIRE flag was dropped from the bpf_get_root_mem_cgroup()
kfunc because it returns a global pointer to root_mem_cgroup without
performing any explicit reference counting. This makes it an ideal
candidate to verify the new implicit trusted pointer semantics.

Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260113083949.2502978-3-mattbobrowski@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-13 19:19:13 -08:00
Matt Bobrowski
f8ade2342e bpf: return PTR_TO_BTF_ID | PTR_TRUSTED from BPF kfuncs by default
Teach the BPF verifier to treat pointers to struct types returned from
BPF kfuncs as implicitly trusted (PTR_TO_BTF_ID | PTR_TRUSTED) by
default. Returning untrusted pointers to struct types from BPF kfuncs
should be considered an exception only, and certainly not the norm.

Update existing selftests to reflect the change in register type
printing (e.g. `ptr_` becoming `trusted_ptr_` in verifier error
messages).

Link: https://lore.kernel.org/bpf/aV4nbCaMfIoM0awM@google.com/
Signed-off-by: Matt Bobrowski <mattbobrowski@google.com>
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20260113083949.2502978-1-mattbobrowski@google.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-13 19:19:13 -08:00
Gal Pressman
cf055f8c00 selftests: drv-net: fix RPS mask handling for high CPU numbers
The RPS bitmask bounds check uses ~(RPS_MAX_CPUS - 1) which equals ~15 =
0xfff0, only allowing CPUs 0-3.

Change the mask to ~((1UL << RPS_MAX_CPUS) - 1) = ~0xffff to allow CPUs
0-15.

Fixes: 5ebfb4cc30 ("selftests/net: toeplitz test")
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260112173715.384843-3-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-13 19:13:08 -08:00
Gal Pressman
9d48c62f6b selftests: drv-net: fix RPS mask handling in toeplitz test
The toeplitz.py test passed the hex mask without "0x" prefix (e.g.,
"300" for CPUs 8,9). The toeplitz.c strtoul() call wrongly parsed this
as decimal 300 (0x12c) instead of hex 0x300.

Pass the prefixed mask to toeplitz.c, and the unprefixed one to sysfs.

Fixes: 9cf9aa77a1 ("selftests: drv-net: hw: convert the Toeplitz test to Python")
Reviewed-by: Nimrod Oren <noren@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260112173715.384843-2-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-13 19:13:08 -08:00
Ido Schimmel
f8f9ee9d8b selftests: fib-onlink: Add test cases for nexthop device mismatch
Add test cases that verify that when the "onlink" keyword is specified,
both address families (with and without VRF) accept routes with a
gateway address that is reachable via a different interface than the one
specified.

Output without "ipv6: Allow for nexthop device mismatch with "onlink"":

 # ./fib-onlink-tests.sh | grep mismatch
 TEST: nexthop device mismatch                             [ OK ]
 TEST: nexthop device mismatch                             [ OK ]
 TEST: nexthop device mismatch                             [FAIL]
 TEST: nexthop device mismatch                             [FAIL]

Output with "ipv6: Allow for nexthop device mismatch with "onlink"":

 # ./fib-onlink-tests.sh | grep mismatch
 TEST: nexthop device mismatch                             [ OK ]
 TEST: nexthop device mismatch                             [ OK ]
 TEST: nexthop device mismatch                             [ OK ]
 TEST: nexthop device mismatch                             [ OK ]

That is, the IPv4 tests were always passing, but the IPv6 ones only pass
after the specified patch.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260111120813.159799-6-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-13 18:57:35 -08:00
Ido Schimmel
9bf8345fb3 selftests: fib-onlink: Add a test case for IPv4 multicast gateway
A multicast gateway address should be rejected when "onlink" is
specified, but it is only tested as part of the IPv6 tests. Add an
equivalent IPv4 test.

 # ./fib-onlink-tests.sh -v
 [...]
 COMMAND: ip ro add table 254 169.254.101.12/32 via 233.252.0.1 dev veth1 onlink
 Error: Nexthop has invalid gateway.

 TEST: Invalid gw - multicast address                      [ OK ]
 [...]
 COMMAND: ip ro add table 1101 169.254.102.12/32 via 233.252.0.1 dev veth5 onlink
 Error: Nexthop has invalid gateway.

 TEST: Invalid gw - multicast address, VRF                 [ OK ]
 [...]
 Tests passed:  37
 Tests failed:   0

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260111120813.159799-4-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-13 18:57:34 -08:00
Ido Schimmel
0a3419f4ba selftests: fib-onlink: Remove "wrong nexthop device" IPv6 tests
The command in the test fails as expected because IPv6 forbids a nexthop
device mismatch:

 # ./fib-onlink-tests.sh -v
 [...]
 COMMAND: ip -6 ro add table 1101 2001:db8:102::103/128 via 2001:db8:701::64 dev veth5 onlink
 Error: Nexthop has invalid gateway or device mismatch.

 TEST: Gateway resolves to wrong nexthop device - VRF      [ OK ]
 [...]

Where:

 # ip route get 2001:db8:701::64 vrf lisa
 2001:db8:701::64 dev veth7 table 1101 proto kernel src 2001:db8:701::1 metric 256 pref medium

This is in contrast to IPv4 where a nexthop device mismatch is allowed
when "onlink" is specified:

 # ip route get 169.254.7.2 vrf lisa
 169.254.7.2 dev veth7 table 1101 src 169.254.7.1 uid 0
 # ip ro add table 1101 169.254.102.103/32 via 169.254.7.2 dev veth5 onlink
 # echo $?
 0

Remove these tests in preparation for aligning IPv6 with IPv4 and
allowing nexthop device mismatch when "onlink" is specified.

A subsequent patch will add tests that verify that both address families
allow a nexthop device mismatch with "onlink".

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260111120813.159799-3-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-13 18:57:34 -08:00
Ido Schimmel
e5566f6b1d selftests: fib-onlink: Remove "wrong nexthop device" IPv4 tests
According to the test description, these tests fail because of a wrong
nexthop device:

 # ./fib-onlink-tests.sh -v
 [...]
 COMMAND: ip ro add table 254 169.254.101.102/32 via 169.254.3.1 dev veth1 onlink
 Error: Nexthop has invalid gateway.

 TEST: Gateway resolves to wrong nexthop device            [ OK ]
 COMMAND: ip ro add table 1101 169.254.102.103/32 via 169.254.7.1 dev veth5 onlink
 Error: Nexthop has invalid gateway.

 TEST: Gateway resolves to wrong nexthop device - VRF      [ OK ]
 [...]

But this is incorrect. They fail because the gateway addresses are local
addresses:

 # ip -4 address show
 [...]
 28: veth3@if27: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link-netns peer_ns-Urqh3o
     inet 169.254.3.1/24 scope global veth3
 [...]
 32: veth7@if31: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master lisa state UP group default qlen 1000 link-netns peer_ns-Urqh3o
     inet 169.254.7.1/24 scope global veth7

Therefore, using a local address that matches the nexthop device fails
as well:

 # ip ro add table 254 169.254.101.102/32 via 169.254.3.1 dev veth3 onlink
 Error: Nexthop has invalid gateway.

Using a gateway address with a "wrong" nexthop device is actually valid
and allowed:

 # ip route get 169.254.1.2
 169.254.1.2 dev veth1 src 169.254.1.1 uid 0
 # ip ro add table 254 169.254.101.102/32 via 169.254.1.2 dev veth3 onlink
 # echo $?
 0

Remove these tests given that their output is confusing and that the
scenario that they are testing is already covered by other tests.

A subsequent patch will add tests for the nexthop device mismatch
scenario.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20260111120813.159799-2-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-13 18:57:34 -08:00
Pavel Begunkov
a32bb32d01 selftests: iou-zcrx: test large chunk sizes
Add a test using large chunks for zcrx memory area.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
2026-01-14 02:13:37 +00:00
Jakub Kicinski
fe074aaa53 selftests: drv-net: gro: break out all individual test cases
GRO test groups the cases into categories, e.g. "tcp" case
checks coalescing in presence of:
 - packets with bad csum,
 - sequence number mismatch,
 - timestamp option value mismatch,
 - different TCP options.

Since we now have TAP support grouping the cases like that
lowers our reporting granularity. This matters even more for
NICs performing HW GRO and LRO since it appears that most
implementation have _some_ bugs. Flagging the whole group
of tests as failed prevents us from catching regressions
in the things that work today.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260113000740.255360-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-13 17:54:01 -08:00
Jakub Kicinski
d3b35898de selftests: drv-net: gro: run the test against HW GRO and LRO
Run the test against HW GRO and LRO. NICs I have pass the base cases.
Interestingly all are happy to build GROs larger than 64k.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260113000740.255360-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-13 17:54:01 -08:00
Jakub Kicinski
8171f6a76b selftests: drv-net: gro: improve feature config
We'll need to do a lot more feature handling to test HW-GRO and LRO.
Clean up the feature handling for SW GRO a bit to let the next commit
focus on the new test cases, only.

Make sure HW GRO-like features are not enabled for the SW tests.
Be more careful about changing features as "nothing changed"
situations may result in non-zero error code from ethtool.

Don't disable TSO on the local interface (receiver) when running over
netdevsim, we just want GSO to break up the segments on the sender.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260113000740.255360-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-13 17:54:01 -08:00
Jakub Kicinski
d131da6d72 selftests: drv-net: gro: use cmd print
Now that cmd() can be printed directly remove the old formatting.

Before:

  # fragmented ip6 doesn't coalesce:
  # Expected {200 100 100 }, Total 3 packets
  # Received {200 100 }, Total 2 packets.
  # /root/ksft-net-drv/drivers/net/gro: incorrect number of packets

Now:

  # CMD: drivers/net/gro --ipv6 --dmac 9e:[...]
  #   EXIT: 1
  #   STDOUT: fragmented ip6 doesn't coalesce:
  #   STDERR: Expected {200 100 100 }, Total 3 packets
  #           Received {200 100 }, Total 2 packets.
  #           /root/ksft-net-drv/drivers/net/gro: incorrect number of packets

Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260113000740.255360-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-13 17:54:00 -08:00
Jakub Kicinski
ce0f92dc73 selftests: net: py: teach cmd() how to print itself
Teach cmd() how to print itself, to make debug prints easier.
Example output (leading # due to ksft_pr()):

  # CMD: /root/ksft-net-drv/drivers/net/gro
  #   EXIT: 1
  #   STDOUT: ipv6 with ext header does coalesce:
  #   STDERR: Expected {200 }, Total 1 packets
  #           Received {100 [!=200]100 [!=0]}, Total 2 packets.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260113000740.255360-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-13 17:54:00 -08:00
Jakub Kicinski
b324192e36 selftests: net: py: teach ksft_pr() multi-line safety
Make printing multi-line logs easier by automatically prefixing
each line in ksft_pr(). Make use of this when formatting exceptions.

Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20260113000740.255360-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-13 17:54:00 -08:00
Sean Christopherson
d7507a94a0 KVM: SVM: Treat exit_code as an unsigned 64-bit value through all of KVM
Fix KVM's long-standing buggy handling of SVM's exit_code as a 32-bit
value.  Per the APM and Xen commit d1bd157fbc ("Big merge the HVM
full-virtualisation abstractions.") (which is arguably more trustworthy
than KVM), offset 0x70 is a single 64-bit value:

  070h 63:0 EXITCODE

Track exit_code as a single u64 to prevent reintroducing bugs where KVM
neglects to correctly set bits 63:32.

Fixes: 6aa8b732ca ("[PATCH] kvm: userspace interface")
Cc: Jim Mattson <jmattson@google.com>
Cc: Yosry Ahmed <yosry.ahmed@linux.dev>
Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev>
Link: https://patch.msgid.link/20251230211347.4099600-6-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-13 17:37:03 -08:00
Sean Christopherson
c3a9a27c79 KVM: selftests: Add a test to verify APICv updates (while L2 is active)
Add a test to verify KVM correctly handles a variety of edge cases related
to APICv updates, and in particular updates that are triggered while L2 is
actively running.

Reviewed-by: Chao Gao <chao.gao@intel.com>
Link: https://patch.msgid.link/20260109034532.1012993-2-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-01-13 17:35:31 -08:00
Donglin Peng
a3acd7d434 selftests/bpf: Add test cases for btf__permute functionality
This patch introduces test cases for the btf__permute function to ensure
it works correctly with both base BTF and split BTF scenarios.

The test suite includes:
- test_permute_base: Validates permutation on base BTF
- test_permute_split: Tests permutation on split BTF

Signed-off-by: Donglin Peng <pengdonglin@xiaomi.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Eduard Zingerman <eddyz87@gmail.com>
Link: https://lore.kernel.org/bpf/20260109130003.3313716-3-dolinux.peng@gmail.com
2026-01-13 16:10:39 -08:00
Willem de Bruijn
c65182ef9d selftests: net: reduce txtimestamp deschedule flakes
This test occasionally fails due to exceeding timing bounds, as
run in continuous testing on netdev.bots:

  https://netdev.bots.linux.dev/contest.html?test=txtimestamp-sh

A common pattern is a single elevated delay between USR and SND.

    # 8.36 [+0.00] test SND
    # 8.36 [+0.00]     USR: 1767864384 s 240994 us (seq=0, len=0)
    # 8.44 [+0.08] ERROR: 18461 us expected between 10000 and 18000
    # 8.44 [+0.00]     SND: 1767864384 s 259455 us (seq=42, len=10)  (USR +18460 us)
    # 8.52 [+0.07]     SND: 1767864384 s 339523 us (seq=42, len=10)  (USR +10005 us)
    # 8.52 [+0.00]     USR: 1767864384 s 409580 us (seq=0, len=0)
    # 8.60 [+0.08]     SND: 1767864384 s 419586 us (seq=42, len=10)  (USR +10005 us)
    # 8.60 [+0.00]     USR: 1767864384 s 489645 us (seq=0, len=0)
    # 8.68 [+0.08]     SND: 1767864384 s 499651 us (seq=42, len=10)  (USR +10005 us)
    # 8.68 [+0.00]     USR-SND: count=4, avg=12119 us, min=10005 us, max=18460 us

(Note that other delays are nowhere near the large 8ms tolerance.)

One hypothesis is that the task is descheduled between taking the USR
timestamp and sending the packet. Possibly in printing.

Delay taking the timestamp closer to sendmsg, and delay printing until
after sendmsg.

With this change, failure rate is significantly lower in current runs.

Link: https://lore.kernel.org/netdev/20260107110521.1aab55e9@kernel.org/
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20260112163355.3510150-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-13 11:46:39 -08:00
Linus Torvalds
0bb933a9fc Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull x86 kvm fixes from Paolo Bonzini:

 - Avoid freeing stack-allocated node in kvm_async_pf_queue_task

 - Clear XSTATE_BV[i] in guest XSAVE state whenever XFD[i]=1

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  selftests: kvm: Verify TILELOADD actually #NM faults when XFD[18]=1
  selftests: kvm: try getting XFD and XSAVE state out of sync
  selftests: kvm: replace numbered sync points with actions
  x86/fpu: Clear XSTATE_BV[i] in guest XSAVE state whenever XFD[i]=1
  x86/kvm: Avoid freeing stack-allocated node in kvm_async_pf_queue_task
2026-01-13 09:50:07 -08:00
Alexei Starovoitov
9160335317 selftests/bpf: Add tests for s>>=31 and s>>=63
Add tests for special arithmetic shift right.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Co-developed-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Link: https://lore.kernel.org/r/20260112201424.816836-3-puranjay@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-13 09:33:38 -08:00
Yonghong Song
951d79017e selftests/bpf: Fix verifier_arena_globals1 failure with 64K page
With 64K page on arm64, verifier_arena_globals1 failed like below:
  ...
  libbpf: map 'arena': failed to create: -E2BIG
  ...
  #509/1   verifier_arena_globals1/check_reserve1:FAIL
  ...

For 64K page, if the number of arena pages is (1UL << 20), the total
memory will exceed 4G and this will cause map creation failure.
Adjusting ARENA_PAGES based on the actual page size fixed the problem.

Cc: Emil Tsalapatis <emil@etsalapatis.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com>
Link: https://lore.kernel.org/r/20260113061033.3798549-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-13 09:32:16 -08:00
Yonghong Song
d2f7cd20a7 selftests/bpf: Fix sk_bypass_prot_mem failure with 64K page
The current selftest sk_bypass_prot_mem only supports 4K page.
When running with 64K page on arm64, the following failure happens:
  ...
  check_bypass:FAIL:no bypass unexpected no bypass: actual 3 <= expected 32
  ...
  #385/1   sk_bypass_prot_mem/TCP  :FAIL
  ...
  check_bypass:FAIL:no bypass unexpected no bypass: actual 4 <= expected 32
  ...
  #385/2   sk_bypass_prot_mem/UDP  :FAIL
  ...

Adding support to 64K page as well fixed the failure.

Cc: Kuniyuki Iwashima <kuniyu@google.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20260113061028.3798326-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-13 09:32:16 -08:00
Yonghong Song
2465a08d43 selftests/bpf: Fix dmabuf_iter/lots_of_buffers failure with 64K page
On arm64 with 64K page , I observed the following test failure:
  ...
  subtest_dmabuf_iter_check_lots_of_buffers:FAIL:total_bytes_read unexpected total_bytes_read:
      actual 4696 <= expected 65536
  #97/3    dmabuf_iter/lots_of_buffers:FAIL

With 4K page on x86, the total_bytes_read is 4593.
With 64K page on arm64, the total_byte_read is 4696.

In progs/dmabuf_iter.c, for each iteration, the output is
  BPF_SEQ_PRINTF(seq, "%lu\n%llu\n%s\n%s\n", inode, size, name, exporter);

The only difference between 4K and 64K page is 'size' in
the above BPF_SEQ_PRINTF. The 4K page will output '4096' and
the 64K page will output '65536'. So the total_bytes_read with 64K page
is slighter greater than 4K page.

Adjusting the total_bytes_read from 65536 to 4096 fixed the issue.

Cc: T.J. Mercier <tjmercier@google.com>
Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20260113061023.3798085-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-13 09:32:16 -08:00
Robert Richter
8441c7d3bd cxl: Check for invalid addresses returned from translation functions on errors
Translation functions may return an invalid address in case of errors.
If the address is not checked the further use of the invalid value
will cause an address corruption.

Consistently check for a valid address returned by translation
functions. Use RESOURCE_SIZE_MAX to indicate an invalid address for
type resource_size_t. Depending on the type either RESOURCE_SIZE_MAX
or ULLONG_MAX is used to indicate an address error.

Propagating an invalid address from a failed translation may cause
userspace to think it has received a valid SPA, when in fact it is
wrong. The CXL userspace API, using trace events, expects ULLONG_MAX
to indicate a translation failure. If ULLONG_MAX is not returned
immediately, subsequent calculations can transform that bad address
into a different value (!ULLONG_MAX), and an invalid SPA may be
returned to userspace. This can lead to incorrect diagnostics and
erroneous corrective actions.

[ dj: Added user impact statement from Alison. ]
[ dj: Fixed checkpatch tab alignment issue. ]

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Robert Richter <rrichter@amd.com>
Fixes: c3dd67681c ("cxl/region: Add inject and clear poison by region offset")
Fixes: b78b9e7b79 ("cxl/region: Refactor address translation funcs for testing")
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20260107120544.410993-1-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2026-01-13 08:30:40 -07:00