Commit Graph

49385 Commits

Author SHA1 Message Date
Paolo Abeni
f5b60d6a57 Merge tag 'nf-next-25-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following batch contains Netfilter updates for net-next,
specifically 26 patches: 5 patches adding/updating selftests,
4 fixes, 3 PREEMPT_RT fixes, and 14 patches to enhance nf_tables):

1) Improve selftest coverage for pipapo 4 bit group format, from
   Florian Westphal.

2) Fix incorrect dependencies when compiling a kernel without
   legacy ip{6}tables support, also from Florian.

3) Two patches to fix nft_fib vrf issues, including selftest updates
   to improve coverage, also from Florian Westphal.

4) Fix incorrect nesting in nft_tunnel's GENEVE support, from
   Fernando F. Mancera.

5) Three patches to fix PREEMPT_RT issues with nf_dup infrastructure
   and nft_inner to match in inner headers, from Sebastian Andrzej Siewior.

6) Integrate conntrack information into nft trace infrastructure,
   from Florian Westphal.

7) A series of 13 patches to allow to specify wildcard netdevice in
   netdev basechain and flowtables, eg.

   table netdev filter {
       chain ingress {
           type filter hook ingress devices = { eth0, eth1, vlan* } priority 0; policy accept;
       }
   }

   This also allows for runtime hook registration on NETDEV_{UN}REGISTER
   event, from Phil Sutter.

netfilter pull request 25-05-23

* tag 'nf-next-25-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: (26 commits)
  selftests: netfilter: Torture nftables netdev hooks
  netfilter: nf_tables: Add notifications for hook changes
  netfilter: nf_tables: Support wildcard netdev hook specs
  netfilter: nf_tables: Sort labels in nft_netdev_hook_alloc()
  netfilter: nf_tables: Handle NETDEV_CHANGENAME events
  netfilter: nf_tables: Wrap netdev notifiers
  netfilter: nf_tables: Respect NETDEV_REGISTER events
  netfilter: nf_tables: Prepare for handling NETDEV_REGISTER events
  netfilter: nf_tables: Have a list of nf_hook_ops in nft_hook
  netfilter: nf_tables: Pass nf_hook_ops to nft_unregister_flowtable_hook()
  netfilter: nf_tables: Introduce nft_register_flowtable_ops()
  netfilter: nf_tables: Introduce nft_hook_find_ops{,_rcu}()
  netfilter: nf_tables: Introduce functions freeing nft_hook objects
  netfilter: nf_tables: add packets conntrack state to debug trace info
  netfilter: conntrack: make nf_conntrack_id callable without a module dependency
  netfilter: nf_dup_netdev: Move the recursion counter struct netdev_xmit
  netfilter: nft_inner: Use nested-BH locking for nft_pcpu_tun_ctx
  netfilter: nf_dup{4, 6}: Move duplication check to task_struct
  netfilter: nft_tunnel: fix geneve_opt dump
  selftests: netfilter: nft_fib.sh: add type and oif tests with and without VRFs
  ...
====================

Link: https://patch.msgid.link/20250523132712.458507-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-26 18:53:41 +02:00
Rafael J. Wysocki
57356d98c0 Merge branch 'acpica'
Merge ACPICA updates, including two upstream releases 20241212 and
20250404, for 6.16-rc1:

 - Fix two ACPICA SLAB cache leaks (Seunghun Han).

 - Add EINJv2 get error type action and define Error Injection Actions
   in hex values to avoid inconsistencies between the specification and
   the code (Zaid Alali).

 - Fix typo in comments for SRAT structures (Adam Lackorzynski).

 - Prevent possible loss of data in ACPICA because of u32 to u8
   conversions (Saket Dumbre).

 - Fix reading FFixedHW operation regions in ACPICA (Daniil Tatianin).

 - Add support for printing AML arguments when the ACPICA debug level is
   ACPI_LV_TRACE_POINT (Mario Limonciello).

 - Drop a stale comment about the file content from actbl2.h (Sudeep
   Holla).

 - Apply pack(1) to union aml_resource (Tamir Duberstein).

 - Fix overflow check in the ACPICA version of vsnprintf() (gldrk).

 - Interpret SIDP structures in DMAR added revision 3.4 of the VT-d
   specification (Alexey Neyman).

 - Add typedef and other definitions related to MRRM to ACPICA (Tony
   Luck).

 - Add definitions for RIMT to ACPICA (Sunil V L).

 - Fix spelling mistake "Incremement" -> "Increment" in the ACPICA
   utilities code (Colin Ian King).

 - Add typedef and other definitions for ERDT to ACPICA (Tony Luck).

 - Introduce ACPI_NONSTRING and use it (Kees Cook, Ahmed Salem).

 - Rename structure and field names of the RAS2 table in actbl2.h (Shiju
   Jose).

 - Fix up whitespace in acpica/utcache.c (Zhe Qiao).

 - Avoid sequence overread in a call to strncmp() in ap_get_table_length()
   and replace strncpy() with memcpy() in ACPICA in some places (Ahmed
   Salem).

 - Update copyright year in all ACPICA files (Saket Dumbre).

* acpica: (30 commits)
  ACPICA: Update copyright year
  ACPICA: Logfile: Changes for version 20250404
  ACPICA: Replace strncpy() with memcpy()
  ACPICA: Apply ACPI_NONSTRING in more places
  ACPICA: Avoid sequence overread in call to strncmp()
  ACPICA: Adjust the position of code lines
  ACPICA: actbl2.h: ACPI 6.5: RAS2: Rename structure and field names of the RAS2 table
  ACPICA: Apply ACPI_NONSTRING
  ACPICA: Introduce ACPI_NONSTRING
  ACPICA: actbl2.h: ERDT: Add typedef and other definitions
  ACPICA: infrastructure: Add new DMT_BUF types and shorten a long name
  ACPICA: Utilities: Fix spelling mistake "Incremement" -> "Increment"
  ACPICA: MRRM: Some cleanups
  ACPICA: actbl2: Add definitions for RIMT
  ACPICA: actbl2.h: MRRM: Add typedef and other definitions
  ACPICA: infrastructure: Add new header and ACPI_DMT_BUF26 types
  ACPICA: Interpret SIDP structures in DMAR
  ACPICA: utilities: Fix overflow check in vsnprintf()
  ACPICA: Apply pack(1) to union aml_resource
  ACPICA: Drop stale comment about the header file content
  ...
2025-05-26 18:35:01 +02:00
Paolo Abeni
34d26315db Merge tag 'linux-can-next-for-6.16-20250522' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:

====================
pull-request: can-next 2025-05-22

this is a pull request of 22 patches for net-next/main.

The series by Biju Das contains 19 patches and adds RZ/G3E CANFD
support to the rcar_canfd driver.

The patch by Vincent Mailhol adds a struct data_bittiming_params to
group FD parameters as a preparation patch for CAN-XL support.

Felix Maurer's patch imports tst-filter from can-tests into the kernel
self tests and Vincent Mailhol adds support for physical CAN
interfaces.

linux-can-next-for-6.16-20250522

* tag 'linux-can-next-for-6.16-20250522' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next: (22 commits)
  selftests: can: test_raw_filter.sh: add support of physical interfaces
  selftests: can: Import tst-filter from can-tests
  can: dev: add struct data_bittiming_params to group FD parameters
  can: rcar_canfd: Add RZ/G3E support
  can: rcar_canfd: Enhance multi_channel_irqs handling
  can: rcar_canfd: Add external_clk variable to struct rcar_canfd_hw_info
  can: rcar_canfd: Add sh variable to struct rcar_canfd_hw_info
  can: rcar_canfd: Add struct rcanfd_regs variable to struct rcar_canfd_hw_info
  can: rcar_canfd: Add shared_can_regs variable to struct rcar_canfd_hw_info
  can: rcar_canfd: Add ch_interface_mode variable to struct rcar_canfd_hw_info
  can: rcar_canfd: Add {nom,data}_bittiming variables to struct rcar_canfd_hw_info
  can: rcar_canfd: Add max_cftml variable to struct rcar_canfd_hw_info
  can: rcar_canfd: Add max_aflpn variable to struct rcar_canfd_hw_info
  can: rcar_canfd: Add rnc_field_width variable to struct rcar_canfd_hw_info
  can: rcar_canfd: Update RCANFD_GAFLCFG macro
  can: rcar_canfd: Add rcar_canfd_setrnc()
  can: rcar_canfd: Drop the mask operation in RCANFD_GAFLCFG_SETRNC macro
  can: rcar_canfd: Update RCANFD_GERFL_ERR macro
  can: rcar_canfd: Drop RCANFD_GAFLCFG_GETRNC macro
  can: rcar_canfd: Use of_get_available_child_by_name()
  ...
====================

Link: https://patch.msgid.link/20250522084128.501049-1-mkl@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-26 18:11:24 +02:00
Linus Torvalds
181d8e399f Merge tag 'vfs-6.16-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull misc vfs updates from Christian Brauner:
 "This contains the usual selections of misc updates for this cycle.

  Features:

   - Use folios for symlinks in the page cache

     FUSE already uses folios for its symlinks. Mirror that conversion
     in the generic code and the NFS code. That lets us get rid of a few
     folio->page->folio conversions in this path, and some of the few
     remaining users of read_cache_page() / read_mapping_page()

   - Try and make a few filesystem operations killable on the VFS
     inode->i_mutex level

   - Add sysctl vfs_cache_pressure_denom for bulk file operations

     Some workloads need to preserve more dentries than we currently
     allow through out sysctl interface

     A HDFS servers with 12 HDDs per server, on a HDFS datanode startup
     involves scanning all files and caching their metadata (including
     dentries and inodes) in memory. Each HDD contains approximately 2
     million files, resulting in a total of ~20 million cached dentries
     after initialization

     To minimize dentry reclamation, they set vfs_cache_pressure to 1.
     Despite this configuration, memory pressure conditions can still
     trigger reclamation of up to 50% of cached dentries, reducing the
     cache from 20 million to approximately 10 million entries. During
     the subsequent cache rebuild period, any HDFS datanode restart
     operation incurs substantial latency penalties until full cache
     recovery completes

     To maintain service stability, more dentries need to be preserved
     during memory reclamation. The current minimum reclaim ratio (1/100
     of total dentries) remains too aggressive for such workload. This
     patch introduces vfs_cache_pressure_denom for more granular cache
     pressure control

     The configuration [vfs_cache_pressure=1,
     vfs_cache_pressure_denom=10000] effectively maintains the full 20
     million dentry cache under memory pressure, preventing datanode
     restart performance degradation

   - Avoid some jumps in inode_permission() using likely()/unlikely()

   - Avid a memory access which is most likely a cache miss when
     descending into devcgroup_inode_permission()

   - Add fastpath predicts for stat() and fdput()

   - Anonymous inodes currently don't come with a proper mode causing
     issues in the kernel when we want to add useful VFS debug assert.
     Fix that by giving them a proper mode and masking it off when we
     report it to userspace which relies on them not having any mode

   - Anonymous inodes currently allow to change inode attributes because
     the VFS falls back to simple_setattr() if i_op->setattr isn't
     implemented. This means the ownership and mode for every single
     user of anon_inode_inode can be changed. Block that as it's either
     useless or actively harmful. If specific ownership is needed the
     respective subsystem should allocate anonymous inodes from their
     own private superblock

   - Raise SB_I_NODEV and SB_I_NOEXEC on the anonymous inode superblock

   - Add proper tests for anonymous inode behavior

   - Make it easy to detect proper anonymous inodes and to ensure that
     we can detect them in codepaths such as readahead()

  Cleanups:

   - Port pidfs to the new anon_inode_{g,s}etattr() helpers

   - Try to remove the uselib() system call

   - Add unlikely branch hint return path for poll

   - Add unlikely branch hint on return path for core_sys_select

   - Don't allow signals to interrupt getdents copying for fuse

   - Provide a size hint to dir_context for during readdir()

   - Use writeback_iter directly in mpage_writepages

   - Update compression and mtime descriptions in initramfs
     documentation

   - Update main netfs API document

   - Remove useless plus one in super_cache_scan()

   - Remove unnecessary NULL-check guards during setns()

   - Add separate separate {get,put}_cgroup_ns no-op cases

  Fixes:

   - Fix typo in root= kernel parameter description

   - Use KERN_INFO for infof()|info_plog()|infofc()

   - Correct comments of fs_validate_description()

   - Mark an unlikely if condition with unlikely() in
     vfs_parse_monolithic_sep()

   - Delete macro fsparam_u32hex()

   - Remove unused and problematic validate_constant_table()

   - Fix potential unsigned integer underflow in fs_name()

   - Make file-nr output the total allocated file handles"

* tag 'vfs-6.16-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (43 commits)
  fs: Pass a folio to page_put_link()
  nfs: Use a folio in nfs_get_link()
  fs: Convert __page_get_link() to use a folio
  fs/read_write: make default_llseek() killable
  fs/open: make do_truncate() killable
  fs/open: make chmod_common() and chown_common() killable
  include/linux/fs.h: add inode_lock_killable()
  readdir: supply dir_context.count as readdir buffer size hint
  vfs: Add sysctl vfs_cache_pressure_denom for bulk file operations
  fuse: don't allow signals to interrupt getdents copying
  Documentation: fix typo in root= kernel parameter description
  include/cgroup: separate {get,put}_cgroup_ns no-op case
  kernel/nsproxy: remove unnecessary guards
  fs: use writeback_iter directly in mpage_writepages
  fs: remove useless plus one in super_cache_scan()
  fs: add S_ANON_INODE
  fs: remove uselib() system call
  device_cgroup: avoid access to ->i_rdev in the common case in devcgroup_inode_permission()
  fs/fs_parse: Remove unused and problematic validate_constant_table()
  fs: touch up predicts in inode_permission()
  ...
2025-05-26 09:02:39 -07:00
Stanislav Fomichev
8ceeef23a3 selftests: ncdevmem: add tx test with multiple IOVs
Use prime 3 for length to make offset slowly drift away.

Signed-off-by: Stanislav Fomichev <stfomichev@gmail.com>
Acked-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-26 10:00:48 +01:00
Stanislav Fomichev
61f24c6885 selftests: ncdevmem: make chunking optional
Add new -z argument to specify max IOV size. By default, use
single large IOV.

Signed-off-by: Stanislav Fomichev <stfomichev@gmail.com>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-26 10:00:48 +01:00
Ingo Molnar
94ec70880f Merge branch 'locking/futex' into locking/core, to pick up pending futex changes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-25 10:10:08 +02:00
Leo Yan
628e124404 perf tests switch-tracking: Fix timestamp comparison
The test might fail on the Arm64 platform with the error:

  # perf test -vvv "Track with sched_switch"
  Missing sched_switch events
  #

The issue is caused by incorrect handling of timestamp comparisons. The
comparison result, a signed 64-bit value, was being directly cast to an
int, leading to incorrect sorting for sched events.

The case does not fail everytime, usually I can trigger the failure
after run 20 ~ 30 times:

  # while true; do perf test "Track with sched_switch"; done
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : FAILED!
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : FAILED!
  106: Track with sched_switch                                         : Ok
  106: Track with sched_switch                                         : Ok

I used cross compiler to build Perf tool on my host machine and tested on
Debian / Juno board.  Generally, I think this issue is not very specific
to GCC versions.  As both internal CI and my local env can reproduce the
issue.

My Host Build compiler:

  # aarch64-linux-gnu-gcc --version
  aarch64-linux-gnu-gcc (Ubuntu 13.3.0-6ubuntu2~24.04) 13.3.0

Juno Board:

  # lsb_release -a
  No LSB modules are available.
  Distributor ID: Debian
  Description:    Debian GNU/Linux 12 (bookworm)
  Release:        12
  Codename:       bookworm

Fix this by explicitly returning 0, 1, or -1 based on whether the result
is zero, positive, or negative.

Fixes: d44bc55829 ("perf tests: Add a test for tracking with sched_switch")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250331172759.115604-1-leo.yan@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-24 11:40:47 -03:00
Dave Jiang
9f153b7fb5 Merge branch 'for-6.16/cxl-features-ras' into cxl-for-next
Add CXL RAS Features support. Features include "patrol scrub control",
"error check scrub", "perform maintenance", and "memory sparing". This
support connects the RAS Featurs to EDAC.
2025-05-23 13:26:24 -07:00
Shiju Jose
0c6e6f1357 cxl/edac: Add CXL memory device patrol scrub control feature
CXL spec 3.2 section 8.2.10.9.11.1 describes the device patrol scrub
control feature. The device patrol scrub proactively locates and makes
corrections to errors in regular cycle.

Allow specifying the number of hours within which the patrol scrub must be
completed, subject to minimum and maximum limits reported by the device.
Also allow disabling scrub allowing trade-off error rates against
performance.

Add support for patrol scrub control on CXL memory devices.
Register with the EDAC device driver, which retrieves the scrub attribute
descriptors from EDAC scrub and exposes the sysfs scrub control attributes
to userspace. For example, scrub control for the CXL memory device
"cxl_mem0" is exposed in /sys/bus/edac/devices/cxl_mem0/scrubX/.

Additionally, add support for region-based CXL memory patrol scrub control.
CXL memory regions may be interleaved across one or more CXL memory
devices. For example, region-based scrub control for "cxl_region1" is
exposed in /sys/bus/edac/devices/cxl_region1/scrubX/.

[dj: A few formatting fixes from Jonathan]

Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Co-developed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250521124749.817-4-shiju.jose@huawei.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-23 13:24:09 -07:00
Lorenz Bauer
3c0421c93c libbpf: Use mmap to parse vmlinux BTF from sysfs
Teach libbpf to use mmap when parsing vmlinux BTF from /sys. We don't
apply this to fall-back paths on the regular file system because there
is no way to ensure that modifications underlying the MAP_PRIVATE
mapping are not visible to the process.

Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250520-vmlinux-mmap-v5-3-e8c941acc414@isovalent.com
2025-05-23 10:06:28 -07:00
Lorenz Bauer
828226b69f selftests: bpf: Add a test for mmapable vmlinux BTF
Add a basic test for the ability to mmap /sys/kernel/btf/vmlinux.
Ensure that the data is valid BTF and that it is padded with zero.

Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20250520-vmlinux-mmap-v5-2-e8c941acc414@isovalent.com
2025-05-23 10:06:28 -07:00
Shradha Gupta
a9c0b33ef2 tools: hv: Enable debug logs for hv_kvp_daemon
Allow the KVP daemon to log the KVP updates triggered in the VM
with a new debug flag(-d).
When the daemon is started with this flag, it logs updates and debug
information in syslog with loglevel LOG_DEBUG. This information comes
in handy for debugging issues where the key-value pairs for certain
pools show mismatch/incorrect values.
The distro-vendors can further consume these changes and modify the
respective service files to redirect the logs to specific files as
needed.

Signed-off-by: Shradha Gupta <shradhagupta@linux.microsoft.com>
Reviewed-by: Naman Jain <namjain@linux.microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Link: https://lore.kernel.org/r/1744715978-8185-1-git-send-email-shradhagupta@linux.microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Message-ID: <1744715978-8185-1-git-send-email-shradhagupta@linux.microsoft.com>
2025-05-23 16:30:55 +00:00
Ming Lei
533c87e2ed selftests: ublk: add test for UBLK_F_QUIESCE
Add test generic_11 for covering new control command of
UBLK_U_CMD_QUIESCE_DEV.

Add 'quiesce -n dev_id' sub-command on ublk utility for transitioning
device state to quiesce states, then verify the feature via generic_10
by doing quiesce and recovery.

Cc: Yoav Cohen <yoav@nvidia.com>
Link: https://lore.kernel.org/linux-block/DM4PR12MB632807AB7CDCE77D1E5AB7D0A9B92@DM4PR12MB6328.namprd12.prod.outlook.com/
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250522163523.406289-4-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-23 09:42:12 -06:00
Ming Lei
f40b1f2670 selftests: ublk: add test case for UBLK_U_CMD_UPDATE_SIZE
Add test generic_10 for covering new control command of UBLK_U_CMD_UPDATE_SIZE.

Add 'update_size -s|--size size_in_bytes' sub-command on ublk utility for
supporting this feature, then verify the feature via generic_10.

Cc: Omri Mann <omri@nvidia.com>
Cc: Jared Holzman <jholzman@nvidia.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Link: https://lore.kernel.org/r/20250522163523.406289-2-ming.lei@redhat.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-05-23 09:42:12 -06:00
Phil Sutter
73db1b5dab selftests: netfilter: Torture nftables netdev hooks
Add a ruleset which binds to various interface names via netdev-family
chains and flowtables and massage the notifiers by frequently renaming
interfaces to match these names. While doing so:
- Keep an 'nft monitor' running in background to receive the notifications
- Loop over 'nft list ruleset' to exercise ruleset dump codepath
- Have iperf running so the involved chains/flowtables see traffic

If supported, also test interface wildcard support separately by
creating a flowtable with 'wild*' interface spec and quickly add/remove
matching dummy interfaces.

Signed-off-by: Phil Sutter <phil@nwl.cc>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-05-23 13:57:14 +02:00
Florian Westphal
996d62ece0 selftests: netfilter: nft_fib.sh: add type and oif tests with and without VRFs
Replace the existing VRF test with a more comprehensive one.

It tests following combinations:
 - fib type (returns address type, e.g. unicast)
 - fib oif (route output interface index
 - both with and without 'iif' keyword (changes result, e.g.
  'fib daddr type local' will be true when the destination address
  is configured on the local machine, but
  'fib daddr . iif type local' will only be true when the destination
  address is configured on the incoming interface.

Add all types of addresses to test with for both ipv4 and ipv6:
- local address on the incoming interface
- local address on another interface
- local address on another interface thats part of a vrf
- address on another host

The ruleset stores obtained results from 'fib' in nftables sets and
then queries the sets to check that it has the expected results.

Perform one pass while packets are coming in on interface NOT part of
a VRF and then again when it was added and make sure fib returns the
expected routes and address types for the various addresses in the
setup.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-05-23 13:57:12 +02:00
Marc Zyngier
1b85d923ba Merge branch kvm-arm64/misc-6.16 into kvmarm-master/next
* kvm-arm64/misc-6.16:
  : .
  : Misc changes and improvements for 6.16:
  :
  : - Add a new selftest for the SVE host state being corrupted by a guest
  :
  : - Keep HCR_EL2.xMO set at all times for systems running with the kernel at EL2,
  :   ensuring that the window for interrupts is slightly bigger, and avoiding
  :   a pretty bad erratum on the AmpereOne HW
  :
  : - Replace a couple of open-coded on/off strings with str_on_off()
  :
  : - Get rid of the pKVM memblock sorting, which now appears to be superflous
  :
  : - Drop superflous clearing of ICH_LR_EOI in the LR when nesting
  :
  : - Add workaround for AmpereOne's erratum AC04_CPU_23, which suffers from
  :   a pretty bad case of TLB corruption unless accesses to HCR_EL2 are
  :   heavily synchronised
  :
  : - Add a per-VM, per-ITS debugfs entry to dump the state of the ITS tables
  :   in a human-friendly fashion
  : .
  KVM: arm64: Fix documentation for vgic_its_iter_next()
  KVM: arm64: vgic-its: Add debugfs interface to expose ITS tables
  arm64: errata: Work around AmpereOne's erratum AC04_CPU_23
  KVM: arm64: nv: Remove clearing of ICH_LR<n>.EOI if ICH_LR<n>.HW == 1
  KVM: arm64: Drop sort_memblock_regions()
  KVM: arm64: selftests: Add test for SVE host corruption
  KVM: arm64: Force HCR_EL2.xMO to 1 at all times in VHE mode
  KVM: arm64: Replace ternary flags with str_on_off() helper

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-23 10:59:43 +01:00
Marc Zyngier
fef3acf5ae Merge branch kvm-arm64/fgt-masks into kvmarm-master/next
* kvm-arm64/fgt-masks: (43 commits)
  : .
  : Large rework of the way KVM deals with trap bits in conjunction with
  : the CPU feature registers. It now draws a direct link between which
  : the feature set, the system registers that need to UNDEF to match
  : the configuration and bits that need to behave as RES0 or RES1 in
  : the trap registers that are visible to the guest.
  :
  : Best of all, these definitions are mostly automatically generated
  : from the JSON description published by ARM under a permissive
  : license.
  : .
  KVM: arm64: Handle TSB CSYNC traps
  KVM: arm64: Add FGT descriptors for FEAT_FGT2
  KVM: arm64: Allow sysreg ranges for FGT descriptors
  KVM: arm64: Add context-switch for FEAT_FGT2 registers
  KVM: arm64: Add trap routing for FEAT_FGT2 registers
  KVM: arm64: Add sanitisation for FEAT_FGT2 registers
  KVM: arm64: Add FEAT_FGT2 registers to the VNCR page
  KVM: arm64: Use HCR_EL2 feature map to drive fixed-value bits
  KVM: arm64: Use HCRX_EL2 feature map to drive fixed-value bits
  KVM: arm64: Allow kvm_has_feat() to take variable arguments
  KVM: arm64: Use FGT feature maps to drive RES0 bits
  KVM: arm64: Validate FGT register descriptions against RES0 masks
  KVM: arm64: Switch to table-driven FGU configuration
  KVM: arm64: Handle PSB CSYNC traps
  KVM: arm64: Use KVM-specific HCRX_EL2 RES0 mask
  KVM: arm64: Remove hand-crafted masks for FGT registers
  KVM: arm64: Use computed FGT masks to setup FGT registers
  KVM: arm64: Propagate FGT masks to the nVHE hypervisor
  KVM: arm64: Unconditionally configure fine-grain traps
  KVM: arm64: Use computed masks as sanitisers for FGT registers
  ...

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-23 10:58:15 +01:00
Marc Zyngier
6eb0ed9629 Merge branch kvm-arm64/mte-frac into kvmarm-master/next
* kvm-arm64/mte-frac:
  : .
  : Prevent FEAT_MTE_ASYNC from being accidently exposed to a guest,
  : courtesy of Ben Horgan. From the cover letter:
  :
  : "The ID_AA64PFR1_EL1.MTE_frac field is currently hidden from KVM.
  : However, when ID_AA64PFR1_EL1.MTE==2, ID_AA64PFR1_EL1.MTE_frac==0
  : indicates that MTE_ASYNC is supported. On a host with
  : ID_AA64PFR1_EL1.MTE==2 but without MTE_ASYNC support a guest with the
  : MTE capability enabled will incorrectly see MTE_ASYNC advertised as
  : supported. This series fixes that."
  : .
  KVM: selftests: Confirm exposing MTE_frac does not break migration
  KVM: arm64: Make MTE_frac masking conditional on MTE capability
  arm64/sysreg: Expose MTE_frac so that it is visible to KVM

Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-05-23 10:57:44 +01:00
Kuniyuki Iwashima
431e2b874e selftest: af_unix: Test SO_PASSRIGHTS.
scm_rights.c has various patterns of tests to exercise GC.

Let's add cases where SO_PASSRIGHTS is disabled.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-23 10:24:19 +01:00
Kuniyuki Iwashima
77cbe1a6d8 af_unix: Introduce SO_PASSRIGHTS.
As long as recvmsg() or recvmmsg() is used with cmsg, it is not
possible to avoid receiving file descriptors via SCM_RIGHTS.

This behaviour has occasionally been flagged as problematic, as
it can be (ab)used to trigger DoS during close(), for example, by
passing a FUSE-controlled fd or a hung NFS fd.

For instance, as noted on the uAPI Group page [0], an untrusted peer
could send a file descriptor pointing to a hung NFS mount and then
close it.  Once the receiver calls recvmsg() with msg_control, the
descriptor is automatically installed, and then the responsibility
for the final close() now falls on the receiver, which may result
in blocking the process for a long time.

Regarding this, systemd calls cmsg_close_all() [1] after each
recvmsg() to close() unwanted file descriptors sent via SCM_RIGHTS.

However, this cannot work around the issue at all, because the final
fput() may still occur on the receiver's side once sendmsg() with
SCM_RIGHTS succeeds.  Also, even filtering by LSM at recvmsg() does
not work for the same reason.

Thus, we need a better way to refuse SCM_RIGHTS at sendmsg().

Let's introduce SO_PASSRIGHTS to disable SCM_RIGHTS.

Note that this option is enabled by default for backward
compatibility.

Link: https://uapi-group.org/kernel-features/#disabling-reception-of-scm_rights-for-af_unix-sockets #[0]
Link: https://github.com/systemd/systemd/blob/v257.5/src/basic/fd-util.c#L612-L628 #[1]
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-23 10:24:18 +01:00
Kuniyuki Iwashima
ae4f2f59e1 tcp: Restrict SO_TXREHASH to TCP socket.
sk->sk_txrehash is only used for TCP.

Let's restrict SO_TXREHASH to TCP to reflect this.

Later, we will make sk_txrehash a part of the union for other
protocol families.

Note that we need to modify BPF selftest not to get/set
SO_TEREHASH for non-TCP sockets.

Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-05-23 10:24:18 +01:00
Ian Rogers
0ffca606e9 perf pmu intel: Adjust cpumaks for sub-NUMA clusters on graniterapids
On graniterapids the cache home agent (CHA) and memory controller
(IMC) PMUs all have their cpumask set to per-socket information. In
order for per NUMA node aggregation to work correctly the PMUs cpumask
needs to be set to CPUs for the relevant sub-NUMA grouping.

For example, on a 2 socket graniterapids machine with sub NUMA
clustering of 3, for uncore_cha and uncore_imc PMUs the cpumask is
"0,120" leading to aggregation only on NUMA nodes 0 and 3:
```
$ perf stat --per-node -e 'UNC_CHA_CLOCKTICKS,UNC_M_CLOCKTICKS' -a sleep 1

 Performance counter stats for 'system wide':

N0        1    277,835,681,344      UNC_CHA_CLOCKTICKS
N0        1     19,242,894,228      UNC_M_CLOCKTICKS
N3        1    277,803,448,124      UNC_CHA_CLOCKTICKS
N3        1     19,240,741,498      UNC_M_CLOCKTICKS

       1.002113847 seconds time elapsed
```

By updating the PMUs cpumasks to "0,120", "40,160" and "80,200" then
the correctly 6 NUMA node aggregations are achieved:
```
$ perf stat --per-node -e 'UNC_CHA_CLOCKTICKS,UNC_M_CLOCKTICKS' -a sleep 1

 Performance counter stats for 'system wide':

N0        1     92,748,667,796      UNC_CHA_CLOCKTICKS
N0        0      6,424,021,142      UNC_M_CLOCKTICKS
N1        0     92,753,504,424      UNC_CHA_CLOCKTICKS
N1        1      6,424,308,338      UNC_M_CLOCKTICKS
N2        0     92,751,170,084      UNC_CHA_CLOCKTICKS
N2        0      6,424,227,402      UNC_M_CLOCKTICKS
N3        1     92,745,944,144      UNC_CHA_CLOCKTICKS
N3        0      6,423,752,086      UNC_M_CLOCKTICKS
N4        0     92,725,793,788      UNC_CHA_CLOCKTICKS
N4        1      6,422,393,266      UNC_M_CLOCKTICKS
N5        0     92,717,504,388      UNC_CHA_CLOCKTICKS
N5        0      6,421,842,618      UNC_M_CLOCKTICKS

       1.003406645 seconds time elapsed
```

In general, having the perf tool adjust cpumasks isn't desirable as
ideally the PMU driver would be advertising the correct cpumask.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250515181417.491401-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-22 23:15:48 -03:00
Arnaldo Carvalho de Melo
9e893dab82 perf tests trace_summary.sh: Run in exclusive mode
And it is being successfull only when running alone, probably because
there are some tests that add the vfs_getname probe that gets used by
'perf trace' and alter how it does syscall arg pathname resolution.

This should be removed or made a fallback to the preferred BPF mode of
getting syscall parameters, but till then, run this in exclusive mode.

For reference, here are some of the tests that run close to this one:

  127: perf record offcpu profiling tests                              : Ok
  128: perf all PMU test                                               : Ok
  129: perf stat --bpf-counters test                                   : Ok
  130: Check Arm CoreSight trace data recording and synthesized samples: Skip
  131: Check Arm CoreSight disassembly script completes without errors : Skip
  132: Check Arm SPE trace data recording and synthesized samples      : Skip
  133: Test data symbol                                                : Ok
  134: Miscellaneous Intel PT testing                                  : Skip
  135: test Intel TPEBS counting mode                                  : Skip
  136: perf script task-analyzer tests                                 : Ok
  137: Check open filename arg using perf trace + vfs_getname          : Ok
  138: perf trace summary                                              : Ok

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/aC-hHTgArwlF_zu9@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-22 22:24:58 -03:00
Namhyung Kim
dd8633bd09 perf test: Add cgroup summary test case for 'perf trace'
$ sudo ./perf test -vv 112
  112: perf trace summary:
  --- start ---
  test child forked, pid 1018940
  testing: perf trace -s -- true
  testing: perf trace -S -- true
  testing: perf trace -s --summary-mode=thread -- true
  testing: perf trace -S --summary-mode=total -- true
  testing: perf trace -as --summary-mode=thread --no-bpf-summary -- true
  testing: perf trace -as --summary-mode=total --no-bpf-summary -- true
  testing: perf trace -as --summary-mode=thread --bpf-summary -- true
  testing: perf trace -as --summary-mode=total --bpf-summary -- true
  testing: perf trace -aS --summary-mode=total --bpf-summary -- true
  testing: perf trace -as --summary-mode=cgroup --bpf-summary -- true
  testing: perf trace -aS --summary-mode=cgroup --bpf-summary -- true
  ---- end(0) ----
  112: perf trace summary                                              : Ok

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250522142551.1062417-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-22 22:24:58 -03:00
Gautam Menghani
59df607bf8 perf python: Add counting.py as example for counting perf events
Add counting.py - a python version of counting.c to demonstrate
measuring and reading of counts for given perf events.

Committer testing:

Build perf and make the generated python binding somewhere you can point
to to avoid using the one in the distro python3-perf (fedora, may be
different in other distros):

  $ make -k O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin

Copy /tmp/build/perf-tools-next/python/perf.cpython-313-x86_64-linux-gnu.so to
somewhere outside this toolbox container and then use it with root:

  # export PYTHONPATH=/root/python/
  # ls -la /root/python/
  total 10640
  drwxr-xr-x. 1 root root       72 May 21 11:40 .
  dr-xr-x---. 1 root root      574 May 21 11:40 ..
  -rwxr-xr-x. 1 acme acme 10894360 May 21 11:40 perf.cpython-313-x86_64-linux-gnu.so
  # tools/perf/python/counting.py | head -5
  For evsel(software/cpu-clock/) val: 2930946 enable: 2932479 run: 2932479
  For evsel(software/cpu-clock/) val: 2924975 enable: 2926267 run: 2926267
  For evsel(software/cpu-clock/) val: 2921017 enable: 2922430 run: 2922430
  For evsel(software/cpu-clock/) val: 2914966 enable: 2916549 run: 2916549
  For evsel(software/cpu-clock/) val: 2910027 enable: 2911589 run: 2911589
  #

Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
[ make the API take a CPU and thread then compute from these the appropriate indices. ]
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/linux-perf-users/CAP-5=fWb-=hCYmpg7U5N9C94EucQGTOS7YwR2-fo4ptOexzxyg@mail.gmail.com/
Link: https://lore.kernel.org/r/20250519195148.1708988-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-22 22:24:58 -03:00
Gautam Menghani
aa68483740 perf python: Add evlist close support
Add support for the evlist close function.

Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250519195148.1708988-7-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-22 22:24:58 -03:00
Gautam Menghani
739621f657 perf python: Add evsel read method
Add the evsel read method to enable python to read counter data for the
given evsel.

Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/linux-perf-users/20250512055748.479786-1-gautam@linux.ibm.com/
Link: https://lore.kernel.org/r/20250519195148.1708988-6-irogers@google.com
[ make the API take a CPU and thread then compute from these the appropriate indices. ]
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-22 22:24:58 -03:00
Gautam Menghani
3b4991dcb4 perf python: Add support for 'struct perf_counts_values' to return counter data
Add support for the perf_counts_values struct to enable the python
bindings to read and return the counter data.

Committer notes:

Use T_ULONG instead of Py_T_ULONG, as all the other PyMemberDef arrays,
fixing the build with older python3 versions.

Use { .name = NULL, } to finish the new PyMemberDef
pyrf_counts_values_members array, again as the other arrays to please
some clang versions, ditto for PyGetSetDef.

Signed-off-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250519195148.1708988-5-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-22 22:23:52 -03:00
Ryan Chung
19e0713bbe selftests/eventfd: correct test name and improve messages
- Rename test from eventfd_chek_flag_cloexec_and_nonblock to
eventfd_check_flag_cloexec_and_nonblock.

- Make the RDWR‐flag comment declarative:
  “The kernel automatically adds the O_RDWR flag.”
- Update semaphore‐flag failure message to:
  “eventfd semaphore flag check failed: …”

Link: https://lkml.kernel.org/r/20250513074411.6965-1-seokwoo.chung130@gmail.com
Signed-off-by: Ryan Chung <seokwoo.chung130@gmail.com>
Reviewed-by: Wen Yang <wen.yang@linux.dev>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-22 14:55:38 -07:00
SeongJae Park
03f83209e8 selftests/damon/_damon_sysfs: read tried regions directories in order
Kdamond.update_schemes_tried_regions() reads and stores tried regions
information out of address order.  It makes debugging a test failure
difficult.  Change the behavior to do the reading and writing in the
address order.

Link: https://lkml.kernel.org/r/20250513002715.40126-6-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendan.higgins@linux.dev>
Cc: David Gow <davidgow@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-22 14:55:38 -07:00
Mark Brown
5fc4b770fc selftests/mm: deduplicate second mmap() of 5*PAGE_SIZE at base
The map_fixed_noreplace test does two blocks of test starting from a
mapping of 5 pages at the base address, logging a test result for each
initial mapping.  These are logged with the same test name, causing test
automation software to see two reports for the same test in a single run. 
Tweak the log message for the second one to deduplicate.

Link: https://lkml.kernel.org/r/20250518-selftests-mm-map-fixed-noreplace-dup-v1-1-1a11a62c5e9f@kernel.org
Signed-off-by: Mark Brown <broonie@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-22 14:55:37 -07:00
David Hildenbrand
2616b37032 selftests/mm: add simple VM_PFNMAP tests based on mmap'ing /dev/mem
Let's test some basic functionality using /dev/mem.  These tests will
implicitly cover some PAT (Page Attribute Handling) handling on x86.

These tests will only run when /dev/mem access to the first two pages in
physical address space is possible and allowed; otherwise, the tests are
skipped.

On current x86-64 with PAT inside a VM, all tests pass:

	TAP version 13
	1..6
	# Starting 6 tests from 1 test cases.
	#  RUN           pfnmap.madvise_disallowed ...
	#            OK  pfnmap.madvise_disallowed
	ok 1 pfnmap.madvise_disallowed
	#  RUN           pfnmap.munmap_split ...
	#            OK  pfnmap.munmap_split
	ok 2 pfnmap.munmap_split
	#  RUN           pfnmap.mremap_fixed ...
	#            OK  pfnmap.mremap_fixed
	ok 3 pfnmap.mremap_fixed
	#  RUN           pfnmap.mremap_shrink ...
	#            OK  pfnmap.mremap_shrink
	ok 4 pfnmap.mremap_shrink
	#  RUN           pfnmap.mremap_expand ...
	#            OK  pfnmap.mremap_expand
	ok 5 pfnmap.mremap_expand
	#  RUN           pfnmap.fork ...
	#            OK  pfnmap.fork
	ok 6 pfnmap.fork
	# PASSED: 6 / 6 tests passed.
	# Totals: pass:6 fail:0 xfail:0 xpass:0 skip:0 error:0

However, we are able to trigger:

[   27.888251] x86/PAT: pfnmap:1790 freeing invalid memtype [mem 0x00000000-0x00000fff]

There are probably more things worth testing in the future, such as
MAP_PRIVATE handling.  But this set of tests is sufficient to cover most
of the things we will rework regarding PAT handling.

Link: https://lkml.kernel.org/r/20250509153033.952746-1-david@redhat.com
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Dev Jain <dev.jain@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-05-22 14:55:36 -07:00
Michal Luczaj
c04eeeb2af selftests/bpf: sockmap_listen cleanup: Drop af_inet SOCK_DGRAM redir tests
Remove tests covered by sockmap_redir.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-8-a1ea723f7e7e@rbox.co
2025-05-22 14:26:59 -07:00
Michal Luczaj
f3de1cf621 selftests/bpf: sockmap_listen cleanup: Drop af_unix redir tests
Remove tests covered by sockmap_redir.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-7-a1ea723f7e7e@rbox.co
2025-05-22 14:26:58 -07:00
Michal Luczaj
9266e49d60 selftests/bpf: sockmap_listen cleanup: Drop af_vsock redir tests
Remove tests covered by sockmap_redir.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-6-a1ea723f7e7e@rbox.co
2025-05-22 14:26:58 -07:00
Michal Luczaj
f0709263a0 selftests/bpf: Add selftest for sockmap/hashmap redirection
Test redirection logic. All supported and unsupported redirect combinations
are tested for success and failure respectively.

BPF_MAP_TYPE_SOCKMAP
BPF_MAP_TYPE_SOCKHASH
	x
sk_msg-to-egress
sk_msg-to-ingress
sk_skb-to-egress
sk_skb-to-ingress
	x
AF_INET, SOCK_STREAM
AF_INET6, SOCK_STREAM
AF_INET, SOCK_DGRAM
AF_INET6, SOCK_DGRAM
AF_UNIX, SOCK_STREAM
AF_UNIX, SOCK_DGRAM
AF_VSOCK, SOCK_STREAM
AF_VSOCK, SOCK_SEQPACKET

Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-5-a1ea723f7e7e@rbox.co
2025-05-22 14:26:58 -07:00
Michal Luczaj
f266905bb3 selftests/bpf: Introduce verdict programs for sockmap_redir
Instead of piggybacking on test_sockmap_listen, introduce
test_sockmap_redir especially for sockmap redirection tests.

Suggested-by: Jiayuan Chen <mrpre@163.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-4-a1ea723f7e7e@rbox.co
2025-05-22 14:26:58 -07:00
Michal Luczaj
b57482b0fe selftests/bpf: Add u32()/u64() to sockmap_helpers
Add integer wrappers for convenient sockmap usage.

While there, fix misaligned trailing slashes.

Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-3-a1ea723f7e7e@rbox.co
2025-05-22 14:26:58 -07:00
Michal Luczaj
d87857946d selftests/bpf: Add socket_kind_to_str() to socket_helpers
Add function that returns string representation of socket's domain/type.

Suggested-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-2-a1ea723f7e7e@rbox.co
2025-05-22 14:26:58 -07:00
Michal Luczaj
fb1131d5e1 selftests/bpf: Support af_unix SOCK_DGRAM socket pair creation
Handle af_unix in init_addr_loopback(). For pair creation, bind() the peer
socket to make SOCK_DGRAM connect() happy.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/r/20250515-selftests-sockmap-redir-v3-1-a1ea723f7e7e@rbox.co
2025-05-22 14:26:58 -07:00
Jakub Kicinski
33e1b1b399 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.15-rc8).

Conflicts:
  80f2ab46c2 ("irdma: free iwdev->rf after removing MSI-X")
  4bcc063939 ("ice, irdma: fix an off by one in error handling code")
  c24a65b6a2 ("iidc/ice/irdma: Update IDC to support multiple consumers")
https://lore.kernel.org/20250513130630.280ee6c5@canb.auug.org.au

No extra adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-05-22 09:42:41 -07:00
Mykyta Yatsenko
5ead949920 selftests/bpf: Add SKIP_LLVM makefile variable
Introduce SKIP_LLVM makefile variable that allows to avoid using llvm
dependencies when building BPF selftests. This is different from
existing feature-llvm, as the latter is a result of automatic detection
and should not be set by user explicitly.
Avoiding llvm dependencies could be useful for environments that do not
have them, given that as of now llvm dependencies are required only by
jit_disasm_helpers.c.

Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20250522013813.125428-1-mykyta.yatsenko5@gmail.com
2025-05-22 09:31:03 -07:00
Florian Westphal
98287045c9 selftests: netfilter: move fib vrf test to nft_fib.sh
It was located in conntrack_vrf.sh because that already had the VRF bits.
Lets not add to this and move it to nft_fib.sh where this belongs.

No functional changes for the subtest intended.
The subtest is limited, it only covered 'fib oif'
(route output interface query) when the incoming interface is part
of a VRF.

Next we can extend it to cover 'fib type' for VRFs and also check fib
results when there is an unrelated VRF in same netns.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-05-22 17:16:02 +02:00
Florian Westphal
839340f7c7 selftests: netfilter: nft_fib.sh: add 'type' mode tests
fib can either lookup the interface id/name of the output interface that
would be used for the given address, or it can check for the type of the
address according to the fib, e.g. local, unicast, multicast and so on.

This can be used to e.g. make a locally configured address only reachable
through its interface.

Example: given eth0:10.1.1.1 and eth1:10.1.2.1 then 'fib daddr type' for
10.1.1.1 arriving on eth1 will be 'local', but 'fib daddr . iif type' is
expected to return 'unicast', whereas 'fib daddr' and 'fib daddr . iif'
are expected to indicate 'local' if such a packet arrives on eth0.

So far nft_fib.sh only covered oif/oifname, not type.

Repeat tests both with default and a policy (ip rule) based setup.

Also try to run all remaining tests even if a subtest has failed.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-05-22 17:16:02 +02:00
Florian Westphal
d31c1cafc4 selftests: netfilter: nft_concat_range.sh: add coverage for 4bit group representation
Pipapo supports a more compact '4 bit group' format that is chosen when
the memory needed for the default exceeds a threshold (2mb).

Add coverage for those code paths, the existing tests use small sets that
are handled by the default representation.

This comes with a test script run-time increase, but I think its ok:

 normal: 2m35s -> 3m9s
 debug:  3m24s -> 5m29s (with KSFT_MACHINE_SLOW=yes).

Cc: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-05-22 17:16:01 +02:00
Greg Kroah-Hartman
0ca7cb7089 Merge tag 'iio-for-6.16a-take2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next
Jonathan writes:

IIO: New device support, features and cleanup for 6.16 - take 2

Note - last minute rebase was to drop a typo patch that I'd accidentally
picked up (in the microblaze arch Kconfig)
Take 2 is due to that rebase messing up some fixes tags that were
referring to patches after that point.

There is a known merge conflict due to changes in neighbouring lines.

Stephen's resolution in linux-next is:
https://lore.kernel.org/linux-next/20250506155728.65605bae@canb.auug.org.au/

Added 3 named IIO reviewers to MAINTAINERS. This is a reflection of those
who have been doing much of this work for some time. Lars-Peter is
removed from the entry having moved on to other topics.  Thanks
Nuno, David and Andy for stepping up and Lars-Peter for all your
hard work in the past!

Includes the usual mix of new device support, features and general
cleanup.

This time we also have some tree wide changes.

- Rip out the iio_device_claim_direct_scoped() as it proved hard to work
  with.  This series includes quite a few related cleanups such as use
  of guard or factoring code out to allow direct returns.
- Switch from iio_device_claim/release_direct_mode() to new
  iio_device_claim/release_direct() which is structured so that sparse
  can warn on failed releases. There were a few false positives but
  those were mostly in code that benefited from being cleaned up as part
  of this process.
- Introduce iio_push_to_buffers_with_ts() to replace the _timestamp()
  version over time. This version takes the size of the supplied buffer
  which the core checks is at least as big as expected by calculation
  from channel descriptions of those channels enabled. Use this in
  an initial set of drivers.
- Add macros for IIO_DECLARE_BUFFER_WITH_TS() and
  IIO_DECLARE_DMA_BUFFER_WITH_TS() to avoid lots of fiddly code to ensure
  correctly aligned buffers for timestamps being added onto the end of
  channel data.

New device support
------------------

adi,ad3530r
- New driver for AD3530, AD3530R, AD3531 and AD3531R DACs with
  programmable gain controls. R variants have internal references.
adi,ad7476
- Add support (dt compatible only) for the Rohm BU79100G ADC which is
  fully compatible with the ti,ads7866.
adi,ad7606
- Support ad7606c-16 and ad7606c-18 devices. Includes switch to dynamic
  channel information allocation.
adi,ad7380
- Add support for the AD7389-4
dfrobot,sen0322
- New driver for this oxygen sensor.
mediatek,mt2701-auxadc
- Add binding for MT6893 which is fully compatible with already supported
  MT8173.
meson-saradc
- Support the GXLX SoCs.  Mostly this is a workaround for some unrelated
  clock control bits found in the ADC register map.
nuvoton,nct7201
- New driver for NCT7201 and NCT7202 I2C ADCs.
rohm,bd79124
- New driver for this 12-bit, 8-channel SAR ADC.
- Switch to new set_rv etc gpio callbacks that were added in 6.15.
rohm,bd79703
- Add support for BD79700, BD79701 and BD79702 DACs that have subsets of
  functionality of the already supported bd79703.  Included making this
  driver suitable for support device variants.
st,stm32-lptimer
- Add support for stm32pm25 to this trigger.

Features
--------

Beyond IIO
- Property iterator for named children.
core
- Enable writes for 64 bit integers used for standard IIO ABI elements.
  Previously these could be read only.
- Helper library that should avoid code duplication for simpler ADC
  bindings that have a child node per channel.
- Enforce that IIO_DMA_MINALIGN is always at least 8 (almost always true
  and simplifies code on all significant architectures)
core/backend
- Add support to control source of data - useful when the HDL includes
  things like generated ramps for testing purposes. Enable this for
  adi-axi-dac
adi,ad3552-hs
- Add debugfs related callbacks to allow debug access to register contents.
adi,ad4000
- Support SPI offload with appropriate FPGA firmware along with improving
  documentation.
adi,ad7293
- Add support for external reference voltage.
adi,ad7606
- Support SPI offload.
adi,ad7768-1
- Support reset GPIO.
adi,admv8818
- Support filter frequencies beyond 2^32.
adi,adxl345
- Add single and double tap events.
hid-sensor-prox
- Support 16-bit report sizes as seen on some Intel platforms.
invensense,icm42600
- Enable use of named interrupts to avoid problems with some wiring choices.
  Get the interrupt by name, but fallback to previous assumption on the first
  being INT1 if no names are supplied.
microchip,mcp3911
- Add reset gpio support.
rohm,bh7150
- Add reset gpio support.
st,stm32
- Add support to control oversampling.
ti,adc128s052
- Add support for ROHM BD79104 which is early compatible with the TI
  parts already supported by this driver. Includes some general driver
  cleanup and a separate dt binding.
- Simplify reference voltage handling by assuming it is fixed after enabling
  the supply.
winsen,mhz19b
- New driver for this C02 sensor.

Cleanup and minor fixes
-----------------------

dt-bindings
- Correct indentation and style for DTS examples.
- Use unevalutateProperties for SPI devices instead of additionalProperties
  to allow generic SPI properties from spi-peripheral-props.yaml
ABI Docs
- Add missing docs for sampling_frequency when it applies only to events.
Treewide
- Various minor tweaks, comment fixes and similar.
- Sort TI ADCs in Kconfig that had gotten out of order.
- Switch various drives that provide GPIO chip functionality to the new
  callbacks with return values.
- Standardize on { } formatting for all array sentinels.
- Make use of aligned_s64 in a few places to replace either wrong types
  or manually defined equivalents.
- Drop places where spi bits_per_word is set to 8 because that is the
  default anyway.

adi,ad_sigma_delta library
- Avoid a potential use of uninitialized data if reg_size has a value
  that is not supported (no drivers hit this but it is reasonable hardening)
adi,ad4030
- Add error checking for scan types and no longer store it in state.
- Rework code to reduce duplication.
- Move setting the mode from buffer preenable() to update_scan_mode(),
  better matching expected semantics of the two different callbacks.
- Improve data marshalling comments.
adi,ad4695
- Use u16 for buffer elements as oversampling is not yet supported except
  with SPI offload (which doesn't use this path).
adi,ad5592r
- Clean up destruction of mutexes.
- Use lock guards to simplify code (later patch fixes a missed unlock)
adi,ad5933
- Correct some incorrect settling times.
adi,ad7091
- Deduplicate handling of writable vs volatile registers as they are the
  inverse of each other for this device.
adi,ad7124
- Fix 3db Filter frequency.
- Remove ability to directly write the filter frequency (which was broken)
- Register naming improvements.
adi,ad7606
- Add a missing return value check.
- Fill in max sampling rates for all chips.
- Use devm_mutex_init()
- Fix up some kernel-doc formatting issues.
- Remove some camel case that snuck in.
- Drop setting address field in channels as easily established from other
  fields.
- Drop unnecessary parameter to ad76060_scale_setup_cb_t.
adi,ad7768-1
- Convert to regmap.
- Factor out buffer allocation.
- Tidy up headers.
adi,ad7944
- Stop setting bits_per_word in SPI xfers with no data.
adi,ad9832
- Add of_device_id table rather than just relying on fallbacks.
- Use FIELD_PREP() to set values of fields.
adi,admv1013
- Cleanup a pointless ternary.
adi,admv8818
- Fix up LPF Band 5 frequency which was slightly wrong.
- Fix an integer overflow.
- Fix range calculation
adi,adt7316
- Replace irqd_get_trigger_type(irq_get_irq_data()) with simpler
  irq_get_trigger_type()
adi,adxl345
- Use regmap cache instead of various state variables that were there to
  reduce bus accesses.
- Make regmap return value checking consistent across all call sites.
adi,axi-dac
- Add a check on number of channels (0 to 15 valid)
allwinner,sun20i
- Use new adc-helpers to replace local parsing code for channel nodes.
bosch,bmp290
- Move to local variables for sensor data marshalling removing the need
  for a messy definition that has to work for all supported parts.
  Follow up fix adds a missing initialization.
dynaimage,al3010 and dynaimage,al3320a
- Various minor cleanup to bring these drivers inline with reviewed feedback
  given on a new driver.
- Fix an error path in which power down is not called when it should be.
- Switch to regmap.
google,cros_ec
- Fix up a flexible array in middle of structure warning.
- Flush fifo when changing the timeout to avoid potential long wait
  for samples.
hid-sensor-rotation
- Remove an __aligned(16) marking that doesn't seem to be justified.
kionix,kxcjk-1013
- Deduplicate code for setting up interrupts.
microchip,mcp3911
- Fix handling of conversion results register which differs across supported
  devices.
idt,zopt2201
- Avoid duplicating register lists as all volatile registers are the
  inverse of writeable registers on this device.
renesas,rzg2l
- Use new adc-helpers to replace local parsing code for channel nodes.
ti,ads1298
- Fix a missing Kconfig dependency.

* tag 'iio-for-6.16a-take2' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (260 commits)
  dt-bindings: iio: adc: Add ROHM BD79100G
  iio: adc: add support for Nuvoton NCT7201
  dt-bindings: iio: adc: add NCT7201 ADCs
  iio: chemical: Add driver for SEN0322
  dt-bindings: trivial-devices: Document SEN0322
  iio: adc: ad7768-1: reorganize driver headers
  iio: bmp280: zero-init buffer
  iio: ssp_sensors: optimalize -> optimize
  HID: sensor-hub: Fix typo and improve documentation
  iio: admv1013: replace redundant ternary operator with just len
  iio: chemical: mhz19b: Fix error code in probe()
  iio: adc: at91-sama5d2: use IIO_DECLARE_BUFFER_WITH_TS
  iio: accel: sca3300: use IIO_DECLARE_BUFFER_WITH_TS
  iio: adc: ad7380: use IIO_DECLARE_DMA_BUFFER_WITH_TS
  iio: adc: ad4695: rename AD4695_MAX_VIN_CHANNELS
  iio: adc: ad4695: use IIO_DECLARE_DMA_BUFFER_WITH_TS
  iio: introduce IIO_DECLARE_BUFFER_WITH_TS macros
  iio: make IIO_DMA_MINALIGN minimum of 8 bytes
  iio: pressure: zpa2326_spi: remove bits_per_word = 8
  iio: pressure: ms5611_spi: remove bits_per_word = 8
  ...
2025-05-22 15:54:52 +02:00
Miguel Ojeda
cbeaa41dfe objtool/rust: relax slice condition to cover more noreturn Rust functions
Developers are indeed hitting other of the `noreturn` slice symbols in
Nova [1], thus relax the last check in the list so that we catch all of
them, i.e.

    *_4core5slice5index22slice_index_order_fail
    *_4core5slice5index24slice_end_index_len_fail
    *_4core5slice5index26slice_start_index_len_fail
    *_4core5slice5index29slice_end_index_overflow_fail
    *_4core5slice5index31slice_start_index_overflow_fail

These all exist since at least Rust 1.78.0, thus backport it too.

See commit 56d680dd23 ("objtool/rust: list `noreturn` Rust functions")
for more details.

Cc: stable@vger.kernel.org # Needed in 6.12.y and later.
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Timur Tabi <ttabi@nvidia.com>
Cc: Kane York <kanepyork@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Reported-by: Joel Fernandes <joelagnelf@nvidia.com>
Fixes: 56d680dd23 ("objtool/rust: list `noreturn` Rust functions")
Closes: https://lore.kernel.org/rust-for-linux/20250513180757.GA1295002@joelnvbox/ [1]
Tested-by: Joel Fernandes <joelagnelf@nvidia.com>
Link: https://lore.kernel.org/r/20250520185555.825242-1-ojeda@kernel.org
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2025-05-22 12:00:58 +02:00
Cong Wang
c3572acffb selftests/tc-testing: Add an HFSC qlen accounting test
This test reproduces a scenario where HFSC queue length and backlog accounting
can become inconsistent when a peek operation triggers a dequeue and possible
drop before the parent qdisc updates its counters. The test sets up a DRR root
qdisc with an HFSC class, netem, and blackhole children, and uses Scapy to
inject a packet. It helps to verify that HFSC correctly tracks qlen and backlog
even when packets are dropped during peek-induced dequeue.

Cc: Mingi Cho <mincho@theori.io>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250518222038.58538-3-xiyou.wangcong@gmail.com
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-05-22 11:16:51 +02:00