Commit Graph

49385 Commits

Author SHA1 Message Date
Howard Chu
d796c51ee5 perf test trace: Remove set -e for BTF general tests
Remove set -e and print error messages in BTF general tests.

Before:
    $ sudo /tmp/perf test btf -vv
    108: perf trace BTF general tests:
    108: perf trace BTF general tests                                    : Running
    --- start ---
    test child forked, pid 889299
    Checking if vmlinux BTF exists
    Testing perf trace's string augmentation
    String augmentation test failed
    ---- end(-1) ----
    108: perf trace BTF general tests                                    : FAILED!

After:
    $ sudo /tmp/perf test btf -vv
    108: perf trace BTF general tests:
    108: perf trace BTF general tests                                    : Running
    --- start ---
    test child forked, pid 886551
    Checking if vmlinux BTF exists
    Testing perf trace's string augmentation
    String augmentation test failed, output:
    :886566/886566 renameat2(CWD, "/tmp/file1_RcMa", CWD, "/tmp/file2_RcMa", NOREPLACE) = 0---- end(-1) ----
    108: perf trace BTF general tests                                    : FAILED!

Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250528191148.89118-5-howardchu95@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09 11:18:19 -07:00
Howard Chu
fc4a0ae7e1 perf test trace: Stop tracing hrtimer_setup event in trace enum test
The event 'timer:hrtimer_setup' is relatively new, for older kernels,
perf trace enum tests won't run as the event 'timer:hrtimer_setup'
cannot be found.

It was originally called 'timer:hrtimer_init', before being renamed in:

commit 244132c4e5 ("tracing/timers: Rename the hrtimer_init event to hrtimer_setup")

Using timer:hrtimer_start should be enough for current testing, and
hopefully 'start' won't be renamed in the future.

Before:
    $ sudo /tmp/perf test enum -vv
    107: perf trace enum augmentation tests:
    107: perf trace enum augmentation tests                              : Running
    --- start ---
    test child forked, pid 786187
    Checking if vmlinux exists
    Tracing syscall landlock_add_rule
    Tracing non-syscall tracepoint timer:hrtimer_setup,timer:hrtimer_start
    [tracepoint failure] Failed to trace timer:hrtimer_setup,timer:hrtimer_start tracepoint, output:
    event syntax error: 'timer:hrtimer_setup,timer:hrtimer_start'
			 \___ unknown tracepoint

    Error:  File /sys/kernel/tracing//events/timer/hrtimer_setup not found.
    Hint:   Perhaps this kernel misses some CONFIG_ setting to enable this feature?.

    Run 'perf list' for a list of valid events

     Usage: perf trace [<options>] [<command>]
	or: perf trace [<options>] -- <command> [<options>]
	or: perf trace record [<options>] [<command>]
	or: perf trace record [<options>] -- <command> [<options>]

	-e, --event <event>   event/syscall selector. use 'perf list' to list available events
    ---- end(-1) ----
    107: perf trace enum augmentation tests                              : FAILED!

After:
    $ sudo /tmp/perf test enum -vv
    107: perf trace enum augmentation tests:
    107: perf trace enum augmentation tests                              : Running
    --- start ---
    test child forked, pid 808547
    Checking if vmlinux exists
    Tracing syscall landlock_add_rule
    Tracing non-syscall tracepoint timer:hrtimer_start
    ---- end(0) ----
    107: perf trace enum augmentation tests                              : Ok

Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250528191148.89118-4-howardchu95@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09 11:18:19 -07:00
Howard Chu
78fc8bfe44 perf test trace: Remove set -e and print trace test's error messages
Currently perf test utilizes the set -e option in shell that exit
immediately if a command exits with a non-zero status, this prevents
further error handling and introduces ambiguity. This patch removes set
-e and prints the error message after invoking perf trace during perf
tests.

In my case, the command that exits with a non-zero status is perf
trace instead of grep, because it can't find the 'timer:hrtimer_setup'
tracepoint, see below.

Before:
    $ sudo /tmp/perf test enum -vv
    107: perf trace enum augmentation tests:
    107: perf trace enum augmentation tests                              : Running
    --- start ---
    test child forked, pid 783533
    Checking if vmlinux exists
    Tracing syscall landlock_add_rule
    Tracing non-syscall tracepoint syscall
    ---- end(-1) ----
    107: perf trace enum augmentation tests                              : FAILED!

After:
    $ sudo /tmp/perf test enum -vv
    107: perf trace enum augmentation tests:
    107: perf trace enum augmentation tests                              : Running
    --- start ---
    test child forked, pid 851658
    Checking if vmlinux exists
    Tracing syscall landlock_add_rule
    Tracing non-syscall tracepoint timer:hrtimer_setup,timer:hrtimer_start
    [tracepoint failure] Failed to trace tracepoint timer:hrtimer_setup,timer:hrtimer_start, output:
    event syntax error: 'timer:hrtimer_setup,timer:hrtimer_start'
			 \___ unknown tracepoint

    Error:  File /sys/kernel/tracing//events/timer/hrtimer_setup not found.
    Hint:   Perhaps this kernel misses some CONFIG_ setting to enable this feature?.

    Run 'perf list' for a list of valid events

     Usage: perf trace [<options>] [<command>]
	or: perf trace [<options>] -- <command> [<options>]
	or: perf trace record [<options>] [<command>]
	or: perf trace record [<options>] -- <command> [<options>]

	-e, --event <event>   event/syscall selector. use 'perf list' to list available events---- end(-1) ----
    107: perf trace enum augmentation tests                              : FAILED!

Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250528191148.89118-3-howardchu95@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09 11:18:19 -07:00
Howard Chu
6612d4d491 perf test trace: Use shell's -f flag to check if vmlinux exists
To match the style of the existing codebase, no functional changes
were applied.

Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250528191148.89118-2-howardchu95@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09 11:18:19 -07:00
Howard Chu
46e34646ae perf trace: Remove --map-dump documentation
The --map-dump option was removed in 5e6da6be30 ("perf trace: Migrate
BPF augmentation to use a skeleton"), this patch removes its remaining
documentation.

Fixes: 5e6da6be30 ("perf trace: Migrate BPF augmentation to use a skeleton")
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Link: https://lore.kernel.org/r/20250601173252.717780-1-howardchu95@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09 11:18:19 -07:00
Ian Rogers
5ae6a303c2 tools/build: Remove some unused libbpf pre-1.0 feature test logic
Commit 76a97cf2e1 ("perf build: Remove libbpf pre-1.0 feature
tests") removed the libbpf feature test logic used by perf in favor of
using LIBBPF_MAJOR_VERSION. Remove some build targets that should have
been removed as part of that clean up.

Fixes: 76a97cf2e1 ("perf build: Remove libbpf pre-1.0 feature tests")
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250603221358.2562167-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09 11:18:19 -07:00
Ian Rogers
5128492b2b perf thread_map: Remove uid options
Now the target doesn't have a uid, it is handled through BPF filters,
remove the uid options to thread_map creation. Tidy up the functions
used in tests to avoid passing unused arguments.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250604174545.2853620-11-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09 11:18:18 -07:00
Ian Rogers
b4c658d4d6 perf target: Remove uid from target
Gathering threads with a uid by scanning /proc is inherently racy
leading to perf_event_open failures that quit perf. All users of the
functionality now use BPF filters, so remove uid and uid_str from
target.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250604174545.2853620-10-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09 11:18:18 -07:00
Ian Rogers
278538ddf1 perf bench evlist-open-close: Switch user option to use BPF filter
Finding user processes by scanning /proc is inherently racy and
results in perf_event_open failures. Use a BPF filter to drop samples
where the uid doesn't match.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250604174545.2853620-9-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09 11:18:18 -07:00
Ian Rogers
bf1976dd28 perf trace: Switch user option to use BPF filter
Finding user processes by scanning /proc is inherently racy and
results in perf_event_open failures. Use a BPF filter to drop samples
where the uid doesn't match. Ensure adding the BPF filter forces
system-wide.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250604174545.2853620-8-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09 11:18:18 -07:00
Ian Rogers
38f83cc9ab perf top: Switch user option to use BPF filter
Finding user processes by scanning /proc is inherently racy and
results in perf_event_open failures. Use a BPF filter to drop samples
where the uid doesn't match.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250604174545.2853620-7-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09 11:18:18 -07:00
Ian Rogers
c54e2f8272 perf tests record: Add basic uid filtering test
Based on the system-wide test with changes around how failure is
handled as BPF permissions are a bigger issue than perf event
paranoia.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250604174545.2853620-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09 11:18:18 -07:00
Ian Rogers
1151208e70 perf record: Switch user option to use BPF filter
Finding user processes by scanning /proc is inherently racy and
results in perf_event_open failures. Use a BPF filter to drop samples
where the uid doesn't match. Ensure adding the BPF filter forces
system-wide.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250604174545.2853620-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09 11:18:17 -07:00
Ian Rogers
466db4275e perf parse-events: Add parse_uid_filter helper
Add parse_uid_filter filter as a helper to parse_filter, that
constructs a uid filter string. As uid filters don't work with
tracepoint filters, add a is_possible_tp_filter function so the
tracepoint filter isn't attempted for tracepoint evsels.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250604174545.2853620-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09 11:18:17 -07:00
Ian Rogers
5ddf4c3a17 perf target: Separate parse_uid into its own function
Allow parse_uid to be called without a struct target. Rather than have
two errors, remove TARGET_ERRNO__USER_NOT_FOUND and use
TARGET_ERRNO__INVALID_UID as the handling is identical.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250604174545.2853620-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09 11:18:17 -07:00
Ian Rogers
8b99e2f7a9 perf parse-events filter: Use evsel__find_pmu
Rather than manually scanning PMUs, use evsel__find_pmu that can use
the PMU set during event parsing.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250604174545.2853620-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09 11:18:17 -07:00
Namhyung Kim
189a977e4d perf bpf-filter: Improve error messages
The BPF filter needs libbpf/BPF-skeleton support and root privilege.
Add error messages to help users understand the problem easily.

When it's not build with BPF support (make BUILD_BPF_SKEL=0).

  $ sudo perf record -e cycles --filter "pid != 0" true
  Error: BPF filter is requested but perf is not built with BPF.
  	Please make sure to build with libbpf and BPF skeleton.

   Usage: perf record [<options>] [<command>]
      or: perf record [<options>] -- <command> [<options>]

          --filter <filter>
                            event filter

When it supports BPF but runs without root or CAP_BPF.  Note that it
also checks pinned BPF filters.

  $ perf record -e cycles --filter "pid != 0" -o /dev/null true
  Error: BPF filter only works for users with the CAP_BPF capability!
  	Please run 'perf record --setup-filter pin' as root first.

   Usage: perf record [<options>] [<command>]
      or: perf record [<options>] -- <command> [<options>]

          --filter <filter>
                            event filter

Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250604174835.1852481-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-09 11:18:17 -07:00
Francesco Poli (wintermute)
e044b8a954 cpupower: split unitdir from libdir in Makefile
Improve the installation procedure for the systemd service unit
'cpupower.service', to be more flexible. Some distros install libraries
to /usr/lib64/, but systemd service units have to be installed to
/usr/lib/systemd/system: as a consequence, the installation procedure
should not assume that systemd service units can be installed to
${libdir}/systemd/system ...
Define a dedicated variable ("unitdir") in the Makefile.

Link: https://lore.kernel.org/linux-pm/260b6d79-ab61-43b7-a0eb-813e257bc028@leemhuis.info/T/#m0601940ab439d5cbd288819d2af190ce59e810e6

Fixes: 9c70b779ad ("cpupower: add a systemd service to run cpupower")

Link: https://lore.kernel.org/r/20250521211656.65646-1-invernomuto@paranoici.org
Signed-off-by: Francesco Poli (wintermute) <invernomuto@paranoici.org>
Tested-by: Thorsten Leemhuis <linux@leemhuis.info>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2025-06-09 10:17:46 -06:00
Xin Li (Intel)
f287822688 selftests/x86: Add a test to detect infinite SIGTRAP handler loop
When FRED is enabled, if the Trap Flag (TF) is set without an external
debugger attached, it can lead to an infinite loop in the SIGTRAP
handler.  To avoid this, the software event flag in the augmented SS
must be cleared, ensuring that no single-step trap remains pending when
ERETU completes.

This test checks for that specific scenario—verifying whether the kernel
correctly prevents an infinite SIGTRAP loop in this edge case when FRED
is enabled.

The test should _always_ pass with IDT event delivery, thus no need to
disable the test even when FRED is not enabled.

Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Tested-by: Sohil Mehta <sohil.mehta@intel.com>
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250609084054.2083189-3-xin%40zytor.com
2025-06-09 08:52:06 -07:00
Linus Torvalds
939f15e640 Merge tag 'turbostat-2025.06.08' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull turbostat updates from Len Brown:

 - Add initial DMR support, which required smarter RAPL probe

 - Fix AMD MSR RAPL energy reporting

 - Add RAPL power limit configuration output

 - Minor fixes

* tag 'turbostat-2025.06.08' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power turbostat: version 2025.06.08
  tools/power turbostat: Add initial support for BartlettLake
  tools/power turbostat: Add initial support for DMR
  tools/power turbostat: Dump RAPL sysfs info
  tools/power turbostat: Avoid probing the same perf counters
  tools/power turbostat: Allow probing RAPL with platform_features->rapl_msrs cleared
  tools/power turbostat: Clean up add perf/msr counter logic
  tools/power turbostat: Introduce add_msr_counter()
  tools/power turbostat: Remove add_msr_perf_counter_()
  tools/power turbostat: Remove add_cstate_perf_counter_()
  tools/power turbostat: Remove add_rapl_perf_counter_()
  tools/power turbostat: Quit early for unsupported RAPL counters
  tools/power turbostat: Always check rapl_joules flag
  tools/power turbostat: Fix AMD package-energy reporting
  tools/power turbostat: Fix RAPL_GFX_ALL typo
  tools/power turbostat: Add Android support for MSR device handling
  tools/power turbostat.8: pm_domain wording fix
  tools/power turbostat.8: fix typo: idle_pct should be pct_idle
2025-06-08 11:44:41 -07:00
Len Brown
42fd37dcc4 tools/power turbostat: version 2025.06.08
Add initial DMR support, which required smarter RAPL probe
Fix AMD MSR RAPL energy reporting
Add RAPL power limit configuration output
Minor fixes

Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:17 -04:00
Zhang Rui
d8c0f5d973 tools/power turbostat: Add initial support for BartlettLake
Add initial support for BartlettLake.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:17 -04:00
Zhang Rui
83075bd59d tools/power turbostat: Add initial support for DMR
Add initial support for DMR.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
2a535d6cc3 tools/power turbostat: Dump RAPL sysfs info
for example:

intel-rapl:1: psys 28.0s:100W 976.0us:100W
intel-rapl:0: package-0 28.0s:57W,max:15W 2.4ms:57W
intel-rapl:0/intel-rapl:0:0: core disabled
intel-rapl:0/intel-rapl:0:1: uncore disabled
intel-rapl-mmio:0: package-0 28.0s:28W,max:15W 2.4ms:57W

[lenb: simplified format]

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>

squish me

Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
69078520fd tools/power turbostat: Avoid probing the same perf counters
For the RAPL package energy status counter, Intel and AMD share the same
perf_subsys and perf_name, but with different MSR addresses.

Both rapl_counter_arch_infos[0] and rapl_counter_arch_infos[1] are
introduced to describe this counter for different Vendors.

As a result, the perf counter is probed twice, and causes a failure in
in get_rapl_counters() because expected_read_size and actual_read_size
don't match.

Fix the problem by skipping the already probed counter.

Note, this is not a perfect fix. For example, if different
vendors/platforms use the same MSR value for different purpose, the code
can be fooled when it probes a rapl_counter_arch_infos[] entry that does
not belong to the running Vendor/Platform.

In a long run, better to put rapl_counter_arch_infos[] into the
platform_features so that this becomes Vendor/Platform specific.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
ff3d019e98 tools/power turbostat: Allow probing RAPL with platform_features->rapl_msrs cleared
platform_features->rapl_msrs describes the RAPL MSRs supported. While
RAPL Perf counters can be exposed from different kernel backend drivers,
e.g. RAPL MSR I/F driver, or RAPL TPMI I/F driver.

Thus, turbostat should first blindly probe all the available RAPL Perf
counters, and falls back to the RAPL MSR counters if they are listed in
platform_features->rapl_msrs.

With this, platforms that don't have RAPL MSRs can clear the
platform_features->rapl_msrs bits and use RAPL Perf counters only.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
0362337968 tools/power turbostat: Clean up add perf/msr counter logic
Increase the code readability by moving the no_perf/no_msr flag and the
cai->perf_name/cai->msr sanity checks into the counter probe functions.

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
1ab2e19b4c tools/power turbostat: Introduce add_msr_counter()
probe_rapl_msr() is reused for probing RAPL MSR counters, cstate MSR
counters and MPERF/APERF/SMI MSR counters, thus its name is misleading.

Similar to add_perf_counter(), introduce add_msr_counter() to probe a
counter via MSR. Introduce wrapper function add_rapl_msr_counter() at
the same time to add extra check for Zero return value for specified
RAPL counters.

No functional change intended.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
3403e89f97 tools/power turbostat: Remove add_msr_perf_counter_()
As the only caller of add_msr_perf_counter_(), add_msr_perf_counter()
just gives extra debug output on top. There is no need to keep both
functions.

Remove add_msr_perf_counter_() and move all the logic to
add_msr_perf_counter().

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
4d6ced7bef tools/power turbostat: Remove add_cstate_perf_counter_()
As the only caller of add_cstate_perf_counter_(),
add_cstate_perf_counter() just gives extra debug output on top. There is
no need to keep both functions.

Remove add_cstate_perf_counter_() and move all the logic to
add_cstate_perf_counter().

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
c8bca955da tools/power turbostat: Remove add_rapl_perf_counter_()
As the only caller of add_rapl_perf_counter_(), add_rapl_perf_counter()
just gives extra debug output on top. There is no need to keep both
functions.

Remove add_rapl_perf_counter_() and move all the logic to
add_rapl_perf_counter().

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
57b53787f0 tools/power turbostat: Quit early for unsupported RAPL counters
Quit early for unsupported RAPL counters.

No functional change.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Zhang Rui
fdea6b883b tools/power turbostat: Always check rapl_joules flag
rapl_joules bit should always be checked even if
platform_features->rapl_msrs is not set or no_msr flag is used.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Gautham R. Shenoy
adb49732c8 tools/power turbostat: Fix AMD package-energy reporting
commit 05a2f07db8 ("tools/power turbostat: read RAPL counters via
perf") that adds support to read RAPL counters via perf defines the
notion of a RAPL domain_id which is set to physical_core_id on
platforms which support per_core_rapl counters (Eg: AMD processors
Family 17h onwards) and is set to the physical_package_id on all the
other platforms.

However, the physical_core_id is only unique within a package and on
platforms with multiple packages more than one core can have the same
physical_core_id and thus the same domain_id. (For eg, the first cores
of each package have the physical_core_id = 0). This results in all
these cores with the same physical_core_id using the same entry in the
rapl_counter_info_perdomain[]. Since rapl_perf_init() skips the
perf-initialization for cores whose domain_ids have already been
visited, cores that have the same physical_core_id always read the
perf file corresponding to the physical_core_id of the first package
and thus the package-energy is incorrectly reported to be the same
value for different packages.

Note: This issue only arises when RAPL counters are read via perf and
not when they are read via MSRs since in the latter case the MSRs are
read separately on each core.

Fix this issue by associating each CPU with rapl_core_id which is
unique across all the packages in the system.

Fixes: 05a2f07db8 ("tools/power turbostat: read RAPL counters via perf")
Signed-off-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Kaushlendra Kumar
b4a734d383 tools/power turbostat: Fix RAPL_GFX_ALL typo
Fix typo in the currently unused RAPL_GFX_ALL macro definition.

Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Kaushlendra Kumar
5663785ae0 tools/power turbostat: Add Android support for MSR device handling
It uses /dev/msrN device paths on Android instead of /dev/cpu/N/msr,
updates error messages and permission checks to reflect the Android
device path, and wraps platform-specific code with #if defined(ANDROID)
to ensure correct behavior on both Android and non-Android systems.
These changes improve compatibility and usability of turbostat on
Android devices.

Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Len Brown
c967900fcb tools/power turbostat.8: pm_domain wording fix
turbostat.8: clarify that uncore "domains" are Power Management domains,
aka pm_domains.

Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:16 -04:00
Len Brown
394c1127ab tools/power turbostat.8: fix typo: idle_pct should be pct_idle
idle_pct should be pct_idle

Signed-off-by: Len Brown <len.brown@intel.com>
2025-06-08 14:10:01 -04:00
Linus Torvalds
35b574a6c2 Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull mount fixes from Al Viro:
 "Various mount-related bugfixes:

   - split the do_move_mount() checks in subtree-of-our-ns and
     entire-anon cases and adapt detached mount propagation selftest for
     mount_setattr

   - allow clone_private_mount() for a path on real rootfs

   - fix a race in call of has_locked_children()

   - fix move_mount propagation graph breakage by MOVE_MOUNT_SET_GROUP

   - make sure clone_private_mnt() caller has CAP_SYS_ADMIN in the right
     userns

   - avoid false negatives in path_overmount()

   - don't leak MNT_LOCKED from parent to child in finish_automount()

   - do_change_type(): refuse to operate on unmounted/not ours mounts"

* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  do_change_type(): refuse to operate on unmounted/not ours mounts
  clone_private_mnt(): make sure that caller has CAP_SYS_ADMIN in the right userns
  selftests/mount_setattr: adapt detached mount propagation test
  do_move_mount(): split the checks in subtree-of-our-ns and entire-anon cases
  fs: allow clone_private_mount() for a path on real rootfs
  fix propagation graph breakage by MOVE_MOUNT_SET_GROUP move_mount(2)
  finish_automount(): don't leak MNT_LOCKED from parent to child
  path_overmount(): avoid false negatives
  fs/fhandle.c: fix a race in call of has_locked_children()
2025-06-08 10:35:12 -07:00
Linus Torvalds
bdc7f8c5ad Merge tag 'mm-stable-2025-06-06-16-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more MM updates from Andrew Morton:
 "The series 'Fix uprobe pte be overwritten when expanding vma' fixes a
  longstanding and quite obscure bug related to the vma merging of the
  uprobe mmap page"

* tag 'mm-stable-2025-06-06-16-09' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  selftests/mm: add test about uprobe pte be orphan during vma merge
  selftests/mm: extract read_sysfs and write_sysfs into vm_util
  mm: expose abnormal new_pte during move_ptes
  mm: fix uprobe pte be overwritten when expanding vma
  mm/damon: s/primitives/code/ on comments
2025-06-06 22:06:57 -07:00
Linus Torvalds
d3c82f618a Merge tag 'mm-hotfixes-stable-2025-06-06-16-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
 "13 hotfixes.

  6 are cc:stable and the remainder address post-6.15 issues or aren't
  considered necessary for -stable kernels. 11 are for MM"

* tag 'mm-hotfixes-stable-2025-06-06-16-02' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  kernel/rcu/tree_stall: add /sys/kernel/rcu_stall_count
  MAINTAINERS: add mm swap section
  kmsan: test: add module description
  MAINTAINERS: add tlb trace events to MMU GATHER AND TLB INVALIDATION
  mm/hugetlb: fix huge_pmd_unshare() vs GUP-fast race
  mm/hugetlb: unshare page tables during VMA split, not before
  MAINTAINERS: add Alistair as reviewer of mm memory policy
  iov_iter: use iov_offset for length calculation in iov_iter_aligned_bvec
  mm/mempolicy: fix incorrect freeing of wi_kobj
  alloc_tag: handle module codetag load errors as module load failures
  mm/madvise: handle madvise_lock() failure during race unwinding
  mm: fix vmstat after removing NR_BOUNCE
  KVM: s390: rename PROT_NONE to PROT_TYPE_DUMMY
2025-06-06 21:45:45 -07:00
Christian Brauner
7054674ee9 selftests/mount_setattr: adapt detached mount propagation test
Make sure that detached trees don't receive mount propagation.

Signed-off-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-06-07 00:41:29 -04:00
Yonghong Song
bbc7bd658d selftests/bpf: Fix a user_ringbuf failure with arm64 64KB page size
The ringbuf max_entries must be PAGE_ALIGNED. See kernel function
ringbuf_map_alloc(). So for arm64 64KB page size, adjust max_entries
properly.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20250607013626.1553001-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-06-06 19:21:43 -07:00
Yonghong Song
8c8c5e3c85 selftests/bpf: Fix ringbuf/ringbuf_write test failure with arm64 64KB page size
The ringbuf max_entries must be PAGE_ALIGNED. See kernel function
ringbuf_map_alloc(). So for arm64 64KB page size, adjust max_entries
and other related metrics properly.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20250607013621.1552332-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-06-06 19:21:43 -07:00
Yonghong Song
377d371590 selftests/bpf: Fix bpf_mod_race test failure with arm64 64KB page size
Currently, uffd_register.range.len is set to 4096 for command
'ioctl(uffd, UFFDIO_REGISTER, &uffd_register)'. For arm64 64KB page size,
the len must be 64KB size aligned as page size alignment is required.
See fs/userfaultfd.c:validate_unaligned_range().

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20250607013615.1551783-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-06-06 19:21:43 -07:00
Yonghong Song
ae8824037a selftests/bpf: Reduce test_xdp_adjust_frags_tail_grow logs
For selftest xdp_adjust_tail/xdp_adjust_frags_tail_grow, if tested failure,
I see a long list of log output like

    ...
    test_xdp_adjust_frags_tail_grow:PASS:9Kb+10b-untouched 0 nsec
    test_xdp_adjust_frags_tail_grow:PASS:9Kb+10b-untouched 0 nsec
    test_xdp_adjust_frags_tail_grow:PASS:9Kb+10b-untouched 0 nsec
    test_xdp_adjust_frags_tail_grow:PASS:9Kb+10b-untouched 0 nsec
    ...

There are total 7374 lines of the above which is too much. Let us
only issue such logs when it is an assert failure.

Signed-off-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20250607013610.1551399-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-06-06 19:21:43 -07:00
Linus Torvalds
119b1e61a7 Merge tag 'riscv-for-linus-6.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:

 - Support for the FWFT SBI extension, which is part of SBI 3.0 and a
   dependency for many new SBI and ISA extensions

 - Support for getrandom() in the VDSO

 - Support for mseal

 - Optimized routines for raid6 syndrome and recovery calculations

 - kexec_file() supports loading Image-formatted kernel binaries

 - Improvements to the instruction patching framework to allow for
   atomic instruction patching, along with rules as to how systems need
   to behave in order to function correctly

 - Support for a handful of new ISA extensions: Svinval, Zicbop, Zabha,
   some SiFive vendor extensions

 - Various fixes and cleanups, including: misaligned access handling,
   perf symbol mangling, module loading, PUD THPs, and improved uaccess
   routines

* tag 'riscv-for-linus-6.16-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (69 commits)
  riscv: uaccess: Only restore the CSR_STATUS SUM bit
  RISC-V: vDSO: Wire up getrandom() vDSO implementation
  riscv: enable mseal sysmap for RV64
  raid6: Add RISC-V SIMD syndrome and recovery calculations
  riscv: mm: Add support for Svinval extension
  RISC-V: Documentation: Add enough title underlines to CMODX
  riscv: Improve Kconfig help for RISCV_ISA_V_PREEMPTIVE
  MAINTAINERS: Update Atish's email address
  riscv: uaccess: do not do misaligned accesses in get/put_user()
  riscv: process: use unsigned int instead of unsigned long for put_user()
  riscv: make unsafe user copy routines use existing assembly routines
  riscv: hwprobe: export Zabha extension
  riscv: Make regs_irqs_disabled() more clear
  perf symbols: Ignore mapping symbols on riscv
  RISC-V: Kconfig: Fix help text of CMDLINE_EXTEND
  riscv: module: Optimize PLT/GOT entry counting
  riscv: Add support for PUD THP
  riscv: xchg: Prefetch the destination word for sc.w
  riscv: Add ARCH_HAS_PREFETCH[W] support with Zicbop
  riscv: Add support for Zicbop
  ...
2025-06-06 18:05:18 -07:00
Andrii Nakryiko
02670deede libbpf: Handle unsupported mmap-based /sys/kernel/btf/vmlinux correctly
libbpf_err_ptr() helpers are meant to return NULL and set errno, if
there is an error. But btf_parse_raw_mmap() is meant to be used
internally and is expected to return ERR_PTR() values. Because of this
mismatch, when libbpf tries to mmap /sys/kernel/btf/vmlinux, we don't
detect the error correctly with IS_ERR() check, and never fallback to
old non-mmap-based way of loading vmlinux BTF.

Fix this by using proper ERR_PTR() returns internally.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixes: 3c0421c93c ("libbpf: Use mmap to parse vmlinux BTF from sysfs")
Cc: Lorenz Bauer <lmb@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20250606202134.2738910-1-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-06-06 14:07:07 -07:00
Linus Torvalds
6d8854216e Merge tag 'block-6.16-20250606' of git://git.kernel.dk/linux
Pull more block updates from Jens Axboe:

 - NVMe pull request via Christoph:
      - TCP error handling fix (Shin'ichiro Kawasaki)
      - TCP I/O stall handling fixes (Hannes Reinecke)
      - fix command limits status code (Keith Busch)
      - support vectored buffers also for passthrough (Pavel Begunkov)
      - spelling fixes (Yi Zhang)

 - MD pull request via Yu:
      - fix REQ_RAHEAD and REQ_NOWAIT IO err handling for raid1/10
      - fix max_write_behind setting for dm-raid
      - some minor cleanups

 - Integrity data direction fix and cleanup

 - bcache NULL pointer fix

 - Fix for loop missing write start/end handling

 - Decouple hardware queues and IO threads in ublk

 - Slew of ublk selftests additions and updates

* tag 'block-6.16-20250606' of git://git.kernel.dk/linux: (29 commits)
  nvme: spelling fixes
  nvme-tcp: fix I/O stalls on congested sockets
  nvme-tcp: sanitize request list handling
  nvme-tcp: remove tag set when second admin queue config fails
  nvme: enable vectored registered bufs for passthrough cmds
  nvme: fix implicit bool to flags conversion
  nvme: fix command limits status code
  selftests: ublk: kublk: improve behavior on init failure
  block: flip iter directions in blk_rq_integrity_map_user()
  block: drop direction param from bio_integrity_copy_user()
  selftests: ublk: cover PER_IO_DAEMON in more stress tests
  Documentation: ublk: document UBLK_F_PER_IO_DAEMON
  selftests: ublk: add stress test for per io daemons
  selftests: ublk: add functional test for per io daemons
  selftests: ublk: kublk: decouple ublk_queues from ublk server threads
  selftests: ublk: kublk: move per-thread data out of ublk_queue
  selftests: ublk: kublk: lift queue initialization out of thread
  selftests: ublk: kublk: tie sqe allocation to io instead of queue
  selftests: ublk: kublk: plumb q_id in io_uring user_data
  ublk: have a per-io daemon instead of a per-queue daemon
  ...
2025-06-06 13:12:50 -07:00
Linus Torvalds
c26f4fbd58 Merge tag 'char-misc-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char / misc / iio driver updates from Greg KH:
 "Here is the big char/misc/iio and other small driver subsystem pull
  request for 6.16-rc1.

  Overall, a lot of individual changes, but nothing major, just the
  normal constant forward progress of new device support and cleanups to
  existing subsystems. Highlights in here are:

   - Large IIO driver updates and additions and device tree changes

   - Android binder bugfixes and logfile fixes

   - mhi driver updates

   - comedi driver updates

   - counter driver updates and additions

   - coresight driver updates and additions

   - echo driver removal as there are no in-kernel users of it

   - nvmem driver updates

   - spmi driver updates

   - new amd-sbi driver "subsystem" and drivers added

   - rust miscdriver binding documentation fix

   - other small driver fixes and updates (uio, w1, acrn, hpet,
     xillybus, cardreader drivers, fastrpc and others)

  All of these have been in linux-next for quite a while with no
  reported problems"

* tag 'char-misc-6.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (390 commits)
  binder: fix yet another UAF in binder_devices
  counter: microchip-tcb-capture: Add watch validation support
  dt-bindings: iio: adc: Add ROHM BD79100G
  iio: adc: add support for Nuvoton NCT7201
  dt-bindings: iio: adc: add NCT7201 ADCs
  iio: chemical: Add driver for SEN0322
  dt-bindings: trivial-devices: Document SEN0322
  iio: adc: ad7768-1: reorganize driver headers
  iio: bmp280: zero-init buffer
  iio: ssp_sensors: optimalize -> optimize
  HID: sensor-hub: Fix typo and improve documentation
  iio: admv1013: replace redundant ternary operator with just len
  iio: chemical: mhz19b: Fix error code in probe()
  iio: adc: at91-sama5d2: use IIO_DECLARE_BUFFER_WITH_TS
  iio: accel: sca3300: use IIO_DECLARE_BUFFER_WITH_TS
  iio: adc: ad7380: use IIO_DECLARE_DMA_BUFFER_WITH_TS
  iio: adc: ad4695: rename AD4695_MAX_VIN_CHANNELS
  iio: adc: ad4695: use IIO_DECLARE_DMA_BUFFER_WITH_TS
  iio: introduce IIO_DECLARE_BUFFER_WITH_TS macros
  iio: make IIO_DMA_MINALIGN minimum of 8 bytes
  ...
2025-06-06 11:50:47 -07:00