mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
synced 2026-05-10 09:09:55 -04:00
aea3496bbc7c20037ecc35411264de109d700fc7
1352792 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
aea3496bbc |
perf tools: Fix arm64 source package build
Syscall tables are generated from rules in the kernel tree. Add the
related files to the MANIFEST to fix the Perf source package build.
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixes:
|
||
|
|
3eb5c49f71 |
perf test: Hybrid improvements for metric value validation test
On my alderlake I currently see for the "perf metrics value validation" test:
```
Total Test Count: 142
Passed Test Count: 139
[
Metric Relationship Error: The collected value of metric ['tma_fetch_latency', 'tma_fetch_bandwidth', 'tma_frontend_bound']
is [31.137028] in workload(s): ['perf bench futex hash -r 2 -s']
but expected value range is [tma_frontend_bound, tma_frontend_bound]
Relationship rule description: 'Sum of the level 2 children should equal level 1 parent',
Metric Relationship Error: The collected value of metric ['tma_memory_bound', 'tma_core_bound', 'tma_backend_bound']
is [6.564442] in workload(s): ['perf bench futex hash -r 2 -s']
but expected value range is [tma_backend_bound, tma_backend_bound]
Relationship rule description: 'Sum of the level 2 children should equal level 1 parent',
Metric Relationship Error: The collected value of metric ['tma_light_operations', 'tma_heavy_operations', 'tma_retiring']
is [57.806179] in workload(s): ['perf bench futex hash -r 2 -s']
but expected value range is [tma_retiring, tma_retiring]
Relationship rule description: 'Sum of the level 2 children should equal level 1 parent']
Metric validation return with erros. Please check metrics reported with errors.
```
I suspect it is due to two metrics for different CPU types being
enabled. Add a -cputype option to avoid this. The test still fails with:
```
Total Test Count: 115
Passed Test Count: 114
[
Wrong Metric Value Error: The collected value of metric ['tma_l2_hit_latency']
is [117.947088] in workload(s): ['perf bench futex hash -r 2 -s']
but expected value range is [0, 100]]
Metric validation return with errors. Please check metrics reported with errors.
```
which is a reproducible genuine error and likely requires a metric fix.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250512184700.11691-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
7f84f67418 |
perf list: Display the PMU name associated with a perf metric in JSON
The 'perf stat --cputype' option can be used to filter which metrics
will be applied, for this reason the JSON metrics have an associated
PMU.
List this PMU name in the 'perf list' output in JSON mode so that
tooling may access it.
An example of the new field is:
```
{
"MetricGroup": "Backend",
"MetricName": "tma_core_bound",
"MetricExpr": "max(0, tma_backend_bound - tma_memory_bound)",
"MetricThreshold": "tma_core_bound > 0.1 & tma_backend_bound > 0.2",
"ScaleUnit": "100%",
"BriefDescription": "This metric represents fraction of slots where ...
"PublicDescription": "This metric represents fraction of slots where ...
"Unit": "cpu_core"
},
```
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250512184700.11691-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
4102ff8b1f |
perf metricgroup: Binary search when resolving referred to metrics
Unlike with events, metrics can be matched by name or a list of metric groups. However, when a metric refers to another metric it isn't referring to a group but the singular metric in question. Prior to this change every "id" in a metric expression is checked to see if it is a metric by scanning all the metrics in the metrics table. As the table is sorted my metric name we can speed the search in the resolution case by binary searching for the metric. Rename some of the metricgroup functions to make it clearer whether they match a metric by name or by both name and group. Before: ``` $ time perf test -v 10 10: PMU JSON event tests : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok 10.5: Parsing of metric thresholds with fake PMUs : Ok real 0m15.972s user 0m13.176s sys 0m3.001s ``` After: ``` $ time perf test -v 10 10: PMU JSON event tests : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok 10.5: Parsing of metric thresholds with fake PMUs : Ok real 0m5.343s user 0m1.871s sys 0m2.128s ``` Committer testing: root@number:~# grep -m1 'model name' /proc/cpuinfo model name : AMD Ryzen 9 9950X3D 16-Core Processor root@number:~# Before: root@number:~# time perf test "Parsing of PMU event table metrics" 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok real 0m9.286s user 0m9.354s sys 0m0.062s root@number:~# After: root@number:~# time perf test "Parsing of PMU event table metrics" 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok real 0m0.689s user 0m0.766s sys 0m0.042s root@number:~# time perf test 10 10: PMU JSON event tests : 10.1: PMU event table sanity : Ok 10.2: PMU event map aliases : Ok 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok 10.5: Parsing of metric thresholds with fake PMUs : Ok real 0m0.696s user 0m0.807s sys 0m0.064s root@number:~# Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20250512194622.33258-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
754baf426e |
perf pmu: Change aliases from list to hashmap
Finding an alias for things like perf_pmu__have_event() would need to search the aliases list, whilst this happens relatively infrequently it can be a significant overhead in testing. Switch to using a hashmap. Move common initialization code to perf_pmu__init(). Refactor the test 'struct perf_pmu_test_pmu' to not have perf pmu within it to better support the perf_pmu__init() function. Before: ``` $ time perf test "Parsing of PMU event table metrics" 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok real 0m13.287s user 0m13.026s sys 0m0.532s ``` After: ``` $ time perf test "Parsing of PMU event table metrics" 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok real 0m13.011s user 0m12.885s sys 0m0.485s ``` Committer testing: root@number:~# grep -m1 'model name' /proc/cpuinfo model name : AMD Ryzen 9 9950X3D 16-Core Processor root@number:~# Before: root@number:~# time perf test "Parsing of PMU event table metrics" 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok real 0m9.296s user 0m9.361s sys 0m0.063s root@number:~# After: root@number:~# time perf test "Parsing of PMU event table metrics" 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok real 0m9.286s user 0m9.354s sys 0m0.062s root@number:~# Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20250512194622.33258-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
375368a961 |
perf fncache: Switch to using hashmap
The existing fncache can get large in testing situations. As the bucket array is a fixed size this leads to it degrading to O(n) performance. Use a regular hashmap that can dynamically reallocate its array. Before: ``` $ time perf test "Parsing of PMU event table metrics" 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok real 0m14.132s user 0m17.806s sys 0m0.557s ``` After: ``` $ time perf test "Parsing of PMU event table metrics" 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok real 0m13.287s user 0m13.026s sys 0m0.532s ``` Committer notes: root@number:~# grep -m1 'model name' /proc/cpuinfo model name : AMD Ryzen 9 9950X3D 16-Core Processor root@number:~# Before: root@number:~# time perf test "Parsing of PMU event table metrics" 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok real 0m9.277s user 0m9.979s sys 0m0.055s root@number:~# After: root@number:~# time perf test "Parsing of PMU event table metrics" 10.3: Parsing of PMU event table metrics : Ok 10.4: Parsing of PMU event table metrics with fake PMUs : Ok real 0m9.296s user 0m9.361s sys 0m0.063s root@number:~# Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Xu Yang <xu.yang_2@nxp.com> Link: https://lore.kernel.org/r/20250512194622.33258-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
f3061d5267 |
perf tests: Harden branch stack sampling test
On continuous testing the perf script output can be empty, or nearly empty, causing tr/grep to exit and due to "set -e" the test traps and fails. Add some empty file handling that sets the test to skip and make grep and other text rewriting failures non-fatal by adding "|| true". Committer testing: root@number:~# grep -m1 "model name" /proc/cpuinfo model name : AMD Ryzen 9 9950X3D 16-Core Processor root@number:~# perf test "Check branch stack sampling" 104: Check branch stack sampling : Ok root@number:~# root@number:~# perf test -vvvvvvv "Check branch stack sampling" 104: Check branch stack sampling: --- start --- test child forked, pid 396047 142d22-142da0 l brstack_bench perf does have symbol 'brstack_bench' Testing user branch stack sampling Testing branch stack filtering permutation (any_call,CALL|IND_CALL|COND_CALL|SYSCALL|IRQ) Testing branch stack filtering permutation (call,CALL|SYSCALL) Testing branch stack filtering permutation (cond,COND) Testing branch stack filtering permutation (any_ret,RET|COND_RET|SYSRET|ERET) Testing branch stack filtering permutation (call,cond,CALL|SYSCALL|COND) Testing branch stack filtering permutation (any_call,cond,CALL|IND_CALL|COND_CALL|IRQ|SYSCALL|COND) Testing branch stack filtering permutation (cond,any_call,any_ret,COND|CALL|IND_CALL|COND_CALL|SYSCALL|IRQ|RET|COND_RET|SYSRET|ERET) ---- end(0) ---- 104: Check branch stack sampling : Ok root@number:~# Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250318161639.34446-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
255f5b6d06 |
perf parse-events: Add "cpu" term to set the CPU an event is recorded on
The -C option allows the CPUs for a list of events to be specified but
its not possible to set the CPU for a single event. Add a term to
allow this. The term isn't a general CPU list due to ',' already being
a special character in event parsing instead multiple cpu= terms may
be provided and they will be merged/unioned together.
An example of mixing different types of events counted on different CPUs:
```
$ perf stat -A -C 0,4-5,8 -e "instructions/cpu=0/,l1d-misses/cpu=4,cpu=5/,inst_retired.any/cpu=8/,cycles" -a sleep 0.1
Performance counter stats for 'system wide':
CPU0 6,979,225 instructions/cpu=0/ # 0.89 insn per cycle
CPU4 75,138 cpu/l1d-misses/
CPU5 1,418,939 cpu/l1d-misses/
CPU8 797,553 cpu/inst_retired.any,cpu=8/
CPU0 7,845,302 cycles
CPU4 6,546,859 cycles
CPU5 185,915,438 cycles
CPU8 2,065,668 cycles
0.112449242 seconds time elapsed
```
Committer testing:
root@number:~# grep -m1 "model name" /proc/cpuinfo
model name : AMD Ryzen 9 9950X3D 16-Core Processor
root@number:~# perf stat -A -e "instructions/cpu=0/,instructions,l1d-misses/cpu=4,cpu=5/,cycles" -a sleep 0.1
Performance counter stats for 'system wide':
CPU0 2,398,351 instructions/cpu=0/ # 0.44 insn per cycle
CPU0 2,398,152 instructions # 0.44 insn per cycle
CPU1 1,265,634 instructions # 0.49 insn per cycle
CPU2 606,087 instructions # 0.50 insn per cycle
CPU3 4,025,752 instructions # 0.52 insn per cycle
CPU4 4,236,810 instructions # 0.53 insn per cycle
CPU5 3,984,832 instructions # 0.66 insn per cycle
CPU6 434,132 instructions # 0.44 insn per cycle
CPU7 65,752 instructions # 0.41 insn per cycle
CPU8 459,083 instructions # 0.48 insn per cycle
CPU9 6,464,161 instructions # 1.31 insn per cycle
<SNIP>
root@number:~# perf stat -e "instructions/cpu=0/,instructions,l1d-misses/cpu=4,cpu=5/,cycles" -a sleep 0.
Performance counter stats for 'system wide':
144,822 instructions/cpu=0/ # 0.03 insn per cycle
4,666,114 instructions # 0.93 insn per cycle
2,583 l1d-misses
4,993,633 cycles
0.000868512 seconds time elapsed
root@number:~#
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Link: https://lore.kernel.org/r/20250403194337.40202-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
168c7b5091 |
perf parse-events: Set is_pmu_core for legacy hardware events
Also set the CPU map to all online CPU maps. This is done so the behavior of legacy hardware and hardware cache events better matches that of sysfs and JSON events during __perf_evlist__propagate_maps(). Fix missing cpumap put in "Synthesize attr update" test. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250403194337.40202-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
f60c3f4468 |
perf stat: Use counter cpumask to skip zero values
When a counter is 0 it may or may not be skipped. For uncore counters it is common they are only valid on 1 logical CPU and all other CPUs should be skipped. The PMU's cpumask was used for the skip calculation, but that cpumask may not reflect user overrides. Similarly a counter on a core PMU may explicitly not request a CPU be gathered. If the counter on this CPU's value is 0 then the counter should be skipped as it wasn't requested. Switch from using the PMU cpumask to that associated with the evsel to support these cases. Avoid potential crash with --per-thread mode where config->aggr_get_id is NULL. Add some examples for the tool event 0 counter skipping. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250403194337.40202-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
2e7a2f7f3c |
libperf cpumap: Add ability to create CPU from a single CPU number
Add perf_cpu_map__new_int() so that a CPU map can be created from a single integer. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Leo Yan <leo.yan@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20250403194337.40202-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
365e02ddb6 |
perf tests metrics: Permission related fixes
When permissions are limited running sleep without system wide isn't a good benchmark to run to achieve samples, switch to running noploop. Remove indent for non-success cases. Allow skip for the not counted case. Minor debug changes. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20250412004704.2297939-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
f0869f3156 |
perf evsel: Add per-thread warning for EOPNOTSUPP open failues
The mrvl_ddr_pmu will return EOPNOTSUPP if opened in per-thread mode. Give a warning for this similar to EINVAL. Doing this better supports metric testing with limited permissions when the mrvl_ddr_pmu is present, as the failure to open causes the test to skip and not fail. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20250412004704.2297939-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
17e548405a |
perf scripts python: exported-sql-viewer.py: Fix pattern matching with Python 3
The script allows the user to enter patterns to find symbols.
The pattern matching characters are converted for use in SQL.
For PostgreSQL the conversion involves using the Python maketrans()
method which is slightly different in Python 3 compared with Python 2.
Fix to work in Python 3.
Fixes:
|
||
|
|
352b088164 |
perf intel-pt: Do not default to recording all switch events
On systems with many CPUs, recording extra context switch events can be
excessive and unnecessary. Add perf config intel-pt.all-switch-events=false
to control the behaviour.
Example:
# perf config intel-pt.all-switch-events=false
# perf record -eintel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.082 MB perf.data ]
# perf script -D | grep PERF_RECORD_SWITCH | awk '{print $5}' | uniq -c
5 PERF_RECORD_SWITCH
# perf config intel-pt.all-switch-events=true
# perf record -eintel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.102 MB perf.data ]
# perf script -D | grep PERF_RECORD_SWITCH | awk '{print $5}' | uniq -c
180 PERF_RECORD_SWITCH_CPU_WIDE
Committer testing:
While doing a make -j28 allmodconfig:
root@five:~# grep "model name" -m1 /proc/cpuinfo
model name : Intel(R) Core(TM) i7-14700K
root@five:~#
root@five:~# perf config intel-pt.all-switch-events=false
root@five:~# perf record -e intel_pt//u uname
Linux
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.019 MB perf.data ]
root@five:~# perf report --stats | grep SWITCH_CPU_WIDE
root@five:~#
root@five:~# perf config intel-pt.all-switch-events=true
root@five:~# perf record -e intel_pt//u uname
Linux
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.047 MB perf.data ]
root@five:~# perf report --stats | grep SWITCH_CPU_WIDE
SWITCH_CPU_WIDE events: 542 (96.4%)
root@five:~#
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250512093932.79854-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
e00eac6b5b |
perf intel-pt: Fix PEBS-via-PT data_src
The Fixes commit did not add support for decoding PEBS-via-PT data_src.
Fix by adding support.
PEBS-via-PT is a feature of some E-core processors, starting with
processors based on Tremont microarchitecture. Because the kernel only
supports Intel PT features that are on all processors, there is no support
for PEBS-via-PT on hybrids.
Currently that leaves processors based on Tremont, Gracemont and Crestmont,
however there are no events on Tremont that produce data_src information,
and for Gracemont and Crestmont there are only:
mem-loads event=0xd0,umask=0x5,ldlat=3
mem-stores event=0xd0,umask=0x6
Affected processors include Alder Lake N (Gracemont), Sierra Forest
(Crestmont) and Grand Ridge (Crestmont).
Example:
# perf record -d -e intel_pt/branch=0/ -e mem-loads/aux-output/pp uname
Before:
# perf.before script --itrace=o -Fdata_src
0 |OP No|LVL N/A|SNP N/A|TLB N/A|LCK No|BLK N/A
0 |OP No|LVL N/A|SNP N/A|TLB N/A|LCK No|BLK N/A
After:
# perf script --itrace=o -Fdata_src
10268100142 |OP LOAD|LVL L1 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK N/A
10450100442 |OP LOAD|LVL L2 hit|SNP None|TLB L2 miss|LCK No|BLK N/A
Fixes:
|
||
|
|
cd17a9b1a7 |
perf test demangle-ocaml: Switch to using dso__demangle_sym()
The use of the demangle-ocaml APIs means we don't detect if a different demangler is used before the OCaml one for the case that matters to perf. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Ariel Ben-Yehuda <ariel.byd@gmail.com> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Bill Wendling <morbo@google.com> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Daniel Xu <dxu@dxuuu.xyz> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Gary Guo <gary@garyguo.net> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Trevor Gross <tmgross@umich.edu> Link: https://lore.kernel.org/r/20250430004128.474388-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
41fcc9a343 |
perf test demangle-java: Switch to using dso__demangle_sym()
The use of the demangle-java APIs means we don't detect if a different demangler is used before the Java one for the case that matters to perf. Remove the return types from the demangled names as dso__demangle_sym() removes those. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Ariel Ben-Yehuda <ariel.byd@gmail.com> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Bill Wendling <morbo@google.com> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Daniel Xu <dxu@dxuuu.xyz> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Gary Guo <gary@garyguo.net> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Trevor Gross <tmgross@umich.edu> Link: https://lore.kernel.org/r/20250430004128.474388-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
bdf05ccd18 |
perf test demangle-rust: Add Rust demangling test
The test cases are listed examples in: https://doc.rust-lang.org/rustc/symbol-mangling/v0.html This test was previously part of a different Rust v0 demangler: https://lore.kernel.org/lkml/20250129193037.573431-1-irogers@google.com/ Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Ariel Ben-Yehuda <ariel.byd@gmail.com> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Bill Wendling <morbo@google.com> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Daniel Xu <dxu@dxuuu.xyz> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Gary Guo <gary@garyguo.net> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Trevor Gross <tmgross@umich.edu> Link: https://lore.kernel.org/r/20250430004128.474388-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
ac292ea7c3 |
perf demangle-rust: Remove previous legacy rust decoder
Code is unused since the introduction of rustc-demangle demangler. Reviewed-by: Miguel Ojeda <ojeda@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Ariel Ben-Yehuda <ariel.byd@gmail.com> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Bill Wendling <morbo@google.com> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Daniel Xu <dxu@dxuuu.xyz> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Gary Guo <gary@garyguo.net> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Trevor Gross <tmgross@umich.edu> Link: https://lore.kernel.org/r/20250430004128.474388-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
e20848c317 |
perf symbol-elf: Integrate rust-v0 demangling
Use the demangle-rust-v0 APIs to see if symbol is Rust mangled and demangle if so. The API requires a pre-allocated output buffer, some estimation and retrying are added for this. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Ariel Ben-Yehuda <ariel.byd@gmail.com> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Bill Wendling <morbo@google.com> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Daniel Xu <dxu@dxuuu.xyz> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Gary Guo <gary@garyguo.net> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Trevor Gross <tmgross@umich.edu> Link: https://lore.kernel.org/r/20250430004128.474388-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
60869b22af |
perf demangle-rust: Add rustc-demangle C demangler
Imported at commit 80e40f57d99f ("add comment about finding latest
version of code") from:
https://github.com/rust-lang/rustc-demangle/blob/main/crates/native-c/src/demangle.c
https://github.com/rust-lang/rustc-demangle/blob/main/crates/native-c/include/demangle.h
There is discussion of this issue motivating the import in:
https://github.com/rust-lang/rust/issues/60705
https://lore.kernel.org/lkml/20250129193037.573431-1-irogers@google.com/
The SPDX lines reflect the dual license Apache-2 or MIT in:
https://github.com/rust-lang/rustc-demangle/blob/main/README.md
Following Migual Ojeda's suggestion comments were added on copyright and
keeping the code in sync with upstream.
The files are renamed as perf supports multiple demanglers and so
demangle as a name would be overloaded.
The work here was done by Ariel Ben-Yehuda <ariel.byd@gmail.com> and I
am merely importing it as discussed in the rust-lang issue.
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andreas Hindborg <a.hindborg@kernel.org>
Cc: Ariel Ben-Yehuda <ariel.byd@gmail.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Bill Wendling <morbo@google.com>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Daniel Xu <dxu@dxuuu.xyz>
Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Gary Guo <gary@garyguo.net>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Trevor Gross <tmgross@umich.edu>
Link: https://lore.kernel.org/r/20250430004128.474388-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
4f1a19b8bc |
perf test amd ibs: Fix spelling mistake "Asssuming" -> "Assuming"
There is a spelling mistake ina pr_debug message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250507082421.188848-1-colin.i.king@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
c60b7d6f50 |
perf pmu: Use available core PMU for raw events
When it finds a matching PMU for a legacy event, it should look for core PMUs. The raw events also refers to core events so it should be handled similarly. On x86, PERF_TYPE_RAW should match with the existing cpu PMU. But on ARM, there's no PMU with the matching type so it'll pick the first core PMU for it. Suggested-by: Ian Rogers <irogers@google.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250507215939.54399-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
c42e219942 |
perf lock contention: Add -J/--inject-delay option
This is to slow down lock acquistion (on contention locks) deliberately.
A possible use case is to estimate impact on application performance by
optimization of kernel locking behavior. By delaying the lock it can
simulate the worse condition as a control group, and then compare with
the current behavior as a optimized condition.
The syntax is 'time@function' and the time can have unit suffix like
"us" and "ms". For example, I ran a simple test like below.
$ sudo perf lock con -abl -L tasklist_lock -- \
sh -c 'for i in $(seq 1000); do sleep 1 & done; wait'
contended total wait max wait avg wait address symbol
92 1.18 ms 199.54 us 12.79 us ffffffff8a806080 tasklist_lock (rwlock)
The contention count was 92 and the average wait time was around 10 us.
But if I add 100 usec of delay to the tasklist_lock,
$ sudo perf lock con -abl -L tasklist_lock -J 100us@tasklist_lock -- \
sh -c 'for i in $(seq 1000); do sleep 1 & done; wait'
contended total wait max wait avg wait address symbol
190 15.67 ms 230.10 us 82.46 us ffffffff8a806080 tasklist_lock (rwlock)
The contention count increased and the average wait time was up closed
to 100 usec. If I increase the delay even more,
$ sudo perf lock con -abl -L tasklist_lock -J 1ms@tasklist_lock -- \
sh -c 'for i in $(seq 1000); do sleep 1 & done; wait'
contended total wait max wait avg wait address symbol
1002 2.80 s 3.01 ms 2.80 ms ffffffff8a806080 tasklist_lock (rwlock)
Now every sleep process had contention and the wait time was more than 1
msec. This is on my 4 CPU laptop so I guess one CPU has the lock while
other 3 are waiting for it mostly.
For simplicity, it only supports global locks for now.
Committer testing:
root@number:~# grep -m1 'model name' /proc/cpuinfo
model name : AMD Ryzen 9 9950X3D 16-Core Processor
root@number:~# perf lock con -abl -L tasklist_lock -- sh -c 'for i in $(seq 1000); do sleep 1 & done; wait'
contended total wait max wait avg wait address symbol
142 453.85 us 25.39 us 3.20 us ffffffffae808080 tasklist_lock (rwlock)
root@number:~# perf lock con -abl -L tasklist_lock -J 100us@tasklist_lock -- sh -c 'for i in $(seq 1000); do sleep 1 & done; wait'
contended total wait max wait avg wait address symbol
1040 2.39 s 3.11 ms 2.30 ms ffffffffae808080 tasklist_lock (rwlock)
root@number:~# perf lock con -abl -L tasklist_lock -J 1ms@tasklist_lock -- sh -c 'for i in $(seq 1000); do sleep 1 & done; wait'
contended total wait max wait avg wait address symbol
1025 24.72 s 31.01 ms 24.12 ms ffffffffae808080 tasklist_lock (rwlock)
root@number:~#
Suggested-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250509171950.183591-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
4bfe27140e |
perf tests: Fix 'perf report' tests installation
There was a copy-paste mistake in the installation commands.
Also, we need to install stderr-whitelist.txt file, which contains
allowed messages that are printed on stderr and should not cause test
fail.
Fixes:
|
||
|
|
30d20fb1f8 |
perf trace: Fix leaks of 'struct thread' in set_filter_loop_pids()
I've found some leaks from 'perf trace -a'.
It seems there are more leaks but this is what I can find for now.
Fixes:
|
||
|
|
bb3de7fa98 |
perf trace: Fix leaks of 'struct thread' in fprintf_sys_enter()
I've found some leaks from 'perf trace -a'.
It seems there are more leaks but this is what I can find for now.
Fixes:
|
||
|
|
70e21ac8b0 |
perf parse-events: Add debug dump of evlist if reordered
Add debug verbose output to show how evsels were reordered by
parse_events__sort_events_and_fix_groups(). For example:
```
$ perf record -v -e '{instructions,cycles}' true
Using CPUID GenuineIntel-6-B7-1
WARNING: events were regrouped to match PMUs
evlist after sorting/fixing: '{cpu_atom/instructions/,cpu_atom/cycles/},{cpu_core/instructions/,cpu_core/cycles/}'
```
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250402201549.4090305-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
583dc500d1 |
perf evlist: Make groups visible in evlist__format_evsels() output
Make groups visible in output:
Before:
{cycles,instructions} ->
cpu_atom/cycles/,cpu_atom/instructions/,cpu_core/cycles/,cpu_core/instructions/
After:
{cycles,instructions} ->
{cpu_atom/cycles/,cpu_atom/instructions/},{cpu_core/cycles/,cpu_core/instructions/}
Committer testing:
Before:
root@number:~# perf record -e '{cycles,instructions,cache-misses}' /tmp/bla
Failed to collect 'cycles,instructions,cache-misses' for the '/tmp/bla' workload: Permission denied
root@number:~#
After:
root@number:~# perf record -e '{cycles,instructions,cache-misses}' /tmp/bla
Failed to collect '{cycles,instructions,cache-misses}' for the '/tmp/bla' workload: Permission denied
root@number:~#
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250402201549.4090305-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
f0f245eaa2 |
perf evlist: Refactor evlist__scnprintf_evsels()
Switch output to using a strbuf so the storage can be resized.
Add a maximum size argument to avoid too much output that may happen for
uncore events.
Rename as scnprintf is no longer used.
Committer testing:
With the patch applied:
root@number:~# perf probe -x ~/bin/perf evlist__format_evsels
Added new event:
probe_perf:evlist_format_evsels (on evlist__format_evsels in /home/acme/bin/perf)
You can now use it in all perf tools, such as:
perf record -e probe_perf:evlist_format_evsels -aR sleep 1
root@number:~# perf probe -l
probe_perf:evlist_format_evsels (on evlist__format_evsels@util/evlist.c in /home/acme/bin/perf)
root@number:~# perf trace -e probe_perf:*/max-stack=10/ perf record -e cycles,instructions,cache-misses /tmp/bla
Failed to collect 'cycles,instructions,cache-misses' for the '/tmp/bla' workload: Permission denied
0.000 perf/3893011 probe_perf:evlist_format_evsels(__probe_ip: 6183397)
evlist__format_evsels (/home/acme/bin/perf)
__cmd_record (/home/acme/bin/perf)
cmd_record (/home/acme/bin/perf)
run_builtin (/home/acme/bin/perf)
handle_internal_command (/home/acme/bin/perf)
run_argv (/home/acme/bin/perf)
main (/home/acme/bin/perf)
__libc_start_call_main (/usr/lib64/libc.so.6)
__libc_start_main@@GLIBC_2.34 (/usr/lib64/libc.so.6)
_start (/home/acme/bin/perf)
root@number:~#
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250402201549.4090305-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
a5efaf9008 |
perf stat: Remove print_mixed_hw_group_error
print_mixed_hw_group_error will print a warning when a group of events
uses different PMUs.
This isn't possible to happen as parse_events__sort_events_and_fix_groups()
will break groups when this happens, adding the warning at the start
of perf of:
WARNING: events were regrouped to match PMUs
As the previous mixed group warning can never happen, remove the
associated code.
Committer testing:
Before/after:
acme@five:~$ perf stat -e '{cpu_core/cycles/,cpu_atom/cycles/}' sleep 1
WARNING: events were regrouped to match PMUs
Performance counter stats for 'sleep 1':
424,895 cpu_atom/cycles/u
<not counted> cpu_core/cycles/u (0.00%)
1.011862314 seconds time elapsed
0.000000000 seconds user
0.003166000 seconds sys
acme@five:~$
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250402201549.4090305-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
4b53137721 |
perf stat: Better hybrid support for the NMI watchdog warning
Prior to this patch evlist__has_hybrid would return false if the
processor wasn't hybrid or the evlist didn't contain any core
events. If the only PMU used by events was cpu_core then it would
true even though there are no cpu_atom events. For example:
```
$ perf stat --cputype=cpu_core -e '{cycles,cycles,cycles,cycles,cycles,cycles,cycles,cycles,cycles}' true
Performance counter stats for 'true':
<not counted> cpu_core/cycles/ (0.00%)
<not counted> cpu_core/cycles/ (0.00%)
<not counted> cpu_core/cycles/ (0.00%)
<not counted> cpu_core/cycles/ (0.00%)
<not counted> cpu_core/cycles/ (0.00%)
<not counted> cpu_core/cycles/ (0.00%)
<not counted> cpu_core/cycles/ (0.00%)
<not counted> cpu_core/cycles/ (0.00%)
<not counted> cpu_core/cycles/ (0.00%)
0.001981900 seconds time elapsed
0.002311000 seconds user
0.000000000 seconds sys
```
This patch changes evlist__has_hybrid to return true only if the
evlist contains events from >1 core PMU. This means the NMI watchdog
warning is shown for the case above.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250402201549.4090305-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
7900938850 |
perf trace: Add missing thread__put() in thread__e_machine()
Add missing thread__put() of the found parent thread in thread__e_machine(). Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250401202715.3493567-1-irogers@google.com [ split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
8830091383 |
perf trace: Free the files.max entry in files->table
The files.max is the maximum valid fd in the files array and so freeing the values needs to be inclusive of the max value. Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250401202715.3493567-1-irogers@google.com [ split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
8330d092f7 |
Merge remote-tracking branch 'torvalds/master' into perf-tools-next
To pick up fixes from the latest perf-tools pull request from Namhyung and get perf-tools-next in line with thinngs in other areas it uses, like tools/lib/bpf, etc. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
707df33751 |
Merge tag 'media/v6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull media fixes from Mauro Carvalho Chehab: "Some Kconfig dependency fixes" * tag 'media/v6.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: media: cec: tda9950: add back i2c dependency media: i2c: lt6911uxe: add two selects to Kconfig media: platform: synopsys: VIDEO_SYNOPSYS_HDMIRX should depend on ARCH_ROCKCHIP media: i2c: lt6911uxe: Fix Kconfig dependencies: media: vivid: fix FB dependency |
||
|
|
0d8d44db29 |
Merge tag 'for-6.15-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- revert device path canonicalization, this does not work as intended
with namespaces and is not reliable in all setups
- fix crash in scrub when checksum tree is not valid, e.g. when mounted
with rescue=ignoredatacsums
- fix crash when tracepoint btrfs_prelim_ref_insert is enabled
- other minor fixups:
- open code folio_index(), meant to be used in MM code
- use matching type for sizeof in compression allocation
* tag 'for-6.15-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: open code folio_index() in btree_clear_folio_dirty_tag()
Revert "btrfs: canonicalize the device path before adding it"
btrfs: avoid NULL pointer dereference if no valid csum tree
btrfs: handle empty eb->folios in num_extent_folios()
btrfs: correct the order of prelim_ref arguments in btrfs__prelim_ref
btrfs: compression: adjust cb->compressed_folios allocation type
|
||
|
|
cccd033714 |
Merge tag 'for-6.15/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm
Pull device mapper fixes from Mikulas Patocka: - fix reading past the end of allocated memory - fix missing dm_put_live_table() in dm_keyslot_evict() * tag 'for-6.15/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm: fix copying after src array boundaries dm: add missing unlock on in dm_keyslot_evict() |
||
|
|
f1aff4bc19 |
dm: fix copying after src array boundaries
The blammed commit copied to argv the size of the reallocated argv,
instead of the size of the old_argv, thus reading and copying from
past the old_argv allocated memory.
Following BUG_ON was hit:
[ 3.038929][ T1] kernel BUG at lib/string_helpers.c:1040!
[ 3.039147][ T1] Internal error: Oops - BUG: 00000000f2000800 [#1] SMP
...
[ 3.056489][ T1] Call trace:
[ 3.056591][ T1] __fortify_panic+0x10/0x18 (P)
[ 3.056773][ T1] dm_split_args+0x20c/0x210
[ 3.056942][ T1] dm_table_add_target+0x13c/0x360
[ 3.057132][ T1] table_load+0x110/0x3ac
[ 3.057292][ T1] dm_ctl_ioctl+0x424/0x56c
[ 3.057457][ T1] __arm64_sys_ioctl+0xa8/0xec
[ 3.057634][ T1] invoke_syscall+0x58/0x10c
[ 3.057804][ T1] el0_svc_common+0xa8/0xdc
[ 3.057970][ T1] do_el0_svc+0x1c/0x28
[ 3.058123][ T1] el0_svc+0x50/0xac
[ 3.058266][ T1] el0t_64_sync_handler+0x60/0xc4
[ 3.058452][ T1] el0t_64_sync+0x1b0/0x1b4
[ 3.058620][ T1] Code: f800865e a9bf7bfd 910003fd 941f48aa (d4210000)
[ 3.058897][ T1] ---[ end trace 0000000000000000 ]---
[ 3.059083][ T1] Kernel panic - not syncing: Oops - BUG: Fatal exception
Fix it by copying the size of src, and not the size of dst, as it was.
Fixes:
|
||
|
|
8feafba59c |
perf test: Add direct off-cpu tests
Since we added --off-cpu-thresh, add tests for when a sample's off-cpu time is above the threshold, and when it's below the threshold. Note that the basic test performed in test_offcpu_basic() collects a direct sample now, since sleep 1 has duration of 1000ms, higher than the default value of --off-cpu-thresh of 500ms, resulting in a direct sample. An example: $ sudo perf test offcpu 124: perf record offcpu profiling tests : Ok $ Committer testing: root@number:~# perf test offcpu 126: perf record offcpu profiling tests : Ok root@number:~# perf test -v offcpu 126: perf record offcpu profiling tests : Ok root@number:~# perf test -vv offcpu 126: perf record offcpu profiling tests: --- start --- test child forked, pid 1410791 Checking off-cpu privilege Basic off-cpu test Basic off-cpu test [Success] Child task off-cpu test Child task off-cpu test [Success] Threshold test (above threshold) Threshold test (above threshold) [Success] Threshold test (below threshold) Threshold test (below threshold) [Success] ---- end(0) ---- 126: perf record offcpu profiling tests : Ok root@number:~# Suggested-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Howard Chu <howardchu95@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Gautam Menghani <gautam@linux.ibm.com> Tested-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250501022809.449767-11-howardchu95@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
9557c00076 |
perf record --off-cpu: Add --off-cpu-thresh option
Specify the threshold for dumping offcpu samples with --off-cpu-thresh,
the unit is milliseconds. Default value is 500ms.
Example:
perf record --off-cpu --off-cpu-thresh 824
The example above collects direct off-cpu samples where the off-cpu time
is longer than 824ms.
Committer testing:
After commenting out the end off-cpu dump to have just the ones that are
added right after the task is scheduled back, and using a threshould of
1000ms, we see some periods (the 5th column, just before "offcpu-time"
in the 'perf script' output) that are over 1000.000.000 nanoseconds:
root@number:~# perf record --off-cpu --off-cpu-thresh 10000
^C[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 3.902 MB perf.data (34335 samples) ]
root@number:~# perf script
<SNIP>
Isolated Web Co 59932 [028] 63839.594437: 1000049427 offcpu-time:
7fe63c7976c2 __syscall_cancel_arch_end+0x0 (/usr/lib64/libc.so.6)
7fe63c78c04c __futex_abstimed_wait_common+0x7c (/usr/lib64/libc.so.6)
7fe63c78e928 pthread_cond_timedwait@@GLIBC_2.3.2+0x178 (/usr/lib64/libc.so.6)
5599974a9fe7 mozilla::detail::ConditionVariableImpl::wait_for(mozilla::detail::MutexImpl&, mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator> const&)+0xe7 (/usr/lib64/fir>
100000000 [unknown] ([unknown])
swapper 0 [025] 63839.594459: 195724 cycles:P: ffffffffac328270 read_tsc+0x0 ([kernel.kallsyms])
Isolated Web Co 59932 [010] 63839.594466: 1000055278 offcpu-time:
7fe63c7976c2 __syscall_cancel_arch_end+0x0 (/usr/lib64/libc.so.6)
7fe63c78ba24 __syscall_cancel+0x14 (/usr/lib64/libc.so.6)
7fe63c804c4e __poll+0x1e (/usr/lib64/libc.so.6)
7fe633b0d1b8 PollWrapper(_GPollFD*, unsigned int, int) [clone .lto_priv.0]+0xf8 (/usr/lib64/firefox/libxul.so)
10000002c [unknown] ([unknown])
swapper 0 [027] 63839.594475: 134433 cycles:P: ffffffffad4c45d9 irqentry_enter+0x19 ([kernel.kallsyms])
swapper 0 [028] 63839.594499: 215838 cycles:P: ffffffffac39199a switch_mm_irqs_off+0x10a ([kernel.kallsyms])
MediaPD~oder #1 1407676 [027] 63839.594514: 134433 cycles:P: 7f982ef5e69f dct_IV(int*, int, int*)+0x24f (/usr/lib64/libfdk-aac.so.2.0.0)
swapper 0 [024] 63839.594524: 267411 cycles:P: ffffffffad4c6ee6 poll_idle+0x56 ([kernel.kallsyms])
MediaSu~sor #75 1093827 [026] 63839.594555: 332652 cycles:P: 55be753ad030 moz_xmalloc+0x200 (/usr/lib64/firefox/firefox)
swapper 0 [027] 63839.594616: 160548 cycles:P: ffffffffad144840 menu_select+0x570 ([kernel.kallsyms])
Isolated Web Co 14019 [027] 63839.595120: 1000050178 offcpu-time:
7fc9537cc6c2 __syscall_cancel_arch_end+0x0 (/usr/lib64/libc.so.6)
7fc9537c104c __futex_abstimed_wait_common+0x7c (/usr/lib64/libc.so.6)
7fc9537c3928 pthread_cond_timedwait@@GLIBC_2.3.2+0x178 (/usr/lib64/libc.so.6)
7fc95372a3c8 pt_TimedWait+0xb8 (/usr/lib64/libnspr4.so)
7fc95372a8d8 PR_WaitCondVar+0x68 (/usr/lib64/libnspr4.so)
7fc94afb1f7c WatchdogMain(void*)+0xac (/usr/lib64/firefox/libxul.so)
7fc947498660 [unknown] ([unknown])
7fc9535fce88 [unknown] ([unknown])
7fc94b620e60 WatchdogManager::~WatchdogManager()+0x0 (/usr/lib64/firefox/libxul.so)
fff8548387f8b48 [unknown] ([unknown])
swapper 0 [003] 63839.595712: 212948 cycles:P: ffffffffacd5b865 acpi_os_read_port+0x55 ([kernel.kallsyms])
<SNIP>
Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Suggested-by: Ian Rogers <irogers@google.com>
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-2-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-10-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
74069a0160 |
perf record --off-cpu: Dump the remaining PERF_SAMPLE_ in sample_type from BPF's stack trace map
Dump the remaining PERF_SAMPLE_ data, as if it is dumping a direct sample. Put the stack trace, tid, off-cpu time and cgroup id into the raw_data section, just like a direct off-cpu sample coming from BPF's bpf_perf_event_output(). This ensures that evsel__parse_sample() correctly parses both direct samples and accumulated samples. Suggested-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Howard Chu <howardchu95@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Gautam Menghani <gautam@linux.ibm.com> Tested-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241108204137.2444151-10-howardchu95@gmail.com Link: https://lore.kernel.org/r/20250501022809.449767-9-howardchu95@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
8ae7a5769b |
perf script: Display off-cpu samples correctly
No PERF_SAMPLE_CALLCHAIN in sample_type, but 'perf script' needs to
display a callchain, have to specify manually.
Also, prefer displaying a callchain:
gvfs-afc-volume 2267 [001] 3829232.955656: 1001115340 offcpu-time:
77f05292603f __pselect+0xbf (/usr/lib/x86_64-linux-gnu/libc.so.6)
77f052a1801c [unknown] (/usr/lib/x86_64-linux-gnu/libusbmuxd-2.0.so.6.0.0)
77f052a18d45 [unknown] (/usr/lib/x86_64-linux-gnu/libusbmuxd-2.0.so.6.0.0)
77f05289ca94 start_thread+0x384 (/usr/lib/x86_64-linux-gnu/libc.so.6)
77f052929c3c clone3+0x2c (/usr/lib/x86_64-linux-gnu/libc.so.6)
to a raw binary BPF output:
BPF output: 0000: dd 08 00 00 db 08 00 00 <DD>...<DB>...
0008: cc ce ab 3b 00 00 00 00 <CC>Ϋ;....
0010: 06 00 00 00 00 00 00 00 ........
0018: 00 fe ff ff ff ff ff ff .<FE><FF><FF><FF><FF><FF><FF>
0020: 3f 60 92 52 f0 77 00 00 ?`.R<F0>w..
0028: 1c 80 a1 52 f0 77 00 00 ..<A1>R<F0>w..
0030: 45 8d a1 52 f0 77 00 00 E.<A1>R<F0>w..
0038: 94 ca 89 52 f0 77 00 00 .<CA>.R<F0>w..
0040: 3c 9c 92 52 f0 77 00 00 <..R<F0>w..
0048: 00 00 00 00 00 00 00 00 ........
0050: 00 00 00 00 ....
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-9-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-8-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
7de1a87f1e |
perf record --off-cpu: Disable perf_event's callchain collection
There is a check in evsel.c that does this: if (evsel__is_offcpu_event(evsel)) evsel->core.attr.sample_type &= OFFCPU_SAMPLE_TYPES; This along with: #define OFFCPU_SAMPLE_TYPES (PERF_SAMPLE_IDENTIFIER | PERF_SAMPLE_IP | \ PERF_SAMPLE_TID | PERF_SAMPLE_TIME | \ PERF_SAMPLE_ID | PERF_SAMPLE_CPU | \ PERF_SAMPLE_PERIOD | PERF_SAMPLE_CALLCHAIN | \ PERF_SAMPLE_CGROUP) will tell perf_event to collect callchain. We don't need the callchain from perf_event when collecting off-cpu samples, because it's prev's callchain, not next's callchain. (perf_event) (task_storage) (needed) prev next | | ---sched_switch----> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Howard Chu <howardchu95@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Gautam Menghani <gautam@linux.ibm.com> Tested-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241108204137.2444151-8-howardchu95@gmail.com Link: https://lore.kernel.org/r/20250501022809.449767-7-howardchu95@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
7f8f56475d |
perf evsel: Assemble off-cpu samples
Use the data in bpf-output samples, to assemble off-cpu samples.
In evsel__is_offcpu_event(), check if sample_type is PERF_SAMPLE_RAW to
support off-cpu sample data created by an older version of perf.
Testing compatibility on off-cpu samples collected by perf before this patch series:
See below, the sample_type still uses PERF_SAMPLE_CALLCHAIN
$ perf script --header -i ./perf.data.ptn | grep "event : name = offcpu-time"
# event : name = offcpu-time, , id = { 237917, 237918, 237919, 237920 }, type = 1 (software), size = 136, config = 0xa (PERF_COUNT_SW_BPF_OUTPUT), { sample_period, sample_freq } = 1, sample_type = IP|TID|TIME|CALLCHAIN|CPU|PERIOD|IDENTIFIER, read_format = ID|LOST, disabled = 1, freq = 1, sample_id_all = 1
The output is correct.
$ perf script -i ./perf.data.ptn | grep offcpu-time
gmain 2173 [000] 18446744069.414584: 100102015 offcpu-time:
NetworkManager 901 [000] 18446744069.414584: 5603579 offcpu-time:
Web Content 1183550 [000] 18446744069.414584: 46278 offcpu-time:
gnome-control-c 2200559 [000] 18446744069.414584: 11998247014 offcpu-time:
<SNIP>
$
And after this patch series:
$ perf script --header -i ./perf.data.off-cpu-v9 | grep "event : name = offcpu-time"
# event : name = offcpu-time, , id = { 237959, 237960, 237961, 237962 }, type = 1 (software), size = 136, config = 0xa (PERF_COUNT_SW_BPF_OUTPUT), { sample_period, sample_freq } = 1, sample_type = IP|TID|TIME|CPU|PERIOD|RAW|IDENTIFIER, read_format = ID|LOST, disabled = 1, freq = 1, sample_id_all = 1
$ ./perf script -i ./perf.data.off-cpu-v9 | grep offcpu-time
gnome-shell 1875 [001] 4789616.361225: 100097057 offcpu-time:
gnome-shell 1875 [001] 4789616.461419: 100107463 offcpu-time:
firefox 2206821 [002] 4789616.475690: 255257245 offcpu-time:
$
Committer testing:
The command to record those samples:
root@number:~# perf record --off-cpu -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.092 MB perf.data (1552 samples) ]
root@number:~#
Then, before this patch series, the sample_type for the "offcpu-time" event is:
root@number:~# perf evlist -v | grep offcpu-time
offcpu-time: type: 1 (PERF_TYPE_SOFTWARE), size: 136, config: 0xa (PERF_COUNT_SW_BPF_OUTPUT), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CALLCHAIN|CPU|PERIOD|IDENTIFIER, read_format: ID|LOST, disabled: 1, freq: 1, sample_id_all: 1
root@number:~#
And after it, after recording it again:
root@number:~# perf record --off-cpu -a sleep 1 ; perf evlist -v | grep offcpu-time
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.151 MB perf.data (2843 samples) ]
offcpu-time: type: 1 (PERF_TYPE_SOFTWARE), size: 136, config: 0xa (PERF_COUNT_SW_BPF_OUTPUT), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|PERIOD|IDENTIFIER, read_format: ID|LOST, disabled: 1, sample_id_all: 1
root@number:~#
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gautam Menghani <gautam@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241108204137.2444151-7-howardchu95@gmail.com
Link: https://lore.kernel.org/r/20250501022809.449767-6-howardchu95@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
d6948f2af2 |
perf record --off-cpu: Dump off-cpu samples in BPF
Collect tid, period, callchain, and cgroup id and dump them when off-cpu time threshold is reached. We don't collect the off-cpu time twice (the delta), it's either in direct samples, or accumulated samples that are dumped at the end of perf.data. Suggested-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Howard Chu <howardchu95@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Gautam Menghani <gautam@linux.ibm.com> Tested-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241108204137.2444151-6-howardchu95@gmail.com Link: https://lore.kernel.org/r/20250501022809.449767-5-howardchu95@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
282c195906 |
perf record --off-cpu: Preparation of off-cpu BPF program
Set the perf_event map in BPF for dumping off-cpu samples, and set the offcpu_thresh to specify the threshold. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Howard Chu <howardchu95@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Gautam Menghani <gautam@linux.ibm.com> Tested-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241108204137.2444151-5-howardchu95@gmail.com Link: https://lore.kernel.org/r/20250501022809.449767-4-howardchu95@gmail.com [ Added some missing iteration variables to off_cpu_config() and fixed up a manually edited patch hunk line boundary line ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
0f72027bb9 |
perf record --off-cpu: Parse off-cpu event
Parse the off-cpu event using parse_event(), as bpf-output. Call evlist__enable_evsel() on off-cpu event. This fixes the inability to collect direct off-cpu samples on a workload, as reported by Arnaldo Carvalho de Melo <acme@redhat.com>. The reason being, workload sets enable_on_exec instead of calling evlist__enable(), but off-cpu event does not attach to an executable and execve won't be called, so the fds from perf_event_open() are not enabled. no-inherit should be set to 1, here's the reason: We update the BPF perf_event map for direct off-cpu sample dumping (in following patches), it executes as follows: bpf_map_update_value() bpf_fd_array_map_update_elem() perf_event_fd_array_get_ptr() perf_event_read_local() In perf_event_read_local(), there is: int perf_event_read_local(struct perf_event *event, u64 *value, u64 *enabled, u64 *running) { ... /* * It must not be an event with inherit set, we cannot read * all child counters from atomic context. */ if (event->attr.inherit) { ret = -EOPNOTSUPP; goto out; } Which means no-inherit has to be true for updating the BPF perf_event map. Moreover, for bpf-output events, we primarily want a system-wide event instead of a per-task event. The reason is that in BPF's bpf_perf_event_output(), BPF uses the CPU index to retrieve the perf_event file descriptor it outputs to. Making a bpf-output event system-wide naturally satisfies this requirement by mapping CPU appropriately. Suggested-by: Namhyung Kim <namhyung@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Howard Chu <howardchu95@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Gautam Menghani <gautam@linux.ibm.com> Tested-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241108204137.2444151-4-howardchu95@gmail.com Link: https://lore.kernel.org/r/20250501022809.449767-3-howardchu95@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
671e943452 |
perf evsel: Expose evsel__is_offcpu_event() for future use
Expose evsel__is_offcpu_event() so it can be used in off_cpu_config(), evsel__parse_sample() and 'perf script'. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Howard Chu <howardchu95@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Gautam Menghani <gautam@linux.ibm.com> Tested-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241108204137.2444151-3-howardchu95@gmail.com Link: https://lore.kernel.org/r/20250501022809.449767-2-howardchu95@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |