The ACC (automatic C-state conversion) feature was available on Sky Lake and
Cascade Lake Xeons (SKX and CLX), but it is not available on Ice Lake and
Sapphire Rapids Xeons (ICX and SPR). Therefore, stop decoding it for ICX and
SPR.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Sapphire Rapids Xeon (SPR) supports 2 flavors of PC6 - PC6N (non-retention) and
PC6R (retention). Before this patch we used ICX package C-state limits, which
was wrong, because ICX has only one PC6 flavor. With this patch, we use SKX PC6
limits for SPR, because they are the same.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
The 'automatic_cstate_conversion_probe()' function has a too long 'if'
statement, convert it to a 'switch' statement in order to improve code
readability a bit.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Before this patch, SPR platform was considered identical to ICX platform. This
patch separates SPR support from ICX.
This patch is a preparation for adding SPR-specific package C-state limits
support.
Signed-off-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Reviewed-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Intel Performance Hybrid processors have a 2nd MSR
describing the turbo limits enforced on the Ecores.
Note, TRL and Secondary-TRL are usually R/O information,
but on overclock-capable parts, they can be written.
Signed-off-by: Len Brown <len.brown@intel.com>
When CONFIG_INTEL_UNCORE_FREQ_CONTROL is effective,
(Linux 5.9 and later), print the current (and default)
min and max uncore frequency limits.
When that driver provides the current uncore frequency
(Linux 5.18 and later), print a UncMHz column
reflecting the current uncore frequency.
Note that UncMHz is an instantaneous sample, not an average.
eg.
$ sudo ./turbostat -S --show frequency
...
Uncore Frequency pkg0 die0: 800 - 3900 MHz (800 - 3900 MHz)
...
Avg_MHz Busy% Bzy_MHz TSC_MHz UncMHz
28 0.70 4049 3095 3900
Signed-off-by: Len Brown <len.brown@intel.com>
Currently if a fscanf fails then an early return leaks an open
file pointer. Fix this by fclosing the file before the return.
Detected using static analysis with cppcheck:
tools/power/x86/turbostat/turbostat.c:2039:3: error: Resource leak: fp [resourceLeak]
Fixes: eae97e053f ("tools/power turbostat: Support thermal throttle count print")
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Acked-by: Chen Yu <yu.c.chen@intel.com>
Reviewed-by: Tom Rix <trix@redhat.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Using strncmp for a single character comparison is overly complicated,
just use a simpler single character comparison instead. Also stops
static analyzers (such as cppcheck) from complaining about strncmp on
non-null terminated strings.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
It would be handy to have cmdline in turbostat output. For example,
according to the turbostat output, there are no C-states requested.
In this case the user is very curious if something like
intel_idle.max_cstate=0 was used, or may be idle=none too. It is
also curious whether things like intel_pstate=nohwp were used.
Print the boot command line accordingly:
turbostat version 21.05.04 - Len Brown <lenb@kernel.org>
Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.16.0+ root=UUID=
b42359ed-1e05-42eb-8757-6bf2a1c19070 ro quiet splash vt.handoff=7
Suggested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
Change > MAX_DIE_PER_PACKAGE to >= MAX_DIE_PER_PACKAGE to prevent
accessing one element beyond the end of the array.
Fixes: 7fd786dfbd ("tools/power/x86/intel-speed-select: OOB daemon mode")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Add a command line option to dirty_log_perf_test to run vCPUs for the
entire duration of disabling dirty logging. By default, the test stops
running runs vCPUs before disabling dirty logging, which is faster but
less interesting as it doesn't stress KVM's handling of contention
between page faults and the zapping of collapsible SPTEs. Enabling the
flag also lets the user verify that KVM is indeed rebuilding zapped SPTEs
as huge pages by checking KVM's pages_{1g,2m,4k} stats. Without vCPUs to
fault in the zapped SPTEs, the stats will show that KVM is zapping pages,
but they never show whether or not KVM actually allows huge pages to be
recreated.
Note! Enabling the flag can _significantly_ increase runtime, especially
if the thread that's disabling dirty logging doesn't have a dedicated
pCPU, e.g. if all pCPUs are used to run vCPUs.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220715232107.3775620-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Include sys/time.h and pthread.h in tmon.h, so that types
"pthread_mutex_t" and "struct timeval tv" are known when tmon.h
references them.
Without these headers, compiling tmon against musl-libc will fail with
these errors:
In file included from sysfs.c:31:0:
tmon.h:47:8: error: unknown type name 'pthread_mutex_t'
extern pthread_mutex_t input_lock;
^~~~~~~~~~~~~~~
make[3]: *** [<builtin>: sysfs.o] Error 1
make[3]: *** Waiting for unfinished jobs....
In file included from tui.c:31:0:
tmon.h:54:17: error: field 'tv' has incomplete type
struct timeval tv;
^~
make[3]: *** [<builtin>: tui.o] Error 1
make[2]: *** [Makefile:83: tmon] Error 2
Signed-off-by: Markus Mayer <mmayer@broadcom.com>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Acked-by: Alejandro González <alejandro.gonzalez.correo@gmail.com>
Tested-by: Alejandro González <alejandro.gonzalez.correo@gmail.com>
Fixes: 94f69966fa ("tools/thermal: Introduce tmon, a tool for thermal subsystem")
Link: https://lore.kernel.org/r/20220718031040.44714-1-f.fainelli@gmail.com
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
The ISA states: "when ACC[i] contains defined data, the contents of VSRs
4×i to 4×i+3 are undefined until either a VSX Move From ACC instruction
is used to copy the contents of ACC[i] to VSRs 4×i to 4×i+3 or some other
instruction directly writes to one of these VSRs." We aren't doing this.
This test only works on Power10 because the hardware implementation
happens to map ACC0 to VSRs 0-3, but will fail on any other implementation
that doesn't do this. So add xxmfacc between writing to the accumulator
and accessing the VSRs.
Fixes: 3527e1ab9a ("selftests/powerpc: Add matrix multiply assist (MMA) test")
Signed-off-by: Rashmica Gupta <rashmica@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20220617043935.428083-1-rashmica@linux.ibm.com
clang has -Wconstant-conversion by default, and the constant 0xAAAAAAAAA
(9 As) being converted to an int, which is generally 32 bits, results
in the compile warning:
clang -Wl,-no-as-needed -Wall -isystem ../../../../usr/include/ -lpthread seccomp_bpf.c -lcap -o seccomp_bpf
seccomp_bpf.c:812:67: warning: implicit conversion from 'long' to 'int' changes value from 45812984490 to -1431655766 [-Wconstant-conversion]
int kill = kill_how == KILL_PROCESS ? SECCOMP_RET_KILL_PROCESS : 0xAAAAAAAAA;
~~~~ ^~~~~~~~~~~
1 warning generated.
-1431655766 is the expected truncation, 0xAAAAAAAA (8 As), so use
this directly in the code to avoid the warning.
Fixes: 3932fcecd9 ("selftests/seccomp: Add test for unknown SECCOMP_RET kill behavior")
Signed-off-by: YiFei Zhu <zhuyifei@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220526223407.1686936-1-zhuyifei@google.com
Pull asm-generic fixes from Arnd Bergmann:
"Two more bug fixes for asm-generic, one addressing an incorrect
Kconfig symbol reference and another one fixing a build failure for
the perf tool on mips and possibly others"
* tag 'asm-generic-fixes-5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
asm-generic: remove a broken and needless ifdef conditional
tools: Fixed MIPS builds due to struct flock re-definition
So far the vmtest.sh script, which can be used as a convenient way to
run bpf selftests, has obtained the kernel config safe to use for
testing from the libbpf/libbpf GitHub repository [0].
Given that we now have included this configuration into this very
repository, we can just consume it from here as well, eliminating the
necessity of remote accesses.
With this change we adjust the logic in the script to use the
configuration from below tools/testing/selftests/bpf/configs/ instead
of pulling it over the network.
[0] https://github.com/libbpf/libbpf
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Mykola Lysenko <mykolal@fb.com>
Link: https://lore.kernel.org/bpf/20220727001156.3553701-4-deso@posteo.net
This change makes sure to sort the existing minimal kernel configuration
containing options required for running BPF selftests alphabetically.
Doing so will make it easier to diff it against other configurations,
which in turn helps with maintaining disjunct config files that build on
top of each other. It also helped identify the CONFIG_IPV6_GRE being set
twice and removes one of the occurrences.
Lastly, we change NET_CLS_BPF from 'm' to 'y'. Having this option as 'm'
will cause failures of the btf_skc_cls_ingress selftest.
Signed-off-by: Daniel Müller <deso@posteo.net>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Mykola Lysenko <mykolal@fb.com>
Link: https://lore.kernel.org/bpf/20220727001156.3553701-2-deso@posteo.net
When using 'perf mem' and 'perf c2c', an issue is observed that tool
reports the wrong offset for global data symbols. This is a common
issue on both x86 and Arm64 platforms.
Let's see an example, for a test program, below is the disassembly for
its .bss section which is dumped with objdump:
...
Disassembly of section .bss:
0000000000004040 <completed.0>:
...
0000000000004080 <buf1>:
...
00000000000040c0 <buf2>:
...
0000000000004100 <thread>:
...
First we used 'perf mem record' to run the test program and then used
'perf --debug verbose=4 mem report' to observe what's the symbol info
for 'buf1' and 'buf2' structures.
# ./perf mem record -e ldlat-loads,ldlat-stores -- false_sharing.exe 8
# ./perf --debug verbose=4 mem report
...
dso__load_sym_internal: adjusting symbol: st_value: 0x40c0 sh_addr: 0x4040 sh_offset: 0x3028
symbol__new: buf2 0x30a8-0x30e8
...
dso__load_sym_internal: adjusting symbol: st_value: 0x4080 sh_addr: 0x4040 sh_offset: 0x3028
symbol__new: buf1 0x3068-0x30a8
...
The perf tool relies on libelf to parse symbols, in executable and
shared object files, 'st_value' holds a virtual address; 'sh_addr' is
the address at which section's first byte should reside in memory, and
'sh_offset' is the byte offset from the beginning of the file to the
first byte in the section. The perf tool uses below formula to convert
a symbol's memory address to a file address:
file_address = st_value - sh_addr + sh_offset
^
` Memory address
We can see the final adjusted address ranges for buf1 and buf2 are
[0x30a8-0x30e8) and [0x3068-0x30a8) respectively, apparently this is
incorrect, in the code, the structure for 'buf1' and 'buf2' specifies
compiler attribute with 64-byte alignment.
The problem happens for 'sh_offset', libelf returns it as 0x3028 which
is not 64-byte aligned, combining with disassembly, it's likely libelf
doesn't respect the alignment for .bss section, therefore, it doesn't
return the aligned value for 'sh_offset'.
Suggested by Fangrui Song, ELF file contains program header which
contains PT_LOAD segments, the fields p_vaddr and p_offset in PT_LOAD
segments contain the execution info. A better choice for converting
memory address to file address is using the formula:
file_address = st_value - p_vaddr + p_offset
This patch introduces elf_read_program_header() which returns the
program header based on the passed 'st_value', then it uses the formula
above to calculate the symbol file address; and the debugging log is
updated respectively.
After applying the change:
# ./perf --debug verbose=4 mem report
...
dso__load_sym_internal: adjusting symbol: st_value: 0x40c0 p_vaddr: 0x3d28 p_offset: 0x2d28
symbol__new: buf2 0x30c0-0x3100
...
dso__load_sym_internal: adjusting symbol: st_value: 0x4080 p_vaddr: 0x3d28 p_offset: 0x2d28
symbol__new: buf1 0x3080-0x30c0
...
Fixes: f17e04afaf ("perf report: Fix ELF symbol parsing")
Reported-by: Chang Rui <changruinj@gmail.com>
Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220724060013.171050-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes from:
28a99e95f5 ("x86/amd: Use IBPB for firmware calls")
This only causes these perf files to be rebuilt:
CC /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
CC /tmp/build/perf/bench/mem-memset-x86-64-asm.o
And addresses this perf build warning:
Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org
Link: https://lore.kernel.org/lkml/Yt6oWce9UDAmBAtX@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Once line card is activated, check the FW version and PSID are exposed.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Once line card is provisioned, check if HW revision and INI version
are exposed on associated nested auxiliary device.
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Implements workqueue trace bpf function.
Test cases:
# perf kwork -k workqueue lat -b
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end |
--------------------------------------------------------------------------------------------------------------------------------
(w)addrconf_verify_work | 0002 | 5.856 ms | 1 | 5.856 ms | 111994.634313 s | 111994.640169 s |
(w)vmstat_update | 0001 | 1.247 ms | 1 | 1.247 ms | 111996.462651 s | 111996.463899 s |
(w)neigh_periodic_work | 0001 | 1.183 ms | 1 | 1.183 ms | 111996.462789 s | 111996.463973 s |
(w)neigh_managed_work | 0001 | 0.989 ms | 2 | 1.635 ms | 111996.462820 s | 111996.464455 s |
(w)wb_workfn | 0000 | 0.667 ms | 1 | 0.667 ms | 111996.384273 s | 111996.384940 s |
(w)bpf_prog_free_deferred | 0001 | 0.495 ms | 1 | 0.495 ms | 111986.314201 s | 111986.314696 s |
(w)mix_interrupt_randomness | 0002 | 0.421 ms | 6 | 0.749 ms | 111995.927750 s | 111995.928499 s |
(w)vmstat_shepherd | 0000 | 0.374 ms | 2 | 0.385 ms | 111991.265242 s | 111991.265627 s |
(w)e1000_watchdog | 0002 | 0.356 ms | 5 | 0.390 ms | 111994.528380 s | 111994.528770 s |
(w)vmstat_update | 0000 | 0.231 ms | 2 | 0.365 ms | 111996.384407 s | 111996.384772 s |
(w)flush_to_ldisc | 0006 | 0.165 ms | 1 | 0.165 ms | 111995.930606 s | 111995.930771 s |
(w)flush_to_ldisc | 0000 | 0.094 ms | 2 | 0.095 ms | 111996.460453 s | 111996.460548 s |
--------------------------------------------------------------------------------------------------------------------------------
# perf kwork -k workqueue rep -b
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
(w)e1000_watchdog | 0002 | 0.627 ms | 2 | 0.324 ms | 112002.720665 s | 112002.720989 s |
(w)flush_to_ldisc | 0007 | 0.598 ms | 2 | 0.534 ms | 112000.875226 s | 112000.875761 s |
(w)wq_barrier_func | 0007 | 0.492 ms | 1 | 0.492 ms | 112000.876981 s | 112000.877473 s |
(w)flush_to_ldisc | 0007 | 0.281 ms | 1 | 0.281 ms | 112005.826882 s | 112005.827163 s |
(w)mix_interrupt_randomness | 0002 | 0.229 ms | 3 | 0.102 ms | 112005.825671 s | 112005.825774 s |
(w)vmstat_shepherd | 0000 | 0.202 ms | 1 | 0.202 ms | 112001.504511 s | 112001.504713 s |
(w)bpf_prog_free_deferred | 0001 | 0.181 ms | 1 | 0.181 ms | 112000.883251 s | 112000.883432 s |
(w)wb_workfn | 0007 | 0.130 ms | 1 | 0.130 ms | 112001.505195 s | 112001.505325 s |
(w)vmstat_update | 0000 | 0.053 ms | 1 | 0.053 ms | 112001.504763 s | 112001.504815 s |
--------------------------------------------------------------------------------------------------------------------------------
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-18-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implements softirq trace bpf function.
Test cases:
Trace softirq latency without filter:
# perf kwork -k softirq lat -b
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end |
--------------------------------------------------------------------------------------------------------------------------------
(s)RCU:9 | 0005 | 0.281 ms | 3 | 0.338 ms | 111295.752222 s | 111295.752560 s |
(s)RCU:9 | 0002 | 0.262 ms | 24 | 1.400 ms | 111301.335986 s | 111301.337386 s |
(s)SCHED:7 | 0005 | 0.177 ms | 14 | 0.212 ms | 111295.752270 s | 111295.752481 s |
(s)RCU:9 | 0007 | 0.161 ms | 47 | 2.022 ms | 111295.402159 s | 111295.404181 s |
(s)NET_RX:3 | 0003 | 0.149 ms | 12 | 1.261 ms | 111301.192964 s | 111301.194225 s |
(s)TIMER:1 | 0001 | 0.105 ms | 9 | 0.198 ms | 111301.180191 s | 111301.180389 s |
... <SNIP> ...
(s)NET_RX:3 | 0002 | 0.098 ms | 6 | 0.124 ms | 111295.403760 s | 111295.403884 s |
(s)SCHED:7 | 0001 | 0.093 ms | 19 | 0.242 ms | 111301.180256 s | 111301.180498 s |
(s)SCHED:7 | 0007 | 0.078 ms | 15 | 0.188 ms | 111300.064226 s | 111300.064415 s |
(s)SCHED:7 | 0004 | 0.077 ms | 11 | 0.213 ms | 111301.361759 s | 111301.361973 s |
(s)SCHED:7 | 0000 | 0.063 ms | 33 | 0.805 ms | 111295.401811 s | 111295.402616 s |
(s)SCHED:7 | 0003 | 0.063 ms | 14 | 0.085 ms | 111301.192255 s | 111301.192340 s |
--------------------------------------------------------------------------------------------------------------------------------
Trace softirq latency with cpu filter:
# perf kwork -k softirq lat -b -C 1
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end |
--------------------------------------------------------------------------------------------------------------------------------
(s)RCU:9 | 0001 | 0.178 ms | 5 | 0.572 ms | 111435.534135 s | 111435.534707 s |
--------------------------------------------------------------------------------------------------------------------------------
Trace softirq latency with name filter:
# perf kwork -k softirq lat -b -n SCHED
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Avg delay | Count | Max delay | Max delay start | Max delay end |
--------------------------------------------------------------------------------------------------------------------------------
(s)SCHED:7 | 0001 | 0.295 ms | 15 | 2.183 ms | 111452.534950 s | 111452.537133 s |
(s)SCHED:7 | 0002 | 0.215 ms | 10 | 0.315 ms | 111460.000238 s | 111460.000553 s |
(s)SCHED:7 | 0005 | 0.190 ms | 29 | 0.338 ms | 111457.032538 s | 111457.032876 s |
(s)SCHED:7 | 0003 | 0.097 ms | 10 | 0.319 ms | 111452.434351 s | 111452.434670 s |
(s)SCHED:7 | 0006 | 0.089 ms | 1 | 0.089 ms | 111450.737450 s | 111450.737539 s |
(s)SCHED:7 | 0007 | 0.085 ms | 17 | 0.169 ms | 111452.471333 s | 111452.471502 s |
(s)SCHED:7 | 0004 | 0.071 ms | 15 | 0.221 ms | 111452.535252 s | 111452.535473 s |
(s)SCHED:7 | 0000 | 0.044 ms | 32 | 0.130 ms | 111460.001982 s | 111460.002112 s |
--------------------------------------------------------------------------------------------------------------------------------
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-17-yangjihong1@huawei.com
[ Add {} for multiline if blocks ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implements irq trace bpf function.
Test cases:
Trace irq without filter:
# perf kwork -k irq rep -b
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
virtio0-requests:25 | 0000 | 31.026 ms | 285 | 1.493 ms | 110326.049963 s | 110326.051456 s |
eth0:10 | 0002 | 7.875 ms | 96 | 1.429 ms | 110313.916835 s | 110313.918264 s |
ata_piix:14 | 0002 | 2.510 ms | 28 | 0.396 ms | 110331.367987 s | 110331.368383 s |
--------------------------------------------------------------------------------------------------------------------------------
Trace irq with cpu filter:
# perf kwork -k irq rep -b -C 0
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
virtio0-requests:25 | 0000 | 34.288 ms | 282 | 2.061 ms | 110358.078968 s | 110358.081029 s |
--------------------------------------------------------------------------------------------------------------------------------
Trace irq with name filter:
# perf kwork -k irq rep -b -n eth0
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
eth0:10 | 0002 | 2.184 ms | 21 | 0.572 ms | 110386.541699 s | 110386.542271 s |
--------------------------------------------------------------------------------------------------------------------------------
Trace irq with summary:
# perf kwork -k irq rep -b -S
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
virtio0-requests:25 | 0000 | 42.923 ms | 285 | 1.181 ms | 110418.128867 s | 110418.130049 s |
eth0:10 | 0002 | 2.085 ms | 20 | 0.668 ms | 110416.002935 s | 110416.003603 s |
ata_piix:14 | 0002 | 0.970 ms | 4 | 0.656 ms | 110424.034482 s | 110424.035138 s |
--------------------------------------------------------------------------------------------------------------------------------
Total count : 309
Total runtime (msec) : 45.977 (0.003% load average)
Total time span (msec) : 17017.655
--------------------------------------------------------------------------------------------------------------------------------
Committer testing:
# perf kwork -k irq rep -b
Starting trace, Hit <Ctrl+C> to stop and report
^C
Kwork Name | Cpu | Total Runtime | Count | Max runtime | Max runtime start | Max runtime end |
--------------------------------------------------------------------------------------------------------------------------------
nvme0q20:145 | 0019 | 0.570 ms | 28 | 0.064 ms | 26966.635102 s | 26966.635167 s |
amdgpu:162 | 0002 | 0.568 ms | 29 | 0.068 ms | 26966.644346 s | 26966.644414 s |
nvme0q4:129 | 0003 | 0.565 ms | 31 | 0.037 ms | 26966.614830 s | 26966.614866 s |
nvme0q16:141 | 0015 | 0.205 ms | 66 | 0.012 ms | 26967.145161 s | 26967.145174 s |
nvme0q29:154 | 0028 | 0.154 ms | 44 | 0.014 ms | 26967.078970 s | 26967.078984 s |
nvme0q10:135 | 0009 | 0.134 ms | 43 | 0.011 ms | 26967.132093 s | 26967.132104 s |
nvme0q2:127 | 0001 | 0.132 ms | 26 | 0.011 ms | 26966.883584 s | 26966.883595 s |
nvme0q25:150 | 0024 | 0.127 ms | 32 | 0.014 ms | 26966.631419 s | 26966.631433 s |
nvme0q14:139 | 0013 | 0.110 ms | 21 | 0.017 ms | 26966.760843 s | 26966.760861 s |
nvme0q30:155 | 0029 | 0.102 ms | 30 | 0.022 ms | 26966.677171 s | 26966.677193 s |
nvme0q13:138 | 0012 | 0.088 ms | 20 | 0.015 ms | 26966.738733 s | 26966.738748 s |
nvme0q6:131 | 0005 | 0.087 ms | 13 | 0.020 ms | 26966.648445 s | 26966.648465 s |
nvme0q28:153 | 0027 | 0.066 ms | 12 | 0.015 ms | 26966.771431 s | 26966.771447 s |
nvme0q26:151 | 0025 | 0.060 ms | 13 | 0.012 ms | 26966.704266 s | 26966.704278 s |
nvme0q21:146 | 0020 | 0.054 ms | 20 | 0.011 ms | 26967.322082 s | 26967.322094 s |
nvme0q1:126 | 0000 | 0.046 ms | 11 | 0.013 ms | 26966.859754 s | 26966.859767 s |
nvme0q17:142 | 0016 | 0.046 ms | 10 | 0.011 ms | 26967.114513 s | 26967.114524 s |
xhci_hcd:74 | 0015 | 0.041 ms | 3 | 0.016 ms | 26967.086004 s | 26967.086020 s |
nvme0q8:133 | 0007 | 0.039 ms | 12 | 0.008 ms | 26966.712056 s | 26966.712063 s |
nvme0q32:157 | 0031 | 0.036 ms | 10 | 0.014 ms | 26966.627054 s | 26966.627068 s |
nvme0q9:134 | 0008 | 0.036 ms | 11 | 0.011 ms | 26967.258452 s | 26967.258462 s |
nvme0q7:132 | 0006 | 0.024 ms | 3 | 0.014 ms | 26966.767404 s | 26966.767418 s |
nvme0q11:136 | 0010 | 0.023 ms | 5 | 0.006 ms | 26966.935455 s | 26966.935461 s |
nvme0q31:156 | 0030 | 0.018 ms | 5 | 0.006 ms | 26966.627517 s | 26966.627524 s |
nvme0q12:137 | 0011 | 0.015 ms | 2 | 0.014 ms | 26966.799588 s | 26966.799602 s |
enp5s0-rx-0:164 | 0006 | 0.009 ms | 2 | 0.005 ms | 26966.742024 s | 26966.742028 s |
enp5s0-rx-1:165 | 0007 | 0.006 ms | 2 | 0.004 ms | 26966.939486 s | 26966.939490 s |
enp5s0-tx-0:166 | 0008 | 0.005 ms | 1 | 0.005 ms | 26966.939484 s | 26966.939489 s |
enp5s0-tx-1:167 | 0009 | 0.005 ms | 1 | 0.005 ms | 26966.939484 s | 26966.939489 s |
--------------------------------------------------------------------------------------------------------------------------------
#t
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-16-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'perf record' generates perf.data, which generates extra interrupts
for hard disk, amount of data to be collected increases with time.
Using eBPF trace can process the data in kernel, which solves the
preceding two problems.
Add -b/--use-bpf option for latency and report to support
tracing kwork events using eBPF:
1. Create bpf prog and attach to tracepoints,
2. Start tracing after command is entered,
3. After user hit "ctrl+c", stop tracing and report,
4. Support CPU and name filtering.
This commit implements the framework code and
does not add specific event support.
Test cases:
# perf kwork rep -h
Usage: perf kwork report [<options>]
-b, --use-bpf Use BPF to measure kwork runtime
-C, --cpu <cpu> list of cpus to profile
-i, --input <file> input file name
-n, --name <name> event name to profile
-s, --sort <key[,key2...]>
sort by key(s): runtime, max, count
-S, --with-summary Show summary with statistics
--time <str> Time span for analysis (start,stop)
# perf kwork lat -h
Usage: perf kwork latency [<options>]
-b, --use-bpf Use BPF to measure kwork latency
-C, --cpu <cpu> list of cpus to profile
-i, --input <file> input file name
-n, --name <name> event name to profile
-s, --sort <key[,key2...]>
sort by key(s): avg, max, count
--time <str> Time span for analysis (start,stop)
# perf kwork lat -b
Unsupported bpf trace class irq
# perf kwork rep -b
Unsupported bpf trace class irq
Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220709015033.38326-15-yangjihong1@huawei.com
[ Simplify work_findnew() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>