Ended up only being useful when filtering multiple irq_vectors
tracepoints, as we end up having a tracepoint for each of the entries,
i.e.:
This will always come with the "RESCHEDULE_VECTOR" in the 'vector' arg:
# perf trace --max-events 8 -e irq_vectors:reschedule*
0.000 cc1/29067 irq_vectors:reschedule_entry(vector: RESCHEDULE)
0.004 cc1/29067 irq_vectors:reschedule_exit(vector: RESCHEDULE)
0.553 cc1/29067 irq_vectors:reschedule_entry(vector: RESCHEDULE)
0.556 cc1/29067 irq_vectors:reschedule_exit(vector: RESCHEDULE)
1.182 cc1/29067 irq_vectors:reschedule_entry(vector: RESCHEDULE)
1.185 cc1/29067 irq_vectors:reschedule_exit(vector: RESCHEDULE)
1.203 :29052/29052 irq_vectors:reschedule_entry(vector: RESCHEDULE)
1.206 :29052/29052 irq_vectors:reschedule_exit(vector: RESCHEDULE)
#
While filtering that value will produce nothing:
# perf trace --max-events 8 -e irq_vectors:reschedule* --filter="vector != RESCHEDULE"
^C#
Maybe it'll be useful for those other tracepoints:
# perf list irq_vectors:vector_*
List of pre-defined events (to be used in -e):
irq_vectors:vector_activate [Tracepoint event]
irq_vectors:vector_alloc [Tracepoint event]
irq_vectors:vector_alloc_managed [Tracepoint event]
irq_vectors:vector_clear [Tracepoint event]
irq_vectors:vector_config [Tracepoint event]
irq_vectors:vector_deactivate [Tracepoint event]
irq_vectors:vector_free_moved [Tracepoint event]
irq_vectors:vector_reserve [Tracepoint event]
irq_vectors:vector_reserve_managed [Tracepoint event]
irq_vectors:vector_setup [Tracepoint event]
irq_vectors:vector_teardown [Tracepoint event]
irq_vectors:vector_update [Tracepoint event]
#
But since we have it done, keep it.
This at least served to teach me that all those irq vectors have a entry
and an exit tracepoint that I can then use just like with
raw_syscalls:sys_{enter,exit}, i.e. pair them, use just a
trace__irq_vectors_entry() + trace__irq_vectors_exit() and use the
'vector' arg as I use the 'syscall id' one for syscalls.
Then the default for 'perf trace' will include irq_vectors in addition
to syscalls.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-wer4cwbbqub3o7sa8h1j3uzb@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
I.e. after running:
$ make -C tools/perf O=/tmp/build/perf
We end up with:
$ cat /tmp/build/perf/trace/beauty/generated/x86_arch_irq_vectors_array.c
static const char *x86_irq_vectors[] = {
[0x02] = "NMI",
[0x12] = "MCE",
[0x20] = "IRQ_MOVE_CLEANUP",
[0x80] = "IA32_SYSCALL",
[0xec] = "LOCAL_TIMER",
[0xed] = "HYPERV_STIMER0",
[0xee] = "HYPERV_REENLIGHTENMENT",
[0xef] = "MANAGED_IRQ_SHUTDOWN",
[0xf0] = "POSTED_INTR_NESTED",
[0xf1] = "POSTED_INTR_WAKEUP",
[0xf2] = "POSTED_INTR",
[0xf3] = "HYPERVISOR_CALLBACK",
[0xf4] = "DEFERRED_ERROR",
[0xf6] = "IRQ_WORK",
[0xf7] = "X86_PLATFORM_IPI",
[0xf8] = "REBOOT",
[0xf9] = "THRESHOLD_APIC",
[0xfa] = "THERMAL_APIC",
[0xfb] = "CALL_FUNCTION_SINGLE",
[0xfc] = "CALL_FUNCTION",
[0xfd] = "RESCHEDULE",
[0xfe] = "ERROR_APIC",
[0xff] = "SPURIOUS_APIC",
};
$
Now its just a matter of using it, associating it to tracepoint arguments named
'vector', all of which can be correctly used with this table, for int args.
At some point we should move tools/perf/trace/beauty to tools/beauty/,
so that it can be used more generally and even made available externally
like libbpf, libperf, libtraceevent, etc.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-0p2df4kq1afrxbck4e4ct34r@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Things like:
# grep __data_loc /sys/kernel/debug/tracing/events/sched/sched_process_exec/format
field:__data_loc char[] filename; offset:8; size:4; signed:1;
#
That, at that offset (8) and with that size(8) have an integer that
contains the real length and offset for the contents of that array.
Now this works:
# perf trace --max-events 1 -e sched:*exec -a
0.000 sed/19441 sched:sched_process_exec(filename: "/usr/bin/sync", pid: 19441 (sync), old_pid: 19441 (sync))
#
As when using the libtraceevent based beautifier:
# perf trace --libtraceevent --max-events 1 -e sched:*exec -a
0.000 sync/19463 sched:sched_process_exec(filename=/usr/bin/sync pid=19463 old_pid=19463)
#
I.e. that 'filename' is implemented as a dynamic char array.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-950p0m842fe6n7sxsdwqj5i2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The build of file libperf-jvmti.so succeeds but the resulting
object fails to load:
# ~/linux/tools/perf/perf record -k mono -- java \
-XX:+PreserveFramePointer \
-agentpath:/root/linux/tools/perf/libperf-jvmti.so \
hog 100000 123450
Error occurred during initialization of VM
Could not find agent library /root/linux/tools/perf/libperf-jvmti.so
in absolute path, with error:
/root/linux/tools/perf/libperf-jvmti.so: undefined symbol: _ctype
Add the missing _ctype symbol into the build script.
Fixes: 79743bc927 ("perf jvmti: Link against tools/lib/string.o to have weak strlcpy()")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20191008093841.59387-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When executing the task exit testing case, perf gets stuck in an endless
loop this case and doesn't return back on Arm64 Juno board.
After digging into this issue, since Juno board has Arm's big.LITTLE
CPUs, thus the PMUs are not compatible between the big CPUs and little
CPUs. This leads to a PMU event that cannot be enabled properly when
the traced task is migrated from one variant's CPU to another variant.
Finally, the test case runs into infinite loop for cannot read out any
event data after return from polling.
Eventually, we need to work out formal solution to allow PMU events can
be freely migrated from one CPU variant to another, but this is a
difficult task and a different topic. This patch tries to fix the Perf
test case to avoid infinite loop, when the testing detects 1000 times
retrying for reading empty events, it will directly bail out and return
failure. This allows the Perf tool can continue its other test cases.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20191011091942.29841-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In the earlier fix for the memory overrun of id arrays I managed to typo
the wrong event in the fix.
Of course we need to close the current event in the loop, not the
original failing event.
The same test case as in the original patch still passes.
Fixes: 7834fa948b ("perf evlist: Fix access of freed id arrays")
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20191011182140.8353-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
My earlier patch to just enable --reltime with --time was a little too
optimistic. The --time parsing would accept absolute time, which is
very confusing to the user.
Support relative time in --time parsing too. This only works with recent
perf record that records the first sample time. Otherwise we error out.
Fixes: 3714437d3f ("perf script: Allow --time with --reltime")
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20191011182140.8353-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
perf trace:
Arnaldo Carvalho de Melo:
- Reuse the strace-like syscall_arg_fmt->scnprintf() beautification routines
(convert integer arguments into strings, like open flags, etc) in tracepoint
arguments.
For now the type based scnprintf routines (pid_t, umode_t, etc) and the
ones based in well known arg name based ("fd", etc) gets associated with
tracepoint args of that type.
A tracepoint only arg, "msr", for the msr:{write,read}_msr gets added as
an initial step.
- Introduce syscall_arg_fmt->strtoul() methods to be the reverse operation
of ->scnprintf(), i.e. to go from a string to an integer.
- Implement --filter, just like in 'perf record', that affects the tracepoint
events specied thus far in the command line, use the ->strtoul() methods
to allow strings in tables associated with beautifiers to the integers
the in-kernel tracepoint (eBPF later) filters expect, e.g.:
# perf trace --max-events 1 -e sched:*ipi --filter="cpu==1 || cpu==2"
0.000 as/24630 sched:sched_wake_idle_without_ipi(cpu: 1)
#
# perf trace --max-events 1 --max-stack=32 -e msr:* --filter="msr==IA32_TSC_DEADLINE"
207.000 cc1/19963 msr:write_msr(msr: IA32_TSC_DEADLINE, val: 5442316760822)
do_trace_write_msr ([kernel.kallsyms])
do_trace_write_msr ([kernel.kallsyms])
lapic_next_deadline ([kernel.kallsyms])
clockevents_program_event ([kernel.kallsyms])
hrtimer_interrupt ([kernel.kallsyms])
smp_apic_timer_interrupt ([kernel.kallsyms])
apic_timer_interrupt ([kernel.kallsyms])
[0x6ff66c] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
[0x7047c3] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
[0x707708] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
execute_one_pass (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
[0x4f3d37] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
[0x4f3d49] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
execute_pass_list (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
cgraph_node::expand (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
[0x2625b4] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
symbol_table::finalize_compilation_unit (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
[0x5ae8b9] (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
toplev::main (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
main (/usr/lib/gcc-cross/alpha-linux-gnu/8/cc1)
[0x26b6a] (/usr/lib/x86_64-linux-gnu/libc-2.29.so)
#
# perf trace --max-events 8 -e msr:* --filter="msr==IA32_SPEC_CTRL"
0.000 :13281/13281 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
0.063 migration/3/25 msr:write_msr(msr: IA32_SPEC_CTRL)
0.217 kworker/u16:1-/4826 msr:write_msr(msr: IA32_SPEC_CTRL)
0.687 rcu_sched/11 msr:write_msr(msr: IA32_SPEC_CTRL)
0.696 :13280/13280 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
0.305 :13281/13281 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
0.355 :13274/13274 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
2.743 kworker/u16:0-/6711 msr:write_msr(msr: IA32_SPEC_CTRL)
#
# perf trace --max-events 8 --cpu 1 -e msr:* --filter="msr!=IA32_SPEC_CTRL && msr!=IA32_TSC_DEADLINE && msr != FS_BASE"
0.000 mtr-packet/30819 msr:write_msr(msr: 0x830, val: 68719479037)
0.096 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
238.925 mtr-packet/30819 msr:write_msr(msr: 0x830, val: 8589936893)
511.010 :0/0 msr:write_msr(msr: 0x830, val: 68719479037)
1005.052 :0/0 msr:read_msr(msr: IA32_TSC_ADJUST)
1235.131 CPU 0/KVM/3750 msr:write_msr(msr: 0x830, val: 4294969595)
1235.195 CPU 0/KVM/3750 msr:read_msr(msr: IA32_SYSENTER_ESP, val: -2199023037952)
1235.201 CPU 0/KVM/3750 msr:read_msr(msr: IA32_APICBASE, val: 4276096000)
#
- Default to not using libtraceevent and its plugins for beautifying
tracepoint arguments, since now we're reusing the strace-like beatufiers.
Use --libtraceevent_print (using just --libtrace is unambiguous and can
be used as a short hand) to go back to those beautifiers.
This will help in the transition, as can be seen in some of the sched tracepoints
that still need some work in the libbeauty based mode:
# trace --no-inherit -e msr:*,*sleep,sched:* sleep 1
0.000 ( ): sched:sched_waking(comm: "trace", pid: 3319 (trace), prio: 120, success: 1)
0.006 ( ): sched:sched_wakeup(comm: "trace", pid: 3319 (trace), prio: 120, success: 1)
0.348 ( ): sched:sched_process_exec(filename: 140212596720100, pid: 3319 (sleep), old_pid: 3319 (sleep))
0.490 ( ): msr:write_msr(msr: FS_BASE, val: 139631189321088)
0.670 ( ): nanosleep(rqtp: 0x7ffc52c23bc0) ...
0.674 ( ): sched:sched_stat_runtime(comm: "sleep", pid: 3319 (sleep), runtime: 659259, vruntime: 78942418342)
0.675 ( ): sched:sched_switch(prev_comm: "sleep", prev_pid: 3319 (sleep), prev_prio: 120, prev_state: 1, next_comm: "swapper/0", next_prio: 120)
1001.059 ( ): sched:sched_waking(comm: "sleep", pid: 3319 (sleep), prio: 120, success: 1)
1001.098 ( ): sched:sched_wakeup(comm: "sleep", pid: 3319 (sleep), prio: 120, success: 1)
0.670 (1000.504 ms): ... [continued]: nanosleep()) = 0
1001.456 ( ): sched:sched_process_exit(comm: "sleep", pid: 3319 (sleep), prio: 120)
# trace --libtrace --no-inherit -e msr:*,*sleep,sched:* sleep 1
# trace --libtrace --no-inherit -e msr:*,*sleep,sched:* sleep 1
0.000 ( ): sched:sched_waking(comm=trace pid=3323 prio=120 target_cpu=000)
0.007 ( ): sched:sched_wakeup(comm=trace pid=3323 prio=120 target_cpu=000)
0.382 ( ): sched:sched_process_exec(filename=/usr/bin/sleep pid=3323 old_pid=3323)
0.525 ( ): msr:write_msr(c0000100, value 7f5d508a0580)
0.713 ( ): nanosleep(rqtp: 0x7fff487fb4a0) ...
0.717 ( ): sched:sched_stat_runtime(comm=sleep pid=3323 runtime=617722 [ns] vruntime=78957731636 [ns])
0.719 ( ): sched:sched_switch(prev_comm=sleep prev_pid=3323 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120)
1001.117 ( ): sched:sched_waking(comm=sleep pid=3323 prio=120 target_cpu=000)
1001.157 ( ): sched:sched_wakeup(comm=sleep pid=3323 prio=120 target_cpu=000)
0.713 (1000.522 ms): ... [continued]: nanosleep()) = 0
1001.538 ( ): sched:sched_process_exit(comm=sleep pid=3323 prio=120)
#
- Make -v (verbose) mode be honoured for .perfconfig based trace.add_events,
to help in diagnosing problems with building eBPF events (-e source.c).
- When using eBPF syscall payload augmentation do not show strace-like
syscalls when all the user specified was some tracepoint event, bringing
the behaviour in line with that of when not using eBPF augmentation.
Intel PT:
exported-sql-viewer GUI:
Adrian Hunter:
- Add LookupModel, HBoxLayout, VBoxLayout, global time range calculations
so as to add a time chart by CPU.
perf script:
Andi Kleen:
- Allow --time (to specify a time span of interest) with --reltime
perf diff:
Jin Yao:
- Report noise for cycles diff, i.e. a histogram + stddev.
(timestamps relative to start).
perf annotate:
Arnaldo Carvalho de Melo:
- Initialize env->cpuid when running in live mode (perf top), as it
is used in some of the per arch annotation init routines.
samples bpf:
Björn Töpel:
- Fixup fallout of using tools/perf/perf-sys. from outside tools/perf.
Core:
Ian Rogers:
- Avoid 'sample_reg_masks' being const + weak, as this breaks with some
compilers that constant-propagate from the weak symbol.
libperf:
- First part of moving the perf_mmap class from tools/perf to libperf.
- Propagate CFLAGS to libperf from the tools/perf Makefile.
Vendor events:
John Garry:
- Add entry in MAINTAINERS with reviewers for the for perf tool arm64
pmu-events files.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull perf fixes from Ingo Molnar:
"Mostly tooling fixes, but also a couple of updates for new Intel
models (which are technically hw-enablement, but to users it's a fix
to perf behavior on those new CPUs - hope this is fine), an AUX
inheritance fix, event time-sharing fix, and a fix for lost non-perf
NMI events on AMD systems"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
perf/x86/cstate: Add Tiger Lake CPU support
perf/x86/msr: Add Tiger Lake CPU support
perf/x86/intel: Add Tiger Lake CPU support
perf/x86/cstate: Update C-state counters for Ice Lake
perf/x86/msr: Add new CPU model numbers for Ice Lake
perf/x86/cstate: Add Comet Lake CPU support
perf/x86/msr: Add Comet Lake CPU support
perf/x86/intel: Add Comet Lake CPU support
perf/x86/amd: Change/fix NMI latency mitigation to use a timestamp
perf/core: Fix corner case in perf_rotate_context()
perf/core: Rework memory accounting in perf_mmap()
perf/core: Fix inheritance of aux_output groups
perf annotate: Don't return -1 for error when doing BPF disassembly
perf annotate: Return appropriate error code for allocation failures
perf annotate: Fix arch specific ->init() failure errors
perf annotate: Propagate the symbol__annotate() error return
perf annotate: Fix the signedness of failure returns
perf annotate: Propagate perf_env__arch() error
perf evsel: Fall back to global 'perf_env' in perf_evsel__env()
perf tools: Propagate get_cpuid() error
...
Pull powerpc fixes from Michael Ellerman:
"Fix a kernel crash in spufs_create_root() on Cell machines, since the
new mount API went in.
Fix a regression in our KVM code caused by our recent PCR changes.
Avoid a warning message about a failing hypervisor API on systems that
don't have that API.
A couple of minor build fixes.
Thanks to: Alexey Kardashevskiy, Alistair Popple, Desnes A. Nunes do
Rosario, Emmanuel Nicolet, Jordan Niethe, Laurent Dufour, Stephen
Rothwell"
* tag 'powerpc-5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
spufs: fix a crash in spufs_create_root()
powerpc/kvm: Fix kvmppc_vcore->in_guest value in kvmhv_switch_to_host
selftests/powerpc: Fix compile error on tlbie_test due to newer gcc
powerpc/pseries: Remove confusing warning message.
powerpc/64s/radix: Fix build failure with RADIX_MMU=n
Currently when a new map is mmapped we set its refcnt to 2 in the
perf_evlist_mmap_ops::mmap callback.
Every mmap gets its refcnt set to 2 when it's first mmaped:
- 1 for the current user, which will be taken out by a call to
perf_evlist__munmap_filtered(), where we find out there's
no more data comming from kernel to this mmap.
- 1 for the drain code where in perf_mmap__consume() the mmap
is released if it is empty.
Move this common setup into libperf's generic code before the mmap
callback is called.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-23-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>