'perf stat' accepts some config terms but doesn't apply them. For
example:
# perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash
# ls
# exit
Performance counter stats for 'bash':
266258061 instructions/no-inherit/
266258061 instructions/inherit/
1.402183915 seconds time elapsed
The result is confusing, because user may expect the first
'instructions' event exclude the 'ls' command.
This patch forbid most of these config terms for 'perf stat'.
Result:
# ./perf stat -e 'instructions/no-inherit/' -e 'instructions/inherit/' bash
event syntax error: 'instructions/no-inherit/'
\___ 'no-inherit' is not usable in 'perf stat'
...
We can add blocked config terms back when 'perf stat' really supports them.
This patch also removes unavailable config term from error message:
# ./perf stat -e 'instructions/badterm/' ls
event syntax error: 'instructions/badterm/'
\___ unknown term
valid terms: config,config1,config2,name
# ./perf stat -e 'cpu/badterm/' ls
event syntax error: 'cpu/badterm/'
\___ unknown term
valid terms: pc,any,inv,edge,cmask,event,in_tx,ldlat,umask,in_tx_cp,offcore_rsp,config,config1,config2,name
Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1455882283-79592-11-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Using 4 kHz is not necessary and sometimes is more than what was
auto-tuned:
# dmesg | grep max_sample_rate | tail -2
[ 2499.144373] perf interrupt took too long (2501 > 2500), lowering kernel.perf_event_max_sample_rate to 50000
[ 3592.413606] perf interrupt took too long (5069 > 5000), lowering kernel.perf_event_max_sample_rate to 25000
Simulating a auto-tune of 2000 we make the test fail, as reported
by Steven Noonan for one of his machines, so reduce it to 500 HZ,
it is enough to get a good number of samples for this test:
# perf test -v 21 2>&1 | grep '^Reading object code for memory address' | tee /tmp/out | tail -5
Reading object code for memory address: 0x479f40
Reading object code for memory address: 0x7f29b7eea80d
Reading object code for memory address: 0x7f29b7eea80d
Reading object code for memory address: 0x7f29b7eea800
Reading object code for memory address: 0xffffffff813b2f23
[root@jouet ~]# wc -l /tmp/out
40 /tmp/out
[root@jouet ~]#
For systems that auto-tune below that, the previous patches will tell the
user what is happening so that he may either ignore the result of this test or
bump /proc/sys/kernel/perf_event_max_sample_rate.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Steven Noonan <steven@uplinklabs.net>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-6kufyy1iprdfzrbtuqgxir70@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Before:
# perf test -v "code reading" 2>&1 | tail -4
perf_evlist__open failed
test child finished with -1
---- end ----
Test object code reading: FAILED!
#
After:
# perf test -v "code reading" 2>&1 | tail -7
perf_evlist__open() failed!
Error: Invalid argument.
Hint: Check /proc/sys/kernel/perf_event_max_sample_rate.
Hint: The current value is 1000 and 4000 is being requested.
test child finished with -1
---- end ----
Test object code reading: FAILED!
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Steven Noonan <steven@uplinklabs.net>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-ifbx7vmrc38loe6317owz2jx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When running the "code reading" test we get:
# perf test -v "code reading" 2>&1 | tail -5
Parsing event 'cycles:u'
perf_evlist__open failed
test child finished with -1
---- end ----
Test object code reading: FAILED!
#
And with -vv we get the errno value, -22, i.e. -EINVAL, but we can do
better and handle the case at hand, with this patch it becomes:
# perf test -v "code reading" 2>&1 | tail -7
perf_evlist__open() failed!
Error: Invalid argument.
Hint: Check /proc/sys/kernel/perf_event_max_sample_rate.
Hint: The current value is 1000 and 4000 is being requested.
test child finished with -1
---- end ----
Test object code reading: FAILED!
#
Next patch will make this 'perf test' entry to use perf_evlist__strerror()
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Steven Noonan <steven@uplinklabs.net>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-i31ai6kfefn75eapejjokfhc@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Allow user to easily switch all events to user or kernel space with simple
--all-user or --all-kernel options.
This will be handy within perf mem/c2c wrappers to switch easily monitoring
modes.
Committer note:
Testing it:
# perf record --all-kernel --all-user -a sleep 2
Error: option `all-user' cannot be used with all-kernel
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
--all-user Configure all used events to run in user space.
--all-kernel Configure all used events to run in kernel space.
# perf record --all-user --all-kernel -a sleep 2
Error: option `all-kernel' cannot be used with all-user
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
--all-kernel Configure all used events to run in kernel space.
--all-user Configure all used events to run in user space.
# perf record --all-user -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.416 MB perf.data (162 samples) ]
# perf report | grep '\[k\]'
# perf record --all-kernel -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.423 MB perf.data (296 samples) ]
# perf report | grep '\[\.\]'
#
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1455525293-8671-2-git-send-email-jolsa@kernel.org
[ Made those options to be mutually exclusive ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We were dropping the reference we possibly held but not obtaining one
for the new maps, which we will drop at perf_evlist__delete(), fix it.
This was caught by Steven Noonan in some of the machines which would
produce this output when caught by glibc debug mechanisms:
$ sudo perf test 21
21: Test object code reading :***
Error in `perf': corrupted double-linked list: 0x00000000023ffcd0 ***
======= Backtrace: =========
/usr/lib/libc.so.6(+0x72055)[0x7f25be0f3055]
/usr/lib/libc.so.6(+0x779b6)[0x7f25be0f89b6]
/usr/lib/libc.so.6(+0x7a0ed)[0x7f25be0fb0ed]
/usr/lib/libc.so.6(__libc_calloc+0xba)[0x7f25be0fceda]
perf(parse_events_lex_init_extra+0x38)[0x4cfff8]
perf(parse_events+0x55)[0x4a0615]
perf(perf_evlist__config+0xcf)[0x4eeb2f]
perf[0x479f82]
perf(test__code_reading+0x1e)[0x47ad4e]
perf(cmd_test+0x5dd)[0x46452d]
perf[0x47f4e3]
perf(main+0x603)[0x42c723]
/usr/lib/libc.so.6(__libc_start_main+0xf0)[0x7f25be0a1610]
perf(_start+0x29)[0x42c859]
Further investigation using valgrind led to the reference count imbalance fixed
in this patch.
Reported-and-Tested-by: Steven Noonan <steven@uplinklabs.net>
Report-Link: http://lkml.kernel.org/r/CAKbGBLjC2Dx5vshxyGmQkcD+VwiAQLbHoXA9i7kvRB2-2opHZQ@mail.gmail.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: f30a79b012 ("perf tools: Add reference counting for cpu_map object")
Link: http://lkml.kernel.org/n/tip-j0u1bdhr47sa511sgg76kb8h@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If CPU_DOWN_PREPARE fails the perf hotplug notifier is called for
CPU_DOWN_FAILED and calls perf_event_init_cpu(), which checks whether the
swhash is referenced. If yes it allocates a new hash and stores the pointer in
the per cpu data structure.
But at this point the cpu is still online, so there must be a valid hash
already. By overwriting the pointer the existing hash is not longer
accessible.
Remove the CPU_DOWN_FAILED state, as there is nothing to (re)allocate.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/20160209201007.763417379@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
User visible changes:
- Make 'perf record' collect CPU cache info in the perf.data file header:
$ perf record usleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
$ perf report --header-only -I | tail -10 | head -8
# CPU cache info:
# L1 Data 32K [0-1]
# L1 Instruction 32K [0-1]
# L1 Data 32K [2-3]
# L1 Instruction 32K [2-3]
# L2 Unified 256K [0-1]
# L2 Unified 256K [2-3]
# L3 Unified 4096K [0-3]
$
Will be used in 'perf c2c' and eventually in 'perf diff' to allow, for instance
running the same workload in multiple machines and then when using 'diff' show
the hardware difference. (Jiri Olsa)
- 'perf stat' now shows shadow metrics (insn per cycle, etc) in
interval mode too. E.g:
# perf stat -I 1000 -e instructions,cycles sleep 1
# time counts unit events
1.000215928 519,620 instructions # 0.69 insn per cycle
1.000215928 752,003 cycles
<SNIP>
Infrastructure changes:
- libapi now can also use pr_{warning,info,debug}() and that can be
set by tools using it (Jiri Olsa)
- libapi adopts filename__read_str() from perf, adds sysfs__read_str() (Jiri Olsa)
- Add check for java alternatives cmd in jvmti Makefile, so that it manages
to automatically find the right path for the JDK devel files in Ubuntu like
systems in addition to Fedora like ones (Stephane Eranian)
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Now that we can modify the metrics printout functions easily, it's
straight forward to support metric printing for interval mode. All that
is needed is to print the time stamp on every new line. Pass the prefix
into the context and print it out.
v2: Move wrong hunk to here.
Committer note:
Before:
[root@jouet ~]# perf stat -I 1000 -e instructions,cycles sleep 1
# time counts unit events
1.000168216 538,913 instructions
1.000168216 748,765 cycles
1.000660048 153,741 instructions
1.000660048 214,066 cycles
After:
# perf stat -I 1000 -e instructions,cycles sleep 1
# time counts unit events
1.000215928 519,620 instructions # 0.69 insn per cycle
1.000215928 752,003 cycles
1.000946033 148,502 instructions # 0.33 insn per cycle
1.000946033 160,104 cycles
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1454173616-17710-3-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Abstract the printing of shadow metrics. Instead of every metric calling
fprintf directly and taking care of indentation, use two call backs: one
to print metrics and another to start a new line.
This will allow adding metrics to CSV mode and also using them for other
purposes.
The computation of padding is now done in the central callback, instead
of every metric doing it manually. This makes it easier to add new
metrics.
v2: Refactor functions, printout now does more. Move
shadow printing. Improve fallback callbacks. Don't
use void * callback data.
v3: Remove unnecessary hunk. Add typedef for new_line
v4: Remove unnecessary hunk. Don't print metrics for CSV/interval
mode yet. Move printout change to separate patch.
v5: Fix bisect bugs. Avoid bogus frontend cycles printing.
Fix indentation in different aggregation modes.
v6: Delay newline handling
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1454173616-17710-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Adding support for warning/info/debug output within libapi code. Adding
following macros:
pr_warning(fmt, ...)
pr_info(fmt, ...)
pr_debug(fmt, ...)
Also adding libapi_set_print function to set above functions. This will
be used in perf to set standard debug handlers for libapi.
Adding 2 header files:
debug.h
- to be used outside libapi, contains
libapi_set_print interface
debug-internal.h
- to be used within libapi, contains
pr_warning/pr_info/pr_debug definitions
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1455465826-8426-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull cifs fixes from Steve French:
"A small set of cifs fixes.
I am still reviewing some more, recently submitted SMB3 fixes, but
these three are small and safe and ready now"
* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
cifs: fix erroneous return value
cifs: fix potential overflow in cifs_compose_mount_options
cifs: remove redundant check for null string pointer
Pull ARM KVM fixes from Paolo Bonzini:
- Fix for an unpleasant crash when the VM is created without a timer
- Allow HYP mode to access the full PA space, and not only 40bit
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
arm64: KVM: Configure TCR_EL2.PS at runtime
KVM: arm/arm64: Fix reference to uninitialised VGIC
KVM/ARM fixes for 4.5-rc4
- Fix for an unpleasant crash when the VM is created without a timer
- Allow HYP mode to access the full PA space, and not only 40bit
Pull IOMMU SVM fixes from David Woodhouse:
"Minor register size and interrupt acknowledgement fixes which only
showed up in testing on newer hardware, but mostly a fix to the MM
refcount handling to prevent a recursive refcount issue when mmap() is
used on the file descriptor associated with a bound PASID"
* tag 'for-linus-20160216' of git://git.infradead.org/intel-iommu:
iommu/vt-d: Clear PPR bit to ensure we get more page request interrupts
iommu/vt-d: Fix 64-bit accesses to 32-bit DMAR_GSTS_REG
iommu/vt-d: Fix mm refcounting to hold mm_count not mm_users
Pull spi fixes from Mark Brown:
"A small clutch of driver specific fixes.
The OMAP one is a bit worrying since it seems to be triggered by some
changes in the runtime PM core code and I suspect there's other
drivers across that are going to be using the same pattern outside of
OMAP but nothing seems to be coming up in the testing people are
doing"
* tag 'spi-fix-v4.5-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi:
spi: omap2-mcspi: Fix PM regression with deferred probe for pm_runtime_reinit
spi: bcm2835aux: fix bitmask defines
spi: atmel: fix gpio chip-select in case of non-DT platform
spi/fsl-espi: Correct the maximum transaction length
spi: imx: fix spi resource leak with dma transfer
spi: fix counting in spi-loopback-test code