Linus reported that sometimes 'perf report -s symbol' exits without any
message on TUI. David and Jiri found that it's because it failed to add
a hist entry due to an invalid symbol length.
It turns out that sorting by symbol (address) was broken since it only
compares symbol addresses. The symbol address is a relative address
within a dso thus just checking its address can result in merging
unrelated symbols together. Fix it by checking dso before comparing
symbol address.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1381802517-18812-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Current collapse stage has a scalability problem which can be reproduced
easily with a parallel kernel build.
This is because it needs to traverse every children of callchains
linearly during the collapse/merge stage.
Converting it to a rbtree reduced the overhead significantly.
On my 400MB perf.data file which recorded with make -j32 kernel build:
$ time perf --no-pager report --stdio > /dev/null
before:
real 6m22.073s
user 6m18.683s
sys 0m0.706s
after:
real 0m20.780s
user 0m19.962s
sys 0m0.689s
During the perf report the overhead on append_chain_children went down
from 96.69% to 18.16%:
- 18.16% perf perf [.] append_chain_children
- append_chain_children
- 77.48% append_chain_children
+ 69.79% merge_chain_branch
- 22.96% append_chain_children
+ 67.44% merge_chain_branch
+ 30.15% append_chain_children
+ 2.41% callchain_append
+ 7.25% callchain_append
+ 12.26% callchain_append
+ 10.22% merge_chain_branch
+ 11.58% perf perf [.] dso__find_symbol
+ 8.02% perf perf [.] sort__comm_cmp
+ 5.48% perf libc-2.17.so [.] malloc_consolidate
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1381468543-25334-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Initially it tries to find a probe:vfs_getname that should be setup
with:
perf probe 'vfs_getname=getname_flags:65 pathname=result->name:string'
or with slight changes to cope with code flux in the getname_flags code.
In the future, if a "vfs:getname" tracepoint becomes available, then it
will be preferred.
This is not strictly required and more expensive method of reading the
/proc/pid/fd/ symlink will be used when the fd->path array entry is not
populated by a previous vfs_getname + open syscall ret sequence.
As with any other 'perf probe' probe the setup must be done just once
and the probe will be left inactive, waiting for users, be it 'perf
trace' of any other tool.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-ujg8se8glq5izmu8cdkq15po@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
* kcore annotation improvements, including build-id cache support,
multi map 'call' instruction navigation fixes, kcore address
validation, objdump workarounds. From Adrian Hunter.
* 'trace' beautifiers for lots of syscall arguments, from Arnaldo Carvalho de Melo.
* More compact 'trace' output by suppressing zeroed args, from Arnaldo Carvalho de Melo.
* Show thread COMM by default in 'trace', from Arnaldo Carvalho de Melo.
* Show path associated with fd in live sessions, using a 'vfs_getname'
'perf probe' created dynamic tracepoint or by looking at /proc/pid/fd, from Arnaldo Carvalho de Melo.
* Memory and mmap leak fixes from Chenggang Qin.
* Add option to show full timestamp in 'trace', from David Ahern.
* Add 'record' command in 'trace', to record raw_syscalls:*, from David Ahern.
* Add summary option to dump syscall statistics in 'trace', from David Ahern.
* Fix comm resolution in 'trace' when reading events from file, from David Ahern.
* Improved messages when doing profiling in all or a subset of CPUs
using a workload as the session delimitator, as in:
'perf stat --cpu 0,2 sleep 10s'
from Arnaldo Carvalho de Melo.
* Add units to nanosec-based counters in 'perf stat', from David Ahern.
* Assorted build fixes for from David Ahern and Jiri Olsa.
* 'perf lock' fixes and cleanups, from Davidlohr Bueso.
* Memory leak fixes in 'perf test', from Felipe Pena.
* Build system super speedups, from Ingo Molnar.
* Fix mmap_read event overflow, from Jiri Olsa.
* Code cleanups from Jiri Olsa.
* Allow specifying B/K/M/G unit to the --mmap-pages arguments, from Jiri Olsa.
* Separate the GTK support in a separate libperf-gtk.so DSO, that is
only loaded when --gtk is specified, from Namhyung Kim.
* Fixes for some memory leaks, from Namhyumg Kim.
* Fix srcline sort key behavior, from Namhyung Kim.
* Fix failing assertions in numa bench, from Petr Holasek.
* perf bash completion fixes and improvements from Ramkumar Ramachandra.
* Improve error messages in 'trace', providing hints about system configuration
steps needed for using it, from Ramkumar Ramachandra.
* Remove bogus info when using 'perf stat' -e cycles/instructions, from
Ramkumar Ramachandra.
* Support for Openembedded/Yocto -dbg packages, from Ricardo Ribalda Delgado.
* Implement addr2line directly using libbfd, from Roberto Vitillo.
* Add new option --ignore-vmlinux for perf top, from Willy Tarreau.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
kcore can be used to view the running kernel object code. However,
kcore changes as modules are loaded and unloaded, and when the kernel
decides to modify its own code. Consequently it is useful to create a
copy of kcore at a particular time. Unlike vmlinux, kcore is not unique
for a given build-id. And in addition, the kallsyms and modules files
are also needed. The tool therefore creates a directory:
~/.debug/[kernel.kcore]/<build-id>/<YYYYmmddHHMMSShh>
which contains: kcore, kallsyms and modules.
Note that the copied kcore contains only code sections. See the
kcore_copy() function for how that is determined.
The tool will not make additional copies of kcore if there is already
one with the same modules at the same addresses.
Currently, perf tools will not look for kcore in the cache. That is
addressed in another patch.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/525BF849.5030405@intel.com
[ renamed 'index' to 'idx' to avoid shadowing string.h symbol in f12,
use at least one member initializer when initializing a struct to
zeros, also to fix the build on f12 ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim noticed that the autodep .d file inclusion rule was
unnecessarily complicated:
> > +-include *.d */*.d
>
> Hmm.. this */*.d part is really needed?
Only include *.d files.
Reported-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim noticed that the stackprotector testcase was incomplete:
> The flag being checked should be -"W"stack-protector instead of
> -"f"stack-protector. And the gcc manpage says that -Wstack-protector is
> only active when -fstack-protector is active. So the end result should
> look like
>
> $(BUILD) -Werror -fstack-protector -Wstack-protector
Add -Wstack-protector.
Reported-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim noticed that the volatile-register-var feature check
is superfluous:
> The gcc manpage says this warning is enabled by -Wall, and we add -Wall
> to CFLAGS before doing feature checks. So all gcc versions that support
> -Wvolatile-register-var enables it by default without this check and
> older gcc versions will always fail the feature check.
Remove it - this will further speed up feature checks.
Reported-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Namhyung Kim reported these duplicate DPACKAGE definitions:
test-libbfd:
$(BUILD) -DPACKAGE='perf' -DPACKAGE=perf -lbfd -ldl
Fix all affected places and use Namhyung's suggestion that the
definition should look like a normal C string: -DPACKAGE='"perf"'.
Reported-by: Namhyung Kim <namhyung@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Arnaldo reported that 'make DEBUG=1' does not work anymore.
The reason is that 'Makefile' only passes it through to
'Makefile.perf' via the environment, but 'Makefile.perf'
checks that it's a command line option:
ifeq ("$(origin DEBUG)", "command line")
PERF_DEBUG = $(DEBUG)
endif
So pass it through properly, and also clean up DEBUG parameter
handling while at it and fix a couple of annoyances:
- DEBUG=0 used to be interpreted as 'debugging on'. Turn it
into 'debugging off' instead.
- Same was the case for 'DEBUG=' - turn that into debug-off
as well.
- Pass in just a clean, sanitized 'DEBUG' value and get rid of
the intermediate, unnecessary PERF_DEBUG variable.
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
trace-event-parse.c:parse_proc_kallsyms()
Old GCC (4.4.2) does not see through the code flow of get_srcline() and
gets confused about the status of 'file' and 'line':
CC /tmp/build/perf/util/srcline.o
cc1: warnings being treated as errors
util/srcline.c: In function ¿get_srcline¿:
util/srcline.c:226: error: ¿file¿ may be used uninitialized in this function
util/srcline.c:227: error: ¿line¿ may be used uninitialized in this function
make[1]: *** [/tmp/build/perf/util/srcline.o] Error 1
make: *** [install] Error 2
make: Leaving directory `/home/acme/git/linux/tools/perf'
[acme@fedora12 linux]$
Help out GCC by initializing 'file' and 'line'.
Signed-off-by: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Link: http://lkml.kernel.org/n/tip-h8k7h49z3cndqgjdftkmm9f8@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Currently, execution of 'perf trace' reports the following cryptic
message to the user:
$ perf trace
Couldn't read the raw_syscalls tracepoints information!
Typically this happens because the user does not have permissions to
read the debugfs filesystem. Also handle the case when the kernel was
not compiled with debugfs support or when it isn't mounted.
Now, the tool prints detailed error messages:
$ perf trace
Error: Unable to find debugfs
Hint: Was your kernel was compiled with debugfs support?
Hint: Is the debugfs filesystem mounted?
Hint: Try 'sudo mount -t debugfs nodev /sys/kernel/debug'
$ perf trace
Error: No permissions to read /sys/kernel/debug//tracing/events/raw_syscalls
Hint: Try 'sudo mount -o remount,mode=755 /sys/kernel/debug/'
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/1380863851-14460-1-git-send-email-artagnon@gmail.com
[ Added ready to use commands to fix the issues as extra hints, use the
current debugfs mount point when reporting permission error, use
strerror_r instead of the deprecated sys_errlist, as reported by David Ahern ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>