Pull perf/core improvements and fixes from Jiri Olsa:
* Factor hists statistics counts processing which in turn also
fixes several bugs in TUI report command (Namhyung Kim)
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The hist_browser__reset() is only called right after a filter is
applied so it needs to udpate browser->nr_entries properly. We cannot
use hists->nr_non_filtered_entreis directly since it's possible that
such entries are also filtered out by minimum percentage limit.
In addition when a filter is used for perf top, hist browser's
nr_entries field was not updated after applying the filter. But it
needs to be updated as new samples are coming.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1398327843-31845-11-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
The nr_entries variable is increased inside the loop in the function
but it always count the first entry regardless of it's filtered or
not; caused an off-by-one error.
It'd become a problem especially there's no entry at all - it'd get a
segfault during referencing a NULL pointer.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1398327843-31845-9-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Currently, accounting each sample is done in multiple places - once
when adding them to the input tree, other when adding them to the
output tree. It's not only confusing but also can cause a subtle
problem since concurrent processing like in perf top might see the
updated stats before adding entries into the output tree - like seeing
more (blank) lines at the end and/or slight inaccurate percentage.
To fix this, only account the entries when it's moved into the output
tree so that they cannot be seen prematurely. There're some
exceptional cases here and there - they should be addressed separately
with comments.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1398327843-31845-7-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
The hists->nr_entries is counted in multiple places so that they can
confuse readers of the code. This is a preparation of later change
and do not intend any functional difference.
Note that report__collapse_hists() now changed to return nothing since
its return value (nr_samples) is only for checking if there's any data
in the input file and this can be acheived by checking ->nr_entries.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1398327843-31845-2-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Pull perf/urgent fixes from Jiri Olsa:
* Fix memory leak and backward compatibility macros for pevent
filter enums in traceevent library (Steven Rostedt)
* Disable libdw unwind for all but x86 arch (Jiri Olsa)
* Fix memory leak in sample_ustack (Masanari Iida)
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit 12e55569a2 "tools lib traceevent: Use helper trace-seq in print
functions like kernel does" added a extra trace_seq helper to process
string arguments like the kernel does it. But the difference between the
kernel and the userspace library is that the kernel's trace_seq structure
has a static allocated buffer. The userspace one has a dynamically
allocated one. It requires a trace_seq_destroy(), otherwise it produces
a nasty memory leak.
Cc: stable@vger.kernel.org # 3.14+
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20140422192330.6bb09bf8@gandalf.local.home
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
The return value for pevent_filter_match() is suppose to return FILTER_NONE
if the event doesn't have a filter, and FILTER_NOEXIST if there is no filter
at all. But the change 41e12e580a "tools lib traceevent: Refactor
pevent_filter_match() to get rid of die()" replaced the return value
with PEVENT_ERRNO__* values and added "backward compatibility" macros
that used the old names. Unfortunately, the NOEXIST and NONE macros were
swapped, and this broke users that use the old return names.
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Link: http://lkml.kernel.org/r/20140421222346.0351ced4@gandalf.local.home
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
This patch figures out the max number of cpus and nodes that are on the
system and creates a map of cpu to node. This allows us to provide a cpu
and quickly get the node associated with it.
It was mostly copied from builtin-kmem.c and tweaked slightly to use less memory
(use possible cpus instead of max). It also calculates the max number of nodes.
Signed-off-by: Don Zickus <dzickus@redhat.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1396896924-129847-2-git-send-email-dzickus@redhat.com
[ Removing out label code in init_cpunode_map ]
[ Adding check for snprintf error ]
[ Removing unneeded returns ]
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
In the current version, when using perf record, if something goes
wrong in tools/perf/builtin-record.c:375
session = perf_session__new(file, false, NULL);
The error message:
"Not enough memory for reading per file header"
is issued. This error message seems to be outdated and is not very
helpful. This patch proposes to replace this error message by
"Perf session creation failed"
I believe this issue has been brought to lkml:
https://lkml.org/lkml/2014/2/24/458
although this patch only tackles a (small) part of the issue.
Additionnaly, this patch improves error reporting in
tools/perf/util/data.c open_file_write.
Currently, if the call to open fails, the user is unaware of it.
This patch logs the error, before returning the error code to
the caller.
Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Adrien BAK <adrien.bak@metascale.org>
Link: http://lkml.kernel.org/r/1397786443.3093.4.camel@beast
[ Reorganize the changelog into paragraphs ]
[ Added empty line after fd declaration in open_file_write ]
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
pert-report doesn't resolve function names in VDSO:
$ perf report --stdio -g flat,0.0,15,callee --sort pid
...
8.76%
0x7fff6b1fe861
__gettimeofday
ACE_OS::gettimeofday()
...
In this case symbol values should be adjusted the same way as for executables,
relocatable objects and prelinked libraries.
After fix:
$ perf report --stdio -g flat,0.0,15,callee --sort pid
...
8.76%
__vdso_gettimeofday
__gettimeofday
ACE_OS::gettimeofday()
Signed-off-by: Vladimir Nikulichev <nvs@tbricks.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Link: http://lkml.kernel.org/r/969812.163009436-sendEmail@nvs
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Every event in the perf-kvm has a 'stats' structure, which contains
max/min/average/etc times of handling this event.
The problem is that the 'perf-kvm stat report' command always shows
that 'min time' is 0us for every event. Example:
# perf kvm stat report
Analyze events for all VCPUs:
VM-EXIT Samples Samples% Time% Min Time Max Time Avg time
[..]
0xB2 MSCH 12 0.07% 0.00% 0us 8us 7.31us ( +- 2.11% )
0xB2 CHSC 12 0.07% 0.00% 0us 18us 9.39us ( +- 9.49% )
0xB2 STPX 8 0.05% 0.00% 0us 2us 1.88us ( +- 7.18% )
0xB2 STSI 7 0.04% 0.00% 0us 44us 16.49us ( +- 38.20% )
[..]
This happens because the 'stats' structure is not initialized and
stats->min equals to 0. Lets initialize the structure for every
event after its allocation using init_stats() function. This initializes
stats->min to -1 and makes 'Min time' statistics counting work:
# perf kvm stat report
Analyze events for all VCPUs:
VM-EXIT Samples Samples% Time% Min Time Max Time Avg time
[..]
0xB2 MSCH 12 0.07% 0.00% 6us 8us 7.31us ( +- 2.11% )
0xB2 CHSC 12 0.07% 0.00% 7us 18us 9.39us ( +- 9.49% )
0xB2 STPX 8 0.05% 0.00% 1us 2us 1.88us ( +- 7.18% )
0xB2 STSI 7 0.04% 0.00% 1us 44us 16.49us ( +- 38.20% )
[..]
Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Link: http://lkml.kernel.org/r/1397053319-2130-3-git-send-email-borntraeger@de.ibm.com
[ Fixing the perf examples changelog output ]
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Pull char/misc driver fixes from Greg KH:
"Here are a few driver fixes for char/misc drivers that resolve
reported issues.
All have been in linux-next successfully for a few days"
* tag 'char-misc-3.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
Drivers: hv: vmbus: Negotiate version 3.0 when running on ws2012r2 hosts
Tools: hv: Handle the case when the target file exists correctly
vme_tsi148: Utilize to_pci_dev() macro
vme_tsi148: Fix PCI address mapping assumption
vme_tsi148: Fix typo in tsi148_slave_get()
w1: avoid recursive device_add
w1: fix netlink refcnt leak on error path
misc: Grammar s/addition/additional/
drivers: mcb: fix memory leak in chameleon_parse_cells() error path
mei: ignore client writing state during cb completion
mei: me: do not load the driver if the FW doesn't support MEI interface
GenWQE: Increase driver version number
GenWQE: Fix multithreading problems
GenWQE: Ensure rc is not returning an uninitialized value
GenWQE: Add wmb before DDCB is started
GenWQE: Enable access to VPD flash area
Pull perf fixes from Ingo Molnar:
"Tooling fixes, plus a simple hardware-enablement patch for the Intel
RAPL PMU (energy use measurement) on Haswell CPUs, which I hope is
still fine at this stage"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf tools: Instead of redirecting flex output, use -o
perf tools: Fix double free in perf test 21 (code-reading.c)
perf stat: Initialize statistics correctly
perf bench: Set more defaults in the 'numa' suite
perf bench: Fix segfault at the end of an 'all' execution
perf bench: Update manpage to mention numa and futex
perf probe: Use dwarf_getcfi_elf() instead of dwarf_getcfi()
perf probe: Fix to handle errors in line_range searching
perf probe: Fix --line option behavior
perf tools: Pick up libdw without explicit LIBDW_DIR
MAINTAINERS: Change e-mail to kernel.org one
perf callchains: Disable unwind libraries when libelf isn't found
tools lib traceevent: Do not call warning() directly
tools lib traceevent: Print event name when show warning if possible
perf top: Fix documentation of invalid -s option
perf/x86: Enable DRAM RAPL support on Intel Haswell
Return the appropriate error code and handle the case when the target
file exists correctly. This fixes a bug.
Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: <stable@vger.kernel.org> [3.14]
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Add hist.percentage option for setting default value of the
symbol_conf.filter_relative. It affects the output of various perf
commands (like perf report, top and diff) only if filter(s) applied.
An user can write .perfconfig file like below to show absolute
percentage of filtered entries by default:
$ cat ~/.perfconfig
[hist]
percentage = absolute
And it can be changed through command line:
$ perf report --percentage relative
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-6-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
The --percentage option is for controlling overhead percentage
displayed. It can only receive either of "relative" or "absolute".
Move the parser callback function into a common location since it's
used by multiple commands now.
For more information, please see previous commit same thing done to
"perf report".
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-4-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
The --percentage option is for controlling overhead percentage
displayed. It can only receive either of "relative" or "absolute".
"relative" means it's relative to filtered entries only so that the
sum of shown entries will be always 100%. "absolute" means it retains
the original value before and after the filter is applied.
$ perf report -s comm
# Overhead Command
# ........ ............
#
74.19% cc1
7.61% gcc
6.11% as
4.35% sh
4.14% make
1.13% fixdep
...
$ perf report -s comm -c cc1,gcc --percentage absolute
# Overhead Command
# ........ ............
#
74.19% cc1
7.61% gcc
$ perf report -s comm -c cc1,gcc --percentage relative
# Overhead Command
# ........ ............
#
90.69% cc1
9.31% gcc
Note that it has zero effect if no filter was applied.
Suggested-by: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-3-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
When filtering by thread, dso or symbol on TUI it also update total
period so that the output shows different result than no filter - the
percentage changed to relative to filtered entries only. Sometimes
this is not desired since users might expect same results with filter.
So new filtered_* fields to hists->stats to count them separately.
They'll be controlled/used by user later.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1397145720-8063-2-git-send-email-namhyung@kernel.org
Signed-off-by: Jiri Olsa <jolsa@redhat.com>