Commit Graph

11645 Commits

Author SHA1 Message Date
Ingo Molnar
1ccb2f4e8e Merge branch 'perf/urgent' into perf/core, to pick up fixes
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-08-10 12:02:26 +02:00
Thomas Richter
f9ea3225dd bpf: fix selftest/bpf/test_pkt_md_access on s390x
Commit 18f3d6be6b ("selftests/bpf: Add test cases to test narrower ctx field loads")
introduced new eBPF test cases. One of them (test_pkt_md_access.c)
fails on s390x. The BPF verifier error message is:

[root@s8360046 bpf]# ./test_progs
test_pkt_access:PASS:ipv4 349 nsec
test_pkt_access:PASS:ipv6 212 nsec
[....]
libbpf: load bpf program failed: Permission denied
libbpf: -- BEGIN DUMP LOG ---
libbpf:
0: (71) r2 = *(u8 *)(r1 +0)
invalid bpf_context access off=0 size=1

libbpf: -- END LOG --
libbpf: failed to load program 'test1'
libbpf: failed to load object './test_pkt_md_access.o'
Summary: 29 PASSED, 1 FAILED
[root@s8360046 bpf]#

This is caused by a byte endianness issue. S390x is a big endian
architecture.  Pointer access to the lowest byte or halfword of a
four byte value need to add an offset.
On little endian architectures this offset is not needed.

Fix this and use the same approach as the originator used for other files
(for example test_verifier.c) in his original commit.

With this fix the test program test_progs succeeds on s390x:
[root@s8360046 bpf]# ./test_progs
test_pkt_access:PASS:ipv4 236 nsec
test_pkt_access:PASS:ipv6 217 nsec
test_xdp:PASS:ipv4 3624 nsec
test_xdp:PASS:ipv6 1722 nsec
test_l4lb:PASS:ipv4 926 nsec
test_l4lb:PASS:ipv6 1322 nsec
test_tcp_estats:PASS: 0 nsec
test_bpf_obj_id:PASS:get-fd-by-notexist-prog-id 0 nsec
test_bpf_obj_id:PASS:get-fd-by-notexist-map-id 0 nsec
test_bpf_obj_id:PASS:get-prog-info(fd) 0 nsec
test_bpf_obj_id:PASS:get-map-info(fd) 0 nsec
test_bpf_obj_id:PASS:get-prog-info(fd) 0 nsec
test_bpf_obj_id:PASS:get-map-info(fd) 0 nsec
test_bpf_obj_id:PASS:get-prog-fd(next_id) 0 nsec
test_bpf_obj_id:PASS:get-prog-info(next_id->fd) 0 nsec
test_bpf_obj_id:PASS:get-prog-fd(next_id) 0 nsec
test_bpf_obj_id:PASS:get-prog-info(next_id->fd) 0 nsec
test_bpf_obj_id:PASS:check total prog id found by get_next_id 0 nsec
test_bpf_obj_id:PASS:get-map-fd(next_id) 0 nsec
test_bpf_obj_id:PASS:get-map-fd(next_id) 0 nsec
test_bpf_obj_id:PASS:get-map-fd(next_id) 0 nsec
test_bpf_obj_id:PASS:get-map-fd(next_id) 0 nsec
test_bpf_obj_id:PASS:get-map-fd(next_id) 0 nsec
test_bpf_obj_id:PASS:get-map-fd(next_id) 0 nsec
test_bpf_obj_id:PASS:get-map-fd(next_id) 0 nsec
test_bpf_obj_id:PASS:check get-map-info(next_id->fd) 0 nsec
test_bpf_obj_id:PASS:get-map-fd(next_id) 0 nsec
test_bpf_obj_id:PASS:check get-map-info(next_id->fd) 0 nsec
test_bpf_obj_id:PASS:check total map id found by get_next_id 0 nsec
test_pkt_md_access:PASS: 277 nsec
Summary: 30 PASSED, 0 FAILED
[root@s8360046 bpf]#

Fixes: 18f3d6be6b ("selftests/bpf: Add test cases to test narrower ctx field loads")
Signed-off-by: Thomas Richter <tmricht@linux.vnet.ibm.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-07 10:06:27 -07:00
Daniel Borkmann
2c460621bb bpf: fix byte order test in test_verifier
We really must check with #if __BYTE_ORDER == XYZ instead of
just presence of #ifdef __LITTLE_ENDIAN. I noticed that when
actually running this on big endian machine, the latter test
resolves to true for user space, same for #ifdef __BIG_ENDIAN.

E.g., looking at endian.h from libc, both are also defined
there, so we really must test this against __BYTE_ORDER instead
for proper insns selection. For the kernel, such checks are
fine though e.g. see 13da9e200f ("Revert "endian: #define
__BYTE_ORDER"") and 415586c9e6 ("UAPI: fix endianness conditionals
in M32R's asm/stat.h") for some more context, but not for
user space. Lets also make sure to properly include endian.h.
After that, suite passes for me:

./test_verifier: ELF 64-bit MSB executable, [...]

Linux foo 4.13.0-rc3+ #4 SMP Fri Aug 4 06:59:30 EDT 2017 s390x s390x s390x GNU/Linux

Before fix: Summary: 505 PASSED, 11 FAILED
After  fix: Summary: 516 PASSED,  0 FAILED

Fixes: 18f3d6be6b ("selftests/bpf: Add test cases to test narrower ctx field loads")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong <yhs@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-04 16:09:06 -07:00
Daniel Borkmann
bad1926dd2 bpf, s390: fix build for libbpf and selftest suite
The BPF feature test as well as libbpf is missing the __NR_bpf
define for s390 and currently refuses to compile (selftest suite
depends on libbpf as well). Similar issue was fixed some time
ago via b0c47807d3 ("bpf: Add sparc support to tools and
samples."), just do the same and add definitions.

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-08-04 11:18:01 -07:00
Linus Torvalds
bc78d646e7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Handle notifier registry failures properly in tun/tap driver, from
    Tonghao Zhang.

 2) Fix bpf verifier handling of subtraction bounds and add a testcase
    for this, from Edward Cree.

 3) Increase reset timeout in ftgmac100 driver, from Ben Herrenschmidt.

 4) Fix use after free in prd_retire_rx_blk_timer_exired() in AF_PACKET,
    from Cong Wang.

 5) Fix SElinux regression due to recent UDP optimizations, from Paolo
    Abeni.

 6) We accidently increment IPSTATS_MIB_FRAGFAILS in the ipv6 code
    paths, fix from Stefano Brivio.

 7) Fix some mem leaks in dccp, from Xin Long.

 8) Adjust MDIO_BUS kconfig deps to avoid build errors, from Arnd
    Bergmann.

 9) Mac address length check and buffer size fixes from Cong Wang.

10) Don't leak sockets in ipv6 udp early demux, from Paolo Abeni.

11) Fix return value when copy_from_user() fails in
    bpf_prog_get_info_by_fd(), from Daniel Borkmann.

12) Handle PHY_HALTED properly in phy library state machine, from
    Florian Fainelli.

13) Fix OOPS in fib_sync_down_dev(), from Ido Schimmel.

14) Fix truesize calculation in virtio_net which led to performance
    regressions, from Michael S Tsirkin.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (76 commits)
  samples/bpf: fix bpf tunnel cleanup
  udp6: fix jumbogram reception
  ppp: Fix a scheduling-while-atomic bug in del_chan
  Revert "net: bcmgenet: Remove init parameter from bcmgenet_mii_config"
  virtio_net: fix truesize for mergeable buffers
  mv643xx_eth: fix of_irq_to_resource() error check
  MAINTAINERS: Add more files to the PHY LIBRARY section
  ipv4: fib: Fix NULL pointer deref during fib_sync_down_dev()
  net: phy: Correctly process PHY_HALTED in phy_stop_machine()
  sunhme: fix up GREG_STAT and GREG_IMASK register offsets
  bpf: fix bpf_prog_get_info_by_fd to dump correct xlated_prog_len
  tcp: avoid bogus gcc-7 array-bounds warning
  net: tc35815: fix spelling mistake: "Intterrupt" -> "Interrupt"
  bpf: don't indicate success when copy_from_user fails
  udp6: fix socket leak on early demux
  net: thunderx: Fix BGX transmit stall due to underflow
  Revert "vhost: cache used event for better performance"
  team: use a larger struct for mac address
  net: check dev->addr_len for dev_set_mac_address()
  phy: bcm-ns-usb3: fix MDIO_BUS dependency
  ...
2017-07-31 22:36:42 -07:00
Ingo Molnar
c3a3800fe4 Merge tag 'perf-core-for-mingo-4.14-20170728' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes for 4.14 from Arnaldo Carvalho de Melo:

New features:

 - Add PERF_SAMPLE_CALLCHAIN and PERF_RECORD_MMAP[2] to 'perf data' CTF
   conversion, allowing CTF trace visualization tools to show callchains
   and to resolve symbols (Geneviève Bastien)

Improvements:

 - Use group read for event groups in 'perf stat', reducing overhead when
   groups are defined in the event specification, i.e. when using {} to
   enclose a list of events, asking them to be read at the same time,
   e.g.: "perf stat -e '{cycles,instructions}'" (Jiri Olsa)

Fixes:

 - Do not overwrite perf_sample->weight in 'perf annotate' when
   processing samples, use whatever came from the kernel when
   perf_event_attr.sample_type has PERF_SAMPLE_WEIGHT set or just handle
   its default value, 0, when that is not set and "weight" is one of the
   sort orders chosen (Arnaldo Carvalho de Melo)

 - 'perf annotate --show-total-period' fixes:
    - TUI should show period, not nr_samples
    - Set appropriate column width for period/percent
    - Fix the column header to show "Period" when when that is what
      is being asked for
   (Taeung Song, Arnaldo Carvalho de Melo)

 - Use default sort if evlist is empty, fixing pipe mode (David Carrillo-Cisneros)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-30 11:15:37 +02:00
Ingo Molnar
f5db340f19 Merge branch 'perf/urgent' into perf/core, to pick up latest fixes and refresh the tree
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-07-30 11:15:13 +02:00
Geneviève Bastien
6b7007af72 perf data: Add doc when no conversion support compiled
This adds documentation on the environment variables needed to the
message telling that no conversion support is compiled in.

Committer testing:

  $ make -C tools/perf install
  $ perf data convert --all --to-ctf myctftrace
  No conversion support compiled in. perf should be compiled with environment variables LIBBABELTRACE=1 and LIBBABELTRACE_DIR=/path/to/libbabeltrace/
  $

Signed-off-by: Geneviève Bastien <gbastien@versatic.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Francis Deslauriers <francis.deslauriers@efficios.com>
Cc: Julien Desfossez <jdesfossez@efficios.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170727181205.24843-3-gbastien@versatic.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-28 16:30:45 -03:00
Geneviève Bastien
f9f6f2a903 perf data: Add mmap[2] events to CTF conversion
This adds the mmap and mmap2 events to the CTF trace obtained from perf
data.

These events will allow CTF trace visualization tools like Trace Compass
to automatically resolve the symbols of the callchain to the
corresponding function or origin library.

To include those events, one needs to convert with the --all option.
Here follows an output of babeltrace:

  $ sudo perf data convert --all --to-ctf myctftrace
  $ babeltrace ./myctftrace
  [19:00:00.000000000] (+0.000000000) perf_mmap2: { cpu_id = 0 },
 { pid = 638, tid = 638, start = 0x7F54AE39E000, filename =
 "/usr/lib/ld-2.25.so" }
  [19:00:00.000000000] (+0.000000000) perf_mmap2: { cpu_id = 0 }, { pid =
 638, tid = 638, start = 0x7F54AE565000, filename =
 "/usr/lib/libudev.so.1.6.6" }
  [19:00:00.000000000] (+0.000000000) perf_mmap2: { cpu_id = 0 }, { pid =
 638, tid = 638, start = 0x7FFC093EA000, filename = "[vdso]" }

Signed-off-by: Geneviève Bastien <gbastien@versatic.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Francis Deslauriers <francis.deslauriers@efficios.com>
Cc: Julien Desfossez <jdesfossez@efficios.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170727181205.24843-2-gbastien@versatic.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-28 16:26:06 -03:00
Geneviève Bastien
a3073c8e59 perf data: Add callchain to CTF conversion
The field perf_callchain, if available, is added to the sampling events
during the CTF conversion. It is an array of u64 values.  The
perf_callchain_size field contains the size of the array.

It will allow the analysis of sampling data in trace visualization tools
like Trace Compass. Possible analyses with those data: dynamic
flamegraphs, correlation with other tracing data like a userspace trace.

Here follows a babeltrace CTF output of a trace with callchain:

  $ babeltrace ./myctftrace
  [17:38:45.672760285] (+?.?????????) cycles:ppp: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF81063EE4, perf_tid = 25841, perf_pid = 25774, perf_period = 1, perf_callchain_size = 7, perf_callchain = [ [0] = 0xFFFFFFFFFFFFFF80, [1] = 0xFFFFFFFF81063EE4, [2] = 0xFFFFFFFF8100C770, [3] = 0xFFFFFFFF81006EC6, [4] = 0xFFFFFFFF8118245E, [5] = 0xFFFFFFFF810A9224, [6] = 0xFFFFFFFF8164A4C6 ] }
  [17:38:45.672777672] (+0.000017387) cycles:ppp: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF81063EE4, perf_tid = 25841, perf_pid = 25774, perf_period = 1, perf_callchain_size = 8, perf_callchain = [ [0] = 0xFFFFFFFFFFFFFF80, [1] = 0xFFFFFFFF81063EE4, [2] = 0xFFFFFFFF8100C770, [3] = 0xFFFFFFFF81006EC6, [4] = 0xFFFFFFFF8118245E, [5] = 0xFFFFFFFF810A9224, [6] = 0xFFFFFFFF8164A4C6, [7] = 0xFFFFFFFF8164ABAD ] }
  [17:38:45.672786700] (+0.000009028) cycles:ppp: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF81063EE4, perf_tid = 25841, perf_pid = 25774, perf_period = 70, perf_callchain_size = 3, perf_callchain = [ [0] = 0xFFFFFFFFFFFFFF80, [1] = 0xFFFFFFFF81063EE4, [2] = 0xFFFFFFFF8100C770 ] }

Signed-off-by: Geneviève Bastien <gbastien@versatic.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Francis Deslauriers <francis.deslauriers@efficios.com>
Cc: Julien Desfossez <jdesfossez@efficios.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170727181205.24843-1-gbastien@versatic.net
[ Removed PERF_SAMPLE_CALLCHAIN from the TODO list, jolsa ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-28 16:25:07 -03:00
Arnaldo Carvalho de Melo
3861c4a49b perf annotate TUI: Set appropriate column width for period/percent
Either when we start 'perf annotate' or 'perf report' with
--show-total-period or when we, in the annotate browser, press 't' to
toggle period/percent for the first column, we need to adjust the width
for the 'period' case.

Based-on-a-patch-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-n2np5qcs20u6qjdr9orygne6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-28 13:19:32 -03:00
Taeung Song
f67d395c6e perf annotate TUI: Fix column header when toggling period/percent
We have the 't' hotkey to toggle showing either the total period or the
percentage of samples for a given line, but we forgot to toggle as well
the column header, always showing "Percent", even when showing the
period, fix it.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1501172169-6761-1-git-send-email-treeze.taeung@gmail.com
[ Extracted from a larger patch, s/Event count/Period/g ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-28 12:53:08 -03:00
Arnaldo Carvalho de Melo
bc1e5d60ce perf annotate TUI: Clarify calculation of column header widths
In commit f8f4aaead5 ("perf annotate: Finally display IPC and cycle
accounting") the 'pcnt_width' variable was abused in a few places to
also include the optional width of the "IPC" and "cycles" columns, while
in other places we stopped using 'pcnt_width' and instead its previous
equation...

Now that we need to tap into annotate_browser__pcnt_width() to consider
if --show-total-period is being used and instead of that hardcoded 7
(strlen("Percent")) we need to use it or strlen("Event count") we need
this properly clarified to avoid having to touch all the (7 * nr_events)
places.

Clarify this by introducing a separate annotate_browser__cycles_width()
to leave the pcnt_width calculate just what its name implies.

Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/n/tip-szgb07t4k5wtvks8nzwkg710@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-28 12:53:07 -03:00
Taeung Song
29dc267f27 perf annotate TUI: Fix --show-total-period
We were showing the number of samples, not the total period, fix it.

Reported-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Fixes: 0c4a5bcea4 ("perf annotate: Display total number of samples with --show-total-period")
Link: http://lkml.kernel.org/r/1500500223-16753-1-git-send-email-treeze.taeung@gmail.com
[ extracted from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-28 12:53:06 -03:00
Arnaldo Carvalho de Melo
bb79a232b0 perf annotate TUI: Use sym_hist_entry in disasm_line_samples
Just paving the way to fix --show-total-period in the TUI, i.e. now
we save in struct disasm_line_samples not just the number of samples,
but also the total period.

Based-on-a-patch-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/tip-1sup5hkwrxocjvrmrmhs732o@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-28 12:53:06 -03:00
Arnaldo Carvalho de Melo
48cc330852 perf annotate: Fix storing per line sym_hist_entry
The existing loop incremented the offset while using it as the array
index, when we went to an array of sym_hist_entry instances, we
should've moved the increment to outside of the array element reference,
oops, fix it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 461c17f00f ("perf annotate: Store the sample period in each histogram bucket")
Link: http://lkml.kernel.org/n/tip-s3dm6uyrazlpag3f0psfia07@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-28 12:53:05 -03:00
Jakub Kicinski
d777b2ddbe bpf: don't zero out the info struct in bpf_obj_get_info_by_fd()
The buffer passed to bpf_obj_get_info_by_fd() should be initialized
to zeros.  Kernel will enforce that to guarantee we can safely extend
info structures in the future.

Making the bpf_obj_get_info_by_fd() call in libbpf perform the zeroing
is problematic, however, since some members of the info structures
may need to be initialized by the callers (for instance pointers
to buffers to which kernel is to dump translated and jited images).

Remove the zeroing and fix up the in-tree callers before any kernel
has been released with this code.

As Daniel points out this seems to be the intended operation anyway,
since commit 95b9afd398 ("bpf: Test for bpf ID") is itself setting
the buffer pointers before calling bpf_obj_get_info_by_fd().

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-26 17:02:52 -07:00
Arnaldo Carvalho de Melo
ce9ee4a2de perf annotate stdio: Set enough columns for --show-total-period
Now that we set the first column header according to wether
--show-total-period is being used, we need to size it accordingly.

Based-on-a-patch-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/tip-pu504ffnit4m334k09hxcbs3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-26 17:16:46 -03:00
David Carrillo-Cisneros
64831a21db perf sort: Use default sort if evlist is empty
Fixes bug noted by Jiri in https://lkml.org/lkml/2017/6/13/755 and
caused by commit d49dadea78 ("perf tools: Make 'trace' or
'trace_fields' sort key default for tracepoint events") not taking into
account that evlist is empty in pipe-mode.

Before this commit, pipe mode will only show bogus "100.00%  N/A"
instead of correct output as follows:

  $ perf record -o - sleep 1 | perf report -i -
  # To display the perf.data header info, please use --header/--header-only options.
  #
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.000 MB - ]
  #
  # Total Lost Samples: 0
  #
  # Samples: 8  of event 'cycles:ppH'
  # Event count (approx.): 145658
  #
  # Overhead  Trace output
  # ........  ............
  #
     100.00%  N/A

Correct output, after patch:

  $ perf record -o - sleep 1 | perf report -i -
  # To display the perf.data header info, please use --header/--header-only options.
  #
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.000 MB - ]
  #
  # Total Lost Samples: 0
  #
  # Samples: 8  of event 'cycles:ppH'
  # Event count (approx.): 191331
  #
  # Overhead  Command  Shared Object      Symbol
  # ........  .......  .................  .................................
  #
      81.63%  sleep    libc-2.19.so       [.] _exit
      13.58%  sleep    ld-2.19.so         [.] do_lookup_x
       2.34%  sleep    [kernel.kallsyms]  [k] context_switch
       2.34%  sleep    libc-2.19.so       [.] __GI___libc_nanosleep
       0.11%  perf     [kernel.kallsyms]  [k] __intel_pmu_enable_a

Reported-by: Jiri Olsa <jolsa@kernel.org>
Report-Link: https://lkml.kernel.org/r/20170613185422.GA6092@krava
Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Turner <pjt@google.com>
Cc: Simon Que <sque@chromium.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: d49dadea78 ("perf tools: Make 'trace' or 'trace_fields' sort key default for tracepoint events")
Link: https://lkml.kernel.org/r/20170721051157.47331-1-davidcc@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-26 17:00:07 -03:00
Arnaldo Carvalho de Melo
c6c13be76c perf annotate: Do not overwrite perf_sample->weight
When we parse an event we may get a value from the kernel in response to
PERF_SAMPLE_WEIGHT being set in perf_event_attr->sample_type, and if it
is not set, then perf_sample->weight will be set to zero, which should
be ok according to a discussion with Andi Kleen [1]:

1: https://lkml.kernel.org/r/20170724174637.GS3044@two.firstfloor.org

Acked-by: Andi Kleen <andi@firstfloor.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-8ev8ufk3lzmvgz37yg9nv3qz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-26 16:52:25 -03:00
Jiri Olsa
82bf311e15 perf stat: Use group read for event groups
Make perf stat use  group read if there  are groups defined. The group
read will get the values for all member of groups within a single
syscall instead of calling read syscall for every event.

We can see considerable less amount of kernel cycles spent on single
group read, than reading each event separately, like for following perf
stat command:

  # perf stat -e {cycles,instructions} -I 10 -a sleep 1

Monitored with "perf stat -r 5 -e '{cycles:u,cycles:k}'"

Before:

        24,325,676      cycles:u
       297,040,775      cycles:k

       1.038554134 seconds time elapsed

After:
        25,034,418      cycles:u
       158,256,395      cycles:k

       1.036864497 seconds time elapsed

The perf_evsel__open fallback changes contributed by Andi Kleen.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170726120206.9099-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-26 14:25:44 -03:00
Jiri Olsa
f7794d5254 perf evsel: Add read_counter()
Add perf_evsel__read_counter() to read single or group counter. After
calling this function the counter's evsel::counts struct is filled with
values for the counter and member of its group if there are any.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170726120206.9099-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-26 14:21:59 -03:00
Jiri Olsa
de63403bfd perf tools: Add perf_evsel__read_size function
Currently we use the size of struct perf_counts_values to read the
event, which prevents us to put any new member to the struct.

Adding perf_evsel__read_size to return size of the buffer needed for
event read.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170726120206.9099-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-26 14:20:28 -03:00
Lin Ma
67fbcd62f5 tools/kvm_stat: add '-f help' to get the available event list
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26 19:04:53 +02:00
Lin Ma
efcb521943 tools/kvm_stat: use variables instead of hard paths in help output
Using variables instead of hard paths makes the requirements information
more accurate.

Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-26 19:04:52 +02:00
Arnaldo Carvalho de Melo
62e6039f02 perf tools: Add tools/include/uapi/asm-generic/fcntl.h to the MANIFEST
This file was copied from the kernel so that we could build tools/perf/
on older systems where some newer defines, such as these are available:

    CC       trace/beauty/fcntl.o
  trace/beauty/fcntl.c: In function ‘syscall_arg__scnprintf_fcntl_arg’:
  trace/beauty/fcntl.c:93:13: error: ‘F_OFD_SETLK’ undeclared (first use in this function)
        cmd == F_OFD_SETLK || cmd == F_OFD_SETLKW || cmd == F_OFD_GETLK ||
               ^
  trace/beauty/fcntl.c:93:13: note: each undeclared identifier is reported only once for each function it appears in
  trace/beauty/fcntl.c:93:35: error: ‘F_OFD_SETLKW’ undeclared (first use in this function)
        cmd == F_OFD_SETLK || cmd == F_OFD_SETLKW || cmd == F_OFD_GETLK ||
                                     ^
  trace/beauty/fcntl.c:93:58: error: ‘F_OFD_GETLK’ undeclared (first use in this function)
        cmd == F_OFD_SETLK || cmd == F_OFD_SETLKW || cmd == F_OFD_GETLK ||
                                                            ^
  mv: cannot stat ‘trace/beauty/.fcntl.o.tmp’: No such file or directory
  make[4]: *** [trace/beauty/fcntl.o] Error 1
  make[3]: *** [trace/beauty] Error 2
  make[3]: *** Waiting for unfinished jobs....
    CC       tests/llvm.o

But we need to make sure that it is also in the tools/perf/MANIFEST file, that
is used to build a tarball for detached (from the kernel sources) compilation,
which was failing, with the above message, on a RHEL7.4 system, fix it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 84d1d8a12d ("tools include uapi asm-generic: Grab a copy of fcntl.h")
Link: http://lkml.kernel.org/n/tip-2d5px7aq5stbwi24pgirwtlm@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 22:46:37 -03:00
Taeung Song
38d2dcd0cc perf annotate stdio: Fix column header when using --show-total-period
Currently the first column header is always "Percent", fix it to show
correct column name based on given options, i.e. if using
--show-total-period, show "Event count" as a first column.

Reported-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/c3c902e7-95bc-16d4-366f-12eb034c5c8d@gmail.com
[ Extracted from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 22:46:37 -03:00
Andi Kleen
3f056b6664 perf jevents: Make build fail on JSON parse error
Today, when a JSON file fails parsing the build continues, but there are
no json files built in, which is difficult to debug later.  Make the
build stop on a parse error instead.

v2: Add fixes from Sukadev. Now we handle architectures
    with no JSON events correctly. And fix some stale comments.

Committer note:

Tested by running the cross build container tests, that were all failing
for v1.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20170725001638.19990-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 22:46:36 -03:00
Jin Yao
a1a8bed32d perf report: Tag branch type/flag on "to" and tag cycles on "from"
Current --branch-history LBR annotation displays confused data. For
example, each cycles report is duplicated on both "from" and "to"
entries.

For example:

  perf report --branch-history --no-children --stdio

  --2.32%--main div.c:39 (COND_BWD CROSS_2M predicted:49.7% cycles:1)
            main div.c:44 (predicted:49.7% cycles:1)
            main div.c:42 (RET CROSS_2M cycles:2)
            compute_flag div.c:28 (cycles:2)
            compute_flag div.c:27 (RET CROSS_2M cycles:1)
            rand rand.c:28 (cycles:1)
            rand rand.c:28 (RET CROSS_2M cycles:1)
            __random random.c:298 (cycles:1)
            __random random.c:297 (COND_BWD CROSS_2M cycles:1)
            __random random.c:295 (cycles:1)
            __random random.c:295 (COND_BWD CROSS_2M cycles:1)
            __random random.c:295 (cycles:1)
            __random random.c:295 (RET CROSS_2M cycles:9)

The cycles should be tagged only on the "from". It's for the code block
that ends with "from", not for "to".

Another issue is the "predicted:49.7%" is duplicated too (tag on both
"from" and "to").

This patch tags the branch type/flag on "to" and tag the cycles on
"from".

For example:

  --2.32%--main div.c:39 (COND_BWD CROSS_2M predicted:49.7%)
            main div.c:44 (cycles:1)
            main div.c:42 (RET CROSS_2M)
            compute_flag div.c:28 (cycles:2)
            compute_flag div.c:27 (RET CROSS_2M)
            rand rand.c:28 (cycles:1)
            rand rand.c:28 (RET CROSS_2M)
            __random random.c:298 (cycles:1)
            __random random.c:297 (COND_BWD CROSS_2M)
            __random random.c:295 (cycles:1)
            __random random.c:295 (COND_BWD CROSS_2M)
            __random random.c:295 (cycles:1)
            __random random.c:295 (RET CROSS_2M)
            |
             --2.23%--__random_r random_r.c:392 (cycles:9)

In this example, The "main div.c:39 (COND_BWD CROSS_2M predicted:49.7%)"
is "to" of branch and "main div.c:44 (cycles:1)" is "from" of branch.
It should be easier for understanding than before.

Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1500894547-18411-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 22:46:35 -03:00
Jin Yao
b49a821ed9 perf report: Make --branch-history work without callgraphs(-g) option in perf record
perf record -b -g <command>
  perf report --branch-history

This merges the LBRs with the callgraphs.

However it would be nice if it also works without callgraphs (-g) set in
perf record, so that only the LBRs are displayed.  But currently perf
report errors in this case. For example,

  perf record -b <command>
  perf report --branch-history

  Error:
  Selected -g or --branch-history but no callchain data. Did
  you call 'perf record' without -g?

This patch displays the LBRs only even if callgraphs(-g) is not enabled
in perf record.

Change log:

v2: According to Milian Wolff's comment, change the obsolete error
message. Now the error message is:

                 ┌─Error:─────────────────────────────────────┐
                 │Selected -g or --branch-history.            │
                 │But no callchain or branch data.            │
                 │Did you call 'perf record' without -g or -b?│
                 │                                            │
                 │                                            │
                 │Press any key...                            │
                 └────────────────────────────────────────────┘

When passing the last parameter to hists__fprintf,
changes "|" to "||".

  hists__fprintf(hists, !quiet, 0, 0, rep->min_percent, stdout,
                 symbol_conf.use_callchain || symbol_conf.show_branchflag_count);

Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1494240182-28899-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 22:46:03 -03:00
Arun Kalyanasundaram
a641860550 perf script python: Generate hooks with additional argument
Modify the signature of tracepoint specific and trace_unhandled hooks to
add the perf_sample dict as a new argument.
Create a python helper function to print a dictionary.

Signed-off-by: Arun Kalyanasundaram <arunkaly@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Seongjae Park <sj38.park@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20170721220422.63962-6-arunkaly@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 22:43:21 -03:00
Arun Kalyanasundaram
f38d281663 perf script python: Add perf_sample dict to tracepoint handlers
The process_event python hook receives a dict with all perf_sample
entries, but the tracepoint specific and trace_unhandled hooks predate
the introduction of this dict, and do not receive it.

Add the aforementioned dict as an additional argument to the affected
handlers. To keep backwards compatibility (and avoid unnecessary work),
do not pass the dict if the number of arguments signals that handler
version predates this change.

Signed-off-by: Arun Kalyanasundaram <arunkaly@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Seongjae Park <sj38.park@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20170721220422.63962-5-arunkaly@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 22:43:20 -03:00
Arun Kalyanasundaram
74ec14f389 perf script python: Add sample_read to dict
Provide time_enabled, time_running and counter value in the perf_sample
dict.

Signed-off-by: Arun Kalyanasundaram <arunkaly@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Seongjae Park <sj38.park@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20170721220422.63962-4-arunkaly@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 22:43:19 -03:00
Arun Kalyanasundaram
892e76b2e8 perf script python: Refactor creation of perf sample dict
Move the creation of the dict containing perf_sample entries into a
helper function to enable its reuse in other sample processing routines.

Signed-off-by: Arun Kalyanasundaram <arunkaly@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Seongjae Park <sj38.park@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20170721220422.63962-3-arunkaly@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 22:43:19 -03:00
Arun Kalyanasundaram
e9f9a9ca85 perf script python: Allocate memory only if handler exists
Avoid allocating memory if hook handler is not available. This saves
unused memory allocation and simplifies error path.

Let handler in python_process_tracepoint point to either tracepoint
specific or trace_unhandled hook. Use dict to check if handler points to
trace_unhandled.

Remove the exit label in python_process_general_event and return when no
handler is available.

Signed-off-by: Arun Kalyanasundaram <arunkaly@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Seongjae Park <sj38.park@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20170721220422.63962-2-arunkaly@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 22:43:18 -03:00
Dan Carpenter
2ec5cab604 perf script: Remove some bogus error handling
If script_desc__new() fails then the current code has a NULL
dereference.  We don't actually need to do any cleanup, we can just
return NULL.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/20170722073610.nnsyiwdcfl6bhn4t@mwanda
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 22:43:17 -03:00
Krister Johansen
868a832918 perf top: Support lookup of symbols in other mount namespaces.
The perf top command needs to unshare its fs from the helper threads in
order to successfully setns(2) during its symbol lookup.  It also needs
to impelement a force flag to ignore ownership of perf-<pid>.map files.

Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1499305693-1599-6-git-send-email-kjlx@templeofstupid.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 22:43:16 -03:00
Jiri Olsa
2b04e0f882 perf evsel: Add verbose output for sys_perf_event_open fallback
Adding info about what is being switched off in the sys_perf_event_open
fallback.

New output (notice the 'switching off' lines):

  $ perf stat -e '{cycles,instructions}' -vvv ls
  Using CPUID GenuineIntel-6-3D
  intel_pt default config: tsc
  ------------------------------------------------------------
  perf_event_attr:
    size                             112
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
    disabled                         1
    inherit                          1
    enable_on_exec                   1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid 3591  cpu -1  group_fd -1  flags 0x8
  sys_perf_event_open failed, error -22
  switching off cloexec flag
  ------------------------------------------------------------
  perf_event_attr:
    size                             112
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
    disabled                         1
    inherit                          1
    enable_on_exec                   1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid 3591  cpu -1  group_fd -1  flags 0
  sys_perf_event_open failed, error -22
  switching off exclude_guest, exclude_host
  ------------------------------------------------------------
  perf_event_attr:
    size                             112
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
    disabled                         1
    inherit                          1
    enable_on_exec                   1
  ------------------------------------------------------------
  sys_perf_event_open: pid 3591  cpu -1  group_fd -1  flags 0
  sys_perf_event_open failed, error -22
  switching off sample_id_all
  ------------------------------------------------------------
  perf_event_attr:
    size                             112
    sample_type                      IDENTIFIER
    read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING|ID|GROUP
    disabled                         1
    inherit                          1
    enable_on_exec                   1
  ...

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170721121212.21414-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 11:23:53 -03:00
Sudeep Holla
5d90faf454 perf jvmti: Fix linker error when libelf config is disabled
When libelf is disabled in the configuration, we get the following
linker error:

  LINK     libperf-jvmti.so
  ld: cannot find -lelf
  Makefile.perf:515: recipe for target 'libperf-jvmti.so' failed

Jiri pointed out that both librt and libelf are not really required. So
this patch fixes the linker error by getting rid of unwanted libraries
in the linker stage.

Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Acked-by: David Carrillo-Cisneros <davidcc@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 209045adc2 ("perf tools: add JVMTI agent library")
Link: http://lkml.kernel.org/r/20170719011839.99399-5-davidcc@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 11:23:53 -03:00
David Carrillo-Cisneros
f484959908 perf annotate: Process tracing data in pipe mode
'perf annotate' was missing the handler for tracing data records.

Prior to this patch we obtained "unhandled" records when piping trace
events to perf annotate (using -D option to show the dump_printf
messages in process_event_synth_tracing_data_stub):

  $ perf record -o - -e block:bio_free sleep 2 | perf annotate -D --stdio
  ...
  0x78 [0xc]: PERF_RECORD_TRACING_DATA: unhandled!
  ...

Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Paul Turner <pjt@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20170719011839.99399-4-davidcc@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 11:23:52 -03:00
David Carrillo-Cisneros
cb281fea4b perf tools: Add EXCLUDE_EXTLIBS and EXTRA_PERFLIBS to makefile
The goal is to allow users to override linking of libraries that
were automatically added to PERFLIBS.

EXCLUDE_EXTLIBS contains linker flags to be removed from LIBS
while EXTRA_PERFLIBS contains linker flags to be added.

My use case is to force certain library to be build statically,
e.g. for libelf:

  EXCLUDE_EXTLIBS=-lelf EXTRA_PERFLIBS=path/libelf.a

Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Paul Turner <pjt@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20170719011839.99399-3-davidcc@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 11:23:51 -03:00
Arnaldo Carvalho de Melo
cd8dd032f6 perf cgroup: Fix refcount usage
When converting from atomic_t to refcount_t we didn't follow the usual
step of initializing it to one before taking any new reference, which
trips over checking if taking a reference for a freed refcount_t, fix
it.

Brendan's report:

 ---
It's 4.12-rc7, with node v4.4.1. I'm building 4.13-rc1 now, as I hit
what I think is another unrelated perf bug and I'm starting to wonder
what else is broken on that version:

(root) /mnt/src/linux-4.12-rc7/tools/perf # ./perf record -F 99 -a -e
cpu-clock --cgroup=docker/f9e9d5df065b14646e8a11edc837a13877fd90c171137b2ba3feb67a0201cb65
-g
perf: /mnt/src/linux-4.12-rc7/tools/include/linux/refcount.h:108:
refcount_inc: Assertion `!(!refcount_inc_not_zero(r))' failed.
Aborted

that used to work...
 ---

Testing it:

Before:

  # perf stat -e cycles -C 0 --cgroup /
  perf: /home/acme/git/linux/tools/include/linux/refcount.h:108: refcount_inc: Assertion `!(!refcount_inc_not_zero(r))' failed.
  Aborted (core dumped)
  #

After:

  # perf stat -e cycles -C 0 --cgroup /
^C
  Performance counter stats for 'CPU(s) 0':

       132,081,393      cycles                    /

       2.492942763 seconds time elapsed

  #

Reported-by: Brendan Gregg <brendan.d.gregg@gmail.com>
Acked-by: Elena Reshetova <elena.reshetova@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: Kees Kook <keescook@chromium.org>
Cc: Krister Johansen <kjlx@templeofstupid.com>
Cc: Paul Turner <pjt@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sudeep Holla <Sudeep.Holla@arm.com>
Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 79c5fe6db8 ("perf cgroup: Convert cgroup_sel.refcnt from atomic_t to refcount_t")
Link: http://lkml.kernel.org/n/tip-l7ovfblq14ip2i08m1g0fkhv@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 11:23:50 -03:00
Thomas Richter
cf6383f73c perf report: Fix kernel symbol adjustment for s390x
On s390x the kernel text segment starts at address 0x0.  When perf
report reads kernel symbols from vmlinux file it adds an offset of
0x1000.

For example see symbol set_reset_devices:

  [root@s8360047 linux-devel]# nm -A vmlinux| fgrep set_reset_devices
  vmlinux:0000000001379000 t set_reset_devices
  [root@s8360047 linux-devel]#

  [root@s8360047 linux-devel]# fgrep set_reset_devices /proc/kallsyms
  0000000001379000 t set_reset_devices
  [root@s8360047 linux-devel]#

The kernel symbol table and the vmlinux file have the same address for
symbol set_reset_devices namely 1379000.

When perf report reads this symbols it displays it with address
symbol__new: set_reset_devices 0x137a000-0x137a018

There is a difference between perf report and vmlinux of 0x1000.

The reason for the difference is at kernel symbol load time in function
dso__load_sym(). The vmlinux file is investigated with its ELF header.
Command readelf shows this:

  Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .text             PROGBITS         0000000000000000  00001000
       0000000000b0e0c2  0000000000000000  AX       0     0     128

This leads to an invalid calculation of the symbol start address, see
file utit/symbol-elf.c line 974:

        /* Adjust symbol to map to file offset */
        if (adjust_kernel_syms)
                sym.st_value -= shdr.sh_addr - shdr.sh_offset;

With shdr.sh_addr set to 0x0 and shdr.sh_offset set to 0x1000 as read
from the ELF .text section 0x1000 is added to the symbol address.

I would like to fix this by introducing an archticture specific function
named elf__needs_adjust_symbols(). This is the same approach as done by
PowerPC.  The function currently does not exist for s390x and the
default weak one is used.  The s390x specific one returns false when
symsrc_init() is invoked for kernel symbols and results in variable
adjust_kernel_syms being false.  This omits the adjustment and the
correct address is displayed (when symbol resolvement does not work).

The s390x specific function returns false for kernel symbol adjustment
and returns true for kernel modules, processes and shared libraries.

Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
LPU-Reference: 20170713130252.6167-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 11:23:50 -03:00
Taeung Song
585d93c5ff perf annotate stdio: Fix --show-total-period
We were showing the total number of samples, not the total period as
asked by the user, fix it.

Reported-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Milian Wolff <milian.wolff@kdab.com>
Link: http://lkml.kernel.org/n/tip-lh2nh89rtqn5x5vbfthw6qml@git.kernel.org
Fixes: 0c4a5bcea4 ("perf annotate: Display total number of samples with --show-total-period")
[ split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 11:23:36 -03:00
Edward Cree
545722cb0f selftests/bpf: subtraction bounds test
There is a bug in the verifier's handling of BPF_SUB: [a,b] - [c,d] yields
 was [a-c, b-d] rather than the correct [a-d, b-c].  So here is a test
 which, with the bogus handling, will produce ranges of [0,0] and thus
 allowed accesses; whereas the correct handling will give a range of
 [-255, 255] (and hence the right-shift will give a range of [0, 255]) and
 the accesses will be rejected.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-24 14:02:55 -07:00
Linus Torvalds
bbcdea658f Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
 "Two hw-enablement patches, two race fixes, three fixes for regressions
  of semantics, plus a number of tooling fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel: Add proper condition to run sched_task callbacks
  perf/core: Fix locking for children siblings group read
  perf/core: Fix scheduling regression of pinned groups
  perf/x86/intel: Fix debug_store reset field for freq events
  perf/x86/intel: Add Goldmont Plus CPU PMU support
  perf/x86/intel: Enable C-state residency events for Apollo Lake
  perf symbols: Accept zero as the kernel base address
  Revert "perf/core: Drop kernel samples even though :u is specified"
  perf annotate: Fix broken arrow at row 0 connecting jmp instruction to its target
  perf evsel: State in the default event name if attr.exclude_kernel is set
  perf evsel: Fix attr.exclude_kernel setting for default cycles:p
2017-07-21 11:12:48 -07:00
Taeung Song
ecd5f9959d perf annotate: Do not overwrite sample->period
In fixing the --show-total-period option it was noticed that the value
of sample->period was being overwritten, fix it.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: fd36f3dd79 ("perf hist: Pass struct sample to __hists__add_entry()")
Link: http://lkml.kernel.org/r/1500500215-16646-1-git-send-email-treeze.taeung@gmail.com
[ split from a larger patch, added the Fixes tag ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-21 12:02:52 -03:00
Taeung Song
461c17f00f perf annotate: Store the sample period in each histogram bucket
We'll use it soon, when fixing --show-total-period.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1500500215-16646-1-git-send-email-treeze.taeung@gmail.com
[ split from a larger patch, do the math in __symbol__inc_addr_samples() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-21 12:02:38 -03:00
Taeung Song
bab89f6aed perf hists: Pass perf_sample to __symbol__inc_addr_samples()
To pave the way to use perf_sample fields in the annotate code, storing
sample->period in sym_hist->addr->period and its sum in
sym_hist->period.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1500500215-16646-1-git-send-email-treeze.taeung@gmail.com
[ split and adjusted from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-21 08:23:50 -03:00
Taeung Song
8158683da3 perf annotate: Rename 'sum' to 'nr_samples' in struct sym_hist
To make it more clear that it is the sum of all the nr_samples fields in the
addr[] entries, i.e.:

  sym_hist->nr_samples = sum(sym_hist->addr[0 ..  symbol__size(sym)]->nr_samples)

Committer notes:

Taeung had renamed it to total_samples, but using nr_samples, as in the
added explanation above, looks clearer and establishes the direct
connection, making clear it is about the _number_ of samples.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1500500211-16599-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-21 08:23:49 -03:00