Commit Graph

1324139 Commits

Author SHA1 Message Date
Leo Yan
9a7b618ef6 perf test record+probe_libc_inet_pton: Make test resilient
The test failed back and forth due to the call chain being heavily
impacted by the libc, which varies across different architectures and
distros.

The libc contains the symbols for "gaih_inet" and "getaddrinfo" in some
cases, but not always.  Moreover, these symbols can be either normal
symbols or dynamic symbols, making it difficult to decide the call chain
entries due to the symbols are inconsistent.

To fix the issue, this commit identifies three call chain entries are
always present.  These entries are matched by iterating through the
lines in the "perf script" result.  The recording attribute max-stack is
set to 4 for the possible maximum call chain depth.

After:

  # perf test -vF pton
  --- start ---
  Pattern: ping[][0-9 \.:]+probe_libc:inet_pton: \([[:xdigit:]]+\)
    Matching: ping  285058 [025] 1219802.466939: probe_libc:inet_pton: (ffffa14b7cf0)
  Pattern: .*inet_pton\+0x[[:xdigit:]]+[[:space:]]\(/usr/lib/aarch64-linux-gnu/libc-2.31.so|inlined\)$
    Matching: ping  285058 [025] 1219802.466939: probe_libc:inet_pton: (ffffa14b7cf0)
    Matching: ffffa14b7cf0 __GI___inet_pton+0x0 (/usr/lib/aarch64-linux-gnu/libc-2.31.so)
  Pattern: .*(\+0x[[:xdigit:]]+|\[unknown\])[[:space:]]\(.*/bin/ping.*\)$
    Matching: ping  285058 [025] 1219802.466939: probe_libc:inet_pton: (ffffa14b7cf0)
    Matching: ffffa14b7cf0 __GI___inet_pton+0x0 (/usr/lib/aarch64-linux-gnu/libc-2.31.so)
    Matching: ffffa1488040 getaddrinfo+0xe8 (/usr/lib/aarch64-linux-gnu/libc-2.31.so)
    Matching: aaaab8672da4 [unknown] (/usr/bin/ping)
  ---- end ----
   82: probe libc's inet_pton & backtrace it with ping                 : Ok

Closes: https://lore.kernel.org/linux-perf-users/1728978807-81116-1-git-send-email-renyu.zj@linux.alibaba.com/
Closes: https://lore.kernel.org/linux-perf-users/Z0X3AYUWkAgfPpWj@x1/T/#m57327e135b156047e37d214a0d453af6ae1e02be
Reported-by: Guilherme Amadio <amadio@gentoo.org>
Reported-by: Jing Zhang <renyu.zj@linux.alibaba.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20241202111958.553403-1-leo.yan@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-14 14:57:20 -03:00
Ian Rogers
8e246a1b2a perf inject: Fix use without initialization of local variables
Local variables were missing initialization and command line
processing didn't provide default values.

Fixes: 64eed019f3 ("perf inject: Lazy build-id mmap2 event insertion")
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241211060831.806539-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-14 14:57:19 -03:00
James Clark
6804a7192a perf probe: Rename err label
Rename err to out to avoid confusion because buf is still supposed to be
freed in non error cases.

Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241211085525.519458-3-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-14 14:57:19 -03:00
Ian Rogers
f9c506fb69 perf test stat: Avoid hybrid assumption when virtualized
The cycles event will fallback to task-clock in the hybrid test when
running virtualized. Change the test to not fail for this.

Fixes: 65d1182191 ("perf test: Add a test for default perf stat command")
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241212173354.9860-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-14 14:57:19 -03:00
Athira Rajeev
2adbf5349a perf record: Fix segfault with --off-cpu when debuginfo is not enabled
When kernel is built without debuginfo, running 'perf record' with
--off-cpu results in segfault as below:

   ./perf record --off-cpu -e dummy sleep 1
   libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled?
   libbpf: failed to find '.BTF' ELF section in /lib/modules/6.13.0-rc3+/build/vmlinux
   libbpf: failed to find valid kernel BTF
   Segmentation fault (core dumped)

The backtrace pointed to:

   #0  0x00000000100fb17c in btf.type_cnt ()
   #1  0x00000000100fc1a8 in btf_find_by_name_kind ()
   #2  0x00000000100fc38c in btf.find_by_name_kind ()
   #3  0x00000000102ee3ac in off_cpu_prepare ()
   #4  0x000000001002f78c in cmd_record ()
   #5  0x00000000100aee78 in run_builtin ()
   #6  0x00000000100af3e4 in handle_internal_command ()
   #7  0x000000001001004c in main ()

Code sequence is:

   static void check_sched_switch_args(void)
   {
        struct btf *btf = btf__load_vmlinux_btf();
        const struct btf_type *t1, *t2, *t3;
        u32 type_id;

        type_id = btf__find_by_name_kind(btf, "btf_trace_sched_switch",
                                         BTF_KIND_TYPEDEF);

btf__load_vmlinux_btf() fails when CONFIG_DEBUG_INFO_BTF is not enabled.

Here bpf__find_by_name_kind() calls btf__type_cnt() with NULL btf value
and results in segfault.

To fix this, add a check to see if btf is not NULL before invoking
bpf__find_by_name_kind().

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://lore.kernel.org/r/20241223135813.8175-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-14 14:57:19 -03:00
Athira Rajeev
8c1a106635 perf tests base_probe: Fix check for the count of existing probes in test_adding_kernel
perftool-testsuite_probe fails in test_adding_kernel as below:

	Regexp not found: "probe:inode_permission_11"
	-- [ FAIL ] -- perf_probe :: test_adding_kernel :: force-adding probes ::
	second probe adding (with force) (output regexp parsing)
	event syntax error: 'probe:inode_permission_11'
	  \___ unknown tracepoint

	Error:  File /sys/kernel/tracing//events/probe/inode_permission_11
	not found.
	Hint:   Perhaps this kernel misses some CONFIG_ setting to
	enable this feature?.

The test does the following:

1) Adds a probe point first using:

    $CMD_PERF probe --add $TEST_PROBE

2) Then tries to add same probe again without —force and expects it to
   fail. Next tries to add same probe again with —force. In this case,
   perf probe succeeds and adds the probe with a suffix number. Example:

  ./perf probe --add inode_permission
  Added new event:
   probe:inode_permission (on inode_permission)

  ./perf probe --add inode_permission --force
  Added new event:
   probe:inode_permission_1 (on inode_permission)

   ./perf probe --add inode_permission --force
  Added new event:
   probe:inode_permission_2 (on inode_permission)

Each time, suffix is added to existing probe name.

To get the suffix number, test cases uses:

  NO_OF_PROBES=`$CMD_PERF probe -l | wc -l`

This will work if there is no other probe existing in the system. If
there are any other probes other than kernel probes or inode_permission,
( example: any probe), "perf probe -l" will include count for other
probes too.

Example, in the system where this failed, already some probes were
default added. So count became 10

  ./perf probe -l | wc -l
  10

So to be specific for "inode_permission", restrict the probe count check
to that probe point alone using:

  NO_OF_PROBES=`$CMD_PERF probe -l $TEST_PROBE| wc -l`

Similarly while removing the probe using "probe --del *", (removing all
probes), check uses:

  ../common/check_all_lines_matched.pl "Removed event: probe:$TEST_PROBE"

But if there are other probes in the system, the log will contain
reference to other existing probe too. Hence change usage of
check_all_lines_matched.pl to check_all_patterns_found.pl This will make
sure expecting string comes in the result

Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250110094324.94604-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-14 14:57:19 -03:00
Michel Lind
8bf18c5cef perf MANIFEST: Add license files
The standalone tarballs should include the license files - both the
COPYING declaration as well as the text of GPLv2.

Signed-off-by: Michel Lind <michel@michel-slm.name>
Link: https://lore.kernel.org/r/Z0Zcx0WRqtlUYpgw@hyperscale.parallels
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-14 14:57:19 -03:00
James Clark
3178155d29 perf test brstack: Speed up running test by using tr -s instead of xargs
The brstack test runs quite slowly in software models. Part of the reason
is "xargs -n1" is quite inefficient in replacing spaces with newlines.

While that's not noticeable on normal machines, it is on software models.

Use "tr -s ' ' '\n'" instead which can do the same transformation, but is
much faster. For comparison on an M1 Macbook Pro:

  $ time seq -s ' ' 10000 | xargs -n1 > /dev/null

  real    0m2.729s
  user    0m2.009s
  sys     0m0.914s
  $ time seq -s ' ' 10000 | tr -s ' ' '\n' | grep '.' > /dev/null

  real    0m0.002s
  user    0m0.001s
  sys     0m0.001s

The "grep '.'" is also needed to remove any remaining blank lines.

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241213231312.2640687-2-robh@kernel.org
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
[robh: Drop changing loop iterations on arm64. Squash blank line fix and redo commit msg]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-14 14:57:19 -03:00
Charlie Jenkins
b1bb6fc06b perf tools mips: Fix mips syscall generation
The mips syscall generation was still based on the old method. Delete
the Makefile since it is no longer needed with the new method of
generation.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Fixes: 619ffe6694 ("perf tools mips: Use generic syscall scripts")
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250110-perf_fix_mips-v1-1-4e661c3b710a@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-13 11:46:41 -03:00
James Clark
05cd60e4d0 perf tests arm_spe: Add test for discard mode
Add a test that checks that there were no AUX or AUXTRACE events
recorded when discard mode is used.

Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Graham Woodward <graham.woodward@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108142904.401139-6-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-13 11:45:05 -03:00
James Clark
9c3164ea7e perf tools arm-spe: Don't allocate buffer or tracking event in discard mode
The buffer will never be written to so don't bother allocating it. The
tracking event is also not required.

Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Graham Woodward <graham.woodward@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108142904.401139-5-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-13 11:45:03 -03:00
James Clark
23a65c5e8b perf tools arm-spe: Pull out functions for aux buffer and tracking setup
These won't be used in the next commit in discard mode, so put them in
their own functions. No functional changes intended.

Reviewed-by: Yeoreum Yun <yeoreum.yun@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Graham Woodward <graham.woodward@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108142904.401139-4-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-13 11:43:15 -03:00
Jiachen Zhang
ac0ac75189 perf report: Fix misleading help message about --demangle
The wrong help message may mislead users. This commit fixes it.

Fixes: 328ccdace8 ("perf report: Add --no-demangle option")
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Jiachen Zhang <me@jcix.top>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250109152220.1869581-1-me@jcix.top
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10 14:46:09 -03:00
Namhyung Kim
510f0247cd perf ftrace: Fix display for range of the first bucket
When min_latency is not given, it prints 0 - 0.  It should be 0 - 1.

Before:

  $ sudo ./perf ftrace latency -a -T do_futex sleep 1
  #   DURATION     |      COUNT | GRAPH                                          |
       0 -    0 us |        321 | ###########                                    |
  ...

After:

  $ sudo ./perf ftrace latency -a -T do_futex sleep 1
  #   DURATION     |      COUNT | GRAPH                                          |
       0 -    1 us |        699 | ############                                   |
  ...

Fixes: 08b875b6bf ("perf ftrace latency: Introduce --min-latency to narrow down into a latency range")
Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gabriele Monaco <gmonaco@redhat.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250108210015.1188531-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10 14:45:43 -03:00
Namhyung Kim
dd01b985c5 perf ftrace: Check min/max latency only with bucket range
It's an optional feature and remains 0 when bucket range is not given.
And it makes the histogram goes to the last entry always because any
latency (num) is greater than or equal to 0.

Before:

  $ sudo ./perf ftrace latency -a -T do_futex sleep 1
  #   DURATION     |      COUNT | GRAPH                                          |
       0 -    0 us |          0 |                                                |
       1 -    2 us |          0 |                                                |
       2 -    4 us |          0 |                                                |
       4 -    8 us |          0 |                                                |
       8 -   16 us |          0 |                                                |
      16 -   32 us |          0 |                                                |
      32 -   64 us |          0 |                                                |
      64 -  128 us |          0 |                                                |
     128 -  256 us |          0 |                                                |
     256 -  512 us |          0 |                                                |
     512 - 1024 us |          0 |                                                |
       1 -    2 ms |          0 |                                                |
       2 -    4 ms |          0 |                                                |
       4 -    8 ms |          0 |                                                |
       8 -   16 ms |          0 |                                                |
      16 -   32 ms |          0 |                                                |
      32 -   64 ms |          0 |                                                |
      64 -  128 ms |          0 |                                                |
     128 -  256 ms |          0 |                                                |
     256 -  512 ms |          0 |                                                |
     512 - 1024 ms |          0 |                                                |
       1 - ...  s  |       1353 | ############################################## |

After:

  $ sudo ./perf ftrace latency -a -T do_futex sleep 1
  #   DURATION     |      COUNT | GRAPH                                          |
       0 -    0 us |        321 | ###########                                    |
       1 -    2 us |        132 | ####                                           |
       2 -    4 us |        202 | #######                                        |
       4 -    8 us |        188 | ######                                         |
       8 -   16 us |         16 |                                                |
      16 -   32 us |         12 |                                                |
      32 -   64 us |         30 | #                                              |
      64 -  128 us |         98 | ###                                            |
     128 -  256 us |         53 | #                                              |
     256 -  512 us |         57 | ##                                             |
     512 - 1024 us |          9 |                                                |
       1 -    2 ms |          9 |                                                |
       2 -    4 ms |          1 |                                                |
       4 -    8 ms |         98 | ###                                            |
       8 -   16 ms |          5 |                                                |
      16 -   32 ms |          7 |                                                |
      32 -   64 ms |         32 | #                                              |
      64 -  128 ms |         10 |                                                |
     128 -  256 ms |         10 |                                                |
     256 -  512 ms |          2 |                                                |
     512 - 1024 ms |          0 |                                                |
       1 - ...  s  |          0 |                                                |

Fixes: 690a052a6d ("perf ftrace latency: Add --max-latency option")
Reviewed-by: Gabriele Monaco <gmonaco@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Gabriele Monaco <gmonaco@redhat.com
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250108210015.1188531-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10 14:43:55 -03:00
Arnaldo Carvalho de Melo
74c033b6aa perf MANIFEST: Add arch/*/include/uapi/asm/bpf_perf_event.h to the perf tarball
Needed to build tools/lib/bpf/ on various arches other than x86_64,
notably arm64 when using the perf tarballs generated by:

  $ make help | grep perf-
    perf-tar-src-pkg    - Build the perf source tarball with no compression
    perf-targz-src-pkg  - Build the perf source tarball with gzip compression
    perf-tarbz2-src-pkg - Build the perf source tarball with bz2 compression
    perf-tarxz-src-pkg  - Build the perf source tarball with xz compression
    perf-tarzst-src-pkg - Build the perf source tarball with zst compression
  $

Building with BPF support was opt-in in perf for a long time, and
testing it via the tarball main kernel Makefile targets in an
architecture other than x86_64 was an odd case.

I had noticed this at some point earlier this year while cross building
perf to some arches, including arm64, but it fell thru the cracks, see
the Link tag below.

Fix it now by adding those arch/*/include/uapi/asm/bpf_perf_event.h
files to the MANIFEST file used in building the perf source tarball.

Tested with:

  perfbuilder@number:~$ time dm debian:experimental-x-arm64
     1    21.60 debian:experimental-x-arm64   : Ok   aarch64-linux-gnu-gcc (Debian 14.1.0-5) 14.1.0  flex 2.6.4
  BUILD_TARBALL_HEAD=d31a974f6edc576f84c35be9526fec549a3b3520
  $
  $ git log --oneline -1 d31a974f6edc576f84c35be9526fec549a3b3520
  d31a974f6edc576f (HEAD -> perf-tools-next) perf MANIFEST: Add arch/*/include/uapi/asm/bpf_perf_event.h to the perf tarball
  $

That was previously failing:

  perfbuilder@number:~$ grep debian:experimental-x-arm64 dm.log.old/summary
  19     4.80 debian:experimental-x-arm64   : FAIL gcc version 14.1.0 (Debian 14.1.0-5)
  $
  perfbuilder@number:~$ grep -B6 'Error 1' dm.log.old/debian:experimental-x-arm64
  In file included from /git/perf-6.12.0-rc6/tools/include/uapi/linux/bpf_perf_event.h:11,
                   from libbpf.c:36:
  /git/perf-6.12.0-rc6/tools/include/uapi/asm/bpf_perf_event.h:2:10: fatal error: ../../arch/arm64/include/uapi/asm/bpf_perf_event.h: No such file or directory
      2 | #include "../../arch/arm64/include/uapi/asm/bpf_perf_event.h"
        |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  compilation terminated.
  make[4]: *** [/git/perf-6.12.0-rc6/tools/build/Makefile.build:105: /tmp/build/perf/libbpf/staticobjs/libbpf.o] Error 1
  perfbuilder@number:~$

Closes: https://lore.kernel.org/all/Z0UNRCRYKunbDYxP@hyperscale.parallels
Fixes: 9eea8fafe3 ("libbpf: fix __arg_ctx type enforcement for perf_event programs")
Reported-by: Michel Lind <michel@michel-slm.name>
Tested-by: Michel Lind <michel@michel-slm.name>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: 317c11923cf676437456e44a7f408d4ce589a9c0.camel@michel-slm.name
Link: https://lore.kernel.org/bpf/ZfyEgoG3JFiOs2Fs@x1/
Link: https://lore.kernel.org/r/Z0Yy5u42Q1hWoEzz@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10 10:59:42 -03:00
Yoshihiro Furudera
e5e34e9995 perf vendor events arm64: Add FUJITSU-MONAKA PMU event
Add PMU events for FUJITSU-MONAKA.

And, also updated common-and-microarch.json and recommended.json.

FUJITSU-MONAKA Specification URL:

  https://github.com/fujitsu/FUJITSU-MONAKA

Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Akio Kakuno <fj3333bs@aa.jp.fujitsu.com>
Signed-off-by: Yoshihiro Furudera <fj5100bi@fujitsu.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Lucas Stach <l.stach@pengutronix.de>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: Xu Yang <xu.yang_2@nxp.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20241217065751.1448755-1-fj5100bi@fujitsu.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10 10:59:42 -03:00
Namhyung Kim
876e80cf83 perf tools: Fixup end address of modules
In machine__create_module(), it reads /proc/modules to get a list of
modules in the system.  The file shows the start address (of text) and
the size of the module so it uses the info to reconstruct system memory
maps for symbol resolution.

But module memory consists of multiple segments and they can be
scaterred.  Currently perf tools assume they are contiguous and see some
overlaps.  This can confuse the tool when it finds a map containing a
given address.

As we mostly care about the function symbols in the text segment, it can
fixup the size or end address of modules when there's an overlap.  We
can use maps__fixup_end() which updates the end address using the start
address of the next map.

Ideally it should be able to track other segments (like data/rodata),
but that would require some changes in /proc/modules IMHO.

Reported-by: Blake Jones <blakejones@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Gomez <da.gomez@samsung.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Pavlu <petr.pavlu@suse.com>
Cc: Sami Tolvanen <samitolvanen@google.com>
Link: https://lore.kernel.org/r/20241218220453.203069-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10 10:59:42 -03:00
Namhyung Kim
8c2eafbbfd perf symbol: Prefer non-label symbols with same address
When there are more than one symbols at the same address, it needs to
choose which one is better.  In choose_best_symbol() it didn't check the
type of symbols.  It's possible to have labels in other symbols and in
that case, it would be better to pick the actual symbol over the labels.
To minimize the possible impact on other symbols, I only check NOTYPE
symbols specifically.

  $ readelf -sW vmlinux | grep -e __do_softirq -e __softirqentry_text_start
  105089: ffffffff82000000   814 FUNC    GLOBAL DEFAULT    1 __do_softirq
  111954: ffffffff82000000     0 NOTYPE  GLOBAL DEFAULT    1 __softirqentry_text_start

The commit 77b004f4c5 tried to do the same by not giving the size
to the label symbols but it seems there's some label-only symbols in asm
code.  Let's restore the original code and choose the right symbol using
type of the symbols.

Fixes: 77b004f4c5 ("perf symbol: Do not fixup end address of labels")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: http://lore.kernel.org/lkml/Z3b-DqBMnNb4ucEm@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10 10:59:42 -03:00
Ian Rogers
368781025a perf symbol-elf: Avoid a weak cxx_demangle_sym function
cxx_demangle_sym is weak in case demangle-cxx.c replaces the
definition in symbol-elf.c. When demangle-cxx.c is built
HAVE_CXA_DEMANGLE_SUPPORT is defined, as such the define can be used
to avoid a weak symbol.

As weak symbols are outside of the C standard their use can lead to
strange behaviors, in particular with LTO, as well as causing issues to
be hidden at link time.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20241119031754.1021858-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10 10:59:42 -03:00
Namhyung Kim
4f90ed0ae3 perf trace: Fix unaligned access for augmented args
Some version of compilers reported unaligned accesses in perf trace when
undefined-behavior sanitizer is on.  I found that it uses raw data in
the sample directly and assuming it's properly aligned.

Unlike other sample fields, the raw data is not 8-byte aligned because
there's a size field (u32) before the actual data.  So I added a static
buffer in syscall__augmented_args() and return it instead.  This is not
ideal but should work well as perf trace is single-threaded.

A better approach would be aligning the raw data by adding a 4-byte data
before the augmented args but I'm afraid it'd break the backward
compatibility.

Committer testing:

To build with the undefined behaviour sanitizer:

 $ make CC=clang EXTRA_CFLAGS=-fsanitize=undefined -C tools/perf

Checking if the resulting binary is instrumented:

  root@number:~# nm ~/bin/perf | grep ubsan | wc -l
  113
  root@number:~# nm ~/bin/perf | grep ubsan | tail -5
  000000000043d5b0 t _ZN7__ubsanL19UBsanOnDeadlySignalEiPvS0_
  000000000043ce50 T _ZNK7__ubsan5Value12getSIntValueEv
  000000000043cf40 T _ZNK7__ubsan5Value12getUIntValueEv
  000000000043d140 T _ZNK7__ubsan5Value13getFloatValueEv
  000000000043cfd0 T _ZNK7__ubsan5Value19getPositiveIntValueEv
  root@number:~#

Now running something that will access timespec, as reported in the
Closes URL:

  root@number:~# perf trace --max-events=1 -e *nano* sleep 1.1
  trace/beauty/timespec.c:10:64: runtime error: member access within misaligned address 0x7fc583cfb2a4 for type 'struct augmented_arg', which requires 8 byte alignment
  0x7fc583cfb2a4: note: pointer points here
    99 99 11 00 10 00 00 00  00 00 00 00 01 00 00 00  00 00 00 00 01 e1 f5 05  00 00 00 00 00 00 00 00
                ^
  SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior trace/beauty/timespec.c:10:64
  <SNIP>

As Namhyung said we need to make the raw_data to be 64-bit aligned,
probably we need to add a PERF_SAMPLE_ALIGNED_RAW with a 64-bit raw_size
instead of the current u32 done at kernel/events/core.c,
perf_output_sample(), that perf_output_put(handle, raw->size) where
raw->size is an u32 and then the raw_data is always 64-bit unaligned...

After the patch:

  root@number:~# perf trace -e *nano* sleep 1.1
       0.000 (1100.064 ms): sleep/1984224 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 100000001 }, rmtp: 0x7fff5b3fe970) = 0
  root@number:~#

Closes: https://lore.kernel.org/r/Z2STgyD1p456Qqhg@google.com
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250102201248.790841-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10 10:59:42 -03:00
James Clark
0ba2022410 perf test: Mark remaining probe tests as exclusive
Probes are global and other probe tests are already exclusive. These
two tests can throw warnings when run at the same time so mark them as
exclusive too:

  $ perf test -vvv 81 79

  79: perftool-testsuite_probe:
  --- start ---
  test child forked, pid 46419
  ../common/init.sh: line 137: /sys/kernel/debug/tracing/uprobe_events: Device or resource busy

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Link: https://lore.kernel.org/r/20250107165933.292225-1-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10 10:59:42 -03:00
Charlie Jenkins
3cc550f5bb perf tools: Remove dependency on libaudit
All architectures now support HAVE_SYSCALL_TABLE_SUPPORT, so the flag is
no longer needed. With the removal of the flag, the related
GENERIC_SYSCALL_TABLE can also be removed.

libaudit was only used as a fallback for when HAVE_SYSCALL_TABLE_SUPPORT
was not defined, so libaudit is also no longer needed for any
architecture.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-16-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10 10:59:42 -03:00
Charlie Jenkins
00d1bfae1b perf tools s390: Use generic syscall table scripts
Use the generic scripts to generate headers from the syscall table
instead of the custom ones for s390.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-15-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10 10:59:42 -03:00
Charlie Jenkins
4c02c7e0a2 perf tools powerpc: Use generic syscall table scripts
Use the generic scripts to generate headers from the syscall table
instead of the custom ones for powerpc.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-14-7543b5293098@rivosinc.com
Link: https://lore.kernel.org/lkml/20250110100505.78d81450@canb.auug.org.au
[ Stephen Rothwell noticed on linux-next that the powerpc build for perf was broken and ...]
Link: https://lore.kernel.org/lkml/20250109-perf_powerpc_spu-v1-1-c097fc43737e@rivosinc.com
[ ... Charlie fixed it up and asked for it to be squashed to avoid breaking bisection. ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-10 10:57:01 -03:00
Charlie Jenkins
619ffe6694 perf tools mips: Use generic syscall scripts
Use the generic scripts to generate headers from the syscall table for
mips.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-13-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-09 12:56:20 -03:00
Charlie Jenkins
fa70857a27 perf tools loongarch: Use syscall table
loongarch uses a syscall table, use that in perf instead of using unistd.h.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-12-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-09 12:56:00 -03:00
Charlie Jenkins
cb8197db8c perf tools arm64: Use syscall table
arm64 uses a syscall table, use that in perf instead of using unistd.h.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-11-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-09 12:55:36 -03:00
Charlie Jenkins
02f2d58f23 perf tools parisc: Support syscall header
parisc uses a syscall table, use that in perf instead of requiring
libaudit.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-10-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-09 12:55:13 -03:00
Charlie Jenkins
bb4f842891 perf tools alpha: Support syscall header
alpha uses a syscall table, use that in perf instead of requiring
libaudit.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-9-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-09 12:54:54 -03:00
Charlie Jenkins
a874d1f6f1 perf tools x86: Use generic syscall scripts
Use the generic scripts to generate headers from the syscall table for
both 32- and 64-bit x86.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-8-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-09 12:53:49 -03:00
Charlie Jenkins
24f122dc09 perf tools xtensa: Support syscall header
xtensa uses a syscall table, use that in perf instead of requiring
libaudit.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-7-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-09 12:53:28 -03:00
Charlie Jenkins
1f44829e5e perf tools sparc: Support syscall headers
sparc uses a syscall table, use that in perf instead of requiring
libaudit.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-6-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-09 12:52:51 -03:00
Charlie Jenkins
430a6dfe41 perf tools sh: Support syscall headers
sh uses a syscall table, use that in perf instead of requiring libaudit.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-5-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-09 12:52:28 -03:00
Charlie Jenkins
9605665a64 perf tools arm: Support syscall headers
arm uses a syscall table, use that in perf instead of requiring
libaudit.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-4-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-09 12:51:30 -03:00
Charlie Jenkins
c68825eed9 perf tools csky: Support generic syscall headers
csky uses the generic syscall table, use that in perf instead of
requiring libaudit.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Acked-by: Guo Ren <guoren@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-3-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-09 12:51:18 -03:00
Charlie Jenkins
26db672256 perf tools arc: Support generic syscall headers
Arc uses the generic syscall table, use that in perf instead of
requiring libaudit.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-2-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-09 12:50:56 -03:00
Charlie Jenkins
4a73aff8c5 perf tools: Create generic syscall table support
Currently each architecture in perf independently generates syscall
headers.

Adapt the work that has gone into unifying syscall header
implementations in the kernel to work with perf tools.

Introduce this framework with riscv at first. riscv previously relied on
libaudit, but with this change, perf tools for riscv no longer needs
this external dependency.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mickaël Salaün <mic@digikod.net>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250108-perf_syscalltbl-v6-1-7543b5293098@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-09 12:49:49 -03:00
Ian Rogers
6bfb4c571b perf test cpumap: Avoid use-after-free following merge
Previously cpu maps in the test weren't modified by calls to the cpu map
API, however, perf_cpu_map__merge was modified so the left hand argument
was updated.

In the test this meant the maps copy of the "two" map was put/deleted in
the merge meaning when accessed via maps, the pointer was stale and to
the put/deleted memory.

To fix this add an extra layer of indirection to the maps array, so the
updated value of two is accessed.

Fixes: a9d2217556 ("libperf cpumap: Refactor perf_cpu_map__merge()")
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250108051511.1720369-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-08 17:40:05 -03:00
Dmitry Vyukov
9c64c7c658 perf llvm-add2line: Remove unused symbol_conf.h include
Remove unused symbol_conf.h include.

First, it's just unused. Second, it's problematic since this is a C++
file, and most perf headers don't compile as C++. So if any other
includes are added to symbol_conf.h, it may break the build.

Signed-off-by: Dmitriy Vyukov <dvyukov@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250108070248.237943-1-dvyukov@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-08 17:38:32 -03:00
James Clark
58f4f294b3 perf test trace_btf_general: Fix shellcheck warning
Shellcheck versions < v0.7.2 can't follow this path so add the helper to
fix the following warning:

  tests/shell/trace_btf_general.sh line 8:
  . "$(dirname $0)"/lib/probe.sh
  ^--------------------------^ SC1090: Can't follow non-constant source.
  Use a directive to specify location.

Fixes: 0255338d69 ("perf trace: Add tests for BTF general augmentation")
Signed-off-by: James Clark <james.clark@linaro.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250106164300.734202-1-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-08 17:36:04 -03:00
Arnaldo Carvalho de Melo
64a7617efd perf namespaces: Fixup the nsinfo__in_pidns() return type, its bool
When adding support for refconunt checking a cut'n'paste made this
function, that is just an accessor to a bool member of 'struct nsinfo',
return a pid_t, when that member is a boolean, fix it.

Fixes: bcaf0a9785 ("perf namespaces: Add functions to access nsinfo")
Reported-by: Francesco Nigro <fnigro@redhat.com>
Reported-by: Ilan Green <igreen@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
Link: https://lore.kernel.org/r/20241206204828.507527-6-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-08 17:31:06 -03:00
Arnaldo Carvalho de Melo
74833e37df perf jitdump: Fixup in_pidns member when java agent and 'perf record' are not in the same pidns
When running 'perf record' outside a container and the java agent inside
a container the jit_repipe_code_load() and friends will emit
PERF_RECORD_MMAP2 entries for the jitdump records and will check if we
need to fixup the pid/tid:

        nspid = jr->load.pid;
        pid   = jr_entry_pid(jd, jr);
        tid   = jr_entry_tid(jd, jr);

The jr_entry_pid() function looks if we're in the same pidns:

  static pid_t jr_entry_pid(struct jit_buf_desc *jd, union jr_entry *jr)
  {
          if (jd->nsi && nsinfo__in_pidns(jd->nsi))
                  return nsinfo__tgid(jd->nsi);
          return jr->load.pid;
  }

But since the thread, populated from perf.data records, try to figure
out if in the same pidns by actually trying, on the system where 'perf
inject' is running to open a procfs file (a bug that remains to be
fixed), assuming that if it is not possible that is because that thread
terminated and thus we can't get its namespace info and tolerates
nsinfo__init() failing, noting only that that namespace can't be
entered, so don't even try.

But we can kinda get at least that info (thread->nsinfo->in_pidns) from
the data in the perf.data file, namely the pid and tid in the
PERF_RECORD_MMAP2 for the jit-<PID>.dump file generated from the java
agent, if the PERF_RECORD_MMAP2->pid is the same as what is in the
jitdump file, then we're in the same namespace, otherwise we need to use
the PERF_RECORD_MMAP2->pid.

This all has to be revamped for this jitdump + running perf from
outside, as the meaning of in_pidns is being abused, the initialization
of nsinfo->pid with the value coming from the PERF_RECORD_MMAP2 data is
wrong as it is the pid _outside_ the container since perf was running
there.

The hack in this patch at least produces the expected result in this
scenario by following the assumptions in the current codebase for
finding maps and for generating the PERF_RECORD_MMAP2 for the ELF files
synthesized from the jitdump records in jit_repipe_code_load(), etc.s

Reported-by: Francesco Nigro <fnigro@redhat.com>
Reported-by: Ilan Green <igreen@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
Link: https://lore.kernel.org/r/20241206204828.507527-5-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-08 17:30:50 -03:00
Arnaldo Carvalho de Melo
9c6a585d25 perf namespaces: Introduce nsinfo__set_in_pidns()
When we're processing a perf.data file we will, for every thread in that
file do a machine__findnew_thread(machine, pid, tid) that when that pid
is seen for the first time will create a 'struct thread' representing
it.

That in turn will call nsinfo__new() -> nsinfo__init() and there it will
assume we're running live, which is wrong and will need to be addressed
in a followup patch.

The nsinfo__new() assumes that if we can't access that thread it has
already finished and will ignore the -1 return from nsinfo__init(), just
taking notes to avoid trying to enter in that namespace, since it isn't
there anymore, a race.

When doing this from 'perf inject', tho, we can fill in parts of that
nsinfo from what we get from the PERF_RECORD_MMAP2 (pid, tid) and in the
jitdump file name, that has the form of jit-<PID>.dump.

So if the pid in the jitdump file name is not the one in the
PERF_RECORD_MMAP2, we can assume that its the pid of the process
_inside_ the namespace, and that perf was runing outside that namespace.

This will be done in the following patch.

Reported-by: Francesco Nigro <fnigro@redhat.com>
Reported-by: Ilan Green <igreen@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
Link: https://lore.kernel.org/r/20241206204828.507527-4-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-08 17:30:24 -03:00
Arnaldo Carvalho de Melo
f523347ba6 perf jitdump: Accept jitdump mmaps emitted from inside containers
When the java agent is running inside a container it will emit mmaps
with the format:

  ⬢ [acme@toolbox a]$ perf report -D | grep PERF_RECORD_MMAP | grep \.dump
  0 0x15c400 [0x90]: PERF_RECORD_MMAP2 3308868/3308868: [0x7fb8de6cb000(0x1000) @ 0 08:14 3222905945 0]: r-xp /tmp/.debug/jit/java-jit-20241126.XXTxEIOn/jit-1.dump
  ⬢ [acme@toolbox a]$

Since perf is running from outside the container it sees the pid 3308868
in PERF_RECORD_MMAP2, while the agent saw the pid of the profiled app
inside the container, 1.

The previous validation was:

  if (pid && pid2 != nsinfo__nstgid(nsi))

pid2 at this point is '1' (/jit-1.dump), so it considers this as a
malformed jitdump mmap and refuses to process it.

The test ends up as:

  if (3308868 && 1 != 3308868)

which is true and the jitdump is not processed.

Since 1 in the container namespace is really 3308868 in the namespace
that perf is running, consider this a valid mmap.

We need to make perf realize this and behave accordingly, for now
checking instead:

  if (pid && pid2 && pid != nsinfo__nstgid(nsi))

Translating to:

  if (3308868 && 1 && 3308868 != 3308868)

Will make the jitdump mmap to be considered valid and processed.

The jitdump is described in:

  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/Documentation/jitdump-specification.txt

Now we end up with the expected flurry of MMAPs, one per jitted function
transformed into a little ELF file that should then be processable by
the other perf features, like code annotation:

  [acme@toolbox a]$ echo $JITDUMPDIR
  /tmp/.debug/jit
  [acme@toolbox a]$

First use 'perf inject':

  ⬢ [acme@toolbox a]$ time perf inject -i perf.data -o acme-perf-injected.data -j

Then look at the PERF_RECORD_MMAP events in the result file, that went
thru the JIT map file:

  ⬢ [acme@toolbox a]$ ls -la /tmp/*.map
  -rw-r--r--. 1 acme acme 2989559 Nov 27 16:11 /tmp/perf-3308868.map
  [acme@toolbox a]$

It is a symbol table:

  ⬢ [acme@toolbox a]$ head /tmp/*.map
  0x00007fb8bda5c1a0 0x00000000000000d0 java.lang.String java.lang.module.ModuleDescriptor.name()
  0x00007fb8bda5c4a0 0x0000000000000178 int java.lang.StringLatin1.hashCode(byte[])
  0x00007fb8bda5c9a0 0x00000000000000d0 java.lang.String org.springframework.boot.context.config.ConfigDataLocation.getValue()
  0x00007fb8bda5cca0 0x00000000000000d0 java.lang.module.ModuleDescriptor java.lang.module.ModuleReference.descriptor()
  0x00007fb8bda5cfa0 0x00000000000000d0 java.lang.Object java.util.KeyValueHolder.getKey()
  0x00007fb8bda5d2a0 0x00000000000000d0 java.lang.Object java.util.KeyValueHolder.getValue()
  0x00007fb8bda5d5a0 0x0000000000000218 boolean jdk.internal.misc.Unsafe.compareAndSetReference(java.lang.Object, long, java.lang.Object, java.lang.Object)
  0x00007fb8bda5d9a0 0x00000000000001f0 boolean jdk.internal.misc.Unsafe.compareAndSetLong(java.lang.Object, long, long, long)
  0x00007fb8bda5dda0 0x00000000000001f8 void java.lang.System.arraycopy(java.lang.Object, int, java.lang.Object, int, int)
  0x00007fb8bda5e1a0 0x00000000000001e8 int java.lang.Object.hashCode()
  ⬢ [acme@toolbox a]$

As specified in:

  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/Documentation/jit-interface.txt

This was collected from inside the container, so came as
/tmp/perf-1.map.

To make perf, running outside the container to use it we need to copy it
to /tmp/perf-3308868.map.

This is another logic that has to be added to perf to work on this
scenario of running outside the container but processing things created
by the hava agent running inside the container.

With all this in place we get to:

  ⬢ [acme@toolbox a]$ perf report -D -i acme-perf-injected.data | \
  			grep PERF_RECORD_MMAP > /tmp/acme-perf-injected.data.mmaps ; \
  			wc -l /tmp/acme-perf-injected.data.mmaps
  44182 /tmp/acme-perf-injected.data.mmaps
  ⬢ [acme@toolbox a]$ tail /tmp/acme-perf-injected.data.mmaps
  1030266786574466 0x7bc9e0 [0x98]: PERF_RECORD_MMAP2 1/78: [0x7fb8c0ceb1c0(0x8d0) @ 0x80 00:2c 238715 1]: --xs /tmp/.debug/jit/java-jit-20241126.XXTxEIOn/jitted-1-43989.so
  1030266795288774 0x7bca78 [0x98]: PERF_RECORD_MMAP2 1/78: [0x7fb8c0cecc00(0x7e8) @ 0x80 00:2c 238716 1]: --xs /tmp/.debug/jit/java-jit-20241126.XXTxEIOn/jitted-1-43990.so
  1030266895967339 0x7bcb10 [0x98]: PERF_RECORD_MMAP2 1/78: [0x7fb8c0cee500(0x3328) @ 0x80 00:2c 238717 1]: --xs /tmp/.debug/jit/java-jit-20241126.XXTxEIOn/jitted-1-43991.so
  1030266915748306 0x7bcba8 [0x98]: PERF_RECORD_MMAP2 1/78: [0x7fb8c0aae0a0(0x138) @ 0x80 00:2c 238718 1]: --xs /tmp/.debug/jit/java-jit-20241126.XXTxEIOn/jitted-1-43992.so
  1030267185851220 0x7bcc40 [0x98]: PERF_RECORD_MMAP2 1/78: [0x7fb8c0cf61e0(0x3b50) @ 0x80 00:2c 238719 1]: --xs /tmp/.debug/jit/java-jit-20241126.XXTxEIOn/jitted-1-43993.so
  1030267231364524 0x7bccd8 [0x98]: PERF_RECORD_MMAP2 1/78: [0x7fb8c0cfea80(0x14a0) @ 0x80 00:2c 238720 1]: --xs /tmp/.debug/jit/java-jit-20241126.XXTxEIOn/jitted-1-43994.so
  1030267425498831 0x7bcd70 [0x98]: PERF_RECORD_MMAP2 1/78: [0x7fb8c054b4a0(0x338) @ 0x80 00:2c 238721 1]: --xs /tmp/.debug/jit/java-jit-20241126.XXTxEIOn/jitted-1-43995.so
  1030267506147888 0x7bce08 [0x98]: PERF_RECORD_MMAP2 1/78: [0x7fb8c0a995c0(0x1e8) @ 0x80 00:2c 238722 1]: --xs /tmp/.debug/jit/java-jit-20241126.XXTxEIOn/jitted-1-43996.so
  1030268112586116 0x7bcea0 [0x98]: PERF_RECORD_MMAP2 1/78: [0x7fb8c0d02520(0x258) @ 0x80 00:2c 238723 1]: --xs /tmp/.debug/jit/java-jit-20241126.XXTxEIOn/jitted-1-43997.so
  1030269435398150 0x7bcf38 [0x98]: PERF_RECORD_MMAP2 1/78: [0x7fb8c0d02dc0(0x278) @ 0x80 00:2c 238724 1]: --xs /tmp/.debug/jit/java-jit-20241126.XXTxEIOn/jitted-1-43998.so
  ⬢ [acme@toolbox a]$

And if we look at those tiny ELF files generated by the jitdump code
used by 'perf inject' we see:

  ⬢ [acme@toolbox a]$ file /tmp/.debug/jit/java-jit-20241126.XXTxEIOn/jitted-1-43989.so
  /tmp/.debug/jit/java-jit-20241126.XXTxEIOn/jitted-1-43989.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=790591db95a77d644657dfe5058658b200000000, with debug_info, not stripped
  ⬢ [acme@toolbox a]$ file /tmp/.debug/jit/java-jit-20241126.XXTxEIOn/jitted-1-43990.so
  /tmp/.debug/jit/java-jit-20241126.XXTxEIOn/jitted-1-43990.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=762f932acbee53a22638bf4c2b86780200000000, with debug_info, not stripped
  ⬢ [acme@toolbox a]$

  ⬢ [acme@toolbox a]$ ls -la /tmp/.debug/jit/java-jit-20241126.XXTxEIOn/jitted-1-43989.so /tmp/.debug/jit/java-jit-20241126.XXTxEIOn/jitted-1-43990.so
  -rw-r--r--. 1 acme acme 9432 Nov 29 10:56 /tmp/.debug/jit/java-jit-20241126.XXTxEIOn/jitted-1-43989.so
  -rw-r--r--. 1 acme acme 7504 Nov 29 10:56 /tmp/.debug/jit/java-jit-20241126.XXTxEIOn/jitted-1-43990.so
  ⬢ [acme@toolbox a]$

And:

  ⬢ [acme@toolbox a]$ objdump -dS /tmp/.debug/jit/java-jit-20241126.XXTxEIOn/jitted-1-43990.so | head -20

  /tmp/.debug/jit/java-jit-20241126.XXTxEIOn/jitted-1-43990.so:     file format elf64-x86-64

  Disassembly of section .text:

  0000000000000080 <Lredacted/REDACTED/REDACTED/logging/RedactedRedacted;Redacted(Lredacted/REDACTED/REDACTED/redactedRedacted/Redacted;)V>:
    80:	44 8b 56 08          	mov    0x8(%rsi),%r10d
    84:	49 c1 e2 03          	shl    $0x3,%r10
    88:	49 3b c2             	cmp    %r10,%rax
    8b:	0f 85 6f 15 83 fc    	jne    fffffffffc831600 <Lredacted/REDACTED/REDACTED/redacted/RedactedRedactedRedacted;Redacted(Lredacted/Redacted/Redacted/redactedRedacted/Redacted;)V+0xfffffffffc831580>
    91:	66 66 90             	data16 xchg %ax,%ax
    94:	0f 1f 84 00 00 00 00 	nopl   0x0(%rax,%rax,1)
    9b:	00
    9c:	66 66 66 90          	data16 data16 xchg %ax,%ax
    a0:	89 84 24 00 c0 fe ff 	mov    %eax,-0x14000(%rsp)
    a7:	55                   	push   %rbp
    a8:	48 8b ec             	mov    %rsp,%rbp
    ab:	48 83 ec 40          	sub    $0x40,%rsp
    af:	48 89 34 24          	mov    %rsi,(%rsp)
  ⬢ [acme@toolbox a]$

The thing now being investigated is why we can't annotate anything here,
maybe that JITDUMPDIR is getting in the way:

  ⬢ [acme@toolbox a]$ perf annotate --stdio2 -i acme-perf-injected.data 'java.lang.String com.fasterxml.jackson.core.sym.CharsToNameCanonicalizer.findSymbol(char[], int, int, int)'
  Error:
  Couldn't annotate java.lang.String com.fasterxml.jackson.core.sym.CharsToNameCanonicalizer.findSymbol(char[], int, int, int):
  Internal error: Invalid -1 error code
  ⬢ [acme@toolbox a]$

In the tests I performed while merging this patch:

  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6d518ac7be6223811ab947897273b1bbef846180

It works, but then there was no JITDUMPDIR involved:

  /home/acme/.debug/jit/java-jit-20241127.XXF1SRgN/jitted-3912413-4191.so

⬢ [acme@toolbox perf-tools-next]$ perf report --call-graph none --no-child -i perf-injected.data | grep jitted- | head
     1.36%  java             jitted-3912413-54.so    [.] Interpreter
     0.30%  C1 CompilerThre  jitted-3912413-1.so     [.] flush_icache_stub
     0.18%  java             jitted-3912413-4184.so  [.] org.apache.fop.fo.properties.PropertyMaker.get(int, org.apache.fop.fo.PropertyList, boolean, boolean)
     0.18%  java             jitted-3912413-4177.so  [.] org.apache.fop.layoutmgr.inline.TextLayoutManager.getNextKnuthElements(org.apache.fop.layoutmgr.LayoutContext, int)
     0.13%  java             jitted-3912413-3845.so  [.] java.text.DecimalFormat.subformatNumber(java.lang.StringBuffer, java.text.Format$FieldDelegate, boolean, boolean, int, int, int, int)
     0.11%  java             jitted-3912413-4191.so  [.] org.apache.fop.fo.FObj.addChildNode(org.apache.fop.fo.FONode)
     0.09%  java             jitted-3912413-2418.so  [.] org.apache.fop.fo.XMLWhiteSpaceHandler.handleWhiteSpace()
     0.08%  Reference Handl  jitted-3912413-54.so    [.] Interpreter
     0.08%  java             jitted-3912413-3326.so  [.] org.apache.xmlgraphics.fonts.Glyphs.stringToGlyph(java.lang.String)
     0.08%  java             jitted-3912413-3953.so  [.] org.apache.fop.layoutmgr.BreakingAlgorithm.considerLegalBreak(org.apache.fop.layoutmgr.KnuthElement, int)
⬢ [acme@toolbox perf-tools-next]$

And then:

  ⬢ [acme@toolbox perf-tools-next]$ perf annotate --stdio2 -i perf-injected.data 'org.apache.fop.layoutmgr.inline.TextLayoutManager.getNextKnuthElements(org.apache.fop.layoutmgr.LayoutContext, int)' | head -20
  Samples: 8  of event 'cpu_atom/cycles/Pu', 4000 Hz, Event count (approx.): 8112794, [percent: local period]
  org.apache.fop.layoutmgr.inline.TextLayoutManager.getNextKnuthElements(org.apache.fop.layoutmgr.LayoutContext, int)() /home/acme/.debug/jit/java-jit-20241127.XXF1SRgN/jitted-3912413-4177.so
  Percent        0x80 <org.apache.fop.layoutmgr.inline.TextLayoutManager.getNextKnuthElements(org.apache.fop.layoutmgr.LayoutContext, int)>:
                   nop
                   movl       0x8(%rsi),%r10d
                   cmpl       0x8(%rax),%r10d
                 → jne        0
                   movl       %eax,-0x14000(%rsp)
                   pushq      %rbp
                   subq       $0xb0,%rsp
                   nop
                   cmpl       $0x3,0x20(%r15)
                 ↓ jne        7037
             2e:   movl       %ecx,0x28(%rsp)
                   movq       %rdx,%rbp
                   movl       0x64(%rdx),%ebx
                   cmpb       $0x0,0x38(%r15)
                 ↓ jne        3a44
                   movq       %rsi,0x30(%rsp)
             48:   movq       0x30(%rsp),%r10
  ⬢ [acme@toolbox perf-tools-next]$

No source code nor line numbers, that I saw in another build of perf for
RHEL9, for the same workload described in the cset above (a publicly
available java benchmark), so something to investigate on perf upstream
running on fedora, maybe some quirk with the jdk used when building perf
for RHEL 9 and for Fedora 40.

A related patch that should have make this all work is:

  "perf inject jit: Add namespaces support"
  https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=67dec926931448d688efb5fe34f7b5a22470fc0a

But we still need to polish this some more, maybe there are differences
in the agent used in NodeJS with --perf-prof and the jvmti one we're
using.

Hopefully describing all the steps while we investigate this case will
help us improve perf support for profiling JITed environments running in
containers while profiling from inside and outside it.

Reported-by: Francesco Nigro <fnigro@redhat.com>
Reported-by: Ilan Green <igreen@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Clark Williams <williams@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
Link: https://lore.kernel.org/r/20241206204828.507527-3-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-08 17:29:51 -03:00
Christophe Leroy
7a93786c30 perf machine: Don't ignore _etext when not a text symbol
Depending on how vmlinux.lds is written, _etext might be the very first
data symbol instead of the very last text symbol.

Don't require it to be a text symbol, accept any symbol type.

Comitter notes:

See the first Link for further discussion, but it all boils down to
this:

 ---
  # grep -e _stext -e _etext -e _edata /proc/kallsyms
  c0000000 T _stext
  c08b8000 D _etext

  So there is no _edata and _etext is not text

  $ ppc-linux-objdump -x vmlinux | grep -e _stext -e _etext -e _edata
  c0000000 g       .head.text	00000000 _stext
  c08b8000 g       .rodata	00000000 _etext
  c1378000 g       .sbss	00000000 _edata
 ---

Fixes: ed9adb2035 ("perf machine: Read also the end of the kernel")
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/r/b3ee1994d95257cb7f2de037c5030ba7d1bed404.1736327613.git.christophe.leroy@csgroup.eu
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-08 17:20:42 -03:00
Christophe Leroy
dae29277fd perf maps: Fix display of kernel symbols
Since commit 659ad3492b ("perf maps: Switch from rbtree to lazily
sorted array for addresses"), perf doesn't display anymore kernel
symbols on powerpc, allthough it still detects them as kernel addresses.

	# Overhead  Command     Shared Object  Symbol
	# ........  ..........  ............. ......................................
	#
	    80.49%  Coeur main  [unknown]      [k] 0xc005f0f8
	     3.91%  Coeur main  gau            [.] engine_loop.constprop.0.isra.0
	     1.72%  Coeur main  [unknown]      [k] 0xc005f11c
	     1.09%  Coeur main  [unknown]      [k] 0xc01f82c8
	     0.44%  Coeur main  libc.so.6      [.] epoll_wait
	     0.38%  Coeur main  [unknown]      [k] 0xc0011718
	     0.36%  Coeur main  [unknown]      [k] 0xc01f45c0

This is because function maps__find_next_entry() now returns current
entry instead of next entry, leading to kernel map end address getting
mis-configured with its own start address instead of the start address
of the following map.

Fix it by really taking the next entry, also make sure that entry
follows current one by making sure entries are sorted.

Fixes: 659ad3492b ("perf maps: Switch from rbtree to lazily sorted array for addresses")
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/2ea4501209d5363bac71a6757fe91c0747558a42.1736329923.git.christophe.leroy@csgroup.eu
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-08 17:20:42 -03:00
Namhyung Kim
c738a34417 perf test: Update ftrace test to use --graph-opts
I found it failed on machines with limited memory because 16M byte
per-cpu buffer is too big.  The reason it added the option is not to
miss tracing data.  Thus we can limit the data size by reducing the
function call depth instead of increasing the buffer size to handle the
whole data.

As it used the same option in the test_ftrace_trace() and it was able
to find the sleep function, it should work with the profile subcommand.

Get rid of other grep commands which might be affected by the depth
change.

Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20250107224352.1128669-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-08 17:20:42 -03:00
Namhyung Kim
e5f2024cb9 perf ftrace profile: Add --graph-opts option
Like trace subcommand, it should be able to pass some options to control
the tracing behavior for the function graph tracer.

But some options are limited in order to maintain the internal behavior.

For example, it can limit the function call depth like below:

  # perf ftrace profile --graph-opts depth=5 -- myprog

Committer testing:

  root@number:~# perf ftrace profile --graph-opts thresh=1000 -- sleep 1
  # Total (us)   Avg (us)   Max (us)      Count   Function
   1001419.301 500709.650 1000032.000          2   x64_sys_call
   1000032.000 1000032.000 1000032.000          1   __x64_sys_clock_nanosleep
   1000032.000 1000032.000 1000032.000          1   common_nsleep
   1000031.000 1000031.000 1000031.000          1   do_nanosleep
   1000031.000 1000031.000 1000031.000          1   hrtimer_nanosleep
   1000024.000 1000024.000 1000024.000          1   schedule
      1387.208   1387.208   1387.208          1   __x64_sys_execve
      1386.691   1386.691   1386.691          1   do_execveat_common.isra.0
      1334.170   1334.170   1334.170          1   bprm_execve
      1258.413   1258.413   1258.413          1   load_elf_binary
      1123.068   1123.068   1123.068          1   begin_new_exec
      1113.550   1113.550   1113.550          1   mmput
      1109.237   1109.237   1109.237          1   exit_mmap
  root@number:~# perf ftrace profile --graph-opts thresh=1200 -- sleep 1
  # Total (us)   Avg (us)   Max (us)      Count   Function
   1001448.204 500724.102 1000018.000          2   x64_sys_call
   1000017.000 1000017.000 1000017.000          1   __x64_sys_clock_nanosleep
   1000017.000 1000017.000 1000017.000          1   common_nsleep
   1000017.000 1000017.000 1000017.000          1   hrtimer_nanosleep
   1000016.000 1000016.000 1000016.000          1   do_nanosleep
   1000012.000 1000012.000 1000012.000          1   schedule
      1430.112   1430.112   1430.112          1   __x64_sys_execve
      1429.581   1429.581   1429.581          1   do_execveat_common.isra.0
      1376.289   1376.289   1376.289          1   bprm_execve
      1301.743   1301.743   1301.743          1   load_elf_binary
  root@number:~#

Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250107224352.1128669-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-08 17:20:42 -03:00
Namhyung Kim
86a12b92a9 perf ftrace: Display latency statistics at the end
Sometimes users also want to see average latency as well as histogram.

Display latency statistics like avg, max, min at the end.

  $ sudo ./perf ftrace latency -ab -T synchronize_rcu -- ...
  #   DURATION     |      COUNT | GRAPH                                          |
       0 - 1    us |          0 |                                                |
       1 - 2    us |          0 |                                                |
       2 - 4    us |          0 |                                                |
       4 - 8    us |          0 |                                                |
       8 - 16   us |          0 |                                                |
      16 - 32   us |          0 |                                                |
      32 - 64   us |          0 |                                                |
      64 - 128  us |          0 |                                                |
     128 - 256  us |          0 |                                                |
     256 - 512  us |          0 |                                                |
     512 - 1024 us |          0 |                                                |
       1 - 2    ms |          0 |                                                |
       2 - 4    ms |          0 |                                                |
       4 - 8    ms |          0 |                                                |
       8 - 16   ms |          1 | #####                                          |
      16 - 32   ms |          7 | ########################################       |
      32 - 64   ms |          0 |                                                |
      64 - 128  ms |          0 |                                                |
     128 - 256  ms |          0 |                                                |
     256 - 512  ms |          0 |                                                |
     512 - 1024 ms |          0 |                                                |
       1 - ...   s |          0 |                                                |

  # statistics  (in usec)
    total time:               171832
      avg time:                21479
      max time:                30906
      min time:                15869
         count:                    8

Committer testing:

  root@number:~# perf ftrace latency -nab --bucket-range 100 --max-latency 512 -T switch_mm_irqs_off sleep 1
  #   DURATION     |      COUNT | GRAPH                                          |
       0 -  100 ns |        314 | ##                                             |
     100 -  200 ns |       1843 | #############                                  |
     200 -  300 ns |       1390 | ##########                                     |
     300 -  400 ns |        844 | ######                                         |
     400 -  500 ns |        480 | ###                                            |
     500 -  512 ns |        315 | ##                                             |
     512 -  ... ns |         16 |                                                |

  # statistics  (in nsec)
    total time:              2448936
      avg time:                  387
      max time:                 3285
      min time:                   82
         count:                 6328
  root@number:~#

Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250107224352.1128669-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-01-08 17:20:42 -03:00