When building perf with LIBBPF_DYNAMIC=1 on a fedora system with
libbpf-devel 1.5 I it was breaking with:
util/bpf-event.c: In function ‘format_btf_variable’:
util/bpf-event.c:291:18: error: ‘const struct btf_dump_type_data_opts’ has no member named ‘emit_strings’
291 | .emit_strings = 1,
| ^~~~~~~~~~~~
util/bpf-event.c:291:33: error: initialized field overwritten [-Werror=override-init]
291 | .emit_strings = 1,
| ^
util/bpf-event.c:291:33: note: (near initialization for ‘opts.skip_names’)
Check the version before using that feature.
Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The detection of uncore_imc may happen for free running PMUs and the
clockticks event may be present on uncore_clock. Rewrite the test to
detect duplicated/deduplicated events from perf list, not hardcoded to
uncore_imc.
If perf stat fails then assume it is permissions and skip the test.
Committer testing:
Before:
root@x1:~# perf test -vv uniquifyi
96: perf stat events uniquifying:
--- start ---
test child forked, pid 220851
stat event uniquifying test
grep: Unmatched [, [^, [:, [., or [=
Event is not uniquified [Failed]
perf stat -e clockticks -A -o /tmp/__perf_test.stat_output.X7ChD -- true
# started on Fri Sep 19 16:48:38 2025
Performance counter stats for 'system wide':
CPU0 2,310,956 uncore_clock/clockticks/
0.001746771 seconds time elapsed
---- end(-1) ----
96: perf stat events uniquifying : FAILED!
root@x1:~#
After:
root@x1:~# perf test -vv uniquifyi
96: perf stat events uniquifying:
--- start ---
test child forked, pid 222366
Uniquification of PMU sysfs events test
Testing event uncore_imc_free_running/data_read/ is uniquified to uncore_imc_free_running_0/data_read/
Testing event uncore_imc_free_running/data_total/ is uniquified to uncore_imc_free_running_0/data_total/
Testing event uncore_imc_free_running/data_write/ is uniquified to uncore_imc_free_running_0/data_write/
Testing event uncore_imc_free_running/data_read/ is uniquified to uncore_imc_free_running_1/data_read/
Testing event uncore_imc_free_running/data_total/ is uniquified to uncore_imc_free_running_1/data_total/
Testing event uncore_imc_free_running/data_write/ is uniquified to uncore_imc_free_running_1/data_write/
---- end(0) ----
96: perf stat events uniquifying : Ok
root@x1:~#
Fixes: 070b315333 ("perf test: Restrict uniquifying test to machines with 'uncore_imc'")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The PMU name is appearing twice in:
```
$ perf stat -e uncore_imc_free_running/data_total/ -A true
Performance counter stats for 'system wide':
CPU0 1.57 MiB uncore_imc_free_running_0/uncore_imc_free_running,data_total/
CPU0 1.58 MiB uncore_imc_free_running_1/uncore_imc_free_running,data_total/
0.000892376 seconds time elapsed
```
Use the pmu_name_len_no_suffix to avoid this problem.
Committer testing:
After this patch:
root@x1:~# perf stat -e uncore_imc_free_running/data_total/ -A true
Performance counter stats for 'system wide':
CPU0 1.69 MiB uncore_imc_free_running_0/data_total/
CPU0 1.68 MiB uncore_imc_free_running_1/data_total/
0.002141605 seconds time elapsed
root@x1:~#
Fixes: 7d45f402d3 ("perf evlist: Make uniquifying counter names consistent")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The test starts a workload and then opens events. If the events fail
to open, for example because of perf_event_paranoid, the gopipe of the
workload is leaked and the file descriptor leak check fails when the
test exits. To avoid this cancel the workload when opening the events
fails.
Before:
```
$ perf test -vv 7
7: PERF_RECORD_* events & perf_sample fields:
--- start ---
test child forked, pid 1189568
Using CPUID GenuineIntel-6-B7-1
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
config 0xa00000000 (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/)
disabled 1
------------------------------------------------------------
sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8
sys_perf_event_open failed, error -13
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
config 0xa00000000 (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/)
disabled 1
exclude_kernel 1
------------------------------------------------------------
sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 3
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
config 0x400000000 (cpu_core/PERF_COUNT_HW_CPU_CYCLES/)
disabled 1
------------------------------------------------------------
sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8
sys_perf_event_open failed, error -13
------------------------------------------------------------
perf_event_attr:
type 0 (PERF_TYPE_HARDWARE)
config 0x400000000 (cpu_core/PERF_COUNT_HW_CPU_CYCLES/)
disabled 1
exclude_kernel 1
------------------------------------------------------------
sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 3
Attempt to add: software/cpu-clock/
..after resolving event: software/config=0/
cpu-clock -> software/cpu-clock/
------------------------------------------------------------
perf_event_attr:
type 1 (PERF_TYPE_SOFTWARE)
size 136
config 0x9 (PERF_COUNT_SW_DUMMY)
sample_type IP|TID|TIME|CPU
read_format ID|LOST
disabled 1
inherit 1
mmap 1
comm 1
enable_on_exec 1
task 1
sample_id_all 1
mmap2 1
comm_exec 1
ksymbol 1
bpf_event 1
{ wakeup_events, wakeup_watermark } 1
------------------------------------------------------------
sys_perf_event_open: pid 1189569 cpu 0 group_fd -1 flags 0x8
sys_perf_event_open failed, error -13
perf_evlist__open: Permission denied
---- end(-2) ----
Leak of file descriptor 6 that opened: 'pipe:[14200347]'
---- unexpected signal (6) ----
iFailed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
Failed to read build ID for //anon
#0 0x565358f6666e in child_test_sig_handler builtin-test.c:311
#1 0x7f29ce849df0 in __restore_rt libc_sigaction.c:0
#2 0x7f29ce89e95c in __pthread_kill_implementation pthread_kill.c:44
#3 0x7f29ce849cc2 in raise raise.c:27
#4 0x7f29ce8324ac in abort abort.c:81
#5 0x565358f662d4 in check_leaks builtin-test.c:226
#6 0x565358f6682e in run_test_child builtin-test.c:344
#7 0x565358ef7121 in start_command run-command.c:128
#8 0x565358f67273 in start_test builtin-test.c:545
#9 0x565358f6771d in __cmd_test builtin-test.c:647
#10 0x565358f682bd in cmd_test builtin-test.c:849
#11 0x565358ee5ded in run_builtin perf.c:349
#12 0x565358ee6085 in handle_internal_command perf.c:401
#13 0x565358ee61de in run_argv perf.c:448
#14 0x565358ee6527 in main perf.c:555
#15 0x7f29ce833ca8 in __libc_start_call_main libc_start_call_main.h:74
#16 0x7f29ce833d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128
#17 0x565358e391c1 in _start perf[851c1]
7: PERF_RECORD_* events & perf_sample fields : FAILED!
```
After:
```
$ perf test 7
7: PERF_RECORD_* events & perf_sample fields : Skip (permissions)
```
Fixes: 16d00fee70 ("perf tests: Move test__PERF_RECORD into separate object")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If libperl is installed then the perf tool build will build against
it. There appears to be limited interest in the scripting support for
perl so let's make it opt-in and deprecate it.
With this patch applied you need to add LIBPERL=1 to get libperl
support in perf - there is no warning if libperl is missing, but
building will fail if libperl is missing and the build has LIBPERL=1.
The perf version output is changed to:
```
$ perf version --build-options
perf version 6.17.rc3.g8eca69269947
aio: [ on ] # HAVE_AIO_SUPPORT
bpf: [ on ] # HAVE_LIBBPF_SUPPORT
bpf_skeletons: [ on ] # HAVE_BPF_SKEL
debuginfod: [ on ] # HAVE_DEBUGINFOD_SUPPORT
dwarf: [ on ] # HAVE_LIBDW_SUPPORT
dwarf_getlocations: [ on ] # HAVE_LIBDW_SUPPORT
dwarf-unwind: [ on ] # HAVE_DWARF_UNWIND_SUPPORT
auxtrace: [ on ] # HAVE_AUXTRACE_SUPPORT
libbfd: [ OFF ] # HAVE_LIBBFD_SUPPORT ( tip: Deprecated, license incompatibility, use BUILD_NONDISTRO=1 and install binutils-dev[el] )
libbpf-strings: [ on ] # HAVE_LIBBPF_STRINGS_SUPPORT
libcapstone: [ on ] # HAVE_LIBCAPSTONE_SUPPORT
libdw-dwarf-unwind: [ on ] # HAVE_LIBDW_SUPPORT
libelf: [ on ] # HAVE_LIBELF_SUPPORT
libnuma: [ on ] # HAVE_LIBNUMA_SUPPORT
libopencsd: [ OFF ] # HAVE_CSTRACE_SUPPORT
libperl: [ OFF ] # HAVE_LIBPERL_SUPPORT ( tip:
Deprecated, use LIBPERL=1 and install libperl-dev to build with it )
libpfm4: [ on ] # HAVE_LIBPFM
libpython: [ on ] # HAVE_LIBPYTHON_SUPPORT
libslang: [ on ] # HAVE_SLANG_SUPPORT
libtraceevent: [ on ] # HAVE_LIBTRACEEVENT
libunwind: [ OFF ] # HAVE_LIBUNWIND_SUPPORT ( tip:
Deprecated, use LIBUNWIND=1 and install libunwind-dev[el] to build
with it )
lzma: [ on ] # HAVE_LZMA_SUPPORT
numa_num_possible_cpus: [ on ] # HAVE_LIBNUMA_SUPPORT
zlib: [ on ] # HAVE_ZLIB_SUPPORT
zstd: [ on ] # HAVE_ZSTD_SUPPORT
```
i.e. there is a tip saying about deprecation and how to get support
back.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <qmo@kernel.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Tomas Glozar <tglozar@redhat.com>
Cc: Yuzhuo Jing <yuzhuo@google.com>
Link: https://lore.kernel.org/lkml/aMrk03gigBlGcYLK@x1/
Link: https://lore.kernel.org/lkml/CAP-5=fVX+bLBRJCiziDi_hBySgv2NFtDoghtpheSSxVAvvETGw@mail.gmail.com
[ Keep the pre-existing perl-ExtUtils-Embed hint for Fedora/RHEL systems ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If a user specifies an AUX buffer larger than 2 GiB, the returned size
may exceed 0x80000000. Since the err variable is defined as a signed
32-bit integer, such a value overflows and becomes negative.
As a result, the perf record command reports an error:
0x146e8 [0x30]: failed to process type: 71 [Unknown error 183711232]
Change the type of the err variable to a signed 64-bit integer to
accommodate large buffer sizes correctly.
Fixes: d5652d865e ("perf session: Add ability to skip 4GiB or more")
Reported-by: Tamas Zsoldos <tamas.zsoldos@arm.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20250808-perf_fix_big_buffer_size-v1-1-45f45444a9a4@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There can be a significant gap in memset/memcpy performance depending
on the size of the region being operated on.
With chunk-size=4kb:
$ echo madvise > /sys/kernel/mm/transparent_hugepage/enabled
$ perf bench mem memset -p 4kb -k 4kb -s 4gb -l 10 -f x86-64-stosq
# Running 'mem/memset' benchmark:
# function 'x86-64-stosq' (movsq-based memset() in arch/x86/lib/memset_64.S)
# Copying 4gb bytes ...
13.011655 GB/sec
With chunk-size=1gb:
$ echo madvise > /sys/kernel/mm/transparent_hugepage/enabled
$ perf bench mem memset -p 4kb -k 1gb -s 4gb -l 10 -f x86-64-stosq
# Running 'mem/memset' benchmark:
# function 'x86-64-stosq' (movsq-based memset() in arch/x86/lib/memset_64.S)
# Copying 4gb bytes ...
21.936355 GB/sec
So, allow the user to specify the chunk-size.
The default value is identical to the total size of the region, which
preserves current behaviour.
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raghavendra K T <raghavendra.kt@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Page sizes that can be selected: 4KB, 2MB, 1GB.
Both the reservation and node from which hugepages are allocated
from are expected to be addressed by the user.
An example of page-size selection:
$ perf bench mem memset -s 4gb -p 2mb
# Running 'mem/memset' benchmark:
# function 'default' (Default memset() provided by glibc)
# Copying 4gb bytes ...
14.919194 GB/sec
# function 'x86-64-unrolled' (unrolled memset() in arch/x86/lib/memset_64.S)
# Copying 4gb bytes ...
11.514503 GB/sec
# function 'x86-64-stosq' (movsq-based memset() in arch/x86/lib/memset_64.S)
# Copying 4gb bytes ...
12.600568 GB/sec
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Raghavendra K T <raghavendra.kt@amd.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Pull sound fixes from Takashi Iwai:
"A collection of small fixes. The volume became higher than wished, but
nothing really stands out -- all small, nice and smooth.
A slightly large change is found in qcom USB-audio offload stuff, but
this is a regression fix specific to this device, hence it should be
safe to apply at this late stage.
- Various small fixes for ASoC Cirrus, Realtek, lpass, Intel and
Qualcomm drivers
- ASoC SoundWire fixes
- A few TAS2781 HD-audio side-codec driver fixes
- A fix for Qualcomm USB-audio offload breakage
- Usual a few HD-audio quirks"
* tag 'sound-6.17-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (35 commits)
ALSA: hda/realtek: Fix mute led for HP Laptop 15-dw4xx
ALSA: hda: intel-dsp-config: Prevent SEGFAULT if ACPI_HANDLE() is NULL
ALSA: usb: qcom: Fix false-positive address space check
ASoC: rt5682s: Adjust SAR ADC button mode to fix noise issue
ASoC: Intel: PTL: Add entry for HDMI-In capture support to non-I2S codec boards.
ASoC: amd: acp: Fix incorrect retrival of acp_chip_info
ASoC: Intel: sof_sdw: use PRODUCT_FAMILY for Fatcat series
ASoC: qcom: sc8280xp: Fix sound card driver name match data for QCS8275
ALSA: hda/realtek: Fix volume control on Lenovo Thinkbook 13x Gen 4
ALSA: hda/realtek: Support Lenovo Thinkbook 13x Gen 5
ALSA: hda: cs35l41: Support Lenovo Thinkbook 13x Gen 5
ALSA: hda/realtek: Add ALC295 Dell TAS2781 I2C fixup
ALSA: hda/tas2781: Fix a potential race condition that causes a NULL pointer in case no efi.get_variable exsits
ASoC: qcom: sc8280xp: Enable DAI format configuration for MI2S interfaces
ASoC: qcom: q6apm-lpass-dais: Fix missing set_fmt DAI op for I2S
ASoC: qcom: audioreach: Fix lpaif_type configuration for the I2S interface
ASoC: Intel: catpt: Expose correct bit depth to userspace
ALSA: hda/tas2781: Fix the order of TAS2781 calibrated-data
ASoC: codecs: lpass-wsa-macro: Fix speaker quality distortion
ASoC: codecs: lpass-rx-macro: Fix playback quality distortion
...
When not running as root and with higher perf event paranoia values
the perf record LBR tests could fail rather than skipping the
problematic tests.
Add the sensitivity to the test and confirm it passes with paranoia
values from -1 to 2.
Committer testing:
Testing with '$ perf test -vv lbr', i.e. as non root, and then comparing
the output shows the mentioned errors before this patch:
acme@x1:~$ grep -m1 "model name" /proc/cpuinfo
model name : 13th Gen Intel(R) Core(TM) i7-1365U
acme@x1:~$
Before:
132: perf record LBR tests : Skip
After:
132: perf record LBR tests : Ok
Fixes: 32559b99e0 ("perf test: Add set of perf record LBR tests")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf record testcase fails on systems with more than 1K CPUs.
Testcase: perf test -vv "PERF_RECORD_* events & perf_sample fields"
PERF_RECORD_* events & perf_sample fields :
--- start ---
test child forked, pid 272482
sched_getaffinity: Invalid argument
sched__get_first_possible_cpu: Invalid argument
test child finished with -1
---- end ----
PERF_RECORD_* events & perf_sample fields: FAILED!
sched__get_first_possible_cpu uses "sched_getaffinity" to get the
cpumask and this call is returning EINVAL (Invalid argument).
This happens because the default mask size in glibc is 1024.
To overcome this 1024 CPUs mask size limitation of cpu_set_t, change the
mask size using the CPU_*_S macros ie, use CPU_ALLOC to allocate
cpumask, CPU_ALLOC_SIZE for size.
Same fix needed for mask which is used to setaffinity so that mask size
is large enough to represent number of possible CPU's in the system.
Reported-by: Tejas Manhas <tejas05@linux.ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-by: Ian Rogers <irogers@google.com>
Tested-by: Tejas Manhas <tejas05@linux.ibm.com>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Hari Bathini <hbathini@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Utilizes the previous is_breg_access_indirect function to determine if
the register + offset stores the variable itself or the struct it points
to, save the information in die_var_type.is_reg_var_addr.
Since we are storing the real types in the stack state, we need to do a
type dereference when is_reg_var_addr is set to false for stack/frame
registers.
For other gp registers, skip the variable when the register is a pointer
to the type. If we want to accept these variables, we might also utilize
is_reg_var_addr in a different way, we need to mark that register as a
pointer to the type.
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Zecheng Li <zecheng@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xu Liu <xliuprof@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduces the function is_breg_access_indirect to determine whether a
memory access involving a DW_OP_breg* operation refers to the variable's
value directly or requires dereferencing the variable's type as a
pointer based on the DWARF expression.
Previously, all breg based accesses were assumed to directly access the
variable's value (is_pointer = false).
The is_breg_access_indirect function handles three cases:
1. Base register + offset only: (e.g., DW_OP_breg7 RSP+88) The
calculated address is the location of the variable. The access is
direct, so no type dereference is needed. Returns false.
2. Base register + offset, followed by other operations ending in
DW_OP_stack_value, including DW_OP_deref: (e.g., DW_OP_breg*,
DW_OP_deref, DW_OP_stack_value) The DWARF expression computes the
variable's value, but that value requires a dereference. The memory
access is fetching that value, so no type dereference is needed.
Returns false.
3. Base register + offset, followed only by DW_OP_stack_value: (e.g.,
DW_OP_breg13 R13+256, DW_OP_stack_value) This indicates the value at
the base + offset is the variable's value. Since this value is being
used as an address in the memory access, the variable's type is
treated as a pointer and requires a type dereference. Returns true.
The is_pointer argument passed to match_var_offset is now set by
is_breg_access_indirect for breg accesses.
There are more complex expressions that includes multiple operations and
may require additional handling, such as DW_OP_deref without a
DW_OP_stack_value, or including multiple base registers. They are less
common in the Linux kernel dwarf and are skipped in check_allowed_ops.
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Zecheng Li <zecheng@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xu Liu <xliuprof@google.com>
Link: https://lore.kernel.org/r/CAJUgMyK2wTiEZB__dtgCELmaNGFWhG1j0g9rv_C=cLD6Zq4_5w@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Current code skips to parse events after generating data source. The
reason is the data source packets have cache and snooping related info,
the afterwards event packets might contain duplicate info.
This commit changes to continue parsing the events after data source
analysis. If data source does not give out memory level and snooping
types, then the event info is used to synthesize the related fields.
As a result, both the peer snoop option ('-d peer') and hitm options
('-d tot/lcl/rmt') are supported by Arm SPE in 'perf c2c'.
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since FEAT_SPEv1p4, Arm SPE provides two extra events: "Cache data
modified" and "Data snooped".
Set the snoop mode as:
- If both the "Cache data modified" event and the "Data snooped" event
are set, which indicates a load operation that snooped from a outside
cache and hit a modified copy, set the HITM flag to inspect false
sharing.
- If the snooped event bit is not set, and the snooped event has been
supported by the hardware, set as NONE mode (no snoop operation).
- If the snooped event bit is not set, and the event is not supported or
absent the events info in the meta data, set as NA mode (not
available).
Don't set any mode for only "Cache data modified" event, as it hits a
local modified copy.
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>