linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-31 10:19:22 -04:00

Author	SHA1	Message	Date
Ian Rogers	1bcd627165	perf stat: Remove "unit" workarounds for metric-only Remove code that tested the "unit" as in KB/sec for certain hard coded metric values and did workarounds. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-11-11 16:48:35 -08:00
Ian Rogers	19df87d9ed	perf stat: Fix default metricgroup display on hybrid The logic to skip output of a default metric line was firing on Alderlake and not displaying 'TopdownL1 (cpu_atom)'. Remove the need_full_name check as it is equivalent to the different PMU test in the cases we care about, merge the 'if's and flip the evsel of the PMU test. The 'if' is now basically saying, if the output matches the last printed output then skip the output. Before: ``` TopdownL1 (cpu_core) # 11.3 % tma_bad_speculation # 24.3 % tma_frontend_bound TopdownL1 (cpu_core) # 33.9 % tma_backend_bound # 30.6 % tma_retiring # 42.2 % tma_backend_bound # 25.0 % tma_frontend_bound (49.81%) # 12.8 % tma_bad_speculation # 20.0 % tma_retiring (59.46%) ``` After: ``` TopdownL1 (cpu_core) # 8.3 % tma_bad_speculation # 43.7 % tma_frontend_bound # 30.7 % tma_backend_bound # 17.2 % tma_retiring TopdownL1 (cpu_atom) # 31.9 % tma_backend_bound # 37.6 % tma_frontend_bound (49.66%) # 18.0 % tma_bad_speculation # 12.6 % tma_retiring (59.58%) ``` Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-11-11 16:48:35 -08:00
Ian Rogers	b71f46a6a7	perf stat: Remove hard coded shadow metrics Now that the metrics are encoded in common json the hard coded printing means the metrics are shown twice. Remove the hard coded version. This means that when specifying events, and those events correspond to a hard coded metric, the metric will no longer be displayed. The metric will be displayed if the metric is requested. Due to the adhoc printing in the previous approach it was often found frustrating, the new approach avoids this. The default perf stat output on an alderlake now looks like: ``` $ perf stat -a -- sleep 1 Performance counter stats for 'system wide': 19,697 context-switches # nan cs/sec cs_per_second TopdownL1 (cpu_core) # 10.7 % tma_bad_speculation # 24.9 % tma_frontend_bound TopdownL1 (cpu_core) # 34.3 % tma_backend_bound # 30.1 % tma_retiring 6,593 page-faults # nan faults/sec page_faults_per_second 729,065,658 cpu_atom/cpu-cycles/ # nan GHz cycles_frequency (49.79%) 1,605,131,101 cpu_core/cpu-cycles/ # nan GHz cycles_frequency # 19.7 % tma_bad_speculation # 14.2 % tma_retiring (50.14%) # 37.3 % tma_frontend_bound (50.31%) 87,302,268 cpu_atom/branches/ # nan M/sec branch_frequency (60.27%) 512,046,956 cpu_core/branches/ # nan M/sec branch_frequency 1,111 cpu-migrations # nan migrations/sec migrations_per_second # 28.8 % tma_backend_bound (60.26%) 0.00 msec cpu-clock # 0.0 CPUs CPUs_utilized 392,509,323 cpu_atom/instructions/ # 0.6 instructions insn_per_cycle (60.19%) 2,990,369,310 cpu_core/instructions/ # 1.9 instructions insn_per_cycle 3,493,478 cpu_atom/branch-misses/ # 5.9 % branch_miss_rate (49.69%) 7,297,531 cpu_core/branch-misses/ # 1.4 % branch_miss_rate 1.006621701 seconds time elapsed ``` Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-11-11 16:48:35 -08:00
Ian Rogers	a3248b5b54	perf jevents: Add metric DefaultShowEvents Some Default group metrics require their events showing for consistency with perf's previous behavior. Add a flag to indicate when this is the case and use it in stat-display. As events are coming from Default metrics remove that default hardware and software events from perf stat. Following this change the default perf stat output on an alderlake looks like: ``` $ perf stat -a -- sleep 1 Performance counter stats for 'system wide': 20,550 context-switches # nan cs/sec cs_per_second TopdownL1 (cpu_core) # 9.0 % tma_bad_speculation # 28.1 % tma_frontend_bound TopdownL1 (cpu_core) # 29.2 % tma_backend_bound # 33.7 % tma_retiring 6,685 page-faults # nan faults/sec page_faults_per_second 790,091,064 cpu_atom/cpu-cycles/ # nan GHz cycles_frequency (49.83%) 2,563,918,366 cpu_core/cpu-cycles/ # nan GHz cycles_frequency # 12.3 % tma_bad_speculation # 14.5 % tma_retiring (50.20%) # 33.8 % tma_frontend_bound (50.24%) 76,390,322 cpu_atom/branches/ # nan M/sec branch_frequency (60.20%) 1,015,173,047 cpu_core/branches/ # nan M/sec branch_frequency 1,325 cpu-migrations # nan migrations/sec migrations_per_second # 39.3 % tma_backend_bound (60.17%) 0.00 msec cpu-clock # 0.000 CPUs utilized # 0.0 CPUs CPUs_utilized 554,347,072 cpu_atom/instructions/ # 0.64 insn per cycle # 0.6 instructions insn_per_cycle (60.14%) 5,228,931,991 cpu_core/instructions/ # 2.04 insn per cycle # 2.0 instructions insn_per_cycle 4,308,874 cpu_atom/branch-misses/ # 5.65% of all branches # 5.6 % branch_miss_rate (49.76%) 9,890,606 cpu_core/branch-misses/ # 0.97% of all branches # 1.0 % branch_miss_rate 1.005477803 seconds time elapsed ``` Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-11-11 16:48:35 -08:00
Ian Rogers	c7adeb0974	perf jevents: Add set of common metrics based on default ones Add support to getting a common set of metrics from a default table. It simplifies the generation to add json metrics at the same time. The metrics added are CPUs_utilized, cs_per_second, migrations_per_second, page_faults_per_second, insn_per_cycle, stalled_cycles_per_instruction, frontend_cycles_idle, backend_cycles_idle, cycles_frequency, branch_frequency and branch_miss_rate based on the shadow metric definitions. Following this change the default perf stat output on an alderlake looks like: ``` $ perf stat -a -- sleep 2 Performance counter stats for 'system wide': 0.00 msec cpu-clock # 0.000 CPUs utilized 77,739 context-switches 15,033 cpu-migrations 321,313 page-faults 14,355,634,225 cpu_atom/instructions/ # 1.40 insn per cycle (35.37%) 134,561,560,583 cpu_core/instructions/ # 3.44 insn per cycle (57.85%) 10,263,836,145 cpu_atom/cycles/ (35.42%) 39,138,632,894 cpu_core/cycles/ (57.60%) 2,989,658,777 cpu_atom/branches/ (42.60%) 32,170,570,388 cpu_core/branches/ (57.39%) 29,789,870 cpu_atom/branch-misses/ # 1.00% of all branches (42.69%) 165,991,152 cpu_core/branch-misses/ # 0.52% of all branches (57.19%) (software) # nan cs/sec cs_per_second TopdownL1 (cpu_core) # 11.9 % tma_bad_speculation # 19.6 % tma_frontend_bound (63.97%) TopdownL1 (cpu_core) # 18.8 % tma_backend_bound # 49.7 % tma_retiring (63.97%) (software) # nan faults/sec page_faults_per_second # nan GHz cycles_frequency (42.88%) # nan GHz cycles_frequency (69.88%) TopdownL1 (cpu_atom) # 11.7 % tma_bad_speculation # 29.9 % tma_retiring (50.07%) TopdownL1 (cpu_atom) # 31.3 % tma_frontend_bound (43.09%) (cpu_atom) # nan M/sec branch_frequency (43.09%) # nan M/sec branch_frequency (70.07%) # nan migrations/sec migrations_per_second TopdownL1 (cpu_atom) # 27.1 % tma_backend_bound (43.08%) (software) # 0.0 CPUs CPUs_utilized # 1.4 instructions insn_per_cycle (43.04%) # 3.5 instructions insn_per_cycle (69.99%) # 1.0 % branch_miss_rate (35.46%) # 0.5 % branch_miss_rate (65.02%) 2.005626564 seconds time elapsed ``` Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-11-11 16:48:34 -08:00
Ian Rogers	2e5140849b	perf expr: Add #target_cpu literal For CPU nanoseconds a lot of the stat-shadow metrics use either task-clock or cpu-clock, the latter being used when target__has_cpu. Add a #target_cpu literal so that json metrics can perform the same test. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-11-11 16:48:34 -08:00
Ian Rogers	c8035a4961	perf metricgroup: Add care to picking the evsel for displaying a metric Rather than using the first evsel in the matched events, try to find the least shared non-tool evsel. The aim is to pick the first evsel that typifies the metric within the list of metrics. This addresses an issue where Default metric group metrics may lose their counter value due to how the stat displaying hides counters for default event/metric output. For a metricgroup like TopdownL1 on an Intel Alderlake the change is, before there are 4 events with metrics: ``` $ perf stat -M topdownL1 -a sleep 1 Performance counter stats for 'system wide': 7,782,334,296 cpu_core/TOPDOWN.SLOTS/ # 10.4 % tma_bad_speculation # 19.7 % tma_frontend_bound 2,668,927,977 cpu_core/topdown-retiring/ # 35.7 % tma_backend_bound # 34.1 % tma_retiring 803,623,987 cpu_core/topdown-bad-spec/ 167,514,386 cpu_core/topdown-heavy-ops/ 1,555,265,776 cpu_core/topdown-fe-bound/ 2,792,733,013 cpu_core/topdown-be-bound/ 279,769,310 cpu_atom/TOPDOWN_RETIRING.ALL/ # 12.2 % tma_retiring # 15.1 % tma_bad_speculation 457,917,232 cpu_atom/CPU_CLK_UNHALTED.CORE/ # 38.4 % tma_backend_bound # 34.2 % tma_frontend_bound 783,519,226 cpu_atom/TOPDOWN_FE_BOUND.ALL/ 10,790,192 cpu_core/INT_MISC.UOP_DROPPING/ 879,845,633 cpu_atom/TOPDOWN_BE_BOUND.ALL/ ``` After there are 6 events with metrics: ``` $ perf stat -M topdownL1 -a sleep 1 Performance counter stats for 'system wide': 2,377,551,258 cpu_core/TOPDOWN.SLOTS/ # 7.9 % tma_bad_speculation # 36.4 % tma_frontend_bound 480,791,142 cpu_core/topdown-retiring/ # 35.5 % tma_backend_bound 186,323,991 cpu_core/topdown-bad-spec/ 65,070,590 cpu_core/topdown-heavy-ops/ # 20.1 % tma_retiring 871,733,444 cpu_core/topdown-fe-bound/ 848,286,598 cpu_core/topdown-be-bound/ 260,936,456 cpu_atom/TOPDOWN_RETIRING.ALL/ # 12.4 % tma_retiring # 17.6 % tma_bad_speculation 419,576,513 cpu_atom/CPU_CLK_UNHALTED.CORE/ 797,132,597 cpu_atom/TOPDOWN_FE_BOUND.ALL/ # 38.0 % tma_frontend_bound 3,055,447 cpu_core/INT_MISC.UOP_DROPPING/ 671,014,164 cpu_atom/TOPDOWN_BE_BOUND.ALL/ # 32.0 % tma_backend_bound ``` Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-11-11 16:48:34 -08:00
Namhyung Kim	367377f45c	perf tools: Fix missing feature check for inherit + SAMPLE_READ It should also have PERF_SAMPLE_TID to enable inherit and PERF_SAMPLE_READ on recent kernels. Not having _TID makes the feature check wrongly detect the inherit and _READ support. It was reported that the following command failed due to the error in the missing feature check on Intel SPR machines. $ perf record -e '{cpu/mem-loads-aux/S,cpu/mem-loads,ldlat=3/PS}' -- ls Error: Failure to open event 'cpu/mem-loads,ldlat=3/PS' on PMU 'cpu' which will be removed. Invalid event (cpu/mem-loads,ldlat=3/PS) in per-thread mode, enable system wide with '-a'. Reviewed-by: Ian Rogers <irogers@google.com> Fixes: `3b193a57ba` ("perf tools: Detect missing kernel features properly") Reported-and-tested-by: Chen, Zide <zide.chen@intel.com> Closes: https://lore.kernel.org/lkml/20251022220802.1335131-1-zide.chen@intel.com/ Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-11-11 16:43:37 -08:00
Chen Ni	e279039c3e	perf symbol: Remove unneeded semicolon Remove unnecessary semicolons reported by Coccinelle/coccicheck and the semantic patch at scripts/coccinelle/misc/semicolon.cocci. Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-11-10 22:22:16 -08:00
Ian Rogers	0e9b51a432	perf pmu: Make pmu_alias_terms weak again The terms for a json event should be weak so they don't override command line options. Before: ``` $ perf record -vv -c 1000 -e uops_issued.any -o /dev/null true 2>&1 \|grep "{ sample_period, sample_freq }" { sample_period, sample_freq } 200003 { sample_period, sample_freq } 2000003 { sample_period, sample_freq } 1000 ``` After: ``` $ perf record -vv -c 1000 -e uops_issued.any -o /dev/null true 2>&1 \|grep "{ sample_period, sample_freq }" { sample_period, sample_freq } 1000 { sample_period, sample_freq } 1000 { sample_period, sample_freq } 1000 ``` Fixes: `84bae3af20` ("perf pmu: Don't eagerly parse event terms") Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-11-09 23:07:57 -08:00
Ian Rogers	6331b26693	perf tool: Add a delegate_tool that just delegates actions to another tool Add an ability to be able to compose perf_tools, by having one perform an action and then calling a delegate. Currently the perf_tools have if-then-elses setting the callback and then if-then-elses within the callback. Understanding the behavior is complex as it is in two places and logic for numerous operations, within things like perf inject, is interwoven. By chaining perf_tools together based on command line options this kind of code can be avoided. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-11-07 13:25:05 -08:00
Ian Rogers	71062e282d	perf tool: Add the perf_tool argument to all callbacks Getting context for what a tool is doing, such as the perf_inject instance, using container_of the tool is a common pattern in the code. This isn't possible event_op2, event_op3 and event_op4 callbacks as the tool isn't passed. Add the argument and then fix function signatures to match. As tools maybe reading a tool from somewhere else, change that code to use the passed in tool. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-11-07 13:25:05 -08:00
Namhyung Kim	6bd89ae7d1	perf record: Make sure to update build-ID cache Recent change on enabling --buildid-mmap by default brought an issue with build-id handling. With build-ID in MMAP2 records, we don't need to save the build-ID table in the header of a perf data file. But the actual file contents still need to be cached in the debug directory for annotation etc. Split the build-ID header processing and caching and make sure perf record to save hit DSOs in the build-ID cache by moving perf_session__cache_build_ids() to the end of the record__ finish_output(). Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-11-06 17:29:16 -08:00
Ian Rogers	3f02cebe13	perf metricgroup: When copy metrics copy default information When copy metrics into a group also copy default information from the original metrics. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-11-03 20:57:21 -08:00
Ian Rogers	3bae9228a5	perf metricgroup: Missed free on error path If an out-of-memory occurs the expr also needs freeing. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-11-03 20:57:21 -08:00
Ian Rogers	5faa23cdab	perf metricgroup: Update comment on location of metric_event list Update comment as the stat_config no longer holds all metrics. Signed-off-by: Ian Rogers <irogers@google.com> Fixes: `faebee18d7` ("perf stat: Move metric list from config to evlist") Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-11-03 20:57:21 -08:00
Ian Rogers	371d32394e	perf evsel: Remove unused metric_events variable The metric_events exist in the metric_expr list and so this variable has been unused for a while. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-11-03 20:57:21 -08:00
Ian Rogers	01bc5d2f0d	perf tools: Cache counter names for raw samples on s390 Searching all event names is slower now that legacy names are included. Add a cache to avoid long iterative searches. Note, the cache isn't cleaned up and is as such a memory leak, however, globally reachable leaks like this aren't treated as leaks by leak sanitizer. Reported-by: Thomas Richter <tmricht@linux.ibm.com> Closes: https://lore.kernel.org/linux-perf-users/09943f4f-516c-4b93-877c-e4a64ed61d38@linux.ibm.com/ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-31 12:46:19 -07:00
Namhyung Kim	915c31f0e6	perf trace: Increase syscall handler map size to 1024 The syscalls_sys_{enter,exit} map in augmented_raw_syscalls.bpf.c has max entries of 512. Usually syscall numbers are smaller than this but x86 has x32 ABI where syscalls start from 512. That makes trace__init_syscalls_bpf_prog_array_maps() fail in the middle of the loop when it accesses those keys. As the loop iteration is not ordered by syscall numbers anymore, the failure can affect non-x32 syscalls. Let's increase the map size to 1024 so that it can handle those ABIs too. While most systems won't need this, increasing the size will be safer for potential future changes. Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-31 12:29:44 -07:00
Namhyung Kim	553d18c98a	perf lock contention: Load kernel map before lookup On some machines, it caused troubles when it tried to find kernel symbols. I think it's because kernel modules and kallsyms are messed up during load and split. Basically we want to make sure the kernel map is loaded and the code has it in the lock_contention_read(). But recently we added more lookups in the lock_contention_prepare() which is called before _read(). Also the kernel map (kallsyms) may not be the first one in the group like on ARM. Let's use machine__kernel_map() rather than just loading the first map. Reviewed-by: Ian Rogers <irogers@google.com> Fixes: `688d2e8de2` ("perf lock contention: Add -l/--lock-addr option") Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-31 11:32:51 -07:00
Michal Suchanek	2fee899c06	perf hwmon_pmu: Fix uninitialized variable warning The line_len is only set on success. Check the return value instead. util/hwmon_pmu.c: In function ‘perf_pmus__read_hwmon_pmus’: util/hwmon_pmu.c:742:20: warning: ‘line_len’ may be used uninitialized [-Wmaybe-uninitialized] 742 \| if (line_len > 0 && line[line_len - 1] == '\n') \| ^ util/hwmon_pmu.c:719:24: note: ‘line_len’ was declared here 719 \| size_t line_len; Fixes: `53cc0b351e` ("perf hwmon_pmu: Add a tool PMU exposing events from hwmon in sysfs") Signed-off-by: Michal Suchanek <msuchanek@suse.de> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-27 11:30:15 -07:00
tanze	ab29ff9f6f	perf auxtrace: Add auxtrace_synth_id_range_start() helper To avoid hardcoding the offset value for synthetic event IDs in multiple auxtrace modules (arm-spe, cs-etm, intel-pt, etc.), and to improve code reusability, this patch unifies the handling of the ID offset via a dedicated helper function. Signed-off-by: tanze <tanze@kylinos.cn> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Leo Yan <leo.yan@arm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-25 17:44:57 -07:00
Ian Rogers	be806f06ad	perf stat: Add/fix bperf cgroup max events workarounds Commit `b8308511f6` bumped the max events to 1024 but this results in BPF verifier issues if the number of command line events is too large. Workaround this by: 1) moving the constants to a header file to share between BPF and perf C code, 2) testing that the maximum number of events doesn't cause BPF verifier issues in debug builds, 3) lower the max events from 1024 to 128, 4) in perf stat, if there are more events than the BPF counters can support then disable BPF counter usage. The rodata setup is factored into its own function to avoid duplicating it in the testing code. Signed-off-by: Ian Rogers <irogers@google.com> Fixes: `b8308511f6` ("perf stat bperf cgroup: Increase MAX_EVENTS from 32 to 1024") Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-25 16:44:21 -07:00
Leo Yan	3e98f0203e	perf cs-etm: Mute enumeration value warning When the OpenCSD library introduces a new enumeration value (for example, in the v1.7.1 release), the perf build fails with an error: util/cs-etm-decoder/cs-etm-decoder.c:600:10: error: enumeration value 'OCSD_GEN_TRC_ELEM_ITMTRACE' not explicitly handled in switch [-Werror, -Wswitch-enum] 600 \| switch (elem->elem_type) { \| ^~~~~~~~~~~~~~~ 1 error generated. Convert to if-else sentences to mute the enumeration value warning, which can avoid build failures whenever the lib is updated. No functional change. Suggested-by: Suzuki K Poulose <suzuki.poulose@arm.com> Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-23 18:59:13 -07:00
Kuninori Morimoto	9960889b32	tools: arm64: Add Cortex-A720AE definitions Add cputype definitions for Cortex-A720AE. These will be used for errata detection in subsequent patches. These values can be found in the Cortex-A720AE TRM: https://developer.arm.com/documentation/102828/0001/ ... in Table A-187 Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Reviewed-by: Leo Yan <leo.yan@arm.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-23 14:48:15 -07:00
Zecheng Li	109218718d	perf annotate: Save pointer offset in stack state The tracked pointer offset was not being preserved in the stack state, which could lead to incorrect type analysis. This change adds a ptr_offset field to the type_state_stack struct and passes it to set_stack_state and findnew_stack_state to ensure the offset is preserved after the pointer is loaded from a stack location. It improves the type annotation coverage and quality. Signed-off-by: Zecheng Li <zecheng@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-21 06:02:49 -07:00
Zecheng Li	1f4cc4ae3f	perf annotate: Track arithmetic instructions on pointers Track the arithmetic operations on registers with pointer types. We handle only add, sub and lea instructions. The original pointer information needs to be preserved for getting outermost struct types. For example, reg0 points to a struct cfs_rq, when we add 0x10 to reg0, it should preserve the information of struct cfs_rq + 0x10 in the register instead of a pointer type to the child field at 0x10. Details: 1. struct type_state_reg now includes an offset, indicating if the register points to the start or an internal part of its associated type. This offset is used in mem to reg and reg to stack mem transfers, and also applied to the final type offset. 2. lea offset(%sp/%fp), reg is now treated as taking the address of a stack variable. It worked fine in most cases, but an issue with this approach is the pointer type may not exist. 3. lea offset(%base), reg is handled by moving the type from %base and adding an offset, similar to an add operation followed by a mov reg to reg. 4. Non-stack variables from DWARF with non-zero offsets in their location expressions are now accepted with register offset tracking. Multi-register addressing modes in LEA are not supported. Signed-off-by: Zecheng Li <zecheng@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-21 06:02:49 -07:00
Zecheng Li	24a30ce9b1	perf annotate: Track address registers via TSR_KIND_POINTER Introduce TSR_KIND_POINTER to improve the data type profiler's ability to track pointer-based memory accesses and address register variables. TSR_KIND_POINTER represents that the location holds a pointer type to the type in the type state. The semantics match the `breg` registers that describe a memory location. This change implements handling for this new kind in mov instructions and in the check_matching_type() function. When a TSR_KIND_POINTER is moved to the stack, the stack state size is set to the architecture's pointer size. Signed-off-by: Zecheng Li <zecheng@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-21 06:02:49 -07:00
Zecheng Li	068b6a4524	perf annotate: Skip annotating data types to lea instructions Introduce a helper function is_address_gen_insn() to check arch-dependent address generation instructions like lea in x86. Remove type annotation on these instructions since they are not accessing memory. It should be counted as `no_mem_ops`. Signed-off-by: Zecheng Li <zecheng@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-21 06:02:49 -07:00
Tianyou Li	f1204e5846	perf annotate: Check return value of evsel__get_arch() properly Check the error code of evsel__get_arch() in the symbol__annotate(). Previously it checked non-zero value but after the refactoring it does only for negative values. Fixes: `0669729eb0` ("perf annotate: Factor out evsel__get_arch()") Suggested-by: James Clark <james.clark@linaro.org> Acked-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Tianyou Li <tianyou.li@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-21 05:42:34 -07:00
Tianyou Li	262c61435c	perf annotate: fix a crash when annotate the same symbol with 's' and 'T' When perf report with annotation for a symbol, press 's' and 'T', then exit the annotate browser. Once annotate the same symbol, the annotate browser will crash. The browser.arch was required to be correctly updated when data type feature was enabled by 'T'. Usually it was initialized by symbol__annotate2 function. If a symbol has already been correctly annotated at the first time, it should not call the symbol__annotate2 function again, thus the browser.arch will not get initialized. Then at the second time to show the annotate browser, the data type needs to be displayed but the browser.arch is empty. Stack trace as below: Perf: Segmentation fault -------- backtrace -------- #0 0x55d365 in ui__signal_backtrace setup.c:0 #1 0x7f5ff1a3e930 in __restore_rt libc.so.6[3e930] #2 0x570f08 in arch__is perf[570f08] #3 0x562186 in annotate_get_insn_location perf[562186] #4 0x562626 in __hist_entry__get_data_type annotate.c:0 #5 0x56476d in annotation_line__write perf[56476d] #6 0x54e2db in annotate_browser__write annotate.c:0 #7 0x54d061 in ui_browser__list_head_refresh perf[54d061] #8 0x54dc9e in annotate_browser__refresh annotate.c:0 #9 0x54c03d in __ui_browser__refresh browser.c:0 #10 0x54ccf8 in ui_browser__run perf[54ccf8] #11 0x54eb92 in __hist_entry__tui_annotate perf[54eb92] #12 0x552293 in do_annotate hists.c:0 #13 0x55941c in evsel__hists_browse hists.c:0 #14 0x55b00f in evlist__tui_browse_hists perf[55b00f] #15 0x42ff02 in cmd_report perf[42ff02] #16 0x494008 in run_builtin perf.c:0 #17 0x494305 in handle_internal_command perf.c:0 #18 0x410547 in main perf[410547] #19 0x7f5ff1a295d0 in __libc_start_call_main libc.so.6[295d0] #20 0x7f5ff1a29680 in __libc_start_main@@GLIBC_2.34 libc.so.6[29680] #21 0x410b75 in _start perf[410b75] Fixes: `1d4374afd0` ("perf annotate: Add 'T' hot key to toggle data type display") Reviewed-by: James Clark <james.clark@linaro.org> Tested-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Tianyou Li <tianyou.li@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-21 05:42:01 -07:00
Namhyung Kim	0e6c07a3c3	perf annotate: Fix build with NO_SLANG=1 The recent change for perf c2c annotate broke build without slang support like below. builtin-annotate.c: In function 'hists__find_annotations': builtin-annotate.c:522:73: error: 'NO_ADDR' undeclared (first use in this function); did you mean 'NR_ADDR'? 522 \| key = hist_entry__tui_annotate(he, evsel, NULL, NO_ADDR); \| ^~~~~~~ \| NR_ADDR builtin-annotate.c:522:73: note: each undeclared identifier is reported only once for each function it appears in builtin-annotate.c:522:31: error: too many arguments to function 'hist_entry__tui_annotate' 522 \| key = hist_entry__tui_annotate(he, evsel, NULL, NO_ADDR); \| ^~~~~~~~~~~~~~~~~~~~~~~~ In file included from util/sort.h:6, from builtin-annotate.c:28: util/hist.h:756:19: note: declared here 756 \| static inline int hist_entry__tui_annotate(struct hist_entry *he __maybe_unused, \| ^~~~~~~~~~~~~~~~~~~~~~~~ And I noticed that it missed to update the other side of #ifdef HAVE_SLANG_SUPPORT. Let's fix it. Cc: Tianyou Li <tianyou.li@intel.com> Fixes: `cd3466cd26` ("perf c2c: Add annotation support to perf c2c report") Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-21 05:36:02 -07:00
Ian Rogers	800201997a	perf parse-events: Make X modifier more respectful of groups Events with an X modifier were reordered within a group, for example slots was made the leader in: ``` $ perf record -e '{cpu/mem-stores/ppu,cpu/slots/uX}' -- sleep 1 ``` Fix by making `dont_regroup` evsels always use their index for sorting. Make the cur_leader, when fixing the groups, be that of `dont_regroup` evsel so that the `dont_regroup` evsel doesn't become a leader. On a tigerlake this patch corrects this and meets expectations in: ``` $ perf stat -e '{cpu/mem-stores/,cpu/slots/uX}' -a -- sleep 0.1 Performance counter stats for 'system wide': 83,458,652 cpu/mem-stores/ 2,720,854,880 cpu/slots/uX 0.103780587 seconds time elapsed $ perf stat -e 'slots,slots:X' -a -- sleep 0.1 Performance counter stats for 'system wide': 732,042,247 slots (48.96%) 643,288,155 slots:X (51.04%) 0.102731018 seconds time elapsed ``` Closes: https://lore.kernel.org/lkml/18f20d38-070c-4e17-bc90-cf7102e1e53d@linux.intel.com/ Fixes: `035c178930` ("perf parse-events: Add 'X' modifier to exclude an event from being regrouped") Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-20 13:05:37 +09:00
Tianyou Li	ad83f3b715	perf c2c annotate: Start from the contention line Add support to highlight the contention line in the annotate browser, use 'TAB'/'UNTAB' to refocus to the contention line. Signed-off-by: Tianyou Li <tianyou.li@intel.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Reviewed-by: Thomas Falcon <thomas.falcon@intel.com> Reviewed-by: Jiebin Sun <jiebin.sun@intel.com> Reviewed-by: Pan Deng <pan.deng@intel.com> Reviewed-by: Zhiguo Zhou <zhiguo.zhou@intel.com> Reviewed-by: Wangyang Guo <wangyang.guo@intel.com> Tested-by: Ravi Bangoria <ravi.bangoria@amd.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-19 12:50:07 +09:00
Ian Rogers	b8308511f6	perf stat bperf cgroup: Increase MAX_EVENTS from 32 to 1024 The MAX_EVENTS value ensured a counted loop presumably to satisfy the BPF verifier. It is possible to go past 32 events when gathering uncore events. Increase the amount to 1024 as that should provide some amount of headroom. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-19 12:35:40 +09:00
Ian Rogers	5960aab556	perf python: Add PMU argument to parse_metrics Add an optional PMU argument to parse_metrics to allow restriction of the particular metrics to be opened. If no argument is provided then all metrics with the given name/group are opened Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Gautam Menghani <gautam@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-19 11:59:44 +09:00
Ian Rogers	787bd57817	perf evsel: Improvements to __evsel__match Ensure both the perf_event_attr and alternate_hw_config are checked in the match. Don't mask the config if the perf_event_attr isn't a HARDWARE or HW_CACHE event. Add common early exit cases. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-15 23:59:11 +09:00
Ian Rogers	5bf6291113	perf evlist: Avoid scanning all PMUs for evlist__new_default Rather than wildcard matching the cycles event specify only the core PMUs. This avoids potentially loading unnecessary uncore PMUs. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-15 23:59:11 +09:00
Ian Rogers	b1c5efbfd9	perf parse-events: Remove hard coded legacy hardware and cache parsing Now that legacy hardware and cache events are in json, having the lexer match the specific event is no longer necessary and generic PMU parsing can take place. Because of this remove the specific term parsing, event adding, and passing of alternate_hw_config which was now always PERF_COUNT_HW_MAX. This mirrors a similar change for software events in commit `6e9fa4131a` ("perf parse-events: Remove non-json software events"). With no hard coded legacy hardware or cache events the wild card, case insensitivity, etc. is consistent for events. This does, however, mean events like cycles will wild card against all PMUs. A change does the same was originally posted and merged from: https://lore.kernel.org/r/20240416061533.921723-10-irogers@google.com and reverted by Linus in commit `4f1b067359` ("Revert "perf parse-events: Prefer sysfs/JSON hardware events over legacy"") due to his dislike for the cycles behavior on ARM. Earlier patches in this series make perf record event opening failures non-fatal and hide the cycles event's failure to open on ARM in perf record, so it is expected the behavior will now be transparent in perf record. perf stat with a cycles event will wildcard open the event on all PMUs. As cycles is a "default event", the perf stat behavior for default events was updated to only open them on core/software PMUs. The change to support legacy events with PMUs was done to clean up Intel's hybrid PMU implementation. Having sysfs/json events with increased priority to legacy was requested by Mark Rutland <mark.rutland@arm.com> to fix Apple-M PMU issues wrt broken legacy events on that PMU. It was requested that RISC-V be able to add events to the perf tool json so the PMU driver didn't need to map legacy events to config encodings: https://lore.kernel.org/lkml/20240217005738.3744121-1-atishp@rivosinc.com/ A previous series of patches decreasing legacy hardware event priorities was posted in: https://lore.kernel.org/lkml/20250416045117.876775-1-irogers@google.com/ Namhyung Kim <namhyung@kernel.org> mentioned that hardware and software events can be implemented similarly: https://lore.kernel.org/lkml/aIJmJns2lopxf3EK@google.com/ Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-15 23:59:11 +09:00
Ian Rogers	50062baa53	perf print-events: Remove print_symbol_events Now legacy hardware events are in json there's no need for a specific printing routine that previously served for both hardware and software events. The associated event_symbols_hw is also removed. To support the previous filtered version use an event glob of "legacy hardware" which matches the topic of the json events. Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-15 23:59:11 +09:00
Ian Rogers	b12b5b531a	perf print-events: Remove print_hwcache_events Now legacy cache events are in json there's no need for a specific printing routine. To support the previous filtered version use an event glob of "legacy cache" which matches the topic of the json events. Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-15 23:59:11 +09:00
Ian Rogers	249a4c6d01	perf pmu: Add and use legacy_terms in alias information Add support to finding/adding events from the default_core event table. If an event already exists from sysfs/json then the default_core configuration is saved in the legacy_terms string. Lazily use the legacy_terms string to set a legacy hardware or cache event as deprecated if the core PMU doesn't support it. Use the legacy terms string to set the alternate_hw_config, avoiding the value needing to be passed from the parse_events parser. Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-15 23:59:11 +09:00
Ian Rogers	abcff00014	perf parse-events: Add terms for legacy hardware and cache config values Add the PMU terms legacy-hardware-config and legacy-cache-config. These terms are similar to the config term in that their values are assigned to the perf_event_attr config value. They differ in that the PMU type is switched to be either PERF_TYPE_HARDWARE or PERF_TYPE_HW_CACHE, and the PMU type is moved into the extended type information of the config value. This will allow later patches to add legacy events to json. An example use of the terms is in the following: ``` $ perf stat -vv -e 'cpu/legacy-hardware-config=1/,cpu/legacy-cache-config=0x10001/' true Using CPUID GenuineIntel-6-8D-1 Attempt to add: cpu/legacy-hardware-config=0x1/ ..after resolving event: cpu/legacy-hardware-config=0x1/ Attempt to add: cpu/legacy-cache-config=0x10001/ ..after resolving event: cpu/legacy-cache-config=0x10001/ Control descriptor is not initialized ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0x1 (PERF_COUNT_HW_INSTRUCTIONS) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED\|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 ------------------------------------------------------------ sys_perf_event_open: pid 994937 cpu -1 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ perf_event_attr: type 3 (PERF_TYPE_HW_CACHE) size 136 config 0x10001 (PERF_COUNT_HW_CACHE_RESULT_MISS \| PERF_COUNT_HW_CACHE_OP_READ \| PERF_COUNT_HW_CACHE_L1I) sample_type IDENTIFIER read_format TOTAL_TIME_ENABLED\|TOTAL_TIME_RUNNING disabled 1 inherit 1 enable_on_exec 1 ------------------------------------------------------------ sys_perf_event_open: pid 994937 cpu -1 group_fd -1 flags 0x8 = 4 cpu/legacy-hardware-config=1/: -1: 1364046 414756 414756 cpu/legacy-cache-config=0x10001/: -1: 57453 414756 414756 cpu/legacy-hardware-config=1/: 1364046 414756 414756 cpu/legacy-cache-config=0x10001/: 57453 414756 414756 Performance counter stats for 'true': 1,364,046 cpu/legacy-hardware-config=1/ 57,453 cpu/legacy-cache-config=0x10001/ 0.001988593 seconds time elapsed 0.002194000 seconds user 0.000000000 seconds sys ``` Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-15 23:59:10 +09:00
Ian Rogers	70424bb5ff	perf pmu: Factor term parsing into a perf_event_attr into a helper Factor existing functionality in perf_pmu__name_from_config into a helper that will be used in later patches. Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-15 23:59:10 +09:00
Ian Rogers	7f20b3dd93	perf pmu: Use fd rather than FILE from new_alias The FILE argument was necessary for the scanner but now that functionality is not being used we can switch to just using io__getline which should cut down on stdio buffer usage. Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-15 23:59:10 +09:00
Ian Rogers	5f68451a93	perf parse-events: Remove unused FILE input argument to scanner Now the events file isn't directly parsed from a FILE but stored in a string prior to parsing, remove the FILE argument to the associated scanner functions as they only ever pass NULL. Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-15 23:59:10 +09:00
Ian Rogers	84bae3af20	perf pmu: Don't eagerly parse event terms When an event/alias is created for a PMU the terms are eagerly parsed using parse_events_terms. For a command like perf stat or perf record, the particular event/alias will be found, the terms parsed, the terms cloned for use in the event parsing, and then the terms used to configure the perf_event_attr. Events/aliases may be eagerly loaded, such as from sysfs or in perf list, in which case the aliases terms will be little or never used. To avoid redundant work, to avoid cloning, and to reduce memory overhead, hold the terms for an event as a string until they need handling as a term list. This may introduce duplicate parsing if an event is repeated in a list, but this situation is expected to be uncommon. Measuring the number of instructions before and after with a sysfs event and perf stat, there is a minor reduction in the number of instructions executed by 0.3%. Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-15 23:59:10 +09:00
Ian Rogers	7c0135e4d7	perf perf_api_probe: Avoid scanning all PMUs, try software PMU first Scan the software PMU first rather than last as it is the least likely to fail the probe. Specifying the software PMU by name was enabled by commit `9957d8c801` ("perf jevents: Add common software event json"). For hardware events, add core PMU names when getting events to probe so that not all PMUs are scanned. For example, when legacy events support wildcards and for the event "cycles:u" on x86, we want to only scan the "cpu" PMU and not all uncore PMUs for the event too. Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-15 23:59:10 +09:00
Ian Rogers	b7b76f607a	perf parse-events: Fix legacy cache events if event is duplicated in a PMU The term list when adding an event to a PMU is expected to have the event name for the alias lookup. Also, set found_supported so that -EINVAL isn't returned. Fixes: `62593394f6` ("perf parse-events: Legacy cache names on all PMUs and lower priority") Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-15 23:59:10 +09:00
Ian Rogers	2a67955de1	perf bpf_counter: Fix opening of "any"(-1) CPU events The bperf BPF counter code doesn't handle "any"(-1) CPU events, always wanting to aggregate a count against a CPU, which avoids the need for atomics so let's not change that. Force evsels used for BPF counters to require a CPU when not in system-wide mode so that the "any"(-1) value isn't used during map propagation and evsel's CPU map matches that of the PMU. Fixes: `b91917c0c6` ("perf bpf_counter: Fix handling of cpumap fixing hybrid") Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-10-13 01:58:51 -07:00

1 2 3 4 5 ...

9988 Commits