linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-21 19:35:23 -04:00

Author	SHA1	Message	Date
Ian Rogers	d9f2ecbc5e	perf dso: Move build_id to dso_id The dso_id previously contained the major, minor, inode and inode generation information from a mmap2 event - the inode generation would be zero when reading from /proc/pid/maps. The build_id was in the dso. With build ID mmap2 events these fields wouldn't be initialized which would largely mean the special empty case where any dso would match for equality. This isn't desirable as if a dso is replaced we want the comparison to yield a difference. To support detecting the difference between DSOs based on build_id, move the build_id out of the DSO and into the dso_id. The dso_id is also stored in the DSO so nothing is lost. Capture in the dso_id what parts have been initialized and rename dso_id__inject to dso_id__improve_id so that it is clear the dso_id is being improved upon with additional information. With the build_id in the dso_id, use memcmp to compare for equality. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250724163302.596743-7-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-25 10:37:56 -07:00
Ian Rogers	eee4b66105	perf build-id: Ensure struct build_id is empty before use If a build ID is read then not all code paths may ensure it is empty before use. Initialize the build_id to be zero-ed unless there is clear initialization such as a call to build_id__init. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250724163302.596743-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-25 10:37:55 -07:00
Ian Rogers	29be60c93d	perf build-id: Mark DSO in sample callchains Previously only the sample IP's map DSO would be marked hit for the purposes of populating the build ID cache. Walk the call chain to mark all IPs and DSOs. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250724163302.596743-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-25 10:37:55 -07:00
Ian Rogers	fccaaf6fbb	perf build-id: Change sprintf functions to snprintf Pass in a size argument rather than implying all build id strings must be SBUILD_ID_SIZE. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250724163302.596743-4-irogers@google.com [ fixed some build errors ] Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-25 10:37:13 -07:00
Ian Rogers	5a2ceebd81	perf build-id: Truncate to avoid overflowing the build_id data Warning when the build_id data would be overflowed would lead to memory corruption, switch to truncation. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250724163302.596743-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-24 13:50:17 -07:00
Ian Rogers	f3982385bc	perf build-id: Reduce size of "size" variable Later clean up of the dso_id to include a build_id will suffer from alignment and size issues. The size can only hold up to a value of BUILD_ID_SIZE (20) and the mmap2 event uses a byte for the value. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250724163302.596743-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-24 13:50:17 -07:00
Ian Rogers	fcc7cc3123	perf metricgroups: Add NO_THRESHOLD_AND_NMI constraint Thresholds can increase the number of counters a metric needs. The NMI watchdog can take away a counter (hopefully the buddy watchdog will become the default and this will no longer be true). Add a new constraint for the case that a metric and its thresholds would fit in counters but only if the NMI watchdog isn't enabled. Either the threshold or the NMI watchdog should be disabled to make the metric fit. Wire this up into the metric__group_events logic. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250719030517.1990983-16-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-24 13:41:36 -07:00
Ian Rogers	8dcd27b1b8	perf parse-events: Fix missing slots for Intel topdown metric events Topdown metric events require grouping with a slots event. In perf metrics this is currently achieved by metrics adding an unnecessary "0 * tma_info_thread_slots". New TMA metrics trigger optimizations of the metric expression that removes the event and breaks the metric due to the missing but required event. Add a pass immediately before sorting and fixing parsed events, that insert a slots event if one is missing. Update test expectations to match this. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250719030517.1990983-15-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-24 13:41:35 -07:00
Ian Rogers	5b546de9cc	perf topdown: Use attribute to see an event is a topdown metic or slots The string comparisons were overly broad and could fire for the incorrect PMU and events. Switch to using the config in the attribute then add a perf test to confirm the attribute config values match those of parsed events of that name and don't match others. This exposed matches for slots events that shouldn't have matched as the slots fixed counter event, such as topdown.slots_p. Fixes: `fbc798316b` ("perf x86/topdown: Refine helper arch_is_topdown_metrics()") Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250719030517.1990983-14-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-24 13:41:35 -07:00
Ian Rogers	811082e4b6	perf parse-events: Support user CPUs mixed with threads/processes Counting events system-wide with a specified CPU prior to this change worked: ``` $ perf stat -e 'msr/tsc/,msr/tsc,cpu=cpu_core/,msr/tsc,cpu=cpu_atom/' -a sleep 1 Performance counter stats for 'system wide': 59,393,419,099 msr/tsc/ 33,927,965,927 msr/tsc,cpu=cpu_core/ 25,465,608,044 msr/tsc,cpu=cpu_atom/ ``` However, when counting with process the counts became system wide: ``` $ perf stat -e 'msr/tsc/,msr/tsc,cpu=cpu_core/,msr/tsc,cpu=cpu_atom/' perf test -F 10 10.1: Basic parsing test : Ok 10.2: Parsing without PMU name : Ok 10.3: Parsing with PMU name : Ok Performance counter stats for 'perf test -F 10': 59,233,549 msr/tsc/ 59,227,556 msr/tsc,cpu=cpu_core/ 59,224,053 msr/tsc,cpu=cpu_atom/ ``` Make the handling of CPU maps with event parsing clearer. When an event is parsed creating an evsel the cpus should be either the PMU's cpumask or user specified CPUs. Update perf_evlist__propagate_maps so that it doesn't clobber the user specified CPUs. Try to make the behavior clearer, firstly fix up missing cpumasks. Next, perform sanity checks and adjustments from the global evlist CPU requests and for the PMU including simplifying to the "any CPU"(-1) value. Finally remove the event if the cpumask is empty. So that events are opened with a CPU and a thread change stat's create_perf_stat_counter to give both. With the change things are fixed: ``` $ perf stat --no-scale -e 'msr/tsc/,msr/tsc,cpu=cpu_core/,msr/tsc,cpu=cpu_atom/' perf test -F 10 10.1: Basic parsing test : Ok 10.2: Parsing without PMU name : Ok 10.3: Parsing with PMU name : Ok Performance counter stats for 'perf test -F 10': 63,704,975 msr/tsc/ 47,060,704 msr/tsc,cpu=cpu_core/ (4.62%) 16,640,591 msr/tsc,cpu=cpu_atom/ (2.18%) ``` However, note the "--no-scale" option is used. This is necessary as the running time for the event on the counter isn't the same as the enabled time because the thread doesn't necessarily run on the CPUs specified for the counter. All counter values are scaled with: scaled_value = value * time_enabled / time_running and so without --no-scale the scaled_value becomes very large. This problem already exists on hybrid systems for the same reason. Here are 2 runs of the same code with an instructions event that counts the same on both types of core, there is no real multiplexing happening on the event: ``` $ perf stat -e instructions perf test -F 10 ... Performance counter stats for 'perf test -F 10': 87,896,447 cpu_atom/instructions/ (14.37%) 98,171,964 cpu_core/instructions/ (85.63%) ... $ perf stat --no-scale -e instructions perf test -F 10 ... Performance counter stats for 'perf test -F 10': 13,069,890 cpu_atom/instructions/ (19.32%) 83,460,274 cpu_core/instructions/ (80.68%) ... ``` The scaling has inflated per-PMU instruction counts and the overall count by 2x. To fix this the kernel needs changing when a task+CPU event (or just task event on hybrid) is scheduled out. A fix could be that the state isn't inactive but off for such events, so that time_enabled counts don't accumulate on them. Reviewed-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250719030517.1990983-13-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-24 13:41:35 -07:00
Ian Rogers	e9387ba569	perf evsel: Add evsel__open_per_cpu_and_thread Add evsel__open_per_cpu_and_thread that combines the operation of evsel__open_per_cpu and evsel__open_per_thread so that an event without the "any" cpumask can be opened with its cpumask and with threads it specifies. Change the implementation of evsel__open_per_cpu and evsel__open_per_thread to use evsel__open_per_cpu_and_thread to make the implementation of those functions clearer. Reviewed-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250719030517.1990983-12-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-24 13:41:35 -07:00
Ian Rogers	cd63c22168	perf parse-events: Minor __add_event refactoring Rename cpu_list to user_cpus. If a PMU isn't given, find it early from the perf_event_attr. Make the pmu_cpus more explicitly a copy from the PMU (except when user_cpus are given). Derive the cpus from pmu_cpus and user_cpus as appropriate. Handle strdup errors on name and metric_id. Reviewed-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250719030517.1990983-11-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-24 13:41:35 -07:00
Ian Rogers	3cb614a261	perf pmus: Factor perf_pmus__find_by_attr out of evsel__find_pmu Allow a PMU to be found by a perf_event_attr, useful when creating evsels. Reviewed-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250719030517.1990983-10-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-24 13:41:35 -07:00
Ian Rogers	f958537f18	perf evsel: Use libperf perf_evsel__exit Avoid the duplicated code and better enable perf_evsel to change. Reviewed-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250719030517.1990983-9-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-24 13:41:35 -07:00
Ian Rogers	6d765f5f7e	libperf evsel: Rename own_cpus to pmu_cpus own_cpus is generally the cpumask from the PMU. Rename to pmu_cpus to try to make this clearer. Variable rename with no other changes. Reviewed-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250719030517.1990983-7-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-24 13:41:35 -07:00
Ian Rogers	175c852325	perf tool_pmu: Allow num_cpus(_online) to be specific to a cpumask For hybrid metrics it is useful to know the number of p-core or e-core CPUs. If a cpumask is specified for the num_cpus or num_cpus_online tool events, compute the value relative to the given mask rather than for the full system. ``` $ sudo /tmp/perf/perf stat -e 'tool/num_cpus/,tool/num_cpus,cpu=cpu_core/, tool/num_cpus,cpu=cpu_atom/,tool/num_cpus_online/,tool/num_cpus_online, cpu=cpu_core/,tool/num_cpus_online,cpu=cpu_atom/' true Performance counter stats for 'true': 28 tool/num_cpus/ 16 tool/num_cpus,cpu=cpu_core/ 12 tool/num_cpus,cpu=cpu_atom/ 28 tool/num_cpus_online/ 16 tool/num_cpus_online,cpu=cpu_core/ 12 tool/num_cpus_online,cpu=cpu_atom/ 0.000767205 seconds time elapsed 0.000938000 seconds user 0.000000000 seconds sys ``` Reviewed-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250719030517.1990983-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-24 13:41:35 -07:00
Ian Rogers	bd741d80dc	perf parse-events: Allow the cpu term to be a PMU or CPU range On hybrid systems, events like msr/tsc/ will aggregate counts across all CPUs. Often metrics only want a value like msr/tsc/ for the cores on which the metric is being computed. Listing each CPU with terms cpu=0,cpu=1.. is laborious and would need to be encoded for all variations of a CPU model. Allow the cpumask from a PMU to be an argument to the cpu term. For example in the following the cpumask of the cstate_pkg PMU selects the CPUs to count msr/tsc/ counter upon: ``` $ cat /sys/bus/event_source/devices/cstate_pkg/cpumask 0 $ perf stat -A -e 'msr/tsc,cpu=cstate_pkg/' -a sleep 0.1 Performance counter stats for 'system wide': CPU0 252,621,253 msr/tsc,cpu=cstate_pkg/ 0.101184092 seconds time elapsed ``` As the cpu term is now also allowed to be a string, allow it to encode a range of CPUs (a list can't be supported as ',' is already a special token). The "event qualifiers" section of the `perf list` man page is updated to detail the additional behavior. The man page formatting is tidied up in this section, as it was incorrectly appearing within the "parameterized events" section. Reviewed-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250719030517.1990983-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-24 13:41:35 -07:00
Ian Rogers	ced4c24956	perf stat: Don't size aggregation ids from user_requested_cpus As evsels may have additional CPU terms, the user_requested_cpus may not reflect all the CPUs requested. Use evlist->all_cpus to size the array as that reflects all the CPUs potentially needed by the evlist. Reviewed-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250719030517.1990983-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-24 13:41:34 -07:00
Ian Rogers	848e7a06fe	perf stat: Avoid buffer overflow to the aggregation map CPUs may be created and passed to perf_stat__get_aggr (via config->aggr_get_id), such as in the stat display should_skip_zero_counter. There may be no such aggr_id, for example, if running with a thread. Add a missing bound check and just create IDs for these cases. Reviewed-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250719030517.1990983-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-24 13:41:34 -07:00
Ian Rogers	62f4512238	perf parse-events: Warn if a cpu term is unsupported by a CPU Factor requested CPU warning out of evlist and into evsel. At the end of adding an event, perform the warning check. To avoid repeatedly testing if the cpu_list is empty, add a local variable. ``` $ perf stat -e cpu_atom/cycles,cpu=1/ -a true WARNING: A requested CPU in '1' is not supported by PMU 'cpu_atom' (CPUs 16-27) for event 'cpu_atom/cycles/' Performance counter stats for 'system wide': <not supported> cpu_atom/cycles/ 0.000781511 seconds time elapsed ``` Reviewed-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: James Clark <james.clark@linaro.org> Link: https://lore.kernel.org/r/20250719030517.1990983-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-24 13:41:34 -07:00
Ian Rogers	12d30725bf	perf pfm: Don't force loading of all PMUs Force loading all PMUs adds significant cost because DRM and other PMUs are loaded, it should also not be required if the pmus__ functions are used. Tested by run perf test, in particular the pfm related tests. Also `perf list` is identical before and after. Before: $ time ./perf test pfm 54: Test libpfm4 support : 54.1: test of individual --pfm-events : Ok 54.2: test groups of --pfm-events : Ok 103: perf all libpfm4 events test : Ok real 0m8.933s user 0m1.824s sys 0m7.122s After: $ time ./perf test pfm 54: Test libpfm4 support : 54.1: test of individual --pfm-events : Ok 54.2: test groups of --pfm-events : Ok 103: perf all libpfm4 events test : Ok real 0m5.259s user 0m1.793s sys 0m3.570s Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250722013449.146233-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-24 13:28:31 -07:00
Yang Li	db12d7ec6b	perf stat: Remove duplicated include in stat-shadow.c The header files rblist.h is included twice in stat-shadow.c, so one inclusion of each can be removed. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=22933 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20250723070418.2195172-1-yang.lee@linux.alibaba.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-23 10:48:29 -07:00
Ian Rogers	008b75759e	perf ui scripts: Switch FILENAME_MAX to NAME_MAX FILENAME_MAX is the same as PATH_MAX (4kb) in glibc rather than NAME_MAX's 255. Switch to using NAME_MAX and ensure the '\0' is accounted for in the path's buffer size. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250717150855.1032526-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-22 18:17:53 -07:00
Ian Rogers	82aac55337	perf pmu: Switch FILENAME_MAX to NAME_MAX FILENAME_MAX is the same as PATH_MAX (4kb) in glibc rather than NAME_MAX's 255. Switch to using NAME_MAX and ensure the '\0' is accounted for in the path's buffer size. Fixes: `754baf426e` ("perf pmu: Change aliases from list to hashmap") Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250717150855.1032526-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-22 18:17:53 -07:00
Changbin Du	129f70bd60	perf: ftrace: add graph tracer options args/retval/retval-hex/retaddr This change adds support for new funcgraph tracer options funcgraph-args, funcgraph-retval, funcgraph-retval-hex and funcgraph-retaddr. The new added options are: - args : Show function arguments. - retval : Show function return value. - retval-hex : Show function return value in hexadecimal format. - retaddr : Show function return address. # ./perf ftrace -G vfs_write --graph-opts retval,retaddr # tracer: function_graph # # CPU DURATION FUNCTION CALLS # \| \| \| \| \| \| \| 5) \| mutex_unlock() { /* <-rb_simple_write+0xda/0x150 / 5) 0.188 us \| local_clock(); / <-lock_release+0x2ad/0x440 ret=0x3bf2a3cf90e / 5) \| rt_mutex_slowunlock() { / <-rb_simple_write+0xda/0x150 / 5) \| _raw_spin_lock_irqsave() { / <-rt_mutex_slowunlock+0x4f/0x200 / 5) 0.123 us \| preempt_count_add(); / <-_raw_spin_lock_irqsave+0x23/0x90 ret=0x0 / 5) 0.128 us \| local_clock(); / <-__lock_acquire.isra.0+0x17a/0x740 ret=0x3bf2a3cfc8b / 5) 0.086 us \| do_raw_spin_trylock(); / <-_raw_spin_lock_irqsave+0x4a/0x90 ret=0x1 / 5) 0.845 us \| } / _raw_spin_lock_irqsave ret=0x292 / 5) \| _raw_spin_unlock_irqrestore() { / <-rt_mutex_slowunlock+0x191/0x200 / 5) 0.097 us \| local_clock(); / <-lock_release+0x2ad/0x440 ret=0x3bf2a3cff1f / 5) 0.086 us \| do_raw_spin_unlock(); / <-_raw_spin_unlock_irqrestore+0x23/0x60 ret=0x1 / 5) 0.104 us \| preempt_count_sub(); / <-_raw_spin_unlock_irqrestore+0x35/0x60 ret=0x0 / 5) 0.726 us \| } / _raw_spin_unlock_irqrestore ret=0x80000000 / 5) 1.881 us \| } / rt_mutex_slowunlock ret=0x0 / 5) 2.931 us \| } / mutex_unlock ret=0x0 */ Signed-off-by: Changbin Du <changbin.du@huawei.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250613114048.132336-1-changbin.du@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-22 17:47:22 -07:00
Federico Pellegrin	e9fdf0d2ec	perf build: Always disable stack protection for BPF skeleton objects When the clang toolchain has stack protection enabled, the bpf skeletons build fails with: error: A call to built-in function '__stack_chk_fail' is not supported. Since stack-protector makes no sense for the BPF bits, just unconditionally disable it. See also similar case at `878625e1c7` Signed-off-by: Federico Pellegrin <fede@evolware.org> Link: https://lore.kernel.org/r/20250718041224.12389-1-fede@evolware.org [ rearrange long lines ] Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-20 20:49:35 -07:00
Anubhav Shelat	39f473f6d0	perf sched timehist: decode process names of processes in zombie state Previously when running perf trace timehist --state, when recording processes in the zombie state the process name would not be decoded properly and appears with just the PID: 1140057.412177 [0006] Mutter Input Th[3139/3104] 0.956 0.019 0.041 S 1140057.412222 [0012] :1248612[1248612] 0.000 0.000 0.332 Z 1140057.412275 [0004] <idle> 0.052 0.052 0.953 I 1140057.412284 [0008] <idle> 0.070 0.070 0.932 I 1140057.412333 [0004] KMS thread[3126/3104] 0.953 0.112 0.058 S Now some extra processing has been added to decode the process name: 1140057.412177 [0006] Mutter Input Th[3139/3104] 0.956 0.019 0.041 S 1140057.412222 [0012] sleep[1248612] 0.000 0.000 0.332 Z 1140057.412275 [0004] <idle> 0.052 0.052 0.953 I 1140057.412284 [0008] <idle> 0.070 0.070 0.932 I 1140057.412333 [0004] KMS thread[3126/3104] 0.953 0.112 0.058 S Signed-off-by: Anubhav Shelat <ashelat@redhat.com> Link: https://lore.kernel.org/r/20250716203914.45772-2-ashelat@redhat.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-17 17:05:07 -07:00
Ian Rogers	95d692f9ab	perf flamegraph: Fix minor pylint/type hint issues Switch to assuming python3. Fix minor pylint issues on line length, repeated compares, not using f-strings and variable case. Add type hints and check with mypy. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20250716004635.31161-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-16 10:43:27 -07:00
Namhyung Kim	8db1d77248	perf ftrace latency: Add -e option to measure time between two events In addition to the function latency, it can measure events latencies. Some kernel tracepoints are paired and it's menningful to measure how long it takes between the two events. The latency is tracked for the same thread. Currently it only uses BPF to do the work but it can be lifted later. Instead of having separate a BPF program for each tracepoint, it only uses generic 'event_begin' and 'event_end' programs to attach to any (raw) tracepoints. $ sudo perf ftrace latency -a -b --hide-empty \ -e i915_request_wait_begin,i915_request_wait_end -- sleep 1 # DURATION \| COUNT \| GRAPH \| 256 - 512 us \| 4 \| ###### \| 2 - 4 ms \| 2 \| ### \| 4 - 8 ms \| 12 \| ################### \| 8 - 16 ms \| 10 \| ################ \| # statistics (in usec) total time: 194915 avg time: 6961 max time: 12855 min time: 373 count: 28 Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250714052143.342851-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-14 22:51:58 -07:00
Ian Rogers	b4aff7ed7a	perf python: Set index error for invalid thread/cpu map items Returning NULL for out of bound CPU or thread map items causes internal errors. Fix by correctly setting the error to be an index error. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250710235126.1086011-14-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-11 12:36:40 -07:00
Ian Rogers	421c5f39ad	perf python: Improve leader copying from evlist The struct pyrf_evlist embeds the evlist requiring the copying from things like parsed events. The copying logic handles the leader being the event itself, but if the leader group event is a different in the list it will cause an evsel to point to the evsel in the list that was copied from which is bad. Fix this by adding another pass over the evlist rewriting leaders, simplified by the introductin of two evlist helpers. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250710235126.1086011-13-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-11 12:36:40 -07:00
Ian Rogers	6183afcba9	perf python: Correct pyrf_evsel__read for tool PMUs Tool PMUs assume that stat's process_counter_values is being used to read the counters. Specifically they hold onto old values in evsel->prev_raw_counts and give the cumulative count based off of this value. Update pyrf_evsel__read to allocate counts and prev_raw_counts, use evsel__read_counter rather than perf_evsel__read so tool PMUs are read from not just perf_event_open events, make the returned pyrf_counts_values contain the delta value rather than the cumulative value. Fixes: `739621f657` ("perf python: Add evsel read method") Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250710235126.1086011-12-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-11 12:36:40 -07:00
Ian Rogers	64ec9b997f	perf python: Fix thread check in pyrf_evsel__read The CPU index is incorrectly checked rather than the thread index. Fixes: `739621f657` ("perf python: Add evsel read method") Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250710235126.1086011-11-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-11 12:36:40 -07:00
Ian Rogers	7d5b635d9f	perf python: In str(evsel) use the evsel__pmu_name helper The evsel__pmu_name helper will internally use evsel__find_pmu that handles legacy events, extended types, etc. in determining a PMU and will provide a better value than just trying to access the PMU's name directly as the PMU may not have been computed. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250710235126.1086011-10-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-11 12:36:40 -07:00
Ian Rogers	5c255832de	perf jevents: If the long_desc and desc are identical then drop the long_desc If the short and long descriptions are the same then save space and don't store both of them. When storing the desc in the perf_pmu_alias, don't duplicate the desc into the long_desc. By avoiding storing the duplicate the size of the events string in the binary on x86 is reduced by 29,840 bytes. Fix tests that expect a duplicated description. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250710235126.1086011-9-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-11 12:36:40 -07:00
Ian Rogers	3787cdaf38	perf expr: Accumulate rather than replace in the context counts Metrics will fill in the context to have mappings from an event to a count. When counts are added they replace existing mappings which generally shouldn't exist with aggregation. Switch to accumulating to better support cases where perf stat's aggregation isn't used and we may see a counter more than once. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250710235126.1086011-8-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-11 12:36:40 -07:00
Ian Rogers	faebee18d7	perf stat: Move metric list from config to evlist The rblist of metric_event that then have a list of associated metric_expr is moved out of the stat_config and into the evlist. This is done as part of refactoring things for python, having the state split in two places complicates that implementation. The evlist is doing the harder work of enabling and disabling events, the metrics are needed to compute a value and it doesn't seem unreasonable to hang them from the evlist. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250710235126.1086011-7-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-11 12:36:40 -07:00
Ian Rogers	cb336b6aae	perf metricgroup: Factor out for-each function and move out printing Factor metricgroup__for_each_metric into its own function handling regular and sys metrics. Make the metric adding and printing code use it, move the printing code into print-events files. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250710235126.1086011-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-11 12:36:40 -07:00
Ian Rogers	8c75dc7420	perf pmu: Tolerate failure to read the type for wellknown PMUs If sysfs isn't mounted then we may fail to read a PMU's type. In this situation resort to lookup of wellknown types. Only applies to software, tracepoint and breakpoint PMUs. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250710235126.1086011-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-11 12:36:40 -07:00
Ian Rogers	bcc7693ad1	perf spark: Fix includes and add SPDX scnprintf is declared in linux/kernel.h, directly depend upon it. Add missing SPDX comments. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250710235126.1086011-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-11 12:36:40 -07:00
Ian Rogers	679c098cd2	perf parse-events: Minor tidy up of event_type helper Add missing breakpoint and raw types. Avoid a switch, just use a lookup array. Switch the type to unsigned to avoid checking negative values. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250710235126.1086011-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-11 12:36:40 -07:00
Ian Rogers	28f5aa8184	perf hwmon_pmu: Avoid shortening hwmon PMU name Long names like ucsi_source_psy_USBC000:001 when prefixed with hwmon_ exceed the buffer size and the last digit is lost. This causes confusion with similar names like ucsi_source_psy_USBC000:002. Extend the buffer size to avoid this. Fixes: `53cc0b351e` ("perf hwmon_pmu: Add a tool PMU exposing events from hwmon in sysfs") Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250710235126.1086011-2-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-11 12:36:39 -07:00
Leo Yan	4a6cdecaa1	perf tests bp_account: Fix leaked file descriptor Since the commit `e9846f5ead` ("perf test: In forked mode add check that fds aren't leaked"), the test "Breakpoint accounting" reports the error: # perf test -vvv "Breakpoint accounting" 20: Breakpoint accounting: --- start --- test child forked, pid 373 failed opening event 0 failed opening event 0 watchpoints count 4, breakpoints count 6, has_ioctl 1, share 0 wp 0 created wp 1 created wp 2 created wp 3 created wp 0 modified to bp wp max created ---- end(0) ---- Leak of file descriptor 7 that opened: 'anon_inode:[perf_event]' A watchpoint's file descriptor was not properly released. This patch fixes the leak. Fixes: `032db28e5f` ("perf tests: Add breakpoint accounting/modify test") Reported-by: Aishwarya TCV <aishwarya.tcv@arm.com> Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250711-perf_fix_breakpoint_accounting-v1-1-b314393023f9@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-11 12:05:02 -07:00
Sebastian Andrzej Siewior	7497e947bc	perf bench futex: Remove support for IMMUTABLE It has been decided to remove the support IMMUTABLE futex. perf bench was one of the eary users for testing purposes. Now that the API is removed before it could be used in an official release, remove the bits from perf, too. Remove Remove support for IMMUTABLE futex. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250710110011.384614-7-bigeasy@linutronix.de	2025-07-11 16:02:01 +02:00
Thomas Richter	a12a23720c	perf list: Remove trailing A in PAI crypto event 4210 According to the z16 and z17 Principle of Operation documents SA22-7832-13 and SA22-7832-14 the event 4210 is named PCC_COMPUTE_LAST_BLOCK_CMAC_USING_ENCRYPTED_AES_256 without a trailing 'A'. Adjust the json definition files for this event and remove the trailing 'A' character. PCC_COMPUTE_LAST_BLOCK_CMAC_USING_ENCRYPTED_AES_256A Also remove a black ' ' between the dash '-' and the number: xxx-AES- 192 ----> xxx-AES-192 Suggested-by: Ingo Franzki <ifranzki@linux.ibm.com> Signed-off-by: Thomas Richter <tmricht@linux.ibm.com> Reviewed-by: Ian Rogers <irogers@google.com> Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com> Link: https://lore.kernel.org/r/20250709072452.1595257-1-tmricht@linux.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-09 09:19:22 -07:00
Ian Rogers	585189332a	perf vendor events: Update TigerLake events Update events from v1.17 to v1.18. Bring in the event updates v1.18: `943fea37d0` Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250630163101.1920170-16-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-07 16:45:06 -07:00
Ian Rogers	80c6b82226	perf vendor events: Update SkylakeX events Update events from v1.36 to v1.37. Bring in the event updates v1.37: `6ee8e4cadd` Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250630163101.1920170-15-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-07 16:45:06 -07:00
Ian Rogers	336473ad07	perf vendor events: Update SierraForest events Update events from v1.09 to v1.11. Bring in the event updates v1.11: `6b824df1db` `4b0346fbee` Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250630163101.1920170-14-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-07 16:45:06 -07:00
Ian Rogers	8704418511	perf vendor events: Update SapphireRapids events Update events from v1.25 to v1.28. Bring in the event updates v1.28: `990bfdff27` `b7b4d7f18c` Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250630163101.1920170-13-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-07 16:45:05 -07:00
Ian Rogers	1f9e24e4df	perf vendor events: Add PantherLake events Bring in the events at v1.00: `d90a6737d0` Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250630163101.1920170-12-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2025-07-07 16:45:05 -07:00

... 4 5 6 7 8 ...

17931 Commits