linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-04-29 01:09:33 -04:00

Author	SHA1	Message	Date
Ian Rogers	176e66715d	perf vendor events intel: Update sandybridge TMA metrics to 4.7 Top-Down Microarchitecture Analysis (TMA) metrics simplify cycle-accounting using microarchitecture-abstracted metrics organized in one hierarchy. This update is from version 4.5 to 4.7. The update includes: - Add metrics tma_fp_vector_128b, tma_fp_vector_256b and tma_info_system_cpus_utilized. - Remove metrics tma_info_system_mem_parallel_requests, tma_info_system_core_frequency and tma_info_system_mem_request_latency. - Swapped tma_info_core_ilp (becomes per SMT thread) and tma_info_pipeline_execute (per physical core). - Tuned thresholds for tma_fetch_bandwidth. The update came from: https://github.com/intel/perfmon/pull/140 https://github.com/intel/perfmon/pull/138 Running the script: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-27-irogers@google.com	2024-02-16 15:28:24 -08:00
Ian Rogers	74f76c3ba7	perf vendor events intel: Update rocketlake TMA metrics to 4.7 Top-Down Microarchitecture Analysis (TMA) metrics simplify cycle-accounting using microarchitecture-abstracted metrics organized in one hierarchy. This update is from version 4.5 to 4.7. The update includes: - tma_info_bottleneck* metrics, an abstraction or summarization of the 100+ TMA tree nodes into 12-entry familiar performance metrics. - Reduce number of events (multiplexing) for tma_info_system_gflops, tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0. - Fixes for tma_info_bottleneck_mispredictions and tma_info_bad_spec_branch_misprediction_cost. - New tma_info_inst_mix_ippause metric. - tma_serializing_operation is raised to level 3. - Swapped tma_info_core_ilp (becomes per SMT thread) and tma_info_pipeline_execute (per physical core). - tma_nop_instructions and tma_shuffles_256b are lowered to level 4 under tma_other_light_ops_group. - Reduced number of events when SMT is off. - Tuned thresholds for tma_info_bottleneck_branching_overhead, tma_fetch_bandwidth and tma_ports_utilized_3m. The update came from: https://github.com/intel/perfmon/pull/140 https://github.com/intel/perfmon/pull/138 Running the script: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-26-irogers@google.com	2024-02-16 15:28:12 -08:00
Ian Rogers	5f9a13bee0	perf vendor events intel: Update jaketown TMA metrics to 4.7 Top-Down Microarchitecture Analysis (TMA) metrics simplify cycle-accounting using microarchitecture-abstracted metrics organized in one hierarchy. This update is from version 4.5 to 4.7. The update includes: - Swapped tma_info_core_ilp (becomes per SMT thread) and tma_info_pipeline_execute (per physical core). - Tuned thresholds for tma_fetch_bandwidth. The update came from: https://github.com/intel/perfmon/pull/140 https://github.com/intel/perfmon/pull/138 Running the script: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-25-irogers@google.com	2024-02-16 15:27:59 -08:00
Ian Rogers	14bc1a59f2	perf vendor events intel: Update ivytown TMA metrics to 4.7 Top-Down Microarchitecture Analysis (TMA) metrics simplify cycle-accounting using microarchitecture-abstracted metrics organized in one hierarchy. This update is from version 4.5 to 4.7. The update includes: - Swapped tma_info_core_ilp (becomes per SMT thread) and tma_info_pipeline_execute (per physical core). - Reduced number of events when SMT is off. - Tuned thresholds for tma_fetch_bandwidth and tma_ports_utilized_3m. The update came from: https://github.com/intel/perfmon/pull/140 https://github.com/intel/perfmon/pull/138 Running the script: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-24-irogers@google.com	2024-02-16 15:27:47 -08:00
Ian Rogers	8cf54fa844	perf vendor events intel: Update ivybridge TMA metrics to 4.7 Top-Down Microarchitecture Analysis (TMA) metrics simplify cycle-accounting using microarchitecture-abstracted metrics organized in one hierarchy. This update is from version 4.5 to 4.7. The update includes: - Swapped tma_info_core_ilp (becomes per SMT thread) and tma_info_pipeline_execute (per physical core). - Reduced number of events when SMT is off. - Tuned thresholds for tma_fetch_bandwidth and tma_ports_utilized_3m. The update came from: https://github.com/intel/perfmon/pull/140 https://github.com/intel/perfmon/pull/138 Running the script: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-23-irogers@google.com	2024-02-16 15:27:34 -08:00
Ian Rogers	b15cae3f69	perf vendor events intel: Update icelakex TMA metrics to 4.7 Top-Down Microarchitecture Analysis (TMA) metrics simplify cycle-accounting using microarchitecture-abstracted metrics organized in one hierarchy. This update is from version 4.5 to 4.7. The update includes: - tma_info_bottleneck* metrics, an abstraction or summarization of the 100+ TMA tree nodes into 12-entry familiar performance metrics. - Reduce number of events (multiplexing) for tma_info_system_gflops, tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0. - Fixes for tma_info_bottleneck_mispredictions and tma_info_bad_spec_branch_misprediction_cost. - New tma_info_inst_mix_ippause metric. - tma_serializing_operation is raised to level 3. - Swapped tma_info_core_ilp (becomes per SMT thread) and tma_info_pipeline_execute (per physical core). - tma_nop_instructions and tma_shuffles_256b are lowered to level 4 under tma_other_light_ops_group. - Reduced number of events when SMT is off. - Tuned thresholds for tma_info_bottleneck_branching_overhead, tma_fetch_bandwidth and tma_ports_utilized_3m. The update came from: https://github.com/intel/perfmon/pull/140 https://github.com/intel/perfmon/pull/138 Running the script: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-22-irogers@google.com	2024-02-16 15:27:22 -08:00
Ian Rogers	70bfdad63f	perf vendor events intel: Update icelake TMA metrics to 4.7 Top-Down Microarchitecture Analysis (TMA) metrics simplify cycle-accounting using microarchitecture-abstracted metrics organized in one hierarchy. This update is from version 4.5 to 4.7. The update includes: - tma_info_bottleneck* metrics, an abstraction or summarization of the 100+ TMA tree nodes into 12-entry familiar performance metrics. - Reduce number of events (multiplexing) for tma_info_system_gflops, tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0. - Fixes for tma_info_bottleneck_mispredictions and tma_info_bad_spec_branch_misprediction_cost. - New tma_info_inst_mix_ippause metric. - tma_serializing_operation is raised to level 3. - Swapped tma_info_core_ilp (becomes per SMT thread) and tma_info_pipeline_execute (per physical core). - tma_nop_instructions and tma_shuffles_256b are lowered to level 4 under tma_other_light_ops_group. - Reduced number of events when SMT is off. - Tuned thresholds for tma_info_bottleneck_branching_overhead, tma_fetch_bandwidth and tma_ports_utilized_3m. The update came from: https://github.com/intel/perfmon/pull/140 https://github.com/intel/perfmon/pull/138 Running the script: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-21-irogers@google.com	2024-02-16 15:27:07 -08:00
Ian Rogers	2a264a1946	perf vendor events intel: Update haswellx TMA metrics to 4.7 Top-Down Microarchitecture Analysis (TMA) metrics simplify cycle-accounting using microarchitecture-abstracted metrics organized in one hierarchy. This update is from version 4.5 to 4.7. The update includes: - Swapped tma_info_core_ilp (becomes per SMT thread) and tma_info_pipeline_execute (per physical core). - Tuned thresholds for tma_fetch_bandwidth and tma_ports_utilized_3m. The update came from: https://github.com/intel/perfmon/pull/140 https://github.com/intel/perfmon/pull/138 Running the script: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-20-irogers@google.com	2024-02-16 15:26:54 -08:00
Ian Rogers	89b66259a7	perf vendor events intel: Update haswell TMA metrics to 4.7 Top-Down Microarchitecture Analysis (TMA) metrics simplify cycle-accounting using microarchitecture-abstracted metrics organized in one hierarchy. This update is from version 4.5 to 4.7. The update includes: - Swapped tma_info_core_ilp (becomes per SMT thread) and tma_info_pipeline_execute (per physical core). - Tuned thresholds for tma_fetch_bandwidth and tma_ports_utilized_3m. The update came from: https://github.com/intel/perfmon/pull/140 https://github.com/intel/perfmon/pull/138 Running the script: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-19-irogers@google.com	2024-02-16 15:26:42 -08:00
Ian Rogers	c72a20435a	perf vendor events intel: Update cascadelakex TMA metrics to 4.7 Top-Down Microarchitecture Analysis (TMA) metrics simplify cycle-accounting using microarchitecture-abstracted metrics organized in one hierarchy. This update is from version 4.5 to 4.7. The update includes: - tma_info_bottleneck* metrics, an abstraction or summarization of the 100+ TMA tree nodes into 12-entry familiar performance metrics. - Reduce number of events (multiplexing) for tma_info_system_gflops, tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0. - Fixes for tma_info_bottleneck_mispredictions and tma_info_bad_spec_branch_misprediction_cost. - New tma_info_inst_mix_ippause metric. - tma_serializing_operation is raised to level 3. - Swapped tma_info_core_ilp (becomes per SMT thread) and tma_info_pipeline_execute (per physical core). - tma_nop_instructions and tma_shuffles_256b are lowered to level 4 under tma_other_light_ops_group. - Reduced number of events when SMT is off. - Tuned thresholds for tma_info_bottleneck_branching_overhead, tma_fetch_bandwidth and tma_ports_utilized_3m. The update came from: https://github.com/intel/perfmon/pull/140 https://github.com/intel/perfmon/pull/138 Running the script: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-18-irogers@google.com	2024-02-16 15:26:28 -08:00
Ian Rogers	8792e8f89d	perf vendor events intel: Update broadwellx TMA metrics to 4.7 Top-Down Microarchitecture Analysis (TMA) metrics simplify cycle-accounting using microarchitecture-abstracted metrics organized in one hierarchy. This update is from version 4.5 to 4.7. The update includes: - Reduce number of events (multiplexing) for tma_info_system_gflops, tma_info_core_flopc and tma_info_inst_mix_ipflop. - Removal of tma_info_bad_spec_branch_misprediction_cost. - Swapped tma_info_core_ilp (becomes per SMT thread) and tma_info_pipeline_execute (per physical core). - Tuned thresholds for tma_fetch_bandwidth and tma_ports_utilized_3m. The update came from: https://github.com/intel/perfmon/pull/140 https://github.com/intel/perfmon/pull/138 Running the script: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-17-irogers@google.com	2024-02-16 15:26:15 -08:00
Ian Rogers	4018680df9	perf vendor events intel: Update broadwellde TMA metrics to 4.7 Top-Down Microarchitecture Analysis (TMA) metrics simplify cycle-accounting using microarchitecture-abstracted metrics organized in one hierarchy. This update is from version 4.5 to 4.7. The update includes: - Reduce number of events (multiplexing) for tma_info_system_gflops, tma_info_core_flopc and tma_info_inst_mix_ipflop. - Removal of tma_info_bad_spec_branch_misprediction_cost. - Swapped tma_info_core_ilp (becomes per SMT thread) and tma_info_pipeline_execute (per physical core). - Tuned thresholds for tma_fetch_bandwidth and tma_ports_utilized_3m. The update came from: https://github.com/intel/perfmon/pull/140 https://github.com/intel/perfmon/pull/138 Running the script: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-16-irogers@google.com	2024-02-16 15:26:03 -08:00
Ian Rogers	eedd6d0a72	perf vendor events intel: Update broadwell TMA metrics to 4.7 Top-Down Microarchitecture Analysis (TMA) metrics simplify cycle-accounting using microarchitecture-abstracted metrics organized in one hierarchy. This update is from version 4.5 to 4.7. The update includes: - Reduce number of events (multiplexing) for tma_info_system_gflops, tma_info_core_flopc and tma_info_inst_mix_ipflop. - Removal of tma_info_bad_spec_branch_misprediction_cost. - Swapped tma_info_core_ilp (becomes per SMT thread) and tma_info_pipeline_execute (per physical core). - Tuned thresholds for tma_fetch_bandwidth and tma_ports_utilized_3m. The update came from: https://github.com/intel/perfmon/pull/140 https://github.com/intel/perfmon/pull/138 Running the script: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-15-irogers@google.com	2024-02-16 15:25:51 -08:00
Ian Rogers	52530942ba	perf vendor events intel: Update alderlake TMA metrics to 4.7 Top-Down Microarchitecture Analysis (TMA) metrics simplify cycle-accounting using microarchitecture-abstracted metrics organized in one hierarchy. This update is from version 4.5 to 4.7. The update includes: - tma_info_bottleneck* metrics, an abstraction or summarization of the 100+ TMA tree nodes into 12-entry familiar performance metrics. - tma_c01_wait and tma_c02_wait metrics measure power-performance states. - Reduce number of events (multiplexing) for tma_info_system_gflops, tma_info_core_flopc, tma_info_inst_mix_ipflop and tma_ports_utilized_0. - Fixes for tma_info_bottleneck_mispredictions and tma_info_bad_spec_branch_misprediction_cost. - New tma_info_inst_mix_ippause metric. - tma_serializing_operation is raised to level 3. - Swapped tma_info_core_ilp (becomes per SMT thread) and tma_info_pipeline_execute (per physical core). - tma_nop_instructions and tma_shuffles_256b are lowered to level 4 under tma_other_light_ops_group. - Reduced number of events when SMT is off. - Tuned thresholds for tma_info_bottleneck_branching_overhead, tma_fetch_bandwidth and tma_ports_utilized_3m. The update came from: https://github.com/intel/perfmon/pull/140 https://github.com/intel/perfmon/pull/138 Running the script: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-14-irogers@google.com	2024-02-16 15:25:40 -08:00
Ian Rogers	c4bb31c7b0	perf vendor events intel: Update tigerlake events to v1.15 Update alderlake events to v1.15 released in: `282a6951fd` Documentation fixes, removal of TOPDOWN.BR_MISPREDICT_SLOTS, deprecation of UNC_ARB_DAT_REQUESTS.RD, UNC_ARB_DAT_REQUESTS.RD and UNC_ARB_IFA_OCCUPANCY.ALL. Event json automatically generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-13-irogers@google.com	2024-02-16 15:25:28 -08:00
Ian Rogers	c31d718ca2	perf vendor events intel: Update skylake events to v58 Update skylake events to v58 released in: `625fb75073` Improves documentation. Event json automatically generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-12-irogers@google.com	2024-02-16 15:25:17 -08:00
Ian Rogers	9626368d42	perf vendor events intel: Update sierraforst events to v1.01 Update sierraforest events to v1.01 released in: `582bca24aa` Adds the majority of core and uncore events. Event json automatically generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-11-irogers@google.com	2024-02-16 15:25:06 -08:00
Ian Rogers	8972c03353	perf vendor events intel: Update rocketlake events to v1.02 Update alderlake events to v1.02 released in: `4931178d1e` Improves documentation and removes TOPDOWN.BR_MISPREDICT_SLOTS. Event json automatically generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-10-irogers@google.com	2024-02-16 15:24:54 -08:00
Ian Rogers	1d262a85e2	perf vendor events intel: Update meteorlake events to v1.07 Update meteorlake events to v1.07 released in: `6251722308` Umask changed on atom mem_bound events. Adds atom events ARITH.FPDIV_ACTIVE, FP_FLOPS_RETIRED.ALL, FP_FLOPS_RETIRED.DP, FP_FLOPS_RETIRED.FP32, ARITH.DIV_ACTIVE, BR_INST_RETIRED.COND, BR_INST_RETIRED.COND_TAKEN, BR_INST_RETIRED.INDIRECT, BR_INST_RETIRED.INDIRECT_CALL, BR_INST_RETIRED.IND_CALL, BR_INST_RETIRED.NEAR_RETURN, DTLB_LOAD_MISSES.WALK_COMPLETED_4K, DTLB_STORE_MISSES.WALK_COMPLETED_2M_4M, DTLB_STORE_MISSES.WALK_COMPLETED_4K, ITLB_MISSES.WALK_COMPLETED_4K, and alias events. Event json automatically generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-9-irogers@google.com	2024-02-16 15:24:16 -08:00
Ian Rogers	e8866cdbe1	perf vendor events intel: Update icelake events to v1.21 Update icelake events to v1.21 released in: `54f1246b04` Improves descriptions, removes TOPDOWN.BR_MISPREDICT_SLOTS. Event json automatically generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-8-irogers@google.com	2024-02-16 15:24:04 -08:00
Ian Rogers	f9044d46b7	perf vendor events intel: Update haswell events to v35 Update haswell events to v35 released in: `c0f9b34d42` Updates "must be precise" on RTM_RETIRED.ABORTED. Event json automatically generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: linux-perf-users@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-7-irogers@google.com	2024-02-16 15:23:53 -08:00
Ian Rogers	24cda3081a	perf vendor events intel: Update grandridge events to v1.01 Update grandridge events to v1.01 released in: `211d607165` Adds the majority of core and uncore events. Event json automatically generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-6-irogers@google.com	2024-02-16 15:23:40 -08:00
Ian Rogers	ea518afc99	perf vendor events intel: Update emeraldrapids events to v1.03 Update emeraldrapids events to v1.03 released in: `c7c6f72dae` Adds uncore CHA events. Event json automatically generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-5-irogers@google.com	2024-02-16 15:23:24 -08:00
Ian Rogers	7163acea30	perf vendor events intel: Update broadwell events to v29 Update broadwell events to v29 released in: `47117146c6` Updates "must be precise" on RTM_RETIRED.ABORTED. Event json automatically generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-4-irogers@google.com	2024-02-16 15:23:07 -08:00
Ian Rogers	5dcc2abaa5	perf vendor events intel: Update alderlaken events to v1.24 Update alderlaken events to v1.24 released in: `e627dd8d89` Adds LBR_INSERTS.ANY/MISC_RETIRED.LBR_INSERTS event. Event json automatically generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-3-irogers@google.com	2024-02-16 15:22:48 -08:00
Ian Rogers	2252ddf434	perf vendor events intel: Update alderlake events to v1.24 Update alderlake events to v1.24 released in: `e627dd8d89` Adds aliased events, improves documentation and fix some event fields. Event json automatically generated by: https://github.com/intel/perfmon/blob/main/scripts/create_perf_json.py Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Stephane Eranian <eranian@google.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Edward Baker <edward.baker@intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Samantha Alt <samantha.alt@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214011820.644458-2-irogers@google.com	2024-02-16 15:22:26 -08:00
Arnaldo Carvalho de Melo	29d16de26d	perf augmented_raw_syscalls.bpf: Move 'struct timespec64' to vmlinux.h If we instead decide to generate vmlinux.h from BTF info, it will be there: $ pahole timespec64 struct timespec64 { time64_t tv_sec; /* 0 8 / long int tv_nsec; / 8 8 / / size: 16, cachelines: 1, members: 2 / / last cacheline: 16 bytes */ }; $ pahole manages to find it from /sys/kernel/btf/vmlinux, that is generated from the kernel types. With this linux/bpf.h doesn't need to be included, as its already in the minimalistic tools/perf/util/bpf_skel/vmlinux/vmlinux.h file or what we need comes when generating a vmlinux.h file from BTF info, i.e. when using GEN_VMLINUX_H=1, as noticed by Namyung in a build break before removing linux/bpf.h. Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/Zc_fp6CgDClPhS_O@x1	2024-02-16 15:19:57 -08:00
Michael Petlan	f512e08fd0	perf testsuite: Install kprobe tests and common files Signed-off-by: Michael Petlan <mpetlan@redhat.com> Cc: kjain@linux.ibm.com Cc: atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240215110231.15385-8-mpetlan@redhat.com	2024-02-16 11:50:02 -08:00
Veronika Molnarova	e7d759f31c	perf testsuite: Add test for kprobe handling Test perf interface to kprobes: listing, adding and removing probes. It is run as a part of perftool-testsuite_probe test case. Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com> Signed-off-by: Michael Petlan <mpetlan@redhat.com> Cc: kjain@linux.ibm.com Cc: atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240215110231.15385-7-mpetlan@redhat.com	2024-02-16 11:49:47 -08:00
Veronika Molnarova	61d348f1e9	perf testsuite: Add common output checking helpers As a form of validation, it is a common practice to check the outputs of commands whether they contain expected patterns or match a certain regex. Add helpers for verifying that all regexes are found in the output, that all lines match any pattern from a set and that a certain expression is not present in the output. In verbose mode these helpers log mismatches for easier failure investigation. Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com> Signed-off-by: Michael Petlan <mpetlan@redhat.com> Cc: kjain@linux.ibm.com Cc: atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240215110231.15385-6-mpetlan@redhat.com	2024-02-16 11:49:36 -08:00
Veronika Molnarova	c8eb2a9ff8	perf testsuite: Add test case for perf probe Add new perf probe test case that acts as an entry element in perf test list. Runs multiple subtests from directory "base_probe", which will be added in incomming patches and can be expanded without further editing. Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com> Signed-off-by: Michael Petlan <mpetlan@redhat.com> Cc: kjain@linux.ibm.com Cc: atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240215110231.15385-5-mpetlan@redhat.com	2024-02-16 11:49:22 -08:00
Veronika Molnarova	e3425864a9	perf testsuite: Add initialization script for shell tests Initialize reporting and logging functions that unifies formatting of the test output used for shell tests. Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com> Signed-off-by: Michael Petlan <mpetlan@redhat.com> Cc: kjain@linux.ibm.com Cc: atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240215110231.15385-4-mpetlan@redhat.com	2024-02-16 11:48:58 -08:00
Veronika Molnarova	451af6a790	perf testsuite: Add common setting for shell tests Add settings defining sample commands later shared by shell tests. This adds the possibility to globally adjust the default values for the whole testsuite. Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com> Signed-off-by: Michael Petlan <mpetlan@redhat.com> Cc: kjain@linux.ibm.com Cc: atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240215110231.15385-3-mpetlan@redhat.com	2024-02-16 11:48:40 -08:00
Veronika Molnarova	0aa8142871	perf testsuite: Add common regex patters Unify perf regexes for checking testing output into a single file to reduce duplicates and prevent errors when editing. This will be used in upcomming patches in shell tests. Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com> Signed-off-by: Michael Petlan <mpetlan@redhat.com> Cc: kjain@linux.ibm.com Cc: atrajeev@linux.vnet.ibm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240215110231.15385-2-mpetlan@redhat.com	2024-02-16 11:48:18 -08:00
Adrian Hunter	6f04d664a9	perf test: Enable Symbols test to work with a current module dso The test needs a struct machine and creates one for the current host, but a side-effect is that struct machine has set up kernel maps including module maps. If the 'Symbols' test --dso option specifies a current kernel module, it will already be present as a kernel dso, and a map with kmaps needs to be used otherwise there will be a segfault - see below. For that case, find the existing map and use that. In that case also, the dso is split by section into multiple dsos, so test those dsos also. That in turn, shows up that those dsos have not had overlapping symbols removed, so the test fails. Example: Before: $ perf test -F -v Symbols --dso /lib/modules/$(uname -r)/kernel/arch/x86/kvm/kvm-intel.ko 70: Symbols : --- start --- Testing /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko Segmentation fault (core dumped) After: $ perf test -F -v Symbols --dso /lib/modules/$(uname -r)/kernel/arch/x86/kvm/kvm-intel.ko 70: Symbols : --- start --- Testing /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko Overlapping symbols: 41d30-41fbb l vmx_init 41d30-41fbb g init_module ---- end ---- Symbols: FAILED! Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240131192416.16387-1-adrian.hunter@intel.com	2024-02-16 11:44:04 -08:00
Leo Yan	81901fc064	perf build: Cleanup perf register configuration The target is to allow the tool to always enable the perf register feature for native parsing and cross parsing, and current code doesn't depend on the macro 'HAVE_PERF_REGS_SUPPORT'. This patch remove the variable 'NO_PERF_REGS' and the defined macro 'HAVE_PERF_REGS_SUPPORT' from the Makefile. Signed-off-by: Leo Yan <leo.yan@linux.dev> Reviewed-by: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Ming Wang <wangming01@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: linux-csky@vger.kernel.org Cc: linux-riscv@lists.infradead.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214113947.240957-5-leo.yan@linux.dev	2024-02-15 13:48:55 -08:00
Leo Yan	9a4e47ef98	perf parse-regs: Introduce a weak function arch__sample_reg_masks() Every architecture can provide a register list for sampling. If an architecture doesn't support register sampling, it won't define the data structure 'sample_reg_masks'. Consequently, any code using this structure must be protected by the macro 'HAVE_PERF_REGS_SUPPORT'. This patch defines a weak function, arch__sample_reg_masks(), which will be replaced by an architecture-defined function for returning the architecture's register list. With this refactoring, the function always exists, the condition checking for 'HAVE_PERF_REGS_SUPPORT' is not needed anymore, so remove it. Signed-off-by: Leo Yan <leo.yan@linux.dev> Reviewed-by: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Ming Wang <wangming01@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: linux-csky@vger.kernel.org Cc: linux-riscv@lists.infradead.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214113947.240957-4-leo.yan@linux.dev	2024-02-15 13:48:36 -08:00
Leo Yan	ec87c99de4	perf parse-regs: Always build perf register functions Currently, the macro HAVE_PERF_REGS_SUPPORT is used as a switch to turn on or turn off the code of perf registers. If any architecture cannot support perf register, it disables the perf register parsing, for both the native parsing and cross parsing for other architectures. To support both the native parsing and cross parsing, the tool should always build the perf regs functions. Thus, this patch removes HAVE_PERF_REGS_SUPPORT from the perf regs files. Signed-off-by: Leo Yan <leo.yan@linux.dev> Reviewed-by: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Ming Wang <wangming01@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: linux-csky@vger.kernel.org Cc: linux-riscv@lists.infradead.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214113947.240957-3-leo.yan@linux.dev	2024-02-15 13:48:20 -08:00
Leo Yan	fca6af7be2	perf build: Remove unused CONFIG_PERF_REGS CONFIG_PERF_REGS is not used, remove it. Signed-off-by: Leo Yan <leo.yan@linux.dev> Reviewed-by: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Ming Wang <wangming01@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: linux-csky@vger.kernel.org Cc: linux-riscv@lists.infradead.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240214113947.240957-2-leo.yan@linux.dev	2024-02-15 13:47:36 -08:00
Ian Rogers	6d6be5eb45	perf metric: Don't remove scale from counts Counts were switched from the scaled saved value form to the aggregated count to avoid double accounting. When this happened the removing of scaling for a count should have been removed, however, it wasn't and this wasn't observed as it normally doesn't matter because a counter's scale is 1. A problem was observed with RAPL events that are scaled. Fixes: `37cc8ad77c` ("perf metric: Directly use counts rather than saved_value") Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Kaige Ye <ye@kaige.org> Cc: John Garry <john.g.garry@oracle.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240209204947.3873294-5-irogers@google.com	2024-02-13 13:48:09 -08:00
Ian Rogers	2543947c77	perf stat: Avoid metric-only segv Cycles is recognized as part of a hard coded metric in stat-shadow.c, it may call print_metric_only with a NULL fmt string leading to a segfault. Handle the NULL fmt explicitly. Fixes: `088519f318` ("perf stat: Move the display functions to stat-display.c") Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Kaige Ye <ye@kaige.org> Cc: John Garry <john.g.garry@oracle.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240209204947.3873294-4-irogers@google.com	2024-02-13 13:48:09 -08:00
Ian Rogers	6dd76680b9	perf expr: Fix "has_event" function for metric style events Events in metrics cannot use '/' as a separator, it would be recognized as a divide, so they use '@'. The '@' is recognized in the metricgroups code and changed to '/', do the same in the has_event function so that the parsing is only tried without the @s. Fixes: `4a4a9bf907` ("perf expr: Add has_event function") Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Kaige Ye <ye@kaige.org> Cc: John Garry <john.g.garry@oracle.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240209204947.3873294-3-irogers@google.com	2024-02-13 13:48:06 -08:00
Ian Rogers	4ea7d94407	perf expr: Allow NaN to be a valid number Currently only floating point numbers can be parsed, add a special case for NaN. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Kaige Ye <ye@kaige.org> Cc: John Garry <john.g.garry@oracle.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240209204947.3873294-2-irogers@google.com	2024-02-13 13:47:08 -08:00
Ian Rogers	923e4616ec	perf maps: Locking tidy up of nr_maps After this change maps__nr_maps is only used by tests, existing users are migrated to maps__empty. Compute maps__empty under the read lock. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linux.dev> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Artem Savkov <asavkov@redhat.com> Cc: bpf@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240210031746.4057262-7-irogers@google.com	2024-02-12 12:35:41 -08:00
Ian Rogers	ff0bd79980	perf maps: Hide maps internals Move the struct into the C file. Add maps__equal to work around exposing the struct for reference count checking. Add accessors for the unwind_libunwind_ops. Move maps_list_node to its only use in symbol.c. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linux.dev> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Artem Savkov <asavkov@redhat.com> Cc: bpf@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240210031746.4057262-6-irogers@google.com	2024-02-12 12:35:41 -08:00
Ian Rogers	39a27325e6	perf maps: Get map before returning in maps__find_next_entry Finding a map is done under a lock, returning the map without a reference count means it can be removed without notice and causing uses after free. Grab a reference count to the map within the lock region and return this. Fix up locations that need a map__put following this. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linux.dev> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Artem Savkov <asavkov@redhat.com> Cc: bpf@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240210031746.4057262-5-irogers@google.com	2024-02-12 12:35:41 -08:00
Ian Rogers	107ef66cb0	perf maps: Get map before returning in maps__find_by_name Finding a map is done under a lock, returning the map without a reference count means it can be removed without notice and causing uses after free. Grab a reference count to the map within the lock region and return this. Fix up locations that need a map__put following this. Also fix some reference counted pointer comparisons. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linux.dev> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Artem Savkov <asavkov@redhat.com> Cc: bpf@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240210031746.4057262-4-irogers@google.com	2024-02-12 12:35:33 -08:00
Ian Rogers	42fd623b58	perf maps: Get map before returning in maps__find Finding a map is done under a lock, returning the map without a reference count means it can be removed without notice and causing uses after free. Grab a reference count to the map within the lock region and return this. Fix up locations that need a map__put following this. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linux.dev> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Artem Savkov <asavkov@redhat.com> Cc: bpf@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240210031746.4057262-3-irogers@google.com	2024-02-12 12:35:26 -08:00
Ian Rogers	659ad3492b	perf maps: Switch from rbtree to lazily sorted array for addresses Maps is a collection of maps primarily sorted by the starting address of the map. Prior to this change the maps were held in an rbtree requiring 4 pointers per node. Prior to reference count checking, the rbnode was embedded in the map so 3 pointers per node were necessary. This change switches the rbtree to an array lazily sorted by address, much as the array sorting nodes by name. 1 pointer is needed per node, but to avoid excessive resizing the backing array may be twice the number of used elements. Meaning the memory overhead is roughly half that of the rbtree. For a perf record with "--no-bpf-event -g -a" of true, the memory overhead of perf inject is reduce fom 3.3MB to 3MB, so 10% or 300KB is saved. Map inserts always happen at the end of the array. The code tracks whether the insertion violates the sorting property. O(log n) rb-tree complexity is switched to O(1). Remove slides the array, so O(log n) rb-tree complexity is degraded to O(n). A find may need to sort the array using qsort which is O(n*log n), but in general the maps should be sorted and so average performance should be O(log n) as with the rbtree. An rbtree node consumes a cache line, but with the array 4 nodes fit on a cache line. Iteration is simplified to scanning an array rather than pointer chasing. Overall it is expected the performance after the change should be comparable to before, but with half of the memory consumed. To avoid a list and repeated logic around splitting maps, maps__merge_in is rewritten in terms of maps__fixup_overlap_and_insert. maps_merge_in splits the given mapping inserting remaining gaps. maps__fixup_overlap_and_insert splits the existing mappings, then adds the incoming mapping. By adding the new mapping first, then re-inserting the existing mappings the splitting behavior matches. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: James Clark <james.clark@arm.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Song Liu <song@kernel.org> Cc: Leo Yan <leo.yan@linux.dev> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Artem Savkov <asavkov@redhat.com> Cc: bpf@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240210031746.4057262-2-irogers@google.com	2024-02-12 12:35:14 -08:00
Namhyung Kim	39d14c0dd6	Merge branch 'perf-tools' into perf-tools-next To get some fixes in the perf test and JSON metrics into the development branch. Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2024-02-12 12:19:21 -08:00

1 2 3 4 5 ...

1248997 Commits