Commit Graph

1215158 Commits

Author SHA1 Message Date
Zhang Rui
24d16bec37 tools/power/turbostat: Adjust cstate for is_skx()/is_icx()/is_spr() models
Disable CC3/CC7/PC3/PC7 for is_skx()/is_icx()/is_spr() models.

Delete is_skx() CPU model check.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:19 +08:00
Zhang Rui
8e20ced057 tools/power/turbostat: Adjust cstate for is_dnv() models
Enable CC1 and disable CC3/CC7/PC3/PC7 for is_dnv() models.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:19 +08:00
Zhang Rui
3d982ac0da tools/power/turbostat: Adjust cstate for is_jvl() models
Disable CC3/CC7/PC2/PC3/PC6/PC7 for is_jvl() models.

Delete is_jvl() CPU model check.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:19 +08:00
Zhang Rui
ff20614955 tools/power/turbostat: Adjust cstate for has_slv_msrs() models
Disable PC2/PC3/PC7 and enable PC6 for has_slv_msrs() models.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:19 +08:00
Zhang Rui
192cbf0468 tools/power/turbostat: Adjust cstate for has_snb_msrs() models
Enable PC7 for has_snb_msrs() models.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:19 +08:00
Zhang Rui
6f1935c036 tools/power/turbostat: Adjust cstate for models with .cst_limit set
Enable PC3/PC6 for platforms with .cst_limit set because package cstates
are guarded by pkg_cstate_limit.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:19 +08:00
Zhang Rui
942c854d8d tools/power/turbostat: Adjust cstate for has_snb_msrs() models
Enable CC7 and PC2 for has_snb_msrs() models.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:19 +08:00
Zhang Rui
ce7ddf8af2 tools/power/turbostat: Adjust cstate for models with .has_nhm_msrs set
Enable CC1/CC3/CC6 for platforms with .has_nhm_msrs set.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:19 +08:00
Zhang Rui
3c6a17b8ae tools/power/turbostat: Add skeleton support for cstate enumeration
Add skeleton support for cstate enumeration.

Note that the previous logic may override the cstate setting for
multiple times for different reasons. The conversion to new cstate
enumeration must be done step by step following the previous code
order strictly.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:19 +08:00
Zhang Rui
485a017c45 tools/power/turbostat: Abstract TSC tweak support
On some models, the CPU base frequency is different from the TSC
frequency, and the aperf/mperf counters are running at CPU base
frequency instead of TSC frequency.

Abstract support for TSC tweak.

Given that tsc_tweak depends on base_hz, move the code to probe_bclk()
after base_hz is available.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:19 +08:00
Zhang Rui
bf1ad57c3f tools/power/turbostat: Remove unused family/model parameters for RAPL functions
RAPL probing can be done without family/model checking. Remove these
parameters in rapl probe functions.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:19 +08:00
Zhang Rui
7c60409382 tools/power/turbostat: Abstract hardcoded TDP value
Different hardcoded TDP values are used when TDP can not be retrieved
from the hardware.

Abstract hardcoded TDP value.

Delete CPU model checks in get_tdp_intel().

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:19 +08:00
Zhang Rui
9e6f35159c tools/power/turbostat: Abstract fixed DRAM Energy unit support
Abstract the support for fixed Dram domain energy unit.

Delete rapl_dram_energy_units_probe() CPU model check.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:19 +08:00
Zhang Rui
6d35b8c4a6 tools/power/turbostat: Abstract RAPL divisor support
INTEL_FAM6_ATOM_SILVERMONT model needs a divisor to convert the raw
Energy Units value from MSR_RAPL_POWER_UNIT.

Abstract the support for RAPL divisor.

Delete CPU model check in rapl_probe_intel().

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:19 +08:00
Zhang Rui
e338831b14 tools/power/turbostat: Abstract Per Core RAPL support
Abstract the support for Per Core RAPL.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:19 +08:00
Zhang Rui
86ba263d9b tools/power/turbostat: Abstract RAPL MSRs support
Abstract the support for RAPL MSRs.

Delete CPU model checks in rapl_probe_intel().

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:19 +08:00
Zhang Rui
a98f886035 tools/power/turbostat: Simplify the logic for RAPL enumeration
The support for each RAPL domains, as well as the support for the perf
status of each RAPL domains, can be detected by checking the
availabilities of the corresponding RAPL MSRs.

Change the code accordingly and remove the hardcoded logic for each
model.

Note that this also fixes the INTEL_FAM6_ATOM_TREMONT model, which has
RAPL_PKG_PERF_STATUS and MSR_DRAM_PERF_STATUS but doesn't have BIC_PKG__
and BIC_RAM__ set.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:19 +08:00
Zhang Rui
b9cd66833d tools/power/turbostat: Redefine RAPL macros
Redefine RAPL macros to make the code more readable.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:19 +08:00
Zhang Rui
a5d1ab93a0 tools/power/turbostat: Abstract hardcoded Crystal Clock frequency
Abstract the support for hardcoded Crystal Clock frequency, which is
used when crystal clock is not available from CPUID.15.

Delete CPU model checks in process_cpuid().

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:18 +08:00
Zhang Rui
d90120bf9f tools/power/turbostat: Abstract Automatic Cstate Conversion support
Abstract the support for AUTOMATIC_CSTATE_CONVERSION bit in
MSR_PKG_CST_CONFIG_CONTROL.

Delete automatic_cstate_conversion_probe() CPU model check.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:18 +08:00
Zhang Rui
0c057cf7a0 tools/power/turbostat: Abstract Perf Limit Reasons MSRs support
Abstract the support for MSR_CORE/GFX/RING_PERF_LIMIT_REASONS MSRs.

Delete perf_limit_reasons_probe() CPU model check.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:18 +08:00
Zhang Rui
d8e1623baa tools/power/turbostat: Abstract TCC Offset bits support
Abstract the support for different TCC Offset bits in
MSR_IA32_TEMPERATURE_TARGET.

Delete check_tcc_offset() CPU model check.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:18 +08:00
Zhang Rui
a61c9cb478 tools/power/turbostat: Abstract Config TDP MSRs support
Abstract the support for MSR_CONFIG_TDP_NOMINAL/LEVEL_1/LEVEL_2/CONTROL
and MSR_TURBO_ACTIVATION_RATIO.

Delete has_config_tdp() CPU model check.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:18 +08:00
Zhang Rui
a3943deaf9 tools/power/turbostat: Rename some TRL functions
Rename dump_hsw_turbo_ratio_limits() and dump_ivt_turbo_ratio_limits()
to dump_turbo_ratio_limit2() and dump_turbo_ratio_limit1() because they
dump MSR_TURBO_RATIO_LIMIT1/LIMIT2, and the MSRs' behavior is
consistent when they are available.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:18 +08:00
Zhang Rui
10d85d85ab tools/power/turbostat: Abstract Turbo Ratio Limit MSRs support
Abstract the support for MSR_TURBO_RATIO_LIMIT, MSR_TRUBO_RATIO_LIMIT1,
MSR_TURBO_RATIO_LIMIT2, MSR_SECONDARY_TURBO_RATIO_LIMIT,
MSR_ATOM_CORE_RATIOS and MSR_ATOM_CORE_TURBO_RATIOS.

Delete has_turbo_ratio_group_limits(), has_turbo_ratio_limit(),
has_atom_turbo_ratio_limit(), has_ivt_turbo_ratio_limit(),
has_hsw_turbo_ratio_limit(), has_knl_turbo_ratio_limit() and
has_glm_turbo_ratio_limit() CPU model checks.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:18 +08:00
Zhang Rui
8b7199c085 tools/power/turbostat: Rename some functions
Rename dump_nhm_platform_info() and dump_nhm_cst_cfg() to
dump_platform_info() and dump_cst_cfg() because these MSRs' behavior is
consistent when they're available.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:18 +08:00
Zhang Rui
c2c25e85df tools/power/turbostat: Remove a redundant check
Platforms with has_msr_misc_pwr_mgmt set is a subset of platforms with
has_nhm_msrs set.

Thus remove the redudant check for platform->has_nhm_msrs.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:18 +08:00
Zhang Rui
fcfa1ce074 tools/power/turbostat: Abstract Nehalem MSRs support
MSR_PLATFORM_INFO, MSR_IA32_TEMPERATURE_TARGET, MSR_SMI_COUNT,
MSR_PKG_CST_CONFIG_CONTROL, and the TRL MSRs are always available for
platforms since Nehalem. Support for these msrs can be described
altogether.

Abstract the support for these MSRs.

Delete probe_nhm_msrs() CPU model check.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:18 +08:00
Zhang Rui
3989fc8907 tools/power/turbostat: Abstract Package cstate limit decoding support
Abstract the support for decoding package cstate limit from
MSR_PKG_CST_CONFIG_CONTROL.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:18 +08:00
Zhang Rui
71e841293c tools/power/turbostat: Abstract BCLK frequency support
Abstract CPU base clock frequency support.

Note that bclk is used by
1. calculate base_hz using MSR_PLATFORM_INFO, which is guarded by
   probe_nhm_msrs().
2. dump MSR_PLATFORM_INFO and Turbo Ratio Limit MSRs, which are also
   guarded by probe_nhm_msrs().
Thus probe_bclk() works for probe_nhm_msrs() models only.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:18 +08:00
Zhang Rui
3dd0e7547d tools/power/turbostat: Abstract MSR_MISC_PWR_MGMT support
Abstract MSR_MISC_PWR_MGMT support.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:18 +08:00
Zhang Rui
778fc34a7a tools/power/turbostat: Abstract MSR_MISC_FEATURE_CONTROL support
Abstract MSR_MISC_FEATURE_CONTROL support.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:18 +08:00
Zhang Rui
45232ab168 tools/power/turbostat: Add skeleton support for table driven feature enumeration
Turbostat supports a series of features that may diverge among different
CPU models.

Current code uses various of CPU model checks in different places to
handle this, which makes the code hard to maintain.

Add skeleton support for table driven feature enumeration to replace the
current error-prone CPU model checks and global variables.

Note: by comparing the CPU models with intel-family.h, it is found that
turbostat support for below four Models are missing, including
INTEL_FAM6_ICELAKE, INTEL_FAM6_ATOM_SILVERMONT_MID,
INTEL_FAM6_ATOM_AIRMONT_MID and INTEL_FAM6_ATOM_AIRMONT_NP. Adding
support for these models is a different work, thus it is not covered in
this patch set.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:18 +08:00
Zhang Rui
48674c1bb6 tools/power/turbostat: Remove pseudo check for two models
INTEL_FAM6_ATOM_SILVERMONT_MID/INTEL_FAM6_ATOM_AIRMONT_MID are not
listed in probe_nhm_msrs(). This means that most of the turbostat
features are not available on these two platforms.

Further more, checking for these two models in has_slv_msrs() is
dead code. Because has_slv_msrs() is called by the code guarded by
probe_nhm_msrs().

For these two reasons, remove pseudo check for
INTEL_FAM6_ATOM_SILVERMONT_MID and INTEL_FAM6_ATOM_AIRMONT_MID.

Will add back the support when we can access these two platforms.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:18 +08:00
Zhang Rui
bbfc33b1e4 tools/power/turbostat: Remove redundant duplicates
Remove redundant duplicates in intel_model_duplicates().

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:18 +08:00
Zhang Rui
6d306d6ec7 tools/power/turbostat: Replace raw value cpu model with Macro
Kernel already has
 #define INTEL_FAM6_NEHALEM_G	0x1F /* Auburndale / Havendale */

Use standard Macro for CPU Model instead of raw value.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:18 +08:00
Zhang Rui
2c019d6579 tools/power/turbostat: Support alternative graphics sysfs knobs
/sys/class/graphics/fb0/device/drm/card0/ and /sys/class/drm/card0/
point to the same device node.
But in some cases, one exists and the other one does not.

Prefer to use /sys/class/drm/card0/, and fall back to
/sys/class/graphics/fb0/device/drm/card0/.

This recovers the "GFXMHz" and "GFXAMHz" columns on some platforms like
a SPR server.

Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:18 +08:00
Zhang Rui
b98a6d7876 tools/power/turbostat: Enable TCC Offset on more models
All Models that duplicate INTEL_FAM6_CANNONLAKE_L support TCC Offset.
Enable this feature on all these models.

Delete obsolete model_orig.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:18 +08:00
Chen Yu
b61b7d8c4c tools/power/turbostat: Enable the C-state Pre-wake printing
Currently the C-state Pre-wake will not be printed due to the
probe has not been invoked. Invoke the probe function accordingly.

Fixes: aeb01e6d71 ("tools/power turbostat: Print the C-state Pre-wake settings")
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:18 +08:00
Zhang Rui
137f01b352 tools/power/turbostat: Fix a knl bug
MSR_KNL_CORE_C6_RESIDENCY should be evaluated only if
1. this is KNL platform
AND
2. need to get C6 residency or need to calculate C1 residency

Fix the broken logic introduced by commit 1e9042b9c8 ("tools/power
turbostat: Fix CPU%C1 display value").

Fixes: 1e9042b9c8 ("tools/power turbostat: Fix CPU%C1 display value")
Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:18 +08:00
Zhang Rui
4d1827485a tools/power/turbostat: Fix failure with new uncore sysfs
On some platforms, turbostat fails during launch time like below,

turbostat version 2023.03.17 - Len Brown <lenb@kernel.org>
...
cpu40: MSR_IA32_PACKAGE_THERM_STATUS: 0x884c0000 (24 C)
cpu40: MSR_IA32_PACKAGE_THERM_INTERRUPT: 0x00000003 (100 C, 100 C)
turbostat: snapshot_sysfs_counter(/sys/devices/system/cpu/intel_uncore_frequency/package_00_die_00/current_freq_khz): No data available

This is because new uncore sysfs is used on these platforms as
introduced by commit 9b8dea80e3 ("platform/x86/intel-uncore-freq:
Support for cluster level controls").

With the new uncore sysfs interface,
/sys/devices/system/cpu/intel_uncore_frequency/package_00_die_00/current_freq_khz
is still available, but reading it fails.

How to support the fabric cluster level uncore sysfs is not settled yet,
as a short term fix, clear the BIC_UNCORE_MHZ bit when new sysfs I/F is
detected.

Signed-off-by: Zhang Rui <rui.zhang@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
2023-09-27 22:14:17 +08:00
Linus Torvalds
0bb80ecc33 Linux 6.6-rc1 v6.6-rc1 2023-09-10 16:28:41 -07:00
Linus Torvalds
1548b060d6 Merge tag 'topic/drm-ci-2023-08-31-1' of git://anongit.freedesktop.org/drm/drm
Pull drm ci scripts from Dave Airlie:
 "This is a bunch of ci integration for the freedesktop gitlab instance
  where we currently do upstream userspace testing on diverse sets of
  GPU hardware. From my perspective I think it's an experiment worth
  going with and seeing how the benefits/noise playout keeping these
  files useful.

  Ideally I'd like to get this so we can do pre-merge testing on PRs
  eventually.

  Below is some info from danvet on why we've ended up making the
  decision and how we can roll it back if we decide it was a bad plan.

  Why in upstream?

   - like documentation, testcases, tools CI integration is one of these
     things where you can waste endless amounts of time if you
     accidentally have a version that doesn't match your source code

   - but also like the above, there's a balance, this is the initial cut
     of what we think makes sense to keep in sync vs out-of-tree,
     probably needs adjustment

   - gitlab supports out-of-repo gitlab integration and that's what's
     been used for the kernel in drm, but it results in per-driver
     fragmentation and lots of duplicated effort. the simple act of
     smashing an arbitrary winner into a topic branch already started
     surfacing patches on dri-devel and sparking good cross driver team
     discussions

  Why gitlab?

   - it's not any more shit than any of the other CI

   - drm userspace uses it extensively for everything in userspace, we
     have a lot of people and experience with this, including
     integration of hw testing labs

   - media userspace like gstreamer is also on gitlab.fd.o, and there's
     discussion to extend this to the media subsystem in some fashion

  Can this be shared?

   - there's definitely a pile of code that could move to scripts/ if
     other subsystem adopt ci integration in upstream kernel git. other
     bits are more drm/gpu specific like the igt-gpu-tests/tools
     integration

   - docker images can be run locally or in other CI runners

  Will we regret this?

   - it's all in one directory, intentionally, for easy deletion

   - probably 1-2 years in upstream to see whether this is worth it or a
     Big Mistake. that's roughly what it took to _really_ roll out solid
     CI in the bigger userspace projects we have on gitlab.fd.o like
     mesa3d"

* tag 'topic/drm-ci-2023-08-31-1' of git://anongit.freedesktop.org/drm/drm:
  drm: ci: docs: fix build warning - add missing escape
  drm: Add initial ci/ subdirectory
2023-09-10 11:55:26 -07:00
Linus Torvalds
e56b2b6057 Merge tag 'x86-urgent-2023-09-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Fix preemption delays in the SGX code, remove unnecessarily
  UAPI-exported code, fix a ld.lld linker (in)compatibility quirk and
  make the x86 SMP init code a bit more conservative to fix kexec()
  lockups"

* tag 'x86-urgent-2023-09-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/sgx: Break up long non-preemptible delays in sgx_vepc_release()
  x86: Remove the arch_calc_vm_prot_bits() macro from the UAPI
  x86/build: Fix linker fill bytes quirk/incompatibility for ld.lld
  x86/smp: Don't send INIT to non-present and non-booted CPUs
2023-09-10 10:39:31 -07:00
Linus Torvalds
e79dbf03d8 Merge tag 'perf-urgent-2023-09-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 perf event fix from Ingo Molnar:
 "Work around a firmware bug in the uncore PMU driver, affecting certain
  Intel systems"

* tag 'perf-urgent-2023-09-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/uncore: Correct the number of CHAs on EMR
2023-09-10 10:34:46 -07:00
Linus Torvalds
535a265d7f Merge tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo:
 "perf tools maintainership:

   - Add git information for perf-tools and perf-tools-next trees and
     branches to the MAINTAINERS file. That is where development now
     takes place and myself and Namhyung Kim have write access, more
     people to come as we emulate other maintainer groups.

  perf record:

   - Record kernel data maps when 'perf record --data' is used, so that
     global variables can be resolved and used in tools that do data
     profiling.

  perf trace:

   - Remove the old, experimental support for BPF events in which a .c
     file was passed as an event: "perf trace -e hello.c" to then get
     compiled and loaded.

     The only known usage for that, that shipped with the kernel as an
     example for such events, augmented the raw_syscalls tracepoints and
     was converted to a libbpf skeleton, reusing all the user space
     components and the BPF code connected to the syscalls.

     In the end just the way to glue the BPF part and the user space
     type beautifiers changed, now being performed by libbpf skeletons.

     The next step is to use BTF to do pretty printing of all syscall
     types, as discussed with Alan Maguire and others.

     Now, on a perf built with BUILD_BPF_SKEL=1 we get most if not all
     path/filenames/strings, some of the networking data structures,
     perf_event_attr, etc, i.e. systemwide tracing of nanosleep calls
     and perf_event_open syscalls while 'perf stat' runs 'sleep' for 5
     seconds:

      # perf trace -a -e *nanosleep,perf* perf stat -e cycles,instructions sleep 5
         0.000 (   9.034 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
         9.039 (   0.006 ms): perf/327641 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x1 (PERF_COUNT_HW_INSTRUCTIONS), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 327642 (perf-exec), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
             ? (           ): gpm/991  ... [continued]: clock_nanosleep())               = 0
        10.133 (           ): sleep/327642 clock_nanosleep(rqtp: { .tv_sec: 5, .tv_nsec: 0 }, rmtp: 0x7ffd36f83ed0) ...
             ? (           ): pool-gsd-smart/3051  ... [continued]: clock_nanosleep())   = 0
        30.276 (           ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
       223.215 (1000.430 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0
        30.276 (2000.394 ms): gpm/991  ... [continued]: clock_nanosleep())               = 0
      1230.814 (           ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ...
      1230.814 (1000.404 ms): pool-gsd-smart/3051  ... [continued]: clock_nanosleep())   = 0
      2030.886 (           ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
      2237.709 (1000.153 ms): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) = 0
             ? (           ): crond/1172  ... [continued]: clock_nanosleep())            = 0
      3242.699 (           ): pool-gsd-smart/3051 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f6e7fffec90) ...
      2030.886 (2000.385 ms): gpm/991  ... [continued]: clock_nanosleep())               = 0
      3728.078 (           ): crond/1172 clock_nanosleep(rqtp: { .tv_sec: 60, .tv_nsec: 0 }, rmtp: 0x7ffe0971dcf0) ...
      3242.699 (1000.158 ms): pool-gsd-smart/3051  ... [continued]: clock_nanosleep())   = 0
      4031.409 (           ): gpm/991 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffcc6f73710) ...
        10.133 (5000.375 ms): sleep/327642  ... [continued]: clock_nanosleep())          = 0

      Performance counter stats for 'sleep 5':

             2,617,347      cycles
             1,855,997      instructions                     #    0.71  insn per cycle

           5.002282128 seconds time elapsed

           0.000855000 seconds user
           0.000852000 seconds sys

  perf annotate:

   - Building with binutils' libopcode now is opt-in (BUILD_NONDISTRO=1)
     for licensing reasons, and we missed a build test on
     tools/perf/tests makefile.

     Since we now default to NDEBUG=1, we ended up segfaulting when
     building with BUILD_NONDISTRO=1 because a needed initialization
     routine was being "error checked" via an assert.

     Fix it by explicitly checking the result and aborting instead if it
     fails.

     We better back propagate the error, but at least 'perf annotate' on
     samples collected for a BPF program is back working when perf is
     built with BUILD_NONDISTRO=1.

  perf report/top:

   - Add back TUI hierarchy mode header, that is seen when using 'perf
     report/top --hierarchy'.

   - Fix the number of entries for 'e' key in the TUI that was
     preventing navigation of lines when expanding an entry.

  perf report/script:

   - Support cross platform register handling, allowing a perf.data file
     collected on one architecture to have registers sampled correctly
     displayed when analysis tools such as 'perf report' and 'perf
     script' are used on a different architecture.

   - Fix handling of event attributes in pipe mode, i.e. when one uses:

  	perf record -o - | perf report -i -

     When no perf.data files are used.

   - Handle files generated via pipe mode with a version of perf and
     then read also via pipe mode with a different version of perf,
     where the event attr record may have changed, use the record size
     field to properly support this version mismatch.

  perf probe:

   - Accessing global variables from uprobes isn't supported, make the
     error message state that instead of stating that some minimal
     kernel version is needed to have that feature. This seems just a
     tool limitation, the kernel probably has all that is needed.

  perf tests:

   - Fix a reference count related leak in the dlfilter v0 API where the
     result of a thread__find_symbol_fb() is not matched with an
     addr_location__exit() to drop the reference counts of the resolved
     components (machine, thread, map, symbol, etc). Add a dlfilter test
     to make sure that doesn't regresses.

   - Lots of fixes for the 'perf test' written in shell script related
     to problems found with the shellcheck utility.

   - Fixes for 'perf test' shell scripts testing features enabled when
     perf is built with BUILD_BPF_SKEL=1, such as 'perf stat' bpf
     counters.

   - Add perf record sample filtering test, things like the following
     example, that gets implemented as a BPF filter attached to the
     event:

       # perf record -e task-clock -c 10000 --filter 'ip < 0xffffffff00000000'

   - Improve the way the task_analyzer test checks if libtraceevent is
     linked, using 'perf version --build-options' instead of the more
     expensinve 'perf record -e "sched:sched_switch"'.

   - Add support for riscv in the mmap-basic test. (This went as well
     via the RiscV tree, same contents).

  libperf:

   - Implement riscv mmap support (This went as well via the RiscV tree,
     same contents).

  perf script:

   - New tool that converts perf.data files to the firefox profiler
     format so that one can use the visualizer at
     https://profiler.firefox.com/. Done by Anup Sharma as part of this
     year's Google Summer of Code.

     One can generate the output and upload it to the web interface but
     Anup also automated everything:

       perf script gecko -F 99 -a sleep 60

   - Support syscall name parsing on arm64.

   - Print "cgroup" field on the same line as "comm".

  perf bench:

   - Add new 'uprobe' benchmark to measure the overhead of uprobes
     with/without BPF programs attached to it.

   - breakpoints are not available on power9, skip that test.

  perf stat:

   - Add #num_cpus_online literal to be used in 'perf stat' metrics, and
     add this extra 'perf test' check that exemplifies its purpose:

  	TEST_ASSERT_VAL("#num_cpus_online",
                         expr__parse(&num_cpus_online, ctx, "#num_cpus_online") == 0);
  	TEST_ASSERT_VAL("#num_cpus", expr__parse(&num_cpus, ctx, "#num_cpus") == 0);
  	TEST_ASSERT_VAL("#num_cpus >= #num_cpus_online", num_cpus >= num_cpus_online);

  Miscellaneous:

   - Improve tool startup time by lazily reading PMU, JSON, sysfs data.

   - Improve error reporting in the parsing of events, passing YYLTYPE
     to error routines, so that the output can show were the parsing
     error was found.

   - Add 'perf test' entries to check the parsing of events
     improvements.

   - Fix various leak for things detected by -fsanitize=address, mostly
     things that would be freed at tool exit, including:

       - Free evsel->filter on the destructor.

       - Allow tools to register a thread->priv destructor and use it in
         'perf trace'.

       - Free evsel->priv in 'perf trace'.

       - Free string returned by synthesize_perf_probe_point() when the
         caller fails to do all it needs.

   - Adjust various compiler options to not consider errors some
     warnings when building with broken headers found in things like
     python, flex, bison, as we otherwise build with -Werror. Some for
     gcc, some for clang, some for some specific version of those, some
     for some specific version of flex or bison, or some specific
     combination of these components, bah.

   - Allow customization of clang options for BPF target, this helps
     building on gentoo where there are other oddities where BPF targets
     gets passed some compiler options intended for the native build, so
     building with WERROR=0 helps while these oddities are fixed.

   - Dont pass ERR_PTR() values to perf_session__delete() in 'perf top'
     and 'perf lock', fixing some segfaults when handling some odd
     failures.

   - Add LTO build option.

   - Fix format of unordered lists in the perf docs
     (tools/perf/Documentation)

   - Overhaul the bison files, using constructs such as YYNOMEM.

   - Remove unused tokens from the bison .y files.

   - Add more comments to various structs.

   - A few LoongArch enablement patches.

  Vendor events (JSON):

   - Add JSON metrics for Yitian 710 DDR (aarch64). Things like:

  	EventName, BriefDescription
  	visible_window_limit_reached_rd, "At least one entry in read queue reaches the visible window limit.",
  	visible_window_limit_reached_wr, "At least one entry in write queue reaches the visible window limit.",
  	op_is_dqsosc_mpc	       , "A DQS Oscillator MPC command to DRAM.",
  	op_is_dqsosc_mrr	       , "A DQS Oscillator MRR command to DRAM.",
  	op_is_tcr_mrr		       , "A Temperature Compensated Refresh(TCR) MRR command to DRAM.",

   - Add AmpereOne metrics (aarch64).

   - Update N2 and V2 metrics (aarch64) and events using Arm telemetry
     repo.

   - Update scale units and descriptions of common topdown metrics on
     aarch64. Things like:
       - "MetricExpr": "stall_slot_frontend / (#slots * cpu_cycles)",
       - "BriefDescription": "Frontend bound L1 topdown metric",
       + "MetricExpr": "100 * (stall_slot_frontend / (#slots * cpu_cycles))",
       + "BriefDescription": "This metric is the percentage of total slots that were stalled due to resource constraints in the frontend of the processor.",

   - Update events for intel: meteorlake to 1.04, sapphirerapids to
     1.15, Icelake+ metric constraints.

   - Update files for the power10 platform"

* tag 'perf-tools-for-v6.6-1-2023-09-05' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (217 commits)
  perf parse-events: Fix driver config term
  perf parse-events: Fixes relating to no_value terms
  perf parse-events: Fix propagation of term's no_value when cloning
  perf parse-events: Name the two term enums
  perf list: Don't print Unit for "default_core"
  perf vendor events intel: Fix modifier in tma_info_system_mem_parallel_reads for skylake
  perf dlfilter: Avoid leak in v0 API test use of resolve_address()
  perf metric: Add #num_cpus_online literal
  perf pmu: Remove str from perf_pmu_alias
  perf parse-events: Make common term list to strbuf helper
  perf parse-events: Minor help message improvements
  perf pmu: Avoid uninitialized use of alias->str
  perf jevents: Use "default_core" for events with no Unit
  perf test stat_bpf_counters_cgrp: Enhance perf stat cgroup BPF counter test
  perf test shell stat_bpf_counters: Fix test on Intel
  perf test shell record_bpf_filter: Skip 6.2 kernel
  libperf: Get rid of attr.id field
  perf tools: Convert to perf_record_header_attr_id()
  libperf: Add perf_record_header_attr_id()
  perf tools: Handle old data in PERF_RECORD_ATTR
  ...
2023-09-09 20:06:17 -07:00
Linus Torvalds
fd3a5940e6 Merge tag '6.6-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6
Pull smb client fixes from Steve French:

 - six smb3 client fixes including ones to allow controlling smb3
   directory caching timeout and limits, and one debugging improvement

 - one fix for nls Kconfig (don't need to expose NLS_UCS2_UTILS option)

 - one minor spnego registry update

* tag '6.6-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
  spnego: add missing OID to oid registry
  smb3: fix minor typo in SMB2_GLOBAL_CAP_LARGE_MTU
  cifs: update internal module version number for cifs.ko
  smb3: allow controlling maximum number of cached directories
  smb3: add trace point for queryfs (statfs)
  nls: Hide new NLS_UCS2_UTILS
  smb3: allow controlling length of time directory entries are cached with dir leases
  smb: propagate error code of extract_sharename()
2023-09-09 19:56:23 -07:00
David Howells
a3c57ab79a iov_iter: Kunit tests for page extraction
Add some kunit tests for page extraction for ITER_BVEC, ITER_KVEC and
ITER_XARRAY type iterators.  ITER_UBUF and ITER_IOVEC aren't dealt with
as they require userspace VM interaction.  ITER_DISCARD isn't dealt with
either as that can't be extracted.

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Hildenbrand <david@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-09-09 15:11:49 -07:00
David Howells
2d71340ff1 iov_iter: Kunit tests for copying to/from an iterator
Add some kunit tests for page extraction for ITER_BVEC, ITER_KVEC and
ITER_XARRAY type iterators.  ITER_UBUF and ITER_IOVEC aren't dealt with
as they require userspace VM interaction.  ITER_DISCARD isn't dealt with
either as that does nothing.

Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Hildenbrand <david@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-09-09 15:11:49 -07:00
David Howells
f741bd7178 iov_iter: Fix iov_iter_extract_pages() with zero-sized entries
iov_iter_extract_pages() doesn't correctly handle skipping over initial
zero-length entries in ITER_KVEC and ITER_BVEC-type iterators.

The problem is that it accidentally reduces maxsize to 0 when it
skipping and thus runs to the end of the array and returns 0.

Fix this by sticking the calculated size-to-copy in a new variable
rather than back in maxsize.

Fixes: 7d58fe7310 ("iov_iter: Add a function to extract a page list from an iterator")
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Hildenbrand <david@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-09-09 15:11:49 -07:00