Merge tag 'perf-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull performance event updates from Ingo Molnar:
 "x86 PMU driver updates:

   - Add support for the core PMU for Intel Diamond Rapids (DMR) CPUs
     (Dapeng Mi)

     Compared to previous iterations of the Intel PMU code, there's been
     a lot of changes, which center around three main areas:

      - Introduce the OFF-MODULE RESPONSE (OMR) facility to replace the
        Off-Core Response (OCR) facility

      - New PEBS data source encoding layout

      - Support the new "RDPMC user disable" feature

   - Likewise, a large series adds uncore PMU support for Intel Diamond
     Rapids (DMR) CPUs (Zide Chen)

     This centers around these four main areas:

      - DMR may have two Integrated I/O and Memory Hub (IMH) dies,
        separate from the compute tile (CBB) dies. Each CBB and each IMH
        die has its own discovery domain.

      - Unlike prior CPUs that retrieve the global discovery table
        portal exclusively via PCI or MSR, DMR uses PCI for IMH PMON
        discovery and MSR for CBB PMON discovery.

      - DMR introduces several new PMON types: SCA, HAMVF, D2D_ULA, UBR,
        PCIE4, CRS, CPC, ITC, OTC, CMS, and PCIE6.

      - IIO free-running counters in DMR are MMIO-based, unlike SPR.

   - Also add support for Add missing PMON units for Intel Panther Lake,
     and support Nova Lake (NVL), which largely maps to Panther Lake.
     (Zide Chen)

   - KVM integration: Add support for mediated vPMUs (by Kan Liang and
     Sean Christopherson, with fixes and cleanups by Peter Zijlstra,
     Sandipan Das and Mingwei Zhang)

   - Add Intel cstate driver to support for Wildcat Lake (WCL) CPUs,
     which are a low-power variant of Panther Lake (Zide Chen)

   - Add core, cstate and MSR PMU support for the Airmont NP Intel CPU
     (aka MaxLinear Lightning Mountain), which maps to the existing
     Airmont code (Martin Schiller)

  Performance enhancements:

   - Speed up kexec shutdown by avoiding unnecessary cross CPU calls
     (Jan H. Schönherr)

   - Fix slow perf_event_task_exit() with LBR callstacks (Namhyung Kim)

  User-space stack unwinding support:

   - Various cleanups and refactorings in preparation to generalize the
     unwinding code for other architectures (Jens Remus)

  Uprobes updates:

   - Transition from kmap_atomic to kmap_local_page (Keke Ming)

   - Fix incorrect lockdep condition in filter_chain() (Breno Leitao)

   - Fix XOL allocation failure for 32-bit tasks (Oleg Nesterov)

  Misc fixes and cleanups:

   - s390: Remove kvm_types.h from Kbuild (Randy Dunlap)

   - x86/intel/uncore: Convert comma to semicolon (Chen Ni)

   - x86/uncore: Clean up const mismatch (Greg Kroah-Hartman)

   - x86/ibs: Fix typo in dc_l2tlb_miss comment (Xiang-Bin Shi)"

* tag 'perf-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits)
  s390: remove kvm_types.h from Kbuild
  uprobes: Fix incorrect lockdep condition in filter_chain()
  x86/ibs: Fix typo in dc_l2tlb_miss comment
  x86/uprobes: Fix XOL allocation failure for 32-bit tasks
  perf/x86/intel/uncore: Convert comma to semicolon
  perf/x86/intel: Add support for rdpmc user disable feature
  perf/x86: Use macros to replace magic numbers in attr_rdpmc
  perf/x86/intel: Add core PMU support for Novalake
  perf/x86/intel: Add support for PEBS memory auxiliary info field in NVL
  perf/x86/intel: Add core PMU support for DMR
  perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR
  perf/x86/intel: Support the 4 new OMR MSRs introduced in DMR and NVL
  perf/core: Fix slow perf_event_task_exit() with LBR callstacks
  perf/core: Speed up kexec shutdown by avoiding unnecessary cross CPU calls
  uprobes: use kmap_local_page() for temporary page mappings
  arm/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
  mips/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
  arm64/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
  riscv/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
  perf/x86/intel/uncore: Add Nova Lake support
  ...
This commit is contained in:
Linus Torvalds
2026-02-10 12:00:46 -08:00
45 changed files with 2155 additions and 523 deletions

View File

@@ -0,0 +1,44 @@
What: /sys/bus/event_source/devices/cpu.../rdpmc
Date: November 2011
KernelVersion: 3.10
Contact: Linux kernel mailing list linux-kernel@vger.kernel.org
Description: The /sys/bus/event_source/devices/cpu.../rdpmc attribute
is used to show/manage if rdpmc instruction can be
executed in user space. This attribute supports 3 numbers.
- rdpmc = 0
user space rdpmc is globally disabled for all PMU
counters.
- rdpmc = 1
user space rdpmc is globally enabled only in event mmap
ioctl called time window. If the mmap region is unmapped,
user space rdpmc is disabled again.
- rdpmc = 2
user space rdpmc is globally enabled for all PMU
counters.
In the Intel platforms supporting counter level's user
space rdpmc disable feature (CPUID.23H.EBX[2] = 1), the
meaning of 3 numbers is extended to
- rdpmc = 0
global user space rdpmc and counter level's user space
rdpmc of all counters are both disabled.
- rdpmc = 1
No changes on behavior of global user space rdpmc.
counter level's rdpmc of system-wide events is disabled
but counter level's rdpmc of non-system-wide events is
enabled.
- rdpmc = 2
global user space rdpmc and counter level's user space
rdpmc of all counters are both enabled unconditionally.
The default value of rdpmc is 1.
Please notice:
- global user space rdpmc's behavior would change
immediately along with the rdpmc value's change,
but the behavior of counter level's user space rdpmc
won't take effect immediately until the event is
reactivated or recreated.
- The rdpmc attribute is global, even for x86 hybrid
platforms. For example, changing cpu_core/rdpmc will
also change cpu_atom/rdpmc.

View File

@@ -113,7 +113,7 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm,
void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
void *src, unsigned long len)
{
void *xol_page_kaddr = kmap_atomic(page);
void *xol_page_kaddr = kmap_local_page(page);
void *dst = xol_page_kaddr + (vaddr & ~PAGE_MASK);
preempt_disable();
@@ -126,7 +126,7 @@ void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
preempt_enable();
kunmap_atomic(xol_page_kaddr);
kunmap_local(xol_page_kaddr);
}

View File

@@ -15,7 +15,7 @@
void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
void *src, unsigned long len)
{
void *xol_page_kaddr = kmap_atomic(page);
void *xol_page_kaddr = kmap_local_page(page);
void *dst = xol_page_kaddr + (vaddr & ~PAGE_MASK);
/*
@@ -32,7 +32,7 @@ void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
sync_icache_aliases((unsigned long)dst, (unsigned long)dst + len);
done:
kunmap_atomic(xol_page_kaddr);
kunmap_local(xol_page_kaddr);
}
unsigned long uprobe_get_swbp_addr(struct pt_regs *regs)

View File

@@ -214,11 +214,11 @@ void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
unsigned long kaddr, kstart;
/* Initialize the slot */
kaddr = (unsigned long)kmap_atomic(page);
kaddr = (unsigned long)kmap_local_page(page);
kstart = kaddr + (vaddr & ~PAGE_MASK);
memcpy((void *)kstart, src, len);
flush_icache_range(kstart, kstart + len);
kunmap_atomic((void *)kaddr);
kunmap_local((void *)kaddr);
}
/**

View File

@@ -165,7 +165,7 @@ void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
void *src, unsigned long len)
{
/* Initialize the slot */
void *kaddr = kmap_atomic(page);
void *kaddr = kmap_local_page(page);
void *dst = kaddr + (vaddr & ~PAGE_MASK);
unsigned long start = (unsigned long)dst;
@@ -178,5 +178,5 @@ void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr,
}
flush_icache_range(start, start + len);
kunmap_atomic(kaddr);
kunmap_local(kaddr);
}

View File

@@ -5,6 +5,5 @@ generated-y += syscall_table.h
generated-y += unistd_nr.h
generic-y += asm-offsets.h
generic-y += kvm_types.h
generic-y += mcs_spinlock.h
generic-y += mmzone.h

View File

@@ -114,6 +114,7 @@ static idtentry_t sysvec_table[NR_SYSTEM_VECTORS] __ro_after_init = {
SYSVEC(IRQ_WORK_VECTOR, irq_work),
SYSVEC(PERF_GUEST_MEDIATED_PMI_VECTOR, perf_guest_mediated_pmi_handler),
SYSVEC(POSTED_INTR_VECTOR, kvm_posted_intr_ipi),
SYSVEC(POSTED_INTR_WAKEUP_VECTOR, kvm_posted_intr_wakeup_ipi),
SYSVEC(POSTED_INTR_NESTED_VECTOR, kvm_posted_intr_nested_ipi),

View File

@@ -1439,6 +1439,8 @@ static int __init amd_core_pmu_init(void)
amd_pmu_global_cntr_mask = x86_pmu.cntr_mask64;
x86_get_pmu(smp_processor_id())->capabilities |= PERF_PMU_CAP_MEDIATED_VPMU;
/* Update PMC handling functions */
x86_pmu.enable_all = amd_pmu_v2_enable_all;
x86_pmu.disable_all = amd_pmu_v2_disable_all;

View File

@@ -30,6 +30,7 @@
#include <linux/device.h>
#include <linux/nospec.h>
#include <linux/static_call.h>
#include <linux/kvm_types.h>
#include <asm/apic.h>
#include <asm/stacktrace.h>
@@ -56,6 +57,8 @@ DEFINE_PER_CPU(struct cpu_hw_events, cpu_hw_events) = {
.pmu = &pmu,
};
static DEFINE_PER_CPU(bool, guest_lvtpc_loaded);
DEFINE_STATIC_KEY_FALSE(rdpmc_never_available_key);
DEFINE_STATIC_KEY_FALSE(rdpmc_always_available_key);
DEFINE_STATIC_KEY_FALSE(perf_is_hybrid);
@@ -1760,6 +1763,25 @@ void perf_events_lapic_init(void)
apic_write(APIC_LVTPC, APIC_DM_NMI);
}
#ifdef CONFIG_PERF_GUEST_MEDIATED_PMU
void perf_load_guest_lvtpc(u32 guest_lvtpc)
{
u32 masked = guest_lvtpc & APIC_LVT_MASKED;
apic_write(APIC_LVTPC,
APIC_DM_FIXED | PERF_GUEST_MEDIATED_PMI_VECTOR | masked);
this_cpu_write(guest_lvtpc_loaded, true);
}
EXPORT_SYMBOL_FOR_KVM(perf_load_guest_lvtpc);
void perf_put_guest_lvtpc(void)
{
this_cpu_write(guest_lvtpc_loaded, false);
apic_write(APIC_LVTPC, APIC_DM_NMI);
}
EXPORT_SYMBOL_FOR_KVM(perf_put_guest_lvtpc);
#endif /* CONFIG_PERF_GUEST_MEDIATED_PMU */
static int
perf_event_nmi_handler(unsigned int cmd, struct pt_regs *regs)
{
@@ -1767,6 +1789,17 @@ perf_event_nmi_handler(unsigned int cmd, struct pt_regs *regs)
u64 finish_clock;
int ret;
/*
* Ignore all NMIs when the CPU's LVTPC is configured to route PMIs to
* PERF_GUEST_MEDIATED_PMI_VECTOR, i.e. when an NMI time can't be due
* to a PMI. Attempting to handle a PMI while the guest's context is
* loaded will generate false positives and clobber guest state. Note,
* the LVTPC is switched to/from the dedicated mediated PMI IRQ vector
* while host events are quiesced.
*/
if (this_cpu_read(guest_lvtpc_loaded))
return NMI_DONE;
/*
* All PMUs/events that share this PMI handler should make sure to
* increment active_events for their events.
@@ -2130,7 +2163,8 @@ static int __init init_hw_perf_events(void)
pr_cont("%s PMU driver.\n", x86_pmu.name);
x86_pmu.attr_rdpmc = 1; /* enable userspace RDPMC usage by default */
/* enable userspace RDPMC usage by default */
x86_pmu.attr_rdpmc = X86_USER_RDPMC_CONDITIONAL_ENABLE;
for (quirk = x86_pmu.quirks; quirk; quirk = quirk->next)
quirk->func();
@@ -2582,6 +2616,27 @@ static ssize_t get_attr_rdpmc(struct device *cdev,
return snprintf(buf, 40, "%d\n", x86_pmu.attr_rdpmc);
}
/*
* Behaviors of rdpmc value:
* - rdpmc = 0
* global user space rdpmc and counter level's user space rdpmc of all
* counters are both disabled.
* - rdpmc = 1
* global user space rdpmc is enabled in mmap enabled time window and
* counter level's user space rdpmc is enabled for only non system-wide
* events. Counter level's user space rdpmc of system-wide events is
* still disabled by default. This won't introduce counter data leak for
* non system-wide events since their count data would be cleared when
* context switches.
* - rdpmc = 2
* global user space rdpmc and counter level's user space rdpmc of all
* counters are enabled unconditionally.
*
* Suppose the rdpmc value won't be changed frequently, don't dynamically
* reschedule events to make the new rpdmc value take effect on active perf
* events immediately, the new rdpmc value would only impact the new
* activated perf events. This makes code simpler and cleaner.
*/
static ssize_t set_attr_rdpmc(struct device *cdev,
struct device_attribute *attr,
const char *buf, size_t count)
@@ -2610,12 +2665,12 @@ static ssize_t set_attr_rdpmc(struct device *cdev,
*/
if (val == 0)
static_branch_inc(&rdpmc_never_available_key);
else if (x86_pmu.attr_rdpmc == 0)
else if (x86_pmu.attr_rdpmc == X86_USER_RDPMC_NEVER_ENABLE)
static_branch_dec(&rdpmc_never_available_key);
if (val == 2)
static_branch_inc(&rdpmc_always_available_key);
else if (x86_pmu.attr_rdpmc == 2)
else if (x86_pmu.attr_rdpmc == X86_USER_RDPMC_ALWAYS_ENABLE)
static_branch_dec(&rdpmc_always_available_key);
on_each_cpu(cr4_update_pce, NULL, 1);
@@ -3073,11 +3128,12 @@ void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap)
cap->version = x86_pmu.version;
cap->num_counters_gp = x86_pmu_num_counters(NULL);
cap->num_counters_fixed = x86_pmu_num_counters_fixed(NULL);
cap->bit_width_gp = x86_pmu.cntval_bits;
cap->bit_width_fixed = x86_pmu.cntval_bits;
cap->bit_width_gp = cap->num_counters_gp ? x86_pmu.cntval_bits : 0;
cap->bit_width_fixed = cap->num_counters_fixed ? x86_pmu.cntval_bits : 0;
cap->events_mask = (unsigned int)x86_pmu.events_maskl;
cap->events_mask_len = x86_pmu.events_mask_len;
cap->pebs_ept = x86_pmu.pebs_ept;
cap->mediated = !!(pmu.capabilities & PERF_PMU_CAP_MEDIATED_VPMU);
}
EXPORT_SYMBOL_FOR_KVM(perf_get_x86_pmu_capability);

View File

@@ -232,6 +232,29 @@ static struct event_constraint intel_skt_event_constraints[] __read_mostly = {
EVENT_CONSTRAINT_END
};
static struct event_constraint intel_arw_event_constraints[] __read_mostly = {
FIXED_EVENT_CONSTRAINT(0x00c0, 0), /* INST_RETIRED.ANY */
FIXED_EVENT_CONSTRAINT(0x003c, 1), /* CPU_CLK_UNHALTED.CORE */
FIXED_EVENT_CONSTRAINT(0x0300, 2), /* pseudo CPU_CLK_UNHALTED.REF */
FIXED_EVENT_CONSTRAINT(0x013c, 2), /* CPU_CLK_UNHALTED.REF_TSC_P */
FIXED_EVENT_CONSTRAINT(0x0073, 4), /* TOPDOWN_BAD_SPECULATION.ALL */
FIXED_EVENT_CONSTRAINT(0x019c, 5), /* TOPDOWN_FE_BOUND.ALL */
FIXED_EVENT_CONSTRAINT(0x02c2, 6), /* TOPDOWN_RETIRING.ALL */
INTEL_UEVENT_CONSTRAINT(0x01b7, 0x1),
INTEL_UEVENT_CONSTRAINT(0x02b7, 0x2),
INTEL_UEVENT_CONSTRAINT(0x04b7, 0x4),
INTEL_UEVENT_CONSTRAINT(0x08b7, 0x8),
INTEL_UEVENT_CONSTRAINT(0x01d4, 0x1),
INTEL_UEVENT_CONSTRAINT(0x02d4, 0x2),
INTEL_UEVENT_CONSTRAINT(0x04d4, 0x4),
INTEL_UEVENT_CONSTRAINT(0x08d4, 0x8),
INTEL_UEVENT_CONSTRAINT(0x0175, 0x1),
INTEL_UEVENT_CONSTRAINT(0x0275, 0x2),
INTEL_UEVENT_CONSTRAINT(0x21d3, 0x1),
INTEL_UEVENT_CONSTRAINT(0x22d3, 0x1),
EVENT_CONSTRAINT_END
};
static struct event_constraint intel_skl_event_constraints[] = {
FIXED_EVENT_CONSTRAINT(0x00c0, 0), /* INST_RETIRED.ANY */
FIXED_EVENT_CONSTRAINT(0x003c, 1), /* CPU_CLK_UNHALTED.CORE */
@@ -435,6 +458,62 @@ static struct extra_reg intel_lnc_extra_regs[] __read_mostly = {
EVENT_EXTRA_END
};
static struct event_constraint intel_pnc_event_constraints[] = {
FIXED_EVENT_CONSTRAINT(0x00c0, 0), /* INST_RETIRED.ANY */
FIXED_EVENT_CONSTRAINT(0x0100, 0), /* INST_RETIRED.PREC_DIST */
FIXED_EVENT_CONSTRAINT(0x003c, 1), /* CPU_CLK_UNHALTED.CORE */
FIXED_EVENT_CONSTRAINT(0x0300, 2), /* CPU_CLK_UNHALTED.REF */
FIXED_EVENT_CONSTRAINT(0x013c, 2), /* CPU_CLK_UNHALTED.REF_TSC_P */
FIXED_EVENT_CONSTRAINT(0x0400, 3), /* SLOTS */
METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_RETIRING, 0),
METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_BAD_SPEC, 1),
METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_FE_BOUND, 2),
METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_BE_BOUND, 3),
METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_HEAVY_OPS, 4),
METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_BR_MISPREDICT, 5),
METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_FETCH_LAT, 6),
METRIC_EVENT_CONSTRAINT(INTEL_TD_METRIC_MEM_BOUND, 7),
INTEL_EVENT_CONSTRAINT(0x20, 0xf),
INTEL_EVENT_CONSTRAINT(0x79, 0xf),
INTEL_UEVENT_CONSTRAINT(0x0275, 0xf),
INTEL_UEVENT_CONSTRAINT(0x0176, 0xf),
INTEL_UEVENT_CONSTRAINT(0x04a4, 0x1),
INTEL_UEVENT_CONSTRAINT(0x08a4, 0x1),
INTEL_UEVENT_CONSTRAINT(0x01cd, 0xfc),
INTEL_UEVENT_CONSTRAINT(0x02cd, 0x3),
INTEL_EVENT_CONSTRAINT(0xd0, 0xf),
INTEL_EVENT_CONSTRAINT(0xd1, 0xf),
INTEL_EVENT_CONSTRAINT(0xd4, 0xf),
INTEL_EVENT_CONSTRAINT(0xd6, 0xf),
INTEL_EVENT_CONSTRAINT(0xdf, 0xf),
INTEL_EVENT_CONSTRAINT(0xce, 0x1),
INTEL_UEVENT_CONSTRAINT(0x01b1, 0x8),
INTEL_UEVENT_CONSTRAINT(0x0847, 0xf),
INTEL_UEVENT_CONSTRAINT(0x0446, 0xf),
INTEL_UEVENT_CONSTRAINT(0x0846, 0xf),
INTEL_UEVENT_CONSTRAINT(0x0148, 0xf),
EVENT_CONSTRAINT_END
};
static struct extra_reg intel_pnc_extra_regs[] __read_mostly = {
/* must define OMR_X first, see intel_alt_er() */
INTEL_UEVENT_EXTRA_REG(0x012a, MSR_OMR_0, 0x40ffffff0000ffffull, OMR_0),
INTEL_UEVENT_EXTRA_REG(0x022a, MSR_OMR_1, 0x40ffffff0000ffffull, OMR_1),
INTEL_UEVENT_EXTRA_REG(0x042a, MSR_OMR_2, 0x40ffffff0000ffffull, OMR_2),
INTEL_UEVENT_EXTRA_REG(0x082a, MSR_OMR_3, 0x40ffffff0000ffffull, OMR_3),
INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x01cd),
INTEL_UEVENT_EXTRA_REG(0x02c6, MSR_PEBS_FRONTEND, 0x9, FE),
INTEL_UEVENT_EXTRA_REG(0x03c6, MSR_PEBS_FRONTEND, 0x7fff1f, FE),
INTEL_UEVENT_EXTRA_REG(0x40ad, MSR_PEBS_FRONTEND, 0xf, FE),
INTEL_UEVENT_EXTRA_REG(0x04c2, MSR_PEBS_FRONTEND, 0x8, FE),
EVENT_EXTRA_END
};
EVENT_ATTR_STR(mem-loads, mem_ld_nhm, "event=0x0b,umask=0x10,ldlat=3");
EVENT_ATTR_STR(mem-loads, mem_ld_snb, "event=0xcd,umask=0x1,ldlat=3");
EVENT_ATTR_STR(mem-stores, mem_st_snb, "event=0xcd,umask=0x2");
@@ -650,6 +729,102 @@ static __initconst const u64 glc_hw_cache_extra_regs
},
};
static __initconst const u64 pnc_hw_cache_event_ids
[PERF_COUNT_HW_CACHE_MAX]
[PERF_COUNT_HW_CACHE_OP_MAX]
[PERF_COUNT_HW_CACHE_RESULT_MAX] =
{
[ C(L1D ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x81d0,
[ C(RESULT_MISS) ] = 0xe124,
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = 0x82d0,
},
},
[ C(L1I ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_MISS) ] = 0xe424,
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = -1,
[ C(RESULT_MISS) ] = -1,
},
},
[ C(LL ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x12a,
[ C(RESULT_MISS) ] = 0x12a,
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = 0x12a,
[ C(RESULT_MISS) ] = 0x12a,
},
},
[ C(DTLB) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x81d0,
[ C(RESULT_MISS) ] = 0xe12,
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = 0x82d0,
[ C(RESULT_MISS) ] = 0xe13,
},
},
[ C(ITLB) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = -1,
[ C(RESULT_MISS) ] = 0xe11,
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = -1,
[ C(RESULT_MISS) ] = -1,
},
[ C(OP_PREFETCH) ] = {
[ C(RESULT_ACCESS) ] = -1,
[ C(RESULT_MISS) ] = -1,
},
},
[ C(BPU ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x4c4,
[ C(RESULT_MISS) ] = 0x4c5,
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = -1,
[ C(RESULT_MISS) ] = -1,
},
[ C(OP_PREFETCH) ] = {
[ C(RESULT_ACCESS) ] = -1,
[ C(RESULT_MISS) ] = -1,
},
},
[ C(NODE) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = -1,
[ C(RESULT_MISS) ] = -1,
},
},
};
static __initconst const u64 pnc_hw_cache_extra_regs
[PERF_COUNT_HW_CACHE_MAX]
[PERF_COUNT_HW_CACHE_OP_MAX]
[PERF_COUNT_HW_CACHE_RESULT_MAX] =
{
[ C(LL ) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x4000000000000001,
[ C(RESULT_MISS) ] = 0xFFFFF000000001,
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = 0x4000000000000002,
[ C(RESULT_MISS) ] = 0xFFFFF000000002,
},
},
};
/*
* Notes on the events:
* - data reads do not include code reads (comparable to earlier tables)
@@ -2167,6 +2342,26 @@ static __initconst const u64 tnt_hw_cache_extra_regs
},
};
static __initconst const u64 arw_hw_cache_extra_regs
[PERF_COUNT_HW_CACHE_MAX]
[PERF_COUNT_HW_CACHE_OP_MAX]
[PERF_COUNT_HW_CACHE_RESULT_MAX] = {
[C(LL)] = {
[C(OP_READ)] = {
[C(RESULT_ACCESS)] = 0x4000000000000001,
[C(RESULT_MISS)] = 0xFFFFF000000001,
},
[C(OP_WRITE)] = {
[C(RESULT_ACCESS)] = 0x4000000000000002,
[C(RESULT_MISS)] = 0xFFFFF000000002,
},
[C(OP_PREFETCH)] = {
[C(RESULT_ACCESS)] = 0x0,
[C(RESULT_MISS)] = 0x0,
},
},
};
EVENT_ATTR_STR(topdown-fe-bound, td_fe_bound_tnt, "event=0x71,umask=0x0");
EVENT_ATTR_STR(topdown-retiring, td_retiring_tnt, "event=0xc2,umask=0x0");
EVENT_ATTR_STR(topdown-bad-spec, td_bad_spec_tnt, "event=0x73,umask=0x6");
@@ -2225,6 +2420,22 @@ static struct extra_reg intel_cmt_extra_regs[] __read_mostly = {
EVENT_EXTRA_END
};
static struct extra_reg intel_arw_extra_regs[] __read_mostly = {
/* must define OMR_X first, see intel_alt_er() */
INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OMR_0, 0xc0ffffffffffffffull, OMR_0),
INTEL_UEVENT_EXTRA_REG(0x02b7, MSR_OMR_1, 0xc0ffffffffffffffull, OMR_1),
INTEL_UEVENT_EXTRA_REG(0x04b7, MSR_OMR_2, 0xc0ffffffffffffffull, OMR_2),
INTEL_UEVENT_EXTRA_REG(0x08b7, MSR_OMR_3, 0xc0ffffffffffffffull, OMR_3),
INTEL_UEVENT_EXTRA_REG(0x01d4, MSR_OMR_0, 0xc0ffffffffffffffull, OMR_0),
INTEL_UEVENT_EXTRA_REG(0x02d4, MSR_OMR_1, 0xc0ffffffffffffffull, OMR_1),
INTEL_UEVENT_EXTRA_REG(0x04d4, MSR_OMR_2, 0xc0ffffffffffffffull, OMR_2),
INTEL_UEVENT_EXTRA_REG(0x08d4, MSR_OMR_3, 0xc0ffffffffffffffull, OMR_3),
INTEL_UEVENT_PEBS_LDLAT_EXTRA_REG(0x5d0),
INTEL_UEVENT_EXTRA_REG(0x0127, MSR_SNOOP_RSP_0, 0xffffffffffffffffull, SNOOP_0),
INTEL_UEVENT_EXTRA_REG(0x0227, MSR_SNOOP_RSP_1, 0xffffffffffffffffull, SNOOP_1),
EVENT_EXTRA_END
};
EVENT_ATTR_STR(topdown-fe-bound, td_fe_bound_skt, "event=0x9c,umask=0x01");
EVENT_ATTR_STR(topdown-retiring, td_retiring_skt, "event=0xc2,umask=0x02");
EVENT_ATTR_STR(topdown-be-bound, td_be_bound_skt, "event=0xa4,umask=0x02");
@@ -2917,6 +3128,8 @@ static void intel_pmu_enable_fixed(struct perf_event *event)
bits |= INTEL_FIXED_0_USER;
if (hwc->config & ARCH_PERFMON_EVENTSEL_OS)
bits |= INTEL_FIXED_0_KERNEL;
if (hwc->config & ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE)
bits |= INTEL_FIXED_0_RDPMC_USER_DISABLE;
/*
* ANY bit is supported in v3 and up
@@ -3052,6 +3265,27 @@ static void intel_pmu_enable_event_ext(struct perf_event *event)
__intel_pmu_update_event_ext(hwc->idx, ext);
}
static void intel_pmu_update_rdpmc_user_disable(struct perf_event *event)
{
if (!x86_pmu_has_rdpmc_user_disable(event->pmu))
return;
/*
* Counter scope's user-space rdpmc is disabled by default
* except two cases.
* a. rdpmc = 2 (user space rdpmc enabled unconditionally)
* b. rdpmc = 1 and the event is not a system-wide event.
* The count of non-system-wide events would be cleared when
* context switches, so no count data is leaked.
*/
if (x86_pmu.attr_rdpmc == X86_USER_RDPMC_ALWAYS_ENABLE ||
(x86_pmu.attr_rdpmc == X86_USER_RDPMC_CONDITIONAL_ENABLE &&
event->ctx->task))
event->hw.config &= ~ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE;
else
event->hw.config |= ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE;
}
DEFINE_STATIC_CALL_NULL(intel_pmu_enable_event_ext, intel_pmu_enable_event_ext);
static void intel_pmu_enable_event(struct perf_event *event)
@@ -3060,6 +3294,8 @@ static void intel_pmu_enable_event(struct perf_event *event)
struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx;
intel_pmu_update_rdpmc_user_disable(event);
if (unlikely(event->attr.precise_ip))
static_call(x86_pmu_pebs_enable)(event);
@@ -3532,17 +3768,32 @@ static int intel_alt_er(struct cpu_hw_events *cpuc,
struct extra_reg *extra_regs = hybrid(cpuc->pmu, extra_regs);
int alt_idx = idx;
if (!(x86_pmu.flags & PMU_FL_HAS_RSP_1))
return idx;
switch (idx) {
case EXTRA_REG_RSP_0 ... EXTRA_REG_RSP_1:
if (!(x86_pmu.flags & PMU_FL_HAS_RSP_1))
return idx;
if (++alt_idx > EXTRA_REG_RSP_1)
alt_idx = EXTRA_REG_RSP_0;
if (config & ~extra_regs[alt_idx].valid_mask)
return idx;
break;
if (idx == EXTRA_REG_RSP_0)
alt_idx = EXTRA_REG_RSP_1;
case EXTRA_REG_OMR_0 ... EXTRA_REG_OMR_3:
if (!(x86_pmu.flags & PMU_FL_HAS_OMR))
return idx;
if (++alt_idx > EXTRA_REG_OMR_3)
alt_idx = EXTRA_REG_OMR_0;
/*
* Subtracting EXTRA_REG_OMR_0 ensures to get correct
* OMR extra_reg entries which start from 0.
*/
if (config & ~extra_regs[alt_idx - EXTRA_REG_OMR_0].valid_mask)
return idx;
break;
if (idx == EXTRA_REG_RSP_1)
alt_idx = EXTRA_REG_RSP_0;
if (config & ~extra_regs[alt_idx].valid_mask)
return idx;
default:
break;
}
return alt_idx;
}
@@ -3550,16 +3801,26 @@ static int intel_alt_er(struct cpu_hw_events *cpuc,
static void intel_fixup_er(struct perf_event *event, int idx)
{
struct extra_reg *extra_regs = hybrid(event->pmu, extra_regs);
event->hw.extra_reg.idx = idx;
int er_idx;
if (idx == EXTRA_REG_RSP_0) {
event->hw.extra_reg.idx = idx;
switch (idx) {
case EXTRA_REG_RSP_0 ... EXTRA_REG_RSP_1:
er_idx = idx - EXTRA_REG_RSP_0;
event->hw.config &= ~INTEL_ARCH_EVENT_MASK;
event->hw.config |= extra_regs[EXTRA_REG_RSP_0].event;
event->hw.extra_reg.reg = MSR_OFFCORE_RSP_0;
} else if (idx == EXTRA_REG_RSP_1) {
event->hw.config &= ~INTEL_ARCH_EVENT_MASK;
event->hw.config |= extra_regs[EXTRA_REG_RSP_1].event;
event->hw.extra_reg.reg = MSR_OFFCORE_RSP_1;
event->hw.config |= extra_regs[er_idx].event;
event->hw.extra_reg.reg = MSR_OFFCORE_RSP_0 + er_idx;
break;
case EXTRA_REG_OMR_0 ... EXTRA_REG_OMR_3:
er_idx = idx - EXTRA_REG_OMR_0;
event->hw.config &= ~ARCH_PERFMON_EVENTSEL_UMASK;
event->hw.config |= 1ULL << (8 + er_idx);
event->hw.extra_reg.reg = MSR_OMR_0 + er_idx;
break;
default:
pr_warn("The extra reg idx %d is not supported.\n", idx);
}
}
@@ -5633,6 +5894,8 @@ static void update_pmu_cap(struct pmu *pmu)
hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_UMASK2;
if (ebx_0.split.eq)
hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_EQ;
if (ebx_0.split.rdpmc_user_disable)
hybrid(pmu, config_mask) |= ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE;
if (eax_0.split.cntr_subleaf) {
cpuid_count(ARCH_PERFMON_EXT_LEAF, ARCH_PERFMON_NUM_COUNTER_LEAF,
@@ -5695,6 +5958,8 @@ static void intel_pmu_check_hybrid_pmus(struct x86_hybrid_pmu *pmu)
else
pmu->intel_ctrl &= ~GLOBAL_CTRL_EN_PERF_METRICS;
pmu->pmu.capabilities |= PERF_PMU_CAP_MEDIATED_VPMU;
intel_pmu_check_event_constraints_all(&pmu->pmu);
intel_pmu_check_extra_regs(pmu->extra_regs);
@@ -7209,6 +7474,20 @@ static __always_inline void intel_pmu_init_lnc(struct pmu *pmu)
hybrid(pmu, extra_regs) = intel_lnc_extra_regs;
}
static __always_inline void intel_pmu_init_pnc(struct pmu *pmu)
{
intel_pmu_init_glc(pmu);
x86_pmu.flags &= ~PMU_FL_HAS_RSP_1;
x86_pmu.flags |= PMU_FL_HAS_OMR;
memcpy(hybrid_var(pmu, hw_cache_event_ids),
pnc_hw_cache_event_ids, sizeof(hw_cache_event_ids));
memcpy(hybrid_var(pmu, hw_cache_extra_regs),
pnc_hw_cache_extra_regs, sizeof(hw_cache_extra_regs));
hybrid(pmu, event_constraints) = intel_pnc_event_constraints;
hybrid(pmu, pebs_constraints) = intel_pnc_pebs_event_constraints;
hybrid(pmu, extra_regs) = intel_pnc_extra_regs;
}
static __always_inline void intel_pmu_init_skt(struct pmu *pmu)
{
intel_pmu_init_grt(pmu);
@@ -7217,6 +7496,19 @@ static __always_inline void intel_pmu_init_skt(struct pmu *pmu)
static_call_update(intel_pmu_enable_acr_event, intel_pmu_enable_acr);
}
static __always_inline void intel_pmu_init_arw(struct pmu *pmu)
{
intel_pmu_init_grt(pmu);
x86_pmu.flags &= ~PMU_FL_HAS_RSP_1;
x86_pmu.flags |= PMU_FL_HAS_OMR;
memcpy(hybrid_var(pmu, hw_cache_extra_regs),
arw_hw_cache_extra_regs, sizeof(hw_cache_extra_regs));
hybrid(pmu, event_constraints) = intel_arw_event_constraints;
hybrid(pmu, pebs_constraints) = intel_arw_pebs_event_constraints;
hybrid(pmu, extra_regs) = intel_arw_extra_regs;
static_call_update(intel_pmu_enable_acr_event, intel_pmu_enable_acr);
}
__init int intel_pmu_init(void)
{
struct attribute **extra_skl_attr = &empty_attrs;
@@ -7314,6 +7606,9 @@ __init int intel_pmu_init(void)
pr_cont(" AnyThread deprecated, ");
}
/* The perf side of core PMU is ready to support the mediated vPMU. */
x86_get_pmu(smp_processor_id())->capabilities |= PERF_PMU_CAP_MEDIATED_VPMU;
/*
* Many features on and after V6 require dynamic constraint,
* e.g., Arch PEBS, ACR.
@@ -7405,6 +7700,7 @@ __init int intel_pmu_init(void)
case INTEL_ATOM_SILVERMONT_D:
case INTEL_ATOM_SILVERMONT_MID:
case INTEL_ATOM_AIRMONT:
case INTEL_ATOM_AIRMONT_NP:
case INTEL_ATOM_SILVERMONT_MID2:
memcpy(hw_cache_event_ids, slm_hw_cache_event_ids,
sizeof(hw_cache_event_ids));
@@ -7866,9 +8162,21 @@ __init int intel_pmu_init(void)
x86_pmu.extra_regs = intel_rwc_extra_regs;
pr_cont("Granite Rapids events, ");
name = "granite_rapids";
goto glc_common;
case INTEL_DIAMONDRAPIDS_X:
intel_pmu_init_pnc(NULL);
x86_pmu.pebs_latency_data = pnc_latency_data;
pr_cont("Panthercove events, ");
name = "panthercove";
goto glc_base;
glc_common:
intel_pmu_init_glc(NULL);
intel_pmu_pebs_data_source_skl(true);
glc_base:
x86_pmu.pebs_ept = 1;
x86_pmu.hw_config = hsw_hw_config;
x86_pmu.get_event_constraints = glc_get_event_constraints;
@@ -7878,7 +8186,6 @@ __init int intel_pmu_init(void)
mem_attr = glc_events_attrs;
td_attr = glc_td_events_attrs;
tsx_attr = glc_tsx_events_attrs;
intel_pmu_pebs_data_source_skl(true);
break;
case INTEL_ALDERLAKE:
@@ -8042,6 +8349,33 @@ __init int intel_pmu_init(void)
name = "arrowlake_h_hybrid";
break;
case INTEL_NOVALAKE:
case INTEL_NOVALAKE_L:
pr_cont("Novalake Hybrid events, ");
name = "novalake_hybrid";
intel_pmu_init_hybrid(hybrid_big_small);
x86_pmu.pebs_latency_data = nvl_latency_data;
x86_pmu.get_event_constraints = mtl_get_event_constraints;
x86_pmu.hw_config = adl_hw_config;
td_attr = lnl_hybrid_events_attrs;
mem_attr = mtl_hybrid_mem_attrs;
tsx_attr = adl_hybrid_tsx_attrs;
extra_attr = boot_cpu_has(X86_FEATURE_RTM) ?
mtl_hybrid_extra_attr_rtm : mtl_hybrid_extra_attr;
/* Initialize big core specific PerfMon capabilities.*/
pmu = &x86_pmu.hybrid_pmu[X86_HYBRID_PMU_CORE_IDX];
intel_pmu_init_pnc(&pmu->pmu);
/* Initialize Atom core specific PerfMon capabilities.*/
pmu = &x86_pmu.hybrid_pmu[X86_HYBRID_PMU_ATOM_IDX];
intel_pmu_init_arw(&pmu->pmu);
intel_pmu_pebs_data_source_lnl();
break;
default:
switch (x86_pmu.version) {
case 1:

View File

@@ -41,7 +41,7 @@
* MSR_CORE_C1_RES: CORE C1 Residency Counter
* perf code: 0x00
* Available model: SLM,AMT,GLM,CNL,ICX,TNT,ADL,RPL
* MTL,SRF,GRR,ARL,LNL,PTL
* MTL,SRF,GRR,ARL,LNL,PTL,WCL,NVL
* Scope: Core (each processor core has a MSR)
* MSR_CORE_C3_RESIDENCY: CORE C3 Residency Counter
* perf code: 0x01
@@ -53,19 +53,20 @@
* Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW,
* SKL,KNL,GLM,CNL,KBL,CML,ICL,ICX,
* TGL,TNT,RKL,ADL,RPL,SPR,MTL,SRF,
* GRR,ARL,LNL,PTL
* GRR,ARL,LNL,PTL,WCL,NVL
* Scope: Core
* MSR_CORE_C7_RESIDENCY: CORE C7 Residency Counter
* perf code: 0x03
* Available model: SNB,IVB,HSW,BDW,SKL,CNL,KBL,CML,
* ICL,TGL,RKL,ADL,RPL,MTL,ARL,LNL,
* PTL
* PTL,WCL,NVL
* Scope: Core
* MSR_PKG_C2_RESIDENCY: Package C2 Residency Counter.
* perf code: 0x00
* Available model: SNB,IVB,HSW,BDW,SKL,KNL,GLM,CNL,
* KBL,CML,ICL,ICX,TGL,TNT,RKL,ADL,
* RPL,SPR,MTL,ARL,LNL,SRF,PTL
* RPL,SPR,MTL,ARL,LNL,SRF,PTL,WCL,
* NVL
* Scope: Package (physical package)
* MSR_PKG_C3_RESIDENCY: Package C3 Residency Counter.
* perf code: 0x01
@@ -78,7 +79,7 @@
* Available model: SLM,AMT,NHM,WSM,SNB,IVB,HSW,BDW,
* SKL,KNL,GLM,CNL,KBL,CML,ICL,ICX,
* TGL,TNT,RKL,ADL,RPL,SPR,MTL,SRF,
* ARL,LNL,PTL
* ARL,LNL,PTL,WCL,NVL
* Scope: Package (physical package)
* MSR_PKG_C7_RESIDENCY: Package C7 Residency Counter.
* perf code: 0x03
@@ -97,11 +98,12 @@
* MSR_PKG_C10_RESIDENCY: Package C10 Residency Counter.
* perf code: 0x06
* Available model: HSW ULT,KBL,GLM,CNL,CML,ICL,TGL,
* TNT,RKL,ADL,RPL,MTL,ARL,LNL,PTL
* TNT,RKL,ADL,RPL,MTL,ARL,LNL,PTL,
* WCL,NVL
* Scope: Package (physical package)
* MSR_MODULE_C6_RES_MS: Module C6 Residency Counter.
* perf code: 0x00
* Available model: SRF,GRR
* Available model: SRF,GRR,NVL
* Scope: A cluster of cores shared L2 cache
*
*/
@@ -527,6 +529,18 @@ static const struct cstate_model lnl_cstates __initconst = {
BIT(PERF_CSTATE_PKG_C10_RES),
};
static const struct cstate_model nvl_cstates __initconst = {
.core_events = BIT(PERF_CSTATE_CORE_C1_RES) |
BIT(PERF_CSTATE_CORE_C6_RES) |
BIT(PERF_CSTATE_CORE_C7_RES),
.module_events = BIT(PERF_CSTATE_MODULE_C6_RES),
.pkg_events = BIT(PERF_CSTATE_PKG_C2_RES) |
BIT(PERF_CSTATE_PKG_C6_RES) |
BIT(PERF_CSTATE_PKG_C10_RES),
};
static const struct cstate_model slm_cstates __initconst = {
.core_events = BIT(PERF_CSTATE_CORE_C1_RES) |
BIT(PERF_CSTATE_CORE_C6_RES),
@@ -599,6 +613,7 @@ static const struct x86_cpu_id intel_cstates_match[] __initconst = {
X86_MATCH_VFM(INTEL_ATOM_SILVERMONT, &slm_cstates),
X86_MATCH_VFM(INTEL_ATOM_SILVERMONT_D, &slm_cstates),
X86_MATCH_VFM(INTEL_ATOM_AIRMONT, &slm_cstates),
X86_MATCH_VFM(INTEL_ATOM_AIRMONT_NP, &slm_cstates),
X86_MATCH_VFM(INTEL_BROADWELL, &snb_cstates),
X86_MATCH_VFM(INTEL_BROADWELL_D, &snb_cstates),
@@ -638,6 +653,7 @@ static const struct x86_cpu_id intel_cstates_match[] __initconst = {
X86_MATCH_VFM(INTEL_EMERALDRAPIDS_X, &icx_cstates),
X86_MATCH_VFM(INTEL_GRANITERAPIDS_X, &icx_cstates),
X86_MATCH_VFM(INTEL_GRANITERAPIDS_D, &icx_cstates),
X86_MATCH_VFM(INTEL_DIAMONDRAPIDS_X, &srf_cstates),
X86_MATCH_VFM(INTEL_TIGERLAKE_L, &icl_cstates),
X86_MATCH_VFM(INTEL_TIGERLAKE, &icl_cstates),
@@ -654,6 +670,9 @@ static const struct x86_cpu_id intel_cstates_match[] __initconst = {
X86_MATCH_VFM(INTEL_ARROWLAKE_U, &adl_cstates),
X86_MATCH_VFM(INTEL_LUNARLAKE_M, &lnl_cstates),
X86_MATCH_VFM(INTEL_PANTHERLAKE_L, &lnl_cstates),
X86_MATCH_VFM(INTEL_WILDCATLAKE_L, &lnl_cstates),
X86_MATCH_VFM(INTEL_NOVALAKE, &nvl_cstates),
X86_MATCH_VFM(INTEL_NOVALAKE_L, &nvl_cstates),
{ },
};
MODULE_DEVICE_TABLE(x86cpu, intel_cstates_match);

View File

@@ -34,6 +34,17 @@ struct pebs_record_32 {
*/
union omr_encoding {
struct {
u8 omr_source : 4;
u8 omr_remote : 1;
u8 omr_hitm : 1;
u8 omr_snoop : 1;
u8 omr_promoted : 1;
};
u8 omr_full;
};
union intel_x86_pebs_dse {
u64 val;
struct {
@@ -73,6 +84,30 @@ union intel_x86_pebs_dse {
unsigned int lnc_addr_blk:1;
unsigned int ld_reserved6:18;
};
struct {
unsigned int pnc_dse: 8;
unsigned int pnc_l2_miss:1;
unsigned int pnc_stlb_clean_hit:1;
unsigned int pnc_stlb_any_hit:1;
unsigned int pnc_stlb_miss:1;
unsigned int pnc_locked:1;
unsigned int pnc_data_blk:1;
unsigned int pnc_addr_blk:1;
unsigned int pnc_fb_full:1;
unsigned int ld_reserved8:16;
};
struct {
unsigned int arw_dse:8;
unsigned int arw_l2_miss:1;
unsigned int arw_xq_promotion:1;
unsigned int arw_reissue:1;
unsigned int arw_stlb_miss:1;
unsigned int arw_locked:1;
unsigned int arw_data_blk:1;
unsigned int arw_addr_blk:1;
unsigned int arw_fb_full:1;
unsigned int ld_reserved9:16;
};
};
@@ -228,6 +263,108 @@ void __init intel_pmu_pebs_data_source_lnl(void)
__intel_pmu_pebs_data_source_cmt(data_source);
}
/* Version for Panthercove and later */
/* L2 hit */
#define PNC_PEBS_DATA_SOURCE_MAX 16
static u64 pnc_pebs_l2_hit_data_source[PNC_PEBS_DATA_SOURCE_MAX] = {
P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA), /* 0x00: non-cache access */
OP_LH | LEVEL(L0) | P(SNOOP, NONE), /* 0x01: L0 hit */
OP_LH | P(LVL, L1) | LEVEL(L1) | P(SNOOP, NONE), /* 0x02: L1 hit */
OP_LH | P(LVL, LFB) | LEVEL(LFB) | P(SNOOP, NONE), /* 0x03: L1 Miss Handling Buffer hit */
OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, NONE), /* 0x04: L2 Hit Clean */
0, /* 0x05: Reserved */
0, /* 0x06: Reserved */
OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, HIT), /* 0x07: L2 Hit Snoop HIT */
OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, HITM), /* 0x08: L2 Hit Snoop Hit Modified */
OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, MISS), /* 0x09: Prefetch Promotion */
OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, MISS), /* 0x0a: Cross Core Prefetch Promotion */
0, /* 0x0b: Reserved */
0, /* 0x0c: Reserved */
0, /* 0x0d: Reserved */
0, /* 0x0e: Reserved */
OP_LH | P(LVL, UNC) | LEVEL(NA) | P(SNOOP, NONE), /* 0x0f: uncached */
};
/* Version for Arctic Wolf and later */
/* L2 hit */
#define ARW_PEBS_DATA_SOURCE_MAX 16
static u64 arw_pebs_l2_hit_data_source[ARW_PEBS_DATA_SOURCE_MAX] = {
P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA), /* 0x00: non-cache access */
OP_LH | P(LVL, L1) | LEVEL(L1) | P(SNOOP, NONE), /* 0x01: L1 hit */
OP_LH | P(LVL, LFB) | LEVEL(LFB) | P(SNOOP, NONE), /* 0x02: WCB Hit */
OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, NONE), /* 0x03: L2 Hit Clean */
OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, HIT), /* 0x04: L2 Hit Snoop HIT */
OP_LH | P(LVL, L2) | LEVEL(L2) | P(SNOOP, HITM), /* 0x05: L2 Hit Snoop Hit Modified */
OP_LH | P(LVL, UNC) | LEVEL(NA) | P(SNOOP, NONE), /* 0x06: uncached */
0, /* 0x07: Reserved */
0, /* 0x08: Reserved */
0, /* 0x09: Reserved */
0, /* 0x0a: Reserved */
0, /* 0x0b: Reserved */
0, /* 0x0c: Reserved */
0, /* 0x0d: Reserved */
0, /* 0x0e: Reserved */
0, /* 0x0f: Reserved */
};
/* L2 miss */
#define OMR_DATA_SOURCE_MAX 16
static u64 omr_data_source[OMR_DATA_SOURCE_MAX] = {
P(OP, LOAD) | P(LVL, NA) | LEVEL(NA) | P(SNOOP, NA), /* 0x00: invalid */
0, /* 0x01: Reserved */
OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, L_SHARE), /* 0x02: local CA shared cache */
OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, L_NON_SHARE),/* 0x03: local CA non-shared cache */
OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_IO), /* 0x04: other CA IO agent */
OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_SHARE), /* 0x05: other CA shared cache */
OP_LH | P(LVL, L3) | LEVEL(L3) | P(REGION, O_NON_SHARE),/* 0x06: other CA non-shared cache */
OP_LH | LEVEL(RAM) | P(REGION, MMIO), /* 0x07: MMIO */
OP_LH | LEVEL(RAM) | P(REGION, MEM0), /* 0x08: Memory region 0 */
OP_LH | LEVEL(RAM) | P(REGION, MEM1), /* 0x09: Memory region 1 */
OP_LH | LEVEL(RAM) | P(REGION, MEM2), /* 0x0a: Memory region 2 */
OP_LH | LEVEL(RAM) | P(REGION, MEM3), /* 0x0b: Memory region 3 */
OP_LH | LEVEL(RAM) | P(REGION, MEM4), /* 0x0c: Memory region 4 */
OP_LH | LEVEL(RAM) | P(REGION, MEM5), /* 0x0d: Memory region 5 */
OP_LH | LEVEL(RAM) | P(REGION, MEM6), /* 0x0e: Memory region 6 */
OP_LH | LEVEL(RAM) | P(REGION, MEM7), /* 0x0f: Memory region 7 */
};
static u64 parse_omr_data_source(u8 dse)
{
union omr_encoding omr;
u64 val = 0;
omr.omr_full = dse;
val = omr_data_source[omr.omr_source];
if (omr.omr_source > 0x1 && omr.omr_source < 0x7)
val |= omr.omr_remote ? P(LVL, REM_CCE1) : 0;
else if (omr.omr_source > 0x7)
val |= omr.omr_remote ? P(LVL, REM_RAM1) : P(LVL, LOC_RAM);
if (omr.omr_remote)
val |= REM;
val |= omr.omr_hitm ? P(SNOOP, HITM) : P(SNOOP, HIT);
if (omr.omr_source == 0x2) {
u8 snoop = omr.omr_snoop | omr.omr_promoted;
if (snoop == 0x0)
val |= P(SNOOP, NA);
else if (snoop == 0x1)
val |= P(SNOOP, MISS);
else if (snoop == 0x2)
val |= P(SNOOP, HIT);
else if (snoop == 0x3)
val |= P(SNOOP, NONE);
} else if (omr.omr_source > 0x2 && omr.omr_source < 0x7) {
val |= omr.omr_snoop ? P(SNOOPX, FWD) : 0;
}
return val;
}
static u64 precise_store_data(u64 status)
{
union intel_x86_pebs_dse dse;
@@ -356,6 +493,44 @@ u64 cmt_latency_data(struct perf_event *event, u64 status)
dse.mtl_fwd_blk);
}
static u64 arw_latency_data(struct perf_event *event, u64 status)
{
union intel_x86_pebs_dse dse;
union perf_mem_data_src src;
u64 val;
dse.val = status;
if (!dse.arw_l2_miss)
val = arw_pebs_l2_hit_data_source[dse.arw_dse & 0xf];
else
val = parse_omr_data_source(dse.arw_dse);
if (!val)
val = P(OP, LOAD) | LEVEL(NA) | P(SNOOP, NA);
if (dse.arw_stlb_miss)
val |= P(TLB, MISS) | P(TLB, L2);
else
val |= P(TLB, HIT) | P(TLB, L1) | P(TLB, L2);
if (dse.arw_locked)
val |= P(LOCK, LOCKED);
if (dse.arw_data_blk)
val |= P(BLK, DATA);
if (dse.arw_addr_blk)
val |= P(BLK, ADDR);
if (!dse.arw_data_blk && !dse.arw_addr_blk)
val |= P(BLK, NA);
src.val = val;
if (event->hw.flags & PERF_X86_EVENT_PEBS_ST_HSW)
src.mem_op = P(OP, STORE);
return src.val;
}
static u64 lnc_latency_data(struct perf_event *event, u64 status)
{
union intel_x86_pebs_dse dse;
@@ -411,6 +586,54 @@ u64 arl_h_latency_data(struct perf_event *event, u64 status)
return lnl_latency_data(event, status);
}
u64 pnc_latency_data(struct perf_event *event, u64 status)
{
union intel_x86_pebs_dse dse;
union perf_mem_data_src src;
u64 val;
dse.val = status;
if (!dse.pnc_l2_miss)
val = pnc_pebs_l2_hit_data_source[dse.pnc_dse & 0xf];
else
val = parse_omr_data_source(dse.pnc_dse);
if (!val)
val = P(OP, LOAD) | LEVEL(NA) | P(SNOOP, NA);
if (dse.pnc_stlb_miss)
val |= P(TLB, MISS) | P(TLB, L2);
else
val |= P(TLB, HIT) | P(TLB, L1) | P(TLB, L2);
if (dse.pnc_locked)
val |= P(LOCK, LOCKED);
if (dse.pnc_data_blk)
val |= P(BLK, DATA);
if (dse.pnc_addr_blk)
val |= P(BLK, ADDR);
if (!dse.pnc_data_blk && !dse.pnc_addr_blk)
val |= P(BLK, NA);
src.val = val;
if (event->hw.flags & PERF_X86_EVENT_PEBS_ST_HSW)
src.mem_op = P(OP, STORE);
return src.val;
}
u64 nvl_latency_data(struct perf_event *event, u64 status)
{
struct x86_hybrid_pmu *pmu = hybrid_pmu(event->pmu);
if (pmu->pmu_type == hybrid_small)
return arw_latency_data(event, status);
return pnc_latency_data(event, status);
}
static u64 load_latency_data(struct perf_event *event, u64 status)
{
union intel_x86_pebs_dse dse;
@@ -1070,6 +1293,17 @@ struct event_constraint intel_grt_pebs_event_constraints[] = {
EVENT_CONSTRAINT_END
};
struct event_constraint intel_arw_pebs_event_constraints[] = {
/* Allow all events as PEBS with no flags */
INTEL_HYBRID_LAT_CONSTRAINT(0x5d0, 0xff),
INTEL_HYBRID_LAT_CONSTRAINT(0x6d0, 0xff),
INTEL_FLAGS_UEVENT_CONSTRAINT(0x01d4, 0x1),
INTEL_FLAGS_UEVENT_CONSTRAINT(0x02d4, 0x2),
INTEL_FLAGS_UEVENT_CONSTRAINT(0x04d4, 0x4),
INTEL_FLAGS_UEVENT_CONSTRAINT(0x08d4, 0x8),
EVENT_CONSTRAINT_END
};
struct event_constraint intel_nehalem_pebs_event_constraints[] = {
INTEL_PLD_CONSTRAINT(0x100b, 0xf), /* MEM_INST_RETIRED.* */
INTEL_FLAGS_EVENT_CONSTRAINT(0x0f, 0xf), /* MEM_UNCORE_RETIRED.* */
@@ -1285,6 +1519,33 @@ struct event_constraint intel_lnc_pebs_event_constraints[] = {
EVENT_CONSTRAINT_END
};
struct event_constraint intel_pnc_pebs_event_constraints[] = {
INTEL_FLAGS_UEVENT_CONSTRAINT(0x100, 0x100000000ULL), /* INST_RETIRED.PREC_DIST */
INTEL_FLAGS_UEVENT_CONSTRAINT(0x0400, 0x800000000ULL),
INTEL_HYBRID_LDLAT_CONSTRAINT(0x1cd, 0xfc),
INTEL_HYBRID_STLAT_CONSTRAINT(0x2cd, 0x3),
INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x11d0, 0xf), /* MEM_INST_RETIRED.STLB_MISS_LOADS */
INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x12d0, 0xf), /* MEM_INST_RETIRED.STLB_MISS_STORES */
INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x21d0, 0xf), /* MEM_INST_RETIRED.LOCK_LOADS */
INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x41d0, 0xf), /* MEM_INST_RETIRED.SPLIT_LOADS */
INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x42d0, 0xf), /* MEM_INST_RETIRED.SPLIT_STORES */
INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_LD(0x81d0, 0xf), /* MEM_INST_RETIRED.ALL_LOADS */
INTEL_FLAGS_UEVENT_CONSTRAINT_DATALA_ST(0x82d0, 0xf), /* MEM_INST_RETIRED.ALL_STORES */
INTEL_FLAGS_EVENT_CONSTRAINT_DATALA_LD_RANGE(0xd1, 0xd4, 0xf),
INTEL_FLAGS_EVENT_CONSTRAINT(0xd0, 0xf),
INTEL_FLAGS_EVENT_CONSTRAINT(0xd6, 0xf),
/*
* Everything else is handled by PMU_FL_PEBS_ALL, because we
* need the full constraints from the main table.
*/
EVENT_CONSTRAINT_END
};
struct event_constraint *intel_pebs_constraints(struct perf_event *event)
{
struct event_constraint *pebs_constraints = hybrid(event->pmu, pebs_constraints);

View File

@@ -243,7 +243,7 @@ static __init void p6_pmu_rdpmc_quirk(void)
*/
pr_warn("Userspace RDPMC support disabled due to a CPU erratum\n");
x86_pmu.attr_rdpmc_broken = 1;
x86_pmu.attr_rdpmc = 0;
x86_pmu.attr_rdpmc = X86_USER_RDPMC_NEVER_ENABLE;
}
}

View File

@@ -436,7 +436,7 @@ uncore_get_event_constraint(struct intel_uncore_box *box, struct perf_event *eve
if (type->constraints) {
for_each_event_constraint(c, type->constraints) {
if ((event->hw.config & c->cmask) == c->code)
if (constraint_match(c, event->hw.config))
return c;
}
}
@@ -1697,152 +1697,181 @@ static int __init uncore_mmio_init(void)
return ret;
}
struct intel_uncore_init_fun {
void (*cpu_init)(void);
int (*pci_init)(void);
void (*mmio_init)(void);
/* Discovery table is required */
bool use_discovery;
/* The units in the discovery table should be ignored. */
int *uncore_units_ignore;
};
static int uncore_mmio_global_init(u64 ctl)
{
void __iomem *io_addr;
static const struct intel_uncore_init_fun nhm_uncore_init __initconst = {
io_addr = ioremap(ctl, sizeof(ctl));
if (!io_addr)
return -ENOMEM;
/* Clear freeze bit (0) to enable all counters. */
writel(0, io_addr);
iounmap(io_addr);
return 0;
}
static const struct uncore_plat_init nhm_uncore_init __initconst = {
.cpu_init = nhm_uncore_cpu_init,
};
static const struct intel_uncore_init_fun snb_uncore_init __initconst = {
static const struct uncore_plat_init snb_uncore_init __initconst = {
.cpu_init = snb_uncore_cpu_init,
.pci_init = snb_uncore_pci_init,
};
static const struct intel_uncore_init_fun ivb_uncore_init __initconst = {
static const struct uncore_plat_init ivb_uncore_init __initconst = {
.cpu_init = snb_uncore_cpu_init,
.pci_init = ivb_uncore_pci_init,
};
static const struct intel_uncore_init_fun hsw_uncore_init __initconst = {
static const struct uncore_plat_init hsw_uncore_init __initconst = {
.cpu_init = snb_uncore_cpu_init,
.pci_init = hsw_uncore_pci_init,
};
static const struct intel_uncore_init_fun bdw_uncore_init __initconst = {
static const struct uncore_plat_init bdw_uncore_init __initconst = {
.cpu_init = snb_uncore_cpu_init,
.pci_init = bdw_uncore_pci_init,
};
static const struct intel_uncore_init_fun snbep_uncore_init __initconst = {
static const struct uncore_plat_init snbep_uncore_init __initconst = {
.cpu_init = snbep_uncore_cpu_init,
.pci_init = snbep_uncore_pci_init,
};
static const struct intel_uncore_init_fun nhmex_uncore_init __initconst = {
static const struct uncore_plat_init nhmex_uncore_init __initconst = {
.cpu_init = nhmex_uncore_cpu_init,
};
static const struct intel_uncore_init_fun ivbep_uncore_init __initconst = {
static const struct uncore_plat_init ivbep_uncore_init __initconst = {
.cpu_init = ivbep_uncore_cpu_init,
.pci_init = ivbep_uncore_pci_init,
};
static const struct intel_uncore_init_fun hswep_uncore_init __initconst = {
static const struct uncore_plat_init hswep_uncore_init __initconst = {
.cpu_init = hswep_uncore_cpu_init,
.pci_init = hswep_uncore_pci_init,
};
static const struct intel_uncore_init_fun bdx_uncore_init __initconst = {
static const struct uncore_plat_init bdx_uncore_init __initconst = {
.cpu_init = bdx_uncore_cpu_init,
.pci_init = bdx_uncore_pci_init,
};
static const struct intel_uncore_init_fun knl_uncore_init __initconst = {
static const struct uncore_plat_init knl_uncore_init __initconst = {
.cpu_init = knl_uncore_cpu_init,
.pci_init = knl_uncore_pci_init,
};
static const struct intel_uncore_init_fun skl_uncore_init __initconst = {
static const struct uncore_plat_init skl_uncore_init __initconst = {
.cpu_init = skl_uncore_cpu_init,
.pci_init = skl_uncore_pci_init,
};
static const struct intel_uncore_init_fun skx_uncore_init __initconst = {
static const struct uncore_plat_init skx_uncore_init __initconst = {
.cpu_init = skx_uncore_cpu_init,
.pci_init = skx_uncore_pci_init,
};
static const struct intel_uncore_init_fun icl_uncore_init __initconst = {
static const struct uncore_plat_init icl_uncore_init __initconst = {
.cpu_init = icl_uncore_cpu_init,
.pci_init = skl_uncore_pci_init,
};
static const struct intel_uncore_init_fun tgl_uncore_init __initconst = {
static const struct uncore_plat_init tgl_uncore_init __initconst = {
.cpu_init = tgl_uncore_cpu_init,
.mmio_init = tgl_uncore_mmio_init,
};
static const struct intel_uncore_init_fun tgl_l_uncore_init __initconst = {
static const struct uncore_plat_init tgl_l_uncore_init __initconst = {
.cpu_init = tgl_uncore_cpu_init,
.mmio_init = tgl_l_uncore_mmio_init,
};
static const struct intel_uncore_init_fun rkl_uncore_init __initconst = {
static const struct uncore_plat_init rkl_uncore_init __initconst = {
.cpu_init = tgl_uncore_cpu_init,
.pci_init = skl_uncore_pci_init,
};
static const struct intel_uncore_init_fun adl_uncore_init __initconst = {
static const struct uncore_plat_init adl_uncore_init __initconst = {
.cpu_init = adl_uncore_cpu_init,
.mmio_init = adl_uncore_mmio_init,
};
static const struct intel_uncore_init_fun mtl_uncore_init __initconst = {
static const struct uncore_plat_init mtl_uncore_init __initconst = {
.cpu_init = mtl_uncore_cpu_init,
.mmio_init = adl_uncore_mmio_init,
};
static const struct intel_uncore_init_fun lnl_uncore_init __initconst = {
static const struct uncore_plat_init lnl_uncore_init __initconst = {
.cpu_init = lnl_uncore_cpu_init,
.mmio_init = lnl_uncore_mmio_init,
};
static const struct intel_uncore_init_fun ptl_uncore_init __initconst = {
static const struct uncore_plat_init ptl_uncore_init __initconst = {
.cpu_init = ptl_uncore_cpu_init,
.mmio_init = ptl_uncore_mmio_init,
.use_discovery = true,
.domain[0].discovery_base = UNCORE_DISCOVERY_MSR,
.domain[0].global_init = uncore_mmio_global_init,
};
static const struct intel_uncore_init_fun icx_uncore_init __initconst = {
static const struct uncore_plat_init nvl_uncore_init __initconst = {
.cpu_init = nvl_uncore_cpu_init,
.mmio_init = ptl_uncore_mmio_init,
.domain[0].discovery_base = PACKAGE_UNCORE_DISCOVERY_MSR,
.domain[0].global_init = uncore_mmio_global_init,
};
static const struct uncore_plat_init icx_uncore_init __initconst = {
.cpu_init = icx_uncore_cpu_init,
.pci_init = icx_uncore_pci_init,
.mmio_init = icx_uncore_mmio_init,
};
static const struct intel_uncore_init_fun snr_uncore_init __initconst = {
static const struct uncore_plat_init snr_uncore_init __initconst = {
.cpu_init = snr_uncore_cpu_init,
.pci_init = snr_uncore_pci_init,
.mmio_init = snr_uncore_mmio_init,
};
static const struct intel_uncore_init_fun spr_uncore_init __initconst = {
static const struct uncore_plat_init spr_uncore_init __initconst = {
.cpu_init = spr_uncore_cpu_init,
.pci_init = spr_uncore_pci_init,
.mmio_init = spr_uncore_mmio_init,
.use_discovery = true,
.uncore_units_ignore = spr_uncore_units_ignore,
.domain[0].base_is_pci = true,
.domain[0].discovery_base = UNCORE_DISCOVERY_TABLE_DEVICE,
.domain[0].units_ignore = spr_uncore_units_ignore,
};
static const struct intel_uncore_init_fun gnr_uncore_init __initconst = {
static const struct uncore_plat_init gnr_uncore_init __initconst = {
.cpu_init = gnr_uncore_cpu_init,
.pci_init = gnr_uncore_pci_init,
.mmio_init = gnr_uncore_mmio_init,
.use_discovery = true,
.uncore_units_ignore = gnr_uncore_units_ignore,
.domain[0].base_is_pci = true,
.domain[0].discovery_base = UNCORE_DISCOVERY_TABLE_DEVICE,
.domain[0].units_ignore = gnr_uncore_units_ignore,
};
static const struct intel_uncore_init_fun generic_uncore_init __initconst = {
static const struct uncore_plat_init dmr_uncore_init __initconst = {
.pci_init = dmr_uncore_pci_init,
.mmio_init = dmr_uncore_mmio_init,
.domain[0].base_is_pci = true,
.domain[0].discovery_base = DMR_UNCORE_DISCOVERY_TABLE_DEVICE,
.domain[0].units_ignore = dmr_uncore_imh_units_ignore,
.domain[1].discovery_base = CBB_UNCORE_DISCOVERY_MSR,
.domain[1].units_ignore = dmr_uncore_cbb_units_ignore,
.domain[1].global_init = uncore_mmio_global_init,
};
static const struct uncore_plat_init generic_uncore_init __initconst = {
.cpu_init = intel_uncore_generic_uncore_cpu_init,
.pci_init = intel_uncore_generic_uncore_pci_init,
.mmio_init = intel_uncore_generic_uncore_mmio_init,
.domain[0].base_is_pci = true,
.domain[0].discovery_base = PCI_ANY_ID,
.domain[1].discovery_base = UNCORE_DISCOVERY_MSR,
};
static const struct x86_cpu_id intel_uncore_match[] __initconst = {
@@ -1894,6 +1923,8 @@ static const struct x86_cpu_id intel_uncore_match[] __initconst = {
X86_MATCH_VFM(INTEL_LUNARLAKE_M, &lnl_uncore_init),
X86_MATCH_VFM(INTEL_PANTHERLAKE_L, &ptl_uncore_init),
X86_MATCH_VFM(INTEL_WILDCATLAKE_L, &ptl_uncore_init),
X86_MATCH_VFM(INTEL_NOVALAKE, &nvl_uncore_init),
X86_MATCH_VFM(INTEL_NOVALAKE_L, &nvl_uncore_init),
X86_MATCH_VFM(INTEL_SAPPHIRERAPIDS_X, &spr_uncore_init),
X86_MATCH_VFM(INTEL_EMERALDRAPIDS_X, &spr_uncore_init),
X86_MATCH_VFM(INTEL_GRANITERAPIDS_X, &gnr_uncore_init),
@@ -1903,14 +1934,25 @@ static const struct x86_cpu_id intel_uncore_match[] __initconst = {
X86_MATCH_VFM(INTEL_ATOM_CRESTMONT_X, &gnr_uncore_init),
X86_MATCH_VFM(INTEL_ATOM_CRESTMONT, &gnr_uncore_init),
X86_MATCH_VFM(INTEL_ATOM_DARKMONT_X, &gnr_uncore_init),
X86_MATCH_VFM(INTEL_DIAMONDRAPIDS_X, &dmr_uncore_init),
{},
};
MODULE_DEVICE_TABLE(x86cpu, intel_uncore_match);
static bool uncore_use_discovery(struct uncore_plat_init *config)
{
for (int i = 0; i < UNCORE_DISCOVERY_DOMAINS; i++) {
if (config->domain[i].discovery_base)
return true;
}
return false;
}
static int __init intel_uncore_init(void)
{
const struct x86_cpu_id *id;
struct intel_uncore_init_fun *uncore_init;
struct uncore_plat_init *uncore_init;
int pret = 0, cret = 0, mret = 0, ret;
if (boot_cpu_has(X86_FEATURE_HYPERVISOR))
@@ -1921,16 +1963,15 @@ static int __init intel_uncore_init(void)
id = x86_match_cpu(intel_uncore_match);
if (!id) {
if (!uncore_no_discover && intel_uncore_has_discovery_tables(NULL))
uncore_init = (struct intel_uncore_init_fun *)&generic_uncore_init;
else
uncore_init = (struct uncore_plat_init *)&generic_uncore_init;
if (uncore_no_discover || !uncore_discovery(uncore_init))
return -ENODEV;
} else {
uncore_init = (struct intel_uncore_init_fun *)id->driver_data;
if (uncore_no_discover && uncore_init->use_discovery)
uncore_init = (struct uncore_plat_init *)id->driver_data;
if (uncore_no_discover && uncore_use_discovery(uncore_init))
return -ENODEV;
if (uncore_init->use_discovery &&
!intel_uncore_has_discovery_tables(uncore_init->uncore_units_ignore))
if (uncore_use_discovery(uncore_init) &&
!uncore_discovery(uncore_init))
return -ENODEV;
}

View File

@@ -33,6 +33,8 @@
#define UNCORE_EXTRA_PCI_DEV_MAX 4
#define UNCORE_EVENT_CONSTRAINT(c, n) EVENT_CONSTRAINT(c, n, 0xff)
#define UNCORE_EVENT_CONSTRAINT_RANGE(c, e, n) \
EVENT_CONSTRAINT_RANGE(c, e, n, 0xff)
#define UNCORE_IGNORE_END -1
@@ -47,6 +49,25 @@ struct uncore_event_desc;
struct freerunning_counters;
struct intel_uncore_topology;
struct uncore_discovery_domain {
/* MSR address or PCI device used as the discovery base */
u32 discovery_base;
bool base_is_pci;
int (*global_init)(u64 ctl);
/* The units in the discovery table should be ignored. */
int *units_ignore;
};
#define UNCORE_DISCOVERY_DOMAINS 2
struct uncore_plat_init {
void (*cpu_init)(void);
int (*pci_init)(void);
void (*mmio_init)(void);
struct uncore_discovery_domain domain[UNCORE_DISCOVERY_DOMAINS];
};
struct intel_uncore_type {
const char *name;
int num_counters;
@@ -597,6 +618,8 @@ extern struct pci_extra_dev *uncore_extra_pci_dev;
extern struct event_constraint uncore_constraint_empty;
extern int spr_uncore_units_ignore[];
extern int gnr_uncore_units_ignore[];
extern int dmr_uncore_imh_units_ignore[];
extern int dmr_uncore_cbb_units_ignore[];
/* uncore_snb.c */
int snb_uncore_pci_init(void);
@@ -613,6 +636,7 @@ void adl_uncore_cpu_init(void);
void lnl_uncore_cpu_init(void);
void mtl_uncore_cpu_init(void);
void ptl_uncore_cpu_init(void);
void nvl_uncore_cpu_init(void);
void tgl_uncore_mmio_init(void);
void tgl_l_uncore_mmio_init(void);
void adl_uncore_mmio_init(void);
@@ -645,6 +669,8 @@ void spr_uncore_mmio_init(void);
int gnr_uncore_pci_init(void);
void gnr_uncore_cpu_init(void);
void gnr_uncore_mmio_init(void);
int dmr_uncore_pci_init(void);
void dmr_uncore_mmio_init(void);
/* uncore_nhmex.c */
void nhmex_uncore_cpu_init(void);

View File

@@ -12,24 +12,6 @@
static struct rb_root discovery_tables = RB_ROOT;
static int num_discovered_types[UNCORE_ACCESS_MAX];
static bool has_generic_discovery_table(void)
{
struct pci_dev *dev;
int dvsec;
dev = pci_get_device(PCI_VENDOR_ID_INTEL, UNCORE_DISCOVERY_TABLE_DEVICE, NULL);
if (!dev)
return false;
/* A discovery table device has the unique capability ID. */
dvsec = pci_find_next_ext_capability(dev, 0, UNCORE_EXT_CAP_ID_DISCOVERY);
pci_dev_put(dev);
if (dvsec)
return true;
return false;
}
static int logical_die_id;
static int get_device_die_id(struct pci_dev *dev)
@@ -52,7 +34,7 @@ static int get_device_die_id(struct pci_dev *dev)
static inline int __type_cmp(const void *key, const struct rb_node *b)
{
struct intel_uncore_discovery_type *type_b = __node_2_type(b);
const struct intel_uncore_discovery_type *type_b = __node_2_type(b);
const u16 *type_id = key;
if (type_b->type > *type_id)
@@ -115,7 +97,7 @@ get_uncore_discovery_type(struct uncore_unit_discovery *unit)
static inline int pmu_idx_cmp(const void *key, const struct rb_node *b)
{
struct intel_uncore_discovery_unit *unit;
const struct intel_uncore_discovery_unit *unit;
const unsigned int *id = key;
unit = rb_entry(b, struct intel_uncore_discovery_unit, node);
@@ -173,7 +155,7 @@ int intel_uncore_find_discovery_unit_id(struct rb_root *units, int die,
static inline bool unit_less(struct rb_node *a, const struct rb_node *b)
{
struct intel_uncore_discovery_unit *a_node, *b_node;
const struct intel_uncore_discovery_unit *a_node, *b_node;
a_node = rb_entry(a, struct intel_uncore_discovery_unit, node);
b_node = rb_entry(b, struct intel_uncore_discovery_unit, node);
@@ -259,23 +241,24 @@ uncore_insert_box_info(struct uncore_unit_discovery *unit,
}
static bool
uncore_ignore_unit(struct uncore_unit_discovery *unit, int *ignore)
uncore_ignore_unit(struct uncore_unit_discovery *unit,
struct uncore_discovery_domain *domain)
{
int i;
if (!ignore)
if (!domain || !domain->units_ignore)
return false;
for (i = 0; ignore[i] != UNCORE_IGNORE_END ; i++) {
if (unit->box_type == ignore[i])
for (i = 0; domain->units_ignore[i] != UNCORE_IGNORE_END ; i++) {
if (unit->box_type == domain->units_ignore[i])
return true;
}
return false;
}
static int __parse_discovery_table(resource_size_t addr, int die,
bool *parsed, int *ignore)
static int __parse_discovery_table(struct uncore_discovery_domain *domain,
resource_size_t addr, int die, bool *parsed)
{
struct uncore_global_discovery global;
struct uncore_unit_discovery unit;
@@ -303,6 +286,9 @@ static int __parse_discovery_table(resource_size_t addr, int die,
if (!io_addr)
return -ENOMEM;
if (domain->global_init && domain->global_init(global.ctl))
return -ENODEV;
/* Parsing Unit Discovery State */
for (i = 0; i < global.max_units; i++) {
memcpy_fromio(&unit, io_addr + (i + 1) * (global.stride * 8),
@@ -314,7 +300,7 @@ static int __parse_discovery_table(resource_size_t addr, int die,
if (unit.access_type >= UNCORE_ACCESS_MAX)
continue;
if (uncore_ignore_unit(&unit, ignore))
if (uncore_ignore_unit(&unit, domain))
continue;
uncore_insert_box_info(&unit, die);
@@ -325,9 +311,9 @@ static int __parse_discovery_table(resource_size_t addr, int die,
return 0;
}
static int parse_discovery_table(struct pci_dev *dev, int die,
u32 bar_offset, bool *parsed,
int *ignore)
static int parse_discovery_table(struct uncore_discovery_domain *domain,
struct pci_dev *dev, int die,
u32 bar_offset, bool *parsed)
{
resource_size_t addr;
u32 val;
@@ -347,20 +333,17 @@ static int parse_discovery_table(struct pci_dev *dev, int die,
}
#endif
return __parse_discovery_table(addr, die, parsed, ignore);
return __parse_discovery_table(domain, addr, die, parsed);
}
static bool intel_uncore_has_discovery_tables_pci(int *ignore)
static bool uncore_discovery_pci(struct uncore_discovery_domain *domain)
{
u32 device, val, entry_id, bar_offset;
int die, dvsec = 0, ret = true;
struct pci_dev *dev = NULL;
bool parsed = false;
if (has_generic_discovery_table())
device = UNCORE_DISCOVERY_TABLE_DEVICE;
else
device = PCI_ANY_ID;
device = domain->discovery_base;
/*
* Start a new search and iterates through the list of
@@ -386,7 +369,7 @@ static bool intel_uncore_has_discovery_tables_pci(int *ignore)
if (die < 0)
continue;
parse_discovery_table(dev, die, bar_offset, &parsed, ignore);
parse_discovery_table(domain, dev, die, bar_offset, &parsed);
}
}
@@ -399,7 +382,7 @@ static bool intel_uncore_has_discovery_tables_pci(int *ignore)
return ret;
}
static bool intel_uncore_has_discovery_tables_msr(int *ignore)
static bool uncore_discovery_msr(struct uncore_discovery_domain *domain)
{
unsigned long *die_mask;
bool parsed = false;
@@ -417,13 +400,13 @@ static bool intel_uncore_has_discovery_tables_msr(int *ignore)
if (__test_and_set_bit(die, die_mask))
continue;
if (rdmsrq_safe_on_cpu(cpu, UNCORE_DISCOVERY_MSR, &base))
if (rdmsrq_safe_on_cpu(cpu, domain->discovery_base, &base))
continue;
if (!base)
continue;
__parse_discovery_table(base, die, &parsed, ignore);
__parse_discovery_table(domain, base, die, &parsed);
}
cpus_read_unlock();
@@ -432,10 +415,23 @@ static bool intel_uncore_has_discovery_tables_msr(int *ignore)
return parsed;
}
bool intel_uncore_has_discovery_tables(int *ignore)
bool uncore_discovery(struct uncore_plat_init *init)
{
return intel_uncore_has_discovery_tables_msr(ignore) ||
intel_uncore_has_discovery_tables_pci(ignore);
struct uncore_discovery_domain *domain;
bool ret = false;
int i;
for (i = 0; i < UNCORE_DISCOVERY_DOMAINS; i++) {
domain = &init->domain[i];
if (domain->discovery_base) {
if (!domain->base_is_pci)
ret |= uncore_discovery_msr(domain);
else
ret |= uncore_discovery_pci(domain);
}
}
return ret;
}
void intel_uncore_clear_discovery_tables(void)

View File

@@ -2,9 +2,15 @@
/* Store the full address of the global discovery table */
#define UNCORE_DISCOVERY_MSR 0x201e
/* Base address of uncore perfmon discovery table for CBB domain */
#define CBB_UNCORE_DISCOVERY_MSR 0x710
/* Base address of uncore perfmon discovery table for the package */
#define PACKAGE_UNCORE_DISCOVERY_MSR 0x711
/* Generic device ID of a discovery table device */
#define UNCORE_DISCOVERY_TABLE_DEVICE 0x09a7
/* Device ID used on DMR */
#define DMR_UNCORE_DISCOVERY_TABLE_DEVICE 0x09a1
/* Capability ID for a discovery table device */
#define UNCORE_EXT_CAP_ID_DISCOVERY 0x23
/* First DVSEC offset */
@@ -136,7 +142,7 @@ struct intel_uncore_discovery_type {
u16 num_units; /* number of units */
};
bool intel_uncore_has_discovery_tables(int *ignore);
bool uncore_discovery(struct uncore_plat_init *init);
void intel_uncore_clear_discovery_tables(void);
void intel_uncore_generic_uncore_cpu_init(void);
int intel_uncore_generic_uncore_pci_init(void);

View File

@@ -245,6 +245,30 @@
#define MTL_UNC_HBO_CTR 0x2048
#define MTL_UNC_HBO_CTRL 0x2042
/* PTL Low Power Bridge register */
#define PTL_UNC_IA_CORE_BRIDGE_PER_CTR0 0x2028
#define PTL_UNC_IA_CORE_BRIDGE_PERFEVTSEL0 0x2022
/* PTL Santa register */
#define PTL_UNC_SANTA_CTR0 0x2418
#define PTL_UNC_SANTA_CTRL0 0x2412
/* PTL cNCU register */
#define PTL_UNC_CNCU_MSR_OFFSET 0x140
/* NVL cNCU register */
#define NVL_UNC_CNCU_BOX_CTL 0x202e
#define NVL_UNC_CNCU_FIXED_CTR 0x2028
#define NVL_UNC_CNCU_FIXED_CTRL 0x2022
/* NVL SANTA register */
#define NVL_UNC_SANTA_CTR0 0x2048
#define NVL_UNC_SANTA_CTRL0 0x2042
/* NVL CBOX register */
#define NVL_UNC_CBOX_PER_CTR0 0x2108
#define NVL_UNC_CBOX_PERFEVTSEL0 0x2102
DEFINE_UNCORE_FORMAT_ATTR(event, event, "config:0-7");
DEFINE_UNCORE_FORMAT_ATTR(umask, umask, "config:8-15");
DEFINE_UNCORE_FORMAT_ATTR(chmask, chmask, "config:8-11");
@@ -1921,8 +1945,36 @@ void ptl_uncore_mmio_init(void)
ptl_uncores);
}
static struct intel_uncore_type ptl_uncore_ia_core_bridge = {
.name = "ia_core_bridge",
.num_counters = 2,
.num_boxes = 1,
.perf_ctr_bits = 48,
.perf_ctr = PTL_UNC_IA_CORE_BRIDGE_PER_CTR0,
.event_ctl = PTL_UNC_IA_CORE_BRIDGE_PERFEVTSEL0,
.event_mask = ADL_UNC_RAW_EVENT_MASK,
.ops = &icl_uncore_msr_ops,
.format_group = &adl_uncore_format_group,
};
static struct intel_uncore_type ptl_uncore_santa = {
.name = "santa",
.num_counters = 2,
.num_boxes = 2,
.perf_ctr_bits = 48,
.perf_ctr = PTL_UNC_SANTA_CTR0,
.event_ctl = PTL_UNC_SANTA_CTRL0,
.event_mask = ADL_UNC_RAW_EVENT_MASK,
.msr_offset = SNB_UNC_CBO_MSR_OFFSET,
.ops = &icl_uncore_msr_ops,
.format_group = &adl_uncore_format_group,
};
static struct intel_uncore_type *ptl_msr_uncores[] = {
&mtl_uncore_cbox,
&ptl_uncore_ia_core_bridge,
&ptl_uncore_santa,
&mtl_uncore_cncu,
NULL
};
@@ -1930,7 +1982,40 @@ void ptl_uncore_cpu_init(void)
{
mtl_uncore_cbox.num_boxes = 6;
mtl_uncore_cbox.ops = &lnl_uncore_msr_ops;
mtl_uncore_cncu.num_counters = 2;
mtl_uncore_cncu.num_boxes = 2;
mtl_uncore_cncu.msr_offset = PTL_UNC_CNCU_MSR_OFFSET;
mtl_uncore_cncu.single_fixed = 0;
uncore_msr_uncores = ptl_msr_uncores;
}
/* end of Panther Lake uncore support */
/* Nova Lake uncore support */
static struct intel_uncore_type *nvl_msr_uncores[] = {
&mtl_uncore_cbox,
&ptl_uncore_santa,
&mtl_uncore_cncu,
NULL
};
void nvl_uncore_cpu_init(void)
{
mtl_uncore_cbox.num_boxes = 12;
mtl_uncore_cbox.perf_ctr = NVL_UNC_CBOX_PER_CTR0;
mtl_uncore_cbox.event_ctl = NVL_UNC_CBOX_PERFEVTSEL0;
ptl_uncore_santa.perf_ctr = NVL_UNC_SANTA_CTR0;
ptl_uncore_santa.event_ctl = NVL_UNC_SANTA_CTRL0;
mtl_uncore_cncu.box_ctl = NVL_UNC_CNCU_BOX_CTL;
mtl_uncore_cncu.fixed_ctr = NVL_UNC_CNCU_FIXED_CTR;
mtl_uncore_cncu.fixed_ctl = NVL_UNC_CNCU_FIXED_CTRL;
uncore_msr_uncores = nvl_msr_uncores;
}
/* end of Nova Lake uncore support */

View File

@@ -471,6 +471,18 @@
#define SPR_C0_MSR_PMON_BOX_FILTER0 0x200e
/* DMR */
#define DMR_IMH1_HIOP_MMIO_BASE 0x1ffff6ae7000
#define DMR_HIOP_MMIO_SIZE 0x8000
#define DMR_CXLCM_EVENT_MASK_EXT 0xf
#define DMR_HAMVF_EVENT_MASK_EXT 0xffffffff
#define DMR_PCIE4_EVENT_MASK_EXT 0xffffff
#define UNCORE_DMR_ITC 0x30
#define DMR_IMC_PMON_FIXED_CTR 0x18
#define DMR_IMC_PMON_FIXED_CTL 0x10
DEFINE_UNCORE_FORMAT_ATTR(event, event, "config:0-7");
DEFINE_UNCORE_FORMAT_ATTR(event2, event, "config:0-6");
DEFINE_UNCORE_FORMAT_ATTR(event_ext, event, "config:0-7,21");
@@ -486,6 +498,10 @@ DEFINE_UNCORE_FORMAT_ATTR(edge, edge, "config:18");
DEFINE_UNCORE_FORMAT_ATTR(tid_en, tid_en, "config:19");
DEFINE_UNCORE_FORMAT_ATTR(tid_en2, tid_en, "config:16");
DEFINE_UNCORE_FORMAT_ATTR(inv, inv, "config:23");
DEFINE_UNCORE_FORMAT_ATTR(inv2, inv, "config:21");
DEFINE_UNCORE_FORMAT_ATTR(thresh_ext, thresh_ext, "config:32-35");
DEFINE_UNCORE_FORMAT_ATTR(thresh10, thresh, "config:23-32");
DEFINE_UNCORE_FORMAT_ATTR(thresh9_2, thresh, "config:23-31");
DEFINE_UNCORE_FORMAT_ATTR(thresh9, thresh, "config:24-35");
DEFINE_UNCORE_FORMAT_ATTR(thresh8, thresh, "config:24-31");
DEFINE_UNCORE_FORMAT_ATTR(thresh6, thresh, "config:24-29");
@@ -494,6 +510,13 @@ DEFINE_UNCORE_FORMAT_ATTR(occ_sel, occ_sel, "config:14-15");
DEFINE_UNCORE_FORMAT_ATTR(occ_invert, occ_invert, "config:30");
DEFINE_UNCORE_FORMAT_ATTR(occ_edge, occ_edge, "config:14-51");
DEFINE_UNCORE_FORMAT_ATTR(occ_edge_det, occ_edge_det, "config:31");
DEFINE_UNCORE_FORMAT_ATTR(port_en, port_en, "config:32-35");
DEFINE_UNCORE_FORMAT_ATTR(rs3_sel, rs3_sel, "config:36");
DEFINE_UNCORE_FORMAT_ATTR(rx_sel, rx_sel, "config:37");
DEFINE_UNCORE_FORMAT_ATTR(tx_sel, tx_sel, "config:38");
DEFINE_UNCORE_FORMAT_ATTR(iep_sel, iep_sel, "config:39");
DEFINE_UNCORE_FORMAT_ATTR(vc_sel, vc_sel, "config:40-47");
DEFINE_UNCORE_FORMAT_ATTR(port_sel, port_sel, "config:48-55");
DEFINE_UNCORE_FORMAT_ATTR(ch_mask, ch_mask, "config:36-43");
DEFINE_UNCORE_FORMAT_ATTR(ch_mask2, ch_mask, "config:36-47");
DEFINE_UNCORE_FORMAT_ATTR(fc_mask, fc_mask, "config:44-46");
@@ -813,76 +836,37 @@ static struct intel_uncore_ops snbep_uncore_pci_ops = {
static struct event_constraint snbep_uncore_cbox_constraints[] = {
UNCORE_EVENT_CONSTRAINT(0x01, 0x1),
UNCORE_EVENT_CONSTRAINT(0x02, 0x3),
UNCORE_EVENT_CONSTRAINT(0x04, 0x3),
UNCORE_EVENT_CONSTRAINT(0x05, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x04, 0x5, 0x3),
UNCORE_EVENT_CONSTRAINT(0x07, 0x3),
UNCORE_EVENT_CONSTRAINT(0x09, 0x3),
UNCORE_EVENT_CONSTRAINT(0x11, 0x1),
UNCORE_EVENT_CONSTRAINT(0x12, 0x3),
UNCORE_EVENT_CONSTRAINT(0x13, 0x3),
UNCORE_EVENT_CONSTRAINT(0x1b, 0xc),
UNCORE_EVENT_CONSTRAINT(0x1c, 0xc),
UNCORE_EVENT_CONSTRAINT(0x1d, 0xc),
UNCORE_EVENT_CONSTRAINT(0x1e, 0xc),
UNCORE_EVENT_CONSTRAINT_RANGE(0x12, 0x13, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x1b, 0x1e, 0xc),
UNCORE_EVENT_CONSTRAINT(0x1f, 0xe),
UNCORE_EVENT_CONSTRAINT(0x21, 0x3),
UNCORE_EVENT_CONSTRAINT(0x23, 0x3),
UNCORE_EVENT_CONSTRAINT(0x31, 0x3),
UNCORE_EVENT_CONSTRAINT(0x32, 0x3),
UNCORE_EVENT_CONSTRAINT(0x33, 0x3),
UNCORE_EVENT_CONSTRAINT(0x34, 0x3),
UNCORE_EVENT_CONSTRAINT(0x35, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x31, 0x35, 0x3),
UNCORE_EVENT_CONSTRAINT(0x36, 0x1),
UNCORE_EVENT_CONSTRAINT(0x37, 0x3),
UNCORE_EVENT_CONSTRAINT(0x38, 0x3),
UNCORE_EVENT_CONSTRAINT(0x39, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x37, 0x39, 0x3),
UNCORE_EVENT_CONSTRAINT(0x3b, 0x1),
EVENT_CONSTRAINT_END
};
static struct event_constraint snbep_uncore_r2pcie_constraints[] = {
UNCORE_EVENT_CONSTRAINT(0x10, 0x3),
UNCORE_EVENT_CONSTRAINT(0x11, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x10, 0x11, 0x3),
UNCORE_EVENT_CONSTRAINT(0x12, 0x1),
UNCORE_EVENT_CONSTRAINT(0x23, 0x3),
UNCORE_EVENT_CONSTRAINT(0x24, 0x3),
UNCORE_EVENT_CONSTRAINT(0x25, 0x3),
UNCORE_EVENT_CONSTRAINT(0x26, 0x3),
UNCORE_EVENT_CONSTRAINT(0x32, 0x3),
UNCORE_EVENT_CONSTRAINT(0x33, 0x3),
UNCORE_EVENT_CONSTRAINT(0x34, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x24, 0x26, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x32, 0x34, 0x3),
EVENT_CONSTRAINT_END
};
static struct event_constraint snbep_uncore_r3qpi_constraints[] = {
UNCORE_EVENT_CONSTRAINT(0x10, 0x3),
UNCORE_EVENT_CONSTRAINT(0x11, 0x3),
UNCORE_EVENT_CONSTRAINT(0x12, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x10, 0x12, 0x3),
UNCORE_EVENT_CONSTRAINT(0x13, 0x1),
UNCORE_EVENT_CONSTRAINT(0x20, 0x3),
UNCORE_EVENT_CONSTRAINT(0x21, 0x3),
UNCORE_EVENT_CONSTRAINT(0x22, 0x3),
UNCORE_EVENT_CONSTRAINT(0x23, 0x3),
UNCORE_EVENT_CONSTRAINT(0x24, 0x3),
UNCORE_EVENT_CONSTRAINT(0x25, 0x3),
UNCORE_EVENT_CONSTRAINT(0x26, 0x3),
UNCORE_EVENT_CONSTRAINT(0x28, 0x3),
UNCORE_EVENT_CONSTRAINT(0x29, 0x3),
UNCORE_EVENT_CONSTRAINT(0x2a, 0x3),
UNCORE_EVENT_CONSTRAINT(0x2b, 0x3),
UNCORE_EVENT_CONSTRAINT(0x2c, 0x3),
UNCORE_EVENT_CONSTRAINT(0x2d, 0x3),
UNCORE_EVENT_CONSTRAINT(0x2e, 0x3),
UNCORE_EVENT_CONSTRAINT(0x2f, 0x3),
UNCORE_EVENT_CONSTRAINT(0x30, 0x3),
UNCORE_EVENT_CONSTRAINT(0x31, 0x3),
UNCORE_EVENT_CONSTRAINT(0x32, 0x3),
UNCORE_EVENT_CONSTRAINT(0x33, 0x3),
UNCORE_EVENT_CONSTRAINT(0x34, 0x3),
UNCORE_EVENT_CONSTRAINT(0x36, 0x3),
UNCORE_EVENT_CONSTRAINT(0x37, 0x3),
UNCORE_EVENT_CONSTRAINT(0x38, 0x3),
UNCORE_EVENT_CONSTRAINT(0x39, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x20, 0x26, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x28, 0x34, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x36, 0x39, 0x3),
EVENT_CONSTRAINT_END
};
@@ -3011,24 +2995,15 @@ static struct intel_uncore_type hswep_uncore_qpi = {
};
static struct event_constraint hswep_uncore_r2pcie_constraints[] = {
UNCORE_EVENT_CONSTRAINT(0x10, 0x3),
UNCORE_EVENT_CONSTRAINT(0x11, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x10, 0x11, 0x3),
UNCORE_EVENT_CONSTRAINT(0x13, 0x1),
UNCORE_EVENT_CONSTRAINT(0x23, 0x1),
UNCORE_EVENT_CONSTRAINT(0x24, 0x1),
UNCORE_EVENT_CONSTRAINT(0x25, 0x1),
UNCORE_EVENT_CONSTRAINT_RANGE(0x23, 0x25, 0x1),
UNCORE_EVENT_CONSTRAINT(0x26, 0x3),
UNCORE_EVENT_CONSTRAINT(0x27, 0x1),
UNCORE_EVENT_CONSTRAINT(0x28, 0x3),
UNCORE_EVENT_CONSTRAINT(0x29, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x28, 0x29, 0x3),
UNCORE_EVENT_CONSTRAINT(0x2a, 0x1),
UNCORE_EVENT_CONSTRAINT(0x2b, 0x3),
UNCORE_EVENT_CONSTRAINT(0x2c, 0x3),
UNCORE_EVENT_CONSTRAINT(0x2d, 0x3),
UNCORE_EVENT_CONSTRAINT(0x32, 0x3),
UNCORE_EVENT_CONSTRAINT(0x33, 0x3),
UNCORE_EVENT_CONSTRAINT(0x34, 0x3),
UNCORE_EVENT_CONSTRAINT(0x35, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x2b, 0x2d, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x32, 0x35, 0x3),
EVENT_CONSTRAINT_END
};
@@ -3043,38 +3018,17 @@ static struct intel_uncore_type hswep_uncore_r2pcie = {
static struct event_constraint hswep_uncore_r3qpi_constraints[] = {
UNCORE_EVENT_CONSTRAINT(0x01, 0x3),
UNCORE_EVENT_CONSTRAINT(0x07, 0x7),
UNCORE_EVENT_CONSTRAINT(0x08, 0x7),
UNCORE_EVENT_CONSTRAINT(0x09, 0x7),
UNCORE_EVENT_CONSTRAINT(0x0a, 0x7),
UNCORE_EVENT_CONSTRAINT_RANGE(0x7, 0x0a, 0x7),
UNCORE_EVENT_CONSTRAINT(0x0e, 0x7),
UNCORE_EVENT_CONSTRAINT(0x10, 0x3),
UNCORE_EVENT_CONSTRAINT(0x11, 0x3),
UNCORE_EVENT_CONSTRAINT(0x12, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x10, 0x12, 0x3),
UNCORE_EVENT_CONSTRAINT(0x13, 0x1),
UNCORE_EVENT_CONSTRAINT(0x14, 0x3),
UNCORE_EVENT_CONSTRAINT(0x15, 0x3),
UNCORE_EVENT_CONSTRAINT(0x1f, 0x3),
UNCORE_EVENT_CONSTRAINT(0x20, 0x3),
UNCORE_EVENT_CONSTRAINT(0x21, 0x3),
UNCORE_EVENT_CONSTRAINT(0x22, 0x3),
UNCORE_EVENT_CONSTRAINT(0x23, 0x3),
UNCORE_EVENT_CONSTRAINT(0x25, 0x3),
UNCORE_EVENT_CONSTRAINT(0x26, 0x3),
UNCORE_EVENT_CONSTRAINT(0x28, 0x3),
UNCORE_EVENT_CONSTRAINT(0x29, 0x3),
UNCORE_EVENT_CONSTRAINT(0x2c, 0x3),
UNCORE_EVENT_CONSTRAINT(0x2d, 0x3),
UNCORE_EVENT_CONSTRAINT(0x2e, 0x3),
UNCORE_EVENT_CONSTRAINT(0x2f, 0x3),
UNCORE_EVENT_CONSTRAINT(0x31, 0x3),
UNCORE_EVENT_CONSTRAINT(0x32, 0x3),
UNCORE_EVENT_CONSTRAINT(0x33, 0x3),
UNCORE_EVENT_CONSTRAINT(0x34, 0x3),
UNCORE_EVENT_CONSTRAINT(0x36, 0x3),
UNCORE_EVENT_CONSTRAINT(0x37, 0x3),
UNCORE_EVENT_CONSTRAINT(0x38, 0x3),
UNCORE_EVENT_CONSTRAINT(0x39, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x14, 0x15, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x1f, 0x23, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x25, 0x26, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x28, 0x29, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x2c, 0x2f, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x31, 0x34, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x36, 0x39, 0x3),
EVENT_CONSTRAINT_END
};
@@ -3348,8 +3302,7 @@ static struct event_constraint bdx_uncore_r2pcie_constraints[] = {
UNCORE_EVENT_CONSTRAINT(0x25, 0x1),
UNCORE_EVENT_CONSTRAINT(0x26, 0x3),
UNCORE_EVENT_CONSTRAINT(0x28, 0x3),
UNCORE_EVENT_CONSTRAINT(0x2c, 0x3),
UNCORE_EVENT_CONSTRAINT(0x2d, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x2c, 0x2d, 0x3),
EVENT_CONSTRAINT_END
};
@@ -3364,35 +3317,18 @@ static struct intel_uncore_type bdx_uncore_r2pcie = {
static struct event_constraint bdx_uncore_r3qpi_constraints[] = {
UNCORE_EVENT_CONSTRAINT(0x01, 0x7),
UNCORE_EVENT_CONSTRAINT(0x07, 0x7),
UNCORE_EVENT_CONSTRAINT(0x08, 0x7),
UNCORE_EVENT_CONSTRAINT(0x09, 0x7),
UNCORE_EVENT_CONSTRAINT(0x0a, 0x7),
UNCORE_EVENT_CONSTRAINT_RANGE(0x07, 0x0a, 0x7),
UNCORE_EVENT_CONSTRAINT(0x0e, 0x7),
UNCORE_EVENT_CONSTRAINT(0x10, 0x3),
UNCORE_EVENT_CONSTRAINT(0x11, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x10, 0x11, 0x3),
UNCORE_EVENT_CONSTRAINT(0x13, 0x1),
UNCORE_EVENT_CONSTRAINT(0x14, 0x3),
UNCORE_EVENT_CONSTRAINT(0x15, 0x3),
UNCORE_EVENT_CONSTRAINT(0x1f, 0x3),
UNCORE_EVENT_CONSTRAINT(0x20, 0x3),
UNCORE_EVENT_CONSTRAINT(0x21, 0x3),
UNCORE_EVENT_CONSTRAINT(0x22, 0x3),
UNCORE_EVENT_CONSTRAINT(0x23, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x14, 0x15, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x1f, 0x23, 0x3),
UNCORE_EVENT_CONSTRAINT(0x25, 0x3),
UNCORE_EVENT_CONSTRAINT(0x26, 0x3),
UNCORE_EVENT_CONSTRAINT(0x28, 0x3),
UNCORE_EVENT_CONSTRAINT(0x29, 0x3),
UNCORE_EVENT_CONSTRAINT(0x2c, 0x3),
UNCORE_EVENT_CONSTRAINT(0x2d, 0x3),
UNCORE_EVENT_CONSTRAINT(0x2e, 0x3),
UNCORE_EVENT_CONSTRAINT(0x2f, 0x3),
UNCORE_EVENT_CONSTRAINT(0x33, 0x3),
UNCORE_EVENT_CONSTRAINT(0x34, 0x3),
UNCORE_EVENT_CONSTRAINT(0x36, 0x3),
UNCORE_EVENT_CONSTRAINT(0x37, 0x3),
UNCORE_EVENT_CONSTRAINT(0x38, 0x3),
UNCORE_EVENT_CONSTRAINT(0x39, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x28, 0x29, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x2c, 0x2f, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x33, 0x34, 0x3),
UNCORE_EVENT_CONSTRAINT_RANGE(0x36, 0x39, 0x3),
EVENT_CONSTRAINT_END
};
@@ -3699,8 +3635,7 @@ static struct event_constraint skx_uncore_iio_constraints[] = {
UNCORE_EVENT_CONSTRAINT(0x95, 0xc),
UNCORE_EVENT_CONSTRAINT(0xc0, 0xc),
UNCORE_EVENT_CONSTRAINT(0xc5, 0xc),
UNCORE_EVENT_CONSTRAINT(0xd4, 0xc),
UNCORE_EVENT_CONSTRAINT(0xd5, 0xc),
UNCORE_EVENT_CONSTRAINT_RANGE(0xd4, 0xd5, 0xc),
EVENT_CONSTRAINT_END
};
@@ -4049,34 +3984,24 @@ static struct freerunning_counters skx_iio_freerunning[] = {
[SKX_IIO_MSR_UTIL] = { 0xb08, 0x1, 0x10, 8, 36 },
};
#define INTEL_UNCORE_FR_EVENT_DESC(name, umask, scl) \
INTEL_UNCORE_EVENT_DESC(name, \
"event=0xff,umask=" __stringify(umask)),\
INTEL_UNCORE_EVENT_DESC(name.scale, __stringify(scl)), \
INTEL_UNCORE_EVENT_DESC(name.unit, "MiB")
static struct uncore_event_desc skx_uncore_iio_freerunning_events[] = {
/* Free-Running IO CLOCKS Counter */
INTEL_UNCORE_EVENT_DESC(ioclk, "event=0xff,umask=0x10"),
/* Free-Running IIO BANDWIDTH Counters */
INTEL_UNCORE_EVENT_DESC(bw_in_port0, "event=0xff,umask=0x20"),
INTEL_UNCORE_EVENT_DESC(bw_in_port0.scale, "3.814697266e-6"),
INTEL_UNCORE_EVENT_DESC(bw_in_port0.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(bw_in_port1, "event=0xff,umask=0x21"),
INTEL_UNCORE_EVENT_DESC(bw_in_port1.scale, "3.814697266e-6"),
INTEL_UNCORE_EVENT_DESC(bw_in_port1.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(bw_in_port2, "event=0xff,umask=0x22"),
INTEL_UNCORE_EVENT_DESC(bw_in_port2.scale, "3.814697266e-6"),
INTEL_UNCORE_EVENT_DESC(bw_in_port2.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(bw_in_port3, "event=0xff,umask=0x23"),
INTEL_UNCORE_EVENT_DESC(bw_in_port3.scale, "3.814697266e-6"),
INTEL_UNCORE_EVENT_DESC(bw_in_port3.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(bw_out_port0, "event=0xff,umask=0x24"),
INTEL_UNCORE_EVENT_DESC(bw_out_port0.scale, "3.814697266e-6"),
INTEL_UNCORE_EVENT_DESC(bw_out_port0.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(bw_out_port1, "event=0xff,umask=0x25"),
INTEL_UNCORE_EVENT_DESC(bw_out_port1.scale, "3.814697266e-6"),
INTEL_UNCORE_EVENT_DESC(bw_out_port1.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(bw_out_port2, "event=0xff,umask=0x26"),
INTEL_UNCORE_EVENT_DESC(bw_out_port2.scale, "3.814697266e-6"),
INTEL_UNCORE_EVENT_DESC(bw_out_port2.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(bw_out_port3, "event=0xff,umask=0x27"),
INTEL_UNCORE_EVENT_DESC(bw_out_port3.scale, "3.814697266e-6"),
INTEL_UNCORE_EVENT_DESC(bw_out_port3.unit, "MiB"),
INTEL_UNCORE_FR_EVENT_DESC(bw_in_port0, 0x20, 3.814697266e-6),
INTEL_UNCORE_FR_EVENT_DESC(bw_in_port1, 0x21, 3.814697266e-6),
INTEL_UNCORE_FR_EVENT_DESC(bw_in_port2, 0x22, 3.814697266e-6),
INTEL_UNCORE_FR_EVENT_DESC(bw_in_port3, 0x23, 3.814697266e-6),
INTEL_UNCORE_FR_EVENT_DESC(bw_out_port0, 0x24, 3.814697266e-6),
INTEL_UNCORE_FR_EVENT_DESC(bw_out_port1, 0x25, 3.814697266e-6),
INTEL_UNCORE_FR_EVENT_DESC(bw_out_port2, 0x26, 3.814697266e-6),
INTEL_UNCORE_FR_EVENT_DESC(bw_out_port3, 0x27, 3.814697266e-6),
/* Free-running IIO UTILIZATION Counters */
INTEL_UNCORE_EVENT_DESC(util_in_port0, "event=0xff,umask=0x30"),
INTEL_UNCORE_EVENT_DESC(util_out_port0, "event=0xff,umask=0x31"),
@@ -4466,14 +4391,9 @@ static struct intel_uncore_type skx_uncore_m2pcie = {
};
static struct event_constraint skx_uncore_m3upi_constraints[] = {
UNCORE_EVENT_CONSTRAINT(0x1d, 0x1),
UNCORE_EVENT_CONSTRAINT(0x1e, 0x1),
UNCORE_EVENT_CONSTRAINT_RANGE(0x1d, 0x1e, 0x1),
UNCORE_EVENT_CONSTRAINT(0x40, 0x7),
UNCORE_EVENT_CONSTRAINT(0x4e, 0x7),
UNCORE_EVENT_CONSTRAINT(0x4f, 0x7),
UNCORE_EVENT_CONSTRAINT(0x50, 0x7),
UNCORE_EVENT_CONSTRAINT(0x51, 0x7),
UNCORE_EVENT_CONSTRAINT(0x52, 0x7),
UNCORE_EVENT_CONSTRAINT_RANGE(0x4e, 0x52, 0x7),
EVENT_CONSTRAINT_END
};
@@ -4891,30 +4811,14 @@ static struct uncore_event_desc snr_uncore_iio_freerunning_events[] = {
/* Free-Running IIO CLOCKS Counter */
INTEL_UNCORE_EVENT_DESC(ioclk, "event=0xff,umask=0x10"),
/* Free-Running IIO BANDWIDTH IN Counters */
INTEL_UNCORE_EVENT_DESC(bw_in_port0, "event=0xff,umask=0x20"),
INTEL_UNCORE_EVENT_DESC(bw_in_port0.scale, "3.0517578125e-5"),
INTEL_UNCORE_EVENT_DESC(bw_in_port0.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(bw_in_port1, "event=0xff,umask=0x21"),
INTEL_UNCORE_EVENT_DESC(bw_in_port1.scale, "3.0517578125e-5"),
INTEL_UNCORE_EVENT_DESC(bw_in_port1.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(bw_in_port2, "event=0xff,umask=0x22"),
INTEL_UNCORE_EVENT_DESC(bw_in_port2.scale, "3.0517578125e-5"),
INTEL_UNCORE_EVENT_DESC(bw_in_port2.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(bw_in_port3, "event=0xff,umask=0x23"),
INTEL_UNCORE_EVENT_DESC(bw_in_port3.scale, "3.0517578125e-5"),
INTEL_UNCORE_EVENT_DESC(bw_in_port3.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(bw_in_port4, "event=0xff,umask=0x24"),
INTEL_UNCORE_EVENT_DESC(bw_in_port4.scale, "3.0517578125e-5"),
INTEL_UNCORE_EVENT_DESC(bw_in_port4.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(bw_in_port5, "event=0xff,umask=0x25"),
INTEL_UNCORE_EVENT_DESC(bw_in_port5.scale, "3.0517578125e-5"),
INTEL_UNCORE_EVENT_DESC(bw_in_port5.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(bw_in_port6, "event=0xff,umask=0x26"),
INTEL_UNCORE_EVENT_DESC(bw_in_port6.scale, "3.0517578125e-5"),
INTEL_UNCORE_EVENT_DESC(bw_in_port6.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(bw_in_port7, "event=0xff,umask=0x27"),
INTEL_UNCORE_EVENT_DESC(bw_in_port7.scale, "3.0517578125e-5"),
INTEL_UNCORE_EVENT_DESC(bw_in_port7.unit, "MiB"),
INTEL_UNCORE_FR_EVENT_DESC(bw_in_port0, 0x20, 3.0517578125e-5),
INTEL_UNCORE_FR_EVENT_DESC(bw_in_port1, 0x21, 3.0517578125e-5),
INTEL_UNCORE_FR_EVENT_DESC(bw_in_port2, 0x22, 3.0517578125e-5),
INTEL_UNCORE_FR_EVENT_DESC(bw_in_port3, 0x23, 3.0517578125e-5),
INTEL_UNCORE_FR_EVENT_DESC(bw_in_port4, 0x24, 3.0517578125e-5),
INTEL_UNCORE_FR_EVENT_DESC(bw_in_port5, 0x25, 3.0517578125e-5),
INTEL_UNCORE_FR_EVENT_DESC(bw_in_port6, 0x26, 3.0517578125e-5),
INTEL_UNCORE_FR_EVENT_DESC(bw_in_port7, 0x27, 3.0517578125e-5),
{ /* end: all zeroes */ },
};
@@ -5247,12 +5151,8 @@ static struct freerunning_counters snr_imc_freerunning[] = {
static struct uncore_event_desc snr_uncore_imc_freerunning_events[] = {
INTEL_UNCORE_EVENT_DESC(dclk, "event=0xff,umask=0x10"),
INTEL_UNCORE_EVENT_DESC(read, "event=0xff,umask=0x20"),
INTEL_UNCORE_EVENT_DESC(read.scale, "6.103515625e-5"),
INTEL_UNCORE_EVENT_DESC(read.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(write, "event=0xff,umask=0x21"),
INTEL_UNCORE_EVENT_DESC(write.scale, "6.103515625e-5"),
INTEL_UNCORE_EVENT_DESC(write.unit, "MiB"),
INTEL_UNCORE_FR_EVENT_DESC(read, 0x20, 6.103515625e-5),
INTEL_UNCORE_FR_EVENT_DESC(write, 0x21, 6.103515625e-5),
{ /* end: all zeroes */ },
};
@@ -5659,14 +5559,9 @@ static struct intel_uncore_type icx_uncore_upi = {
};
static struct event_constraint icx_uncore_m3upi_constraints[] = {
UNCORE_EVENT_CONSTRAINT(0x1c, 0x1),
UNCORE_EVENT_CONSTRAINT(0x1d, 0x1),
UNCORE_EVENT_CONSTRAINT(0x1e, 0x1),
UNCORE_EVENT_CONSTRAINT(0x1f, 0x1),
UNCORE_EVENT_CONSTRAINT_RANGE(0x1c, 0x1f, 0x1),
UNCORE_EVENT_CONSTRAINT(0x40, 0x7),
UNCORE_EVENT_CONSTRAINT(0x4e, 0x7),
UNCORE_EVENT_CONSTRAINT(0x4f, 0x7),
UNCORE_EVENT_CONSTRAINT(0x50, 0x7),
UNCORE_EVENT_CONSTRAINT_RANGE(0x4e, 0x50, 0x7),
EVENT_CONSTRAINT_END
};
@@ -5817,19 +5712,10 @@ static struct freerunning_counters icx_imc_freerunning[] = {
static struct uncore_event_desc icx_uncore_imc_freerunning_events[] = {
INTEL_UNCORE_EVENT_DESC(dclk, "event=0xff,umask=0x10"),
INTEL_UNCORE_EVENT_DESC(read, "event=0xff,umask=0x20"),
INTEL_UNCORE_EVENT_DESC(read.scale, "6.103515625e-5"),
INTEL_UNCORE_EVENT_DESC(read.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(write, "event=0xff,umask=0x21"),
INTEL_UNCORE_EVENT_DESC(write.scale, "6.103515625e-5"),
INTEL_UNCORE_EVENT_DESC(write.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(ddrt_read, "event=0xff,umask=0x30"),
INTEL_UNCORE_EVENT_DESC(ddrt_read.scale, "6.103515625e-5"),
INTEL_UNCORE_EVENT_DESC(ddrt_read.unit, "MiB"),
INTEL_UNCORE_EVENT_DESC(ddrt_write, "event=0xff,umask=0x31"),
INTEL_UNCORE_EVENT_DESC(ddrt_write.scale, "6.103515625e-5"),
INTEL_UNCORE_EVENT_DESC(ddrt_write.unit, "MiB"),
INTEL_UNCORE_FR_EVENT_DESC(read, 0x20, 6.103515625e-5),
INTEL_UNCORE_FR_EVENT_DESC(write, 0x21, 6.103515625e-5),
INTEL_UNCORE_FR_EVENT_DESC(ddrt_read, 0x30, 6.103515625e-5),
INTEL_UNCORE_FR_EVENT_DESC(ddrt_write, 0x31, 6.103515625e-5),
{ /* end: all zeroes */ },
};
@@ -6158,10 +6044,7 @@ static struct intel_uncore_ops spr_uncore_mmio_offs8_ops = {
static struct event_constraint spr_uncore_cxlcm_constraints[] = {
UNCORE_EVENT_CONSTRAINT(0x02, 0x0f),
UNCORE_EVENT_CONSTRAINT(0x05, 0x0f),
UNCORE_EVENT_CONSTRAINT(0x40, 0xf0),
UNCORE_EVENT_CONSTRAINT(0x41, 0xf0),
UNCORE_EVENT_CONSTRAINT(0x42, 0xf0),
UNCORE_EVENT_CONSTRAINT(0x43, 0xf0),
UNCORE_EVENT_CONSTRAINT_RANGE(0x40, 0x43, 0xf0),
UNCORE_EVENT_CONSTRAINT(0x4b, 0xf0),
UNCORE_EVENT_CONSTRAINT(0x52, 0xf0),
EVENT_CONSTRAINT_END
@@ -6462,7 +6345,11 @@ static int uncore_type_max_boxes(struct intel_uncore_type **types,
for (node = rb_first(type->boxes); node; node = rb_next(node)) {
unit = rb_entry(node, struct intel_uncore_discovery_unit, node);
if (unit->id > max)
/*
* on DMR IMH2, the unit id starts from 0x8000,
* and we don't need to count it.
*/
if ((unit->id > max) && (unit->id < 0x8000))
max = unit->id;
}
return max + 1;
@@ -6709,3 +6596,386 @@ void gnr_uncore_mmio_init(void)
}
/* end of GNR uncore support */
/* DMR uncore support */
#define UNCORE_DMR_NUM_UNCORE_TYPES 52
static struct attribute *dmr_imc_uncore_formats_attr[] = {
&format_attr_event.attr,
&format_attr_umask.attr,
&format_attr_edge.attr,
&format_attr_inv.attr,
&format_attr_thresh10.attr,
NULL,
};
static const struct attribute_group dmr_imc_uncore_format_group = {
.name = "format",
.attrs = dmr_imc_uncore_formats_attr,
};
static struct intel_uncore_type dmr_uncore_imc = {
.name = "imc",
.fixed_ctr_bits = 48,
.fixed_ctr = DMR_IMC_PMON_FIXED_CTR,
.fixed_ctl = DMR_IMC_PMON_FIXED_CTL,
.ops = &spr_uncore_mmio_ops,
.format_group = &dmr_imc_uncore_format_group,
.attr_update = uncore_alias_groups,
};
static struct attribute *dmr_sca_uncore_formats_attr[] = {
&format_attr_event.attr,
&format_attr_umask_ext5.attr,
&format_attr_edge.attr,
&format_attr_inv.attr,
&format_attr_thresh8.attr,
NULL,
};
static const struct attribute_group dmr_sca_uncore_format_group = {
.name = "format",
.attrs = dmr_sca_uncore_formats_attr,
};
static struct intel_uncore_type dmr_uncore_sca = {
.name = "sca",
.event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT,
.format_group = &dmr_sca_uncore_format_group,
.attr_update = uncore_alias_groups,
};
static struct attribute *dmr_cxlcm_uncore_formats_attr[] = {
&format_attr_event.attr,
&format_attr_umask.attr,
&format_attr_edge.attr,
&format_attr_inv2.attr,
&format_attr_thresh9_2.attr,
&format_attr_port_en.attr,
NULL,
};
static const struct attribute_group dmr_cxlcm_uncore_format_group = {
.name = "format",
.attrs = dmr_cxlcm_uncore_formats_attr,
};
static struct event_constraint dmr_uncore_cxlcm_constraints[] = {
UNCORE_EVENT_CONSTRAINT_RANGE(0x1, 0x24, 0x0f),
UNCORE_EVENT_CONSTRAINT_RANGE(0x41, 0x41, 0xf0),
UNCORE_EVENT_CONSTRAINT_RANGE(0x50, 0x5e, 0xf0),
UNCORE_EVENT_CONSTRAINT_RANGE(0x60, 0x61, 0xf0),
EVENT_CONSTRAINT_END
};
static struct intel_uncore_type dmr_uncore_cxlcm = {
.name = "cxlcm",
.event_mask = GENERIC_PMON_RAW_EVENT_MASK,
.event_mask_ext = DMR_CXLCM_EVENT_MASK_EXT,
.constraints = dmr_uncore_cxlcm_constraints,
.format_group = &dmr_cxlcm_uncore_format_group,
.attr_update = uncore_alias_groups,
};
static struct intel_uncore_type dmr_uncore_hamvf = {
.name = "hamvf",
.event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT,
.format_group = &dmr_sca_uncore_format_group,
.attr_update = uncore_alias_groups,
};
static struct event_constraint dmr_uncore_cbo_constraints[] = {
UNCORE_EVENT_CONSTRAINT(0x11, 0x1),
UNCORE_EVENT_CONSTRAINT_RANGE(0x19, 0x1a, 0x1),
UNCORE_EVENT_CONSTRAINT(0x1f, 0x1),
UNCORE_EVENT_CONSTRAINT(0x21, 0x1),
UNCORE_EVENT_CONSTRAINT(0x25, 0x1),
UNCORE_EVENT_CONSTRAINT(0x36, 0x1),
EVENT_CONSTRAINT_END
};
static struct intel_uncore_type dmr_uncore_cbo = {
.name = "cbo",
.event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT,
.constraints = dmr_uncore_cbo_constraints,
.format_group = &dmr_sca_uncore_format_group,
.attr_update = uncore_alias_groups,
};
static struct intel_uncore_type dmr_uncore_santa = {
.name = "santa",
.attr_update = uncore_alias_groups,
};
static struct intel_uncore_type dmr_uncore_cncu = {
.name = "cncu",
.attr_update = uncore_alias_groups,
};
static struct intel_uncore_type dmr_uncore_sncu = {
.name = "sncu",
.attr_update = uncore_alias_groups,
};
static struct intel_uncore_type dmr_uncore_ula = {
.name = "ula",
.event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT,
.format_group = &dmr_sca_uncore_format_group,
.attr_update = uncore_alias_groups,
};
static struct intel_uncore_type dmr_uncore_dda = {
.name = "dda",
.event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT,
.format_group = &dmr_sca_uncore_format_group,
.attr_update = uncore_alias_groups,
};
static struct event_constraint dmr_uncore_sbo_constraints[] = {
UNCORE_EVENT_CONSTRAINT(0x1f, 0x01),
UNCORE_EVENT_CONSTRAINT(0x25, 0x01),
EVENT_CONSTRAINT_END
};
static struct intel_uncore_type dmr_uncore_sbo = {
.name = "sbo",
.event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT,
.constraints = dmr_uncore_sbo_constraints,
.format_group = &dmr_sca_uncore_format_group,
.attr_update = uncore_alias_groups,
};
static struct intel_uncore_type dmr_uncore_ubr = {
.name = "ubr",
.event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT,
.format_group = &dmr_sca_uncore_format_group,
.attr_update = uncore_alias_groups,
};
static struct attribute *dmr_pcie4_uncore_formats_attr[] = {
&format_attr_event.attr,
&format_attr_umask.attr,
&format_attr_edge.attr,
&format_attr_inv.attr,
&format_attr_thresh8.attr,
&format_attr_thresh_ext.attr,
&format_attr_rs3_sel.attr,
&format_attr_rx_sel.attr,
&format_attr_tx_sel.attr,
&format_attr_iep_sel.attr,
&format_attr_vc_sel.attr,
&format_attr_port_sel.attr,
NULL,
};
static const struct attribute_group dmr_pcie4_uncore_format_group = {
.name = "format",
.attrs = dmr_pcie4_uncore_formats_attr,
};
static struct intel_uncore_type dmr_uncore_pcie4 = {
.name = "pcie4",
.event_mask_ext = DMR_PCIE4_EVENT_MASK_EXT,
.format_group = &dmr_pcie4_uncore_format_group,
.attr_update = uncore_alias_groups,
};
static struct intel_uncore_type dmr_uncore_crs = {
.name = "crs",
.attr_update = uncore_alias_groups,
};
static struct intel_uncore_type dmr_uncore_cpc = {
.name = "cpc",
.event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT,
.format_group = &dmr_sca_uncore_format_group,
.attr_update = uncore_alias_groups,
};
static struct intel_uncore_type dmr_uncore_itc = {
.name = "itc",
.event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT,
.format_group = &dmr_sca_uncore_format_group,
.attr_update = uncore_alias_groups,
};
static struct intel_uncore_type dmr_uncore_otc = {
.name = "otc",
.event_mask_ext = DMR_HAMVF_EVENT_MASK_EXT,
.format_group = &dmr_sca_uncore_format_group,
.attr_update = uncore_alias_groups,
};
static struct intel_uncore_type dmr_uncore_cms = {
.name = "cms",
.attr_update = uncore_alias_groups,
};
static struct intel_uncore_type dmr_uncore_pcie6 = {
.name = "pcie6",
.event_mask_ext = DMR_PCIE4_EVENT_MASK_EXT,
.format_group = &dmr_pcie4_uncore_format_group,
.attr_update = uncore_alias_groups,
};
static struct intel_uncore_type *dmr_uncores[UNCORE_DMR_NUM_UNCORE_TYPES] = {
NULL, NULL, NULL, NULL,
&spr_uncore_pcu,
&gnr_uncore_ubox,
&dmr_uncore_imc,
NULL,
NULL, NULL, NULL, NULL,
NULL, NULL, NULL, NULL,
NULL, NULL, NULL, NULL,
NULL, NULL, NULL,
&dmr_uncore_sca,
&dmr_uncore_cxlcm,
NULL, NULL, NULL,
NULL, NULL,
&dmr_uncore_hamvf,
&dmr_uncore_cbo,
&dmr_uncore_santa,
&dmr_uncore_cncu,
&dmr_uncore_sncu,
&dmr_uncore_ula,
&dmr_uncore_dda,
NULL,
&dmr_uncore_sbo,
NULL,
NULL, NULL, NULL,
&dmr_uncore_ubr,
NULL,
&dmr_uncore_pcie4,
&dmr_uncore_crs,
&dmr_uncore_cpc,
&dmr_uncore_itc,
&dmr_uncore_otc,
&dmr_uncore_cms,
&dmr_uncore_pcie6,
};
int dmr_uncore_imh_units_ignore[] = {
0x13, /* MSE */
UNCORE_IGNORE_END
};
int dmr_uncore_cbb_units_ignore[] = {
0x25, /* SB2UCIE */
UNCORE_IGNORE_END
};
static unsigned int dmr_iio_freerunning_box_offsets[] = {
0x0, 0x8000, 0x18000, 0x20000
};
static void dmr_uncore_freerunning_init_box(struct intel_uncore_box *box)
{
struct intel_uncore_type *type = box->pmu->type;
u64 mmio_base;
if (box->pmu->pmu_idx >= type->num_boxes)
return;
mmio_base = DMR_IMH1_HIOP_MMIO_BASE;
mmio_base += dmr_iio_freerunning_box_offsets[box->pmu->pmu_idx];
box->io_addr = ioremap(mmio_base, type->mmio_map_size);
if (!box->io_addr)
pr_warn("perf uncore: Failed to ioremap for %s.\n", type->name);
}
static struct intel_uncore_ops dmr_uncore_freerunning_ops = {
.init_box = dmr_uncore_freerunning_init_box,
.exit_box = uncore_mmio_exit_box,
.read_counter = uncore_mmio_read_counter,
.hw_config = uncore_freerunning_hw_config,
};
enum perf_uncore_dmr_iio_freerunning_type_id {
DMR_ITC_INB_DATA_BW,
DMR_ITC_BW_IN,
DMR_OTC_BW_OUT,
DMR_OTC_CLOCK_TICKS,
DMR_IIO_FREERUNNING_TYPE_MAX,
};
static struct freerunning_counters dmr_iio_freerunning[] = {
[DMR_ITC_INB_DATA_BW] = { 0x4d40, 0x8, 0, 8, 48},
[DMR_ITC_BW_IN] = { 0x6b00, 0x8, 0, 8, 48},
[DMR_OTC_BW_OUT] = { 0x6b60, 0x8, 0, 8, 48},
[DMR_OTC_CLOCK_TICKS] = { 0x6bb0, 0x8, 0, 1, 48},
};
static struct uncore_event_desc dmr_uncore_iio_freerunning_events[] = {
/* ITC Free Running Data BW counter for inbound traffic */
INTEL_UNCORE_FR_EVENT_DESC(inb_data_port0, 0x10, "3.814697266e-6"),
INTEL_UNCORE_FR_EVENT_DESC(inb_data_port1, 0x11, "3.814697266e-6"),
INTEL_UNCORE_FR_EVENT_DESC(inb_data_port2, 0x12, "3.814697266e-6"),
INTEL_UNCORE_FR_EVENT_DESC(inb_data_port3, 0x13, "3.814697266e-6"),
INTEL_UNCORE_FR_EVENT_DESC(inb_data_port4, 0x14, "3.814697266e-6"),
INTEL_UNCORE_FR_EVENT_DESC(inb_data_port5, 0x15, "3.814697266e-6"),
INTEL_UNCORE_FR_EVENT_DESC(inb_data_port6, 0x16, "3.814697266e-6"),
INTEL_UNCORE_FR_EVENT_DESC(inb_data_port7, 0x17, "3.814697266e-6"),
/* ITC Free Running BW IN counters */
INTEL_UNCORE_FR_EVENT_DESC(bw_in_port0, 0x20, "3.814697266e-6"),
INTEL_UNCORE_FR_EVENT_DESC(bw_in_port1, 0x21, "3.814697266e-6"),
INTEL_UNCORE_FR_EVENT_DESC(bw_in_port2, 0x22, "3.814697266e-6"),
INTEL_UNCORE_FR_EVENT_DESC(bw_in_port3, 0x23, "3.814697266e-6"),
INTEL_UNCORE_FR_EVENT_DESC(bw_in_port4, 0x24, "3.814697266e-6"),
INTEL_UNCORE_FR_EVENT_DESC(bw_in_port5, 0x25, "3.814697266e-6"),
INTEL_UNCORE_FR_EVENT_DESC(bw_in_port6, 0x26, "3.814697266e-6"),
INTEL_UNCORE_FR_EVENT_DESC(bw_in_port7, 0x27, "3.814697266e-6"),
/* ITC Free Running BW OUT counters */
INTEL_UNCORE_FR_EVENT_DESC(bw_out_port0, 0x30, "3.814697266e-6"),
INTEL_UNCORE_FR_EVENT_DESC(bw_out_port1, 0x31, "3.814697266e-6"),
INTEL_UNCORE_FR_EVENT_DESC(bw_out_port2, 0x32, "3.814697266e-6"),
INTEL_UNCORE_FR_EVENT_DESC(bw_out_port3, 0x33, "3.814697266e-6"),
INTEL_UNCORE_FR_EVENT_DESC(bw_out_port4, 0x34, "3.814697266e-6"),
INTEL_UNCORE_FR_EVENT_DESC(bw_out_port5, 0x35, "3.814697266e-6"),
INTEL_UNCORE_FR_EVENT_DESC(bw_out_port6, 0x36, "3.814697266e-6"),
INTEL_UNCORE_FR_EVENT_DESC(bw_out_port7, 0x37, "3.814697266e-6"),
/* Free Running Clock Counter */
INTEL_UNCORE_EVENT_DESC(clockticks, "event=0xff,umask=0x40"),
{ /* end: all zeroes */ },
};
static struct intel_uncore_type dmr_uncore_iio_free_running = {
.name = "iio_free_running",
.num_counters = 25,
.mmio_map_size = DMR_HIOP_MMIO_SIZE,
.num_freerunning_types = DMR_IIO_FREERUNNING_TYPE_MAX,
.freerunning = dmr_iio_freerunning,
.ops = &dmr_uncore_freerunning_ops,
.event_descs = dmr_uncore_iio_freerunning_events,
.format_group = &skx_uncore_iio_freerunning_format_group,
};
#define UNCORE_DMR_MMIO_EXTRA_UNCORES 1
static struct intel_uncore_type *dmr_mmio_uncores[UNCORE_DMR_MMIO_EXTRA_UNCORES] = {
&dmr_uncore_iio_free_running,
};
int dmr_uncore_pci_init(void)
{
uncore_pci_uncores = uncore_get_uncores(UNCORE_ACCESS_PCI, 0, NULL,
UNCORE_DMR_NUM_UNCORE_TYPES,
dmr_uncores);
return 0;
}
void dmr_uncore_mmio_init(void)
{
uncore_mmio_uncores = uncore_get_uncores(UNCORE_ACCESS_MMIO,
UNCORE_DMR_MMIO_EXTRA_UNCORES,
dmr_mmio_uncores,
UNCORE_DMR_NUM_UNCORE_TYPES,
dmr_uncores);
dmr_uncore_iio_free_running.num_boxes =
uncore_type_max_boxes(uncore_mmio_uncores, UNCORE_DMR_ITC);
}
/* end of DMR uncore support */

View File

@@ -78,6 +78,7 @@ static bool test_intel(int idx, void *data)
case INTEL_ATOM_SILVERMONT:
case INTEL_ATOM_SILVERMONT_D:
case INTEL_ATOM_AIRMONT:
case INTEL_ATOM_AIRMONT_NP:
case INTEL_ATOM_GOLDMONT:
case INTEL_ATOM_GOLDMONT_D:

View File

@@ -45,6 +45,10 @@ enum extra_reg_type {
EXTRA_REG_FE = 4, /* fe_* */
EXTRA_REG_SNOOP_0 = 5, /* snoop response 0 */
EXTRA_REG_SNOOP_1 = 6, /* snoop response 1 */
EXTRA_REG_OMR_0 = 7, /* OMR 0 */
EXTRA_REG_OMR_1 = 8, /* OMR 1 */
EXTRA_REG_OMR_2 = 9, /* OMR 2 */
EXTRA_REG_OMR_3 = 10, /* OMR 3 */
EXTRA_REG_MAX /* number of entries needed */
};
@@ -183,6 +187,13 @@ struct amd_nb {
(1ULL << PERF_REG_X86_R14) | \
(1ULL << PERF_REG_X86_R15))
/* user space rdpmc control values */
enum {
X86_USER_RDPMC_NEVER_ENABLE = 0,
X86_USER_RDPMC_CONDITIONAL_ENABLE = 1,
X86_USER_RDPMC_ALWAYS_ENABLE = 2,
};
/*
* Per register state.
*/
@@ -1099,6 +1110,7 @@ do { \
#define PMU_FL_RETIRE_LATENCY 0x200 /* Support Retire Latency in PEBS */
#define PMU_FL_BR_CNTR 0x400 /* Support branch counter logging */
#define PMU_FL_DYN_CONSTRAINT 0x800 /* Needs dynamic constraint */
#define PMU_FL_HAS_OMR 0x1000 /* has 4 equivalent OMR regs */
#define EVENT_VAR(_id) event_attr_##_id
#define EVENT_PTR(_id) &event_attr_##_id.attr.attr
@@ -1321,6 +1333,12 @@ static inline u64 x86_pmu_get_event_config(struct perf_event *event)
return event->attr.config & hybrid(event->pmu, config_mask);
}
static inline bool x86_pmu_has_rdpmc_user_disable(struct pmu *pmu)
{
return !!(hybrid(pmu, config_mask) &
ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE);
}
extern struct event_constraint emptyconstraint;
extern struct event_constraint unconstrained;
@@ -1668,6 +1686,10 @@ u64 lnl_latency_data(struct perf_event *event, u64 status);
u64 arl_h_latency_data(struct perf_event *event, u64 status);
u64 pnc_latency_data(struct perf_event *event, u64 status);
u64 nvl_latency_data(struct perf_event *event, u64 status);
extern struct event_constraint intel_core2_pebs_event_constraints[];
extern struct event_constraint intel_atom_pebs_event_constraints[];
@@ -1680,6 +1702,8 @@ extern struct event_constraint intel_glp_pebs_event_constraints[];
extern struct event_constraint intel_grt_pebs_event_constraints[];
extern struct event_constraint intel_arw_pebs_event_constraints[];
extern struct event_constraint intel_nehalem_pebs_event_constraints[];
extern struct event_constraint intel_westmere_pebs_event_constraints[];
@@ -1700,6 +1724,8 @@ extern struct event_constraint intel_glc_pebs_event_constraints[];
extern struct event_constraint intel_lnc_pebs_event_constraints[];
extern struct event_constraint intel_pnc_pebs_event_constraints[];
struct event_constraint *intel_pebs_constraints(struct perf_event *event);
void intel_pmu_pebs_add(struct perf_event *event);

View File

@@ -110,7 +110,7 @@ union ibs_op_data3 {
__u64 ld_op:1, /* 0: load op */
st_op:1, /* 1: store op */
dc_l1tlb_miss:1, /* 2: data cache L1TLB miss */
dc_l2tlb_miss:1, /* 3: data cache L2TLB hit in 2M page */
dc_l2tlb_miss:1, /* 3: data cache L2TLB miss in 2M page */
dc_l1tlb_hit_2m:1, /* 4: data cache L1TLB hit in 2M page */
dc_l1tlb_hit_1g:1, /* 5: data cache L1TLB hit in 1G page */
dc_l2tlb_hit_2m:1, /* 6: data cache L2TLB hit in 2M page */

View File

@@ -18,6 +18,9 @@ typedef struct {
unsigned int kvm_posted_intr_ipis;
unsigned int kvm_posted_intr_wakeup_ipis;
unsigned int kvm_posted_intr_nested_ipis;
#endif
#ifdef CONFIG_GUEST_PERF_EVENTS
unsigned int perf_guest_mediated_pmis;
#endif
unsigned int x86_platform_ipis; /* arch dependent */
unsigned int apic_perf_irqs;

View File

@@ -746,6 +746,12 @@ DECLARE_IDTENTRY_SYSVEC(POSTED_INTR_NESTED_VECTOR, sysvec_kvm_posted_intr_nested
# define fred_sysvec_kvm_posted_intr_nested_ipi NULL
#endif
# ifdef CONFIG_GUEST_PERF_EVENTS
DECLARE_IDTENTRY_SYSVEC(PERF_GUEST_MEDIATED_PMI_VECTOR, sysvec_perf_guest_mediated_pmi_handler);
#else
# define fred_sysvec_perf_guest_mediated_pmi_handler NULL
#endif
# ifdef CONFIG_X86_POSTED_MSI
DECLARE_IDTENTRY_SYSVEC(POSTED_MSI_NOTIFICATION_VECTOR, sysvec_posted_msi_notification);
#else

View File

@@ -77,7 +77,9 @@
*/
#define IRQ_WORK_VECTOR 0xf6
/* 0xf5 - unused, was UV_BAU_MESSAGE */
/* IRQ vector for PMIs when running a guest with a mediated PMU. */
#define PERF_GUEST_MEDIATED_PMI_VECTOR 0xf5
#define DEFERRED_ERROR_VECTOR 0xf4
/* Vector on which hypervisor callbacks will be delivered */

View File

@@ -263,6 +263,11 @@
#define MSR_SNOOP_RSP_0 0x00001328
#define MSR_SNOOP_RSP_1 0x00001329
#define MSR_OMR_0 0x000003e0
#define MSR_OMR_1 0x000003e1
#define MSR_OMR_2 0x000003e2
#define MSR_OMR_3 0x000003e3
#define MSR_LBR_SELECT 0x000001c8
#define MSR_LBR_TOS 0x000001c9

View File

@@ -33,6 +33,7 @@
#define ARCH_PERFMON_EVENTSEL_CMASK 0xFF000000ULL
#define ARCH_PERFMON_EVENTSEL_BR_CNTR (1ULL << 35)
#define ARCH_PERFMON_EVENTSEL_EQ (1ULL << 36)
#define ARCH_PERFMON_EVENTSEL_RDPMC_USER_DISABLE (1ULL << 37)
#define ARCH_PERFMON_EVENTSEL_UMASK2 (0xFFULL << 40)
#define INTEL_FIXED_BITS_STRIDE 4
@@ -40,6 +41,7 @@
#define INTEL_FIXED_0_USER (1ULL << 1)
#define INTEL_FIXED_0_ANYTHREAD (1ULL << 2)
#define INTEL_FIXED_0_ENABLE_PMI (1ULL << 3)
#define INTEL_FIXED_0_RDPMC_USER_DISABLE (1ULL << 33)
#define INTEL_FIXED_3_METRICS_CLEAR (1ULL << 2)
#define HSW_IN_TX (1ULL << 32)
@@ -50,7 +52,7 @@
#define INTEL_FIXED_BITS_MASK \
(INTEL_FIXED_0_KERNEL | INTEL_FIXED_0_USER | \
INTEL_FIXED_0_ANYTHREAD | INTEL_FIXED_0_ENABLE_PMI | \
ICL_FIXED_0_ADAPTIVE)
ICL_FIXED_0_ADAPTIVE | INTEL_FIXED_0_RDPMC_USER_DISABLE)
#define intel_fixed_bits_by_idx(_idx, _bits) \
((_bits) << ((_idx) * INTEL_FIXED_BITS_STRIDE))
@@ -226,7 +228,9 @@ union cpuid35_ebx {
unsigned int umask2:1;
/* EQ-bit Supported */
unsigned int eq:1;
unsigned int reserved:30;
/* rdpmc user disable Supported */
unsigned int rdpmc_user_disable:1;
unsigned int reserved:29;
} split;
unsigned int full;
};
@@ -301,6 +305,7 @@ struct x86_pmu_capability {
unsigned int events_mask;
int events_mask_len;
unsigned int pebs_ept :1;
unsigned int mediated :1;
};
/*
@@ -759,6 +764,11 @@ static inline void perf_events_lapic_init(void) { }
static inline void perf_check_microcode(void) { }
#endif
#ifdef CONFIG_PERF_GUEST_MEDIATED_PMU
extern void perf_load_guest_lvtpc(u32 guest_lvtpc);
extern void perf_put_guest_lvtpc(void);
#endif
#if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_INTEL)
extern struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr, void *data);
extern void x86_perf_get_lbr(struct x86_pmu_lbr *lbr);

View File

@@ -2,11 +2,23 @@
#ifndef _ASM_X86_UNWIND_USER_H
#define _ASM_X86_UNWIND_USER_H
#ifdef CONFIG_HAVE_UNWIND_USER_FP
#ifdef CONFIG_UNWIND_USER
#include <asm/ptrace.h>
#include <asm/uprobes.h>
static inline int unwind_user_word_size(struct pt_regs *regs)
{
/* We can't unwind VM86 stacks */
if (regs->flags & X86_VM_MASK)
return 0;
return user_64bit_mode(regs) ? 8 : 4;
}
#endif /* CONFIG_UNWIND_USER */
#ifdef CONFIG_HAVE_UNWIND_USER_FP
#define ARCH_INIT_USER_FP_FRAME(ws) \
.cfa_off = 2*(ws), \
.ra_off = -1*(ws), \
@@ -19,22 +31,11 @@
.fp_off = 0, \
.use_fp = false,
static inline int unwind_user_word_size(struct pt_regs *regs)
{
/* We can't unwind VM86 stacks */
if (regs->flags & X86_VM_MASK)
return 0;
#ifdef CONFIG_X86_64
if (!user_64bit_mode(regs))
return sizeof(int);
#endif
return sizeof(long);
}
static inline bool unwind_user_at_function_start(struct pt_regs *regs)
{
return is_uprobe_at_func_entry(regs);
}
#define unwind_user_at_function_start unwind_user_at_function_start
#endif /* CONFIG_HAVE_UNWIND_USER_FP */

View File

@@ -158,6 +158,9 @@ static const __initconst struct idt_data apic_idts[] = {
INTG(POSTED_INTR_WAKEUP_VECTOR, asm_sysvec_kvm_posted_intr_wakeup_ipi),
INTG(POSTED_INTR_NESTED_VECTOR, asm_sysvec_kvm_posted_intr_nested_ipi),
# endif
#ifdef CONFIG_GUEST_PERF_EVENTS
INTG(PERF_GUEST_MEDIATED_PMI_VECTOR, asm_sysvec_perf_guest_mediated_pmi_handler),
#endif
# ifdef CONFIG_IRQ_WORK
INTG(IRQ_WORK_VECTOR, asm_sysvec_irq_work),
# endif

View File

@@ -192,6 +192,13 @@ int arch_show_interrupts(struct seq_file *p, int prec)
irq_stats(j)->kvm_posted_intr_wakeup_ipis);
seq_puts(p, " Posted-interrupt wakeup event\n");
#endif
#ifdef CONFIG_GUEST_PERF_EVENTS
seq_printf(p, "%*s: ", prec, "VPMI");
for_each_online_cpu(j)
seq_printf(p, "%10u ",
irq_stats(j)->perf_guest_mediated_pmis);
seq_puts(p, " Perf Guest Mediated PMI\n");
#endif
#ifdef CONFIG_X86_POSTED_MSI
seq_printf(p, "%*s: ", prec, "PMN");
for_each_online_cpu(j)
@@ -349,6 +356,18 @@ DEFINE_IDTENTRY_SYSVEC(sysvec_x86_platform_ipi)
}
#endif
#ifdef CONFIG_GUEST_PERF_EVENTS
/*
* Handler for PERF_GUEST_MEDIATED_PMI_VECTOR.
*/
DEFINE_IDTENTRY_SYSVEC(sysvec_perf_guest_mediated_pmi_handler)
{
apic_eoi();
inc_irq_stat(perf_guest_mediated_pmis);
perf_guest_handle_mediated_pmi();
}
#endif
#if IS_ENABLED(CONFIG_KVM)
static void dummy_handler(void) {}
static void (*kvm_posted_intr_wakeup_handler)(void) = dummy_handler;

View File

@@ -1823,3 +1823,27 @@ bool is_uprobe_at_func_entry(struct pt_regs *regs)
return false;
}
#ifdef CONFIG_IA32_EMULATION
unsigned long arch_uprobe_get_xol_area(void)
{
struct thread_info *ti = current_thread_info();
unsigned long vaddr;
/*
* HACK: we are not in a syscall, but x86 get_unmapped_area() paths
* ignore TIF_ADDR32 and rely on in_32bit_syscall() to calculate
* vm_unmapped_area_info.high_limit.
*
* The #ifdef above doesn't cover the CONFIG_X86_X32_ABI=y case,
* but in this case in_32bit_syscall() -> in_x32_syscall() always
* (falsely) returns true because ->orig_ax == -1.
*/
if (test_thread_flag(TIF_ADDR32))
ti->status |= TS_COMPAT;
vaddr = get_unmapped_area(NULL, TASK_SIZE - PAGE_SIZE, PAGE_SIZE, 0, 0);
ti->status &= ~TS_COMPAT;
return vaddr;
}
#endif

View File

@@ -37,6 +37,7 @@ config KVM_X86
select SCHED_INFO
select PERF_EVENTS
select GUEST_PERF_EVENTS
select PERF_GUEST_MEDIATED_PMU
select HAVE_KVM_MSI
select HAVE_KVM_CPU_RELAX_INTERCEPT
select HAVE_KVM_NO_POLL

View File

@@ -32,6 +32,7 @@ mandatory-y += irq_work.h
mandatory-y += kdebug.h
mandatory-y += kmap_size.h
mandatory-y += kprobes.h
mandatory-y += kvm_types.h
mandatory-y += linkage.h
mandatory-y += local.h
mandatory-y += local64.h

View File

@@ -305,6 +305,7 @@ struct perf_event_pmu_context;
#define PERF_PMU_CAP_EXTENDED_HW_TYPE 0x0100
#define PERF_PMU_CAP_AUX_PAUSE 0x0200
#define PERF_PMU_CAP_AUX_PREFER_LARGE 0x0400
#define PERF_PMU_CAP_MEDIATED_VPMU 0x0800
/**
* pmu::scope
@@ -998,6 +999,11 @@ struct perf_event_groups {
u64 index;
};
struct perf_time_ctx {
u64 time;
u64 stamp;
u64 offset;
};
/**
* struct perf_event_context - event context structure
@@ -1036,9 +1042,12 @@ struct perf_event_context {
/*
* Context clock, runs when context enabled.
*/
u64 time;
u64 timestamp;
u64 timeoffset;
struct perf_time_ctx time;
/*
* Context clock, runs when in the guest mode.
*/
struct perf_time_ctx timeguest;
/*
* These fields let us detect when two contexts have both
@@ -1171,9 +1180,8 @@ struct bpf_perf_event_data_kern {
* This is a per-cpu dynamically allocated data structure.
*/
struct perf_cgroup_info {
u64 time;
u64 timestamp;
u64 timeoffset;
struct perf_time_ctx time;
struct perf_time_ctx timeguest;
int active;
};
@@ -1669,6 +1677,8 @@ struct perf_guest_info_callbacks {
unsigned int (*state)(void);
unsigned long (*get_ip)(void);
unsigned int (*handle_intel_pt_intr)(void);
void (*handle_mediated_pmi)(void);
};
#ifdef CONFIG_GUEST_PERF_EVENTS
@@ -1678,6 +1688,7 @@ extern struct perf_guest_info_callbacks __rcu *perf_guest_cbs;
DECLARE_STATIC_CALL(__perf_guest_state, *perf_guest_cbs->state);
DECLARE_STATIC_CALL(__perf_guest_get_ip, *perf_guest_cbs->get_ip);
DECLARE_STATIC_CALL(__perf_guest_handle_intel_pt_intr, *perf_guest_cbs->handle_intel_pt_intr);
DECLARE_STATIC_CALL(__perf_guest_handle_mediated_pmi, *perf_guest_cbs->handle_mediated_pmi);
static inline unsigned int perf_guest_state(void)
{
@@ -1694,6 +1705,11 @@ static inline unsigned int perf_guest_handle_intel_pt_intr(void)
return static_call(__perf_guest_handle_intel_pt_intr)();
}
static inline void perf_guest_handle_mediated_pmi(void)
{
static_call(__perf_guest_handle_mediated_pmi)();
}
extern void perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *cbs);
extern void perf_unregister_guest_info_callbacks(struct perf_guest_info_callbacks *cbs);
@@ -1914,6 +1930,13 @@ extern int perf_event_account_interrupt(struct perf_event *event);
extern int perf_event_period(struct perf_event *event, u64 value);
extern u64 perf_event_pause(struct perf_event *event, bool reset);
#ifdef CONFIG_PERF_GUEST_MEDIATED_PMU
int perf_create_mediated_pmu(void);
void perf_release_mediated_pmu(void);
void perf_load_guest_context(void);
void perf_put_guest_context(void);
#endif
#else /* !CONFIG_PERF_EVENTS: */
static inline void *

View File

@@ -5,8 +5,22 @@
#include <linux/unwind_user_types.h>
#include <asm/unwind_user.h>
#ifndef ARCH_INIT_USER_FP_FRAME
#define ARCH_INIT_USER_FP_FRAME
#ifndef CONFIG_HAVE_UNWIND_USER_FP
#define ARCH_INIT_USER_FP_FRAME(ws)
#endif
#ifndef ARCH_INIT_USER_FP_ENTRY_FRAME
#define ARCH_INIT_USER_FP_ENTRY_FRAME(ws)
#endif
#ifndef unwind_user_at_function_start
static inline bool unwind_user_at_function_start(struct pt_regs *regs)
{
return false;
}
#define unwind_user_at_function_start unwind_user_at_function_start
#endif
int unwind_user(struct unwind_stacktrace *trace, unsigned int max_entries);

View File

@@ -242,6 +242,7 @@ extern void arch_uprobe_clear_state(struct mm_struct *mm);
extern void arch_uprobe_init_state(struct mm_struct *mm);
extern void handle_syscall_uprobe(struct pt_regs *regs, unsigned long bp_vaddr);
extern void arch_uprobe_optimize(struct arch_uprobe *auprobe, unsigned long vaddr);
extern unsigned long arch_uprobe_get_xol_area(void);
#else /* !CONFIG_UPROBES */
struct uprobes_state {
};

View File

@@ -1330,14 +1330,16 @@ union perf_mem_data_src {
mem_snoopx : 2, /* Snoop mode, ext */
mem_blk : 3, /* Access blocked */
mem_hops : 3, /* Hop level */
mem_rsvd : 18;
mem_region : 5, /* cache/memory regions */
mem_rsvd : 13;
};
};
#elif defined(__BIG_ENDIAN_BITFIELD)
union perf_mem_data_src {
__u64 val;
struct {
__u64 mem_rsvd : 18,
__u64 mem_rsvd : 13,
mem_region : 5, /* cache/memory regions */
mem_hops : 3, /* Hop level */
mem_blk : 3, /* Access blocked */
mem_snoopx : 2, /* Snoop mode, ext */
@@ -1394,7 +1396,7 @@ union perf_mem_data_src {
#define PERF_MEM_LVLNUM_L4 0x0004 /* L4 */
#define PERF_MEM_LVLNUM_L2_MHB 0x0005 /* L2 Miss Handling Buffer */
#define PERF_MEM_LVLNUM_MSC 0x0006 /* Memory-side Cache */
/* 0x007 available */
#define PERF_MEM_LVLNUM_L0 0x0007 /* L0 */
#define PERF_MEM_LVLNUM_UNC 0x0008 /* Uncached */
#define PERF_MEM_LVLNUM_CXL 0x0009 /* CXL */
#define PERF_MEM_LVLNUM_IO 0x000a /* I/O */
@@ -1447,6 +1449,25 @@ union perf_mem_data_src {
/* 5-7 available */
#define PERF_MEM_HOPS_SHIFT 43
/* Cache/Memory region */
#define PERF_MEM_REGION_NA 0x0 /* Invalid */
#define PERF_MEM_REGION_RSVD 0x01 /* Reserved */
#define PERF_MEM_REGION_L_SHARE 0x02 /* Local CA shared cache */
#define PERF_MEM_REGION_L_NON_SHARE 0x03 /* Local CA non-shared cache */
#define PERF_MEM_REGION_O_IO 0x04 /* Other CA IO agent */
#define PERF_MEM_REGION_O_SHARE 0x05 /* Other CA shared cache */
#define PERF_MEM_REGION_O_NON_SHARE 0x06 /* Other CA non-shared cache */
#define PERF_MEM_REGION_MMIO 0x07 /* MMIO */
#define PERF_MEM_REGION_MEM0 0x08 /* Memory region 0 */
#define PERF_MEM_REGION_MEM1 0x09 /* Memory region 1 */
#define PERF_MEM_REGION_MEM2 0x0a /* Memory region 2 */
#define PERF_MEM_REGION_MEM3 0x0b /* Memory region 3 */
#define PERF_MEM_REGION_MEM4 0x0c /* Memory region 4 */
#define PERF_MEM_REGION_MEM5 0x0d /* Memory region 5 */
#define PERF_MEM_REGION_MEM6 0x0e /* Memory region 6 */
#define PERF_MEM_REGION_MEM7 0x0f /* Memory region 7 */
#define PERF_MEM_REGION_SHIFT 46
#define PERF_MEM_S(a, s) \
(((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)

View File

@@ -2072,6 +2072,10 @@ config GUEST_PERF_EVENTS
bool
depends on HAVE_PERF_EVENTS
config PERF_GUEST_MEDIATED_PMU
bool
depends on GUEST_PERF_EVENTS
config PERF_USE_VMALLOC
bool
help

View File

@@ -57,6 +57,7 @@
#include <linux/task_work.h>
#include <linux/percpu-rwsem.h>
#include <linux/unwind_deferred.h>
#include <linux/kvm_types.h>
#include "internal.h"
@@ -166,6 +167,18 @@ enum event_type_t {
EVENT_CPU = 0x10,
EVENT_CGROUP = 0x20,
/*
* EVENT_GUEST is set when scheduling in/out events between the host
* and a guest with a mediated vPMU. Among other things, EVENT_GUEST
* is used:
*
* - In for_each_epc() to skip PMUs that don't support events in a
* MEDIATED_VPMU guest, i.e. don't need to be context switched.
* - To indicate the start/end point of the events in a guest. Guest
* running time is deducted for host-only (exclude_guest) events.
*/
EVENT_GUEST = 0x40,
EVENT_FLAGS = EVENT_CGROUP | EVENT_GUEST,
/* compound helpers */
EVENT_ALL = EVENT_FLEXIBLE | EVENT_PINNED,
EVENT_TIME_FROZEN = EVENT_TIME | EVENT_FROZEN,
@@ -458,6 +471,20 @@ static cpumask_var_t perf_online_pkg_mask;
static cpumask_var_t perf_online_sys_mask;
static struct kmem_cache *perf_event_cache;
#ifdef CONFIG_PERF_GUEST_MEDIATED_PMU
static DEFINE_PER_CPU(bool, guest_ctx_loaded);
static __always_inline bool is_guest_mediated_pmu_loaded(void)
{
return __this_cpu_read(guest_ctx_loaded);
}
#else
static __always_inline bool is_guest_mediated_pmu_loaded(void)
{
return false;
}
#endif
/*
* perf event paranoia level:
* -1 - not paranoid at all
@@ -779,33 +806,97 @@ do { \
___p; \
})
#define for_each_epc(_epc, _ctx, _pmu, _cgroup) \
static bool perf_skip_pmu_ctx(struct perf_event_pmu_context *pmu_ctx,
enum event_type_t event_type)
{
if ((event_type & EVENT_CGROUP) && !pmu_ctx->nr_cgroups)
return true;
if ((event_type & EVENT_GUEST) &&
!(pmu_ctx->pmu->capabilities & PERF_PMU_CAP_MEDIATED_VPMU))
return true;
return false;
}
#define for_each_epc(_epc, _ctx, _pmu, _event_type) \
list_for_each_entry(_epc, &((_ctx)->pmu_ctx_list), pmu_ctx_entry) \
if (_cgroup && !_epc->nr_cgroups) \
if (perf_skip_pmu_ctx(_epc, _event_type)) \
continue; \
else if (_pmu && _epc->pmu != _pmu) \
continue; \
else
static void perf_ctx_disable(struct perf_event_context *ctx, bool cgroup)
static void perf_ctx_disable(struct perf_event_context *ctx,
enum event_type_t event_type)
{
struct perf_event_pmu_context *pmu_ctx;
for_each_epc(pmu_ctx, ctx, NULL, cgroup)
for_each_epc(pmu_ctx, ctx, NULL, event_type)
perf_pmu_disable(pmu_ctx->pmu);
}
static void perf_ctx_enable(struct perf_event_context *ctx, bool cgroup)
static void perf_ctx_enable(struct perf_event_context *ctx,
enum event_type_t event_type)
{
struct perf_event_pmu_context *pmu_ctx;
for_each_epc(pmu_ctx, ctx, NULL, cgroup)
for_each_epc(pmu_ctx, ctx, NULL, event_type)
perf_pmu_enable(pmu_ctx->pmu);
}
static void ctx_sched_out(struct perf_event_context *ctx, struct pmu *pmu, enum event_type_t event_type);
static void ctx_sched_in(struct perf_event_context *ctx, struct pmu *pmu, enum event_type_t event_type);
static inline void update_perf_time_ctx(struct perf_time_ctx *time, u64 now, bool adv)
{
if (adv)
time->time += now - time->stamp;
time->stamp = now;
/*
* The above: time' = time + (now - timestamp), can be re-arranged
* into: time` = now + (time - timestamp), which gives a single value
* offset to compute future time without locks on.
*
* See perf_event_time_now(), which can be used from NMI context where
* it's (obviously) not possible to acquire ctx->lock in order to read
* both the above values in a consistent manner.
*/
WRITE_ONCE(time->offset, time->time - time->stamp);
}
static_assert(offsetof(struct perf_event_context, timeguest) -
offsetof(struct perf_event_context, time) ==
sizeof(struct perf_time_ctx));
#define T_TOTAL 0
#define T_GUEST 1
static inline u64 __perf_event_time_ctx(struct perf_event *event,
struct perf_time_ctx *times)
{
u64 time = times[T_TOTAL].time;
if (event->attr.exclude_guest)
time -= times[T_GUEST].time;
return time;
}
static inline u64 __perf_event_time_ctx_now(struct perf_event *event,
struct perf_time_ctx *times,
u64 now)
{
if (is_guest_mediated_pmu_loaded() && event->attr.exclude_guest) {
/*
* (now + times[total].offset) - (now + times[guest].offset) :=
* times[total].offset - times[guest].offset
*/
return READ_ONCE(times[T_TOTAL].offset) - READ_ONCE(times[T_GUEST].offset);
}
return now + READ_ONCE(times[T_TOTAL].offset);
}
#ifdef CONFIG_CGROUP_PERF
static inline bool
@@ -842,12 +933,16 @@ static inline int is_cgroup_event(struct perf_event *event)
return event->cgrp != NULL;
}
static_assert(offsetof(struct perf_cgroup_info, timeguest) -
offsetof(struct perf_cgroup_info, time) ==
sizeof(struct perf_time_ctx));
static inline u64 perf_cgroup_event_time(struct perf_event *event)
{
struct perf_cgroup_info *t;
t = per_cpu_ptr(event->cgrp->info, event->cpu);
return t->time;
return __perf_event_time_ctx(event, &t->time);
}
static inline u64 perf_cgroup_event_time_now(struct perf_event *event, u64 now)
@@ -856,20 +951,21 @@ static inline u64 perf_cgroup_event_time_now(struct perf_event *event, u64 now)
t = per_cpu_ptr(event->cgrp->info, event->cpu);
if (!__load_acquire(&t->active))
return t->time;
now += READ_ONCE(t->timeoffset);
return now;
return __perf_event_time_ctx(event, &t->time);
return __perf_event_time_ctx_now(event, &t->time, now);
}
static inline void __update_cgrp_time(struct perf_cgroup_info *info, u64 now, bool adv)
static inline void __update_cgrp_guest_time(struct perf_cgroup_info *info, u64 now, bool adv)
{
if (adv)
info->time += now - info->timestamp;
info->timestamp = now;
/*
* see update_context_time()
*/
WRITE_ONCE(info->timeoffset, info->time - info->timestamp);
update_perf_time_ctx(&info->timeguest, now, adv);
}
static inline void update_cgrp_time(struct perf_cgroup_info *info, u64 now)
{
update_perf_time_ctx(&info->time, now, true);
if (is_guest_mediated_pmu_loaded())
__update_cgrp_guest_time(info, now, true);
}
static inline void update_cgrp_time_from_cpuctx(struct perf_cpu_context *cpuctx, bool final)
@@ -885,7 +981,7 @@ static inline void update_cgrp_time_from_cpuctx(struct perf_cpu_context *cpuctx,
cgrp = container_of(css, struct perf_cgroup, css);
info = this_cpu_ptr(cgrp->info);
__update_cgrp_time(info, now, true);
update_cgrp_time(info, now);
if (final)
__store_release(&info->active, 0);
}
@@ -908,11 +1004,11 @@ static inline void update_cgrp_time_from_event(struct perf_event *event)
* Do not update time when cgroup is not active
*/
if (info->active)
__update_cgrp_time(info, perf_clock(), true);
update_cgrp_time(info, perf_clock());
}
static inline void
perf_cgroup_set_timestamp(struct perf_cpu_context *cpuctx)
perf_cgroup_set_timestamp(struct perf_cpu_context *cpuctx, bool guest)
{
struct perf_event_context *ctx = &cpuctx->ctx;
struct perf_cgroup *cgrp = cpuctx->cgrp;
@@ -932,8 +1028,12 @@ perf_cgroup_set_timestamp(struct perf_cpu_context *cpuctx)
for (css = &cgrp->css; css; css = css->parent) {
cgrp = container_of(css, struct perf_cgroup, css);
info = this_cpu_ptr(cgrp->info);
__update_cgrp_time(info, ctx->timestamp, false);
__store_release(&info->active, 1);
if (guest) {
__update_cgrp_guest_time(info, ctx->time.stamp, false);
} else {
update_perf_time_ctx(&info->time, ctx->time.stamp, false);
__store_release(&info->active, 1);
}
}
}
@@ -964,8 +1064,7 @@ static void perf_cgroup_switch(struct task_struct *task)
return;
WARN_ON_ONCE(cpuctx->ctx.nr_cgroups == 0);
perf_ctx_disable(&cpuctx->ctx, true);
perf_ctx_disable(&cpuctx->ctx, EVENT_CGROUP);
ctx_sched_out(&cpuctx->ctx, NULL, EVENT_ALL|EVENT_CGROUP);
/*
@@ -981,7 +1080,7 @@ static void perf_cgroup_switch(struct task_struct *task)
*/
ctx_sched_in(&cpuctx->ctx, NULL, EVENT_ALL|EVENT_CGROUP);
perf_ctx_enable(&cpuctx->ctx, true);
perf_ctx_enable(&cpuctx->ctx, EVENT_CGROUP);
}
static int perf_cgroup_ensure_storage(struct perf_event *event,
@@ -1138,7 +1237,7 @@ static inline int perf_cgroup_connect(pid_t pid, struct perf_event *event,
}
static inline void
perf_cgroup_set_timestamp(struct perf_cpu_context *cpuctx)
perf_cgroup_set_timestamp(struct perf_cpu_context *cpuctx, bool guest)
{
}
@@ -1550,29 +1649,24 @@ static void perf_unpin_context(struct perf_event_context *ctx)
*/
static void __update_context_time(struct perf_event_context *ctx, bool adv)
{
u64 now = perf_clock();
lockdep_assert_held(&ctx->lock);
if (adv)
ctx->time += now - ctx->timestamp;
ctx->timestamp = now;
update_perf_time_ctx(&ctx->time, perf_clock(), adv);
}
/*
* The above: time' = time + (now - timestamp), can be re-arranged
* into: time` = now + (time - timestamp), which gives a single value
* offset to compute future time without locks on.
*
* See perf_event_time_now(), which can be used from NMI context where
* it's (obviously) not possible to acquire ctx->lock in order to read
* both the above values in a consistent manner.
*/
WRITE_ONCE(ctx->timeoffset, ctx->time - ctx->timestamp);
static void __update_context_guest_time(struct perf_event_context *ctx, bool adv)
{
lockdep_assert_held(&ctx->lock);
/* must be called after __update_context_time(); */
update_perf_time_ctx(&ctx->timeguest, ctx->time.stamp, adv);
}
static void update_context_time(struct perf_event_context *ctx)
{
__update_context_time(ctx, true);
if (is_guest_mediated_pmu_loaded())
__update_context_guest_time(ctx, true);
}
static u64 perf_event_time(struct perf_event *event)
@@ -1585,7 +1679,7 @@ static u64 perf_event_time(struct perf_event *event)
if (is_cgroup_event(event))
return perf_cgroup_event_time(event);
return ctx->time;
return __perf_event_time_ctx(event, &ctx->time);
}
static u64 perf_event_time_now(struct perf_event *event, u64 now)
@@ -1599,10 +1693,9 @@ static u64 perf_event_time_now(struct perf_event *event, u64 now)
return perf_cgroup_event_time_now(event, now);
if (!(__load_acquire(&ctx->is_active) & EVENT_TIME))
return ctx->time;
return __perf_event_time_ctx(event, &ctx->time);
now += READ_ONCE(ctx->timeoffset);
return now;
return __perf_event_time_ctx_now(event, &ctx->time, now);
}
static enum event_type_t get_event_type(struct perf_event *event)
@@ -2422,20 +2515,23 @@ group_sched_out(struct perf_event *group_event, struct perf_event_context *ctx)
}
static inline void
__ctx_time_update(struct perf_cpu_context *cpuctx, struct perf_event_context *ctx, bool final)
__ctx_time_update(struct perf_cpu_context *cpuctx, struct perf_event_context *ctx,
bool final, enum event_type_t event_type)
{
if (ctx->is_active & EVENT_TIME) {
if (ctx->is_active & EVENT_FROZEN)
return;
update_context_time(ctx);
update_cgrp_time_from_cpuctx(cpuctx, final);
/* vPMU should not stop time */
update_cgrp_time_from_cpuctx(cpuctx, !(event_type & EVENT_GUEST) && final);
}
}
static inline void
ctx_time_update(struct perf_cpu_context *cpuctx, struct perf_event_context *ctx)
{
__ctx_time_update(cpuctx, ctx, false);
__ctx_time_update(cpuctx, ctx, false, 0);
}
/*
@@ -2861,14 +2957,15 @@ static void task_ctx_sched_out(struct perf_event_context *ctx,
static void perf_event_sched_in(struct perf_cpu_context *cpuctx,
struct perf_event_context *ctx,
struct pmu *pmu)
struct pmu *pmu,
enum event_type_t event_type)
{
ctx_sched_in(&cpuctx->ctx, pmu, EVENT_PINNED);
ctx_sched_in(&cpuctx->ctx, pmu, EVENT_PINNED | event_type);
if (ctx)
ctx_sched_in(ctx, pmu, EVENT_PINNED);
ctx_sched_in(&cpuctx->ctx, pmu, EVENT_FLEXIBLE);
ctx_sched_in(ctx, pmu, EVENT_PINNED | event_type);
ctx_sched_in(&cpuctx->ctx, pmu, EVENT_FLEXIBLE | event_type);
if (ctx)
ctx_sched_in(ctx, pmu, EVENT_FLEXIBLE);
ctx_sched_in(ctx, pmu, EVENT_FLEXIBLE | event_type);
}
/*
@@ -2902,11 +2999,11 @@ static void ctx_resched(struct perf_cpu_context *cpuctx,
event_type &= EVENT_ALL;
for_each_epc(epc, &cpuctx->ctx, pmu, false)
for_each_epc(epc, &cpuctx->ctx, pmu, 0)
perf_pmu_disable(epc->pmu);
if (task_ctx) {
for_each_epc(epc, task_ctx, pmu, false)
for_each_epc(epc, task_ctx, pmu, 0)
perf_pmu_disable(epc->pmu);
task_ctx_sched_out(task_ctx, pmu, event_type);
@@ -2924,13 +3021,13 @@ static void ctx_resched(struct perf_cpu_context *cpuctx,
else if (event_type & EVENT_PINNED)
ctx_sched_out(&cpuctx->ctx, pmu, EVENT_FLEXIBLE);
perf_event_sched_in(cpuctx, task_ctx, pmu);
perf_event_sched_in(cpuctx, task_ctx, pmu, 0);
for_each_epc(epc, &cpuctx->ctx, pmu, false)
for_each_epc(epc, &cpuctx->ctx, pmu, 0)
perf_pmu_enable(epc->pmu);
if (task_ctx) {
for_each_epc(epc, task_ctx, pmu, false)
for_each_epc(epc, task_ctx, pmu, 0)
perf_pmu_enable(epc->pmu);
}
}
@@ -3479,11 +3576,10 @@ static void
ctx_sched_out(struct perf_event_context *ctx, struct pmu *pmu, enum event_type_t event_type)
{
struct perf_cpu_context *cpuctx = this_cpu_ptr(&perf_cpu_context);
enum event_type_t active_type = event_type & ~EVENT_FLAGS;
struct perf_event_pmu_context *pmu_ctx;
int is_active = ctx->is_active;
bool cgroup = event_type & EVENT_CGROUP;
event_type &= ~EVENT_CGROUP;
lockdep_assert_held(&ctx->lock);
@@ -3507,14 +3603,14 @@ ctx_sched_out(struct perf_event_context *ctx, struct pmu *pmu, enum event_type_t
*
* would only update time for the pinned events.
*/
__ctx_time_update(cpuctx, ctx, ctx == &cpuctx->ctx);
__ctx_time_update(cpuctx, ctx, ctx == &cpuctx->ctx, event_type);
/*
* CPU-release for the below ->is_active store,
* see __load_acquire() in perf_event_time_now()
*/
barrier();
ctx->is_active &= ~event_type;
ctx->is_active &= ~active_type;
if (!(ctx->is_active & EVENT_ALL)) {
/*
@@ -3533,9 +3629,20 @@ ctx_sched_out(struct perf_event_context *ctx, struct pmu *pmu, enum event_type_t
cpuctx->task_ctx = NULL;
}
is_active ^= ctx->is_active; /* changed bits */
if (event_type & EVENT_GUEST) {
/*
* Schedule out all exclude_guest events of PMU
* with PERF_PMU_CAP_MEDIATED_VPMU.
*/
is_active = EVENT_ALL;
__update_context_guest_time(ctx, false);
perf_cgroup_set_timestamp(cpuctx, true);
barrier();
} else {
is_active ^= ctx->is_active; /* changed bits */
}
for_each_epc(pmu_ctx, ctx, pmu, cgroup)
for_each_epc(pmu_ctx, ctx, pmu, event_type)
__pmu_ctx_sched_out(pmu_ctx, is_active);
}
@@ -3691,7 +3798,7 @@ perf_event_context_sched_out(struct task_struct *task, struct task_struct *next)
raw_spin_lock_nested(&next_ctx->lock, SINGLE_DEPTH_NESTING);
if (context_equiv(ctx, next_ctx)) {
perf_ctx_disable(ctx, false);
perf_ctx_disable(ctx, 0);
/* PMIs are disabled; ctx->nr_no_switch_fast is stable. */
if (local_read(&ctx->nr_no_switch_fast) ||
@@ -3715,7 +3822,7 @@ perf_event_context_sched_out(struct task_struct *task, struct task_struct *next)
perf_ctx_sched_task_cb(ctx, task, false);
perf_ctx_enable(ctx, false);
perf_ctx_enable(ctx, 0);
/*
* RCU_INIT_POINTER here is safe because we've not
@@ -3739,13 +3846,13 @@ perf_event_context_sched_out(struct task_struct *task, struct task_struct *next)
if (do_switch) {
raw_spin_lock(&ctx->lock);
perf_ctx_disable(ctx, false);
perf_ctx_disable(ctx, 0);
inside_switch:
perf_ctx_sched_task_cb(ctx, task, false);
task_ctx_sched_out(ctx, NULL, EVENT_ALL);
perf_ctx_enable(ctx, false);
perf_ctx_enable(ctx, 0);
raw_spin_unlock(&ctx->lock);
}
}
@@ -3992,10 +4099,15 @@ static inline void group_update_userpage(struct perf_event *group_event)
event_update_userpage(event);
}
struct merge_sched_data {
int can_add_hw;
enum event_type_t event_type;
};
static int merge_sched_in(struct perf_event *event, void *data)
{
struct perf_event_context *ctx = event->ctx;
int *can_add_hw = data;
struct merge_sched_data *msd = data;
if (event->state <= PERF_EVENT_STATE_OFF)
return 0;
@@ -4003,13 +4115,22 @@ static int merge_sched_in(struct perf_event *event, void *data)
if (!event_filter_match(event))
return 0;
if (group_can_go_on(event, *can_add_hw)) {
/*
* Don't schedule in any host events from PMU with
* PERF_PMU_CAP_MEDIATED_VPMU, while a guest is running.
*/
if (is_guest_mediated_pmu_loaded() &&
event->pmu_ctx->pmu->capabilities & PERF_PMU_CAP_MEDIATED_VPMU &&
!(msd->event_type & EVENT_GUEST))
return 0;
if (group_can_go_on(event, msd->can_add_hw)) {
if (!group_sched_in(event, ctx))
list_add_tail(&event->active_list, get_event_list(event));
}
if (event->state == PERF_EVENT_STATE_INACTIVE) {
*can_add_hw = 0;
msd->can_add_hw = 0;
if (event->attr.pinned) {
perf_cgroup_event_disable(event, ctx);
perf_event_set_state(event, PERF_EVENT_STATE_ERROR);
@@ -4032,11 +4153,15 @@ static int merge_sched_in(struct perf_event *event, void *data)
static void pmu_groups_sched_in(struct perf_event_context *ctx,
struct perf_event_groups *groups,
struct pmu *pmu)
struct pmu *pmu,
enum event_type_t event_type)
{
int can_add_hw = 1;
struct merge_sched_data msd = {
.can_add_hw = 1,
.event_type = event_type,
};
visit_groups_merge(ctx, groups, smp_processor_id(), pmu,
merge_sched_in, &can_add_hw);
merge_sched_in, &msd);
}
static void __pmu_ctx_sched_in(struct perf_event_pmu_context *pmu_ctx,
@@ -4045,20 +4170,18 @@ static void __pmu_ctx_sched_in(struct perf_event_pmu_context *pmu_ctx,
struct perf_event_context *ctx = pmu_ctx->ctx;
if (event_type & EVENT_PINNED)
pmu_groups_sched_in(ctx, &ctx->pinned_groups, pmu_ctx->pmu);
pmu_groups_sched_in(ctx, &ctx->pinned_groups, pmu_ctx->pmu, event_type);
if (event_type & EVENT_FLEXIBLE)
pmu_groups_sched_in(ctx, &ctx->flexible_groups, pmu_ctx->pmu);
pmu_groups_sched_in(ctx, &ctx->flexible_groups, pmu_ctx->pmu, event_type);
}
static void
ctx_sched_in(struct perf_event_context *ctx, struct pmu *pmu, enum event_type_t event_type)
{
struct perf_cpu_context *cpuctx = this_cpu_ptr(&perf_cpu_context);
enum event_type_t active_type = event_type & ~EVENT_FLAGS;
struct perf_event_pmu_context *pmu_ctx;
int is_active = ctx->is_active;
bool cgroup = event_type & EVENT_CGROUP;
event_type &= ~EVENT_CGROUP;
lockdep_assert_held(&ctx->lock);
@@ -4066,9 +4189,11 @@ ctx_sched_in(struct perf_event_context *ctx, struct pmu *pmu, enum event_type_t
return;
if (!(is_active & EVENT_TIME)) {
/* EVENT_TIME should be active while the guest runs */
WARN_ON_ONCE(event_type & EVENT_GUEST);
/* start ctx time */
__update_context_time(ctx, false);
perf_cgroup_set_timestamp(cpuctx);
perf_cgroup_set_timestamp(cpuctx, false);
/*
* CPU-release for the below ->is_active store,
* see __load_acquire() in perf_event_time_now()
@@ -4076,7 +4201,7 @@ ctx_sched_in(struct perf_event_context *ctx, struct pmu *pmu, enum event_type_t
barrier();
}
ctx->is_active |= (event_type | EVENT_TIME);
ctx->is_active |= active_type | EVENT_TIME;
if (ctx->task) {
if (!(is_active & EVENT_ALL))
cpuctx->task_ctx = ctx;
@@ -4084,21 +4209,37 @@ ctx_sched_in(struct perf_event_context *ctx, struct pmu *pmu, enum event_type_t
WARN_ON_ONCE(cpuctx->task_ctx != ctx);
}
is_active ^= ctx->is_active; /* changed bits */
if (event_type & EVENT_GUEST) {
/*
* Schedule in the required exclude_guest events of PMU
* with PERF_PMU_CAP_MEDIATED_VPMU.
*/
is_active = event_type & EVENT_ALL;
/*
* Update ctx time to set the new start time for
* the exclude_guest events.
*/
update_context_time(ctx);
update_cgrp_time_from_cpuctx(cpuctx, false);
barrier();
} else {
is_active ^= ctx->is_active; /* changed bits */
}
/*
* First go through the list and put on any pinned groups
* in order to give them the best chance of going on.
*/
if (is_active & EVENT_PINNED) {
for_each_epc(pmu_ctx, ctx, pmu, cgroup)
__pmu_ctx_sched_in(pmu_ctx, EVENT_PINNED);
for_each_epc(pmu_ctx, ctx, pmu, event_type)
__pmu_ctx_sched_in(pmu_ctx, EVENT_PINNED | (event_type & EVENT_GUEST));
}
/* Then walk through the lower prio flexible groups */
if (is_active & EVENT_FLEXIBLE) {
for_each_epc(pmu_ctx, ctx, pmu, cgroup)
__pmu_ctx_sched_in(pmu_ctx, EVENT_FLEXIBLE);
for_each_epc(pmu_ctx, ctx, pmu, event_type)
__pmu_ctx_sched_in(pmu_ctx, EVENT_FLEXIBLE | (event_type & EVENT_GUEST));
}
}
@@ -4114,11 +4255,11 @@ static void perf_event_context_sched_in(struct task_struct *task)
if (cpuctx->task_ctx == ctx) {
perf_ctx_lock(cpuctx, ctx);
perf_ctx_disable(ctx, false);
perf_ctx_disable(ctx, 0);
perf_ctx_sched_task_cb(ctx, task, true);
perf_ctx_enable(ctx, false);
perf_ctx_enable(ctx, 0);
perf_ctx_unlock(cpuctx, ctx);
goto rcu_unlock;
}
@@ -4131,7 +4272,7 @@ static void perf_event_context_sched_in(struct task_struct *task)
if (!ctx->nr_events)
goto unlock;
perf_ctx_disable(ctx, false);
perf_ctx_disable(ctx, 0);
/*
* We want to keep the following priority order:
* cpu pinned (that don't need to move), task pinned,
@@ -4141,18 +4282,18 @@ static void perf_event_context_sched_in(struct task_struct *task)
* events, no need to flip the cpuctx's events around.
*/
if (!RB_EMPTY_ROOT(&ctx->pinned_groups.tree)) {
perf_ctx_disable(&cpuctx->ctx, false);
perf_ctx_disable(&cpuctx->ctx, 0);
ctx_sched_out(&cpuctx->ctx, NULL, EVENT_FLEXIBLE);
}
perf_event_sched_in(cpuctx, ctx, NULL);
perf_event_sched_in(cpuctx, ctx, NULL, 0);
perf_ctx_sched_task_cb(cpuctx->task_ctx, task, true);
if (!RB_EMPTY_ROOT(&ctx->pinned_groups.tree))
perf_ctx_enable(&cpuctx->ctx, false);
perf_ctx_enable(&cpuctx->ctx, 0);
perf_ctx_enable(ctx, false);
perf_ctx_enable(ctx, 0);
unlock:
perf_ctx_unlock(cpuctx, ctx);
@@ -5280,9 +5421,20 @@ attach_task_ctx_data(struct task_struct *task, struct kmem_cache *ctx_cache,
return -ENOMEM;
for (;;) {
if (try_cmpxchg((struct perf_ctx_data **)&task->perf_ctx_data, &old, cd)) {
if (try_cmpxchg(&task->perf_ctx_data, &old, cd)) {
if (old)
perf_free_ctx_data_rcu(old);
/*
* Above try_cmpxchg() pairs with try_cmpxchg() from
* detach_task_ctx_data() such that
* if we race with perf_event_exit_task(), we must
* observe PF_EXITING.
*/
if (task->flags & PF_EXITING) {
/* detach_task_ctx_data() may free it already */
if (try_cmpxchg(&task->perf_ctx_data, &cd, NULL))
perf_free_ctx_data_rcu(cd);
}
return 0;
}
@@ -5328,6 +5480,8 @@ attach_global_ctx_data(struct kmem_cache *ctx_cache)
/* Allocate everything */
scoped_guard (rcu) {
for_each_process_thread(g, p) {
if (p->flags & PF_EXITING)
continue;
cd = rcu_dereference(p->perf_ctx_data);
if (cd && !cd->global) {
cd->global = 1;
@@ -5594,6 +5748,8 @@ static void __free_event(struct perf_event *event)
{
struct pmu *pmu = event->pmu;
security_perf_event_free(event);
if (event->attach_state & PERF_ATTACH_CALLCHAIN)
put_callchain_buffers();
@@ -5647,6 +5803,8 @@ static void __free_event(struct perf_event *event)
call_rcu(&event->rcu_head, free_event_rcu);
}
static void mediated_pmu_unaccount_event(struct perf_event *event);
DEFINE_FREE(__free_event, struct perf_event *, if (_T) __free_event(_T))
/* vs perf_event_alloc() success */
@@ -5656,8 +5814,7 @@ static void _free_event(struct perf_event *event)
irq_work_sync(&event->pending_disable_irq);
unaccount_event(event);
security_perf_event_free(event);
mediated_pmu_unaccount_event(event);
if (event->rb) {
/*
@@ -6180,6 +6337,138 @@ u64 perf_event_pause(struct perf_event *event, bool reset)
}
EXPORT_SYMBOL_GPL(perf_event_pause);
#ifdef CONFIG_PERF_GUEST_MEDIATED_PMU
static atomic_t nr_include_guest_events __read_mostly;
static atomic_t nr_mediated_pmu_vms __read_mostly;
static DEFINE_MUTEX(perf_mediated_pmu_mutex);
/* !exclude_guest event of PMU with PERF_PMU_CAP_MEDIATED_VPMU */
static inline bool is_include_guest_event(struct perf_event *event)
{
if ((event->pmu->capabilities & PERF_PMU_CAP_MEDIATED_VPMU) &&
!event->attr.exclude_guest)
return true;
return false;
}
static int mediated_pmu_account_event(struct perf_event *event)
{
if (!is_include_guest_event(event))
return 0;
if (atomic_inc_not_zero(&nr_include_guest_events))
return 0;
guard(mutex)(&perf_mediated_pmu_mutex);
if (atomic_read(&nr_mediated_pmu_vms))
return -EOPNOTSUPP;
atomic_inc(&nr_include_guest_events);
return 0;
}
static void mediated_pmu_unaccount_event(struct perf_event *event)
{
if (!is_include_guest_event(event))
return;
if (WARN_ON_ONCE(!atomic_read(&nr_include_guest_events)))
return;
atomic_dec(&nr_include_guest_events);
}
/*
* Currently invoked at VM creation to
* - Check whether there are existing !exclude_guest events of PMU with
* PERF_PMU_CAP_MEDIATED_VPMU
* - Set nr_mediated_pmu_vms to prevent !exclude_guest event creation on
* PMUs with PERF_PMU_CAP_MEDIATED_VPMU
*
* No impact for the PMU without PERF_PMU_CAP_MEDIATED_VPMU. The perf
* still owns all the PMU resources.
*/
int perf_create_mediated_pmu(void)
{
if (atomic_inc_not_zero(&nr_mediated_pmu_vms))
return 0;
guard(mutex)(&perf_mediated_pmu_mutex);
if (atomic_read(&nr_include_guest_events))
return -EBUSY;
atomic_inc(&nr_mediated_pmu_vms);
return 0;
}
EXPORT_SYMBOL_FOR_KVM(perf_create_mediated_pmu);
void perf_release_mediated_pmu(void)
{
if (WARN_ON_ONCE(!atomic_read(&nr_mediated_pmu_vms)))
return;
atomic_dec(&nr_mediated_pmu_vms);
}
EXPORT_SYMBOL_FOR_KVM(perf_release_mediated_pmu);
/* When loading a guest's mediated PMU, schedule out all exclude_guest events. */
void perf_load_guest_context(void)
{
struct perf_cpu_context *cpuctx = this_cpu_ptr(&perf_cpu_context);
lockdep_assert_irqs_disabled();
guard(perf_ctx_lock)(cpuctx, cpuctx->task_ctx);
if (WARN_ON_ONCE(__this_cpu_read(guest_ctx_loaded)))
return;
perf_ctx_disable(&cpuctx->ctx, EVENT_GUEST);
ctx_sched_out(&cpuctx->ctx, NULL, EVENT_GUEST);
if (cpuctx->task_ctx) {
perf_ctx_disable(cpuctx->task_ctx, EVENT_GUEST);
task_ctx_sched_out(cpuctx->task_ctx, NULL, EVENT_GUEST);
}
perf_ctx_enable(&cpuctx->ctx, EVENT_GUEST);
if (cpuctx->task_ctx)
perf_ctx_enable(cpuctx->task_ctx, EVENT_GUEST);
__this_cpu_write(guest_ctx_loaded, true);
}
EXPORT_SYMBOL_GPL(perf_load_guest_context);
void perf_put_guest_context(void)
{
struct perf_cpu_context *cpuctx = this_cpu_ptr(&perf_cpu_context);
lockdep_assert_irqs_disabled();
guard(perf_ctx_lock)(cpuctx, cpuctx->task_ctx);
if (WARN_ON_ONCE(!__this_cpu_read(guest_ctx_loaded)))
return;
perf_ctx_disable(&cpuctx->ctx, EVENT_GUEST);
if (cpuctx->task_ctx)
perf_ctx_disable(cpuctx->task_ctx, EVENT_GUEST);
perf_event_sched_in(cpuctx, cpuctx->task_ctx, NULL, EVENT_GUEST);
if (cpuctx->task_ctx)
perf_ctx_enable(cpuctx->task_ctx, EVENT_GUEST);
perf_ctx_enable(&cpuctx->ctx, EVENT_GUEST);
__this_cpu_write(guest_ctx_loaded, false);
}
EXPORT_SYMBOL_GPL(perf_put_guest_context);
#else
static int mediated_pmu_account_event(struct perf_event *event) { return 0; }
static void mediated_pmu_unaccount_event(struct perf_event *event) {}
#endif
/*
* Holding the top-level event's child_mutex means that any
* descendant process that has inherited this event will block
@@ -6547,23 +6836,23 @@ void perf_event_update_userpage(struct perf_event *event)
if (!rb)
goto unlock;
/*
* compute total_time_enabled, total_time_running
* based on snapshot values taken when the event
* was last scheduled in.
*
* we cannot simply called update_context_time()
* because of locking issue as we can be called in
* NMI context
*/
calc_timer_values(event, &now, &enabled, &running);
userpg = rb->user_page;
/*
* Disable preemption to guarantee consistent time stamps are stored to
* the user page.
*/
preempt_disable();
/*
* Compute total_time_enabled, total_time_running based on snapshot
* values taken when the event was last scheduled in.
*
* We cannot simply call update_context_time() because doing so would
* lead to deadlock when called from NMI context.
*/
calc_timer_values(event, &now, &enabled, &running);
userpg = rb->user_page;
++userpg->lock;
barrier();
userpg->index = perf_event_index(event);
@@ -7383,6 +7672,7 @@ struct perf_guest_info_callbacks __rcu *perf_guest_cbs;
DEFINE_STATIC_CALL_RET0(__perf_guest_state, *perf_guest_cbs->state);
DEFINE_STATIC_CALL_RET0(__perf_guest_get_ip, *perf_guest_cbs->get_ip);
DEFINE_STATIC_CALL_RET0(__perf_guest_handle_intel_pt_intr, *perf_guest_cbs->handle_intel_pt_intr);
DEFINE_STATIC_CALL_RET0(__perf_guest_handle_mediated_pmi, *perf_guest_cbs->handle_mediated_pmi);
void perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *cbs)
{
@@ -7397,6 +7687,10 @@ void perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *cbs)
if (cbs->handle_intel_pt_intr)
static_call_update(__perf_guest_handle_intel_pt_intr,
cbs->handle_intel_pt_intr);
if (cbs->handle_mediated_pmi)
static_call_update(__perf_guest_handle_mediated_pmi,
cbs->handle_mediated_pmi);
}
EXPORT_SYMBOL_GPL(perf_register_guest_info_callbacks);
@@ -7408,8 +7702,8 @@ void perf_unregister_guest_info_callbacks(struct perf_guest_info_callbacks *cbs)
rcu_assign_pointer(perf_guest_cbs, NULL);
static_call_update(__perf_guest_state, (void *)&__static_call_return0);
static_call_update(__perf_guest_get_ip, (void *)&__static_call_return0);
static_call_update(__perf_guest_handle_intel_pt_intr,
(void *)&__static_call_return0);
static_call_update(__perf_guest_handle_intel_pt_intr, (void *)&__static_call_return0);
static_call_update(__perf_guest_handle_mediated_pmi, (void *)&__static_call_return0);
synchronize_rcu();
}
EXPORT_SYMBOL_GPL(perf_unregister_guest_info_callbacks);
@@ -7869,13 +8163,11 @@ static void perf_output_read(struct perf_output_handle *handle,
u64 read_format = event->attr.read_format;
/*
* compute total_time_enabled, total_time_running
* based on snapshot values taken when the event
* was last scheduled in.
* Compute total_time_enabled, total_time_running based on snapshot
* values taken when the event was last scheduled in.
*
* we cannot simply called update_context_time()
* because of locking issue as we are called in
* NMI context
* We cannot simply call update_context_time() because doing so would
* lead to deadlock when called from NMI context.
*/
if (read_format & PERF_FORMAT_TOTAL_TIMES)
calc_timer_values(event, &now, &enabled, &running);
@@ -12043,7 +12335,7 @@ static void task_clock_event_update(struct perf_event *event, u64 now)
static void task_clock_event_start(struct perf_event *event, int flags)
{
event->hw.state = 0;
local64_set(&event->hw.prev_count, event->ctx->time);
local64_set(&event->hw.prev_count, event->ctx->time.time);
perf_swevent_start_hrtimer(event);
}
@@ -12052,7 +12344,7 @@ static void task_clock_event_stop(struct perf_event *event, int flags)
event->hw.state = PERF_HES_STOPPED;
perf_swevent_cancel_hrtimer(event);
if (flags & PERF_EF_UPDATE)
task_clock_event_update(event, event->ctx->time);
task_clock_event_update(event, event->ctx->time.time);
}
static int task_clock_event_add(struct perf_event *event, int flags)
@@ -12072,8 +12364,8 @@ static void task_clock_event_del(struct perf_event *event, int flags)
static void task_clock_event_read(struct perf_event *event)
{
u64 now = perf_clock();
u64 delta = now - event->ctx->timestamp;
u64 time = event->ctx->time + delta;
u64 delta = now - event->ctx->time.stamp;
u64 time = event->ctx->time.time + delta;
task_clock_event_update(event, time);
}
@@ -13155,6 +13447,10 @@ perf_event_alloc(struct perf_event_attr *attr, int cpu,
if (err)
return ERR_PTR(err);
err = mediated_pmu_account_event(event);
if (err)
return ERR_PTR(err);
/* symmetric to unaccount_event() in _free_event() */
account_event(event);
@@ -14294,8 +14590,11 @@ void perf_event_exit_task(struct task_struct *task)
/*
* Detach the perf_ctx_data for the system-wide event.
*
* Done without holding global_ctx_data_rwsem; typically
* attach_global_ctx_data() will skip over this task, but otherwise
* attach_task_ctx_data() will observe PF_EXITING.
*/
guard(percpu_read)(&global_ctx_data_rwsem);
detach_task_ctx_data(task);
}
@@ -14798,7 +15097,8 @@ static void perf_event_exit_cpu_context(int cpu)
ctx = &cpuctx->ctx;
mutex_lock(&ctx->mutex);
smp_call_function_single(cpu, __perf_event_exit_context, ctx, 1);
if (ctx->nr_events)
smp_call_function_single(cpu, __perf_event_exit_context, ctx, 1);
cpuctx->online = 0;
mutex_unlock(&ctx->mutex);
mutex_unlock(&pmus_lock);

View File

@@ -179,16 +179,16 @@ bool __weak is_trap_insn(uprobe_opcode_t *insn)
void uprobe_copy_from_page(struct page *page, unsigned long vaddr, void *dst, int len)
{
void *kaddr = kmap_atomic(page);
void *kaddr = kmap_local_page(page);
memcpy(dst, kaddr + (vaddr & ~PAGE_MASK), len);
kunmap_atomic(kaddr);
kunmap_local(kaddr);
}
static void copy_to_page(struct page *page, unsigned long vaddr, const void *src, int len)
{
void *kaddr = kmap_atomic(page);
void *kaddr = kmap_local_page(page);
memcpy(kaddr + (vaddr & ~PAGE_MASK), src, len);
kunmap_atomic(kaddr);
kunmap_local(kaddr);
}
static int verify_opcode(struct page *page, unsigned long vaddr, uprobe_opcode_t *insn,
@@ -323,7 +323,7 @@ __update_ref_ctr(struct mm_struct *mm, unsigned long vaddr, short d)
return ret == 0 ? -EBUSY : ret;
}
kaddr = kmap_atomic(page);
kaddr = kmap_local_page(page);
ptr = kaddr + (vaddr & ~PAGE_MASK);
if (unlikely(*ptr + d < 0)) {
@@ -336,7 +336,7 @@ __update_ref_ctr(struct mm_struct *mm, unsigned long vaddr, short d)
*ptr += d;
ret = 0;
out:
kunmap_atomic(kaddr);
kunmap_local(kaddr);
put_page(page);
return ret;
}
@@ -1138,7 +1138,7 @@ static bool filter_chain(struct uprobe *uprobe, struct mm_struct *mm)
bool ret = false;
down_read(&uprobe->consumer_rwsem);
list_for_each_entry_rcu(uc, &uprobe->consumers, cons_node, rcu_read_lock_trace_held()) {
list_for_each_entry(uc, &uprobe->consumers, cons_node) {
ret = consumer_filter(uc, mm);
if (ret)
break;
@@ -1694,6 +1694,12 @@ static const struct vm_special_mapping xol_mapping = {
.mremap = xol_mremap,
};
unsigned long __weak arch_uprobe_get_xol_area(void)
{
/* Try to map as high as possible, this is only a hint. */
return get_unmapped_area(NULL, TASK_SIZE - PAGE_SIZE, PAGE_SIZE, 0, 0);
}
/* Slot allocation for XOL */
static int xol_add_vma(struct mm_struct *mm, struct xol_area *area)
{
@@ -1709,9 +1715,7 @@ static int xol_add_vma(struct mm_struct *mm, struct xol_area *area)
}
if (!area->vaddr) {
/* Try to map as high as possible, this is only a hint. */
area->vaddr = get_unmapped_area(NULL, TASK_SIZE - PAGE_SIZE,
PAGE_SIZE, 0, 0);
area->vaddr = arch_uprobe_get_xol_area();
if (IS_ERR_VALUE(area->vaddr)) {
ret = area->vaddr;
goto fail;

View File

@@ -31,6 +31,7 @@ static int unwind_user_next_common(struct unwind_user_state *state,
{
unsigned long cfa, fp, ra;
/* Get the Canonical Frame Address (CFA) */
if (frame->use_fp) {
if (state->fp < state->sp)
return -EINVAL;
@@ -38,11 +39,9 @@ static int unwind_user_next_common(struct unwind_user_state *state,
} else {
cfa = state->sp;
}
/* Get the Canonical Frame Address (CFA) */
cfa += frame->cfa_off;
/* stack going in wrong direction? */
/* Make sure that stack is not going in wrong direction */
if (cfa <= state->sp)
return -EINVAL;
@@ -50,10 +49,11 @@ static int unwind_user_next_common(struct unwind_user_state *state,
if (cfa & (state->ws - 1))
return -EINVAL;
/* Find the Return Address (RA) */
/* Get the Return Address (RA) */
if (get_user_word(&ra, cfa, frame->ra_off, state->ws))
return -EINVAL;
/* Get the Frame Pointer (FP) */
if (frame->fp_off && get_user_word(&fp, cfa, frame->fp_off, state->ws))
return -EINVAL;
@@ -67,7 +67,6 @@ static int unwind_user_next_common(struct unwind_user_state *state,
static int unwind_user_next_fp(struct unwind_user_state *state)
{
#ifdef CONFIG_HAVE_UNWIND_USER_FP
struct pt_regs *regs = task_pt_regs(current);
if (state->topmost && unwind_user_at_function_start(regs)) {
@@ -81,9 +80,6 @@ static int unwind_user_next_fp(struct unwind_user_state *state)
ARCH_INIT_USER_FP_FRAME(state->ws)
};
return unwind_user_next_common(state, &fp_frame);
#else
return -EINVAL;
#endif
}
static int unwind_user_next(struct unwind_user_state *state)

View File

@@ -1330,14 +1330,16 @@ union perf_mem_data_src {
mem_snoopx : 2, /* Snoop mode, ext */
mem_blk : 3, /* Access blocked */
mem_hops : 3, /* Hop level */
mem_rsvd : 18;
mem_region : 5, /* cache/memory regions */
mem_rsvd : 13;
};
};
#elif defined(__BIG_ENDIAN_BITFIELD)
union perf_mem_data_src {
__u64 val;
struct {
__u64 mem_rsvd : 18,
__u64 mem_rsvd : 13,
mem_region : 5, /* cache/memory regions */
mem_hops : 3, /* Hop level */
mem_blk : 3, /* Access blocked */
mem_snoopx : 2, /* Snoop mode, ext */
@@ -1394,7 +1396,7 @@ union perf_mem_data_src {
#define PERF_MEM_LVLNUM_L4 0x0004 /* L4 */
#define PERF_MEM_LVLNUM_L2_MHB 0x0005 /* L2 Miss Handling Buffer */
#define PERF_MEM_LVLNUM_MSC 0x0006 /* Memory-side Cache */
/* 0x007 available */
#define PERF_MEM_LVLNUM_L0 0x0007 /* L0 */
#define PERF_MEM_LVLNUM_UNC 0x0008 /* Uncached */
#define PERF_MEM_LVLNUM_CXL 0x0009 /* CXL */
#define PERF_MEM_LVLNUM_IO 0x000a /* I/O */
@@ -1447,6 +1449,25 @@ union perf_mem_data_src {
/* 5-7 available */
#define PERF_MEM_HOPS_SHIFT 43
/* Cache/Memory region */
#define PERF_MEM_REGION_NA 0x0 /* Invalid */
#define PERF_MEM_REGION_RSVD 0x01 /* Reserved */
#define PERF_MEM_REGION_L_SHARE 0x02 /* Local CA shared cache */
#define PERF_MEM_REGION_L_NON_SHARE 0x03 /* Local CA non-shared cache */
#define PERF_MEM_REGION_O_IO 0x04 /* Other CA IO agent */
#define PERF_MEM_REGION_O_SHARE 0x05 /* Other CA shared cache */
#define PERF_MEM_REGION_O_NON_SHARE 0x06 /* Other CA non-shared cache */
#define PERF_MEM_REGION_MMIO 0x07 /* MMIO */
#define PERF_MEM_REGION_MEM0 0x08 /* Memory region 0 */
#define PERF_MEM_REGION_MEM1 0x09 /* Memory region 1 */
#define PERF_MEM_REGION_MEM2 0x0a /* Memory region 2 */
#define PERF_MEM_REGION_MEM3 0x0b /* Memory region 3 */
#define PERF_MEM_REGION_MEM4 0x0c /* Memory region 4 */
#define PERF_MEM_REGION_MEM5 0x0d /* Memory region 5 */
#define PERF_MEM_REGION_MEM6 0x0e /* Memory region 6 */
#define PERF_MEM_REGION_MEM7 0x0f /* Memory region 7 */
#define PERF_MEM_REGION_SHIFT 46
#define PERF_MEM_S(a, s) \
(((__u64)PERF_MEM_##a##_##s) << PERF_MEM_##a##_SHIFT)

View File

@@ -77,7 +77,8 @@
*/
#define IRQ_WORK_VECTOR 0xf6
/* 0xf5 - unused, was UV_BAU_MESSAGE */
#define PERF_GUEST_MEDIATED_PMI_VECTOR 0xf5
#define DEFERRED_ERROR_VECTOR 0xf4
/* Vector on which hypervisor callbacks will be delivered */

View File

@@ -939,6 +939,7 @@ static bool perf_pmu__match_wildcard(const char *pmu_name, const char *tok)
{
const char *p, *suffix;
bool has_hex = false;
bool has_underscore = false;
size_t tok_len = strlen(tok);
/* Check start of pmu_name for equality. */
@@ -949,13 +950,14 @@ static bool perf_pmu__match_wildcard(const char *pmu_name, const char *tok)
if (*p == 0)
return true;
if (*p == '_') {
++p;
++suffix;
}
/* Ensure we end in a number */
/* Ensure we end in a number or a mix of number and "_". */
while (1) {
if (!has_underscore && (*p == '_')) {
has_underscore = true;
++p;
++suffix;
}
if (!isxdigit(*p))
return false;
if (!has_hex)

View File

@@ -6482,11 +6482,14 @@ static struct perf_guest_info_callbacks kvm_guest_cbs = {
.state = kvm_guest_state,
.get_ip = kvm_guest_get_ip,
.handle_intel_pt_intr = NULL,
.handle_mediated_pmi = NULL,
};
void kvm_register_perf_callbacks(unsigned int (*pt_intr_handler)(void))
{
kvm_guest_cbs.handle_intel_pt_intr = pt_intr_handler;
kvm_guest_cbs.handle_mediated_pmi = NULL;
perf_register_guest_info_callbacks(&kvm_guest_cbs);
}
void kvm_unregister_perf_callbacks(void)