From 010e3e8fc12b1c13ce19821a11d8930226ebb4b6 Mon Sep 17 00:00:00 2001 From: Raphael Gault Date: Tue, 11 Jun 2019 13:53:09 +0100 Subject: [PATCH 01/25] perf tests arm64: Compile tests unconditionally In order to subsequently add more tests for the arm64 architecture we compile the tests target for arm64 systematically. Further explanation provided by Mark Rutland: Given prior questions regarding this commit, it's probably worth spelling things out more explicitly, e.g. Currently we only build the arm64/tests directory if CONFIG_DWARF_UNWIND is selected, which is fine as the only test we have is arm64/tests/dwarf-unwind.o. So that we can add more tests to the test directory, let's unconditionally build the directory, but conditionally build dwarf-unwind.o depending on CONFIG_DWARF_UNWIND. There should be no functional change as a result of this patch. Signed-off-by: Raphael Gault Acked-by: Mark Rutland Cc: Catalin Marinas Cc: Peter Zijlstra Cc: Will Deacon Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190611125315.18736-2-raphael.gault@arm.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/arch/arm64/Build | 2 +- tools/perf/arch/arm64/tests/Build | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/perf/arch/arm64/Build b/tools/perf/arch/arm64/Build index 36222e64bbf7..a7dd46a5b678 100644 --- a/tools/perf/arch/arm64/Build +++ b/tools/perf/arch/arm64/Build @@ -1,2 +1,2 @@ perf-y += util/ -perf-$(CONFIG_DWARF_UNWIND) += tests/ +perf-y += tests/ diff --git a/tools/perf/arch/arm64/tests/Build b/tools/perf/arch/arm64/tests/Build index 41707fea74b3..a61c06bdb757 100644 --- a/tools/perf/arch/arm64/tests/Build +++ b/tools/perf/arch/arm64/tests/Build @@ -1,4 +1,4 @@ perf-y += regs_load.o -perf-y += dwarf-unwind.o +perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o perf-y += arch-tests.o From 374d910f87b87283df2b1e8a60a6a546d4a14c90 Mon Sep 17 00:00:00 2001 From: Mathieu Poirier Date: Tue, 11 Jun 2019 14:45:28 -0600 Subject: [PATCH 02/25] perf: cs-etm: Optimize option setup for CPU-wide sessions Call function cs_etm_set_option() once with all relevant options set rather than multiple times to avoid going through the list of CPU more than once. Suggested-by: Suzuki K Poulose Signed-off-by: Mathieu Poirier Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Namhyung Kim Cc: Peter Zijlstra Cc: linux-arm-kernel@lists.infradead.org Link: http://lkml.kernel.org/r/20190611204528.20093-1-mathieu.poirier@linaro.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/arch/arm/util/cs-etm.c | 20 ++++++++------------ 1 file changed, 8 insertions(+), 12 deletions(-) diff --git a/tools/perf/arch/arm/util/cs-etm.c b/tools/perf/arch/arm/util/cs-etm.c index 279c69caef91..c6f1ab5499b5 100644 --- a/tools/perf/arch/arm/util/cs-etm.c +++ b/tools/perf/arch/arm/util/cs-etm.c @@ -162,20 +162,19 @@ static int cs_etm_set_option(struct auxtrace_record *itr, !cpu_map__has(online_cpus, i)) continue; - switch (option) { - case ETM_OPT_CTXTID: + if (option & ETM_OPT_CTXTID) { err = cs_etm_set_context_id(itr, evsel, i); if (err) goto out; - break; - case ETM_OPT_TS: + } + if (option & ETM_OPT_TS) { err = cs_etm_set_timestamp(itr, evsel, i); if (err) goto out; - break; - default: - goto out; } + if (option & ~(ETM_OPT_CTXTID | ETM_OPT_TS)) + /* Nothing else is currently supported */ + goto out; } err = 0; @@ -398,11 +397,8 @@ static int cs_etm_recording_options(struct auxtrace_record *itr, if (!cpu_map__empty(cpus)) { perf_evsel__set_sample_bit(cs_etm_evsel, CPU); - err = cs_etm_set_option(itr, cs_etm_evsel, ETM_OPT_CTXTID); - if (err) - goto out; - - err = cs_etm_set_option(itr, cs_etm_evsel, ETM_OPT_TS); + err = cs_etm_set_option(itr, cs_etm_evsel, + ETM_OPT_CTXTID | ETM_OPT_TS); if (err) goto out; } From edff7809c80f09398783d602c33a507309c23e24 Mon Sep 17 00:00:00 2001 From: Adrian Hunter Date: Mon, 10 Jun 2019 10:27:53 +0300 Subject: [PATCH 03/25] perf intel-pt: Add new packets for PEBS via PT MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add 3 new packets to supports PEBS via PT, namely Block Begin Packet (BBP), Block Item Packet (BIP) and Block End Packet (BEP). PEBS data is encoded into multiple BIP packets that come between BBP and BEP. The BEP packet might be associated with a FUP packet. That is indicated by using a separate packet type (INTEL_PT_BEP_IP) similar to other packets types with the _IP suffix. Refer to the Intel SDM for more information about PEBS via PT: https://software.intel.com/en-us/articles/intel-sdm May 2019 version: Vol. 3B 18.5.5.2 PEBS output to Intel® Processor Trace Decoding of BIP packets conflicts with single-byte TNT packets. Since BIP packets only occur in the context of a block (i.e. between BBP and BEP), that context must be recorded and passed to the packet decoder. Signed-off-by: Adrian Hunter Cc: Jiri Olsa Link: http://lkml.kernel.org/r/20190610072803.10456-2-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo --- .../util/intel-pt-decoder/intel-pt-decoder.c | 38 ++++- .../intel-pt-decoder/intel-pt-pkt-decoder.c | 140 +++++++++++++++++- .../intel-pt-decoder/intel-pt-pkt-decoder.h | 21 ++- tools/perf/util/intel-pt.c | 3 +- 4 files changed, 193 insertions(+), 9 deletions(-) diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c index f001f4ec4ddf..2f7791d4034f 100644 --- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c +++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c @@ -133,6 +133,7 @@ struct intel_pt_decoder { int mtc_shift; struct intel_pt_stack stack; enum intel_pt_pkt_state pkt_state; + enum intel_pt_pkt_ctx pkt_ctx; struct intel_pt_pkt packet; struct intel_pt_pkt tnt; int pkt_step; @@ -559,7 +560,7 @@ static int intel_pt_get_split_packet(struct intel_pt_decoder *decoder) memcpy(buf + len, decoder->buf, n); len += n; - ret = intel_pt_get_packet(buf, len, &decoder->packet); + ret = intel_pt_get_packet(buf, len, &decoder->packet, &decoder->pkt_ctx); if (ret < (int)old_len) { decoder->next_buf = decoder->buf; decoder->next_len = decoder->len; @@ -594,6 +595,7 @@ static int intel_pt_pkt_lookahead(struct intel_pt_decoder *decoder, { struct intel_pt_pkt_info pkt_info; const unsigned char *buf = decoder->buf; + enum intel_pt_pkt_ctx pkt_ctx = decoder->pkt_ctx; size_t len = decoder->len; int ret; @@ -612,7 +614,8 @@ static int intel_pt_pkt_lookahead(struct intel_pt_decoder *decoder, if (!len) return INTEL_PT_NEED_MORE_BYTES; - ret = intel_pt_get_packet(buf, len, &pkt_info.packet); + ret = intel_pt_get_packet(buf, len, &pkt_info.packet, + &pkt_ctx); if (!ret) return INTEL_PT_NEED_MORE_BYTES; if (ret < 0) @@ -687,6 +690,10 @@ static int intel_pt_calc_cyc_cb(struct intel_pt_pkt_info *pkt_info) case INTEL_PT_MNT: case INTEL_PT_PTWRITE: case INTEL_PT_PTWRITE_IP: + case INTEL_PT_BBP: + case INTEL_PT_BIP: + case INTEL_PT_BEP: + case INTEL_PT_BEP_IP: return 0; case INTEL_PT_MTC: @@ -879,7 +886,7 @@ static int intel_pt_get_next_packet(struct intel_pt_decoder *decoder) } ret = intel_pt_get_packet(decoder->buf, decoder->len, - &decoder->packet); + &decoder->packet, &decoder->pkt_ctx); if (ret == INTEL_PT_NEED_MORE_BYTES && BITS_PER_LONG == 32 && decoder->len < INTEL_PT_PKT_MAX_SZ && !decoder->next_buf) { ret = intel_pt_get_split_packet(decoder); @@ -1633,6 +1640,10 @@ static int intel_pt_walk_psbend(struct intel_pt_decoder *decoder) case INTEL_PT_MWAIT: case INTEL_PT_PWRE: case INTEL_PT_PWRX: + case INTEL_PT_BBP: + case INTEL_PT_BIP: + case INTEL_PT_BEP: + case INTEL_PT_BEP_IP: decoder->have_tma = false; intel_pt_log("ERROR: Unexpected packet\n"); err = -EAGAIN; @@ -1726,6 +1737,10 @@ static int intel_pt_walk_fup_tip(struct intel_pt_decoder *decoder) case INTEL_PT_MWAIT: case INTEL_PT_PWRE: case INTEL_PT_PWRX: + case INTEL_PT_BBP: + case INTEL_PT_BIP: + case INTEL_PT_BEP: + case INTEL_PT_BEP_IP: intel_pt_log("ERROR: Missing TIP after FUP\n"); decoder->pkt_state = INTEL_PT_STATE_ERR3; decoder->pkt_step = 0; @@ -2047,6 +2062,12 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder) decoder->state.pwrx_payload = decoder->packet.payload; return 0; + case INTEL_PT_BBP: + case INTEL_PT_BIP: + case INTEL_PT_BEP: + case INTEL_PT_BEP_IP: + break; + default: return intel_pt_bug(decoder); } @@ -2085,6 +2106,10 @@ static int intel_pt_walk_psb(struct intel_pt_decoder *decoder) case INTEL_PT_MWAIT: case INTEL_PT_PWRE: case INTEL_PT_PWRX: + case INTEL_PT_BBP: + case INTEL_PT_BIP: + case INTEL_PT_BEP: + case INTEL_PT_BEP_IP: intel_pt_log("ERROR: Unexpected packet\n"); err = -ENOENT; goto out; @@ -2291,6 +2316,10 @@ static int intel_pt_walk_to_ip(struct intel_pt_decoder *decoder) case INTEL_PT_MWAIT: case INTEL_PT_PWRE: case INTEL_PT_PWRX: + case INTEL_PT_BBP: + case INTEL_PT_BIP: + case INTEL_PT_BEP: + case INTEL_PT_BEP_IP: default: break; } @@ -2641,11 +2670,12 @@ static unsigned char *intel_pt_last_psb(unsigned char *buf, size_t len) static bool intel_pt_next_tsc(unsigned char *buf, size_t len, uint64_t *tsc, size_t *rem) { + enum intel_pt_pkt_ctx ctx = INTEL_PT_NO_CTX; struct intel_pt_pkt packet; int ret; while (len) { - ret = intel_pt_get_packet(buf, len, &packet); + ret = intel_pt_get_packet(buf, len, &packet, &ctx); if (ret <= 0) return false; if (packet.type == INTEL_PT_TSC) { diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c index 605fce537d80..0ccf10a0bf44 100644 --- a/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c +++ b/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.c @@ -62,6 +62,10 @@ static const char * const packet_name[] = { [INTEL_PT_MWAIT] = "MWAIT", [INTEL_PT_PWRE] = "PWRE", [INTEL_PT_PWRX] = "PWRX", + [INTEL_PT_BBP] = "BBP", + [INTEL_PT_BIP] = "BIP", + [INTEL_PT_BEP] = "BEP", + [INTEL_PT_BEP_IP] = "BEP", }; const char *intel_pt_pkt_name(enum intel_pt_pkt_type type) @@ -280,6 +284,55 @@ static int intel_pt_get_pwrx(const unsigned char *buf, size_t len, return 7; } +static int intel_pt_get_bbp(const unsigned char *buf, size_t len, + struct intel_pt_pkt *packet) +{ + if (len < 3) + return INTEL_PT_NEED_MORE_BYTES; + packet->type = INTEL_PT_BBP; + packet->count = buf[2] >> 7; + packet->payload = buf[2] & 0x1f; + return 3; +} + +static int intel_pt_get_bip_4(const unsigned char *buf, size_t len, + struct intel_pt_pkt *packet) +{ + if (len < 5) + return INTEL_PT_NEED_MORE_BYTES; + packet->type = INTEL_PT_BIP; + packet->count = buf[0] >> 3; + memcpy_le64(&packet->payload, buf + 1, 4); + return 5; +} + +static int intel_pt_get_bip_8(const unsigned char *buf, size_t len, + struct intel_pt_pkt *packet) +{ + if (len < 9) + return INTEL_PT_NEED_MORE_BYTES; + packet->type = INTEL_PT_BIP; + packet->count = buf[0] >> 3; + memcpy_le64(&packet->payload, buf + 1, 8); + return 9; +} + +static int intel_pt_get_bep(size_t len, struct intel_pt_pkt *packet) +{ + if (len < 2) + return INTEL_PT_NEED_MORE_BYTES; + packet->type = INTEL_PT_BEP; + return 2; +} + +static int intel_pt_get_bep_ip(size_t len, struct intel_pt_pkt *packet) +{ + if (len < 2) + return INTEL_PT_NEED_MORE_BYTES; + packet->type = INTEL_PT_BEP_IP; + return 2; +} + static int intel_pt_get_ext(const unsigned char *buf, size_t len, struct intel_pt_pkt *packet) { @@ -320,6 +373,12 @@ static int intel_pt_get_ext(const unsigned char *buf, size_t len, return intel_pt_get_pwre(buf, len, packet); case 0xA2: /* PWRX */ return intel_pt_get_pwrx(buf, len, packet); + case 0x63: /* BBP */ + return intel_pt_get_bbp(buf, len, packet); + case 0x33: /* BEP no IP */ + return intel_pt_get_bep(len, packet); + case 0xb3: /* BEP with IP */ + return intel_pt_get_bep_ip(len, packet); default: return INTEL_PT_BAD_PACKET; } @@ -468,7 +527,8 @@ static int intel_pt_get_mtc(const unsigned char *buf, size_t len, } static int intel_pt_do_get_packet(const unsigned char *buf, size_t len, - struct intel_pt_pkt *packet) + struct intel_pt_pkt *packet, + enum intel_pt_pkt_ctx ctx) { unsigned int byte; @@ -478,6 +538,22 @@ static int intel_pt_do_get_packet(const unsigned char *buf, size_t len, return INTEL_PT_NEED_MORE_BYTES; byte = buf[0]; + + switch (ctx) { + case INTEL_PT_NO_CTX: + break; + case INTEL_PT_BLK_4_CTX: + if ((byte & 0x7) == 4) + return intel_pt_get_bip_4(buf, len, packet); + break; + case INTEL_PT_BLK_8_CTX: + if ((byte & 0x7) == 4) + return intel_pt_get_bip_8(buf, len, packet); + break; + default: + break; + }; + if (!(byte & BIT(0))) { if (byte == 0) return intel_pt_get_pad(packet); @@ -516,15 +592,65 @@ static int intel_pt_do_get_packet(const unsigned char *buf, size_t len, } } +void intel_pt_upd_pkt_ctx(const struct intel_pt_pkt *packet, + enum intel_pt_pkt_ctx *ctx) +{ + switch (packet->type) { + case INTEL_PT_BAD: + case INTEL_PT_PAD: + case INTEL_PT_TSC: + case INTEL_PT_TMA: + case INTEL_PT_MTC: + case INTEL_PT_FUP: + case INTEL_PT_CYC: + case INTEL_PT_CBR: + case INTEL_PT_MNT: + case INTEL_PT_EXSTOP: + case INTEL_PT_EXSTOP_IP: + case INTEL_PT_PWRE: + case INTEL_PT_PWRX: + case INTEL_PT_BIP: + break; + case INTEL_PT_TNT: + case INTEL_PT_TIP: + case INTEL_PT_TIP_PGD: + case INTEL_PT_TIP_PGE: + case INTEL_PT_MODE_EXEC: + case INTEL_PT_MODE_TSX: + case INTEL_PT_PIP: + case INTEL_PT_OVF: + case INTEL_PT_VMCS: + case INTEL_PT_TRACESTOP: + case INTEL_PT_PSB: + case INTEL_PT_PSBEND: + case INTEL_PT_PTWRITE: + case INTEL_PT_PTWRITE_IP: + case INTEL_PT_MWAIT: + case INTEL_PT_BEP: + case INTEL_PT_BEP_IP: + *ctx = INTEL_PT_NO_CTX; + break; + case INTEL_PT_BBP: + if (packet->count) + *ctx = INTEL_PT_BLK_4_CTX; + else + *ctx = INTEL_PT_BLK_8_CTX; + break; + default: + break; + } +} + int intel_pt_get_packet(const unsigned char *buf, size_t len, - struct intel_pt_pkt *packet) + struct intel_pt_pkt *packet, enum intel_pt_pkt_ctx *ctx) { int ret; - ret = intel_pt_do_get_packet(buf, len, packet); + ret = intel_pt_do_get_packet(buf, len, packet, *ctx); if (ret > 0) { while (ret < 8 && len > (size_t)ret && !buf[ret]) ret += 1; + intel_pt_upd_pkt_ctx(packet, ctx); } return ret; } @@ -602,8 +728,10 @@ int intel_pt_pkt_desc(const struct intel_pt_pkt *packet, char *buf, return snprintf(buf, buf_len, "%s 0x%llx IP:0", name, payload); case INTEL_PT_PTWRITE_IP: return snprintf(buf, buf_len, "%s 0x%llx IP:1", name, payload); + case INTEL_PT_BEP: case INTEL_PT_EXSTOP: return snprintf(buf, buf_len, "%s IP:0", name); + case INTEL_PT_BEP_IP: case INTEL_PT_EXSTOP_IP: return snprintf(buf, buf_len, "%s IP:1", name); case INTEL_PT_MWAIT: @@ -621,6 +749,12 @@ int intel_pt_pkt_desc(const struct intel_pt_pkt *packet, char *buf, (unsigned int)((payload >> 4) & 0xf), (unsigned int)(payload & 0xf), (unsigned int)((payload >> 8) & 0xf)); + case INTEL_PT_BBP: + return snprintf(buf, buf_len, "%s SZ %s-byte Type 0x%llx", + name, packet->count ? "4" : "8", payload); + case INTEL_PT_BIP: + return snprintf(buf, buf_len, "%s ID 0x%02x Value 0x%llx", + name, packet->count, payload); default: break; } diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.h b/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.h index a7aefaa08588..17ca9b56d72f 100644 --- a/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.h +++ b/tools/perf/util/intel-pt-decoder/intel-pt-pkt-decoder.h @@ -50,6 +50,10 @@ enum intel_pt_pkt_type { INTEL_PT_MWAIT, INTEL_PT_PWRE, INTEL_PT_PWRX, + INTEL_PT_BBP, + INTEL_PT_BIP, + INTEL_PT_BEP, + INTEL_PT_BEP_IP, }; struct intel_pt_pkt { @@ -58,10 +62,25 @@ struct intel_pt_pkt { uint64_t payload; }; +/* + * Decoding of BIP packets conflicts with single-byte TNT packets. Since BIP + * packets only occur in the context of a block (i.e. between BBP and BEP), that + * context must be recorded and passed to the packet decoder. + */ +enum intel_pt_pkt_ctx { + INTEL_PT_NO_CTX, /* BIP packets are invalid */ + INTEL_PT_BLK_4_CTX, /* 4-byte BIP packets */ + INTEL_PT_BLK_8_CTX, /* 8-byte BIP packets */ +}; + const char *intel_pt_pkt_name(enum intel_pt_pkt_type); int intel_pt_get_packet(const unsigned char *buf, size_t len, - struct intel_pt_pkt *packet); + struct intel_pt_pkt *packet, + enum intel_pt_pkt_ctx *ctx); + +void intel_pt_upd_pkt_ctx(const struct intel_pt_pkt *packet, + enum intel_pt_pkt_ctx *ctx); int intel_pt_pkt_desc(const struct intel_pt_pkt *packet, char *buf, size_t len); diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c index 8ed51f4e9e30..893cef494a43 100644 --- a/tools/perf/util/intel-pt.c +++ b/tools/perf/util/intel-pt.c @@ -177,13 +177,14 @@ static void intel_pt_dump(struct intel_pt *pt __maybe_unused, int ret, pkt_len, i; char desc[INTEL_PT_PKT_DESC_MAX]; const char *color = PERF_COLOR_BLUE; + enum intel_pt_pkt_ctx ctx = INTEL_PT_NO_CTX; color_fprintf(stdout, color, ". ... Intel Processor Trace data: size %zu bytes\n", len); while (len) { - ret = intel_pt_get_packet(buf, len, &packet); + ret = intel_pt_get_packet(buf, len, &packet, &ctx); if (ret > 0) pkt_len = ret; else From a0db77bf880b8badd2f9ce4da708c69b0b865853 Mon Sep 17 00:00:00 2001 From: Adrian Hunter Date: Mon, 10 Jun 2019 10:27:54 +0300 Subject: [PATCH 04/25] perf intel-pt: Add Intel PT packet decoder test Add Intel PT packet decoder test. This test feeds byte sequences to the Intel PT packet decoder and checks the results. Changes to the packet context are also checked. Committer testing: # perf test "Intel PT" 65: Intel PT packet decoder : Ok # perf test -v "Intel PT" 65: Intel PT packet decoder : --- start --- test child forked, pid 6360 Decoded ok: 00 PAD Decoded ok: 04 TNT N (1) Decoded ok: 06 TNT T (1) Decoded ok: 80 TNT NNNNNN (6) Decoded ok: fe TNT TTTTTT (6) Decoded ok: 02 a3 02 00 00 00 00 00 TNT N (1) Decoded ok: 02 a3 03 00 00 00 00 00 TNT T (1) Decoded ok: 02 a3 00 00 00 00 00 80 TNT NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN (47) Decoded ok: 02 a3 ff ff ff ff ff ff TNT TTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTT (47) Decoded ok: 0d TIP no ip Decoded ok: 2d 01 02 TIP 0x201 Decoded ok: 4d 01 02 03 04 TIP 0x4030201 Decoded ok: 6d 01 02 03 04 05 06 TIP 0x60504030201 Decoded ok: 8d 01 02 03 04 05 06 TIP 0x60504030201 Decoded ok: cd 01 02 03 04 05 06 07 08 TIP 0x807060504030201 Decoded ok: 11 TIP.PGE no ip Decoded ok: 31 01 02 TIP.PGE 0x201 Decoded ok: 51 01 02 03 04 TIP.PGE 0x4030201 Decoded ok: 71 01 02 03 04 05 06 TIP.PGE 0x60504030201 Decoded ok: 91 01 02 03 04 05 06 TIP.PGE 0x60504030201 Decoded ok: d1 01 02 03 04 05 06 07 08 TIP.PGE 0x807060504030201 Decoded ok: 01 TIP.PGD no ip Decoded ok: 21 01 02 TIP.PGD 0x201 Decoded ok: 41 01 02 03 04 TIP.PGD 0x4030201 Decoded ok: 61 01 02 03 04 05 06 TIP.PGD 0x60504030201 Decoded ok: 81 01 02 03 04 05 06 TIP.PGD 0x60504030201 Decoded ok: c1 01 02 03 04 05 06 07 08 TIP.PGD 0x807060504030201 Decoded ok: 1d FUP no ip Decoded ok: 3d 01 02 FUP 0x201 Decoded ok: 5d 01 02 03 04 FUP 0x4030201 Decoded ok: 7d 01 02 03 04 05 06 FUP 0x60504030201 Decoded ok: 9d 01 02 03 04 05 06 FUP 0x60504030201 Decoded ok: dd 01 02 03 04 05 06 07 08 FUP 0x807060504030201 Decoded ok: 02 43 02 04 06 08 0a 0c PIP 0x60504030201 (NR=0) Decoded ok: 02 43 03 04 06 08 0a 0c PIP 0x60504030201 (NR=1) Decoded ok: 99 00 MODE.Exec 16 Decoded ok: 99 01 MODE.Exec 64 Decoded ok: 99 02 MODE.Exec 32 Decoded ok: 99 20 MODE.TSX TXAbort:0 InTX:0 Decoded ok: 99 21 MODE.TSX TXAbort:0 InTX:1 Decoded ok: 99 22 MODE.TSX TXAbort:1 InTX:0 Decoded ok: 02 83 TraceSTOP Decoded ok: 02 03 12 00 CBR 0x12 Decoded ok: 19 01 02 03 04 05 06 07 TSC 0x7060504030201 Decoded ok: 59 12 MTC 0x12 Decoded ok: 02 73 00 00 00 00 00 TMA CTC 0x0 FC 0x0 Decoded ok: 02 73 01 02 00 00 00 TMA CTC 0x201 FC 0x0 Decoded ok: 02 73 00 00 00 ff 01 TMA CTC 0x0 FC 0x1ff Decoded ok: 02 73 80 c0 00 ff 01 TMA CTC 0xc080 FC 0x1ff Decoded ok: 03 CYC 0x0 Decoded ok: 0b CYC 0x1 Decoded ok: fb CYC 0x1f Decoded ok: 07 02 CYC 0x20 Decoded ok: ff fe CYC 0xfff Decoded ok: 07 01 02 CYC 0x1000 Decoded ok: ff ff fe CYC 0x7ffff Decoded ok: 07 01 01 02 CYC 0x80000 Decoded ok: ff ff ff fe CYC 0x3ffffff Decoded ok: 07 01 01 01 02 CYC 0x4000000 Decoded ok: ff ff ff ff fe CYC 0x1ffffffff Decoded ok: 07 01 01 01 01 02 CYC 0x200000000 Decoded ok: ff ff ff ff ff fe CYC 0xffffffffff Decoded ok: 07 01 01 01 01 01 02 CYC 0x10000000000 Decoded ok: ff ff ff ff ff ff fe CYC 0x7fffffffffff Decoded ok: 07 01 01 01 01 01 01 02 CYC 0x800000000000 Decoded ok: ff ff ff ff ff ff ff fe CYC 0x3fffffffffffff Decoded ok: 07 01 01 01 01 01 01 01 02 CYC 0x40000000000000 Decoded ok: ff ff ff ff ff ff ff ff fe CYC 0x1fffffffffffffff Decoded ok: 07 01 01 01 01 01 01 01 01 02 CYC 0x2000000000000000 Decoded ok: ff ff ff ff ff ff ff ff ff 0e CYC 0xffffffffffffffff Decoded ok: 02 c8 01 02 03 04 05 VMCS 0x504030201 Decoded ok: 02 f3 OVF Decoded ok: 02 f3 OVF Decoded ok: 02 f3 OVF Decoded ok: 02 82 02 82 02 82 02 82 02 82 02 82 02 82 02 82 PSB Decoded ok: 02 82 02 82 02 82 02 82 02 82 02 82 02 82 02 82 PSB Decoded ok: 02 82 02 82 02 82 02 82 02 82 02 82 02 82 02 82 PSB Decoded ok: 02 23 PSBEND Decoded ok: 02 c3 88 01 02 03 04 05 06 07 00 MNT 0x7060504030201 Decoded ok: 02 12 01 02 03 04 PTWRITE 0x4030201 IP:0 Decoded ok: 02 32 01 02 03 04 05 06 07 08 PTWRITE 0x807060504030201 IP:0 Decoded ok: 02 92 01 02 03 04 PTWRITE 0x4030201 IP:1 Decoded ok: 02 b2 01 02 03 04 05 06 07 08 PTWRITE 0x807060504030201 IP:1 Decoded ok: 02 62 EXSTOP IP:0 Decoded ok: 02 e2 EXSTOP IP:1 Decoded ok: 02 c2 00 00 00 00 00 00 00 00 MWAIT 0x0 Hints 0x0 Extensions 0x0 Decoded ok: 02 c2 01 02 03 04 05 06 07 08 MWAIT 0x807060504030201 Hints 0x1 Extensions 0x1 Decoded ok: 02 c2 ff 02 03 04 07 06 07 08 MWAIT 0x8070607040302ff Hints 0xff Extensions 0x3 Decoded ok: 02 22 00 00 PWRE 0x0 HW:0 CState:0 Sub-CState:0 Decoded ok: 02 22 01 02 PWRE 0x201 HW:0 CState:0 Sub-CState:2 Decoded ok: 02 22 80 34 PWRE 0x3480 HW:1 CState:3 Sub-CState:4 Decoded ok: 02 22 00 56 PWRE 0x5600 HW:0 CState:5 Sub-CState:6 Decoded ok: 02 a2 00 00 00 00 00 PWRX 0x0 Last CState:0 Deepest CState:0 Wake Reason 0x0 Decoded ok: 02 a2 01 02 03 04 05 PWRX 0x504030201 Last CState:0 Deepest CState:1 Wake Reason 0x2 Decoded ok: 02 a2 ff ff ff ff ff PWRX 0xffffffffff Last CState:15 Deepest CState:15 Wake Reason 0xf Decoded ok: 02 63 00 BBP SZ 8-byte Type 0x0 Decoded ok: 02 63 80 BBP SZ 4-byte Type 0x0 Decoded ok: 02 63 1f BBP SZ 8-byte Type 0x1f Decoded ok: 02 63 9f BBP SZ 4-byte Type 0x1f Decoded ok: 04 00 00 00 00 BIP ID 0x00 Value 0x0 Decoded ok: fc 00 00 00 00 BIP ID 0x1f Value 0x0 Decoded ok: 04 01 02 03 04 BIP ID 0x00 Value 0x4030201 Decoded ok: fc 01 02 03 04 BIP ID 0x1f Value 0x4030201 Decoded ok: 04 00 00 00 00 00 00 00 00 BIP ID 0x00 Value 0x0 Decoded ok: fc 00 00 00 00 00 00 00 00 BIP ID 0x1f Value 0x0 Decoded ok: 04 01 02 03 04 05 06 07 08 BIP ID 0x00 Value 0x807060504030201 Decoded ok: fc 01 02 03 04 05 06 07 08 BIP ID 0x1f Value 0x807060504030201 Decoded ok: 02 33 BEP IP:0 Decoded ok: 02 b3 BEP IP:1 Decoded ok: 02 33 BEP IP:0 Decoded ok: 02 b3 BEP IP:1 test child finished with 0 ---- end ---- Intel PT packet decoder: Ok # Signed-off-by: Adrian Hunter Tested-by: Arnaldo Carvalho de Melo Cc: Jiri Olsa Link: http://lkml.kernel.org/r/20190610072803.10456-3-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/arch/x86/include/arch-tests.h | 1 + tools/perf/arch/x86/tests/Build | 2 +- tools/perf/arch/x86/tests/arch-tests.c | 4 + .../x86/tests/intel-pt-pkt-decoder-test.c | 304 ++++++++++++++++++ 4 files changed, 310 insertions(+), 1 deletion(-) create mode 100644 tools/perf/arch/x86/tests/intel-pt-pkt-decoder-test.c diff --git a/tools/perf/arch/x86/include/arch-tests.h b/tools/perf/arch/x86/include/arch-tests.h index 613709cfbbd0..c41c5affe4be 100644 --- a/tools/perf/arch/x86/include/arch-tests.h +++ b/tools/perf/arch/x86/include/arch-tests.h @@ -9,6 +9,7 @@ struct test; int test__rdpmc(struct test *test __maybe_unused, int subtest); int test__perf_time_to_tsc(struct test *test __maybe_unused, int subtest); int test__insn_x86(struct test *test __maybe_unused, int subtest); +int test__intel_pt_pkt_decoder(struct test *test, int subtest); int test__bp_modify(struct test *test, int subtest); #ifdef HAVE_DWARF_UNWIND_SUPPORT diff --git a/tools/perf/arch/x86/tests/Build b/tools/perf/arch/x86/tests/Build index 3d83d0c6982d..2997c506550c 100644 --- a/tools/perf/arch/x86/tests/Build +++ b/tools/perf/arch/x86/tests/Build @@ -4,5 +4,5 @@ perf-$(CONFIG_DWARF_UNWIND) += dwarf-unwind.o perf-y += arch-tests.o perf-y += rdpmc.o perf-y += perf-time-to-tsc.o -perf-$(CONFIG_AUXTRACE) += insn-x86.o +perf-$(CONFIG_AUXTRACE) += insn-x86.o intel-pt-pkt-decoder-test.o perf-$(CONFIG_X86_64) += bp-modify.o diff --git a/tools/perf/arch/x86/tests/arch-tests.c b/tools/perf/arch/x86/tests/arch-tests.c index d47d3f8e3c8e..6763135aec17 100644 --- a/tools/perf/arch/x86/tests/arch-tests.c +++ b/tools/perf/arch/x86/tests/arch-tests.c @@ -23,6 +23,10 @@ struct test arch_tests[] = { .desc = "x86 instruction decoder - new instructions", .func = test__insn_x86, }, + { + .desc = "Intel PT packet decoder", + .func = test__intel_pt_pkt_decoder, + }, #endif #if defined(__x86_64__) { diff --git a/tools/perf/arch/x86/tests/intel-pt-pkt-decoder-test.c b/tools/perf/arch/x86/tests/intel-pt-pkt-decoder-test.c new file mode 100644 index 000000000000..901bf1f449c4 --- /dev/null +++ b/tools/perf/arch/x86/tests/intel-pt-pkt-decoder-test.c @@ -0,0 +1,304 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include + +#include "intel-pt-decoder/intel-pt-pkt-decoder.h" + +#include "debug.h" +#include "tests/tests.h" +#include "arch-tests.h" + +/** + * struct test_data - Test data. + * @len: number of bytes to decode + * @bytes: bytes to decode + * @ctx: packet context to decode + * @packet: expected packet + * @new_ctx: expected new packet context + * @ctx_unchanged: the packet context must not change + */ +struct test_data { + int len; + u8 bytes[INTEL_PT_PKT_MAX_SZ]; + enum intel_pt_pkt_ctx ctx; + struct intel_pt_pkt packet; + enum intel_pt_pkt_ctx new_ctx; + int ctx_unchanged; +} data[] = { + /* Padding Packet */ + {1, {0}, 0, {INTEL_PT_PAD, 0, 0}, 0, 1 }, + /* Short Taken/Not Taken Packet */ + {1, {4}, 0, {INTEL_PT_TNT, 1, 0}, 0, 0 }, + {1, {6}, 0, {INTEL_PT_TNT, 1, 0x20ULL << 58}, 0, 0 }, + {1, {0x80}, 0, {INTEL_PT_TNT, 6, 0}, 0, 0 }, + {1, {0xfe}, 0, {INTEL_PT_TNT, 6, 0x3fULL << 58}, 0, 0 }, + /* Long Taken/Not Taken Packet */ + {8, {0x02, 0xa3, 2}, 0, {INTEL_PT_TNT, 1, 0xa302ULL << 47}, 0, 0 }, + {8, {0x02, 0xa3, 3}, 0, {INTEL_PT_TNT, 1, 0x1a302ULL << 47}, 0, 0 }, + {8, {0x02, 0xa3, 0, 0, 0, 0, 0, 0x80}, 0, {INTEL_PT_TNT, 47, 0xa302ULL << 1}, 0, 0 }, + {8, {0x02, 0xa3, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff}, 0, {INTEL_PT_TNT, 47, 0xffffffffffffa302ULL << 1}, 0, 0 }, + /* Target IP Packet */ + {1, {0x0d}, 0, {INTEL_PT_TIP, 0, 0}, 0, 0 }, + {3, {0x2d, 1, 2}, 0, {INTEL_PT_TIP, 1, 0x201}, 0, 0 }, + {5, {0x4d, 1, 2, 3, 4}, 0, {INTEL_PT_TIP, 2, 0x4030201}, 0, 0 }, + {7, {0x6d, 1, 2, 3, 4, 5, 6}, 0, {INTEL_PT_TIP, 3, 0x60504030201}, 0, 0 }, + {7, {0x8d, 1, 2, 3, 4, 5, 6}, 0, {INTEL_PT_TIP, 4, 0x60504030201}, 0, 0 }, + {9, {0xcd, 1, 2, 3, 4, 5, 6, 7, 8}, 0, {INTEL_PT_TIP, 6, 0x807060504030201}, 0, 0 }, + /* Packet Generation Enable */ + {1, {0x11}, 0, {INTEL_PT_TIP_PGE, 0, 0}, 0, 0 }, + {3, {0x31, 1, 2}, 0, {INTEL_PT_TIP_PGE, 1, 0x201}, 0, 0 }, + {5, {0x51, 1, 2, 3, 4}, 0, {INTEL_PT_TIP_PGE, 2, 0x4030201}, 0, 0 }, + {7, {0x71, 1, 2, 3, 4, 5, 6}, 0, {INTEL_PT_TIP_PGE, 3, 0x60504030201}, 0, 0 }, + {7, {0x91, 1, 2, 3, 4, 5, 6}, 0, {INTEL_PT_TIP_PGE, 4, 0x60504030201}, 0, 0 }, + {9, {0xd1, 1, 2, 3, 4, 5, 6, 7, 8}, 0, {INTEL_PT_TIP_PGE, 6, 0x807060504030201}, 0, 0 }, + /* Packet Generation Disable */ + {1, {0x01}, 0, {INTEL_PT_TIP_PGD, 0, 0}, 0, 0 }, + {3, {0x21, 1, 2}, 0, {INTEL_PT_TIP_PGD, 1, 0x201}, 0, 0 }, + {5, {0x41, 1, 2, 3, 4}, 0, {INTEL_PT_TIP_PGD, 2, 0x4030201}, 0, 0 }, + {7, {0x61, 1, 2, 3, 4, 5, 6}, 0, {INTEL_PT_TIP_PGD, 3, 0x60504030201}, 0, 0 }, + {7, {0x81, 1, 2, 3, 4, 5, 6}, 0, {INTEL_PT_TIP_PGD, 4, 0x60504030201}, 0, 0 }, + {9, {0xc1, 1, 2, 3, 4, 5, 6, 7, 8}, 0, {INTEL_PT_TIP_PGD, 6, 0x807060504030201}, 0, 0 }, + /* Flow Update Packet */ + {1, {0x1d}, 0, {INTEL_PT_FUP, 0, 0}, 0, 0 }, + {3, {0x3d, 1, 2}, 0, {INTEL_PT_FUP, 1, 0x201}, 0, 0 }, + {5, {0x5d, 1, 2, 3, 4}, 0, {INTEL_PT_FUP, 2, 0x4030201}, 0, 0 }, + {7, {0x7d, 1, 2, 3, 4, 5, 6}, 0, {INTEL_PT_FUP, 3, 0x60504030201}, 0, 0 }, + {7, {0x9d, 1, 2, 3, 4, 5, 6}, 0, {INTEL_PT_FUP, 4, 0x60504030201}, 0, 0 }, + {9, {0xdd, 1, 2, 3, 4, 5, 6, 7, 8}, 0, {INTEL_PT_FUP, 6, 0x807060504030201}, 0, 0 }, + /* Paging Information Packet */ + {8, {0x02, 0x43, 2, 4, 6, 8, 10, 12}, 0, {INTEL_PT_PIP, 0, 0x60504030201}, 0, 0 }, + {8, {0x02, 0x43, 3, 4, 6, 8, 10, 12}, 0, {INTEL_PT_PIP, 0, 0x60504030201 | (1ULL << 63)}, 0, 0 }, + /* Mode Exec Packet */ + {2, {0x99, 0x00}, 0, {INTEL_PT_MODE_EXEC, 0, 16}, 0, 0 }, + {2, {0x99, 0x01}, 0, {INTEL_PT_MODE_EXEC, 0, 64}, 0, 0 }, + {2, {0x99, 0x02}, 0, {INTEL_PT_MODE_EXEC, 0, 32}, 0, 0 }, + /* Mode TSX Packet */ + {2, {0x99, 0x20}, 0, {INTEL_PT_MODE_TSX, 0, 0}, 0, 0 }, + {2, {0x99, 0x21}, 0, {INTEL_PT_MODE_TSX, 0, 1}, 0, 0 }, + {2, {0x99, 0x22}, 0, {INTEL_PT_MODE_TSX, 0, 2}, 0, 0 }, + /* Trace Stop Packet */ + {2, {0x02, 0x83}, 0, {INTEL_PT_TRACESTOP, 0, 0}, 0, 0 }, + /* Core:Bus Ratio Packet */ + {4, {0x02, 0x03, 0x12, 0}, 0, {INTEL_PT_CBR, 0, 0x12}, 0, 1 }, + /* Timestamp Counter Packet */ + {8, {0x19, 1, 2, 3, 4, 5, 6, 7}, 0, {INTEL_PT_TSC, 0, 0x7060504030201}, 0, 1 }, + /* Mini Time Counter Packet */ + {2, {0x59, 0x12}, 0, {INTEL_PT_MTC, 0, 0x12}, 0, 1 }, + /* TSC / MTC Alignment Packet */ + {7, {0x02, 0x73}, 0, {INTEL_PT_TMA, 0, 0}, 0, 1 }, + {7, {0x02, 0x73, 1, 2}, 0, {INTEL_PT_TMA, 0, 0x201}, 0, 1 }, + {7, {0x02, 0x73, 0, 0, 0, 0xff, 1}, 0, {INTEL_PT_TMA, 0x1ff, 0}, 0, 1 }, + {7, {0x02, 0x73, 0x80, 0xc0, 0, 0xff, 1}, 0, {INTEL_PT_TMA, 0x1ff, 0xc080}, 0, 1 }, + /* Cycle Count Packet */ + {1, {0x03}, 0, {INTEL_PT_CYC, 0, 0}, 0, 1 }, + {1, {0x0b}, 0, {INTEL_PT_CYC, 0, 1}, 0, 1 }, + {1, {0xfb}, 0, {INTEL_PT_CYC, 0, 0x1f}, 0, 1 }, + {2, {0x07, 2}, 0, {INTEL_PT_CYC, 0, 0x20}, 0, 1 }, + {2, {0xff, 0xfe}, 0, {INTEL_PT_CYC, 0, 0xfff}, 0, 1 }, + {3, {0x07, 1, 2}, 0, {INTEL_PT_CYC, 0, 0x1000}, 0, 1 }, + {3, {0xff, 0xff, 0xfe}, 0, {INTEL_PT_CYC, 0, 0x7ffff}, 0, 1 }, + {4, {0x07, 1, 1, 2}, 0, {INTEL_PT_CYC, 0, 0x80000}, 0, 1 }, + {4, {0xff, 0xff, 0xff, 0xfe}, 0, {INTEL_PT_CYC, 0, 0x3ffffff}, 0, 1 }, + {5, {0x07, 1, 1, 1, 2}, 0, {INTEL_PT_CYC, 0, 0x4000000}, 0, 1 }, + {5, {0xff, 0xff, 0xff, 0xff, 0xfe}, 0, {INTEL_PT_CYC, 0, 0x1ffffffff}, 0, 1 }, + {6, {0x07, 1, 1, 1, 1, 2}, 0, {INTEL_PT_CYC, 0, 0x200000000}, 0, 1 }, + {6, {0xff, 0xff, 0xff, 0xff, 0xff, 0xfe}, 0, {INTEL_PT_CYC, 0, 0xffffffffff}, 0, 1 }, + {7, {0x07, 1, 1, 1, 1, 1, 2}, 0, {INTEL_PT_CYC, 0, 0x10000000000}, 0, 1 }, + {7, {0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xfe}, 0, {INTEL_PT_CYC, 0, 0x7fffffffffff}, 0, 1 }, + {8, {0x07, 1, 1, 1, 1, 1, 1, 2}, 0, {INTEL_PT_CYC, 0, 0x800000000000}, 0, 1 }, + {8, {0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xfe}, 0, {INTEL_PT_CYC, 0, 0x3fffffffffffff}, 0, 1 }, + {9, {0x07, 1, 1, 1, 1, 1, 1, 1, 2}, 0, {INTEL_PT_CYC, 0, 0x40000000000000}, 0, 1 }, + {9, {0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xfe}, 0, {INTEL_PT_CYC, 0, 0x1fffffffffffffff}, 0, 1 }, + {10, {0x07, 1, 1, 1, 1, 1, 1, 1, 1, 2}, 0, {INTEL_PT_CYC, 0, 0x2000000000000000}, 0, 1 }, + {10, {0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xff, 0xe}, 0, {INTEL_PT_CYC, 0, 0xffffffffffffffff}, 0, 1 }, + /* Virtual-Machine Control Structure Packet */ + {7, {0x02, 0xc8, 1, 2, 3, 4, 5}, 0, {INTEL_PT_VMCS, 5, 0x504030201}, 0, 0 }, + /* Overflow Packet */ + {2, {0x02, 0xf3}, 0, {INTEL_PT_OVF, 0, 0}, 0, 0 }, + {2, {0x02, 0xf3}, INTEL_PT_BLK_4_CTX, {INTEL_PT_OVF, 0, 0}, 0, 0 }, + {2, {0x02, 0xf3}, INTEL_PT_BLK_8_CTX, {INTEL_PT_OVF, 0, 0}, 0, 0 }, + /* Packet Stream Boundary*/ + {16, {0x02, 0x82, 0x02, 0x82, 0x02, 0x82, 0x02, 0x82, 0x02, 0x82, 0x02, 0x82, 0x02, 0x82, 0x02, 0x82}, 0, {INTEL_PT_PSB, 0, 0}, 0, 0 }, + {16, {0x02, 0x82, 0x02, 0x82, 0x02, 0x82, 0x02, 0x82, 0x02, 0x82, 0x02, 0x82, 0x02, 0x82, 0x02, 0x82}, INTEL_PT_BLK_4_CTX, {INTEL_PT_PSB, 0, 0}, 0, 0 }, + {16, {0x02, 0x82, 0x02, 0x82, 0x02, 0x82, 0x02, 0x82, 0x02, 0x82, 0x02, 0x82, 0x02, 0x82, 0x02, 0x82}, INTEL_PT_BLK_8_CTX, {INTEL_PT_PSB, 0, 0}, 0, 0 }, + /* PSB End Packet */ + {2, {0x02, 0x23}, 0, {INTEL_PT_PSBEND, 0, 0}, 0, 0 }, + /* Maintenance Packet */ + {11, {0x02, 0xc3, 0x88, 1, 2, 3, 4, 5, 6, 7}, 0, {INTEL_PT_MNT, 0, 0x7060504030201}, 0, 1 }, + /* Write Data to PT Packet */ + {6, {0x02, 0x12, 1, 2, 3, 4}, 0, {INTEL_PT_PTWRITE, 0, 0x4030201}, 0, 0 }, + {10, {0x02, 0x32, 1, 2, 3, 4, 5, 6, 7, 8}, 0, {INTEL_PT_PTWRITE, 1, 0x807060504030201}, 0, 0 }, + {6, {0x02, 0x92, 1, 2, 3, 4}, 0, {INTEL_PT_PTWRITE_IP, 0, 0x4030201}, 0, 0 }, + {10, {0x02, 0xb2, 1, 2, 3, 4, 5, 6, 7, 8}, 0, {INTEL_PT_PTWRITE_IP, 1, 0x807060504030201}, 0, 0 }, + /* Execution Stop Packet */ + {2, {0x02, 0x62}, 0, {INTEL_PT_EXSTOP, 0, 0}, 0, 1 }, + {2, {0x02, 0xe2}, 0, {INTEL_PT_EXSTOP_IP, 0, 0}, 0, 1 }, + /* Monitor Wait Packet */ + {10, {0x02, 0xc2}, 0, {INTEL_PT_MWAIT, 0, 0}, 0, 0 }, + {10, {0x02, 0xc2, 1, 2, 3, 4, 5, 6, 7, 8}, 0, {INTEL_PT_MWAIT, 0, 0x807060504030201}, 0, 0 }, + {10, {0x02, 0xc2, 0xff, 2, 3, 4, 7, 6, 7, 8}, 0, {INTEL_PT_MWAIT, 0, 0x8070607040302ff}, 0, 0 }, + /* Power Entry Packet */ + {4, {0x02, 0x22}, 0, {INTEL_PT_PWRE, 0, 0}, 0, 1 }, + {4, {0x02, 0x22, 1, 2}, 0, {INTEL_PT_PWRE, 0, 0x0201}, 0, 1 }, + {4, {0x02, 0x22, 0x80, 0x34}, 0, {INTEL_PT_PWRE, 0, 0x3480}, 0, 1 }, + {4, {0x02, 0x22, 0x00, 0x56}, 0, {INTEL_PT_PWRE, 0, 0x5600}, 0, 1 }, + /* Power Exit Packet */ + {7, {0x02, 0xa2}, 0, {INTEL_PT_PWRX, 0, 0}, 0, 1 }, + {7, {0x02, 0xa2, 1, 2, 3, 4, 5}, 0, {INTEL_PT_PWRX, 0, 0x504030201}, 0, 1 }, + {7, {0x02, 0xa2, 0xff, 0xff, 0xff, 0xff, 0xff}, 0, {INTEL_PT_PWRX, 0, 0xffffffffff}, 0, 1 }, + /* Block Begin Packet */ + {3, {0x02, 0x63, 0x00}, 0, {INTEL_PT_BBP, 0, 0}, INTEL_PT_BLK_8_CTX, 0 }, + {3, {0x02, 0x63, 0x80}, 0, {INTEL_PT_BBP, 1, 0}, INTEL_PT_BLK_4_CTX, 0 }, + {3, {0x02, 0x63, 0x1f}, 0, {INTEL_PT_BBP, 0, 0x1f}, INTEL_PT_BLK_8_CTX, 0 }, + {3, {0x02, 0x63, 0x9f}, 0, {INTEL_PT_BBP, 1, 0x1f}, INTEL_PT_BLK_4_CTX, 0 }, + /* 4-byte Block Item Packet */ + {5, {0x04}, INTEL_PT_BLK_4_CTX, {INTEL_PT_BIP, 0, 0}, INTEL_PT_BLK_4_CTX, 0 }, + {5, {0xfc}, INTEL_PT_BLK_4_CTX, {INTEL_PT_BIP, 0x1f, 0}, INTEL_PT_BLK_4_CTX, 0 }, + {5, {0x04, 1, 2, 3, 4}, INTEL_PT_BLK_4_CTX, {INTEL_PT_BIP, 0, 0x04030201}, INTEL_PT_BLK_4_CTX, 0 }, + {5, {0xfc, 1, 2, 3, 4}, INTEL_PT_BLK_4_CTX, {INTEL_PT_BIP, 0x1f, 0x04030201}, INTEL_PT_BLK_4_CTX, 0 }, + /* 8-byte Block Item Packet */ + {9, {0x04}, INTEL_PT_BLK_8_CTX, {INTEL_PT_BIP, 0, 0}, INTEL_PT_BLK_8_CTX, 0 }, + {9, {0xfc}, INTEL_PT_BLK_8_CTX, {INTEL_PT_BIP, 0x1f, 0}, INTEL_PT_BLK_8_CTX, 0 }, + {9, {0x04, 1, 2, 3, 4, 5, 6, 7, 8}, INTEL_PT_BLK_8_CTX, {INTEL_PT_BIP, 0, 0x0807060504030201}, INTEL_PT_BLK_8_CTX, 0 }, + {9, {0xfc, 1, 2, 3, 4, 5, 6, 7, 8}, INTEL_PT_BLK_8_CTX, {INTEL_PT_BIP, 0x1f, 0x0807060504030201}, INTEL_PT_BLK_8_CTX, 0 }, + /* Block End Packet */ + {2, {0x02, 0x33}, INTEL_PT_BLK_4_CTX, {INTEL_PT_BEP, 0, 0}, 0, 0 }, + {2, {0x02, 0xb3}, INTEL_PT_BLK_4_CTX, {INTEL_PT_BEP_IP, 0, 0}, 0, 0 }, + {2, {0x02, 0x33}, INTEL_PT_BLK_8_CTX, {INTEL_PT_BEP, 0, 0}, 0, 0 }, + {2, {0x02, 0xb3}, INTEL_PT_BLK_8_CTX, {INTEL_PT_BEP_IP, 0, 0}, 0, 0 }, + /* Terminator */ + {0, {0}, 0, {0, 0, 0}, 0, 0 }, +}; + +static int dump_packet(struct intel_pt_pkt *packet, u8 *bytes, int len) +{ + char desc[INTEL_PT_PKT_DESC_MAX]; + int ret, i; + + for (i = 0; i < len; i++) + pr_debug(" %02x", bytes[i]); + for (; i < INTEL_PT_PKT_MAX_SZ; i++) + pr_debug(" "); + pr_debug(" "); + ret = intel_pt_pkt_desc(packet, desc, INTEL_PT_PKT_DESC_MAX); + if (ret < 0) { + pr_debug("intel_pt_pkt_desc failed!\n"); + return TEST_FAIL; + } + pr_debug("%s\n", desc); + + return TEST_OK; +} + +static void decoding_failed(struct test_data *d) +{ + pr_debug("Decoding failed!\n"); + pr_debug("Decoding: "); + dump_packet(&d->packet, d->bytes, d->len); +} + +static int fail(struct test_data *d, struct intel_pt_pkt *packet, int len, + enum intel_pt_pkt_ctx new_ctx) +{ + decoding_failed(d); + + if (len != d->len) + pr_debug("Expected length: %d Decoded length %d\n", + d->len, len); + + if (packet->type != d->packet.type) + pr_debug("Expected type: %d Decoded type %d\n", + d->packet.type, packet->type); + + if (packet->count != d->packet.count) + pr_debug("Expected count: %d Decoded count %d\n", + d->packet.count, packet->count); + + if (packet->payload != d->packet.payload) + pr_debug("Expected payload: 0x%llx Decoded payload 0x%llx\n", + (unsigned long long)d->packet.payload, + (unsigned long long)packet->payload); + + if (new_ctx != d->new_ctx) + pr_debug("Expected packet context: %d Decoded packet context %d\n", + d->new_ctx, new_ctx); + + return TEST_FAIL; +} + +static int test_ctx_unchanged(struct test_data *d, struct intel_pt_pkt *packet, + enum intel_pt_pkt_ctx ctx) +{ + enum intel_pt_pkt_ctx old_ctx = ctx; + + intel_pt_upd_pkt_ctx(packet, &ctx); + + if (ctx != old_ctx) { + decoding_failed(d); + pr_debug("Packet context changed!\n"); + return TEST_FAIL; + } + + return TEST_OK; +} + +static int test_one(struct test_data *d) +{ + struct intel_pt_pkt packet; + enum intel_pt_pkt_ctx ctx = d->ctx; + int ret; + + memset(&packet, 0xff, sizeof(packet)); + + /* Decode a packet */ + ret = intel_pt_get_packet(d->bytes, d->len, &packet, &ctx); + if (ret < 0 || ret > INTEL_PT_PKT_MAX_SZ) { + decoding_failed(d); + pr_debug("intel_pt_get_packet returned %d\n", ret); + return TEST_FAIL; + } + + /* Some packets must always leave the packet context unchanged */ + if (d->ctx_unchanged) { + int err; + + err = test_ctx_unchanged(d, &packet, INTEL_PT_NO_CTX); + if (err) + return err; + err = test_ctx_unchanged(d, &packet, INTEL_PT_BLK_4_CTX); + if (err) + return err; + err = test_ctx_unchanged(d, &packet, INTEL_PT_BLK_8_CTX); + if (err) + return err; + } + + /* Compare to the expected values */ + if (ret != d->len || packet.type != d->packet.type || + packet.count != d->packet.count || + packet.payload != d->packet.payload || ctx != d->new_ctx) + return fail(d, &packet, ret, ctx); + + pr_debug("Decoded ok:"); + ret = dump_packet(&d->packet, d->bytes, d->len); + + return ret; +} + +/* + * This test feeds byte sequences to the Intel PT packet decoder and checks the + * results. Changes to the packet context are also checked. + */ +int test__intel_pt_pkt_decoder(struct test *test __maybe_unused, int subtest __maybe_unused) +{ + struct test_data *d = data; + int ret; + + for (d = data; d->len; d++) { + ret = test_one(d); + if (ret) + return ret; + } + + return TEST_OK; +} From 4c35595e1ea7585d09eb80096f47af237061e795 Mon Sep 17 00:00:00 2001 From: Adrian Hunter Date: Mon, 10 Jun 2019 10:27:55 +0300 Subject: [PATCH 05/25] perf intel-pt: Add decoder support for PEBS via PT PEBS data is encoded in Block Item Packets (BIP). Populate a new structure intel_pt_blk_items with the values and, upon a Block End Packet (BEP), report them as a new Intel PT sample type INTEL_PT_BLK_ITEMS. Signed-off-by: Adrian Hunter Cc: Jiri Olsa Link: http://lkml.kernel.org/r/20190610072803.10456-4-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo --- .../util/intel-pt-decoder/intel-pt-decoder.c | 82 ++++++++++- .../util/intel-pt-decoder/intel-pt-decoder.h | 137 ++++++++++++++++++ 2 files changed, 216 insertions(+), 3 deletions(-) diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c index 2f7791d4034f..f8b71bf2bb4c 100644 --- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c +++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.c @@ -134,6 +134,9 @@ struct intel_pt_decoder { struct intel_pt_stack stack; enum intel_pt_pkt_state pkt_state; enum intel_pt_pkt_ctx pkt_ctx; + enum intel_pt_pkt_ctx prev_pkt_ctx; + enum intel_pt_blk_type blk_type; + int blk_type_pos; struct intel_pt_pkt packet; struct intel_pt_pkt tnt; int pkt_step; @@ -167,6 +170,7 @@ struct intel_pt_decoder { bool set_fup_mwait; bool set_fup_pwre; bool set_fup_exstop; + bool set_fup_bep; bool sample_cyc; unsigned int fup_tx_flags; unsigned int tx_flags; @@ -560,6 +564,7 @@ static int intel_pt_get_split_packet(struct intel_pt_decoder *decoder) memcpy(buf + len, decoder->buf, n); len += n; + decoder->prev_pkt_ctx = decoder->pkt_ctx; ret = intel_pt_get_packet(buf, len, &decoder->packet, &decoder->pkt_ctx); if (ret < (int)old_len) { decoder->next_buf = decoder->buf; @@ -885,6 +890,7 @@ static int intel_pt_get_next_packet(struct intel_pt_decoder *decoder) return ret; } + decoder->prev_pkt_ctx = decoder->pkt_ctx; ret = intel_pt_get_packet(decoder->buf, decoder->len, &decoder->packet, &decoder->pkt_ctx); if (ret == INTEL_PT_NEED_MORE_BYTES && BITS_PER_LONG == 32 && @@ -1124,6 +1130,14 @@ static bool intel_pt_fup_event(struct intel_pt_decoder *decoder) decoder->state.to_ip = 0; ret = true; } + if (decoder->set_fup_bep) { + decoder->set_fup_bep = false; + decoder->state.type |= INTEL_PT_BLK_ITEMS; + decoder->state.type &= ~INTEL_PT_BRANCH; + decoder->state.from_ip = decoder->ip; + decoder->state.to_ip = 0; + ret = true; + } return ret; } @@ -1609,6 +1623,46 @@ static void intel_pt_calc_cyc_timestamp(struct intel_pt_decoder *decoder) intel_pt_log_to("Setting timestamp", decoder->timestamp); } +static void intel_pt_bbp(struct intel_pt_decoder *decoder) +{ + if (decoder->prev_pkt_ctx == INTEL_PT_NO_CTX) { + memset(decoder->state.items.mask, 0, sizeof(decoder->state.items.mask)); + decoder->state.items.is_32_bit = false; + } + decoder->blk_type = decoder->packet.payload; + decoder->blk_type_pos = intel_pt_blk_type_pos(decoder->blk_type); + if (decoder->blk_type == INTEL_PT_GP_REGS) + decoder->state.items.is_32_bit = decoder->packet.count; + if (decoder->blk_type_pos < 0) { + intel_pt_log("WARNING: Unknown block type %u\n", + decoder->blk_type); + } else if (decoder->state.items.mask[decoder->blk_type_pos]) { + intel_pt_log("WARNING: Duplicate block type %u\n", + decoder->blk_type); + } +} + +static void intel_pt_bip(struct intel_pt_decoder *decoder) +{ + uint32_t id = decoder->packet.count; + uint32_t bit = 1 << id; + int pos = decoder->blk_type_pos; + + if (pos < 0 || id >= INTEL_PT_BLK_ITEM_ID_CNT) { + intel_pt_log("WARNING: Unknown block item %u type %d\n", + id, decoder->blk_type); + return; + } + + if (decoder->state.items.mask[pos] & bit) { + intel_pt_log("WARNING: Duplicate block item %u type %d\n", + id, decoder->blk_type); + } + + decoder->state.items.mask[pos] |= bit; + decoder->state.items.val[pos][id] = decoder->packet.payload; +} + /* Walk PSB+ packets when already in sync. */ static int intel_pt_walk_psbend(struct intel_pt_decoder *decoder) { @@ -2063,11 +2117,32 @@ static int intel_pt_walk_trace(struct intel_pt_decoder *decoder) return 0; case INTEL_PT_BBP: - case INTEL_PT_BIP: - case INTEL_PT_BEP: - case INTEL_PT_BEP_IP: + intel_pt_bbp(decoder); break; + case INTEL_PT_BIP: + intel_pt_bip(decoder); + break; + + case INTEL_PT_BEP: + decoder->state.type = INTEL_PT_BLK_ITEMS; + decoder->state.from_ip = decoder->ip; + decoder->state.to_ip = 0; + return 0; + + case INTEL_PT_BEP_IP: + err = intel_pt_get_next_packet(decoder); + if (err) + return err; + if (decoder->packet.type == INTEL_PT_FUP) { + decoder->set_fup_bep = true; + no_tip = true; + } else { + intel_pt_log_at("ERROR: Missing FUP after BEP", + decoder->pos); + } + goto next; + default: return intel_pt_bug(decoder); } @@ -2335,6 +2410,7 @@ static int intel_pt_sync_ip(struct intel_pt_decoder *decoder) decoder->set_fup_mwait = false; decoder->set_fup_pwre = false; decoder->set_fup_exstop = false; + decoder->set_fup_bep = false; if (!decoder->branch_enable) { decoder->pkt_state = INTEL_PT_STATE_IN_SYNC; diff --git a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h index 754efa8b501f..9957f2ccdca8 100644 --- a/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h +++ b/tools/perf/util/intel-pt-decoder/intel-pt-decoder.h @@ -30,6 +30,7 @@ enum intel_pt_sample_type { INTEL_PT_CBR_CHG = 1 << 8, INTEL_PT_TRACE_BEGIN = 1 << 9, INTEL_PT_TRACE_END = 1 << 10, + INTEL_PT_BLK_ITEMS = 1 << 11, }; enum intel_pt_period_type { @@ -61,6 +62,141 @@ enum intel_pt_param_flags { INTEL_PT_FUP_WITH_NLIP = 1 << 0, }; +enum intel_pt_blk_type { + INTEL_PT_GP_REGS = 1, + INTEL_PT_PEBS_BASIC = 4, + INTEL_PT_PEBS_MEM = 5, + INTEL_PT_LBR_0 = 8, + INTEL_PT_LBR_1 = 9, + INTEL_PT_LBR_2 = 10, + INTEL_PT_XMM = 16, + INTEL_PT_BLK_TYPE_MAX +}; + +/* + * The block type numbers are not sequential but here they are given sequential + * positions to avoid wasting space for array placement. + */ +enum intel_pt_blk_type_pos { + INTEL_PT_GP_REGS_POS, + INTEL_PT_PEBS_BASIC_POS, + INTEL_PT_PEBS_MEM_POS, + INTEL_PT_LBR_0_POS, + INTEL_PT_LBR_1_POS, + INTEL_PT_LBR_2_POS, + INTEL_PT_XMM_POS, + INTEL_PT_BLK_TYPE_CNT +}; + +/* Get the array position for a block type */ +static inline int intel_pt_blk_type_pos(enum intel_pt_blk_type blk_type) +{ +#define BLK_TYPE(bt) [INTEL_PT_##bt] = INTEL_PT_##bt##_POS + 1 + const int map[INTEL_PT_BLK_TYPE_MAX] = { + BLK_TYPE(GP_REGS), + BLK_TYPE(PEBS_BASIC), + BLK_TYPE(PEBS_MEM), + BLK_TYPE(LBR_0), + BLK_TYPE(LBR_1), + BLK_TYPE(LBR_2), + BLK_TYPE(XMM), + }; +#undef BLK_TYPE + + return blk_type < INTEL_PT_BLK_TYPE_MAX ? map[blk_type] - 1 : -1; +} + +#define INTEL_PT_BLK_ITEM_ID_CNT 32 + +/* + * Use unions so that the block items can be accessed by name or by array index. + * There is an array of 32-bit masks for each block type, which indicate which + * values are present. Then arrays of 32 64-bit values for each block type. + */ +struct intel_pt_blk_items { + union { + uint32_t mask[INTEL_PT_BLK_TYPE_CNT]; + struct { + uint32_t has_rflags:1; + uint32_t has_rip:1; + uint32_t has_rax:1; + uint32_t has_rcx:1; + uint32_t has_rdx:1; + uint32_t has_rbx:1; + uint32_t has_rsp:1; + uint32_t has_rbp:1; + uint32_t has_rsi:1; + uint32_t has_rdi:1; + uint32_t has_r8:1; + uint32_t has_r9:1; + uint32_t has_r10:1; + uint32_t has_r11:1; + uint32_t has_r12:1; + uint32_t has_r13:1; + uint32_t has_r14:1; + uint32_t has_r15:1; + uint32_t has_unused_0:14; + uint32_t has_ip:1; + uint32_t has_applicable_counters:1; + uint32_t has_timestamp:1; + uint32_t has_unused_1:29; + uint32_t has_mem_access_address:1; + uint32_t has_mem_aux_info:1; + uint32_t has_mem_access_latency:1; + uint32_t has_tsx_aux_info:1; + uint32_t has_unused_2:28; + uint32_t has_lbr_0; + uint32_t has_lbr_1; + uint32_t has_lbr_2; + uint32_t has_xmm; + }; + }; + union { + uint64_t val[INTEL_PT_BLK_TYPE_CNT][INTEL_PT_BLK_ITEM_ID_CNT]; + struct { + struct { + uint64_t rflags; + uint64_t rip; + uint64_t rax; + uint64_t rcx; + uint64_t rdx; + uint64_t rbx; + uint64_t rsp; + uint64_t rbp; + uint64_t rsi; + uint64_t rdi; + uint64_t r8; + uint64_t r9; + uint64_t r10; + uint64_t r11; + uint64_t r12; + uint64_t r13; + uint64_t r14; + uint64_t r15; + uint64_t unused_0[INTEL_PT_BLK_ITEM_ID_CNT - 18]; + }; + struct { + uint64_t ip; + uint64_t applicable_counters; + uint64_t timestamp; + uint64_t unused_1[INTEL_PT_BLK_ITEM_ID_CNT - 3]; + }; + struct { + uint64_t mem_access_address; + uint64_t mem_aux_info; + uint64_t mem_access_latency; + uint64_t tsx_aux_info; + uint64_t unused_2[INTEL_PT_BLK_ITEM_ID_CNT - 4]; + }; + uint64_t lbr_0[INTEL_PT_BLK_ITEM_ID_CNT]; + uint64_t lbr_1[INTEL_PT_BLK_ITEM_ID_CNT]; + uint64_t lbr_2[INTEL_PT_BLK_ITEM_ID_CNT]; + uint64_t xmm[INTEL_PT_BLK_ITEM_ID_CNT]; + }; + }; + bool is_32_bit; +}; + struct intel_pt_state { enum intel_pt_sample_type type; int err; @@ -81,6 +217,7 @@ struct intel_pt_state { enum intel_pt_insn_op insn_op; int insn_len; char insn[INTEL_PT_INSN_BUF_SZ]; + struct intel_pt_blk_items items; }; struct intel_pt_insn; From e62ca655eea7ad4956929f647c2d9fb36aeff90e Mon Sep 17 00:00:00 2001 From: Adrian Hunter Date: Mon, 10 Jun 2019 10:27:56 +0300 Subject: [PATCH 06/25] perf intel-pt: Prepare to synthesize PEBS samples Add infrastructure to prepare for synthesizing PEBS samples but leave the actual synthesis to later patches. Signed-off-by: Adrian Hunter Cc: Jiri Olsa Link: http://lkml.kernel.org/r/20190610072803.10456-5-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/intel-pt.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c index 893cef494a43..cc91c1413c22 100644 --- a/tools/perf/util/intel-pt.c +++ b/tools/perf/util/intel-pt.c @@ -101,6 +101,9 @@ struct intel_pt { u64 pwrx_id; u64 cbr_id; + bool sample_pebs; + struct perf_evsel *pebs_evsel; + u64 tsc_bit; u64 mtc_bit; u64 mtc_freq_bits; @@ -1535,6 +1538,11 @@ static int intel_pt_synth_pwrx_sample(struct intel_pt_queue *ptq) pt->pwr_events_sample_type); } +static int intel_pt_synth_pebs_sample(struct intel_pt_queue *ptq __maybe_unused) +{ + return 0; +} + static int intel_pt_synth_error(struct intel_pt *pt, int code, int cpu, pid_t pid, pid_t tid, u64 ip, u64 timestamp) { @@ -1622,6 +1630,16 @@ static int intel_pt_sample(struct intel_pt_queue *ptq) ptq->ipc_cyc_cnt = ptq->state->tot_cyc_cnt; } + /* + * Do PEBS first to allow for the possibility that the PEBS timestamp + * precedes the current timestamp. + */ + if (pt->sample_pebs && state->type & INTEL_PT_BLK_ITEMS) { + err = intel_pt_synth_pebs_sample(ptq); + if (err) + return err; + } + if (pt->sample_pwr_events && (state->type & INTEL_PT_PWR_EVT)) { if (state->type & INTEL_PT_CBR_CHG) { err = intel_pt_synth_cbr_sample(ptq); From 0dfded34a2e3b517c149ee9c7d1e5173025017b7 Mon Sep 17 00:00:00 2001 From: Adrian Hunter Date: Mon, 10 Jun 2019 10:27:57 +0300 Subject: [PATCH 07/25] perf intel-pt: Factor out common sample preparation for re-use Factor out common sample preparation for re-use when synthesizing PEBS samples. Signed-off-by: Adrian Hunter Cc: Jiri Olsa Link: http://lkml.kernel.org/r/20190610072803.10456-6-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/intel-pt.c | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c index cc91c1413c22..a2d90b2f1f11 100644 --- a/tools/perf/util/intel-pt.c +++ b/tools/perf/util/intel-pt.c @@ -1182,28 +1182,37 @@ static inline bool intel_pt_skip_event(struct intel_pt *pt) pt->num_events++ < pt->synth_opts.initial_skip; } +static void intel_pt_prep_a_sample(struct intel_pt_queue *ptq, + union perf_event *event, + struct perf_sample *sample) +{ + event->sample.header.type = PERF_RECORD_SAMPLE; + event->sample.header.size = sizeof(struct perf_event_header); + + sample->pid = ptq->pid; + sample->tid = ptq->tid; + sample->cpu = ptq->cpu; + sample->insn_len = ptq->insn_len; + memcpy(sample->insn, ptq->insn, INTEL_PT_INSN_BUF_SZ); +} + static void intel_pt_prep_b_sample(struct intel_pt *pt, struct intel_pt_queue *ptq, union perf_event *event, struct perf_sample *sample) { + intel_pt_prep_a_sample(ptq, event, sample); + if (!pt->timeless_decoding) sample->time = tsc_to_perf_time(ptq->timestamp, &pt->tc); sample->ip = ptq->state->from_ip; sample->cpumode = intel_pt_cpumode(pt, sample->ip); - sample->pid = ptq->pid; - sample->tid = ptq->tid; sample->addr = ptq->state->to_ip; sample->period = 1; - sample->cpu = ptq->cpu; sample->flags = ptq->flags; - sample->insn_len = ptq->insn_len; - memcpy(sample->insn, ptq->insn, INTEL_PT_INSN_BUF_SZ); - event->sample.header.type = PERF_RECORD_SAMPLE; event->sample.header.misc = sample->cpumode; - event->sample.header.size = sizeof(struct perf_event_header); } static int intel_pt_inject_event(union perf_event *event, From 9d0bc53e35b82e429ab698d112f7af4336578735 Mon Sep 17 00:00:00 2001 From: Adrian Hunter Date: Mon, 10 Jun 2019 10:27:58 +0300 Subject: [PATCH 08/25] perf intel-pt: Synthesize PEBS sample basic information Synthesize a PEBS sample using basic information (ip, timestamp) only. Other PEBS information will be added in later patches. Signed-off-by: Adrian Hunter Cc: Jiri Olsa Link: http://lkml.kernel.org/r/20190610072803.10456-7-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/intel-pt.c | 52 ++++++++++++++++++++++++++++++++++++-- 1 file changed, 50 insertions(+), 2 deletions(-) diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c index a2d90b2f1f11..979519b00a74 100644 --- a/tools/perf/util/intel-pt.c +++ b/tools/perf/util/intel-pt.c @@ -1547,9 +1547,57 @@ static int intel_pt_synth_pwrx_sample(struct intel_pt_queue *ptq) pt->pwr_events_sample_type); } -static int intel_pt_synth_pebs_sample(struct intel_pt_queue *ptq __maybe_unused) +static int intel_pt_synth_pebs_sample(struct intel_pt_queue *ptq) { - return 0; + const struct intel_pt_blk_items *items = &ptq->state->items; + struct perf_sample sample = { .ip = 0, }; + union perf_event *event = ptq->event_buf; + struct intel_pt *pt = ptq->pt; + struct perf_evsel *evsel = pt->pebs_evsel; + u64 sample_type = evsel->attr.sample_type; + u64 id = evsel->id[0]; + u8 cpumode; + + if (intel_pt_skip_event(pt)) + return 0; + + intel_pt_prep_a_sample(ptq, event, &sample); + + sample.id = id; + sample.stream_id = id; + + if (!evsel->attr.freq) + sample.period = evsel->attr.sample_period; + + /* No support for non-zero CS base */ + if (items->has_ip) + sample.ip = items->ip; + else if (items->has_rip) + sample.ip = items->rip; + else + sample.ip = ptq->state->from_ip; + + /* No support for guest mode at this time */ + cpumode = sample.ip < ptq->pt->kernel_start ? + PERF_RECORD_MISC_USER : + PERF_RECORD_MISC_KERNEL; + + event->sample.header.misc = cpumode | PERF_RECORD_MISC_EXACT_IP; + + sample.cpumode = cpumode; + + if (sample_type & PERF_SAMPLE_TIME) { + u64 timestamp = 0; + + if (items->has_timestamp) + timestamp = items->timestamp; + else if (!pt->timeless_decoding) + timestamp = ptq->timestamp; + if (timestamp) + sample.time = tsc_to_perf_time(timestamp, &pt->tc); + } + + return intel_pt_deliver_synth_event(pt, ptq, event, &sample, sample_type); } static int intel_pt_synth_error(struct intel_pt *pt, int code, int cpu, From 9e9a618afc178e747cc449464ba54d9c932f7af2 Mon Sep 17 00:00:00 2001 From: Adrian Hunter Date: Mon, 10 Jun 2019 10:27:59 +0300 Subject: [PATCH 09/25] perf intel-pt: Add gp registers to synthesized PEBS sample Add general purpose register information from PEBS data in the Intel PT trace to the synthesized PEBS sample. Signed-off-by: Adrian Hunter Cc: Jiri Olsa Link: http://lkml.kernel.org/r/20190610072803.10456-8-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/intel-pt.c | 69 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c index 979519b00a74..00c2c96bb805 100644 --- a/tools/perf/util/intel-pt.c +++ b/tools/perf/util/intel-pt.c @@ -35,6 +35,8 @@ #include "config.h" #include "time-utils.h" +#include "../arch/x86/include/uapi/asm/perf_regs.h" + #include "intel-pt-decoder/intel-pt-log.h" #include "intel-pt-decoder/intel-pt-decoder.h" #include "intel-pt-decoder/intel-pt-insn-decoder.h" @@ -1547,6 +1549,60 @@ static int intel_pt_synth_pwrx_sample(struct intel_pt_queue *ptq) pt->pwr_events_sample_type); } +/* + * PEBS gp_regs array indexes plus 1 so that 0 means not present. Refer + * intel_pt_add_gp_regs(). + */ +static const int pebs_gp_regs[] = { + [PERF_REG_X86_FLAGS] = 1, + [PERF_REG_X86_IP] = 2, + [PERF_REG_X86_AX] = 3, + [PERF_REG_X86_CX] = 4, + [PERF_REG_X86_DX] = 5, + [PERF_REG_X86_BX] = 6, + [PERF_REG_X86_SP] = 7, + [PERF_REG_X86_BP] = 8, + [PERF_REG_X86_SI] = 9, + [PERF_REG_X86_DI] = 10, + [PERF_REG_X86_R8] = 11, + [PERF_REG_X86_R9] = 12, + [PERF_REG_X86_R10] = 13, + [PERF_REG_X86_R11] = 14, + [PERF_REG_X86_R12] = 15, + [PERF_REG_X86_R13] = 16, + [PERF_REG_X86_R14] = 17, + [PERF_REG_X86_R15] = 18, +}; + +static u64 *intel_pt_add_gp_regs(struct regs_dump *intr_regs, u64 *pos, + const struct intel_pt_blk_items *items, + u64 regs_mask) +{ + const u64 *gp_regs = items->val[INTEL_PT_GP_REGS_POS]; + u32 mask = items->mask[INTEL_PT_GP_REGS_POS]; + u32 bit; + int i; + + for (i = 0, bit = 1; i < PERF_REG_X86_64_MAX; i++, bit <<= 1) { + /* Get the PEBS gp_regs array index */ + int n = pebs_gp_regs[i] - 1; + + if (n < 0) + continue; + /* + * Add only registers that were requested (i.e. 'regs_mask') and + * that were provided (i.e. 'mask'), and update the resulting + * mask (i.e. 'intr_regs->mask') accordingly. + */ + if (mask & 1 << n && regs_mask & bit) { + intr_regs->mask |= bit; + *pos++ = gp_regs[n]; + } + } + + return pos; +} + static int intel_pt_synth_pebs_sample(struct intel_pt_queue *ptq) { const struct intel_pt_blk_items *items = &ptq->state->items; @@ -1597,6 +1653,19 @@ static int intel_pt_synth_pebs_sample(struct intel_pt_queue *ptq) sample.time = tsc_to_perf_time(timestamp, &pt->tc); } + if (sample_type & PERF_SAMPLE_REGS_INTR && + items->mask[INTEL_PT_GP_REGS_POS]) { + u64 regs[sizeof(sample.intr_regs.mask)]; + u64 regs_mask = evsel->attr.sample_regs_intr; + + sample.intr_regs.abi = items->is_32_bit ? + PERF_SAMPLE_REGS_ABI_32 : + PERF_SAMPLE_REGS_ABI_64; + sample.intr_regs.regs = regs; + + intel_pt_add_gp_regs(&sample.intr_regs, regs, items, regs_mask); + } + return intel_pt_deliver_synth_event(pt, ptq, event, &sample, sample_type); } From 143d34a6b387b96aba42c49cb76d18ad3e3863e5 Mon Sep 17 00:00:00 2001 From: Adrian Hunter Date: Mon, 10 Jun 2019 10:28:00 +0300 Subject: [PATCH 10/25] perf intel-pt: Add XMM registers to synthesized PEBS sample Add XMM register information from PEBS data in the Intel PT trace to the synthesized PEBS sample. Signed-off-by: Adrian Hunter Cc: Jiri Olsa Link: http://lkml.kernel.org/r/20190610072803.10456-9-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/intel-pt.c | 30 +++++++++++++++++++++++++++++- 1 file changed, 29 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c index 00c2c96bb805..f83dd10bb7d0 100644 --- a/tools/perf/util/intel-pt.c +++ b/tools/perf/util/intel-pt.c @@ -1603,6 +1603,31 @@ static u64 *intel_pt_add_gp_regs(struct regs_dump *intr_regs, u64 *pos, return pos; } +#ifndef PERF_REG_X86_XMM0 +#define PERF_REG_X86_XMM0 32 +#endif + +static void intel_pt_add_xmm(struct regs_dump *intr_regs, u64 *pos, + const struct intel_pt_blk_items *items, + u64 regs_mask) +{ + u32 mask = items->has_xmm & (regs_mask >> PERF_REG_X86_XMM0); + const u64 *xmm = items->xmm; + + /* + * If there are any XMM registers, then there should be all of them. + * Nevertheless, follow the logic to add only registers that were + * requested (i.e. 'regs_mask') and that were provided (i.e. 'mask'), + * and update the resulting mask (i.e. 'intr_regs->mask') accordingly. + */ + intr_regs->mask |= (u64)mask << PERF_REG_X86_XMM0; + + for (; mask; mask >>= 1, xmm++) { + if (mask & 1) + *pos++ = *xmm; + } +} + static int intel_pt_synth_pebs_sample(struct intel_pt_queue *ptq) { const struct intel_pt_blk_items *items = &ptq->state->items; @@ -1657,13 +1682,16 @@ static int intel_pt_synth_pebs_sample(struct intel_pt_queue *ptq) items->mask[INTEL_PT_GP_REGS_POS]) { u64 regs[sizeof(sample.intr_regs.mask)]; u64 regs_mask = evsel->attr.sample_regs_intr; + u64 *pos; sample.intr_regs.abi = items->is_32_bit ? PERF_SAMPLE_REGS_ABI_32 : PERF_SAMPLE_REGS_ABI_64; sample.intr_regs.regs = regs; - intel_pt_add_gp_regs(&sample.intr_regs, regs, items, regs_mask); + pos = intel_pt_add_gp_regs(&sample.intr_regs, regs, items, regs_mask); + + intel_pt_add_xmm(&sample.intr_regs, pos, items, regs_mask); } return intel_pt_deliver_synth_event(pt, ptq, event, &sample, sample_type); From aa62afd7daac4b4cc95cd2454e3f43aa23f519c1 Mon Sep 17 00:00:00 2001 From: Adrian Hunter Date: Mon, 10 Jun 2019 10:28:01 +0300 Subject: [PATCH 11/25] perf intel-pt: Add LBR information to synthesized PEBS sample Add LBR information from PEBS data in the Intel PT trace to the synthesized PEBS sample. Signed-off-by: Adrian Hunter Cc: Jiri Olsa Link: http://lkml.kernel.org/r/20190610072803.10456-10-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/intel-pt.c | 72 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 72 insertions(+) diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c index f83dd10bb7d0..db00c13dc36f 100644 --- a/tools/perf/util/intel-pt.c +++ b/tools/perf/util/intel-pt.c @@ -1628,6 +1628,58 @@ static void intel_pt_add_xmm(struct regs_dump *intr_regs, u64 *pos, } } +#define LBR_INFO_MISPRED (1ULL << 63) +#define LBR_INFO_IN_TX (1ULL << 62) +#define LBR_INFO_ABORT (1ULL << 61) +#define LBR_INFO_CYCLES 0xffff + +/* Refer kernel's intel_pmu_store_pebs_lbrs() */ +static u64 intel_pt_lbr_flags(u64 info) +{ + union { + struct branch_flags flags; + u64 result; + } u = { + .flags = { + .mispred = !!(info & LBR_INFO_MISPRED), + .predicted = !(info & LBR_INFO_MISPRED), + .in_tx = !!(info & LBR_INFO_IN_TX), + .abort = !!(info & LBR_INFO_ABORT), + .cycles = info & LBR_INFO_CYCLES, + } + }; + + return u.result; +} + +static void intel_pt_add_lbrs(struct branch_stack *br_stack, + const struct intel_pt_blk_items *items) +{ + u64 *to; + int i; + + br_stack->nr = 0; + + to = &br_stack->entries[0].from; + + for (i = INTEL_PT_LBR_0_POS; i <= INTEL_PT_LBR_2_POS; i++) { + u32 mask = items->mask[i]; + const u64 *from = items->val[i]; + + for (; mask; mask >>= 3, from += 3) { + if ((mask & 7) == 7) { + *to++ = from[0]; + *to++ = from[1]; + *to++ = intel_pt_lbr_flags(from[2]); + br_stack->nr += 1; + } + } + } +} + +/* INTEL_PT_LBR_0, INTEL_PT_LBR_1 and INTEL_PT_LBR_2 */ +#define LBRS_MAX (INTEL_PT_BLK_ITEM_ID_CNT * 3) + static int intel_pt_synth_pebs_sample(struct intel_pt_queue *ptq) { const struct intel_pt_blk_items *items = &ptq->state->items; @@ -1694,6 +1746,26 @@ static int intel_pt_synth_pebs_sample(struct intel_pt_queue *ptq) intel_pt_add_xmm(&sample.intr_regs, pos, items, regs_mask); } + if (sample_type & PERF_SAMPLE_BRANCH_STACK) { + struct { + struct branch_stack br_stack; + struct branch_entry entries[LBRS_MAX]; + } br; + + if (items->mask[INTEL_PT_LBR_0_POS] || + items->mask[INTEL_PT_LBR_1_POS] || + items->mask[INTEL_PT_LBR_2_POS]) { + intel_pt_add_lbrs(&br.br_stack, items); + sample.branch_stack = &br.br_stack; + } else if (pt->synth_opts.last_branch) { + intel_pt_copy_last_branch_rb(ptq); + sample.branch_stack = ptq->last_branch; + } else { + br.br_stack.nr = 0; + sample.branch_stack = &br.br_stack; + } + } + return intel_pt_deliver_synth_event(pt, ptq, event, &sample, sample_type); } From 975846eddf907297aa036544545cd839c7c7dd31 Mon Sep 17 00:00:00 2001 From: Adrian Hunter Date: Mon, 10 Jun 2019 10:28:02 +0300 Subject: [PATCH 12/25] perf intel-pt: Add memory information to synthesized PEBS sample Add memory information from PEBS data in the Intel PT trace to the synthesized PEBS sample. This provides sample types PERF_SAMPLE_ADDR, PERF_SAMPLE_WEIGHT, and PERF_SAMPLE_TRANSACTION, but not PERF_SAMPLE_DATA_SRC. Signed-off-by: Adrian Hunter Cc: Jiri Olsa Link: http://lkml.kernel.org/r/20190610072803.10456-11-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/intel-pt.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c index db00c13dc36f..bf7647897e8a 100644 --- a/tools/perf/util/intel-pt.c +++ b/tools/perf/util/intel-pt.c @@ -1766,6 +1766,33 @@ static int intel_pt_synth_pebs_sample(struct intel_pt_queue *ptq) } } + if (sample_type & PERF_SAMPLE_ADDR && items->has_mem_access_address) + sample.addr = items->mem_access_address; + + if (sample_type & PERF_SAMPLE_WEIGHT) { + /* + * Refer kernel's setup_pebs_adaptive_sample_data() and + * intel_hsw_weight(). + */ + if (items->has_mem_access_latency) + sample.weight = items->mem_access_latency; + if (!sample.weight && items->has_tsx_aux_info) { + /* Cycles last block */ + sample.weight = (u32)items->tsx_aux_info; + } + } + + if (sample_type & PERF_SAMPLE_TRANSACTION && items->has_tsx_aux_info) { + u64 ax = items->has_rax ? items->rax : 0; + /* Refer kernel's intel_hsw_transaction() */ + u64 txn = (u8)(items->tsx_aux_info >> 32); + + /* For RTM XABORTs also log the abort code from AX */ + if (txn & PERF_TXN_TRANSACTION && ax & 1) + txn |= ((ax >> 24) & 0xff) << PERF_TXN_ABORT_SHIFT; + sample.transaction = txn; + } + return intel_pt_deliver_synth_event(pt, ptq, event, &sample, sample_type); } From e01f0ef509ea7e76929f24a074d241de52c6f82a Mon Sep 17 00:00:00 2001 From: Adrian Hunter Date: Mon, 10 Jun 2019 10:28:03 +0300 Subject: [PATCH 13/25] perf intel-pt: Add callchain to synthesized PEBS sample Like other synthesized events, if there is also an Intel PT branch trace, then a call stack can also be synthesized. Add that. Signed-off-by: Adrian Hunter Cc: Jiri Olsa Link: http://lkml.kernel.org/r/20190610072803.10456-12-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/intel-pt.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c index bf7647897e8a..550db6e77968 100644 --- a/tools/perf/util/intel-pt.c +++ b/tools/perf/util/intel-pt.c @@ -1730,6 +1730,14 @@ static int intel_pt_synth_pebs_sample(struct intel_pt_queue *ptq) sample.time = tsc_to_perf_time(timestamp, &pt->tc); } + if (sample_type & PERF_SAMPLE_CALLCHAIN && + pt->synth_opts.callchain) { + thread_stack__sample(ptq->thread, ptq->cpu, ptq->chain, + pt->synth_opts.callchain_sz, sample.ip, + pt->kernel_start); + sample.callchain = ptq->chain; + } + if (sample_type & PERF_SAMPLE_REGS_INTR && items->mask[INTEL_PT_GP_REGS_POS]) { u64 regs[sizeof(sample.intr_regs.mask)]; From 4541a8bb13a86e504416a13360c8dc64d2fd612a Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo Date: Thu, 13 Jun 2019 12:04:19 -0300 Subject: [PATCH 14/25] tools build: Check if gettid() is available before providing helper MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Laura reported that the perf build failed in fedora when we got a glibc that provides gettid(), which I reproduced using fedora rawhide with the glibc-devel-2.29.9000-26.fc31.x86_64 package. Add a feature check to avoid providing a gettid() helper in such systems. On a fedora rawhide system with this patch applied we now get: [root@7a5f55352234 perf]# grep gettid /tmp/build/perf/FEATURE-DUMP feature-gettid=1 [root@7a5f55352234 perf]# cat /tmp/build/perf/feature/test-gettid.make.output [root@7a5f55352234 perf]# ldd /tmp/build/perf/feature/test-gettid.bin linux-vdso.so.1 (0x00007ffc6b1f6000) libc.so.6 => /lib64/libc.so.6 (0x00007f04e0a74000) /lib64/ld-linux-x86-64.so.2 (0x00007f04e0c47000) [root@7a5f55352234 perf]# nm /tmp/build/perf/feature/test-gettid.bin | grep -w gettid U gettid@@GLIBC_2.30 [root@7a5f55352234 perf]# While on a fedora:29 system: [acme@quaco perf]$ grep gettid /tmp/build/perf/FEATURE-DUMP feature-gettid=0 [acme@quaco perf]$ cat /tmp/build/perf/feature/test-gettid.make.output test-gettid.c: In function ‘main’: test-gettid.c:8:9: error: implicit declaration of function ‘gettid’; did you mean ‘getgid’? [-Werror=implicit-function-declaration] return gettid(); ^~~~~~ getgid cc1: all warnings being treated as errors [acme@quaco perf]$ Reported-by: Laura Abbott Tested-by: Laura Abbott Acked-by: Jiri Olsa Cc: Adrian Hunter Cc: Florian Weimer Cc: Namhyung Kim Cc: Stephane Eranian Link: https://lkml.kernel.org/n/tip-yfy3ch53agmklwu9o7rlgf9c@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/build/Makefile.feature | 1 + tools/build/feature/Makefile | 4 ++++ tools/build/feature/test-all.c | 5 +++++ tools/build/feature/test-gettid.c | 11 +++++++++++ tools/perf/Makefile.config | 4 ++++ tools/perf/jvmti/jvmti_agent.c | 2 ++ 6 files changed, 27 insertions(+) create mode 100644 tools/build/feature/test-gettid.c diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature index 3b24231c58a2..50377cc2f5f9 100644 --- a/tools/build/Makefile.feature +++ b/tools/build/Makefile.feature @@ -36,6 +36,7 @@ FEATURE_TESTS_BASIC := \ fortify-source \ sync-compare-and-swap \ get_current_dir_name \ + gettid \ glibc \ gtk2 \ gtk2-infobar \ diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile index 4b8244ee65ce..523ee42db0c8 100644 --- a/tools/build/feature/Makefile +++ b/tools/build/feature/Makefile @@ -54,6 +54,7 @@ FILES= \ test-get_cpuid.bin \ test-sdt.bin \ test-cxx.bin \ + test-gettid.bin \ test-jvmti.bin \ test-jvmti-cmlr.bin \ test-sched_getcpu.bin \ @@ -267,6 +268,9 @@ $(OUTPUT)test-sdt.bin: $(OUTPUT)test-cxx.bin: $(BUILDXX) -std=gnu++11 +$(OUTPUT)test-gettid.bin: + $(BUILD) + $(OUTPUT)test-jvmti.bin: $(BUILD) diff --git a/tools/build/feature/test-all.c b/tools/build/feature/test-all.c index a59c53705093..3b3d5d72124a 100644 --- a/tools/build/feature/test-all.c +++ b/tools/build/feature/test-all.c @@ -38,6 +38,10 @@ # include "test-get_current_dir_name.c" #undef main +#define main main_test_gettid +# include "test-gettid.c" +#undef main + #define main main_test_glibc # include "test-glibc.c" #undef main @@ -195,6 +199,7 @@ int main(int argc, char *argv[]) main_test_libelf(); main_test_libelf_mmap(); main_test_get_current_dir_name(); + main_test_gettid(); main_test_glibc(); main_test_dwarf(); main_test_dwarf_getlocations(); diff --git a/tools/build/feature/test-gettid.c b/tools/build/feature/test-gettid.c new file mode 100644 index 000000000000..ef24e42d3f1b --- /dev/null +++ b/tools/build/feature/test-gettid.c @@ -0,0 +1,11 @@ +// SPDX-License-Identifier: GPL-2.0 +// Copyright (C) 2019, Red Hat Inc, Arnaldo Carvalho de Melo +#define _GNU_SOURCE +#include + +int main(void) +{ + return gettid(); +} + +#undef _GNU_SOURCE diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index 51dd00f65709..5f16a20cae86 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -332,6 +332,10 @@ ifeq ($(feature-get_current_dir_name), 1) CFLAGS += -DHAVE_GET_CURRENT_DIR_NAME endif +ifeq ($(feature-gettid), 1) + CFLAGS += -DHAVE_GETTID +endif + ifdef NO_LIBELF NO_DWARF := 1 NO_DEMANGLE := 1 diff --git a/tools/perf/jvmti/jvmti_agent.c b/tools/perf/jvmti/jvmti_agent.c index f7eb63cbbc65..88108598d6e9 100644 --- a/tools/perf/jvmti/jvmti_agent.c +++ b/tools/perf/jvmti/jvmti_agent.c @@ -45,10 +45,12 @@ static char jit_path[PATH_MAX]; static void *marker_addr; +#ifndef HAVE_GETTID static inline pid_t gettid(void) { return (pid_t)syscall(__NR_gettid); } +#endif static int get_e_machine(struct jitheader *hdr) { From a4066d64d9391a734ee0e49c8d2757f5685013b4 Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo Date: Thu, 13 Jun 2019 17:35:09 -0300 Subject: [PATCH 15/25] perf trace: Fix exclusion of not available syscall names from selector list We were just skipping the syscalls not available in a particular architecture without reflecting this in the number of entries in the ev_qualifier_ids.nr variable, fix it. This was done with the most minimalistic way, reusing the index variable 'i', a followup patch will further clean this by making 'i' renamed to 'nr_used' and using 'nr_allocated' in a few more places. Reported-by: Leo Yan Tested-by: Leo Yan Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Mathieu Poirier Cc: Mike Leach Cc: Namhyung Kim Cc: Suzuki K Poulose Fixes: 04c41bcb862b ("perf trace: Skip unknown syscalls when expanding strace like syscall groups") Link: https://lkml.kernel.org/r/20190613181514.GC1402@kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/builtin-trace.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c index 12f5ad98f8c1..fa9eb467eb4c 100644 --- a/tools/perf/builtin-trace.c +++ b/tools/perf/builtin-trace.c @@ -1527,9 +1527,9 @@ static int trace__read_syscall_info(struct trace *trace, int id) static int trace__validate_ev_qualifier(struct trace *trace) { - int err = 0, i; + int err = 0; bool printed_invalid_prefix = false; - size_t nr_allocated; + size_t nr_allocated, i; struct str_node *pos; trace->ev_qualifier_ids.nr = strlist__nr_entries(trace->ev_qualifier); @@ -1574,7 +1574,7 @@ static int trace__validate_ev_qualifier(struct trace *trace) id = syscalltbl__strglobmatch_next(trace->sctbl, sc, &match_next); if (id < 0) break; - if (nr_allocated == trace->ev_qualifier_ids.nr) { + if (nr_allocated == i) { void *entries; nr_allocated += 8; @@ -1587,11 +1587,11 @@ static int trace__validate_ev_qualifier(struct trace *trace) } trace->ev_qualifier_ids.entries = entries; } - trace->ev_qualifier_ids.nr++; trace->ev_qualifier_ids.entries[i++] = id; } } + trace->ev_qualifier_ids.nr = i; out: if (printed_invalid_prefix) pr_debug("\n"); From 99f26f854867175ad7e157c7efc1e91ddc7eec44 Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo Date: Thu, 13 Jun 2019 18:17:41 -0300 Subject: [PATCH 16/25] perf trace: Streamline validation of select syscall names list Rename the 'i' variable to 'nr_used' and use set 'nr_allocated' since the start of this function, leaving the final assignment of the longer named trace->ev_qualifier_ids.nr state to 'nr_used' at the end of the function. No change in behaviour intended. Cc: Leo Yan Cc: Adrian Hunter Cc: Alexander Shishkin Cc: Jiri Olsa Cc: Mathieu Poirier Cc: Mike Leach Cc: Namhyung Kim Cc: Suzuki K Poulose Link: https://lkml.kernel.org/n/tip-kpgyn8xjdjgt0timrrnniquv@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/builtin-trace.c | 16 ++++++---------- 1 file changed, 6 insertions(+), 10 deletions(-) diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c index fa9eb467eb4c..d5b2e4d8bf41 100644 --- a/tools/perf/builtin-trace.c +++ b/tools/perf/builtin-trace.c @@ -1529,11 +1529,10 @@ static int trace__validate_ev_qualifier(struct trace *trace) { int err = 0; bool printed_invalid_prefix = false; - size_t nr_allocated, i; struct str_node *pos; + size_t nr_used = 0, nr_allocated = strlist__nr_entries(trace->ev_qualifier); - trace->ev_qualifier_ids.nr = strlist__nr_entries(trace->ev_qualifier); - trace->ev_qualifier_ids.entries = malloc(trace->ev_qualifier_ids.nr * + trace->ev_qualifier_ids.entries = malloc(nr_allocated * sizeof(trace->ev_qualifier_ids.entries[0])); if (trace->ev_qualifier_ids.entries == NULL) { @@ -1543,9 +1542,6 @@ static int trace__validate_ev_qualifier(struct trace *trace) goto out; } - nr_allocated = trace->ev_qualifier_ids.nr; - i = 0; - strlist__for_each_entry(pos, trace->ev_qualifier) { const char *sc = pos->s; int id = syscalltbl__id(trace->sctbl, sc), match_next = -1; @@ -1566,7 +1562,7 @@ static int trace__validate_ev_qualifier(struct trace *trace) continue; } matches: - trace->ev_qualifier_ids.entries[i++] = id; + trace->ev_qualifier_ids.entries[nr_used++] = id; if (match_next == -1) continue; @@ -1574,7 +1570,7 @@ static int trace__validate_ev_qualifier(struct trace *trace) id = syscalltbl__strglobmatch_next(trace->sctbl, sc, &match_next); if (id < 0) break; - if (nr_allocated == i) { + if (nr_allocated == nr_used) { void *entries; nr_allocated += 8; @@ -1587,11 +1583,11 @@ static int trace__validate_ev_qualifier(struct trace *trace) } trace->ev_qualifier_ids.entries = entries; } - trace->ev_qualifier_ids.entries[i++] = id; + trace->ev_qualifier_ids.entries[nr_used++] = id; } } - trace->ev_qualifier_ids.nr = i; + trace->ev_qualifier_ids.nr = nr_used; out: if (printed_invalid_prefix) pr_debug("\n"); From 5e2156d837e875c0277bbe9c5cd965ff56539e0b Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo Date: Thu, 13 Jun 2019 18:25:04 -0300 Subject: [PATCH 17/25] tools build feature tests: Add missing SPDX headers Cc: Adrian Hunter Cc: Jiri Olsa Cc: Namhyung Kim Link: https://lkml.kernel.org/n/tip-3h6fa866w6ao0wsbyqz9nrm8@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/build/feature/test-fortify-source.c | 1 + tools/build/feature/test-hello.c | 1 + tools/build/feature/test-setns.c | 1 + 3 files changed, 3 insertions(+) diff --git a/tools/build/feature/test-fortify-source.c b/tools/build/feature/test-fortify-source.c index c9f398d87868..c8a57194f9f2 100644 --- a/tools/build/feature/test-fortify-source.c +++ b/tools/build/feature/test-fortify-source.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0 #include int main(void) diff --git a/tools/build/feature/test-hello.c b/tools/build/feature/test-hello.c index c9f398d87868..c8a57194f9f2 100644 --- a/tools/build/feature/test-hello.c +++ b/tools/build/feature/test-hello.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0 #include int main(void) diff --git a/tools/build/feature/test-setns.c b/tools/build/feature/test-setns.c index 4a1581ae7a55..2757c201ed50 100644 --- a/tools/build/feature/test-setns.c +++ b/tools/build/feature/test-setns.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0 #define _GNU_SOURCE #include From 5875cf4cd32ea08d0d6abb82091f2d1f7cd6889f Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo Date: Thu, 13 Jun 2019 18:29:05 -0300 Subject: [PATCH 18/25] perf tests: Add missing SPDX headers Cc: Adrian Hunter Cc: Andi Kleen Cc: Jiri Olsa Cc: Namhyung Kim Cc: Wang Nan Link: https://lkml.kernel.org/n/tip-p0kg493z2m8qizjbdefzip1i@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/tests/Build | 2 ++ tools/perf/tests/bp_account.c | 1 + tools/perf/tests/bpf-script-example.c | 1 + tools/perf/tests/bpf-script-test-kbuild.c | 1 + tools/perf/tests/bpf-script-test-prologue.c | 1 + tools/perf/tests/bpf-script-test-relocation.c | 1 + tools/perf/tests/bpf.c | 1 + tools/perf/tests/map_groups.c | 1 + tools/perf/tests/mem.c | 1 + tools/perf/tests/mem2node.c | 1 + tools/perf/tests/shell/lib/probe.sh | 1 + tools/perf/tests/shell/probe_vfs_getname.sh | 3 ++- tools/perf/tests/shell/record+probe_libc_inet_pton.sh | 1 + tools/perf/tests/shell/record+script_probe_vfs_getname.sh | 1 + tools/perf/tests/shell/record+zstd_comp_decomp.sh | 2 ++ tools/perf/tests/shell/trace+probe_vfs_getname.sh | 1 + 16 files changed, 19 insertions(+), 1 deletion(-) diff --git a/tools/perf/tests/Build b/tools/perf/tests/Build index e3ba63cef01e..e72accefd669 100644 --- a/tools/perf/tests/Build +++ b/tools/perf/tests/Build @@ -1,3 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0 + perf-y += builtin-test.o perf-y += parse-events.o perf-y += dso-data.o diff --git a/tools/perf/tests/bp_account.c b/tools/perf/tests/bp_account.c index 57fc544aedb0..153624e2d0f5 100644 --- a/tools/perf/tests/bp_account.c +++ b/tools/perf/tests/bp_account.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0 /* * Powerpc needs __SANE_USERSPACE_TYPES__ before to select * 'int-ll64.h' and avoid compile warnings when printing __u64 with %llu. diff --git a/tools/perf/tests/bpf-script-example.c b/tools/perf/tests/bpf-script-example.c index 1ca5106df5f1..ab4b98b3165d 100644 --- a/tools/perf/tests/bpf-script-example.c +++ b/tools/perf/tests/bpf-script-example.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0 /* * bpf-script-example.c * Test basic LLVM building diff --git a/tools/perf/tests/bpf-script-test-kbuild.c b/tools/perf/tests/bpf-script-test-kbuild.c index ff3ec8337f0a..219673aa278f 100644 --- a/tools/perf/tests/bpf-script-test-kbuild.c +++ b/tools/perf/tests/bpf-script-test-kbuild.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0 /* * bpf-script-test-kbuild.c * Test include from kernel header diff --git a/tools/perf/tests/bpf-script-test-prologue.c b/tools/perf/tests/bpf-script-test-prologue.c index 43f1e16486f4..bd83d364cf30 100644 --- a/tools/perf/tests/bpf-script-test-prologue.c +++ b/tools/perf/tests/bpf-script-test-prologue.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0 /* * bpf-script-test-prologue.c * Test BPF prologue diff --git a/tools/perf/tests/bpf-script-test-relocation.c b/tools/perf/tests/bpf-script-test-relocation.c index 93af77421816..74006e4b2d24 100644 --- a/tools/perf/tests/bpf-script-test-relocation.c +++ b/tools/perf/tests/bpf-script-test-relocation.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0 /* * bpf-script-test-relocation.c * Test BPF loader checking relocation diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c index 79b54f8ddebf..c9e4cdc4c9c8 100644 --- a/tools/perf/tests/bpf.c +++ b/tools/perf/tests/bpf.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0 #include #include #include diff --git a/tools/perf/tests/map_groups.c b/tools/perf/tests/map_groups.c index 70d96acc6dcf..594fdaca4f71 100644 --- a/tools/perf/tests/map_groups.c +++ b/tools/perf/tests/map_groups.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0 #include #include #include "tests.h" diff --git a/tools/perf/tests/mem.c b/tools/perf/tests/mem.c index 0f82ee9fd3f7..efe3397824d2 100644 --- a/tools/perf/tests/mem.c +++ b/tools/perf/tests/mem.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0 #include "util/mem-events.h" #include "util/symbol.h" #include "linux/perf_event.h" diff --git a/tools/perf/tests/mem2node.c b/tools/perf/tests/mem2node.c index 9e9e4d37cc77..d23ff1b68eba 100644 --- a/tools/perf/tests/mem2node.c +++ b/tools/perf/tests/mem2node.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0 #include #include #include "cpumap.h" diff --git a/tools/perf/tests/shell/lib/probe.sh b/tools/perf/tests/shell/lib/probe.sh index e37787be672b..51e3f60baba0 100644 --- a/tools/perf/tests/shell/lib/probe.sh +++ b/tools/perf/tests/shell/lib/probe.sh @@ -1,3 +1,4 @@ +# SPDX-License-Identifier: GPL-2.0 # Arnaldo Carvalho de Melo , 2017 skip_if_no_perf_probe() { diff --git a/tools/perf/tests/shell/probe_vfs_getname.sh b/tools/perf/tests/shell/probe_vfs_getname.sh index 46e076e3c537..5d1b63d3f3e1 100755 --- a/tools/perf/tests/shell/probe_vfs_getname.sh +++ b/tools/perf/tests/shell/probe_vfs_getname.sh @@ -1,6 +1,7 @@ #!/bin/sh # Add vfs_getname probe to get syscall args filenames -# + +# SPDX-License-Identifier: GPL-2.0 # Arnaldo Carvalho de Melo , 2017 . $(dirname $0)/lib/probe.sh diff --git a/tools/perf/tests/shell/record+probe_libc_inet_pton.sh b/tools/perf/tests/shell/record+probe_libc_inet_pton.sh index 61c9f8fc6fa1..9b7632ff70aa 100755 --- a/tools/perf/tests/shell/record+probe_libc_inet_pton.sh +++ b/tools/perf/tests/shell/record+probe_libc_inet_pton.sh @@ -7,6 +7,7 @@ # This needs no debuginfo package, all is done using the libc ELF symtab # and the CFI info in the binaries. +# SPDX-License-Identifier: GPL-2.0 # Arnaldo Carvalho de Melo , 2017 . $(dirname $0)/lib/probe.sh diff --git a/tools/perf/tests/shell/record+script_probe_vfs_getname.sh b/tools/perf/tests/shell/record+script_probe_vfs_getname.sh index 9b073e7fa88c..54030c18bfc2 100755 --- a/tools/perf/tests/shell/record+script_probe_vfs_getname.sh +++ b/tools/perf/tests/shell/record+script_probe_vfs_getname.sh @@ -6,6 +6,7 @@ # checks that that was captured by the vfs_getname probe in the generated # perf.data file, with the temp file name as the pathname argument. +# SPDX-License-Identifier: GPL-2.0 # Arnaldo Carvalho de Melo , 2017 . $(dirname $0)/lib/probe.sh diff --git a/tools/perf/tests/shell/record+zstd_comp_decomp.sh b/tools/perf/tests/shell/record+zstd_comp_decomp.sh index 5dcba800109f..899604d17b85 100755 --- a/tools/perf/tests/shell/record+zstd_comp_decomp.sh +++ b/tools/perf/tests/shell/record+zstd_comp_decomp.sh @@ -1,6 +1,8 @@ #!/bin/sh # Zstd perf.data compression/decompression +# SPDX-License-Identifier: GPL-2.0 + trace_file=$(mktemp /tmp/perf.data.XXX) perf_tool=perf diff --git a/tools/perf/tests/shell/trace+probe_vfs_getname.sh b/tools/perf/tests/shell/trace+probe_vfs_getname.sh index 147efeb6b195..45d269b0157e 100755 --- a/tools/perf/tests/shell/trace+probe_vfs_getname.sh +++ b/tools/perf/tests/shell/trace+probe_vfs_getname.sh @@ -7,6 +7,7 @@ # that already handles "probe:vfs_getname" if present, and used in the # "open" syscall "filename" argument beautifier. +# SPDX-License-Identifier: GPL-2.0 # Arnaldo Carvalho de Melo , 2017 . $(dirname $0)/lib/probe.sh From 599ee18f0740d7661b8711249096db94c09bc508 Mon Sep 17 00:00:00 2001 From: John Garry Date: Fri, 14 Jun 2019 22:07:59 +0800 Subject: [PATCH 19/25] perf pmu: Fix uncore PMU alias list for ARM64 In commit 292c34c10249 ("perf pmu: Fix core PMU alias list for X86 platform"), we fixed the issue of CPU events being aliased to uncore events. Fix this same issue for ARM64, since the said commit left the (broken) behaviour untouched for ARM64. Signed-off-by: John Garry Cc: Alexander Shishkin Cc: Ben Hutchings Cc: Hendrik Brueckner Cc: Jiri Olsa Cc: Kan Liang Cc: Mark Rutland Cc: Mathieu Poirier Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Shaokun Zhang Cc: Thomas Richter Cc: Will Deacon Cc: linux-arm-kernel@lists.infradead.org Cc: linuxarm@huawei.com Cc: stable@vger.kernel.org Fixes: 292c34c10249 ("perf pmu: Fix core PMU alias list for X86 platform") Link: http://lkml.kernel.org/r/1560521283-73314-2-git-send-email-john.garry@huawei.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/pmu.c | 28 ++++++++++++---------------- 1 file changed, 12 insertions(+), 16 deletions(-) diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c index e0429f4ef335..faa8eb231e1b 100644 --- a/tools/perf/util/pmu.c +++ b/tools/perf/util/pmu.c @@ -709,9 +709,7 @@ static void pmu_add_cpu_aliases(struct list_head *head, struct perf_pmu *pmu) { int i; struct pmu_events_map *map; - struct pmu_event *pe; const char *name = pmu->name; - const char *pname; map = perf_pmu__find_map(pmu); if (!map) @@ -722,28 +720,26 @@ static void pmu_add_cpu_aliases(struct list_head *head, struct perf_pmu *pmu) */ i = 0; while (1) { + const char *cpu_name = is_arm_pmu_core(name) ? name : "cpu"; + struct pmu_event *pe = &map->table[i++]; + const char *pname = pe->pmu ? pe->pmu : cpu_name; - pe = &map->table[i++]; if (!pe->name) { if (pe->metric_group || pe->metric_name) continue; break; } - if (!is_arm_pmu_core(name)) { - pname = pe->pmu ? pe->pmu : "cpu"; + /* + * uncore alias may be from different PMU + * with common prefix + */ + if (pmu_is_uncore(name) && + !strncmp(pname, name, strlen(pname))) + goto new_alias; - /* - * uncore alias may be from different PMU - * with common prefix - */ - if (pmu_is_uncore(name) && - !strncmp(pname, name, strlen(pname))) - goto new_alias; - - if (strcmp(pname, name)) - continue; - } + if (strcmp(pname, name)) + continue; new_alias: /* need type casts to override 'const' */ From 016f327ce48f9b0b1cdea729ba7080596113563f Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo Date: Fri, 14 Jun 2019 16:50:19 -0300 Subject: [PATCH 20/25] perf trace: Fixup pointer arithmetic when consuming augmented syscall args MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit We can't just add the consumed bytes to the arg->augmented.args member, as it is not void *, so it will access (consumed * sizeof(struct augmented_arg)) in the next augmented arg, totally wrong, cast the member to void pointe before adding the number of bytes consumed, duh. With this and hardcoding handling the 'renameat' and 'renameat2' syscalls in the tools/perf/examples/bpf/augmented_raw_syscalls.c eBPF proggie, we get: mv/24388 renameat2(AT_FDCWD, "/tmp/build/perf/util/.bpf-event.o.tmp", AT_FDCWD, "/tmp/build/perf/util/.bpf-event.o.cmd", RENAME_NOREPLACE) = 0 mv/24394 renameat2(AT_FDCWD, "/tmp/build/perf/util/.perf-hooks.o.tmp", AT_FDCWD, "/tmp/build/perf/util/.perf-hooks.o.cmd", RENAME_NOREPLACE) = 0 mv/24398 renameat2(AT_FDCWD, "/tmp/build/perf/util/.pmu-bison.o.tmp", AT_FDCWD, "/tmp/build/perf/util/.pmu-bison.o.cmd", RENAME_NOREPLACE) = 0 mv/24401 renameat2(AT_FDCWD, "/tmp/build/perf/util/.expr-bison.o.tmp", AT_FDCWD, "/tmp/build/perf/util/.expr-bison.o.cmd", RENAME_NOREPLACE) = 0 mv/24406 renameat2(AT_FDCWD, "/tmp/build/perf/util/.pmu.o.tmp", AT_FDCWD, "/tmp/build/perf/util/.pmu.o.cmd", RENAME_NOREPLACE) = 0 mv/24407 renameat2(AT_FDCWD, "/tmp/build/perf/util/.pmu-flex.o.tmp", AT_FDCWD, "/tmp/build/perf/util/.pmu-flex.o.cmd", RENAME_NOREPLACE) = 0 mv/24416 renameat2(AT_FDCWD, "/tmp/build/perf/util/.parse-events-flex.o.tmp", AT_FDCWD, "/tmp/build/perf/util/.parse-events-flex.o.cmd", RENAME_NOREPLACE) = 0 I.e. it works with two string args in the same syscall. Now back to taming the verifier... Cc: Adrian Hunter Cc: Brendan Gregg Cc: Jiri Olsa Cc: Luis Cláudio Gonçalves Cc: Namhyung Kim Fixes: 8195168e8779 ("perf trace: Consume the augmented_raw_syscalls payload") Link: https://lkml.kernel.org/n/tip-n1w59lpxks6m1le7fpo6rmyw@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/builtin-trace.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c index d5b2e4d8bf41..f3532b081b31 100644 --- a/tools/perf/builtin-trace.c +++ b/tools/perf/builtin-trace.c @@ -1239,7 +1239,7 @@ static size_t syscall_arg__scnprintf_augmented_string(struct syscall_arg *arg, c */ int consumed = sizeof(*augmented_arg) + augmented_arg->size; - arg->augmented.args += consumed; + arg->augmented.args = ((void *)arg->augmented.args) + consumed; arg->augmented.size -= consumed; return printed; From fdbdd7e8580eac9bdafa532746c865644d125e34 Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo Date: Mon, 17 Jun 2019 14:32:53 -0300 Subject: [PATCH 21/25] perf evsel: Make perf_evsel__name() accept a NULL argument In which case it simply returns "unknown", like when it can't figure out the evsel->name value. This makes this code more robust and fixes a problem in 'perf trace' where a NULL evsel was being passed to a routine that only used the evsel for printing its name when a invalid syscall id was passed. Reported-by: Leo Yan Cc: Adrian Hunter Cc: Jiri Olsa Cc: Namhyung Kim Link: https://lkml.kernel.org/n/tip-f30ztaasku3z935cn3ak3h53@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/evsel.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index 0f506f10ecf0..04c4ed1573cb 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -589,6 +589,9 @@ const char *perf_evsel__name(struct perf_evsel *evsel) { char bf[128]; + if (!evsel) + goto out_unknown; + if (evsel->name) return evsel->name; @@ -628,7 +631,10 @@ const char *perf_evsel__name(struct perf_evsel *evsel) evsel->name = strdup(bf); - return evsel->name ?: "unknown"; + if (evsel->name) + return evsel->name; +out_unknown: + return "unknown"; } const char *perf_evsel__group_name(struct perf_evsel *evsel) From 1955c8cf5e26b1f70d674190ff9984dbfd531ee9 Mon Sep 17 00:00:00 2001 From: Florian Fainelli Date: Fri, 14 Jun 2019 11:39:47 -0700 Subject: [PATCH 22/25] perf tools: Don't hardcode host include path for libslang Hardcoding /usr/include/slang is fundamentally incompatible with cross compilation and will lead to the inability for a cross-compiled environment to properly detect whether slang is available or not. If /usr/include/slang is necessary that is a distribution specific knowledge that could be solved with either a standard pkg-config .pc file (which slang has) or simply overriding CFLAGS accordingly, but the default perf Makefile should be clean of all of that. Signed-off-by: Florian Fainelli Cc: Alexander Shishkin Cc: Alexey Budankov Cc: bcm-kernel-feedback-list@broadcom.com Cc: Jakub Kicinski Cc: Jiri Olsa Cc: Namhyung Kim Cc: Peter Zijlstra Cc: Quentin Monnet Cc: Stanislav Fomichev Fixes: ef7b93a11904 ("perf report: Librarize the annotation code and use it in the newt browser") Link: http://lkml.kernel.org/r/20190614183949.5588-1-f.fainelli@gmail.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/build/feature/Makefile | 2 +- tools/perf/Makefile.config | 1 - 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile index 523ee42db0c8..7ef7cf04a292 100644 --- a/tools/build/feature/Makefile +++ b/tools/build/feature/Makefile @@ -182,7 +182,7 @@ $(OUTPUT)test-libaudit.bin: $(BUILD) -laudit $(OUTPUT)test-libslang.bin: - $(BUILD) -I/usr/include/slang -lslang + $(BUILD) -lslang $(OUTPUT)test-libcrypto.bin: $(BUILD) -lcrypto diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index 5f16a20cae86..e04b7a81d221 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -648,7 +648,6 @@ ifndef NO_SLANG NO_SLANG := 1 else # Fedora has /usr/include/slang/slang.h, but ubuntu /usr/include/slang.h - CFLAGS += -I/usr/include/slang CFLAGS += -DHAVE_SLANG_SUPPORT EXTLIBS += -lslang $(call detected,CONFIG_SLANG) From cbefd24f0aee3a5d787a013f207f6fd31d3c76d2 Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo Date: Tue, 18 Jun 2019 17:43:35 -0300 Subject: [PATCH 23/25] tools build: Add test to check if slang.h is in /usr/include/slang/ A few odd old distros (rhel5, 6, yeah, lots of those out in use, in many cases we want to use upstream perf on it) have the slang header files in /usr/include/slang/, so add a test that will be performed only when test-all.c (the one with the most common sane settings) fails, either because we're in one of these odd distros with slang/slang.h or because something else failed (say libelf is not present). So for the common case nothing changes, no additional test is performed. Next step is to check in perf the result of these tests. Cc: Adrian Hunter Cc: Florian Fainelli Cc: Jiri Olsa Cc: Namhyung Kim Fixes: 1955c8cf5e26 ("perf tools: Don't hardcode host include path for libslang") Link: https://lkml.kernel.org/n/tip-2sy7hbwkx68jr6n97qxgg0c6@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/build/Makefile.feature | 2 +- tools/build/feature/Makefile | 4 ++++ tools/build/feature/test-libslang-include-subdir.c | 7 +++++++ 3 files changed, 12 insertions(+), 1 deletion(-) create mode 100644 tools/build/feature/test-libslang-include-subdir.c diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature index 50377cc2f5f9..86b793dffbc4 100644 --- a/tools/build/Makefile.feature +++ b/tools/build/Makefile.feature @@ -53,6 +53,7 @@ FEATURE_TESTS_BASIC := \ libpython \ libpython-version \ libslang \ + libslang-include-subdir \ libcrypto \ libunwind \ pthread-attr-setaffinity-np \ @@ -114,7 +115,6 @@ FEATURE_DISPLAY ?= \ numa_num_possible_cpus \ libperl \ libpython \ - libslang \ libcrypto \ libunwind \ libdw-dwarf-unwind \ diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile index 7ef7cf04a292..0658b8cd0e53 100644 --- a/tools/build/feature/Makefile +++ b/tools/build/feature/Makefile @@ -31,6 +31,7 @@ FILES= \ test-libpython.bin \ test-libpython-version.bin \ test-libslang.bin \ + test-libslang-include-subdir.bin \ test-libcrypto.bin \ test-libunwind.bin \ test-libunwind-debug-frame.bin \ @@ -184,6 +185,9 @@ $(OUTPUT)test-libaudit.bin: $(OUTPUT)test-libslang.bin: $(BUILD) -lslang +$(OUTPUT)test-libslang-include-subdir.bin: + $(BUILD) -lslang + $(OUTPUT)test-libcrypto.bin: $(BUILD) -lcrypto diff --git a/tools/build/feature/test-libslang-include-subdir.c b/tools/build/feature/test-libslang-include-subdir.c new file mode 100644 index 000000000000..3ea47ec7590e --- /dev/null +++ b/tools/build/feature/test-libslang-include-subdir.c @@ -0,0 +1,7 @@ +// SPDX-License-Identifier: GPL-2.0 +#include + +int main(void) +{ + return SLsmg_init_smg(); +} From 78d6ccce03e86de34c7000bcada493ed0679e350 Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo Date: Tue, 18 Jun 2019 17:48:12 -0300 Subject: [PATCH 24/25] perf build: Handle slang being in /usr/include and in /usr/include/slang/ In some distros slang.h may be in a /usr/include 'slang' subdir, so use the if slang is not explicitely disabled (by using NO_SLANG=1) and its feature test for the common case (having /usr/include/slang.h) failed, use the results for the test that checks if it is in slang/slang.h. Change the only file in perf that includes slang.h to use HAVE_SLANG_INCLUDE_SUBDIR and forget about this for good. On a rhel6 system now we have: $ /tmp/build/perf/perf -vv | grep slang libslang: [ on ] # HAVE_SLANG_SUPPORT $ ldd /tmp/build/perf/perf | grep libslang libslang.so.2 => /usr/lib64/libslang.so.2 (0x00007fa2d5a8d000) $ grep slang /tmp/build/perf/FEATURE-DUMP feature-libslang=0 feature-libslang-include-subdir=1 $ cat /etc/redhat-release CentOS release 6.10 (Final) $ While on fedora:29: $ /tmp/build/perf/perf -vv | grep slang libslang: [ on ] # HAVE_SLANG_SUPPORT $ ldd /tmp/build/perf/perf | grep slang libslang.so.2 => /lib64/libslang.so.2 (0x00007f8eb11a7000) $ grep slang /tmp/build/perf/FEATURE-DUMP feature-libslang=1 feature-libslang-include-subdir=1 $ $ cat /etc/fedora-release Fedora release 29 (Twenty Nine) $ The feature-libslang-include-subdir=1 line is because the 'gettid()' test was added to test-all.c as the new glibc has an implementation for that, so we soon should have it not failing, i.e. should be the common case soon. Perhaps I should move it out till it becomes the norm... Cc: Adrian Hunter Cc: Florian Fainelli Cc: Jiri Olsa Cc: Namhyung Kim Fixes: 1955c8cf5e26 ("perf tools: Don't hardcode host include path for libslang") Link: https://lkml.kernel.org/n/tip-bkgtpsu3uit821fuwsdhj9gd@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/Makefile.config | 11 ++++++++--- tools/perf/ui/libslang.h | 5 +++++ 2 files changed, 13 insertions(+), 3 deletions(-) diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index e04b7a81d221..89ac5a1f1550 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -644,9 +644,14 @@ endif ifndef NO_SLANG ifneq ($(feature-libslang), 1) - msg := $(warning slang not found, disables TUI support. Please install slang-devel, libslang-dev or libslang2-dev); - NO_SLANG := 1 - else + ifneq ($(feature-libslang-include-subdir), 1) + msg := $(warning slang not found, disables TUI support. Please install slang-devel, libslang-dev or libslang2-dev); + NO_SLANG := 1 + else + CFLAGS += -DHAVE_SLANG_INCLUDE_SUBDIR + endif + endif + ifndef NO_SLANG # Fedora has /usr/include/slang/slang.h, but ubuntu /usr/include/slang.h CFLAGS += -DHAVE_SLANG_SUPPORT EXTLIBS += -lslang diff --git a/tools/perf/ui/libslang.h b/tools/perf/ui/libslang.h index c0686cda39a5..991e692b9b46 100644 --- a/tools/perf/ui/libslang.h +++ b/tools/perf/ui/libslang.h @@ -10,7 +10,12 @@ #ifndef HAVE_LONG_LONG #define HAVE_LONG_LONG __GLIBC_HAVE_LONG_LONG #endif + +#ifdef HAVE_SLANG_INCLUDE_SUBDIR +#include +#else #include +#endif #if SLANG_VERSION < 20104 #define slsmg_printf(msg, args...) \ From 3469fa84c1631face938efc42b3f488a2c2504e0 Mon Sep 17 00:00:00 2001 From: Arnaldo Carvalho de Melo Date: Tue, 18 Jun 2019 17:59:16 -0300 Subject: [PATCH 25/25] tools build: Fix the zstd test in the test-all.c common case feature test MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit We were renanimg 'main' to 'main_zstd' but then using 'main_libzstd();' in the main() for test-all.c, causing this: $ cat /tmp/build/perf/feature/test-all.make.output test-all.c: In function ‘main’: test-all.c:236:2: error: implicit declaration of function ‘main_test_libzstd’; did you mean ‘main_test_zstd’? [-Werror=implicit-function-declaration] main_test_libzstd(); ^~~~~~~~~~~~~~~~~ main_test_zstd cc1: all warnings being treated as errors $ I.e. what was supposed to be the fast path feature test was _always_ failing, duh, fix it. Cc: Adrian Hunter Cc: Jiri Olsa Cc: Namhyung Kim Cc: Alexey Budankov Fixes: 3b1c5d965971 ("tools build: Implement libzstd feature check, LIBZSTD_DIR and NO_LIBZSTD defines") Link: https://lkml.kernel.org/n/tip-ma4abk0utroiw4mwpmvnjlru@git.kernel.org Signed-off-by: Arnaldo Carvalho de Melo --- tools/build/feature/test-all.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/build/feature/test-all.c b/tools/build/feature/test-all.c index 3b3d5d72124a..88145e8cde1a 100644 --- a/tools/build/feature/test-all.c +++ b/tools/build/feature/test-all.c @@ -186,7 +186,7 @@ # include "test-disassembler-four-args.c" #undef main -#define main main_test_zstd +#define main main_test_libzstd # include "test-libzstd.c" #undef main