Commit Graph

30530 Commits

Author SHA1 Message Date
Adrian Hunter
a5367ecb53 perf tools: Automatically use guest kcore_dir if present
When registering a guest machine using machine_pid from the id index,
check perf.data for a matching kcore_dir subdirectory and set the
kallsyms file name accordingly. If set, use it to find the machine's
kernel symbols and object code (from kcore).

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-23-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:37 -03:00
Adrian Hunter
65691e9ff0 perf tools: Make has_kcore_dir() work also for guest kcore_dir
Copies of /proc/kallsyms, /proc/modules and an extract of /proc/kcore can
be stored in the perf.data output directory under the subdirectory named
kcore_dir. Guest machines will have their files also under subdirectories
beginning kcore_dir__ followed by the machine pid. Make has_kcore_dir()
return true also if there is a guest machine kcore_dir.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-22-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:34 -03:00
Adrian Hunter
386e0d83d3 perf tools: Remove also guest kcore_dir with host kcore_dir
Copies of /proc/kallsyms, /proc/modules and an extract of /proc/kcore can
be stored in the perf.data output directory under the subdirectory named
kcore_dir. Guest machines will have their files also under subdirectories
beginning kcore_dir__ followed by the machine pid. Remove these also when
removing kcore_dir.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-21-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:30 -03:00
Adrian Hunter
13a133b255 perf script python: intel-pt-events: Add machine_pid and vcpu
Add machine_pid and vcpu to the intel-pt-events.py script.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-20-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:25 -03:00
Adrian Hunter
6de306b7a5 perf script python: Add machine_pid and vcpu
Add machine_pid and vcpu to python sample events and context switch events.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-19-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:21 -03:00
Adrian Hunter
7151c1d178 perf auxtrace: Add machine_pid and vcpu to auxtrace_error
Add machine_pid and vcpu to struct perf_record_auxtrace_error. The existing
fmt member is used to identify the new format.

The new members make it possible to easily differentiate errors from guest
machines.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-18-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:17 -03:00
Adrian Hunter
2273e46b98 perf dlfilter: Add machine_pid and vcpu
Add machine_pid and vcpu to struct perf_dlfilter_sample. The 'size' can be
used to determine if the values are present, however machine_pid is zero if
unused in any case. vcpu should be ignored if machine_pid is zero.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-17-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:13 -03:00
Adrian Hunter
e28fb159f1 perf script: Add machine_pid and vcpu
Add fields machine_pid and vcpu. These are displayed only if machine_pid is
non-zero.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-16-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:13 -03:00
Adrian Hunter
6350490995 perf session: Use sample->machine_pid to find guest machine
If machine_pid is set, use it to find the guest machine.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-15-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:08 -03:00
Adrian Hunter
3461b65da7 perf tools: Add machine_pid and vcpu to perf_sample
When parsing a sample with a sample ID, copy machine_pid and vcpu from
perf_sample_id to perf_sample.

Note, machine_pid will be zero when unused, so only a non-zero value
represents a guest machine. vcpu should be ignored if machine_pid is zero.

Note also, machine_pid is used with events that have come from injecting a
guest perf.data file, however guest events recorded on the host (i.e. using
perf kvm) have the (QEMU) hypervisor process pid to identify them - refer
machines__find_for_cpumode().

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-14-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:04 -03:00
Adrian Hunter
797efbc523 perf tools: Add guest_cpu to hypervisor threads
It is possible to know which guest machine was running at a point in time
based on the PID of the currently running host thread. That is, perf
identifies guest machines by the PID of the hypervisor.

To determine the guest CPU, put it on the hypervisor (QEMU) thread for
that VCPU.

This is done when processing the id_index which provides the necessary
information.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-13-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:08:04 -03:00
Adrian Hunter
ff7a78c210 perf session: Create guest machines from id_index
Now that id_index has machine_pid, use it to create guest machines.
Create the guest machines with an idle thread because guest events
for "swapper" will be possible.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-12-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:07:58 -03:00
Adrian Hunter
b47bb18661 perf tools: Add machine_pid and vcpu to id_index
When injecting events from a guest perf.data file, the events will have
separate sample ID numbers. These ID numbers can then be used to determine
which machine an event belongs to. To facilitate that, add machine_pid and
vcpu to id_index records. For backward compatibility, these are added at
the end of the record, and the length of the record is used to determine
if they are present or not.

Note, this is needed because the events from a guest perf.data file contain
the pid/tid of the process running at that time inside the VM not the
pid/tid of the (QEMU) hypervisor thread. So a way is needed to relate
guest events back to the guest machine and VCPU, and using sample ID
numbers for that is relatively simple and convenient.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-11-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:07:58 -03:00
Adrian Hunter
c1fd5b7d8a perf buildid-cache: Do not require purge files to also be in the file system
realname() returns NULL if the file is not in the file system, but we can
still remove it from the build ID cache in that case, so continue and
attempt the purge with the name provided.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:07:58 -03:00
Adrian Hunter
15fe03621d perf buildid-cache: Add guestmount'd files to the build ID cache
When the guestmount option is used, a guest machine's file system mount
point is recorded in machine->root_dir.

perf already iterates guest machines when adding files to the build ID
cache, but does not take machine->root_dir into account.

Use machine->root_dir to find files for guest build IDs, and add them to
the build ID cache using the "proper" name i.e. relative to the guest root
directory not the host root directory.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:07:53 -03:00
Adrian Hunter
57190e38b0 perf script: Add --dump-unsorted-raw-trace option
When reviewing the results of perf inject, it is useful to be able to see
the events in the order they appear in the file.

So add --dump-unsorted-raw-trace option to do an unsorted dump.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:07:48 -03:00
Adrian Hunter
1ee94463e9 perf tools: Add perf_event__synthesize_id_sample()
Add perf_event__synthesize_id_sample() to enable the synthesis of
ID samples.

This is needed by perf inject. When injecting events from a guest perf.data
file, there is a possibility that the sample ID numbers conflict. In that
case, perf_event__synthesize_id_sample() can be used to re-write the ID
sample.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:07:48 -03:00
Adrian Hunter
0a64de04c9 perf tools: Factor out evsel__id_hdr_size()
Factor out evsel__id_hdr_size() so it can be reused.

This is needed by perf inject. When injecting events from a guest perf.data
file, there is a possibility that the sample ID numbers conflict. To
re-write an ID sample, the old one needs to be removed first, which means
determining how big it is with evsel__id_hdr_size() and then subtracting
that from the event size.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:07:48 -03:00
Adrian Hunter
eddc6e3f66 perf tools: Export perf_event__process_finished_round()
Export perf_event__process_finished_round() so it can be used elsewhere.

This is needed in perf inject to obey finished-round ordering.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:07:37 -03:00
Adrian Hunter
f8bcf1e223 perf ordered_events: Add ordered_events__last_flush_time()
Allow callers to get the ordered_events last flush timestamp.

This is needed in perf inject to obey finished-round ordering when
injecting additional events (e.g. from a guest perf.data file) with
timestamps. Any additional events that have timestamps before the last
flush time must be injected before the corresponding FINISHED_ROUND event.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:07:30 -03:00
Adrian Hunter
163dac34d7 perf tools: Export dsos__for_each_with_build_id()
Export dsos__for_each_with_build_id() so it can be used elsewhere.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: kvm@vger.kernel.org
Link: https://lore.kernel.org/r/20220711093218.10967-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-20 11:07:02 -03:00
Adrian Hunter
68566a7cf5 perf tools: Fix dso_id inode generation comparison
Synthesized MMAP events have zero ino_generation, so do not compare
them to DSOs with a real ino_generation otherwise we end up with a DSO
without a build id.

Fixes: 0e3149f86b ("perf dso: Move dso_id from 'struct map' to 'struct dso'")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: kvm@vger.kernel.org
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220711093218.10967-2-adrian.hunter@intel.com
[ Added clarification to the comment from Ian + more detailed explanation from Adrian ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-19 16:19:00 -03:00
Blake Jones
a6bd98c45d perf buildid-list: Add a "-m" option to show kernel and modules build-ids
This new option displays all of the information needed to do external
BuildID-based symbolization of kernel stack traces, such as those collected
by bpf_get_stackid().

For each kernel module plus the main kernel, it displays the BuildID,
the start and end virtual addresses of that module's text range (rounded
out to page boundaries), and the pathname of the module.

When run as a non-privileged user, the actual addresses of the modules'
text ranges are not available, so the tools displays "0, <text length>" for
kernel modules and "0, 0xffffffffffffffff" for the kernel itself.

Sample output:

  root# perf buildid-list -m
  cf6df852fd4da122d616153353cc8f560fd12fe0 ffffffffa5400000 ffffffffa6001e27 [kernel.kallsyms]
  1aa7209aa2acb067d66ed6cf7676d65066384d61 ffffffffc0087000 ffffffffc008b000 /lib/modules/5.15.15-1rodete2-amd64/kernel/crypto/sha512_generic.ko
  3857815b5bf0183697b68f8fe0ea06121644041e ffffffffc008c000 ffffffffc0098000 /lib/modules/5.15.15-1rodete2-amd64/kernel/arch/x86/crypto/sha512-ssse3.ko
  4081fde0bca2bc097cb3e9d1efcb836047d485f1 ffffffffc0099000 ffffffffc009f000 /lib/modules/5.15.15-1rodete2-amd64/kernel/drivers/acpi/button.ko
  1ef81ba4890552ea6b0314f9635fc43fc8cef568 ffffffffc00a4000 ffffffffc00aa000 /lib/modules/5.15.15-1rodete2-amd64/kernel/crypto/cryptd.ko
  cc5c985506cb240d7d082b55ed260cbb851f983e ffffffffc00af000 ffffffffc00b6000 /lib/modules/5.15.15-1rodete2-amd64/kernel/drivers/i2c/busses/i2c-piix4.ko
  [...]

Committer notes:

u64 formatter should be PRIx64 for printing as hex numbers, fix this:

  28     5.28 debian:experimental-x-mips    : FAIL gcc version 11.2.0 (Debian 11.2.0-18)
    builtin-buildid-list.c: In function 'buildid__map_cb':
    builtin-buildid-list.c:32:24: error: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'u64' {aka 'long long unsigned int'} [-Werror=format=]
       32 |         printf("%s %16lx %16lx", bid_buf, map->start, map->end);
          |                    ~~~~^                  ~~~~~~~~~~
          |                        |                     |
          |                        long unsigned int     u64 {aka long long unsigned int}
          |                    %16llx
    builtin-buildid-list.c:32:30: error: format '%lx' expects argument of type 'long unsigned int', but argument 4 has type 'u64' {aka 'long long unsigned int'} [-Werror=format=]
       32 |         printf("%s %16lx %16lx", bid_buf, map->start, map->end);
          |                          ~~~~^                        ~~~~~~~~
          |                              |                           |
          |                              long unsigned int           u64 {aka long long unsigned int}
          |                          %16llx
    cc1: all warnings being treated as errors

Signed-off-by: Blake Jones <blakejones@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220629213632.3899212-1-blakejones@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-18 16:35:34 -03:00
Arnaldo Carvalho de Melo
0698461ad2 Merge remote-tracking branch 'torvalds/master' into perf/core
To update the perf/core codebase.

Fix conflict by moving arch__post_evsel_config(evsel, attr) to the end
of evsel__config(), after what was added in:

  49c692b7df ("perf offcpu: Accept allowed sample types only")

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-18 10:36:11 -03:00
Naveen N. Rao
4b335e1e0d perf trace: Fix SIGSEGV when processing syscall args
On powerpc, 'perf trace' is crashing with a SIGSEGV when trying to
process a perf.data file created with 'perf trace record -p':

  #0  0x00000001225b8988 in syscall_arg__scnprintf_augmented_string <snip> at builtin-trace.c:1492
  #1  syscall_arg__scnprintf_filename <snip> at builtin-trace.c:1492
  #2  syscall_arg__scnprintf_filename <snip> at builtin-trace.c:1486
  #3  0x00000001225bdd9c in syscall_arg_fmt__scnprintf_val <snip> at builtin-trace.c:1973
  #4  syscall__scnprintf_args <snip> at builtin-trace.c:2041
  #5  0x00000001225bff04 in trace__sys_enter <snip> at builtin-trace.c:2319

That points to the below code in tools/perf/builtin-trace.c:
	/*
	 * If this is raw_syscalls.sys_enter, then it always comes with the 6 possible
	 * arguments, even if the syscall being handled, say "openat", uses only 4 arguments
	 * this breaks syscall__augmented_args() check for augmented args, as we calculate
	 * syscall->args_size using each syscalls:sys_enter_NAME tracefs format file,
	 * so when handling, say the openat syscall, we end up getting 6 args for the
	 * raw_syscalls:sys_enter event, when we expected just 4, we end up mistakenly
	 * thinking that the extra 2 u64 args are the augmented filename, so just check
	 * here and avoid using augmented syscalls when the evsel is the raw_syscalls one.
	 */
	if (evsel != trace->syscalls.events.sys_enter)
		augmented_args = syscall__augmented_args(sc, sample, &augmented_args_size, trace->raw_augmented_syscalls_args_size);

As the comment points out, we should not be trying to augment the args
for raw_syscalls. However, when processing a perf.data file, we are not
initializing those properly. Fix the same.

Reported-by: Claudio Carvalho <cclaudio@linux.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20220707090900.572584-1-naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-17 10:59:52 -03:00
Adrian Hunter
deb44a6249 perf tests: Fix Convert perf time to TSC test for hybrid
The test does not always correctly determine the number of events for
hybrids, nor allow for more than 1 evsel when parsing.

Fix by iterating the events actually created and getting the correct
evsel for the events processed.

Fixes: d9da6f70eb ("perf tests: Support 'Convert perf time to TSC' test for hybrid")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20220713123459.24145-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-17 10:57:07 -03:00
Adrian Hunter
498c7a54f1 perf tests: Stop Convert perf time to TSC test opening events twice
Do not call evlist__open() twice.

Fixes: 5bb017d4b9 ("perf test: Fix error message for test case 71 on s390, where it is not supported")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20220713123459.24145-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-17 10:56:58 -03:00
Arnaldo Carvalho de Melo
91d248c3b9 tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes from these csets:

  4ad3278df6 ("x86/speculation: Disable RRSBA behavior")
  d7caac991f ("x86/cpu/amd: Add Spectral Chicken")

That cause no changes to tooling:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  $

Just silences this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
  diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/YtQTm9wsB3hxQWvy@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-17 10:50:50 -03:00
Arnaldo Carvalho de Melo
f098addbdb tools headers cpufeatures: Sync with the kernel sources
To pick the changes from:

  f43b9876e8 ("x86/retbleed: Add fine grained Kconfig knobs")
  a149180fbc ("x86: Add magic AMD return-thunk")
  15e67227c4 ("x86: Undo return-thunk damage")
  369ae6ffc4 ("x86/retpoline: Cleanup some #ifdefery")
  4ad3278df6 x86/speculation: Disable RRSBA behavior
  26aae8ccbc x86/cpu/amd: Enumerate BTC_NO
  9756bba284 x86/speculation: Fill RSB on vmexit for IBRS
  3ebc170068 x86/bugs: Add retbleed=ibpb
  2dbb887e87 x86/entry: Add kernel IBRS implementation
  6b80b59b35 x86/bugs: Report AMD retbleed vulnerability
  a149180fbc x86: Add magic AMD return-thunk
  15e67227c4 x86: Undo return-thunk damage
  a883d624ae x86/cpufeatures: Move RETPOLINE flags to word 11
  5180218615 x86/speculation/mmio: Enumerate Processor MMIO Stale Data bug

This only causes these perf files to be rebuilt:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h
  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/disabled-features.h' differs from latest version at 'arch/x86/include/asm/disabled-features.h'
  diff -u tools/arch/x86/include/asm/disabled-features.h arch/x86/include/asm/disabled-features.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org
Link: https://lore.kernel.org/lkml/YtQM40VmiLTkPND2@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-17 10:49:14 -03:00
Arnaldo Carvalho de Melo
eee51fe38e tools headers UAPI: Sync linux/kvm.h with the kernel sources
To pick the changes in:

  1b870fa557 ("kvm: stats: tell userspace which values are boolean")

That just rebuilds perf, as these patches don't add any new KVM ioctl to
be harvested for the the 'perf trace' ioctl syscall argument
beautifiers.

This is also by now used by tools/testing/selftests/kvm/, a simple test
build succeeded.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
  diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lore.kernel.org/lkml/YtQLDvQrBhJNl3n5@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-17 10:14:07 -03:00
Linus Torvalds
9bd572ec7a Merge tag 'net-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter, bpf and wireless.

  Still no major regressions, the release continues to be calm. An
  uptick of fixes this time around due to trivial data race fixes and
  patches flowing down from subtrees.

  There has been a few driver fixes (particularly a few fixes for false
  positives due to 66e4c8d950 which went into -next in May!) that make
  me worry the wide testing is not exactly fully through.

  So "calm" but not "let's just cut the final ASAP" vibes over here.

  Current release - regressions:

   - wifi: rtw88: fix write to const table of channel parameters

  Current release - new code bugs:

   - mac80211: add gfp_t arg to ieeee80211_obss_color_collision_notify

   - mlx5:
      - TC, allow offload from uplink to other PF's VF
      - Lag, decouple FDB selection and shared FDB
      - Lag, correct get the port select mode str

   - bnxt_en: fix and simplify XDP transmit path

   - r8152: fix accessing unset transport header

  Previous releases - regressions:

   - conntrack: fix crash due to confirmed bit load reordering (after
     atomic -> refcount conversion)

   - stmmac: dwc-qos: disable split header for Tegra194

  Previous releases - always broken:

   - mlx5e: ring the TX doorbell on DMA errors

   - bpf: make sure mac_header was set before using it

   - mac80211: do not wake queues on a vif that is being stopped

   - mac80211: fix queue selection for mesh/OCB interfaces

   - ip: fix dflt addr selection for connected nexthop

   - seg6: fix skb checksums for SRH encapsulation/insertion

   - xdp: fix spurious packet loss in generic XDP TX path

   - bunch of sysctl data race fixes

   - nf_log: incorrect offset to network header

  Misc:

   - bpf: add flags arg to bpf_dynptr_read and bpf_dynptr_write APIs"

* tag 'net-5.19-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (87 commits)
  nfp: flower: configure tunnel neighbour on cmsg rx
  net/tls: Check for errors in tls_device_init
  MAINTAINERS: Add an additional maintainer to the AMD XGBE driver
  xen/netback: avoid entering xenvif_rx_next_skb() with an empty rx queue
  selftests/net: test nexthop without gw
  ip: fix dflt addr selection for connected nexthop
  net: atlantic: remove aq_nic_deinit() when resume
  net: atlantic: remove deep parameter on suspend/resume functions
  sfc: fix kernel panic when creating VF
  seg6: bpf: fix skb checksum in bpf_push_seg6_encap()
  seg6: fix skb checksum in SRv6 End.B6 and End.B6.Encaps behaviors
  seg6: fix skb checksum evaluation in SRH encapsulation/insertion
  sfc: fix use after free when disabling sriov
  net: sunhme: output link status with a single print.
  r8152: fix accessing unset transport header
  net: stmmac: fix leaks in probe
  net: ftgmac100: Hold reference returned by of_get_child_by_name()
  nexthop: Fix data-races around nexthop_compat_mode.
  ipv4: Fix data-races around sysctl_ip_dynaddr.
  tcp: Fix a data-race around sysctl_tcp_ecn_fallback.
  ...
2022-07-14 12:48:07 -07:00
Nicolas Dichtel
cd72e61bad selftests/net: test nexthop without gw
This test implement the scenario described in the commit
"ip: fix dflt addr selection for connected nexthop".
The test configures a nexthop object with an output device only (no gateway
address) and a route that uses this nexthop. The goal is to check if the
kernel selects a valid source address.

Link: https://lore.kernel.org/netdev/20220712095545.10947-1-nicolas.dichtel@6wind.com/
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Link: https://lore.kernel.org/r/20220713114853.29406-2-nicolas.dichtel@6wind.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-14 14:41:19 +02:00
Namhyung Kim
0d2997f750 perf lock: Look up callchain for the contended locks
The lock contention tracepoints don't provide lock names.  All we can
do is to get stack traces and show the caller instead.  To minimize
the overhead it's limited to up to 8 stack traces and display the
first non-lock function symbol name as a caller.

  $ perf lock report -F acquired,contended,avg_wait,wait_total

                  Name   acquired  contended     avg wait    total wait

   update_blocked_a...         40         40      3.61 us     144.45 us
   kernfs_fop_open+...          5          5      3.64 us      18.18 us
    _nohz_idle_balance          3          3      2.65 us       7.95 us
   tick_do_update_j...          1          1      6.04 us       6.04 us
    ep_scan_ready_list          1          1      3.93 us       3.93 us
  ...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220615163222.1275500-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-12 09:56:18 -03:00
Namhyung Kim
7cb2a53f7f perf record: Allow to specify max stack depth of fp callchain
Currently it has no interface to specify the max stack depth for perf
record.  Extend the command line parameter to accept a number after
'fp' to specify the depth like '--call-graph fp,32'.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220615163222.1275500-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-12 09:56:05 -03:00
Namhyung Kim
3ae03f2650 perf lock: Handle lock contention tracepoints
When the lock contention events are used, there's no tracking of
acquire and release.  So the state machine is simplified to use
UNINITIALIZED -> CONTENDED -> ACQUIRED only.

Note that CONTENDED state is re-entrant since mutex locks can hit two
or more consecutive contention_begin events for optimistic spinning
and sleep.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220615163222.1275500-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-12 09:55:37 -03:00
Namhyung Kim
166a9764a3 perf lock: Add lock contention tracepoints record support
When LOCKDEP and LOCK_STAT events are not available, it falls back to
record the new lock contention tracepoints.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220615163222.1275500-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-12 09:55:22 -03:00
Namhyung Kim
9565c9186d perf lock: Skip print_bad_events() if nothing bad
The debug output is meaningful when there are bad lock sequences.
Skip it unless there's one or -v option is given.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220615163222.1275500-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-12 09:55:04 -03:00
Namhyung Kim
309e133dfe perf lock: Allow to use different kernel symbols
Add --vmlinux and --kallsyms options to support data file from
different kernels.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220615163222.1275500-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-12 09:54:42 -03:00
Namhyung Kim
ab0101768f perf lock: Print wait times with unit
Currently it only prints the time in nsec but it's a bit hard to read
and takes longer in the screen.  We can change it to use different units
and keep the number small to save the space.

Before:

  $ perf lock report

                Name   acquired  contended   avg wait (ns) total wait (ns)   max wait (ns)   min wait (ns)

        jiffies_lock        433         32            2778           88908           13570             692
   &lruvec->lru_lock        747          5           11254           56272           18317            1412
      slock-AF_INET6          7          1           23543           23543           23543           23543
    &newf->file_lock        706         15            1025           15388            2279             618
      slock-AF_INET6          8          1           10379           10379           10379           10379
         &rq->__lock       2143          5            2037           10185            3462             939

After:
                Name   acquired  contended     avg wait   total wait     max wait     min wait

        jiffies_lock        433         32      2.78 us     88.91 us     13.57 us       692 ns
   &lruvec->lru_lock        747          5     11.25 us     56.27 us     18.32 us      1.41 us
      slock-AF_INET6          7          1     23.54 us     23.54 us     23.54 us     23.54 us
    &newf->file_lock        706         15      1.02 us     15.39 us      2.28 us       618 ns
      slock-AF_INET6          8          1     10.38 us     10.38 us     10.38 us     10.38 us
         &rq->__lock       2143          5      2.04 us     10.19 us      3.46 us       939 ns

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220615163222.1275500-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-12 09:54:14 -03:00
German Gomez
b55878c90a perf test: Add test for branch stack sampling
Add a self test for branch stack sampling, to check that we get the
expected branch types, and filters behave as expected.

Suggested-by: Anshuman Khandual <anshuman.khandual@arm.com>
Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20220705150511.473919-2-german.gomez@arm.com
Tested-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-12 08:55:36 -03:00
Ian Rogers
29d97deed6 perf test: Make all metrics test more tolerant
Metric names are truncated so don't try to match all of one.

Allow AMX metrics to skip as floating point ones do.

Metrics for optane memory can also skip rather than fail.

Add a system wide check for uncore metrics.

Restructure code to avoid extensive nesting.

Some impetus for this in:

https://lore.kernel.org/lkml/d32376b5-5538-ff00-6620-e74ad4b4abf2@huawei.com/

Suggested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220707153449.202409-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-12 08:54:47 -03:00
Ian Rogers
c7a774d781 perf test: Add debug line to diagnose broken metrics
Printing out the metric name and architecture makes finding the source
of a failure easier.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220707153449.202409-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-12 08:48:18 -03:00
Jakub Kicinski
4a46de446d selftest: net: add tun to .gitignore
Add missing .gitignore entry.

Fixes: 839b92fede ("selftest: tun: add test for NAPI dismantle")
Link: https://lore.kernel.org/r/20220709024141.321683-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-11 19:48:02 -07:00
Linus Torvalds
ce114c8668 Merge tag 'x86_bugs_retbleed' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 retbleed fixes from Borislav Petkov:
 "Just when you thought that all the speculation bugs were addressed and
  solved and the nightmare is complete, here's the next one: speculating
  after RET instructions and leaking privileged information using the
  now pretty much classical covert channels.

  It is called RETBleed and the mitigation effort and controlling
  functionality has been modelled similar to what already existing
  mitigations provide"

* tag 'x86_bugs_retbleed' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (54 commits)
  x86/speculation: Disable RRSBA behavior
  x86/kexec: Disable RET on kexec
  x86/bugs: Do not enable IBPB-on-entry when IBPB is not supported
  x86/entry: Move PUSH_AND_CLEAR_REGS() back into error_entry
  x86/bugs: Add Cannon lake to RETBleed affected CPU list
  x86/retbleed: Add fine grained Kconfig knobs
  x86/cpu/amd: Enumerate BTC_NO
  x86/common: Stamp out the stepping madness
  KVM: VMX: Prevent RSB underflow before vmenter
  x86/speculation: Fill RSB on vmexit for IBRS
  KVM: VMX: Fix IBRS handling after vmexit
  KVM: VMX: Prevent guest RSB poisoning attacks with eIBRS
  KVM: VMX: Convert launched argument to flags
  KVM: VMX: Flatten __vmx_vcpu_run()
  objtool: Re-add UNWIND_HINT_{SAVE_RESTORE}
  x86/speculation: Remove x86_spec_ctrl_mask
  x86/speculation: Use cached host SPEC_CTRL value for guest entry/exit
  x86/speculation: Fix SPEC_CTRL write on SMT state change
  x86/speculation: Fix firmware entry SPEC_CTRL handling
  x86/speculation: Fix RSB filling with CONFIG_RETPOLINE=n
  ...
2022-07-11 18:15:25 -07:00
Matthieu Baerts
3ddabc4336 selftests: mptcp: validate userspace PM tests by default
The new script was not listed in the programs to test.

By consequence, some CIs running MPTCP selftests were not validating
these new tests. Note that MPTCP CI was validating it as it executes all
.sh scripts from 'tools/testing/selftests/net/mptcp' directory.

Fixes: 259a834fad ("selftests: mptcp: functional tests for the userspace PM type")
Reported-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-11 11:31:38 +01:00
Linus Torvalds
74a0032b85 Merge tag 'x86_urgent_for_v5.19_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:

 - Prepare for and clear .brk early in order to address XenPV guests
   failures where the hypervisor verifies page tables and uninitialized
   data in that range leads to bogus failures in those checks

 - Add any potential setup_data entries supplied at boot to the identity
   pagetable mappings to prevent kexec kernel boot failures. Usually,
   this is not a problem for the normal kernel as those mappings are
   part of the initially mapped 2M pages but if kexec gets to allocate
   the second kernel somewhere else, those setup_data entries need to be
   mapped there too.

 - Fix objtool not to discard text references from the __tracepoints
   section so that ENDBR validation still works

 - Correct the setup_data types limit as it is user-visible, before 5.19
   releases

* tag 'x86_urgent_for_v5.19_rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot: Fix the setup data types max limit
  x86/ibt, objtool: Don't discard text references from tracepoint section
  x86/compressed/64: Add identity mappings for setup_data entries
  x86: Fix .brk attribute in linker script
  x86: Clear .brk area at early boot
  x86/xen: Use clear_bss() for Xen PV guests
2022-07-10 08:43:52 -07:00
Pawan Gupta
4ad3278df6 x86/speculation: Disable RRSBA behavior
Some Intel processors may use alternate predictors for RETs on
RSB-underflow. This condition may be vulnerable to Branch History
Injection (BHI) and intramode-BTI.

Kernel earlier added spectre_v2 mitigation modes (eIBRS+Retpolines,
eIBRS+LFENCE, Retpolines) which protect indirect CALLs and JMPs against
such attacks. However, on RSB-underflow, RET target prediction may
fallback to alternate predictors. As a result, RET's predicted target
may get influenced by branch history.

A new MSR_IA32_SPEC_CTRL bit (RRSBA_DIS_S) controls this fallback
behavior when in kernel mode. When set, RETs will not take predictions
from alternate predictors, hence mitigating RETs as well. Support for
this is enumerated by CPUID.7.2.EDX[RRSBA_CTRL] (bit2).

For spectre v2 mitigation, when a user selects a mitigation that
protects indirect CALLs and JMPs against BHI and intramode-BTI, set
RRSBA_DIS_S also to protect RETs for RSB-underflow case.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
2022-07-09 13:12:45 +02:00
Martin Blumenstingl
cfbba7b46a selftests: forwarding: Install no_forwarding.sh
When using the Makefile from tools/testing/selftests/net/forwarding/
all tests should be installed. Add no_forwarding.sh to the list of
"to be installed tests" where it has been missing so far.

Fixes: 476a4f05d9 ("selftests: forwarding: add a no_forwarding.sh test")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-08 20:30:57 -07:00
Martin Blumenstingl
437ac2592c selftests: forwarding: Install local_termination.sh
When using the Makefile from tools/testing/selftests/net/forwarding/
all tests should be installed. Add local_termination.sh to the list of
"to be installed tests" where it has been missing so far.

Fixes: 90b9566aa5 ("selftests: forwarding: add a test for local_termination.sh")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-08 20:30:57 -07:00
Jakub Kicinski
7c895ef884 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:

====================
bpf 2022-07-08

We've added 3 non-merge commits during the last 2 day(s) which contain
a total of 7 files changed, 40 insertions(+), 24 deletions(-).

The main changes are:

1) Fix cBPF splat triggered by skb not having a mac header, from Eric Dumazet.

2) Fix spurious packet loss in generic XDP when pushing packets out (note
   that native XDP is not affected by the issue), from Johan Almbladh.

3) Fix bpf_dynptr_{read,write}() helper signatures with flag argument before
   its set in stone as UAPI, from Joanne Koong.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Add flags arg to bpf_dynptr_read and bpf_dynptr_write APIs
  bpf: Make sure mac_header was set before using it
  xdp: Fix spurious packet loss in generic XDP TX path
====================

Link: https://lore.kernel.org/r/20220708213418.19626-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-08 15:24:16 -07:00