linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-04-13 10:50:40 -04:00

Author	SHA1	Message	Date
Vitaly Kuznetsov	225b7c1117	KVM: selftests: Fix vmxon_pa == vmcs12_pa == -1ull nVMX testcase for !eVMCS The "vmxon_pa == vmcs12_pa == -1ull" test happens to work by accident: as Enlightened VMCS is always supported, set_default_vmx_state() adds 'KVM_STATE_NESTED_EVMCS' to 'flags' and the following branch of vmx_set_nested_state() is executed: if ((kvm_state->flags & KVM_STATE_NESTED_EVMCS) && (!guest_can_use(vcpu, X86_FEATURE_VMX) \|\| !vmx->nested.enlightened_vmcs_enabled)) return -EINVAL; as 'enlightened_vmcs_enabled' is false. In fact, "vmxon_pa == vmcs12_pa == -1ull" is a valid state when not tainted by wrong flags so the test should aim for this branch: if (kvm_state->hdr.vmx.vmxon_pa == INVALID_GPA) return 0; Test all this properly: - Without KVM_STATE_NESTED_EVMCS in the flags, the expected return value is '0'. - With KVM_STATE_NESTED_EVMCS flag (when supported) set, the expected return value is '-EINVAL' prior to enabling eVMCS and '0' after. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Tested-by: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/20231205103630.1391318-11-vkuznets@redhat.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2023-12-07 09:34:40 -08:00
Vitaly Kuznetsov	6dac119518	KVM: selftests: Make Hyper-V tests explicitly require KVM Hyper-V support In preparation for conditional Hyper-V emulation enablement in KVM, make Hyper-V specific tests skip gracefully instead of failing when KVM support for emulating Hyper-V is not there. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Tested-by: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/20231205103630.1391318-10-vkuznets@redhat.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2023-12-07 09:34:36 -08:00
Benjamin Tissoires	da2c1b8610	selftests/hid: fix failing tablet button tests An overlook from commit `74452d6329` ("selftests/hid: tablets: add variants of states with buttons"), where I don't use the Enum... Fixes: `74452d6329` ("selftests/hid: tablets: add variants of states with buttons") Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231207-b4-wip-selftests-v1-1-c4e13fe04a70@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2023-12-07 14:55:00 +01:00
Benjamin Tissoires	f556aa957d	selftests/hid: fix ruff linter complains rename ambiguous variables l, r, and m, and ignore the return values of uhdev.get_evdev() and uhdev.get_slot() Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-15-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2023-12-07 09:52:04 +01:00
Benjamin Tissoires	ed5bc56ced	selftests/hid: fix mypy complains No code change, only typing information added/ignored Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-14-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2023-12-07 09:52:04 +01:00
Benjamin Tissoires	ab9b82909e	selftests/hid: tablets: be stricter for some transitions To accommodate for legacy devices, we rely on the last state of a transition to be valid: for example when we test PEN_IS_OUT_OF_RANGE to PEN_IS_IN_CONTACT, any "normal" device that reports an InRange bit would insert a PEN_IS_IN_RANGE state between the 2. This is of course valid, but this solution prevents to detect false releases emitted by some firmware: when pressing an "eraser mode" button, they might send an extra PEN_IS_OUT_OF_RANGE that we may want to filter. So define 2 sets of transitions: one that is the ideal behavior, and one that is OK, it won't break user space, but we have serious doubts if we are doing the right thing. And depending on the test, either ask only for valid transitions, or tolerate weird ones. Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-13-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2023-12-07 09:52:04 +01:00
Benjamin Tissoires	76df1f72fb	selftests/hid: tablets: add a secondary barrel switch test Some tablets report 2 barrel switches. We better test those too. Use the same transistions description from the primary button tests. Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-12-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2023-12-07 09:52:04 +01:00
Benjamin Tissoires	1f01537ef1	selftests/hid: tablets: convert the primary button tests We get more descriptive in what we are doing, and also get more information of what is actually being tested. Instead of having a non exhaustive button changes that are semi-randomly done, we can describe all the states we want to test. Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-11-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2023-12-07 09:52:04 +01:00
Benjamin Tissoires	74452d6329	selftests/hid: tablets: add variants of states with buttons Turns out that there are transitions that are unlikely to happen: for example, having both the tip switch and a button being changed at the same time (in the same report) would require either a very talented and precise user or a very bad hardware with a very low sampling rate. So instead of manually building the button test by hand and forgetting about some cases, let's reuse the state machine and transitions we have. This patch only adds the states and the valid transitions. The actual tests will be replaced later. Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-10-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2023-12-07 09:52:04 +01:00
Benjamin Tissoires	83912f83fa	selftests/hid: tablets: define the elements of PenState This introduces a little bit more readability by not using the raw values but a dedicated Enum Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-9-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2023-12-07 09:52:04 +01:00
Benjamin Tissoires	e08e493ff9	selftests/hid: tablets: set initial data for tilt/twist Avoids getting a null event when these usages are set Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-8-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2023-12-07 09:52:04 +01:00
Benjamin Tissoires	d8d7aa2266	selftests/hid: tablets: do not set invert when the eraser is used Turns out that the chart from Microsoft is not exactly what I got here: when the rubber is used, and is touching the surface, invert can (should) be set to 0... [0] https://learn.microsoft.com/en-us/windows-hardware/design/component-guidelines/windows-pen-states Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-7-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2023-12-07 09:52:03 +01:00
Benjamin Tissoires	881ccc36b4	selftests/hid: tablets: move move_to function to PenDigitizer We can easily subclass PenDigitizer for introducing firmware bugs when subclassing Pen is harder. Move move_to from Pen to PenDigitizer so we get that ability Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-6-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2023-12-07 09:52:03 +01:00
Benjamin Tissoires	d52f52069f	selftests/hid: tablets: move the transitions to PenState Those transitions have nothing to do with `Pen`, so migrate them to `PenState`. The hidden agenda is to remove `Pen` and integrate it into `PenDigitizer` so that we can tweak the events in each state to emulate firmware bugs. Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-5-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2023-12-07 09:52:03 +01:00
Benjamin Tissoires	b5edacf79c	selftests/hid: tablets: remove unused class Looks like this is a leftover Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-4-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2023-12-07 09:52:03 +01:00
Benjamin Tissoires	110292a77f	selftests/hid: base: allow for multiple skip_if_uhdev We can actually have multiple occurences of `skip_if_uhdev` if we follow the information from the pytest doc[0]. This is not immediately used, but can be if we need multiple conditions on a given test. [0] https://docs.pytest.org/en/latest/historical-notes.html#update-marker-code Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-3-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2023-12-07 09:52:03 +01:00
Benjamin Tissoires	46bc0277c2	selftests/hid: vmtest.sh: allow finer control on the build steps vmtest.sh works great for a one shot test, but not so much for CI where I want to build (with different configs) the bzImage in a separate job than the one I am running it. Add a "build_only" option to specify whether we need to boot the currently built kernel in the vm. Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-2-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2023-12-07 09:52:03 +01:00
Benjamin Tissoires	887f8094b3	selftests/hid: vmtest.sh: update vm2c and container boot2container is now on an official project, so let's use that. The container image is now the same I use for the CI, so let's keep to it. Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Acked-by: Jiri Kosina <jkosina@suse.com> Link: https://lore.kernel.org/r/20231206-wip-selftests-v2-1-c0350c2f5986@kernel.org Signed-off-by: Benjamin Tissoires <bentiss@kernel.org>	2023-12-07 09:52:03 +01:00
Andrew Morton	0c92218f4e	Merge branch 'master' into mm-hotfixes-stable	2023-12-06 17:03:50 -08:00
Nico Pache	f39fb633fe	selftests/mm: prevent duplicate runs caused by TEST_GEN_PROGS Commit `05f1edac80` ("selftests/mm: run all tests from run_vmtests.sh") fixed the inconsistency caused by tests being defined as TEST_GEN_PROGS. This issue was leading to tests not being executed via run_vmtests.sh and furthermore some tests running twice due to the kselftests wrapper also executing them. Fix the definition of two tests (soft-dirty and pagemap_ioctl) that are still incorrectly defined. Link: https://lkml.kernel.org/r/20231120222908.28559-1-npache@redhat.com Signed-off-by: Nico Pache <npache@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Joel Savitz <jsavitz@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-12-06 16:12:47 -08:00
Peter Xu	3f3cac5c0a	mm/selftests: fix pagemap_ioctl memory map test __FILE__ is not guaranteed to exist in current dir. Replace that with argv[0] for memory map test. Link: https://lkml.kernel.org/r/20231116201547.536857-4-peterx@redhat.com Fixes: `46fd75d4a3` ("selftests: mm: add pagemap ioctl tests") Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Cc: Andrei Vagin <avagin@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: Muhammad Usama Anjum <usama.anjum@collabora.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-12-06 16:12:45 -08:00
Andrii Nakryiko	7065eefb38	bpf: rename MAX_BPF_LINK_TYPE into __MAX_BPF_LINK_TYPE for consistency To stay consistent with the naming pattern used for similar cases in BPF UAPI (__MAX_BPF_ATTACH_TYPE, etc), rename MAX_BPF_LINK_TYPE into __MAX_BPF_LINK_TYPE. Also similar to MAX_BPF_ATTACH_TYPE and MAX_BPF_REG, add: #define MAX_BPF_LINK_TYPE __MAX_BPF_LINK_TYPE Not all __MAX_xxx enums have such #define, so I'm not sure if we should add it or not, but I figured I'll start with a completely backwards compatible way, and we can drop that, if necessary. Also adjust a selftest that used MAX_BPF_LINK_TYPE enum. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20231206190920.1651226-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-06 14:41:16 -08:00
Jiri Olsa	ffed24eff9	selftests/bpf: Add test for early update in prog_array_map_poke_run Adding test that tries to trigger the BUG_IN during early map update in prog_array_map_poke_run function. The idea is to share prog array map between thread that constantly updates it and another one loading a program that uses that prog array. Eventually we will hit a place where the program is ok to be updated (poke->tailcall_target_stable check) but the address is still not registered in kallsyms, so the bpf_arch_text_poke returns -EINVAL and cause imbalance for the next tail call update check, which will fail with -EBUSY in bpf_arch_text_poke as described in previous fix. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/bpf/20231206083041.1306660-3-jolsa@kernel.org	2023-12-06 22:40:43 +01:00
Ian Rogers	0713ab3bd1	perf stat: Exit perf stat if parse groups fails Metrics were added by a callback but commit `a4b8cfcabb` ("perf stat: Delay metric parsing") postponed this to allow optimizations based on the CPU configuration. In doing so it stopped errors in metric parsing from causing 'perf stat' termination. This change adds the termination for bad metric names back in. Fixes: `a4b8cfcabb` ("perf stat: Delay metric parsing") Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Closes: https://lore.kernel.org/lkml/ZXByT1K6enTh2EHT@kernel.org/ Link: https://lore.kernel.org/r/20231206183533.972028-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-12-06 17:23:50 -03:00
Andrii Nakryiko	dc5196fac4	selftests/bpf: add BPF token-enabled tests Add a selftest that attempts to conceptually replicate intended BPF token use cases inside user namespaced container. Child process is forked. It is then put into its own userns and mountns. Child creates BPF FS context object. This ensures child userns is captured as the owning userns for this instance of BPF FS. Given setting delegation mount options is privileged operation, we ensure that child cannot set them. This context is passed back to privileged parent process through Unix socket, where parent sets up delegation options, creates, and mounts it as a detached mount. This mount FD is passed back to the child to be used for BPF token creation, which allows otherwise privileged BPF operations to succeed inside userns. We validate that all of token-enabled privileged commands (BPF_BTF_LOAD, BPF_MAP_CREATE, and BPF_PROG_LOAD) work as intended. They should only succeed inside the userns if a) BPF token is provided with proper allowed sets of commands and types; and b) namespaces CAP_BPF and other privileges are set. Lacking a) or b) should lead to -EPERM failures. Based on suggested workflow by Christian Brauner ([0]). [0] https://lore.kernel.org/bpf/20230704-hochverdient-lehne-eeb9eeef785e@brauner/ Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-17-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-06 10:03:00 -08:00
Andrii Nakryiko	1571740a9b	libbpf: add BPF token support to bpf_prog_load() API Wire through token_fd into bpf_prog_load(). Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-16-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-06 10:03:00 -08:00
Andrii Nakryiko	1a8df7fa00	libbpf: add BPF token support to bpf_btf_load() API Allow user to specify token_fd for bpf_btf_load() API that wraps kernel's BPF_BTF_LOAD command. This allows loading BTF from unprivileged process as long as it has BPF token allowing BPF_BTF_LOAD command, which can be created and delegated by privileged process. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-15-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-06 10:03:00 -08:00
Andrii Nakryiko	37891cea66	libbpf: add BPF token support to bpf_map_create() API Add ability to provide token_fd for BPF_MAP_CREATE command through bpf_map_create() API. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-14-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-06 10:03:00 -08:00
Andrii Nakryiko	ecd435143e	libbpf: add bpf_token_create() API Add low-level wrapper API for BPF_TOKEN_CREATE command in bpf() syscall. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-13-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-06 10:03:00 -08:00
Andrii Nakryiko	e1cef620f5	bpf: add BPF token support to BPF_PROG_LOAD command Add basic support of BPF token to BPF_PROG_LOAD. Wire through a set of allowed BPF program types and attach types, derived from BPF FS at BPF token creation time. Then make sure we perform bpf_token_capable() checks everywhere where it's relevant. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-7-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-06 10:02:59 -08:00
Andrii Nakryiko	ee54b1a910	bpf: add BPF token support to BPF_BTF_LOAD command Accept BPF token FD in BPF_BTF_LOAD command to allow BTF data loading through delegated BPF token. BTF loading is a pretty straightforward operation, so as long as BPF token is created with allow_cmds granting BPF_BTF_LOAD command, kernel proceeds to parsing BTF data and creating BTF object. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-06 10:02:59 -08:00
Andrii Nakryiko	688b7270b3	bpf: add BPF token support to BPF_MAP_CREATE command Allow providing token_fd for BPF_MAP_CREATE command to allow controlled BPF map creation from unprivileged process through delegated BPF token. Wire through a set of allowed BPF map types to BPF token, derived from BPF FS at BPF token creation time. This, in combination with allowed_cmds allows to create a narrowly-focused BPF token (controlled by privileged agent) with a restrictive set of BPF maps that application can attempt to create. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-5-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-06 10:02:59 -08:00
Andrii Nakryiko	4527358b76	bpf: introduce BPF token object Add new kind of BPF kernel object, BPF token. BPF token is meant to allow delegating privileged BPF functionality, like loading a BPF program or creating a BPF map, from privileged process to a trusted unprivileged process, all while having a good amount of control over which privileged operations could be performed using provided BPF token. This is achieved through mounting BPF FS instance with extra delegation mount options, which determine what operations are delegatable, and also constraining it to the owning user namespace (as mentioned in the previous patch). BPF token itself is just a derivative from BPF FS and can be created through a new bpf() syscall command, BPF_TOKEN_CREATE, which accepts BPF FS FD, which can be attained through open() API by opening BPF FS mount point. Currently, BPF token "inherits" delegated command, map types, prog type, and attach type bit sets from BPF FS as is. In the future, having an BPF token as a separate object with its own FD, we can allow to further restrict BPF token's allowable set of things either at the creation time or after the fact, allowing the process to guard itself further from unintentionally trying to load undesired kind of BPF programs. But for now we keep things simple and just copy bit sets as is. When BPF token is created from BPF FS mount, we take reference to the BPF super block's owning user namespace, and then use that namespace for checking all the {CAP_BPF, CAP_PERFMON, CAP_NET_ADMIN, CAP_SYS_ADMIN} capabilities that are normally only checked against init userns (using capable()), but now we check them using ns_capable() instead (if BPF token is provided). See bpf_token_capable() for details. Such setup means that BPF token in itself is not sufficient to grant BPF functionality. User namespaced process has to also have necessary combination of capabilities inside that user namespace. So while previously CAP_BPF was useless when granted within user namespace, now it gains a meaning and allows container managers and sys admins to have a flexible control over which processes can and need to use BPF functionality within the user namespace (i.e., container in practice). And BPF FS delegation mount options and derived BPF tokens serve as a per-container "flag" to grant overall ability to use bpf() (plus further restrict on which parts of bpf() syscalls are treated as namespaced). Note also, BPF_TOKEN_CREATE command itself requires ns_capable(CAP_BPF) within the BPF FS owning user namespace, rounding up the ns_capable() story of BPF token. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231130185229.2688956-4-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-06 10:02:59 -08:00
Ian Rogers	01261d8a0f	perf thread: Add missing RC_CHK_EQUAL Comparing pointers without RC_CHK_ACCESS means the indirect object will be compared rather than the underlying maps when REFCNT_CHECKING is enabled. Fix by adding missing RC_CHK_EQUAL. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: German Gomez <german.gomez@arm.com> Cc: Guilherme Amadio <amadio@gentoo.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Li Dong <lidong@vivo.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Nick Terrell <terrelln@fb.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231127220902.1315692-15-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-12-06 13:01:50 -03:00
Ian Rogers	0f6ab6a3fb	perf maps: Move symbol maps functions to maps.c Move the find and certain other symbol maps__* functions to maps.c for better abstraction. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: German Gomez <german.gomez@arm.com> Cc: Guilherme Amadio <amadio@gentoo.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Li Dong <lidong@vivo.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Nick Terrell <terrelln@fb.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231127220902.1315692-14-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-12-06 13:01:46 -03:00
Ian Rogers	9fa688ea34	perf map: Simplify map_ip/unmap_ip and make 'struct map' smaller When mapping an IP it is either an identity mapping or a DSO relative mapping, so a single bit is required in the struct to identify this. The current code uses function pointers, adding 2 pointers per map and also pushing the size of a map beyond 1 cache line. Switch to using a byte to identify the mapping type (as well as priv and erange_warned), to avoid any masking. Change struct maps's layout to avoid holes. Before: ``` struct map { u64 start; /* 0 8 / u64 end; / 8 8 / _Bool erange_warned:1; / 16: 0 1 / _Bool priv:1; / 16: 1 1 / / XXX 6 bits hole, try to pack / / XXX 3 bytes hole, try to pack / u32 prot; / 20 4 / u64 pgoff; / 24 8 / u64 reloc; / 32 8 / u64 (map_ip)(const struct map , u64); / 40 8 / u64 (unmap_ip)(const struct map , u64); / 48 8 / struct dso dso; /* 56 8 / / --- cacheline 1 boundary (64 bytes) --- / refcount_t refcnt; / 64 4 / u32 flags; / 68 4 / / size: 72, cachelines: 2, members: 12 / / sum members: 68, holes: 1, sum holes: 3 / / sum bitfield members: 2 bits, bit holes: 1, sum bit holes: 6 bits / / last cacheline: 8 bytes / }; ``` After: ``` struct map { u64 start; / 0 8 / u64 end; / 8 8 / u64 pgoff; / 16 8 / u64 reloc; / 24 8 / struct dso dso; /* 32 8 / refcount_t refcnt; / 40 4 / u32 prot; / 44 4 / u32 flags; / 48 4 / enum mapping_type mapping_type:8; / 52: 0 4 / / Bitfield combined with next fields / _Bool erange_warned; / 53 1 / _Bool priv; / 54 1 / / size: 56, cachelines: 1, members: 11 / / padding: 1 / / last cacheline: 56 bytes */ }; ``` Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: German Gomez <german.gomez@arm.com> Cc: Guilherme Amadio <amadio@gentoo.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Li Dong <lidong@vivo.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Nick Terrell <terrelln@fb.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231127220902.1315692-13-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-12-06 13:01:42 -03:00
Ian Rogers	407a3898d7	perf test shell diff: Skip test if test_loop symbol is missing in the perf binary The diff test depends on finding the symbol test_loop in perf and will fail if perf has been stripped and no debug object is available. In that case, skip the test instead. Suggested-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20231205164924.835682-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-12-06 13:01:36 -03:00
Chengen Du	d0acce6828	perf symbols: Parse NOTE segments until the build id is found In the ELF file, multiple NOTE segments may exist. To locate the build id, the process shall persist in parsing NOTE segments until the build id is found. Signed-off-by: Chengen Du <chengen.du@canonical.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20231130135723.17562-1-chengen.du@canonical.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-12-06 09:46:15 -03:00
Ian Rogers	030ac3cad2	perf record: Be lazier in allocating lost samples buffer Wait until a lost sample occurs to allocate the lost samples buffer, often the buffer isn't necessary. This saves a 64kb allocation and 5.3kb of peak memory consumption. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: German Gomez <german.gomez@arm.com> Cc: Guilherme Amadio <amadio@gentoo.org> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Li Dong <lidong@vivo.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Ming Wang <wangming01@loongson.cn> Cc: Nick Terrell <terrelln@fb.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Vincent Whitchurch <vincent.whitchurch@axis.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231127220902.1315692-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-12-06 09:46:14 -03:00
Ian Rogers	eb2eac0c7b	perf evsel: Fallback to "task-clock" when not system wide When the "cycles" event isn't available evsel will fallback to the "cpu-clock" software event. "task-clock" is similar to "cpu-clock" but only runs when the process is running. Falling back to "cpu-clock" when not system wide leads to confusion, by falling back to "task-clock" it is hoped the confusion is less. Pass the target to determine if "task-clock" is more appropriate. Update a nearby comment and debug string for the change. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ajay Kaher <akaher@vmware.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexey Makhalov <amakhalov@vmware.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sandipan Das <sandipan.das@amd.com> Cc: Yang Jihong <yangjihong1@huawei.com> Link: https://lore.kernel.org/r/20231121000420.368075-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-12-06 09:45:58 -03:00
Jakub Kicinski	f3c928008a	tools: ynl: move private definitions to a separate header ynl.h has a growing amount of "internal" stuff, which may confuse users who try to take a look at the external API. Currently the internals are at the bottom of the file with a banner in between, but this arrangement makes it hard to add external APIs / inline helpers which need internal definitions. Move internals to a separate header. Link: https://lore.kernel.org/r/20231202211225.342466-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-05 20:08:33 -08:00
Jakub Kicinski	f2d4d9ad80	tools: ynl: use strerror() if no extack of note provided If kernel didn't give use any meaningful error - print a strerror() to the ynl error message. Reviewed-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Link: https://lore.kernel.org/r/20231202211310.342716-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-05 20:08:13 -08:00
Jakub Kicinski	e136735f0c	tools: pynl: make flags argument optional for do() Commit `1768d8a767` ("tools/net/ynl: Add support for create flags") added support for setting legacy netlink CRUD flags on netlink messages (NLM_F_REPLACE, _EXCL, _CREATE etc.). Most of genetlink won't need these, don't force callers to pass in an empty argument to each do() call. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Link: https://lore.kernel.org/r/20231202211005.341613-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-05 20:07:59 -08:00
Andrii Nakryiko	064e0bea19	selftests/bpf: validate precision logic in partial_stack_load_preserves_zeros Enhance partial_stack_load_preserves_zeros subtest with detailed precision propagation log checks. We know expect fp-16 to be spilled, initially imprecise, zero const register, which is later marked as precise even when partial stack slot load is performed, even if it's not a register fill (!). Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231205184248.1502704-10-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-05 13:40:21 -08:00
Andrii Nakryiko	add1cd7f22	selftests/bpf: validate zero preservation for sub-slot loads Validate that 1-, 2-, and 4-byte loads from stack slots not aligned on 8-byte boundary still preserve zero, when loading from all-STACK_ZERO sub-slots, or when stack sub-slots are covered by spilled register with known constant zero value. Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231205184248.1502704-8-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-05 13:40:21 -08:00
Andrii Nakryiko	b33ceb6a3d	selftests/bpf: validate STACK_ZERO is preserved on subreg spill Add tests validating that STACK_ZERO slots are preserved when slot is partially overwritten with subregister spill. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231205184248.1502704-6-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-05 13:40:20 -08:00
Andrii Nakryiko	876301881c	selftests/bpf: add stack access precision test Add a new selftests that validates precision tracking for stack access instruction, using both r10-based and non-r10-based accesses. For non-r10 ones we also make sure to have non-zero var_off to validate that final stack offset is tracked properly in instruction history information inside verifier. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231205184248.1502704-3-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-05 13:40:20 -08:00
Andrii Nakryiko	41f6f64e69	bpf: support non-r10 register spill/fill to/from stack in precision tracking Use instruction (jump) history to record instructions that performed register spill/fill to/from stack, regardless if this was done through read-only r10 register, or any other register after copying r10 into it and potentially adjusting offset. To make this work reliably, we push extra per-instruction flags into instruction history, encoding stack slot index (spi) and stack frame number in extra 10 bit flags we take away from prev_idx in instruction history. We don't touch idx field for maximum performance, as it's checked most frequently during backtracking. This change removes basically the last remaining practical limitation of precision backtracking logic in BPF verifier. It fixes known deficiencies, but also opens up new opportunities to reduce number of verified states, explored in the subsequent patches. There are only three differences in selftests' BPF object files according to veristat, all in the positive direction (less states). File Program Insns (A) Insns (B) Insns (DIFF) States (A) States (B) States (DIFF) -------------------------------------- ------------- --------- --------- ------------- ---------- ---------- ------------- test_cls_redirect_dynptr.bpf.linked3.o cls_redirect 2987 2864 -123 (-4.12%) 240 231 -9 (-3.75%) xdp_synproxy_kern.bpf.linked3.o syncookie_tc 82848 82661 -187 (-0.23%) 5107 5073 -34 (-0.67%) xdp_synproxy_kern.bpf.linked3.o syncookie_xdp 85116 84964 -152 (-0.18%) 5162 5130 -32 (-0.62%) Note, I avoided renaming jmp_history to more generic insn_hist to minimize number of lines changed and potential merge conflicts between bpf and bpf-next trees. Notice also cur_hist_entry pointer reset to NULL at the beginning of instruction verification loop. This pointer avoids the problem of relying on last jump history entry's insn_idx to determine whether we already have entry for current instruction or not. It can happen that we added jump history entry because current instruction is_jmp_point(), but also we need to add instruction flags for stack access. In this case, we don't want to entries, so we need to reuse last added entry, if it is present. Relying on insn_idx comparison has the same ambiguity problem as the one that was fixed recently in [0], so we avoid that. [0] https://patchwork.kernel.org/project/netdevbpf/patch/20231110002638.4168352-3-andrii@kernel.org/ Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reported-by: Tao Lyu <tao.lyu@epfl.ch> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20231205184248.1502704-2-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2023-12-05 13:40:20 -08:00
Ian Rogers	b169374748	perf list: Fix JSON segfault by setting the used skip_duplicate_pmus callback Json output didn't set the skip_duplicate_pmus callback yielding a segfault. Fixes: `cd4e1efbbc` ("perf pmus: Skip duplicate PMUs and don't print list suffix by default") Signed-off-by: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@arm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231129213428.2227448-2-irogers@google.com [namhyung: updated subject line according to Arnaldo] Signed-off-by: Namhyung Kim <namhyung@kernel.org>	2023-12-05 11:16:00 -08:00
Colin Ian King	018b042485	perf bench sched-seccomp-notify: Fix spelling mistake "synchronious" -> "synchronous" There is a spelling mistake in an option description. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-janitors@vger.kernel.org Link: https://lore.kernel.org/r/20230630080029.15614-1-colin.i.king@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>	2023-12-05 15:48:52 -03:00

... 9 10 11 12 13 ...

39278 Commits