Commit Graph

1135994 Commits

Author SHA1 Message Date
Nathan Chancellor
96d845a67b drm/fsl-dcu: Fix return type of fsl_dcu_drm_connector_mode_valid()
With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG),
indirect call targets are validated against the expected function
pointer prototype to make sure the call target is valid to help mitigate
ROP attacks. If they are not identical, there is a failure at run time,
which manifests as either a kernel panic or thread getting killed. A
proposed warning in clang aims to catch these at compile time, which
reveals:

  drivers/gpu/drm/fsl-dcu/fsl_dcu_drm_rgb.c:74:16: error: incompatible function pointer types initializing 'enum drm_mode_status (*)(struct drm_connector *, struct drm_display_mode *)' with an expression of type 'int (struct drm_connector *, struct drm_display_mode *)' [-Werror,-Wincompatible-function-pointer-types-strict]
          .mode_valid = fsl_dcu_drm_connector_mode_valid,
                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  1 error generated.

->mode_valid() in 'struct drm_connector_helper_funcs' expects a return
type of 'enum drm_mode_status', not 'int'. Adjust the return type of
fsl_dcu_drm_connector_mode_valid() to match the prototype's to resolve
the warning and CFI failure.

Link: https://github.com/ClangBuiltLinux/linux/issues/1750
Reported-by: Sami Tolvanen <samitolvanen@google.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221102154215.78059-1-nathan@kernel.org
2022-11-18 09:17:13 -08:00
Kees Cook
74c8e6bffb driver core: Add __alloc_size hint to devm allocators
Mark the devm_*alloc()-family of allocations with appropriate
__alloc_size()/__realloc_size() hints so the compiler can attempt to
reason about buffer lengths from allocations.

Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Nishanth Menon <nm@ti.com>
Cc: Michael Kelley <mikelley@microsoft.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Won Chung <wonchung@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20221029074734.gonna.276-kees@kernel.org
2022-11-18 09:17:13 -08:00
Kees Cook
8abfff25b3 Merge branch 'for-linus/hardening' into for-next/hardening 2022-11-08 10:58:57 -08:00
Nathan Chancellor
000f8870a4 vmlinux.lds.h: Fix placement of '.data..decrypted' section
Commit d4c6399900 ("vmlinux.lds.h: Avoid orphan section with !SMP")
fixed an orphan section warning by adding the '.data..decrypted' section
to the linker script under the PERCPU_DECRYPTED_SECTION define but that
placement introduced a panic with !SMP, as the percpu sections are not
instantiated with that configuration so attempting to access variables
defined with DEFINE_PER_CPU_DECRYPTED() will result in a page fault.

Move the '.data..decrypted' section to the DATA_MAIN define so that the
variables in it are properly instantiated at boot time with
CONFIG_SMP=n.

Cc: stable@vger.kernel.org
Fixes: d4c6399900 ("vmlinux.lds.h: Avoid orphan section with !SMP")
Link: https://lore.kernel.org/cbbd3548-880c-d2ca-1b67-5bb93b291d5f@huawei.com/
Debugged-by: Ard Biesheuvel <ardb@kernel.org>
Reported-by: Zhao Wenhui <zhaowenhui8@huawei.com>
Tested-by: xiafukun <xiafukun@huawei.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221108174934.3384275-1-nathan@kernel.org
2022-11-08 09:59:19 -08:00
Kees Cook
4b21d25bf5 overflow: Introduce overflows_type() and castable_to_type()
Implement a robust overflows_type() macro to test if a variable or
constant value would overflow another variable or type. This can be
used as a constant expression for static_assert() (which requires a
constant expression[1][2]) when used on constant values. This must be
constructed manually, since __builtin_add_overflow() does not produce
a constant expression[3].

Additionally adds castable_to_type(), similar to __same_type(), but for
checking if a constant value would overflow if cast to a given type.

Add unit tests for overflows_type(), __same_type(), and castable_to_type()
to the existing KUnit "overflow" test:

[16:03:33] ================== overflow (21 subtests) ==================
...
[16:03:33] [PASSED] overflows_type_test
[16:03:33] [PASSED] same_type_test
[16:03:33] [PASSED] castable_to_type_test
[16:03:33] ==================== [PASSED] overflow =====================
[16:03:33] ============================================================
[16:03:33] Testing complete. Ran 21 tests: passed: 21
[16:03:33] Elapsed time: 24.022s total, 0.002s configuring, 22.598s building, 0.767s running

[1] https://en.cppreference.com/w/c/language/_Static_assert
[2] C11 standard (ISO/IEC 9899:2011): 6.7.10 Static assertions
[3] https://gcc.gnu.org/onlinedocs/gcc/Integer-Overflow-Builtins.html
    6.56 Built-in Functions to Perform Arithmetic with Overflow Checking
    Built-in Function: bool __builtin_add_overflow (type1 a, type2 b,

Cc: Luc Van Oostenryck <luc.vanoostenryck@gmail.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Tom Rix <trix@redhat.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: Vitor Massaru Iha <vitor@massaru.org>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-hardening@vger.kernel.org
Cc: llvm@lists.linux.dev
Co-developed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221024201125.1416422-1-gwan-gyeong.mun@intel.com
2022-11-02 12:39:27 -07:00
Kees Cook
6dd142d901 coredump: Proactively round up to kmalloc bucket size
Instead of discovering the kmalloc bucket size _after_ allocation, round
up proactively so the allocation is explicitly made for the full size,
allowing the compiler to correctly reason about the resulting size of
the buffer through the existing __alloc_size() hint.

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2022-11-01 10:04:52 -07:00
Kees Cook
905889bc6c btrfs: send: Proactively round up to kmalloc bucket size
Instead of discovering the kmalloc bucket size _after_ allocation, round
up proactively so the allocation is explicitly made for the full size,
allowing the compiler to correctly reason about the resulting size of
the buffer through the existing __alloc_size() hint.

Cc: Chris Mason <clm@fb.com>
Cc: Josef Bacik <josef@toxicpanda.com>
Cc: linux-btrfs@vger.kernel.org
Acked-by: David Sterba <dsterba@suse.com>
Link: https://lore.kernel.org/lkml/20220922133014.GI32411@suse.cz
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220923202822.2667581-8-keescook@chromium.org
2022-11-01 10:04:52 -07:00
Kees Cook
cd536db050 dma-buf: Proactively round up to kmalloc bucket size
Instead of discovering the kmalloc bucket size _after_ allocation, round
up proactively so the allocation is explicitly made for the full size,
allowing the compiler to correctly reason about the resulting size of
the buffer through the existing __alloc_size() hint.

Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linaro-mm-sig@lists.linaro.org
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221018090858.never.941-kees@kernel.org
2022-11-01 10:04:52 -07:00
Xin Li
e1789d7c75 kbuild: upgrade the orphan section warning to an error if CONFIG_WERROR is set
Andrew Cooper suggested upgrading the orphan section warning to a hard link
error. However Nathan Chancellor said outright turning the warning into an
error with no escape hatch might be too aggressive, as we have had these
warnings triggered by new compiler generated sections, and suggested turning
orphan sections into an error only if CONFIG_WERROR is set. Kees Cook echoed
and emphasized that the mandate from Linus is that we should avoid breaking
builds. It wrecks bisection, it causes problems across compiler versions, etc.

Thus upgrade the orphan section warning to a hard link error only if
CONFIG_WERROR is set.

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Xin Li <xin3.li@intel.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221025073023.16137-2-xin3.li@intel.com
2022-11-01 10:04:52 -07:00
Kees Cook
5a17f040fa cred: Do not default to init_cred in prepare_kernel_cred()
A common exploit pattern for ROP attacks is to abuse prepare_kernel_cred()
in order to construct escalated privileges[1]. Instead of providing a
short-hand argument (NULL) to the "daemon" argument to indicate using
init_cred as the base cred, require that "daemon" is always set to
an actual task. Replace all existing callers that were passing NULL
with &init_task.

Future attacks will need to have sufficiently powerful read/write
primitives to have found an appropriately privileged task and written it
to the ROP stack as an argument to succeed, which is similarly difficult
to the prior effort needed to escalate privileges before struct cred
existed: locate the current cred and overwrite the uid member.

This has the added benefit of meaning that prepare_kernel_cred() can no
longer exceed the privileges of the init task, which may have changed from
the original init_cred (e.g. dropping capabilities from the bounding set).

[1] https://google.com/search?q=commit_creds(prepare_kernel_cred(0))

Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: David Howells <dhowells@redhat.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Steve French <sfrench@samba.org>
Cc: Ronnie Sahlberg <lsahlber@redhat.com>
Cc: Shyam Prasad N <sprasad@microsoft.com>
Cc: Tom Talpey <tom@talpey.com>
Cc: Namjae Jeon <linkinjeon@kernel.org>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Anna Schumaker <anna@kernel.org>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: "Michal Koutný" <mkoutny@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-cifs@vger.kernel.org
Cc: samba-technical@lists.samba.org
Cc: linux-nfs@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Acked-by: Russ Weight <russell.h.weight@intel.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Link: https://lore.kernel.org/r/20221026232943.never.775-kees@kernel.org
2022-11-01 10:04:52 -07:00
Kees Cook
e9a40e1585 fortify: Do not cast to "unsigned char"
Do not cast to "unsigned char", as this needlessly creates type problems
when attempting builds without -Wno-pointer-sign[1]. The intent of the
cast is to drop possible "const" types.

[1] https://lore.kernel.org/lkml/CAHk-=wgz3Uba8w7kdXhsqR1qvfemYL+OFQdefJnkeqXG8qZ_pA@mail.gmail.com/

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: 3009f891bb ("fortify: Allow strlen() and strnlen() to pass compile-time known lengths")
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2022-11-01 10:04:52 -07:00
Kees Cook
fb3d88ab35 siphash: Convert selftest to KUnit
Convert the siphash self-test to KUnit so it will be included in "all
KUnit tests" coverage, and can be run individually still:

$ ./tools/testing/kunit/kunit.py run siphash
...
[02:58:45] Starting KUnit Kernel (1/1)...
[02:58:45] ============================================================
[02:58:45] =================== siphash (1 subtest) ====================
[02:58:45] [PASSED] siphash_test
[02:58:45] ===================== [PASSED] siphash =====================
[02:58:45] ============================================================
[02:58:45] Testing complete. Ran 1 tests: passed: 1
[02:58:45] Elapsed time: 21.421s total, 4.306s configuring, 16.947s building, 0.148s running

Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: "Steven Rostedt (Google)" <rostedt@goodmis.org>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Sander Vanheule <sander@svanheule.net>
Acked-by: "Jason A. Donenfeld" <Jason@zx2c4.com>
Link: https://lore.kernel.org/lkml/CAHmME9r+9MPH6zk3Vn=buEMSbQiWMFryqqzerKarmjYk+tHLJA@mail.gmail.com
Tested-by: David Gow <davidgow@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2022-11-01 10:04:52 -07:00
Kees Cook
62e1cbfc5d fortify: Short-circuit known-safe calls to strscpy()
Replacing compile-time safe calls of strcpy()-related functions with
strscpy() was always calling the full strscpy() logic when a builtin
would be better. For example:

	char buf[16];
	strcpy(buf, "yes");

would reduce to __builtin_memcpy(buf, "yes", 4), but not if it was:

	strscpy(buf, yes, sizeof(buf));

Fix this by checking if all sizes are known at compile-time.

Cc: linux-hardening@vger.kernel.org
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
2022-11-01 10:04:52 -07:00
Kees Cook
41eefc46a3 string: Convert strscpy() self-test to KUnit
Convert the strscpy() self-test to a KUnit test.

Cc: David Gow <davidgow@google.com>
Cc: Tobin C. Harding <tobin@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/lkml/Y072ZMk/hNkfwqMv@dev-arch.thelio-3990X
Signed-off-by: Kees Cook <keescook@chromium.org>
2022-11-01 10:04:52 -07:00
Kees Cook
9e4a617757 string: Add __realloc_size hint to kmemdup()
Add __realloc_size() hint to kmemdup() so the compiler can reason about
the length of the returned buffer. (These must not use __alloc_size,
since those include __malloc which says the contents aren't defined[1]).

[1] https://lore.kernel.org/linux-hardening/d199c2af-06af-8a50-a6a1-00eefa0b67b4@prevas.dk/

Cc: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
2022-11-01 10:04:51 -07:00
Kees Cook
96fce387d5 kunit/memcpy: Add dynamic size and window tests
The "side effects" memmove() test accidentally found[1] a corner case in
the recent refactoring of the i386 assembly memmove(), but missed another
corner case. Instead of hoping to get lucky next time, implement much
more complete tests of memcpy() and memmove() -- especially the moving
window overlap for memmove() -- which catches all the issues encountered
and should catch anything new.

[1] https://lore.kernel.org/lkml/CAKwvOdkaKTa2aiA90VzFrChNQM6O_ro+b7VWs=op70jx-DKaXA@mail.gmail.com

Cc: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2022-10-28 16:07:57 -07:00
Kees Cook
03699f271d string: Rewrite and add more kern-doc for the str*() functions
While there were varying degrees of kern-doc for various str*()-family
functions, many needed updating and clarification, or to just be
entirely written. Update (and relocate) existing kern-doc and add missing
functions, sadly shaking my head at how many times I have written "Do
not use this function". Include the results in the core kernel API doc.

Cc: Bagas Sanjaya <bagasdotme@gmail.com>
Cc: Andy Shevchenko <andy@kernel.org>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-hardening@vger.kernel.org
Tested-by: Akira Yokosawa <akiyks@gmail.com>
Link: https://lore.kernel.org/lkml/9b0cf584-01b3-3013-b800-1ef59fe82476@gmail.com
Signed-off-by: Kees Cook <keescook@chromium.org>
2022-10-28 16:07:57 -07:00
Kees Cook
6f7630b1b5 fortify: Capture __bos() results in const temp vars
In two recent run-time memcpy() bound checking bug reports (NFS[1] and
JFS[2]), the _detection_ was working correctly (in the sense that the
requested copy size was larger than the destination field size), but
the _warning text_ was showing the destination field size as SIZE_MAX
("unknown size"). This should be impossible, since the detection function
will explicitly give up if the destination field size is unknown. For
example, the JFS warning was:

  memcpy: detected field-spanning write (size 132) of single field "ip->i_link" at fs/jfs/namei.c:950 (size 18446744073709551615)

Other cases of this warning (e.g.[3]) have reported correctly,
and the reproducer only happens under GCC (at least 10.2 and 12.1),
so this currently appears to be a GCC bug. Explicitly capturing the
__builtin_object_size() results in const temporary variables fixes the
report. For example, the JFS reproducer now correctly reports the field
size (128):

  memcpy: detected field-spanning write (size 132) of single field "ip->i_link" at fs/jfs/namei.c:950 (size 128)

Examination of the .text delta (which is otherwise identical), shows
the literal value used in the report changing:

-     mov    $0xffffffffffffffff,%rcx
+     mov    $0x80,%ecx

[1] https://lore.kernel.org/lkml/Y0zEzZwhOxTDcBTB@codemonkey.org.uk/
[2] https://syzkaller.appspot.com/bug?id=23d613df5259b977dac1696bec77f61a85890e3d
[3] https://lore.kernel.org/all/202210110948.26b43120-yujie.liu@intel.com/

Cc: "Dr. David Alan Gilbert" <linux@treblig.org>
Cc: llvm@lists.linux.dev
Cc: linux-hardening@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2022-10-28 16:07:01 -07:00
Kees Cook
72c3ebea37 overflow: Refactor test skips for Clang-specific issues
Convert test exclusion into test skipping. This brings the logic for
why a test is being skipped into the test itself, instead of having to
spread ifdefs around the code. This will make cleanup easier as minimum
tests get raised. Drop __maybe_unused so missed tests will be noticed
again and clean up whitespace.

For example, clang-11 on i386:

[15:52:32] ================== overflow (18 subtests) ==================
[15:52:32] [PASSED] u8_u8__u8_overflow_test
[15:52:32] [PASSED] s8_s8__s8_overflow_test
[15:52:32] [PASSED] u16_u16__u16_overflow_test
[15:52:32] [PASSED] s16_s16__s16_overflow_test
[15:52:32] [PASSED] u32_u32__u32_overflow_test
[15:52:32] [PASSED] s32_s32__s32_overflow_test
[15:52:32] [SKIPPED] u64_u64__u64_overflow_test
[15:52:32] [SKIPPED] s64_s64__s64_overflow_test
[15:52:32] [SKIPPED] u32_u32__int_overflow_test
[15:52:32] [PASSED] u32_u32__u8_overflow_test
[15:52:32] [PASSED] u8_u8__int_overflow_test
[15:52:32] [PASSED] int_int__u8_overflow_test
[15:52:32] [PASSED] shift_sane_test
[15:52:32] [PASSED] shift_overflow_test
[15:52:32] [PASSED] shift_truncate_test
[15:52:32] [PASSED] shift_nonsense_test
[15:52:32] [PASSED] overflow_allocation_test
[15:52:32] [PASSED] overflow_size_helpers_test
[15:52:32] ==================== [PASSED] overflow =====================
[15:52:32] ============================================================
[15:52:32] Testing complete. Ran 18 tests: passed: 15, skipped: 3

Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Tom Rix <trix@redhat.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: "Gustavo A. R. Silva" <gustavoars@kernel.org>
Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Cc: llvm@lists.linux.dev
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20221006230017.1833458-1-keescook@chromium.org
2022-10-25 14:57:42 -07:00
Nick Desaulniers
0e5b9f25b2 overflow: disable failing tests for older clang versions
Building the overflow kunit tests with clang-11 fails with:

$ ./tools/testing/kunit/kunit.py run --arch=arm --make_options LLVM=1 \
overflow
...
ld.lld: error: undefined symbol: __mulodi4
...

Clang 11 and earlier generate unwanted libcalls for signed output,
unsigned input.

Disable these tests for now, but should these become used in the kernel
we might consider that as justification for dropping clang-11 support.
Keep the clang-11 build alive a little bit longer.

Avoid -Wunused-function warnings via __maybe_unused. To test W=1:

$ make LLVM=1 -j128 defconfig
$ ./scripts/config -e KUNIT -e KUNIT_ALL
$ make LLVM=1 -j128 olddefconfig lib/overflow_kunit.o W=1

Link: https://github.com/ClangBuiltLinux/linux/issues/1711
Link: 3203143f13
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221006171751.3444575-1-ndesaulniers@google.com
2022-10-25 14:57:42 -07:00
Kees Cook
31970608a6 overflow: Fix kern-doc markup for functions
Fix the kern-doc markings for several of the overflow helpers and move
their location into the core kernel API documentation, where it belongs
(it's not driver-specific).

Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Cc: linux-hardening@vger.kernel.org
Reviewed-by: Akira Yokosawa <akiyks@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
2022-10-25 14:57:42 -07:00
Linus Torvalds
9abf2313ad Linux 6.1-rc1 v6.1-rc1 2022-10-16 15:36:24 -07:00
Linus Torvalds
f1947d7c8a Merge tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull more random number generator updates from Jason Donenfeld:
 "This time with some large scale treewide cleanups.

  The intent of this pull is to clean up the way callers fetch random
  integers. The current rules for doing this right are:

   - If you want a secure or an insecure random u64, use get_random_u64()

   - If you want a secure or an insecure random u32, use get_random_u32()

     The old function prandom_u32() has been deprecated for a while
     now and is just a wrapper around get_random_u32(). Same for
     get_random_int().

   - If you want a secure or an insecure random u16, use get_random_u16()

   - If you want a secure or an insecure random u8, use get_random_u8()

   - If you want secure or insecure random bytes, use get_random_bytes().

     The old function prandom_bytes() has been deprecated for a while
     now and has long been a wrapper around get_random_bytes()

   - If you want a non-uniform random u32, u16, or u8 bounded by a
     certain open interval maximum, use prandom_u32_max()

     I say "non-uniform", because it doesn't do any rejection sampling
     or divisions. Hence, it stays within the prandom_*() namespace, not
     the get_random_*() namespace.

     I'm currently investigating a "uniform" function for 6.2. We'll see
     what comes of that.

  By applying these rules uniformly, we get several benefits:

   - By using prandom_u32_max() with an upper-bound that the compiler
     can prove at compile-time is ≤65536 or ≤256, internally
     get_random_u16() or get_random_u8() is used, which wastes fewer
     batched random bytes, and hence has higher throughput.

   - By using prandom_u32_max() instead of %, when the upper-bound is
     not a constant, division is still avoided, because
     prandom_u32_max() uses a faster multiplication-based trick instead.

   - By using get_random_u16() or get_random_u8() in cases where the
     return value is intended to indeed be a u16 or a u8, we waste fewer
     batched random bytes, and hence have higher throughput.

  This series was originally done by hand while I was on an airplane
  without Internet. Later, Kees and I worked on retroactively figuring
  out what could be done with Coccinelle and what had to be done
  manually, and then we split things up based on that.

  So while this touches a lot of files, the actual amount of code that's
  hand fiddled is comfortably small"

* tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  prandom: remove unused functions
  treewide: use get_random_bytes() when possible
  treewide: use get_random_u32() when possible
  treewide: use get_random_{u8,u16}() when possible, part 2
  treewide: use get_random_{u8,u16}() when possible, part 1
  treewide: use prandom_u32_max() when possible, part 2
  treewide: use prandom_u32_max() when possible, part 1
2022-10-16 15:27:07 -07:00
Linus Torvalds
8636df94ec Merge tag 'perf-tools-for-v6.1-2-2022-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull more perf tools updates from Arnaldo Carvalho de Melo:

 - Use BPF CO-RE (Compile Once, Run Everywhere) to support old kernels
   when using bperf (perf BPF based counters) with cgroups.

 - Support HiSilicon PCIe Performance Monitoring Unit (PMU), that
   monitors bandwidth, latency, bus utilization and buffer occupancy.

   Documented in Documentation/admin-guide/perf/hisi-pcie-pmu.rst.

 - User space tasks can migrate between CPUs, so when tracing selected
   CPUs, system-wide sideband is still needed, fix it in the setup of
   Intel PT on hybrid systems.

 - Fix metricgroups title message in 'perf list', it should state that
   the metrics groups are to be used with the '-M' option, not '-e'.

 - Sync the msr-index.h copy with the kernel sources, adding support for
   using "AMD64_TSC_RATIO" in filter expressions in 'perf trace' as well
   as decoding it when printing the MSR tracepoint arguments.

 - Fix program header size and alignment when generating a JIT ELF in
   'perf inject'.

 - Add multiple new Intel PT 'perf test' entries, including a jitdump
   one.

 - Fix the 'perf test' entries for 'perf stat' CSV and JSON output when
   running on PowerPC due to an invalid topology number in that arch.

 - Fix the 'perf test' for arm_coresight failures on the ARM Juno
   system.

 - Fix the 'perf test' attr entry for PERF_FORMAT_LOST, adding this
   option to the or expression expected in the intercepted
   perf_event_open() syscall.

 - Add missing condition flags ('hs', 'lo', 'vc', 'vs') for arm64 in the
   'perf annotate' asm parser.

 - Fix 'perf mem record -C' option processing, it was being chopped up
   when preparing the underlying 'perf record -e mem-events' and thus
   being ignored, requiring using '-- -C CPUs' as a workaround.

 - Improvements and tidy ups for 'perf test' shell infra.

 - Fix Intel PT information printing segfault in uClibc, where a NULL
   format was being passed to fprintf.

* tag 'perf-tools-for-v6.1-2-2022-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (23 commits)
  tools arch x86: Sync the msr-index.h copy with the kernel sources
  perf auxtrace arm64: Add support for parsing HiSilicon PCIe Trace packet
  perf auxtrace arm64: Add support for HiSilicon PCIe Tune and Trace device driver
  perf auxtrace arm: Refactor event list iteration in auxtrace_record__init()
  perf tests stat+json_output: Include sanity check for topology
  perf tests stat+csv_output: Include sanity check for topology
  perf intel-pt: Fix system_wide dummy event for hybrid
  perf intel-pt: Fix segfault in intel_pt_print_info() with uClibc
  perf test: Fix attr tests for PERF_FORMAT_LOST
  perf test: test_intel_pt.sh: Add 9 tests
  perf inject: Fix GEN_ELF_TEXT_OFFSET for jit
  perf test: test_intel_pt.sh: Add jitdump test
  perf test: test_intel_pt.sh: Tidy some alignment
  perf test: test_intel_pt.sh: Print a message when skipping kernel tracing
  perf test: test_intel_pt.sh: Tidy some perf record options
  perf test: test_intel_pt.sh: Fix return checking again
  perf: Skip and warn on unknown format 'configN' attrs
  perf list: Fix metricgroups title message
  perf mem: Fix -C option behavior for perf mem record
  perf annotate: Add missing condition flags for arm64
  ...
2022-10-16 15:14:29 -07:00
Linus Torvalds
2df76606db Merge tag 'kbuild-fixes-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:

 - Fix CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y compile error for the
   combination of Clang >= 14 and GAS <= 2.35.

 - Drop vmlinux.bz2 from the rpm package as it just annoyingly increased
   the package size.

 - Fix modpost error under build environments using musl.

 - Make *.ll files keep value names for easier debugging

 - Fix single directory build

 - Prevent RISC-V from selecting the broken DWARF5 support when Clang
   and GAS are used together.

* tag 'kbuild-fixes-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  lib/Kconfig.debug: Add check for non-constant .{s,u}leb128 support to DWARF5
  kbuild: fix single directory build
  kbuild: add -fno-discard-value-names to cmd_cc_ll_c
  scripts/clang-tools: Convert clang-tidy args to list
  modpost: put modpost options before argument
  kbuild: Stop including vmlinux.bz2 in the rpm's
  Kconfig.debug: add toolchain checks for DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT
  Kconfig.debug: simplify the dependency of DEBUG_INFO_DWARF4/5
2022-10-16 11:12:22 -07:00
Linus Torvalds
2fcd8f108f Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull more clk updates from Stephen Boyd:
 "This is the final part of the clk patches for this merge window.

  The clk rate range series needed another week to fully bake. Maxime
  fixed the bug that broke clk notifiers and prevented this from being
  included in the first pull request. He also added a unit test on top
  to make sure it doesn't break so easily again. The majority of the
  series fixes up how the clk_set_rate_*() APIs work, particularly
  around when the rate constraints are dropped and how they move around
  when reparenting clks. Overall it's a much needed improvement to the
  clk rate range APIs that used to be pretty broken if you looked
  sideways.

  Beyond the core changes there are a few driver fixes for a compilation
  issue or improper data causing clks to fail to register or have the
  wrong parents. These are good to get in before the first -rc so that
  the system actually boots on the affected devices"

* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (31 commits)
  clk: tegra: Fix Tegra PWM parent clock
  clk: at91: fix the build with binutils 2.27
  clk: qcom: gcc-msm8660: Drop hardcoded fixed board clocks
  clk: mediatek: clk-mux: Add .determine_rate() callback
  clk: tests: Add tests for notifiers
  clk: Update req_rate on __clk_recalc_rates()
  clk: tests: Add missing test case for ranges
  clk: qcom: clk-rcg2: Take clock boundaries into consideration for gfx3d
  clk: Introduce the clk_hw_get_rate_range function
  clk: Zero the clk_rate_request structure
  clk: Stop forwarding clk_rate_requests to the parent
  clk: Constify clk_has_parent()
  clk: Introduce clk_core_has_parent()
  clk: Switch from __clk_determine_rate to clk_core_round_rate_nolock
  clk: Add our request boundaries in clk_core_init_rate_req
  clk: Introduce clk_hw_init_rate_request()
  clk: Move clk_core_init_rate_req() from clk_core_round_rate_nolock() to its caller
  clk: Change clk_core_init_rate_req prototype
  clk: Set req_rate on reparenting
  clk: Take into account uncached clocks in clk_set_rate_range()
  ...
2022-10-16 11:08:19 -07:00
Linus Torvalds
b08cd74448 Merge tag '6.1-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6
Pull more cifs updates from Steve French:

 - fix a regression in guest mounts to old servers

 - improvements to directory leasing (caching directory entries safely
   beyond the root directory)

 - symlink improvement (reducing roundtrips needed to process symlinks)

 - an lseek fix (to problem where some dir entries could be skipped)

 - improved ioctl for returning more detailed information on directory
   change notifications

 - clarify multichannel interface query warning

 - cleanup fix (for better aligning buffers using ALIGN and round_up)

 - a compounding fix

 - fix some uninitialized variable bugs found by Coverity and the kernel
   test robot

* tag '6.1-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
  smb3: improve SMB3 change notification support
  cifs: lease key is uninitialized in two additional functions when smb1
  cifs: lease key is uninitialized in smb1 paths
  smb3: must initialize two ACL struct fields to zero
  cifs: fix double-fault crash during ntlmssp
  cifs: fix static checker warning
  cifs: use ALIGN() and round_up() macros
  cifs: find and use the dentry for cached non-root directories also
  cifs: enable caching of directories for which a lease is held
  cifs: prevent copying past input buffer boundaries
  cifs: fix uninitialised var in smb2_compound_op()
  cifs: improve symlink handling for smb2+
  smb3: clarify multichannel warning
  cifs: fix regression in very old smb1 mounts
  cifs: fix skipping to incorrect offset in emit_cached_dirents
2022-10-16 11:01:40 -07:00
Tetsuo Handa
80493877d7 Revert "cpumask: fix checking valid cpu range".
This reverts commit 78e5a33994 ("cpumask: fix checking valid cpu range").

syzbot is hitting WARN_ON_ONCE(cpu >= nr_cpumask_bits) warning at
cpu_max_bits_warn() [1], for commit 78e5a33994 ("cpumask: fix checking
valid cpu range") is broken.  Obviously that patch hits WARN_ON_ONCE()
when e.g.  reading /proc/cpuinfo because passing "cpu + 1" instead of
"cpu" will trivially hit cpu == nr_cpumask_bits condition.

Although syzbot found this problem in linux-next.git on 2022/09/27 [2],
this problem was not fixed immediately.  As a result, that patch was
sent to linux.git before the patch author recognizes this problem, and
syzbot started failing to test changes in linux.git since 2022/10/10
[3].

Andrew Jones proposed a fix for x86 and riscv architectures [4].  But
[2] and [5] indicate that affected locations are not limited to arch
code.  More delay before we find and fix affected locations, less tested
kernel (and more difficult to bisect and fix) before release.

We should have inspected and fixed basically all cpumask users before
applying that patch.  We should not crash kernels in order to ask
existing cpumask users to update their code, even if limited to
CONFIG_DEBUG_PER_CPU_MAPS=y case.

Link: https://syzkaller.appspot.com/bug?extid=d0fd2bf0dd6da72496dd [1]
Link: https://syzkaller.appspot.com/bug?extid=21da700f3c9f0bc40150 [2]
Link: https://syzkaller.appspot.com/bug?extid=51a652e2d24d53e75734 [3]
Link: https://lkml.kernel.org/r/20221014155845.1986223-1-ajones@ventanamicro.com [4]
Link: https://syzkaller.appspot.com/bug?extid=4d46c43d81c3bd155060 [5]
Reported-by: Andrew Jones <ajones@ventanamicro.com>
Reported-by: syzbot+d0fd2bf0dd6da72496dd@syzkaller.appspotmail.com
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-10-16 10:45:17 -07:00
Nathan Chancellor
0a6de78cff lib/Kconfig.debug: Add check for non-constant .{s,u}leb128 support to DWARF5
When building with a RISC-V kernel with DWARF5 debug info using clang
and the GNU assembler, several instances of the following error appear:

  /tmp/vgettimeofday-48aa35.s:2963: Error: non-constant .uleb128 is not supported

Dumping the .s file reveals these .uleb128 directives come from
.debug_loc and .debug_ranges:

  .Ldebug_loc0:
          .byte   4                               # DW_LLE_offset_pair
          .uleb128 .Lfunc_begin0-.Lfunc_begin0    #   starting offset
          .uleb128 .Ltmp1-.Lfunc_begin0           #   ending offset
          .byte   1                               # Loc expr size
          .byte   90                              # DW_OP_reg10
          .byte   0                               # DW_LLE_end_of_list

  .Ldebug_ranges0:
          .byte   4                               # DW_RLE_offset_pair
          .uleb128 .Ltmp6-.Lfunc_begin0           #   starting offset
          .uleb128 .Ltmp27-.Lfunc_begin0          #   ending offset
          .byte   4                               # DW_RLE_offset_pair
          .uleb128 .Ltmp28-.Lfunc_begin0          #   starting offset
          .uleb128 .Ltmp30-.Lfunc_begin0          #   ending offset
          .byte   0                               # DW_RLE_end_of_list

There is an outstanding binutils issue to support a non-constant operand
to .sleb128 and .uleb128 in GAS for RISC-V but there does not appear to
be any movement on it, due to concerns over how it would work with
linker relaxation.

To avoid these build errors, prevent DWARF5 from being selected when
using clang and an assembler that does not have support for these symbol
deltas, which can be easily checked in Kconfig with as-instr plus the
small test program from the dwz test suite from the binutils issue.

Link: https://sourceware.org/bugzilla/show_bug.cgi?id=27215
Link: https://github.com/ClangBuiltLinux/linux/issues/1719
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-10-17 02:06:47 +09:00
Masahiro Yamada
3753af778d kbuild: fix single directory build
Commit f110e5a250 ("kbuild: refactor single builds of *.ko") was wrong.

KBUILD_MODULES _is_ needed for single builds.

Otherwise, "make foo/bar/baz/" does not build module objects at all.

Fixes: f110e5a250 ("kbuild: refactor single builds of *.ko")
Reported-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: David Sterba <dsterba@suse.com>
2022-10-17 02:03:52 +09:00
Linus Torvalds
1501278bb7 Merge tag 'slab-for-6.1-rc1-hotfix' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab hotfix from Vlastimil Babka:
 "A single fix for the common-kmalloc series, for warnings on mips and
  sparc64 reported by Guenter Roeck"

* tag 'slab-for-6.1-rc1-hotfix' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  mm/slab: use kmalloc_node() for off slab freelist_idx_t array allocation
2022-10-15 17:05:07 -07:00
Linus Torvalds
36d8a3edf8 Merge tag 'for-linus' of https://github.com/openrisc/linux
Pull OpenRISC updates from Stafford Horne:
 "I have relocated to London so not much work from me while I get
  settled.

  Still, OpenRISC picked up two patches in this window:

   - Fix for kernel page table walking from Jann Horn

   - MAINTAINER entry cleanup from Palmer Dabbelt"

* tag 'for-linus' of https://github.com/openrisc/linux:
  MAINTAINERS: git://github -> https://github.com for openrisc
  openrisc: Fix pagewalk usage in arch_dma_{clear, set}_uncached
2022-10-15 16:47:33 -07:00
Linus Torvalds
41410965c3 Merge tag 'pci-v6.1-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull pci fix from Bjorn Helgaas:
 "Revert the attempt to distribute spare resources to unconfigured
  hotplug bridges at boot time.

  This fixed some dock hot-add scenarios, but Jonathan Cameron reported
  that it broke a topology with a multi-function device where one
  function was a Switch Upstream Port and the other was an Endpoint"

* tag 'pci-v6.1-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  Revert "PCI: Distribute available resources for root buses, too"
2022-10-15 16:36:38 -07:00
Hyeonggon Yoo
e36ce448a0 mm/slab: use kmalloc_node() for off slab freelist_idx_t array allocation
After commit d6a71648db ("mm/slab: kmalloc: pass requests larger than
order-1 page to page allocator"), SLAB passes large ( > PAGE_SIZE * 2)
requests to buddy like SLUB does.

SLAB has been using kmalloc caches to allocate freelist_idx_t array for
off slab caches. But after the commit, freelist_size can be bigger than
KMALLOC_MAX_CACHE_SIZE.

Instead of using pointer to kmalloc cache, use kmalloc_node() and only
check if the kmalloc cache is off slab during calculate_slab_order().
If freelist_size > KMALLOC_MAX_CACHE_SIZE, no looping condition happens
as it allocates freelist_idx_t array directly from buddy.

Link: https://lore.kernel.org/all/20221014205818.GA1428667@roeck-us.net/
Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net>
Fixes: d6a71648db ("mm/slab: kmalloc: pass requests larger than order-1 page to page allocator")
Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2022-10-15 21:42:05 +02:00
Palmer Dabbelt
34a0bac084 MAINTAINERS: git://github -> https://github.com for openrisc
Github deprecated the git:// links about a year ago, so let's move to
the https:// URLs instead.

Reported-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://github.blog/2021-09-01-improving-git-protocol-security-github/
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Stafford Horne <shorne@gmail.com>
2022-10-15 17:26:51 +01:00
Steve French
e3e9463414 smb3: improve SMB3 change notification support
Change notification is a commonly supported feature by most servers,
but the current ioctl to request notification when a directory is
changed does not return the information about what changed
(even though it is returned by the server in the SMB3 change
notify response), it simply returns when there is a change.

This ioctl improves upon CIFS_IOC_NOTIFY by returning the notify
information structure which includes the name of the file(s) that
changed and why. See MS-SMB2 2.2.35 for details on the individual
filter flags and the file_notify_information structure returned.

To use this simply pass in the following (with enough space
to fit at least one file_notify_information structure)

struct __attribute__((__packed__)) smb3_notify {
       uint32_t completion_filter;
       bool     watch_tree;
       uint32_t data_len;
       uint8_t  data[];
} __packed;

using CIFS_IOC_NOTIFY_INFO 0xc009cf0b
 or equivalently _IOWR(CIFS_IOCTL_MAGIC, 11, struct smb3_notify_info)

The ioctl will block until the server detects a change to that
directory or its subdirectories (if watch_tree is set).

Acked-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Acked-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-15 10:05:53 -05:00
Steve French
2bff065933 cifs: lease key is uninitialized in two additional functions when smb1
cifs_open and _cifsFileInfo_put also end up with lease_key uninitialized
in smb1 mounts.  It is cleaner to set lease key to zero in these
places where leases are not supported (smb1 can not return lease keys
so the field was uninitialized).

Addresses-Coverity: 1514207 ("Uninitialized scalar variable")
Addresses-Coverity: 1514331 ("Uninitialized scalar variable")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-15 10:05:53 -05:00
Steve French
625b60d4f9 cifs: lease key is uninitialized in smb1 paths
It is cleaner to set lease key to zero in the places where leases are not
supported (smb1 can not return lease keys so the field was uninitialized).

Addresses-Coverity: 1513994 ("Uninitialized scalar variable")
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-15 10:05:53 -05:00
Steve French
f09bd695af smb3: must initialize two ACL struct fields to zero
Coverity spotted that we were not initalizing Stbz1 and Stbz2 to
zero in create_sd_buf.

Addresses-Coverity: 1513848 ("Uninitialized scalar variable")
Cc: <stable@vger.kernel.org>
Reviewed-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-15 10:05:53 -05:00
Paulo Alcantara
b854b4ee66 cifs: fix double-fault crash during ntlmssp
The crash occurred because we were calling memzero_explicit() on an
already freed sess_data::iov[1] (ntlmsspblob) in sess_free_buffer().

Fix this by not calling memzero_explicit() on sess_data::iov[1] as
it's already by handled by callers.

Fixes: a4e430c8c8 ("cifs: replace kfree() with kfree_sensitive() for sensitive data")
Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2022-10-15 10:04:38 -05:00
Arnaldo Carvalho de Melo
a3a365655a tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes in:

  b8d1d16360 ("x86/apic: Don't disable x2APIC if locked")
  ca5b7c0d96 ("perf/x86/amd/lbr: Add LbrExtV2 branch record support")

Addressing these tools/perf build warnings:

    diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
    Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'

That makes the beautification scripts to pick some new entries:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  --- before	2022-10-14 18:06:34.294561729 -0300
  +++ after	2022-10-14 18:06:41.285744044 -0300
  @@ -264,6 +264,7 @@
   	[0xc0000102 - x86_64_specific_MSRs_offset] = "KERNEL_GS_BASE",
   	[0xc0000103 - x86_64_specific_MSRs_offset] = "TSC_AUX",
   	[0xc0000104 - x86_64_specific_MSRs_offset] = "AMD64_TSC_RATIO",
  +	[0xc000010e - x86_64_specific_MSRs_offset] = "AMD64_LBR_SELECT",
   	[0xc000010f - x86_64_specific_MSRs_offset] = "AMD_DBG_EXTN_CFG",
   	[0xc0000300 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_STATUS",
   	[0xc0000301 - x86_64_specific_MSRs_offset] = "AMD64_PERF_CNTR_GLOBAL_CTL",
  $

Now one can trace systemwide asking to see backtraces to where that MSR
is being read/written, see this example with a previous update:

  # perf trace -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB"
  ^C#

If we use -v (verbose mode) we can see what it does behind the scenes:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB"
  Using CPUID AuthenticAMD-25-21-0
  0x6a0
  0x6a8
  New filter for msr:read_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313)
  0x6a0
  0x6a8
  New filter for msr:write_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313)
  mmap size 528384B
  ^C#

Example with a frequent msr:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2
  Using CPUID AuthenticAMD-25-21-0
  0x48
  New filter for msr:read_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
  0x48
  New filter for msr:write_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
  mmap size 528384B
  Looking at the vmlinux_path (8 entries long)
  symsrc__init: build id mismatch for vmlinux.
  Using /proc/kcore for kernel data
  Using /proc/kallsyms for symbols
     0.000 Timer/2525383 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       __switch_to_xtra ([kernel.kallsyms])
                                       __switch_to ([kernel.kallsyms])
                                       __schedule ([kernel.kallsyms])
                                       schedule ([kernel.kallsyms])
                                       futex_wait_queue_me ([kernel.kallsyms])
                                       futex_wait ([kernel.kallsyms])
                                       do_futex ([kernel.kallsyms])
                                       __x64_sys_futex ([kernel.kallsyms])
                                       do_syscall_64 ([kernel.kallsyms])
                                       entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
                                       __futex_abstimed_wait_common64 (/usr/lib64/libpthread-2.33.so)
     0.030 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       __switch_to_xtra ([kernel.kallsyms])
                                       __switch_to ([kernel.kallsyms])
                                       __schedule ([kernel.kallsyms])
                                       schedule_idle ([kernel.kallsyms])
                                       do_idle ([kernel.kallsyms])
                                       cpu_startup_entry ([kernel.kallsyms])
                                       secondary_startup_64_no_verify ([kernel.kallsyms])
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Daniel Sneddon <daniel.sneddon@linux.intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/lkml/Y0nQkz2TUJxwfXJd@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Qi Liu
5e91e57e68 perf auxtrace arm64: Add support for parsing HiSilicon PCIe Trace packet
Add support for using 'perf report --dump-raw-trace' to parse PTT packet.

Example usage:

Output will contain raw PTT data and its textual representation, such
as (8DW format):

0 0 0x5810 [0x30]: PERF_RECORD_AUXTRACE size: 0x400000  offset: 0
ref: 0xa5d50c725  idx: 0  tid: -1  cpu: 0
.
. ... HISI PTT data: size 4194304 bytes
.  00000000: 00 00 00 00                                 Prefix
.  00000004: 08 20 00 60                                 Header DW0
.  00000008: ff 02 00 01                                 Header DW1
.  0000000c: 20 08 00 00                                 Header DW2
.  00000010: 10 e7 44 ab                                 Header DW3
.  00000014: 2a a8 1e 01                                 Time
.  00000020: 00 00 00 00                                 Prefix
.  00000024: 01 00 00 60                                 Header DW0
.  00000028: 0f 1e 00 01                                 Header DW1
.  0000002c: 04 00 00 00                                 Header DW2
.  00000030: 40 00 81 02                                 Header DW3
.  00000034: ee 02 00 00                                 Time
....

This patch only add basic parsing support according to the definition of
the PTT packet described in Documentation/trace/hisi-ptt.rst. And the
fields of each packet can be further decoded following the PCIe Spec's
definition of TLP packet.

Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi6124@gmail.com>
Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zeng Prime <prime.zeng@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: linuxarm@huawei.com
Link: https://lore.kernel.org/r/20220927081400.14364-4-yangyicong@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Qi Liu
057381a7ec perf auxtrace arm64: Add support for HiSilicon PCIe Tune and Trace device driver
HiSilicon PCIe tune and trace device (PTT) could dynamically tune the
PCIe link's events, and trace the TLP headers).

This patch add support for PTT device in perf tool, so users could use
'perf record' to get TLP headers trace data.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Acked-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi6124@gmail.com>
Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zeng Prime <prime.zeng@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: linuxarm@huawei.com
Link: https://lore.kernel.org/r/20220927081400.14364-3-yangyicong@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Qi Liu
45a3975f8e perf auxtrace arm: Refactor event list iteration in auxtrace_record__init()
Add find_pmu_for_event() and use to simplify logic in
auxtrace_record_init(). find_pmu_for_event() will be reused in
subsequent patches.

Reviewed-by: John Garry <john.garry@huawei.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bjorn Helgaas <helgaas@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi6124@gmail.com>
Cc: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@huawei.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zeng Prime <prime.zeng@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-pci@vger.kernel.org
Cc: linuxarm@huawei.com
Link: https://lore.kernel.org/r/20220927081400.14364-2-yangyicong@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Athira Rajeev
58d4802a5e perf tests stat+json_output: Include sanity check for topology
Testcase stat+json_output.sh fails in powerpc:

	86: perf stat JSON output linter : FAILED!

The testcase "stat+json_output.sh" verifies perf stat JSON output. The
test covers aggregation modes like per-socket, per-core, per-die, -A
(no_aggr mode) along with few other tests. It counts expected fields for
various commands. For example say -A (i.e, AGGR_NONE mode), expects 7
fields in the output having "CPU" as first field. Same way, for
per-socket, it expects the first field in result to point to socket id.
The testcases compares the result with expected count.

The values for socket, die, core and cpu are fetched from topology
directory:

  /sys/devices/system/cpu/cpu*/topology.

For example, socket value is fetched from "physical_package_id" file of
topology directory.  (cpu__get_topology_int() in util/cpumap.c)

If a platform fails to fetch the topology information, values will be
set to -1. For example, incase of pSeries platform of powerpc, value for
"physical_package_id" is restricted and not exposed. So, -1 will be
assigned.

Perf code has a checks for valid cpu id in "aggr_printout"
(stat-display.c), which displays the fields. So, in cases where topology
values not exposed, first field of the output displaying will be empty.
This cause the testcase to fail, as it counts  number of fields in the
output.

Incase of -A (AGGR_NONE mode,), testcase expects 7 fields in the output,
becos of -1 value obtained from topology files for some, only 6 fields
are printed. Hence a testcase failure reported due to mismatch in number
of fields in the output.

Patch here adds a sanity check in the testcase for topology.  Check will
help to skip the test if -1 value found.

Fixes: 0c343af2a2 ("perf test: JSON format checking")
Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Suggested-by: Ian Rogers <irogers@google.com>
Suggested-by: James Clark <james.clark@arm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Claire Jensen <cjense@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Link: https://lore.kernel.org/r/20221006155149.67205-2-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Athira Rajeev
cd400f6f18 perf tests stat+csv_output: Include sanity check for topology
Testcase stat+csv_output.sh fails in powerpc:

	84: perf stat CSV output linter: FAILED!

The testcase "stat+csv_output.sh" verifies perf stat CSV output. The
test covers aggregation modes like per-socket, per-core, per-die, -A
(no_aggr mode) along with few other tests. It counts expected fields for
various commands. For example say -A (i.e, AGGR_NONE mode), expects 7
fields in the output having "CPU" as first field. Same way, for
per-socket, it expects the first field in result to point to socket id.
The testcases compares the result with expected count.

The values for socket, die, core and cpu are fetched from topology
directory:

  /sys/devices/system/cpu/cpu*/topology.

For example, socket value is fetched from "physical_package_id" file of
topology directory.  (cpu__get_topology_int() in util/cpumap.c)

If a platform fails to fetch the topology information, values will be
set to -1. For example, incase of pSeries platform of powerpc, value for
"physical_package_id" is restricted and not exposed. So, -1 will be
assigned.

Perf code has a checks for valid cpu id in "aggr_printout"
(stat-display.c), which displays the fields. So, in cases where topology
values not exposed, first field of the output displaying will be empty.
This cause the testcase to fail, as it counts  number of fields in the
output.

Incase of -A (AGGR_NONE mode,), testcase expects 7 fields in the output,
becos of -1 value obtained from topology files for some, only 6 fields
are printed. Hence a testcase failure reported due to mismatch in number
of fields in the output.

Patch here adds a sanity check in the testcase for topology.  Check will
help to skip the test if -1 value found.

Fixes: 7473ee56db ("perf test: Add checking for perf stat CSV output.")
Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Suggested-by: Ian Rogers <irogers@google.com>
Suggested-by: James Clark <james.clark@arm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Claire Jensen <cjense@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Link: https://lore.kernel.org/r/20221006155149.67205-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Adrian Hunter
6cef7dab3e perf intel-pt: Fix system_wide dummy event for hybrid
User space tasks can migrate between CPUs, so when tracing selected CPUs,
system-wide sideband is still needed, however evlist->core.has_user_cpus
is not set in the hybrid case, so check the target cpu_list instead.

Fixes: 7d189cadbe ("perf intel-pt: Track sideband system-wide when needed")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20221012082259.22394-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Adrian Hunter
5a3d47071f perf intel-pt: Fix segfault in intel_pt_print_info() with uClibc
uClibc segfaulted because NULL was passed as the format to fprintf().

That happened because one of the format strings was missing and
intel_pt_print_info() didn't check that before calling fprintf().

Add the missing format string, and check format is not NULL before calling
fprintf().

Fixes: 11fa7cb86b ("perf tools: Pass Intel PT information for decoding MTC and CYC")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20221012082259.22394-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
James Clark
e28039667c perf test: Fix attr tests for PERF_FORMAT_LOST
Since PERF_FORMAT_LOST was added, the default read format has that bit
set, so add it to the tests. Keep the old value as well so that the test
still passes on older kernels.

This fixes the following failure:

  expected read_format=0|4, got 20
  FAILED './tests/attr/test-record-C0' - match failure

Fixes: 85b425f31c8866e0 ("perf record: Set PERF_FORMAT_LOST by default")
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221012094633.21669-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00
Ammy Yi
f77811a0f6 perf test: test_intel_pt.sh: Add 9 tests
Add tests:
	Test with MTC and TSC disabled
	Test with branches disabled
	Test with/without CYC
	Test recording with sample mode
	Test with kernel trace
	Test virtual LBR
	Test power events
	Test with TNT packets disabled
	Test with event_trace

These tests mostly check that perf record works with the corresponding
Intel PT config terms, sometimes also checking that certain packets do or
do not appear in the resulting trace as appropriate.

The "Test virtual LBR" is slightly trickier, using a Python script to
check that branch stacks are actually synthesized.

Signed-off-by: Ammy Yi <ammy.yi@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20221014170905.64069-8-adrian.hunter@intel.com
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-15 10:13:16 -03:00