cache_from_obj() was added by commit b9ce5ef49f ("sl[au]b: always get
the cache from its page in kmem_cache_free()") to support kmemcg, where
per-memcg cache can be different from the root one, so we can't use the
kmem_cache pointer given to kmem_cache_free().
Prior to that commit, SLUB already had debugging check+warning that could
be enabled to compare the given kmem_cache pointer to one referenced by
the slab page where the object-to-be-freed resides. This check was moved
to cache_from_obj(). Later the check was also enabled for
SLAB_FREELIST_HARDENED configs by commit 598a0717a8 ("mm/slab: validate
cache membership under freelist hardening").
These checks and warnings can be useful especially for the debugging,
which can be improved. Commit 598a0717a8 changed the pr_err() with
WARN_ON_ONCE() to WARN_ONCE() so only the first hit is now reported,
others are silent. This patch changes it to WARN() so that all errors are
reported.
It's also useful to print SLUB allocation/free tracking info for the
offending object, if tracking is enabled. Thus, export the SLUB
print_tracking() function and provide an empty one for SLAB.
For SLUB we can also benefit from the static key check in
kmem_cache_debug_flags(), but we need to move this function to slab.h and
declare the static key there.
[1] https://lore.kernel.org/r/20200608230654.828134-18-guro@fb.com
[vbabka@suse.cz: avoid bogus WARN()]
Link: https://lore.kernel.org/r/20200623090213.GW5535@shao2-debian
Link: http://lkml.kernel.org/r/b33e0fa7-cd28-4788-9e54-5927846329ef@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Matthew Garrett <mjg59@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: Vinayak Menon <vinmenon@codeaurora.org>
Link: http://lkml.kernel.org/r/afeda7ac-748b-33d8-a905-56b708148ad5@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The function cache_from_obj() was added by commit b9ce5ef49f ("sl[au]b:
always get the cache from its page in kmem_cache_free()") to support
kmemcg, where per-memcg cache can be different from the root one, so we
can't use the kmem_cache pointer given to kmem_cache_free().
Prior to that commit, SLUB already had debugging check+warning that could
be enabled to compare the given kmem_cache pointer to one referenced by
the slab page where the object-to-be-freed resides. This check was moved
to cache_from_obj(). Later the check was also enabled for
SLAB_FREELIST_HARDENED configs by commit 598a0717a8 ("mm/slab: validate
cache membership under freelist hardening").
These checks and warnings can be useful especially for the debugging,
which can be improved. Commit 598a0717a8 changed the pr_err() with
WARN_ON_ONCE() to WARN_ONCE() so only the first hit is now reported,
others are silent. This patch changes it to WARN() so that all errors are
reported.
It's also useful to print SLUB allocation/free tracking info for the
offending object, if tracking is enabled. We could export the SLUB
print_tracking() function and provide an empty one for SLAB, or realize
that both the debugging and hardening cases in cache_from_obj() are only
supported by SLUB anyway. So this patch moves cache_from_obj() from
slab.h to separate instances in slab.c and slub.c, where the SLAB version
only does the kmemcg lookup and even could be completely removed once the
kmemcg rework [1] is merged. The SLUB version can thus easily use the
print_tracking() function. It can also use the kmem_cache_debug_flags()
static key check for improved performance in kernels without the hardening
and with debugging not enabled on boot.
[1] https://lore.kernel.org/r/20200608230654.828134-18-guro@fb.com
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Jann Horn <jannh@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/20200610163135.17364-10-vbabka@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
One advantage of CONFIG_SLUB_DEBUG is that a generic distro kernel can be
built with the option enabled, but it's inactive until simply enabled on
boot, without rebuilding the kernel. With a static key, we can further
eliminate the overhead of checking whether a cache has a particular debug
flag enabled if we know that there are no such caches (slub_debug was not
enabled during boot). We use the same mechanism also for e.g.
page_owner, debug_pagealloc or kmemcg functionality.
This patch introduces the static key and makes the general check for
per-cache debug flags kmem_cache_debug() use it. This benefits several
call sites, including (slow path but still rather frequent) __slab_free().
The next patches will add more uses.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Roman Gushchin <guro@fb.com>
Acked-by: Christoph Lameter <cl@linux.com>
Cc: Jann Horn <jannh@google.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/20200610163135.17364-7-vbabka@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The attribute reflects the SLAB_RECLAIM_ACCOUNT cache flag. It's not
clear why this attribute was writable in the first place, as it's tied to
how the cache is used by its creator, it's not a user tunable.
Furthermore:
- it affects slab merging, but that's not being checked while toggled
- if affects whether __GFP_RECLAIMABLE flag is used to allocate page, but
the runtime toggle doesn't update allocflags
- it affects cache_vmstat_idx() so runtime toggling might lead to incosistency
of NR_SLAB_RECLAIMABLE and NR_SLAB_UNRECLAIMABLE
Thus make it read-only.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Jann Horn <jannh@google.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/20200610163135.17364-6-vbabka@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SLUB_DEBUG creates several files under /sys/kernel/slab/<cache>/ that can
be read to check if the respective debugging options are enabled for given
cache. Some options, namely sanity_checks, trace, and failslab can be
also enabled and disabled at runtime by writing into the files.
The runtime toggling is racy. Some options disable __CMPXCHG_DOUBLE when
enabled, which means that in case of concurrent allocations, some can
still use __CMPXCHG_DOUBLE and some not, leading to potential corruption.
The s->flags field is also not updated or checked atomically. The
simplest solution is to remove the runtime toggling. The extended
slub_debug boot parameter syntax introduced by earlier patch should allow
to fine-tune the debugging configuration during boot with same
granularity.
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Jann Horn <jannh@google.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/20200610163135.17364-5-vbabka@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SLUB allows runtime changing of page allocation order by writing into the
/sys/kernel/slab/<cache>/order file. Jann has reported [1] that this
interface allows the order to be set too small, leading to crashes.
While it's possible to fix the immediate issue, closer inspection reveals
potential races. Storing the new order calls calculate_sizes() which
non-atomically updates a lot of kmem_cache fields while the cache is still
in use. Unexpected behavior might occur even if the fields are set to the
same value as they were.
This could be fixed by splitting out the part of calculate_sizes() that
depends on forced_order, so that we only update kmem_cache.oo field. This
could still race with init_cache_random_seq(), shuffle_freelist(),
allocate_slab(). Perhaps it's possible to audit and e.g. add some
READ_ONCE/WRITE_ONCE accesses, it might be easier just to remove the
runtime order changes, which is what this patch does. If there are valid
usecases for per-cache order setting, we could e.g. extend the boot
parameters to do that.
[1] https://lore.kernel.org/r/CAG48ez31PP--h6_FzVyfJ4H86QYczAFPdxtJHUEEan+7VJETAQ@mail.gmail.com
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Christoph Lameter <cl@linux.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Vijayanand Jitta <vjitta@codeaurora.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/20200610163135.17364-4-vbabka@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
SLUB_DEBUG creates several files under /sys/kernel/slab/<cache>/ that can
be read to check if the respective debugging options are enabled for given
cache. The options can be also toggled at runtime by writing into the
files. Some of those, namely red_zone, poison, and store_user can be
toggled only when no objects yet exist in the cache.
Vijayanand reports [1] that there is a problem with freelist randomization
if changing the debugging option's state results in different number of
objects per page, and the random sequence cache needs thus needs to be
recomputed.
However, another problem is that the check for "no objects yet exist in
the cache" is racy, as noted by Jann [2] and fixing that would add
overhead or otherwise complicate the allocation/freeing paths. Thus it
would be much simpler just to remove the runtime toggling support. The
documentation describes it's "In case you forgot to enable debugging on
the kernel command line", but the neccessity of having no objects limits
its usefulness anyway for many caches.
Vijayanand describes an use case [3] where debugging is enabled for all
but zram caches for memory overhead reasons, and using the runtime toggles
was the only way to achieve such configuration. After the previous patch
it's now possible to do that directly from the kernel boot option, so we
can remove the dangerous runtime toggles by making the /sys attribute
files read-only.
While updating it, also improve the documentation of the debugging /sys files.
[1] https://lkml.kernel.org/r/1580379523-32272-1-git-send-email-vjitta@codeaurora.org
[2] https://lore.kernel.org/r/CAG48ez31PP--h6_FzVyfJ4H86QYczAFPdxtJHUEEan+7VJETAQ@mail.gmail.com
[3] https://lore.kernel.org/r/1383cd32-1ddc-4dac-b5f8-9c42282fa81c@codeaurora.org
Reported-by: Vijayanand Jitta <vjitta@codeaurora.org>
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Pekka Enberg <penberg@kernel.org>
Link: http://lkml.kernel.org/r/20200610163135.17364-3-vbabka@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
As said by Linus:
A symmetric naming is only helpful if it implies symmetries in use.
Otherwise it's actively misleading.
In "kzalloc()", the z is meaningful and an important part of what the
caller wants.
In "kzfree()", the z is actively detrimental, because maybe in the
future we really _might_ want to use that "memfill(0xdeadbeef)" or
something. The "zero" part of the interface isn't even _relevant_.
The main reason that kzfree() exists is to clear sensitive information
that should not be leaked to other future users of the same memory
objects.
Rename kzfree() to kfree_sensitive() to follow the example of the recently
added kvfree_sensitive() and make the intention of the API more explicit.
In addition, memzero_explicit() is used to clear the memory to make sure
that it won't get optimized away by the compiler.
The renaming is done by using the command sequence:
git grep -w --name-only kzfree |\
xargs sed -i 's/kzfree/kfree_sensitive/'
followed by some editing of the kfree_sensitive() kerneldoc and adding
a kzfree backward compatibility macro in slab.h.
[akpm@linux-foundation.org: fs/crypto/inline_crypt.c needs linux/slab.h]
[akpm@linux-foundation.org: fix fs/crypto/inline_crypt.c some more]
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Joe Perches <joe@perches.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: "Jason A . Donenfeld" <Jason@zx2c4.com>
Link: http://lkml.kernel.org/r/20200616154311.12314-3-longman@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Clang's Control Flow Integrity (CFI) is a security mechanism that can help
prevent JOP chains, deployed extensively in downstream kernels used in
Android.
Its deployment is hindered by mismatches in function signatures. For this
case, we make callbacks match their intended function signature, and cast
parameters within them rather than casting the callback when passed as a
parameter.
When running `mount -t ntfs ...` we observe the following trace:
Call trace:
__cfi_check_fail+0x1c/0x24
name_to_dev_t+0x0/0x404
iget5_locked+0x594/0x5e8
ntfs_fill_super+0xbfc/0x43ec
mount_bdev+0x30c/0x3cc
ntfs_mount+0x18/0x24
mount_fs+0x1b0/0x380
vfs_kern_mount+0x90/0x398
do_mount+0x5d8/0x1a10
SyS_mount+0x108/0x144
el0_svc_naked+0x34/0x38
Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: freak07 <michalechner92@googlemail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Acked-by: Anton Altaparmakov <anton@tuxera.com>
Link: http://lkml.kernel.org/r/20200718112513.533800-1-luca.stefani.ge1@gmail.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Try to find module in directory with vmlinux (for fresh build). Then try
standard paths where debuginfo are usually placed. Pick first file which
have elf section '.debug_line'.
Before:
$ echo 'tap_open+0x0/0x0 [tap]' |
./scripts/decode_stacktrace.sh /usr/lib/debug/boot/vmlinux-5.4.0-37-generic
WARNING! Modules path isn't set, but is needed to parse this symbol
tap_open+0x0/0x0 tap
After:
$ echo 'tap_open+0x0/0x0 [tap]' |
./scripts/decode_stacktrace.sh /usr/lib/debug/boot/vmlinux-5.4.0-37-generic
tap_open (drivers/net/tap.c:502) tap
Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Sasha Levin <sashal@kernel.org>
Link: http://lkml.kernel.org/r/159282923068.248444.5461337458421616083.stgit@buzz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Library archives (.a) usually contain multiple object files so their
output of nm --size-sort contains lines like:
<omitted for brevity>
00000000000003a8 t run_test
extent-map-tests.o:
<omitted for brevity>
bloat-o-meter currently doesn't handle them which results in errors when
calling .split() on them. Fix this by simply ignoring them. This enables
diffing subsystems which generate built-in.a files.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200603103513.3712-1-nborisov@suse.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
For SMP systems using IPI based TLB invalidation, looking at
current->active_mm is entirely reasonable. This then presents the
following race condition:
CPU0 CPU1
flush_tlb_mm(mm) use_mm(mm)
<send-IPI>
tsk->active_mm = mm;
<IPI>
if (tsk->active_mm == mm)
// flush TLBs
</IPI>
switch_mm(old_mm,mm,tsk);
Where it is possible the IPI flushed the TLBs for @old_mm, not @mm,
because the IPI lands before we actually switched.
Avoid this by disabling IRQs across changing ->active_mm and
switch_mm().
Of the (SMP) architectures that have IPI based TLB invalidate:
Alpha - checks active_mm
ARC - ASID specific
IA64 - checks active_mm
MIPS - ASID specific flush
OpenRISC - shoots down world
PARISC - shoots down world
SH - ASID specific
SPARC - ASID specific
x86 - N/A
xtensa - checks active_mm
So at the very least Alpha, IA64 and Xtensa are suspect.
On top of this, for scheduler consistency we need at least preemption
disabled across changing tsk->mm and doing switch_mm(), which is
currently provided by task_lock(), but that's not sufficient for
PREEMPT_RT.
[akpm@linux-foundation.org: add comment]
Reported-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Kees Cook <keescook@chromium.org>
Cc: Jann Horn <jannh@google.com>
Cc: Will Deacon <will@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200721154106.GE10769@hirez.programming.kicks-ass.net
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Especially with memory hotplug, we can have offline sections (with a
garbage memmap) and overlapping zones. We have to make sure to only touch
initialized memmaps (online sections managed by the buddy) and that the
zone matches, to not move pages between zones.
To test if this can actually happen, I added a simple
BUG_ON(page_zone(page_i) != page_zone(page_j));
right before the swap. When hotplugging a 256M DIMM to a 4G x86-64 VM and
onlining the first memory block "online_movable" and the second memory
block "online_kernel", it will trigger the BUG, as both zones (NORMAL and
MOVABLE) overlap.
This might result in all kinds of weird situations (e.g., double
allocations, list corruptions, unmovable allocations ending up in the
movable zone).
Fixes: e900a918b0 ("mm: shuffle initial free memory to improve memory-side-cache utilization")
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Wei Yang <richard.weiyang@linux.alibaba.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: <stable@vger.kernel.org> [5.2+]
Link: http://lkml.kernel.org/r/20200624094741.9918-2-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull tty/serial updates from Greg KH:
"Here is the large set of TTY and Serial driver patches for 5.9-rc1.
Lots of bugfixes in here, thanks to syzbot fuzzing for serial and vt
and console code.
Other highlights include:
- much needed vt/vc code cleanup from Jiri Slaby
- 8250 driver fixes and additions
- various serial driver updates and feature enhancements
- locking cleanup for serial/console initializations
- other minor cleanups
All of these have been in linux-next with no reported issues"
* tag 'tty-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (90 commits)
MAINTAINERS: enlist Greg formally for console stuff
vgacon: Fix for missing check in scrollback handling
Revert "serial: 8250: Let serial core initialise spin lock"
serial: 8250: Let serial core initialise spin lock
tty: keyboard, do not speculate on func_table index
serial: stm32: Add RS485 RTS GPIO control
serial: 8250_dw: Fix common clocks usage race condition
serial: 8250_dw: Pass the same rate to the clk round and set rate methods
serial: 8250_dw: Simplify the ref clock rate setting procedure
serial: 8250: Add 8250 port clock update method
tty: serial: imx: add imx earlycon driver
tty: serial: imx: enable imx serial console port as module
tty/synclink: remove leftover bits of non-PCI card support
tty: Use the preferred form for passing the size of a structure type
tty: Fix identation issues in struct serial_struct32
tty: Avoid the use of one-element arrays
serial: msm_serial: add sparse context annotation
serial: pmac_zilog: add sparse context annotation
newport_con: vc_color is now in state
serial: imx: use hrtimers for rs485 delays
...
Pull staging/IIO driver updates from Greg KH:
"Here is the large set of Staging and IIO driver patches for 5.9-rc1.
Lots of churn here, but overall the size increase in lines added is
small, while adding a load of new IIO drivers.
Major things in here:
- lots and lots of IIO new drivers and frameworks added
- IIO driver fixes and updates
- lots of tiny coding style cleanups for staging drivers
- vc04_services major reworks and cleanups
We had 3 set of drivers move out of staging in this round as well:
- wilc1000 wireless driver moved out of staging
- speakup moved out of staging
- most USB driver moved out of staging
Full details are in the shortlog.
All of these have been in linux-next with no reported issues. The last
few changes here were to resolve reported linux-next issues, and they
seem to have resolved the problems"
* tag 'staging-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (428 commits)
staging: most: fix up movement of USB driver
staging: rts5208: clear alignment style issues
staging: r8188eu: replace rtw_netdev_priv define with inline function
staging: netlogic: clear alignment style issues
staging: android: ashmem: Fix lockdep warning for write operation
drivers: most: add USB adapter driver
staging: most: Use %pM format specifier for MAC addresses
staging: ks7010: Use %pM format specifier for MAC addresses
staging: qlge: qlge_dbg: removed comment repition
staging: wfx: Use flex_array_size() helper in memcpy()
staging: rtl8723bs: Align macro definitions
staging: rtl8723bs: Clean up function declations
staging: rtl8723bs: Fix coding style errors
drivers: staging: audio: Fix the missing header file for helper file
staging: greybus: audio: Enable GB codec, audio module compilation.
staging: greybus: audio: Add helper APIs for dynamic audio modules
staging: greybus: audio: Resolve compilation error in topology parser
staging: greybus: audio: Resolve compilation errors for GB codec module
staging: greybus: audio: Maintain jack list within GB Audio module
staging: greybus: audio: Update snd_jack FW usage as per new APIs
...
Pull sound updates from Takashi Iwai:
"This became wide and scattered updates all over the sound tree as
diffstat shows: lots of (still ongoing) refactoring works in ASoC,
fixes and cleanups caught by static analysis, inclusive term
conversions as well as lots of new drivers. Below are highlights:
ASoC core:
- API cleanups and conversions to the unified mute_stream() call
- Simplify I/O helper functions
- Use helper macros to retrieve RTD from substreams
ASoC drivers:
- Lots of fixes and cleanups in Intel ASoC drivers
- Lots of new stuff: Freescale MQS and i.MX6sx, Intel KeemBay I2S,
Maxim MAX98360A and MAX98373 SoundWire, various Mediatek boards,
nVidia Tegra 186 and 210, RealTek RL6231, Samsung Midas and Aries
boards, TI J721e EVM
ALSA core:
- Minor code refacotring for SG-buffer handling
HD-audio:
- Generalization of mute-LED handling with LED classdev
- Intel silent stream support for HDMI
- Device-specific fixes: CA0132, Loongson-3
Others:
- Usual USB- and HD-audio quirks for various devices
- Fixes for echoaudio DMA position handling
- Various documents and trivial fixes for sparse warnings
- Conversion to adopt inclusive terms"
* tag 'sound-5.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (479 commits)
ALSA: pci: delete repeated words in comments
ALSA: isa: delete repeated words in comments
ALSA: hda/tegra: Add 100us dma stop delay
ALSA: hda: Add dma stop delay variable
ASoC: hda/tegra: Set buffer alignment to 128 bytes
ALSA: seq: oss: Serialize ioctls
ALSA: hda/hdmi: Add quirk to force connectivity
ALSA: usb-audio: add startech usb audio dock name
ALSA: usb-audio: Add support for Lenovo ThinkStation P620
Revert "ALSA: hda: call runtime_allow() for all hda controllers"
ALSA: hda/ca0132 - Fix AE-5 microphone selection commands.
ALSA: hda/ca0132 - Add new quirk ID for Recon3D.
ALSA: hda/ca0132 - Fix ZxR Headphone gain control get value.
ALSA: hda/realtek: Add alc269/alc662 pin-tables for Loongson-3 laptops
ALSA: docs: fix typo
ALSA: doc: use correct config variable name
ASoC: core: Two step component registration
ASoC: core: Simplify snd_soc_component_initialize declaration
ASoC: core: Relocate and expose snd_soc_component_initialize
ASoC: sh: Replace 'select' DMADEVICES 'with depends on'
...
Pull KVM updates from Paolo Bonzini:
"s390:
- implement diag318
x86:
- Report last CPU for debugging
- Emulate smaller MAXPHYADDR in the guest than in the host
- .noinstr and tracing fixes from Thomas
- nested SVM page table switching optimization and fixes
Generic:
- Unify shadow MMU cache data structures across architectures"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (127 commits)
KVM: SVM: Fix sev_pin_memory() error handling
KVM: LAPIC: Set the TDCR settable bits
KVM: x86: Specify max TDP level via kvm_configure_mmu()
KVM: x86/mmu: Rename max_page_level to max_huge_page_level
KVM: x86: Dynamically calculate TDP level from max level and MAXPHYADDR
KVM: VXM: Remove temporary WARN on expected vs. actual EPTP level mismatch
KVM: x86: Pull the PGD's level from the MMU instead of recalculating it
KVM: VMX: Make vmx_load_mmu_pgd() static
KVM: x86/mmu: Add separate helper for shadow NPT root page role calc
KVM: VMX: Drop a duplicate declaration of construct_eptp()
KVM: nSVM: Correctly set the shadow NPT root level in its MMU role
KVM: Using macros instead of magic values
MIPS: KVM: Fix build error caused by 'kvm_run' cleanup
KVM: nSVM: remove nonsensical EXITINFO1 adjustment on nested NPF
KVM: x86: Add a capability for GUEST_MAXPHYADDR < HOST_MAXPHYADDR support
KVM: VMX: optimize #PF injection when MAXPHYADDR does not match
KVM: VMX: Add guest physical address check in EPT violation and misconfig
KVM: VMX: introduce vmx_need_pf_intercept
KVM: x86: update exception bitmap on CPUID changes
KVM: x86: rename update_bp_intercept to update_exception_bitmap
...
This reverts commit 8bb9bf242d.
It seems the vmalloc page tables aren't always preallocated in all
situations, because Jason Donenfeld reports an oops with this commit:
BUG: unable to handle page fault for address: ffffe8ffffd00608
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP
CPU: 2 PID: 22 Comm: kworker/2:0 Not tainted 5.8.0+ #154
RIP: process_one_work+0x2c/0x2d0
Code: 41 56 41 55 41 54 55 48 89 f5 53 48 89 fb 48 83 ec 08 48 8b 06 4c 8b 67 40 49 89 c6 45 30 f6 a8 04 b8 00 00 00 00 4c 0f 44 f0 <49> 8b 46 08 44 8b a8 00 01 05
Call Trace:
worker_thread+0x4b/0x3b0
? rescuer_thread+0x360/0x360
kthread+0x116/0x140
? __kthread_create_worker+0x110/0x110
ret_from_fork+0x1f/0x30
CR2: ffffe8ffffd00608
and that page fault address is right in that vmalloc space, and we
clearly don't have a PGD/P4D entry for it.
Looking at the "Code:" line, the actual fault seems to come from the
'pwq->wq' dereference at the top of the process_one_work() function:
struct pool_workqueue *pwq = get_work_pwq(work);
struct worker_pool *pool = worker->pool;
bool cpu_intensive = pwq->wq->flags & WQ_CPU_INTENSIVE;
so 'struct pool_workqueue *pwq' is the allocation that hasn't been
synchronized across CPUs.
Just revert for now, while Joerg figures out the cause.
Reported-and-bisected-by: Jason A. Donenfeld <Jason@zx2c4.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull sched/fifo updates from Ingo Molnar:
"This adds the sched_set_fifo*() encapsulation APIs to remove static
priority level knowledge from non-scheduler code.
The three APIs for non-scheduler code to set SCHED_FIFO are:
- sched_set_fifo()
- sched_set_fifo_low()
- sched_set_normal()
These are two FIFO priority levels: default (high), and a 'low'
priority level, plus sched_set_normal() to set the policy back to
non-SCHED_FIFO.
Since the changes affect a lot of non-scheduler code, we kept this in
a separate tree"
* tag 'sched-fifo-2020-08-04' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
sched,tracing: Convert to sched_set_fifo()
sched: Remove sched_set_*() return value
sched: Remove sched_setscheduler*() EXPORTs
sched,psi: Convert to sched_set_fifo_low()
sched,rcutorture: Convert to sched_set_fifo_low()
sched,rcuperf: Convert to sched_set_fifo_low()
sched,locktorture: Convert to sched_set_fifo()
sched,irq: Convert to sched_set_fifo()
sched,watchdog: Convert to sched_set_fifo()
sched,serial: Convert to sched_set_fifo()
sched,powerclamp: Convert to sched_set_fifo()
sched,ion: Convert to sched_set_normal()
sched,powercap: Convert to sched_set_fifo*()
sched,spi: Convert to sched_set_fifo*()
sched,mmc: Convert to sched_set_fifo*()
sched,ivtv: Convert to sched_set_fifo*()
sched,drm/scheduler: Convert to sched_set_fifo*()
sched,msm: Convert to sched_set_fifo*()
sched,psci: Convert to sched_set_fifo*()
sched,drbd: Convert to sched_set_fifo*()
...
Pull integrity updates from Mimi Zohar:
"The nicest change is the IMA policy rule checking. The other changes
include allowing the kexec boot cmdline line measure policy rules to
be defined in terms of the inode associated with the kexec kernel
image, making the IMA_APPRAISE_BOOTPARAM, which governs the IMA
appraise mode (log, fix, enforce), a runtime decision based on the
secure boot mode of the system, and including errno in the audit log"
* tag 'integrity-v5.9' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
integrity: remove redundant initialization of variable ret
ima: move APPRAISE_BOOTPARAM dependency on ARCH_POLICY to runtime
ima: AppArmor satisfies the audit rule requirements
ima: Rename internal filter rule functions
ima: Support additional conditionals in the KEXEC_CMDLINE hook function
ima: Use the common function to detect LSM conditionals in a rule
ima: Move comprehensive rule validation checks out of the token parser
ima: Use correct type for the args_p member of ima_rule_entry.lsm elements
ima: Shallow copy the args_p member of ima_rule_entry.lsm elements
ima: Fail rule parsing when appraise_flag=blacklist is unsupportable
ima: Fail rule parsing when the KEY_CHECK hook is combined with an invalid cond
ima: Fail rule parsing when the KEXEC_CMDLINE hook is combined with an invalid cond
ima: Fail rule parsing when buffer hook functions have an invalid action
ima: Free the entire rule if it fails to parse
ima: Free the entire rule when deleting a list of rules
ima: Have the LSM free its audit rule
IMA: Add audit log for failure conditions
integrity: Add errno field in audit message