Commit Graph

1122284 Commits

Author SHA1 Message Date
Nick Desaulniers
5aa4860eb5 ARM: 9262/1: remove lazy evaluation in Makefile
arch-y and tune-y used lazy evaluation since they used to contain

cc-option checks. They don't any longer, so just eagerly evaluate these
command line flags.

Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-11-07 14:19:04 +00:00
Ulf Hansson
3f712c7c78 ARM: 9261/1: amba: Drop redundant assignments of the system PM callbacks
If the system PM callbacks haven't been assigned, the PM core falls back to
invoke the corresponding the pm_generic_* helpers for the device. Let's
rely on this behaviour and drop the redundant assignments.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-11-07 14:19:04 +00:00
Nick Desaulniers
527d08631b ARM: 9260/1: lib/xor: use r10 rather than r7 in xor_arm4regs_{2|3}
kbuild test robot reports:
In file included from crypto/xor.c:17:
./arch/arm/include/asm/xor.h:61:3: error: write to reserved register 'R7'
                GET_BLOCK_4(p1);
                ^
./arch/arm/include/asm/xor.h:20:10: note: expanded from macro 'GET_BLOCK_4'
        __asm__("ldmia  %0, {%1, %2, %3, %4}"
                ^
./arch/arm/include/asm/xor.h:63:3: error: write to reserved register 'R7'
                PUT_BLOCK_4(p1);
                ^
./arch/arm/include/asm/xor.h:42:23: note: expanded from macro 'PUT_BLOCK_4'
        __asm__ __volatile__("stmia     %0!, {%2, %3, %4, %5}"
                             ^
./arch/arm/include/asm/xor.h:83:3: error: write to reserved register 'R7'
                GET_BLOCK_4(p1);
                ^
./arch/arm/include/asm/xor.h:20:10: note: expanded from macro 'GET_BLOCK_4'
        __asm__("ldmia  %0, {%1, %2, %3, %4}"
                ^
./arch/arm/include/asm/xor.h:86:3: error: write to reserved register 'R7'
                PUT_BLOCK_4(p1);
                ^
./arch/arm/include/asm/xor.h:42:23: note: expanded from macro 'PUT_BLOCK_4'
        __asm__ __volatile__("stmia     %0!, {%2, %3, %4, %5}"
                             ^
Thumb2 uses r7 rather than r11 as the frame pointer. Let's use r10
rather than r7 for these temporaries.

Link: https://github.com/ClangBuiltLinux/linux/issues/1732
Link: https://lore.kernel.org/llvm/202210072120.V1O2SuKY-lkp@intel.com/

Reported-by: kernel test robot <lkp@intel.com>
Suggested-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-11-07 14:19:03 +00:00
Guilherme G. Piccoli
8fc0b333a7 ARM: 9257/1: Disable FIQs (but not IRQs) on CPUs shutdown paths
Currently the regular CPU shutdown path for ARM disables IRQs/FIQs
in the secondary CPUs - smp_send_stop() calls ipi_cpu_stop(), which
is responsible for that. IRQs are architecturally masked when we
take an interrupt, but FIQs are high priority than IRQs, hence they
aren't masked. With that said, it makes sense to disable FIQs here,
but there's no need for (re-)disabling IRQs.

More than that: there is an alternative path for disabling CPUs,
in the form of function crash_smp_send_stop(), which is used for
kexec/panic path. This function relies on a SMP call that also
triggers a busy-wait loop [at machine_crash_nonpanic_core()], but
without disabling FIQs. This might lead to odd scenarios, like
early interrupts in the boot of kexec'd kernel or even interrupts
in secondary "disabled" CPUs while the main one still works in the
panic path and assumes all secondary CPUs are (really!) off.

So, let's disable FIQs in both paths and *not* disable IRQs a second
time, since they are already masked in both paths by the architecture.
This way, we keep both CPU quiesce paths consistent and safe.

Cc: Marc Zyngier <maz@kernel.org>
Cc: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-11-07 14:19:02 +00:00
Nick Desaulniers
3220022038 ARM: 9256/1: NWFPE: avoid compiler-generated __aeabi_uldivmod
clang-15's ability to elide loops completely became more aggressive when
it can deduce how a variable is being updated in a loop. Counting down
one variable by an increment of another can be replaced by a modulo
operation.

For 64b variables on 32b ARM EABI targets, this can result in the
compiler generating calls to __aeabi_uldivmod, which it does for a do
while loop in float64_rem().

For the kernel, we'd generally prefer that developers not open code 64b
division via binary / operators and instead use the more explicit
helpers from div64.h. On arm-linux-gnuabi targets, failure to do so can
result in linkage failures due to undefined references to
__aeabi_uldivmod().

While developers can avoid open coding divisions on 64b variables, the
compiler doesn't know that the Linux kernel has a partial implementation
of a compiler runtime (--rtlib) to enforce this convention.

It's also undecidable for the compiler whether the code in question
would be faster to execute the loop vs elide it and do the 64b division.

While I actively avoid using the internal -mllvm command line flags, I
think we get better code than using barrier() here, which will force
reloads+spills in the loop for all toolchains.

Link: https://github.com/ClangBuiltLinux/linux/issues/1666

Reported-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-11-07 14:19:02 +00:00
Wang Kefeng
e513ffd881 ARM: 9255/1: efi/dump UEFI runtime page tables for ARM
UEFI runtime page tables dump only for ARM64 at present,
but ARM support EFI and ARM_PTDUMP_DEBUGFS now. Since
ARM could potentially execute with a 1G/3G user/kernel
split, choosing 1G as the upper limit for UEFI runtime
end, with this, we could enable UEFI runtime page tables
on ARM.

Acked-by: Ard Biesheuvel <ardb@kernel.org>
Tested-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-11-07 14:19:01 +00:00
Wang Kefeng
b40b84b120 ARM: 9254/1: mm: Provide better message when kernel fault
If there is a kernel fault, see do_kernel_fault(), we only print
the generic "paging request" or "NULL pointer dereference" message
which don't show read, write or excute information, let's provide
better fault message for them.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-11-07 14:19:00 +00:00
Seung-Woo Kim
d539fee9f8 ARM: 9253/1: ubsan: select ARCH_HAS_UBSAN_SANITIZE_ALL
To enable UBSAN on ARM, this patch enables ARCH_HAS_UBSAN_SANITIZE_ALL
from arm confiuration. Basic kernel bootup test is passed on arm with
CONFIG_UBSAN_SANITIZE_ALL enabled.

[florian: rebased against v6.0-rc7]

Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-11-07 14:19:00 +00:00
Alex Sverdlin
4ab07fd3fb ARM: 9252/1: module: Teach unwinder about PLTs
"unwind: Index not found eef26358" warnings keep popping up on
CONFIG_ARM_MODULE_PLTS-enabled systems if the PC points to a PLT veneer.
Teach the unwinder how to deal with them, taking into account they don't
change state of the stack or register file except loading PC.

Link: https://lore.kernel.org/linux-arm-kernel/20200402153845.30985-1-kursad.oney@broadcom.com/

Tested-by: Kursad Oney <kursad.oney@broadcom.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-11-07 14:18:59 +00:00
Wang Kefeng
e66372ecb8 ARM: 9246/1: dump: show page table level name
ARM could have 3 page table level if ARM_LPAE enabled, or only 2 page
table level, let's show the page table level name when dump.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-10-04 11:09:48 +01:00
Wang Kefeng
afd1efa1d8 ARM: 9245/1: dump: show FDT region
Since commit 7a1be318f5 ("ARM: 9012/1: move device tree mapping out
of linear region"), FDT is placed between the end of the vmalloc region
and the start of the fixmap region, let's show it in dump.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-10-04 11:09:48 +01:00
Alex Sverdlin
823f606ab6 ARM: 9242/1: kasan: Only map modules if CONFIG_KASAN_VMALLOC=n
In case CONFIG_KASAN_VMALLOC=y kasan_populate_vmalloc() allocates the
shadow pages dynamically. But even worse is that kasan_release_vmalloc()
releases them, which is not compatible with create_mapping() of
MODULES_VADDR..MODULES_END range:

BUG: Bad page state in process kworker/9:1  pfn:2068b
page:e5e06160 refcount:0 mapcount:0 mapping:00000000 index:0x0
flags: 0x1000(reserved)
raw: 00001000 e5e06164 e5e06164 00000000 00000000 00000000 ffffffff 00000000
page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
bad because of flags: 0x1000(reserved)
Modules linked in: ip_tables
CPU: 9 PID: 154 Comm: kworker/9:1 Not tainted 5.4.188-... #1
Hardware name: LSI Axxia AXM55XX
Workqueue: events do_free_init
unwind_backtrace
show_stack
dump_stack
bad_page
free_pcp_prepare
free_unref_page
kasan_depopulate_vmalloc_pte
__apply_to_page_range
apply_to_existing_page_range
kasan_release_vmalloc
__purge_vmap_area_lazy
_vm_unmap_aliases.part.0
__vunmap
do_free_init
process_one_work
worker_thread
kthread

Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Alexander Sverdlin <alexander.sverdlin@nokia.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-10-04 11:09:48 +01:00
Linus Walleij
8770b9e575 ARM: 9240/1: dma-mapping: Pass (void *) to virt_to_page()
Pointers to virtual memory functions are (void *) but the
__dma_update_pte() function is passing an unsigned long.
Fix this up by explicit cast.

Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-10-04 11:09:48 +01:00
Li Huafei
752ec621ef ARM: 9234/1: stacktrace: Avoid duplicate saving of exception PC value
Because an exception stack frame is not created in the exception entry,
save_trace() does special handling for the exception PC, but this is
only needed when CONFIG_FRAME_POINTER_UNWIND=y. When
CONFIG_ARM_UNWIND=y, unwind annotations have been added to the exception
entry and save_trace() will repeatedly save the exception PC:

    [0x7f000090] hrtimer_hander+0x8/0x10 [hrtimer]
    [0x8019ec50] __hrtimer_run_queues+0x18c/0x394
    [0x8019f760] hrtimer_run_queues+0xbc/0xd0
    [0x8019def0] update_process_times+0x34/0x80
    [0x801ad2a4] tick_periodic+0x48/0xd0
    [0x801ad3dc] tick_handle_periodic+0x1c/0x7c
    [0x8010f2e0] twd_handler+0x30/0x40
    [0x80177620] handle_percpu_devid_irq+0xa0/0x23c
    [0x801718d0] generic_handle_domain_irq+0x24/0x34
    [0x80502d28] gic_handle_irq+0x74/0x88
    [0x8085817c] generic_handle_arch_irq+0x58/0x78
    [0x80100ba8] __irq_svc+0x88/0xc8
    [0x80108114] arch_cpu_idle+0x38/0x3c
    [0x80108114] arch_cpu_idle+0x38/0x3c    <==== duplicate saved exception PC
    [0x80861bf8] default_idle_call+0x38/0x130
    [0x8015d5cc] do_idle+0x150/0x214
    [0x8015d978] cpu_startup_entry+0x18/0x1c
    [0x808589c0] rest_init+0xd8/0xdc
    [0x80c00a44] arch_post_acpi_subsys_init+0x0/0x8

We can move the special handling of the exception PC in save_trace() to
the unwind_frame() of the frame pointer unwinder.

Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Reviewed-by: Linus Waleij <linus.walleij@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-10-04 11:09:47 +01:00
Li Huafei
5854e4d853 ARM: 9233/1: stacktrace: Skip frame pointer boundary check for call_with_stack()
When using the frame pointer unwinder, it was found that the stack trace
output of stack_trace_save() is incomplete if the stack contains
call_with_stack():

 [0x7f00002c] dump_stack_task+0x2c/0x90 [hrtimer]
 [0x7f0000a0] hrtimer_hander+0x10/0x18 [hrtimer]
 [0x801a67f0] __hrtimer_run_queues+0x1b0/0x3b4
 [0x801a7350] hrtimer_run_queues+0xc4/0xd8
 [0x801a597c] update_process_times+0x3c/0x88
 [0x801b5a98] tick_periodic+0x50/0xd8
 [0x801b5bf4] tick_handle_periodic+0x24/0x84
 [0x8010ffc4] twd_handler+0x38/0x48
 [0x8017d220] handle_percpu_devid_irq+0xa8/0x244
 [0x80176e9c] generic_handle_domain_irq+0x2c/0x3c
 [0x8052e3a8] gic_handle_irq+0x7c/0x90
 [0x808ab15c] generic_handle_arch_irq+0x60/0x80
 [0x8051191c] call_with_stack+0x1c/0x20

For the frame pointer unwinder, unwind_frame() checks stackframe::fp by
stackframe::sp. Since call_with_stack() switches the SP from one stack
to another, stackframe::fp and stackframe: :sp will point to different
stacks, so we can no longer check stackframe::fp by stackframe::sp. Skip
checking stackframe::fp at this point to avoid this problem.

Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Reviewed-by: Linus Waleij <linus.walleij@linaro.org>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-10-04 11:09:47 +01:00
Zhen Lei
09cffecaa7 ARM: 9224/1: Dump the stack traces based on the parameter 'regs' of show_regs()
Function show_regs() is usually called in interrupt handler or exception
handler, it prints the registers specified by the parameter 'regs', then
dump the stack traces. Although not explicitly documented, dump the stack
traces based on'regs' seems to make the most sense. Although dump_stack()
can finally dump the desired content, because 'regs' are saved by the
entry of current interrupt or exception. In the following example we can
see: 1) The backtrace of interrupt or exception handler is not expected,
it causes confusion. 2) Something is printed repeatedly. The line with
the kernel version "CPU: 0 PID: 70 Comm: test0 Not tainted 5.19.0+ #8",
the registers saved in "Exception stack" which 'regs' actually point to.

For example:
rcu: INFO: rcu_sched self-detected stall on CPU
rcu:    0-....: (499 ticks this GP) idle=379/1/0x40000002 softirq=91/91 fqs=249
        (t=500 jiffies g=-911 q=13 ncpus=4)
CPU: 0 PID: 70 Comm: test0 Not tainted 5.19.0+ #8
Hardware name: ARM-Versatile Express
PC is at ktime_get+0x4c/0xe8
LR is at ktime_get+0x4c/0xe8
pc : 8019a474  lr : 8019a474  psr: 60000013
sp : cabd1f28  ip : 00000001  fp : 00000005
r10: 527bf1b8  r9 : 431bde82  r8 : d7b634db
r7 : 0000156e  r6 : 61f234f8  r5 : 00000001  r4 : 80ca86c0
r3 : ffffffff  r2 : fe5bce0b  r1 : 00000000  r0 : 01a431f4
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: 6121406a  DAC: 00000051
CPU: 0 PID: 70 Comm: test0 Not tainted 5.19.0+ #8  <-----------start----------
Hardware name: ARM-Versatile Express                                          |
 unwind_backtrace from show_stack+0x10/0x14                                   |
 show_stack from dump_stack_lvl+0x40/0x4c                                     |
 dump_stack_lvl from rcu_dump_cpu_stacks+0x10c/0x134                          |
 rcu_dump_cpu_stacks from rcu_sched_clock_irq+0x780/0xaf4                     |
 rcu_sched_clock_irq from update_process_times+0x54/0x74                      |
 update_process_times from tick_periodic+0x3c/0xd4                            |
 tick_periodic from tick_handle_periodic+0x20/0x80                       worthless
 tick_handle_periodic from twd_handler+0x30/0x40                             or
 twd_handler from handle_percpu_devid_irq+0x8c/0x1c8                    duplicated
 handle_percpu_devid_irq from generic_handle_domain_irq+0x24/0x34             |
 generic_handle_domain_irq from gic_handle_irq+0x74/0x88                      |
 gic_handle_irq from generic_handle_arch_irq+0x34/0x44                        |
 generic_handle_arch_irq from call_with_stack+0x18/0x20                       |
 call_with_stack from __irq_svc+0x98/0xb0                                     |
Exception stack(0xcabd1ed8 to 0xcabd1f20)                                     |
1ec0:                                                       01a431f4 00000000 |
1ee0: fe5bce0b ffffffff 80ca86c0 00000001 61f234f8 0000156e d7b634db 431bde82 |
1f00: 527bf1b8 00000005 00000001 cabd1f28 8019a474 8019a474 60000013 ffffffff |
 __irq_svc from ktime_get+0x4c/0xe8                 <---------end--------------
 ktime_get from test_task+0x44/0x110
 test_task from kthread+0xd8/0xf4
 kthread from ret_from_fork+0x14/0x2c
Exception stack(0xcabd1fb0 to 0xcabd1ff8)
1fa0:                                     00000000 00000000 00000000 00000000
1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
1fe0: 00000000 00000000 00000000 00000000 00000013 00000000

After replacing dump_stack() with dump_backtrace():
rcu: INFO: rcu_sched self-detected stall on CPU
rcu:    0-....: (500 ticks this GP) idle=8f7/1/0x40000002 softirq=129/129 fqs=241
        (t=500 jiffies g=-915 q=13 ncpus=4)
CPU: 0 PID: 69 Comm: test0 Not tainted 5.19.0+ #9
Hardware name: ARM-Versatile Express
PC is at ktime_get+0x4c/0xe8
LR is at ktime_get+0x4c/0xe8
pc : 8019a494  lr : 8019a494  psr: 60000013
sp : cabddf28  ip : 00000001  fp : 00000002
r10: 0779cb48  r9 : 431bde82  r8 : d7b634db
r7 : 00000a66  r6 : e835ab70  r5 : 00000001  r4 : 80ca86c0
r3 : ffffffff  r2 : ff337d39  r1 : 00000000  r0 : 00cc82c6
Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
Control: 10c5387d  Table: 611d006a  DAC: 00000051
 ktime_get from test_task+0x44/0x110
 test_task from kthread+0xd8/0xf4
 kthread from ret_from_fork+0x14/0x2c
Exception stack(0xcabddfb0 to 0xcabddff8)
dfa0:                                     00000000 00000000 00000000 00000000
dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
dfe0: 00000000 00000000 00000000 00000000 00000013 00000000

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-09-22 08:21:30 +01:00
Zhen Lei
370d51c842 ARM: 9232/1: Replace this_cpu_* with raw_cpu_* in handle_bad_stack()
The hardware automatically disable the IRQ interrupt before jumping to the
interrupt or exception vector. Therefore, the preempt_disable() operation
in this_cpu_read() after macro expansion is unnecessary. In fact, function
this_cpu_read() may trigger scheduling, see pseudocode below.

Pseudocode of this_cpu_read(xx):
preempt_disable_notrace();
raw_cpu_read(xx);
if (unlikely(__preempt_count_dec_and_test()))
	__preempt_schedule_notrace();

Therefore, use raw_cpu_* instead of this_cpu_* to eliminate potential
hazards. At the very least, it reduces a few lines of assembly code.

Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-08-31 14:50:08 +01:00
Wang Kefeng
edd61fc1ca ARM: 9228/1: vfp: kill vfp_flush/release_thread()
Those functions are removed since 2006 commit d6551e884c
("[ARM] Add thread_notify infrastructure").

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-08-31 14:50:08 +01:00
Ben Wolsieffer
3d47ff2568 ARM: 9226/1: disable FDPIC ABI
When building with an arm-*-uclinuxfdpiceabi toolchain, the FDPIC ABI is
enabled by default but should not be used to build the kernel.
Therefore, pass -mno-fdpic if supported by the compiler.

Signed-off-by: Ben Wolsieffer <ben.wolsieffer@hefring.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-08-31 14:50:07 +01:00
Baruch Siach
ee50036b25 ARM: 9221/1: traps: print un-hashed user pc on undefined instruction
When user undefined instruction debug is enabled pc value is hashed like
kernel pointers for security reason. But the security benefit of this
hash is very limited because the code goes on to call __show_regs() that
prints the plain pointer value. pc is a user pointer anyway, so the
kernel does not leak anything. The only result is confusion about the
difference between the pc value on the first printed line, and the value
that __show_regs() prints.

Always print the plain value of pc.

Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
2022-08-30 11:02:43 +01:00
Linus Torvalds
b90cb10531 Linux 6.0-rc3 v6.0-rc3 2022-08-28 15:05:29 -07:00
Linus Torvalds
b467192ec7 Merge tag 'mm-hotfixes-stable-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull more hotfixes from Andrew Morton:
 "Seventeen hotfixes.  Mostly memory management things.

  Ten patches are cc:stable, addressing pre-6.0 issues"

* tag 'mm-hotfixes-stable-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  .mailmap: update Luca Ceresoli's e-mail address
  mm/mprotect: only reference swap pfn page if type match
  squashfs: don't call kmalloc in decompressors
  mm/damon/dbgfs: avoid duplicate context directory creation
  mailmap: update email address for Colin King
  asm-generic: sections: refactor memory_intersects
  bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem
  ocfs2: fix freeing uninitialized resource on ocfs2_dlm_shutdown
  Revert "memcg: cleanup racy sum avoidance code"
  mm/zsmalloc: do not attempt to free IS_ERR handle
  binder_alloc: add missing mmap_lock calls when using the VMA
  mm: re-allow pinning of zero pfns (again)
  vmcoreinfo: add kallsyms_num_syms symbol
  mailmap: update Guilherme G. Piccoli's email addresses
  writeback: avoid use-after-free after removing device
  shmem: update folio if shmem_replace_page() updates the page
  mm/hugetlb: avoid corrupting page->mapping in hugetlb_mcopy_atomic_pte
2022-08-28 14:49:59 -07:00
Linus Torvalds
373eff576e Merge tag 'bitmap-6.0-rc3' of github.com:/norov/linux
Pull bitmap fixes from Yury Norov:
 "Fix the reported issues, and implements the suggested improvements,
  for the version of the cpumask tests [1] that was merged with commit
  c41e8866c2 ("lib/test: introduce cpumask KUnit test suite").

  These changes include fixes for the tests, and better alignment with
  the KUnit style guidelines"

* tag 'bitmap-6.0-rc3' of github.com:/norov/linux:
  lib/cpumask_kunit: add tests file to MAINTAINERS
  lib/cpumask_kunit: log mask contents
  lib/test_cpumask: follow KUnit style guidelines
  lib/test_cpumask: fix cpu_possible_mask last test
  lib/test_cpumask: drop cpu_possible_mask full test
2022-08-28 14:36:27 -07:00
Luca Ceresoli
0ebafe2ea8 .mailmap: update Luca Ceresoli's e-mail address
My Bootlin address is preferred from now on.

Link: https://lkml.kernel.org/r/20220826130515.3011951-1-luca.ceresoli@bootlin.com
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Atish Patra <atishp@atishpatra.org>
Cc: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Cc: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-28 14:02:46 -07:00
Peter Xu
3d2f78f08c mm/mprotect: only reference swap pfn page if type match
Yu Zhao reported a bug after the commit "mm/swap: Add swp_offset_pfn() to
fetch PFN from swap entry" added a check in swp_offset_pfn() for swap type [1]:

  kernel BUG at include/linux/swapops.h:117!
  CPU: 46 PID: 5245 Comm: EventManager_De Tainted: G S         O L 6.0.0-dbg-DEV #2
  RIP: 0010:pfn_swap_entry_to_page+0x72/0xf0
  Code: c6 48 8b 36 48 83 fe ff 74 53 48 01 d1 48 83 c1 08 48 8b 09 f6
  c1 01 75 7b 66 90 48 89 c1 48 8b 09 f6 c1 01 74 74 5d c3 eb 9e <0f> 0b
  48 ba ff ff ff ff 03 00 00 00 eb ae a9 ff 0f 00 00 75 13 48
  RSP: 0018:ffffa59e73fabb80 EFLAGS: 00010282
  RAX: 00000000ffffffe8 RBX: 0c00000000000000 RCX: ffffcd5440000000
  RDX: 1ffffffffff7a80a RSI: 0000000000000000 RDI: 0c0000000000042b
  RBP: ffffa59e73fabb80 R08: ffff9965ca6e8bb8 R09: 0000000000000000
  R10: ffffffffa5a2f62d R11: 0000030b372e9fff R12: ffff997b79db5738
  R13: 000000000000042b R14: 0c0000000000042b R15: 1ffffffffff7a80a
  FS:  00007f549d1bb700(0000) GS:ffff99d3cf680000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000440d035b3180 CR3: 0000002243176004 CR4: 00000000003706e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   <TASK>
   change_pte_range+0x36e/0x880
   change_p4d_range+0x2e8/0x670
   change_protection_range+0x14e/0x2c0
   mprotect_fixup+0x1ee/0x330
   do_mprotect_pkey+0x34c/0x440
   __x64_sys_mprotect+0x1d/0x30

It triggers because pfn_swap_entry_to_page() could be called upon e.g. a
genuine swap entry.

Fix it by only calling it when it's a write migration entry where the page*
is used.

[1] https://lore.kernel.org/lkml/CAOUHufaVC2Za-p8m0aiHw6YkheDcrO-C3wRGixwDS32VTS+k1w@mail.gmail.com/

Link: https://lkml.kernel.org/r/20220823221138.45602-1-peterx@redhat.com
Fixes: 6c287605fd ("mm: remember exclusively mapped anonymous pages with PG_anon_exclusive")
Signed-off-by: Peter Xu <peterx@redhat.com>
Reported-by: Yu Zhao <yuzhao@google.com>
Tested-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: "Huang, Ying" <ying.huang@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-28 14:02:46 -07:00
Phillip Lougher
1f13dff09f squashfs: don't call kmalloc in decompressors
The decompressors may be called while in an atomic section.  So move the
kmalloc() out of this path, and into the "page actor" init function.

This fixes a regression introduced by commit
f268eedddf ("squashfs: extend "page actor" to handle missing pages")

Link: https://lkml.kernel.org/r/20220822215430.15933-1-phillip@squashfs.org.uk
Fixes: f268eedddf ("squashfs: extend "page actor" to handle missing pages")
Reported-by: Chris Murphy <lists@colorremedies.com>
Signed-off-by: Phillip Lougher <phillip@squashfs.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-28 14:02:45 -07:00
Badari Pulavarty
d26f607036 mm/damon/dbgfs: avoid duplicate context directory creation
When user tries to create a DAMON context via the DAMON debugfs interface
with a name of an already existing context, the context directory creation
fails but a new context is created and added in the internal data
structure, due to absence of the directory creation success check.  As a
result, memory could leak and DAMON cannot be turned on.  An example test
case is as below:

    # cd /sys/kernel/debug/damon/
    # echo "off" >  monitor_on
    # echo paddr > target_ids
    # echo "abc" > mk_context
    # echo "abc" > mk_context
    # echo $$ > abc/target_ids
    # echo "on" > monitor_on  <<< fails

Return value of 'debugfs_create_dir()' is expected to be ignored in
general, but this is an exceptional case as DAMON feature is depending
on the debugfs functionality and it has the potential duplicate name
issue.  This commit therefore fixes the issue by checking the directory
creation failure and immediately return the error in the case.

Link: https://lkml.kernel.org/r/20220821180853.2400-1-sj@kernel.org
Fixes: 75c1c2b53c ("mm/damon/dbgfs: support multiple contexts")
Signed-off-by: Badari Pulavarty <badari.pulavarty@intel.com>
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org>	[ 5.15.x]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-28 14:02:45 -07:00
Colin Ian King
ac733f6558 mailmap: update email address for Colin King
Colin King is working on kernel janitorial fixes in his spare time and
using his Intel email is confusing.  Use his gmail account as the default
email address.

Link: https://lkml.kernel.org/r/20220817212753.101109-1-colin.i.king@gmail.com
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-28 14:02:45 -07:00
Quanyang Wang
0c7d7cc2b4 asm-generic: sections: refactor memory_intersects
There are two problems with the current code of memory_intersects:

First, it doesn't check whether the region (begin, end) falls inside the
region (virt, vend), that is (virt < begin && vend > end).

The second problem is if vend is equal to begin, it will return true but
this is wrong since vend (virt + size) is not the last address of the
memory region but (virt + size -1) is.  The wrong determination will
trigger the misreporting when the function check_for_illegal_area calls
memory_intersects to check if the dma region intersects with stext region.

The misreporting is as below (stext is at 0x80100000):
 WARNING: CPU: 0 PID: 77 at kernel/dma/debug.c:1073 check_for_illegal_area+0x130/0x168
 DMA-API: chipidea-usb2 e0002000.usb: device driver maps memory from kernel text or rodata [addr=800f0000] [len=65536]
 Modules linked in:
 CPU: 1 PID: 77 Comm: usb-storage Not tainted 5.19.0-yocto-standard #5
 Hardware name: Xilinx Zynq Platform
  unwind_backtrace from show_stack+0x18/0x1c
  show_stack from dump_stack_lvl+0x58/0x70
  dump_stack_lvl from __warn+0xb0/0x198
  __warn from warn_slowpath_fmt+0x80/0xb4
  warn_slowpath_fmt from check_for_illegal_area+0x130/0x168
  check_for_illegal_area from debug_dma_map_sg+0x94/0x368
  debug_dma_map_sg from __dma_map_sg_attrs+0x114/0x128
  __dma_map_sg_attrs from dma_map_sg_attrs+0x18/0x24
  dma_map_sg_attrs from usb_hcd_map_urb_for_dma+0x250/0x3b4
  usb_hcd_map_urb_for_dma from usb_hcd_submit_urb+0x194/0x214
  usb_hcd_submit_urb from usb_sg_wait+0xa4/0x118
  usb_sg_wait from usb_stor_bulk_transfer_sglist+0xa0/0xec
  usb_stor_bulk_transfer_sglist from usb_stor_bulk_srb+0x38/0x70
  usb_stor_bulk_srb from usb_stor_Bulk_transport+0x150/0x360
  usb_stor_Bulk_transport from usb_stor_invoke_transport+0x38/0x440
  usb_stor_invoke_transport from usb_stor_control_thread+0x1e0/0x238
  usb_stor_control_thread from kthread+0xf8/0x104
  kthread from ret_from_fork+0x14/0x2c

Refactor memory_intersects to fix the two problems above.

Before the 1d7db834a0 ("dma-debug: use memory_intersects()
directly"), memory_intersects is called only by printk_late_init:

printk_late_init -> init_section_intersects ->memory_intersects.

There were few places where memory_intersects was called.

When commit 1d7db834a0 ("dma-debug: use memory_intersects()
directly") was merged and CONFIG_DMA_API_DEBUG is enabled, the DMA
subsystem uses it to check for an illegal area and the calltrace above
is triggered.

[akpm@linux-foundation.org: fix nearby comment typo]
Link: https://lkml.kernel.org/r/20220819081145.948016-1-quanyang.wang@windriver.com
Fixes: 9795593625 ("asm/sections: add helpers to check for section data")
Signed-off-by: Quanyang Wang <quanyang.wang@windriver.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Thierry Reding <treding@nvidia.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-28 14:02:45 -07:00
Liu Shixin
dd0ff4d12d bootmem: remove the vmemmap pages from kmemleak in put_page_bootmem
The vmemmap pages is marked by kmemleak when allocated from memblock. 
Remove it from kmemleak when freeing the page.  Otherwise, when we reuse
the page, kmemleak may report such an error and then stop working.

 kmemleak: Cannot insert 0xffff98fb6eab3d40 into the object search tree (overlaps existing)
 kmemleak: Kernel memory leak detector disabled
 kmemleak: Object 0xffff98fb6be00000 (size 335544320):
 kmemleak:   comm "swapper", pid 0, jiffies 4294892296
 kmemleak:   min_count = 0
 kmemleak:   count = 0
 kmemleak:   flags = 0x1
 kmemleak:   checksum = 0
 kmemleak:   backtrace:

Link: https://lkml.kernel.org/r/20220819094005.2928241-1-liushixin2@huawei.com
Fixes: f41f2ed43c (mm: hugetlb: free the vmemmap pages associated with each HugeTLB page)
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-28 14:02:45 -07:00
Heming Zhao
550842cc60 ocfs2: fix freeing uninitialized resource on ocfs2_dlm_shutdown
After commit 0737e01de9 ("ocfs2: ocfs2_mount_volume does cleanup job
before return error"), any procedure after ocfs2_dlm_init() fails will
trigger crash when calling ocfs2_dlm_shutdown().

ie: On local mount mode, no dlm resource is initialized.  If
ocfs2_mount_volume() fails in ocfs2_find_slot(), error handling will call
ocfs2_dlm_shutdown(), then does dlm resource cleanup job, which will
trigger kernel crash.

This solution should bypass uninitialized resources in
ocfs2_dlm_shutdown().

Link: https://lkml.kernel.org/r/20220815085754.20417-1-heming.zhao@suse.com
Fixes: 0737e01de9 ("ocfs2: ocfs2_mount_volume does cleanup job before return error")
Signed-off-by: Heming Zhao <heming.zhao@suse.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-28 14:02:45 -07:00
Shakeel Butt
dbb16df644 Revert "memcg: cleanup racy sum avoidance code"
This reverts commit 96e51ccf1a.

Recently we started running the kernel with rstat infrastructure on
production traffic and begin to see negative memcg stats values. 
Particularly the 'sock' stat is the one which we observed having negative
value.

$ grep "sock " /mnt/memory/job/memory.stat
sock 253952
total_sock 18446744073708724224

Re-run after couple of seconds

$ grep "sock " /mnt/memory/job/memory.stat
sock 253952
total_sock 53248

For now we are only seeing this issue on large machines (256 CPUs) and
only with 'sock' stat.  I think the networking stack increase the stat on
one cpu and decrease it on another cpu much more often.  So, this negative
sock is due to rstat flusher flushing the stats on the CPU that has seen
the decrement of sock but missed the CPU that has increments.  A typical
race condition.

For easy stable backport, revert is the most simple solution.  For long
term solution, I am thinking of two directions.  First is just reduce the
race window by optimizing the rstat flusher.  Second is if the reader sees
a negative stat value, force flush and restart the stat collection. 
Basically retry but limited.

Link: https://lkml.kernel.org/r/20220817172139.3141101-1-shakeelb@google.com
Fixes: 96e51ccf1a ("memcg: cleanup racy sum avoidance code")
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Cc: "Michal Koutný" <mkoutny@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: Greg Thelen <gthelen@google.com>
Cc: <stable@vger.kernel.org>	[5.15]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-28 14:02:44 -07:00
Sergey Senozhatsky
a5d2172180 mm/zsmalloc: do not attempt to free IS_ERR handle
zsmalloc() now returns ERR_PTR values as handles, which zram accidentally
can pass to zs_free().  Another bad scenario is when zcomp_compress()
fails - handle has default -ENOMEM value, and zs_free() will try to free
that "pointer value".

Add the missing check and make sure that zs_free() bails out when
ERR_PTR() is passed to it.

Link: https://lkml.kernel.org/r/20220816050906.2583956-1-senozhatsky@chromium.org
Fixes: c7e6f17b52 ("zsmalloc: zs_malloc: return ERR_PTR on failure")
Signed-off-by: Sergey Senozhatsky <senozhatsky@chromium.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Nitin Gupta <ngupta@vflare.org>,
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-28 14:02:44 -07:00
Liam Howlett
44e602b4e5 binder_alloc: add missing mmap_lock calls when using the VMA
Take the mmap_read_lock() when using the VMA in binder_alloc_print_pages()
and when checking for a VMA in binder_alloc_new_buf_locked().

It is worth noting binder_alloc_new_buf_locked() drops the VMA read lock
after it verifies a VMA exists, but may be taken again deeper in the call
stack, if necessary.

Link: https://lkml.kernel.org/r/20220810160209.1630707-1-Liam.Howlett@oracle.com
Fixes: a43cfc87ca (android: binder: stop saving a pointer to the VMA)
Signed-off-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Ondrej Mosnacek <omosnace@redhat.com>
Reported-by: <syzbot+a7b60a176ec13cafb793@syzkaller.appspotmail.com>
Acked-by: Carlos Llamas <cmllamas@google.com>
Tested-by: Ondrej Mosnacek <omosnace@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Christian Brauner (Microsoft) <brauner@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Joel Fernandes <joel@joelfernandes.org>
Cc: Martijn Coenen <maco@android.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Todd Kjos <tkjos@android.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: "Arve Hjønnevåg" <arve@android.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-28 14:02:44 -07:00
Alex Williamson
fcab34b433 mm: re-allow pinning of zero pfns (again)
The below referenced commit makes the same error as 1c56343258 ("mm: fix
is_pinnable_page against a cma page"), re-interpreting the logic to
exclude pinning of the zero page, which breaks device assignment with
vfio.

To avoid further subtle mistakes, split the logic into discrete tests.

[akpm@linux-foundation.org: simplify comment, per John]
Link: https://lkml.kernel.org/r/166015037385.760108.16881097713975517242.stgit@omen
Link: https://lore.kernel.org/all/165490039431.944052.12458624139225785964.stgit@omen
Fixes: f25cbb7a95 ("mm: add zone device coherent type memory support")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Suggested-by: Matthew Wilcox <willy@infradead.org>
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Tested-by: Slawomir Laba <slawomirx.laba@intel.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Cc: Alex Sierra <alex.sierra@amd.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Alistair Popple <apopple@nvidia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-28 14:02:44 -07:00
Stephen Brennan
f09bddbd86 vmcoreinfo: add kallsyms_num_syms symbol
The rest of the kallsyms symbols are useless without knowing the number of
symbols in the table.  In an earlier patch, I somehow dropped the
kallsyms_num_syms symbol, so add it back in.

Link: https://lkml.kernel.org/r/20220808205410.18590-1-stephen.s.brennan@oracle.com
Fixes: 5fd8fea935 ("vmcoreinfo: include kallsyms symbols")
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Baoquan He <bhe@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-28 14:02:44 -07:00
Guilherme G. Piccoli
6c26d17eea mailmap: update Guilherme G. Piccoli's email addresses
Both @canonical and @ibm email addresses are invalid now; use my personal
address instead.

Link: https://lkml.kernel.org/r/20220804202207.439427-1-gpiccoli@igalia.com
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-28 14:02:43 -07:00
Khazhismel Kumykov
f87904c075 writeback: avoid use-after-free after removing device
When a disk is removed, bdi_unregister gets called to stop further
writeback and wait for associated delayed work to complete.  However,
wb_inode_writeback_end() may schedule bandwidth estimation dwork after
this has completed, which can result in the timer attempting to access the
just freed bdi_writeback.

Fix this by checking if the bdi_writeback is alive, similar to when
scheduling writeback work.

Since this requires wb->work_lock, and wb_inode_writeback_end() may get
called from interrupt, switch wb->work_lock to an irqsafe lock.

Link: https://lkml.kernel.org/r/20220801155034.3772543-1-khazhy@google.com
Fixes: 45a2966fd6 ("writeback: fix bandwidth estimate for spiky workload")
Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Cc: Michael Stapelberg <stapelberg+linux@google.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-28 14:02:43 -07:00
Matthew Wilcox (Oracle)
9dfb3b8d65 shmem: update folio if shmem_replace_page() updates the page
If we allocate a new page, we need to make sure that our folio matches
that new page.

If we do end up in this code path, we store the wrong page in the shmem
inode's page cache, and I would rather imagine that data corruption
ensues.

This will be solved by changing shmem_replace_page() to
shmem_replace_folio(), but this is the minimal fix.

Link: https://lkml.kernel.org/r/20220730042518.1264767-1-willy@infradead.org
Fixes: da08e9b793 ("mm/shmem: convert shmem_swapin_page() to shmem_swapin_folio()")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: William Kucharski <william.kucharski@oracle.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-28 14:02:43 -07:00
Miaohe Lin
ab74ef708d mm/hugetlb: avoid corrupting page->mapping in hugetlb_mcopy_atomic_pte
In MCOPY_ATOMIC_CONTINUE case with a non-shared VMA, pages in the page
cache are installed in the ptes.  But hugepage_add_new_anon_rmap is called
for them mistakenly because they're not vm_shared.  This will corrupt the
page->mapping used by page cache code.

Link: https://lkml.kernel.org/r/20220712130542.18836-1-linmiaohe@huawei.com
Fixes: f619147104 ("userfaultfd: add UFFDIO_CONTINUE ioctl")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2022-08-28 14:02:43 -07:00
Linus Torvalds
8379c0b31f Merge tag 'for-6.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
 "Fixes:

   - check that subvolume is writable when changing xattrs from security
     namespace

   - fix memory leak in device lookup helper

   - update generation of hole file extent item when merging holes

   - fix space cache corruption and potential double allocations; this
     is a rare bug but can be serious once it happens, stable backports
     and analysis tool will be provided

   - fix error handling when deleting root references

   - fix crash due to assert when attempting to cancel suspended device
     replace, add message what to do if mount fails due to missing
     replace item

  Regressions:

   - don't merge pages into bio if their page offset is not contiguous

   - don't allow large NOWAIT direct reads, this could lead to short
     reads eg. in io_uring"

* tag 'for-6.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: add info when mount fails due to stale replace target
  btrfs: replace: drop assert for suspended replace
  btrfs: fix silent failure when deleting root reference
  btrfs: fix space cache corruption and potential double allocations
  btrfs: don't allow large NOWAIT direct reads
  btrfs: don't merge pages into bio if their page offset is not contiguous
  btrfs: update generation of hole file extent item when merging holes
  btrfs: fix possible memory leak in btrfs_get_dev_args_from_path()
  btrfs: check if root is readonly while setting security xattr
2022-08-28 10:44:04 -07:00
Linus Torvalds
c7bb3fbc1b Merge tag '6.0-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cfis fixes from Steve French:

 - two locking fixes (zero range, punch hole)

 - DFS 9 fix (padding), affecting some servers

 - three minor cleanup changes

* tag '6.0-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: Add helper function to check smb1+ server
  cifs: Use help macro to get the mid header size
  cifs: Use help macro to get the header preamble size
  cifs: skip extra NULL byte in filenames
  smb3: missing inode locks in punch hole
  smb3: missing inode locks in zero range
2022-08-28 10:35:16 -07:00
Linus Torvalds
2f23a7c914 Merge tag 'x86-urgent-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 fixes from Ingo Molnar:

 - Fix PAT on Xen, which caused i915 driver failures

 - Fix compat INT 80 entry crash on Xen PV guests

 - Fix 'MMIO Stale Data' mitigation status reporting on older Intel CPUs

 - Fix RSB stuffing regressions

 - Fix ORC unwinding on ftrace trampolines

 - Add Intel Raptor Lake CPU model number

 - Fix (work around) a SEV-SNP bootloader bug providing bogus values in
   boot_params->cc_blob_address, by ignoring the value on !SEV-SNP
   bootups.

 - Fix SEV-SNP early boot failure

 - Fix the objtool list of noreturn functions and annotate snp_abort(),
   which bug confused objtool on gcc-12.

 - Fix the documentation for retbleed

* tag 'x86-urgent-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Documentation/ABI: Mention retbleed vulnerability info file for sysfs
  x86/sev: Mark snp_abort() noreturn
  x86/sev: Don't use cc_platform_has() for early SEV-SNP calls
  x86/boot: Don't propagate uninitialized boot_params->cc_blob_address
  x86/cpu: Add new Raptor Lake CPU model number
  x86/unwind/orc: Unwind ftrace trampolines with correct ORC entry
  x86/nospec: Fix i386 RSB stuffing
  x86/nospec: Unwreck the RSB stuffing
  x86/bugs: Add "unknown" reporting for MMIO Stale Data
  x86/entry: Fix entry_INT80_compat for Xen PV guests
  x86/PAT: Have pat_enabled() properly reflect state when running on Xen
2022-08-28 10:10:23 -07:00
Linus Torvalds
4459d800f7 Merge tag 'perf-urgent-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 perf fixes from Ingo Molnar:
 "Misc fixes: an Arch-LBR fix, a PEBS enumeration fix, an Intel DS fix,
  PEBS constraints fix on Alder Lake CPUs and an Intel uncore PMU fix"

* tag 'perf-urgent-2022-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/uncore: Fix broken read_counter() for SNB IMC PMU
  perf/x86/intel: Fix pebs event constraints for ADL
  perf/x86/intel/ds: Fix precise store latency handling
  perf/x86/core: Set pebs_capable and PMU_FL_PEBS_ALL for the Baseline
  perf/x86/lbr: Enable the branch type for the Arch LBR by default
2022-08-28 10:05:42 -07:00
Linus Torvalds
611875d510 Merge tag 'perf-tools-fixes-for-v6.0-2022-08-27' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fixup setup of weak groups when using 'perf stat --repeat', add a
   'perf test' for it.

 - Fix memory leaks in 'perf sched record' detected with
   -fsanitize=address.

 - Fix build when PYTHON_CONFIG is user supplied.

 - Capitalize topdown metrics' names in 'perf stat', so that the output,
   sometimes parsed, matches the Intel SDM docs.

 - Make sure the documentation for the save_type filter about Intel
   systems with Arch LBR support (12th-Gen+ client or 4th-Gen Xeon+
   server) reflects recent related kernel changes.

 - Fix 'perf record' man page formatting of description of support to
   hybrid systems.

 - Update arm64´s KVM header from the kernel sources.

* tag 'perf-tools-fixes-for-v6.0-2022-08-27' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf stat: Capitalize topdown metrics' names
  perf docs: Update the documentation for the save_type filter
  perf sched: Fix memory leaks in __cmd_record detected with -fsanitize=address
  perf record: Fix manpage formatting of description of support to hybrid systems
  perf test: Stat test for repeat with a weak group
  perf stat: Clear evsel->reset_group for each stat run
  tools kvm headers arm64: Update KVM header from the kernel sources
  perf python: Fix build when PYTHON_CONFIG is user supplied
2022-08-28 09:58:00 -07:00
Linus Torvalds
10d4879f9e Merge tag 'thermal-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull thermal control fixes from Rafael Wysocki:
 "Fix two issues introduced recently and one driver problem leading to a
  NULL pointer dereference in some cases.

  Specifics:

   - Add missing EXPORT_SYMBOL_GPL in the thermal core and add back the
     required 'trips' property to the thermal zone DT bindings (Daniel
     Lezcano)

   - Prevent the int340x_thermal driver from crashing when a package
     with a buffer of 0 length is returned by an ACPI control method
     evaluated by it (Lee, Chun-Yi)"

* tag 'thermal-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal/int340x_thermal: handle data_vault when the value is ZERO_SIZE_PTR
  dt-bindings: thermal: Fix missing required property
  thermal/core: Add missing EXPORT_SYMBOL_GPL
2022-08-27 15:58:38 -07:00
Linus Torvalds
b98f602df7 Merge tag 'pm-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fix from Rafael Wysocki:
 "Make __resolve_freq() check the presence of the frequency table
  instead of checking whether or not the ->target_index() callback is
  implemented by the driver, because that need not be the case when
  __resolve_freq() is used (Lukasz Luba)"

* tag 'pm-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpufreq: check only freq_table in __resolve_freq()
2022-08-27 15:53:49 -07:00
Linus Torvalds
2b1ddb5950 Merge tag 'acpi-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
 "These fix issues introduced by recent changes related to the handling
  of ACPI device properties and a coding mistake in the exit path of the
  ACPI processor driver.

  Specifics:

   - Prevent acpi_thermal_cpufreq_exit() from attempting to remove
     the same frequency QoS request multiple times (Riwen Lu)

   - Fix type detection for integer ACPI device properties (Stefan
     Binding)

   - Avoid emitting false-positive warnings when processing ACPI
     device properties and drop the useless default case from the
     acpi_copy_property_array_uint() macro (Sakari Ailus)"

* tag 'acpi-6.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: property: Remove default association from integer maximum values
  ACPI: property: Ignore already existing data node tags
  ACPI: property: Fix type detection of unified integer reading functions
  ACPI: processor: Remove freq Qos request for all CPUs
2022-08-27 15:47:02 -07:00
Linus Torvalds
dee1873796 Merge tag 's390-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Vasily Gorbik:

 - Fix double free of guarded storage and runtime instrumentation
   control blocks on fork() failure

 - Fix triggering write fault when VMA does not allow VM_WRITE

* tag 's390-6.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/mm: do not trigger write fault when vma does not allow VM_WRITE
  s390: fix double free of GS and RI CBs on fork() failure
2022-08-27 15:40:51 -07:00
Linus Torvalds
05519f2480 Merge tag 'for-linus-6.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:

 - two minor cleanups

 - a fix of the xen/privcmd driver avoiding a possible NULL dereference
   in an error case

* tag 'for-linus-6.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/privcmd: fix error exit of privcmd_ioctl_dm_op()
  xen: move from strlcpy with unused retval to strscpy
  xen: x86: remove setting the obsolete config XEN_MAX_DOMAIN_MEMORY
2022-08-27 15:38:00 -07:00