Michael Ellerman 20045f0155 powerpc/64s/radix: Don't warn on copros in radix__tlb_flush()
Sachin reported a warning when running the inject-ra-err selftest:

  # selftests: powerpc/mce: inject-ra-err
  Disabling lock debugging due to kernel taint
  MCE: CPU19: machine check (Severe)  Real address Load/Store (foreign/control memory) [Not recovered]
  MCE: CPU19: PID: 5254 Comm: inject-ra-err NIP: [0000000010000e48]
  MCE: CPU19: Initiator CPU
  MCE: CPU19: Unknown
  ------------[ cut here ]------------
  WARNING: CPU: 19 PID: 5254 at arch/powerpc/mm/book3s64/radix_tlb.c:1221 radix__tlb_flush+0x160/0x180
  CPU: 19 PID: 5254 Comm: inject-ra-err Kdump: loaded Tainted: G   M        E      6.6.0-rc3-00055-g9ed22ae6be81 #4
  Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
  ...
  NIP radix__tlb_flush+0x160/0x180
  LR  radix__tlb_flush+0x104/0x180
  Call Trace:
    radix__tlb_flush+0xf4/0x180 (unreliable)
    tlb_finish_mmu+0x15c/0x1e0
    exit_mmap+0x1a0/0x510
    __mmput+0x60/0x1e0
    exit_mm+0xdc/0x170
    do_exit+0x2bc/0x5a0
    do_group_exit+0x4c/0xc0
    sys_exit_group+0x28/0x30
    system_call_exception+0x138/0x330
    system_call_vectored_common+0x15c/0x2ec

And bisected it to commit e43c0a0c3c ("powerpc/64s/radix: combine
final TLB flush and lazy tlb mm shootdown IPIs"), which added a warning
in radix__tlb_flush() if mm->context.copros is still elevated.

However it's possible for the copros count to be elevated if a process
exits without first closing file descriptors that are associated with a
copro, eg. VAS.

If the process exits with a VAS file still open, the release callback
is queued up for exit_task_work() via:
  exit_files()
    put_files_struct()
      close_files()
        filp_close()
          fput()

And called via:
  exit_task_work()
    ____fput()
      __fput()
        file->f_op->release(inode, file)
          coproc_release()
            vas_user_win_ops->close_win()
              vas_deallocate_window()
                mm_context_remove_vas_window()
                  mm_context_remove_copro()

But that is after exit_mm() has been called from do_exit() and triggered
the warning.

Fix it by dropping the warning, and always calling __flush_all_mm().

In the normal case of no copros, that will result in a call to
_tlbiel_pid(mm->context.id, RIC_FLUSH_ALL) just as the current code
does.

If the copros count is elevated then it will cause a global flush, which
should flush translations from any copros. Note that the process table
entry was cleared in arch_exit_mmap(), so copros should not be able to
fetch any new translations.

Fixes: e43c0a0c3c ("powerpc/64s/radix: combine final TLB flush and lazy tlb mm shootdown IPIs")
Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Closes: https://lore.kernel.org/all/A8E52547-4BF1-47CE-8AEA-BC5A9D7E3567@linux.ibm.com/
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Link: https://msgid.link/20231017121527.1574104-1-mpe@ellerman.id.au
2023-10-18 09:45:52 +11:00
2022-09-28 09:02:20 +02:00
2023-09-17 14:40:24 -07:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.
Description
No description provided
Readme 3.4 GiB
Languages
C 97%
Assembly 1%
Shell 0.6%
Rust 0.5%
Python 0.4%
Other 0.3%