New x86 FPU features will be very large, requiring ~10k of stack in
signal handlers. These new features require a new approach called
"dynamic features".
The kernel currently tries to ensure that altstacks are reasonably
sized. Right now, on x86, sys_sigaltstack() requires a size of >=2k.
However, that 2k is a constant. Simply raising that 2k requirement
to >10k for the new features would break existing apps which have a
compiled-in size of 2k.
Instead of universally enforcing a larger stack, prohibit a process from
using dynamic features without properly-sized altstacks. This must be
enforced in two places:
* A dynamic feature can not be enabled without an large-enough altstack
for each process thread.
* Once a dynamic feature is enabled, any request to install a too-small
altstack will be rejected
The dynamic feature enabling code must examine each thread in a
process to ensure that the altstacks are large enough. Add a new lock
(sigaltstack_lock()) to ensure that threads can not race and change
their altstack after being examined.
Add the infrastructure in form of a config option and provide empty
stubs for architectures which do not need dynamic altstack size checks.
This implementation will be fleshed out for x86 in a future patch called
x86/arch_prctl: Add controls for dynamic XSTATE components
[dhansen: commit message. ]
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20211021225527.10184-2-chang.seok.bae@intel.com
'make randconfig' can produce a .config file with
"CONFIG_MEMORY_RESERVE=" (no value) since it has no default.
When a subsequent 'make all' is done, kconfig restarts the config
and prompts for a value for MEMORY_RESERVE. This breaks
scripting/automation where there is no interactive user input.
Add a default value for MEMORY_RESERVE. (Any integer value will
work here for kconfig.)
Fixes a kconfig warning:
.config:214:warning: symbol value '' invalid for MEMORY_RESERVE
* Restart config...
Memory reservation (MiB) (MEMORY_RESERVE) [] (NEW)
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2") # from beginning of git history
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: linux-m68k@lists.linux-m68k.org
Signed-off-by: Greg Ungerer <gerg@linux-m68k.org>
Update save_v86_state to always complete all of it's work except
possibly some of the copies to userspace even if save_v86_state takes
a fault. This ensures that the kernel is always in a sane state, even
if userspace has done something silly.
When save_v86_state takes a fault update it to force userspace to take
a SIGSEGV and terminate the userspace application.
As Andy pointed out in review of the first version of this change
there are races between sigaction and the application terinating. Now
that the code has been modified to always perform all save_v86_state's
work (except possibly copying to userspace) those races do not matter
from a kernel perspective.
Forcing the userspace application to terminate (by resetting it's
handler to SIGDFL) is there to keep everything as close to the current
behavior as possible while removing the unique (and difficult to
maintain) use of do_exit.
If this new SIGSEGV happens during handle_signal the next time around
the exit_to_user_mode_loop, SIGSEGV will be delivered to userspace.
All of the callers of handle_vm86_trap and handle_vm86_fault run the
exit_to_user_mode_loop before they return to userspace any signal sent
to the current task during their execution will be delivered to the
current task before that tasks exits to usermode.
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: x86@kernel.org
Cc: H Peter Anvin <hpa@zytor.com>
v1: https://lkml.kernel.org/r/20211020174406.17889-10-ebiederm@xmission.com
Link: https://lkml.kernel.org/r/877de1xcr6.fsf_-_@disp2133
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
The function setup_tsb_params has exactly one caller tsb_grow. The
function tsb_grow passes in a tsb_bytes value that is between 8192 and
1048576 inclusive, and is guaranteed to be a power of 2. The function
setup_tsb_params verifies this property with a switch statement and
then prints an error and causes the task to exit if this is not true.
In practice that print statement can never be reached because tsb_grow
never passes in a bad tsb_size. So if tsb_size ever gets a bad value
that is a kernel bug.
So replace the do_exit which is effectively an open coded version of
BUG() with an actuall call to BUG(). Making it clearer that this
is a case that can never, and should never happen.
Cc: David Miller <davem@davemloft.net>
Cc: sparclinux@vger.kernel.org
Link: https://lkml.kernel.org/r/20211020174406.17889-8-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
If the register state may be partial and corrupted instead of calling
do_exit, call force_sigsegv(SIGSEGV). Which properly kills the
process with SIGSEGV and does not let any more userspace code execute,
instead of just killing one thread of the process and potentially
confusing everything.
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
History-tree: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Fixes: 756f1ae8a44e ("PPC32: Rework signal code and add a swapcontext system call.")
Fixes: 04879b04bf50 ("[PATCH] ppc64: VMX (Altivec) support & signal32 rework, from Ben Herrenschmidt")
Link: https://lkml.kernel.org/r/20211020174406.17889-7-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Today the sh code allocates memory the first time a process uses
the fpu. If that memory allocation fails, kill the affected task
with force_sig(SIGKILL) rather than do_group_exit(SIGKILL).
Calling do_group_exit from an exception handler can potentially lead
to dead locks as do_group_exit is not designed to be called from
interrupt context. Instead use force_sig(SIGKILL) to kill the
userspace process. Sending signals in general and force_sig in
particular has been tested from interrupt context so there should be
no problems.
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: linux-sh@vger.kernel.org
Fixes: 0ea820cf9b ("sh: Move over to dynamically allocated FPU context.")
Link: https://lkml.kernel.org/r/20211020174406.17889-6-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
When an instruction to save or restore a register from the stack fails
in _save_fp_context or _restore_fp_context return with -EFAULT. This
change was made to r2300_fpu.S[1] but it looks like it got lost with
the introduction of EX2[2]. This is also what the other implementation
of _save_fp_context and _restore_fp_context in r4k_fpu.S does, and
what is needed for the callers to be able to handle the error.
Furthermore calling do_exit(SIGSEGV) from bad_stack is wrong because
it does not terminate the entire process it just terminates a single
thread.
As the changed code was the only caller of arch/mips/kernel/syscall.c:bad_stack
remove the problematic and now unused helper function.
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Maciej Rozycki <macro@orcam.me.uk>
Cc: linux-mips@vger.kernel.org
[1] 35938a00ba ("MIPS: Fix ISA I FP sigcontext access violation handling")
[2] f92722dc45 ("MIPS: Correct MIPS I FP sigcontext layout")
Cc: stable@vger.kernel.org
Fixes: f92722dc45 ("MIPS: Correct MIPS I FP sigcontext layout")
Acked-by: Maciej W. Rozycki <macro@orcam.me.uk>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Link: https://lkml.kernel.org/r/20211020174406.17889-5-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
This reverts commit 001ce9785c.
This upstream commit broke AOSP (post Android 12 merge) build
on RB5. The device either silently crashes into USB crash mode
after android boot animation or we see a blank blue screen
with following dpu errors in dmesg:
[ T444] hw recovery is not complete for ctl:3
[ T444] [drm:dpu_encoder_phys_vid_prepare_for_kickoff:539] [dpu error]enc31 intf1 ctl 3 reset failure: -22
[ T444] [drm:dpu_encoder_phys_vid_wait_for_commit_done:513] [dpu error]vblank timeout
[ T444] [drm:dpu_kms_wait_for_commit_done:454] [dpu error]wait for commit done returned -110
[ C7] [drm:dpu_encoder_frame_done_timeout:2127] [dpu error]enc31 frame done timeout
[ T444] [drm:dpu_encoder_phys_vid_wait_for_commit_done:513] [dpu error]vblank timeout
[ T444] [drm:dpu_kms_wait_for_commit_done:454] [dpu error]wait for commit done returned -110
Fixes: 001ce9785c ("arm64: dts: qcom: sm8250: remove bus clock from the mdss node for sm8250 target")
Signed-off-by: Amit Pundir <amit.pundir@linaro.org>
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Link: https://lore.kernel.org/r/20211014135410.4136412-1-dmitry.baryshkov@linaro.org
Pull ARM fixes from Russell King:
- Fix clang-related relocation warning in futex code
- Fix incorrect use of get_kernel_nofault()
- Fix bad code generation in __get_user_check() when kasan is enabled
- Ensure TLB function table is correctly aligned
- Remove duplicated string function definitions in decompressor
- Fix link-time orphan section warnings
- Fix old-style function prototype for arch_init_kprobes()
- Only warn about XIP address when not compile testing
- Handle BE32 big endian for keystone2 remapping
* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
ARM: 9148/1: handle CONFIG_CPU_ENDIAN_BE32 in arch/arm/kernel/head.S
ARM: 9141/1: only warn about XIP address when not compile testing
ARM: 9139/1: kprobes: fix arch_init_kprobes() prototype
ARM: 9138/1: fix link warning with XIP + frame-pointer
ARM: 9134/1: remove duplicate memcpy() definition
ARM: 9133/1: mm: proc-macros: ensure *_tlb_fns are 4B aligned
ARM: 9132/1: Fix __get_user_check failure with ARM KASAN images
ARM: 9125/1: fix incorrect use of get_kernel_nofault()
ARM: 9122/1: select HAVE_FUTEX_CMPXCHG
Hyper-V needs to issue the GHCB HV call in order to read/write MSRs in
Isolation VMs. For that, expose sev_es_ghcb_hv_call().
The Hyper-V Isolation VMs are unenlightened guests and run a paravisor
at VMPL0 for communicating. GHCB pages are being allocated and set up
by that paravisor. Linux gets the GHCB page's physical address via
MSR_AMD64_SEV_ES_GHCB from the paravisor and should not change it.
Add a @set_ghcb_msr parameter to sev_es_ghcb_hv_call() to control
whether the function should set the GHCB's address prior to the call or
not and export that function for use by HyperV.
[ bp: - Massage commit message
- add a struct ghcb forward declaration to fix randconfig builds. ]
Signed-off-by: Tianyu Lan <Tianyu.Lan@microsoft.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20211025122116.264793-6-ltykernel@gmail.com
In kvm_vcpu_block, the current task is set to TASK_INTERRUPTIBLE before
making a final check whether the vCPU should be woken from HLT by any
incoming interrupt.
This is a problem for the get_user() in __kvm_xen_has_interrupt(), which
really shouldn't be sleeping when the task state has already been set.
I think it's actually harmless as it would just manifest itself as a
spurious wakeup, but it's causing a debug warning:
[ 230.963649] do not call blocking ops when !TASK_RUNNING; state=1 set at [<00000000b6bcdbc9>] prepare_to_swait_exclusive+0x30/0x80
Fix the warning by turning it into an *explicit* spurious wakeup. When
invoked with !task_is_running(current) (and we might as well add
in_atomic() there while we're at it), just return 1 to indicate that
an IRQ is pending, which will cause a wakeup and then something will
call it again in a context that *can* sleep so it can fault the page
back in.
Cc: stable@vger.kernel.org
Fixes: 40da8ccd72 ("KVM: x86/xen: Add event channel interrupt vector upcall")
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <168bf8c689561da904e48e2ff5ae4713eaef9e2d.camel@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
qdio.ko no longer needs to care about how the QAOBs are allocated,
from its perspective they are merely another parameter to do_QDIO().
So for a start, shift the cache into the only qdio driver that uses
QAOBs (ie. qeth). Here there's further opportunity to optimize its
usage in the future - eg. make it per-{device, TX queue}, or only
compile it when the driver is built with CQ/QAOB support.
Signed-off-by: Julian Wiedmann <jwi@linux.ibm.com>
Reviewed-by: Benjamin Block <bblock@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Attempting to build mach-rpc with gcc-9 or higher, or with any version
of clang results in a build failure, like:
arm-linux-gnueabi-gcc-11.1.0: error: unrecognized -march target: armv3m
arm-linux-gnueabi-gcc-11.1.0: note: valid arguments are: armv4 armv4t armv5t armv5te armv5tej armv6 armv6j armv6k armv6z armv6kz armv6zk armv6t2 armv6-m armv6s-m armv7 armv7-a armv7ve armv7-r armv7-m armv7e-m armv8-a armv8.1-a armv8.2-a armv8.3-a armv8.4-a armv8.5-a armv8.6-a armv8-m.base armv8-m.main armv8-r armv8.1-m.main iwmmxt iwmmxt2; did you mean 'armv4'?
Building with gcc-5 also fails in at least one of these ways:
/tmp/cczZoCcv.s:68: Error: selected processor does not support `bx lr' in ARM mode
drivers/tty/vt/vt_ioctl.c:958:1: internal compiler error: Segmentation fault
Handle this in Kconfig so we don't run into this with randconfig
builds, allowing only gcc-6 through gcc-8.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
On BE32 kernels, the __opcode_to_mem_thumb32() interface is intentionally
not defined, but it is referenced whenever runtime patching is enabled
for the kernel, which may be for ftrace, jump label, kprobes or kgdb:
arch/arm/kernel/patch.c: In function '__patch_text_real':
arch/arm/kernel/patch.c:94:32: error: implicit declaration of function '__opcode_to_mem_thumb32' [-Werror=implicit-function-declaration]
94 | insn = __opcode_to_mem_thumb32(insn);
| ^~~~~~~~~~~~~~~~~~~~~~~
Since BE32 kernels never run Thumb2 code, we never end up using the
result of this call, so providing an extern declaration without
a definition makes it build correctly.
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
clang fails to build kernels with THUMB2 and FUNCTION_TRACER
enabled when there is any inline asm statement containing
the frame pointer register r7:
arch/arm/mach-exynos/mcpm-exynos.c:154:2: error: inline asm clobber list contains reserved registers: R7 [-Werror,-Winline-asm]
arch/arm/probes/kprobes/actions-thumb.c:449:3: error: inline asm clobber list contains reserved registers: R7 [-Werror,-Winline-asm]
Apparently gcc should also have warned about this, and the
configuration is actually invalid, though there is some
disagreement on the bug trackers about this.
Link: https://bugs.llvm.org/show_bug.cgi?id=45826
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94986
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
For platforms that are not yet converted to ARCH_MULTIPLATFORM,
we can disable CONFIG_ARM_PATCH_PHYS_VIRT, which in turn requires
setting a correct address here.
As we actualy know what all the values are supposed to be based
on the old mach/memory.h header file contents (from git history),
we can just add them here.
This also solves a problem in Kconfig where 'make randconfig'
fails to continue if no number is selected for a 'hex' option.
Users can still override the number at configuration time, e.g.
when the memory visible to the kernel starts at a nonstandard
address on some machine, but it should no longer be required
now.
I originally posted this back in 2016, but the problem still
persists. The patch has gotten much simpler though, as almost
all platforms rely on ARM_PATCH_PHYS_VIRT now.
Link: https://lore.kernel.org/linux-arm-kernel/1455804123-2526139-5-git-send-email-arnd@arndb.de/
Acked-by: Nicolas Pitre <nico@fluxnic.net>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
pgd_page_vaddr() returns an 'unsigned long' address, causing a warning
with the memcpy() call in kasan_init():
arch/arm/mm/kasan_init.c: In function 'kasan_init':
include/asm-generic/pgtable-nop4d.h:44:50: error: passing argument 2 of '__memcpy' makes pointer from integer without a cast [-Werror=int-conversion]
44 | #define pgd_page_vaddr(pgd) ((unsigned long)(p4d_pgtable((p4d_t){ pgd })))
| ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
| |
| long unsigned int
arch/arm/include/asm/string.h:58:45: note: in definition of macro 'memcpy'
58 | #define memcpy(dst, src, len) __memcpy(dst, src, len)
| ^~~
arch/arm/mm/kasan_init.c:229:16: note: in expansion of macro 'pgd_page_vaddr'
229 | pgd_page_vaddr(*pgd_offset_k(KASAN_SHADOW_START)),
| ^~~~~~~~~~~~~~
arch/arm/include/asm/string.h:21:47: note: expected 'const void *' but argument is of type 'long unsigned int'
21 | extern void *__memcpy(void *dest, const void *src, __kernel_size_t n);
| ~~~~~~~~~~~~^~~
Avoid this by adding an explicit typecast.
Link: https://lore.kernel.org/all/CACRpkdb3DMvof3-xdtss0Pc6KM36pJA-iy=WhvtNVnsDpeJ24Q@mail.gmail.com/
Fixes: 5615f69bc2 ("ARM: 9016/2: Initialize the mapping of KASan shadow memory")
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
A lot of randconfig builds end up not selecting any machine type at
all. This is generally fine for the purpose of compile testing, but
of course it means that the kernel is not usable on actual hardware,
and it causes a warning about this fact.
As most of the build bots now force-enable CONFIG_COMPILE_TEST for
randconfig builds, use that as a guard to control whether we warn
on this type of broken configuration.
We could do the same for the missing-cpu-type warning, but those
configurations fail to build much earlier.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
We can currently build a multi-cpu enabled kernel that allows both ARMv4
and ARMv5 CPUs, and also supports THUMB mode in user space.
However, returning to user space in this configuration with the usr_ret
macro requires the use of the 'bx' instruction, which is refused by
the assembler:
arch/arm/kernel/entry-armv.S: Assembler messages:
arch/arm/kernel/entry-armv.S:937: Error: selected processor does not support `bx lr' in ARM mode
arch/arm/kernel/entry-armv.S:960: Error: selected processor does not support `bx lr' in ARM mode
arch/arm/kernel/entry-armv.S:1003: Error: selected processor does not support `bx lr' in ARM mode
<instantiation>:2:2: note: instruction requires: armv4t
bx lr
While it would be possible to handle this correctly in principle, doing so
seems to not be worth it, if we can simply avoid the problem by enforcing
that a kernel supporting both ARMv4 and a later CPU architecture cannot
run THUMB binaries.
This turned up while build-testing with clang; for some reason,
gcc never triggered the problem.
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
When configuring the kernel for big-endian, we set either BE-8 or BE-32
based on the CPU architecture level. Until linux-4.4, we did not have
any ARMv7-M platform allowing big-endian builds, but now i.MX/Vybrid
is in that category, adn we get a build error because of this:
arch/arm/kernel/module-plts.c: In function 'get_module_plt':
arch/arm/kernel/module-plts.c:60:46: error: implicit declaration of function '__opcode_to_mem_thumb32' [-Werror=implicit-function-declaration]
This comes down to picking the wrong default, ARMv7-M uses BE8
like ARMv7-A does. Changing the default gets the kernel to compile
and presumably works.
https://lore.kernel.org/all/1455804123-2526139-2-git-send-email-arnd@arndb.de/
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Building with 'make W=1' shows a warning in some configurations
when 'verbose()' is defined to be empty.
arch/arm/probes/kprobes/test-core.c: In function 'kprobes_test_case_start':
arch/arm/probes/kprobes/test-core.c:1367:26: error: suggest braces around empty body in an 'else' statement [-Werror=empty-body]
1367 | current_instruction);
| ^
Change the definition of verbose() to use no_printk(), allowing format
string checking and avoiding the warning.
Link: https://lore.kernel.org/all/20210322114600.3528031-1-arnd@kernel.org/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
My intel-ixp42x-welltech-epbx100 no longer boot since 4.14.
This is due to commit 463dbba4d1 ("ARM: 9104/2: Fix Keystone 2 kernel
mapping regression")
which forgot to handle CONFIG_CPU_ENDIAN_BE32 as possible BE config.
Suggested-by: Krzysztof Hałasa <khalasa@piap.pl>
Fixes: 463dbba4d1 ("ARM: 9104/2: Fix Keystone 2 kernel mapping regression")
Signed-off-by: Corentin Labbe <clabbe.montjoie@gmail.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
In preparation for removing HANDLE_DOMAIN_IRQ_IRQENTRY, have arch/arm
perform all the irqentry accounting in its entry code.
For configurations with CONFIG_GENERIC_IRQ_MULTI_HANDLER, we can use
generic_handle_arch_irq(). Other than asm_do_IRQ(), all C calls to
handle_IRQ() are from irqchip handlers which will be called from
generic_handle_arch_irq(), so to avoid double accounting IRQ entry, the
entry logic is moved from handle_IRQ() into asm_do_IRQ().
For ARMv7M the entry assembly is tightly coupled with the NVIC irqchip, and
while the entry code should logically live under arch/arm/, moving the
entry logic there makes things more convoluted. So for now, place the
entry logic in the NVIC irqchip, but separated into a separate
function to make the split of responsibility clear.
For all other configurations without CONFIG_GENERIC_IRQ_MULTI_HANDLER,
IRQ entry is already handled in arch code, and requires no changes.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Tested-by: Vladimir Murzin <vladimir.murzin@arm.com> # ARMv7M
Cc: Russell King <linux@armlinux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Going forward we want architecture/entry code to perform all the
necessary work to enter/exit IRQ context, with irqchip code merely
handling the mapping of the interrupt to any handler(s). Among other
reasons, this is necessary to consistently fix some longstanding issues
with the ordering of lockdep/RCU/tracing instrumentation which many
architectures get wrong today in their entry code.
Importantly, rcu_irq_{enter,exit}() must be called precisely once per
IRQ exception, so that rcu_is_cpu_rrupt_from_idle() can correctly
identify when an interrupt was taken from an idle context which must be
explicitly preempted. Currently handle_domain_irq() calls
rcu_irq_{enter,exit}() via irq_{enter,exit}(), but entry code needs to
be able to call rcu_irq_{enter,exit}() earlier for correct ordering
across lockdep/RCU/tracing updates for sequences such as:
lockdep_hardirqs_off(CALLER_ADDR0);
rcu_irq_enter();
trace_hardirqs_off_finish();
To permit each architecture to be converted to the new style in turn,
this patch adds a new CONFIG_HANDLE_DOMAIN_IRQ_IRQENTRY selected by all
current users of HANDLE_DOMAIN_IRQ, which gates the existing behaviour.
When CONFIG_HANDLE_DOMAIN_IRQ_IRQENTRY is not selected,
handle_domain_irq() requires entry code to perform the
irq_{enter,exit}() work, with an explicit check for this matching the
style of handle_domain_nmi().
Subsequent patches will:
1) Add the necessary IRQ entry accounting to each architecture in turn,
dropping CONFIG_HANDLE_DOMAIN_IRQ_IRQENTRY from that architecture's
Kconfig.
2) Remove CONFIG_HANDLE_DOMAIN_IRQ_IRQENTRY once it is no longer
selected.
3) Convert irqchip drivers to consistently use
generic_handle_domain_irq() rather than handle_domain_irq().
4) Remove handle_domain_irq() and CONFIG_HANDLE_DOMAIN_IRQ.
... which should leave us with a clear split of responsiblity across the
entry and irqchip code, making it possible to perform additional
cleanups and fixes for the aforementioned longstanding issues with entry
code.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
In preparation for removing HANDLE_DOMAIN_IRQ, have arch/nds32 perform
all the necessary IRQ entry accounting in its entry code.
Currently arch/nds32 is tightly coupled with the ativic32 irqchip, and
while the entry code should logically live under arch/nds32/, moving the
entry logic there makes things more convoluted. So for now, place the
entry logic in the ativic32 irqchip, but separated into a separate
function to make the split of responsibility clear.
In future this should probably use GENERIC_IRQ_MULTI_HANDLER to cleanly
decouple this.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vincent Chen <deanbo422@gmail.com>
In preparation for removing HANDLE_DOMAIN_IRQ, have arch/arc perform all
the necessary IRQ entry accounting in its entry code.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vineet Gupta <vgupta@kernel.org>
There's no need fpr arch/mips's do_domain_IRQ() to open-code the NULL
check performed by handle_irq_desc(), nor the resolution of the desc
performed by generic_handle_domain_irq().
Use generic_handle_domain_irq() directly, as this is functioanlly
equivalent and clearer.
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
On MIPS, the only user of handle_domain_irq() is octeon_irq_ciu3_ip2(),
which is called from the platform-specific plat_irq_dispatch() function
invoked from the early assembly code.
No other irqchip relevant to arch/mips uses handle_domain_irq():
* No other plat_irq_dispatch() function transitively calls
handle_domain_irq().
* No other vectored IRQ dispatch function registered with
set_vi_handler() calls handle_domain_irq().
* No chained irqchip handlers call handle_domain_irq(), which makes
sense as this is meant to only be used by root irqchip handlers.
Currently octeon_irq_ciu3_ip2() passes NULL as the `regs` argument to
handle_domain_irq(), and as handle_domain_irq() will pass this to
set_irq_regs(), any invoked IRQ handlers will erroneously see a NULL
pt_regs if they call get_pt_regs().
Fix this by calling generic_handle_domain_irq() directly, and performing
the necessary irq_{enter,exit}() logic directly in
octeon_irq_ciu3_ip2(). At the same time, deselect HANDLE_DOMAIN_IRQ,
which subsequent patches will remove.
Other than the corrected behaviour of get_pt_regs(), there should be no
functional change as a result of this patch.
Fixes: ce210d35bb ("MIPS: OCTEON: Add support for OCTEON III interrupt controller.")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Reviewed-by: Marc Zyngier <maz@kernel.org>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
The Principles of Operations describe the various reasons that
each individual SIGP orders might be rejected, and the status
bit that are set for each condition.
For example, for the Set Architecture order, it states:
"If it is not true that all other CPUs in the configu-
ration are in the stopped or check-stop state, ...
bit 54 (incorrect state) ... is set to one."
However, it also states:
"... if the CZAM facility is installed, ...
bit 55 (invalid parameter) ... is set to one."
Since the Configuration-z/Architecture-Architectural Mode (CZAM)
facility is unconditionally presented, there is no need to examine
each VCPU to determine if it is started/stopped. It can simply be
rejected outright with the Invalid Parameter bit.
Fixes: b697e435ae ("KVM: s390: Support Configuration z/Architecture Mode")
Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Link: https://lore.kernel.org/r/20211008203112.1979843-2-farman@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Improve make_secure_pte to avoid stalls when the system is heavily
overcommitted. This was especially problematic in kvm_s390_pv_unpack,
because of the loop over all pages that needed unpacking.
Due to the locks being held, it was not possible to simply replace
uv_call with uv_call_sched. A more complex approach was
needed, in which uv_call is replaced with __uv_call, which does not
loop. When the UVC needs to be executed again, -EAGAIN is returned, and
the caller (or its caller) will try again.
When -EAGAIN is returned, the path is the same as when the page is in
writeback (and the writeback check is also performed, which is
harmless).
Fixes: 214d9bbcd3 ("s390/mm: provide memory management functions for protected KVM guests")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Link: https://lore.kernel.org/r/20210920132502.36111-5-imbrenda@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
If kvm_s390_pv_destroy_cpu is called more than once, we risk calling
free_page on a random page, since the sidad field is aliased with the
gbea, which is not guaranteed to be zero.
This can happen, for example, if userspace calls the KVM_PV_DISABLE
IOCTL, and it fails, and then userspace calls the same IOCTL again.
This scenario is only possible if KVM has some serious bug or if the
hardware is broken.
The solution is to simply return successfully immediately if the vCPU
was already non secure.
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Fixes: 19e1227768 ("KVM: S390: protvirt: Introduce instruction data area bounce buffer")
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20210920132502.36111-3-imbrenda@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>