Convert #NM to IDTENTRY:
- Implement the C entry point with DEFINE_IDTENTRY
- Emit the ASM stub with DECLARE_IDTENTRY
- Remove the ASM idtentry in 64bit
- Remove the open coded ASM entry code in 32bit
- Fixup the XEN/PV code
- Remove the old prototypes
- Remove the RCU warning as the new entry macro ensures correctness
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134905.056243863@linutronix.de
Convert #UD to IDTENTRY:
- Implement the C entry point with DEFINE_IDTENTRY
- Emit the ASM stub with DECLARE_IDTENTRY
- Remove the ASM idtentry in 64bit
- Remove the open coded ASM entry code in 32bit
- Fixup the XEN/PV code
- Fixup the FOOF bug call in fault.c
- Remove the old prototypes
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134904.955511913@linutronix.de
Convert #BR to IDTENTRY:
- Implement the C entry point with DEFINE_IDTENTRY
- Emit the ASM stub with DECLARE_IDTENTRY
- Remove the ASM idtentry in 64bit
- Remove the open coded ASM entry code in 32bit
- Fixup the XEN/PV code
- Remove the old prototypes
- Remove the RCU warning as the new entry macro ensures correctness
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134904.863001309@linutronix.de
Prepare for using IDTENTRY to define the C exception/trap entry points. It
would be possible to glue this into the existing macro maze, but it's
simpler and better to read at the end to just make them distinct.
Provide a trivial inline helper to read the trap address and add a comment
explaining the logic behind it.
The existing macros will be removed once all instances are converted.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134904.556327833@linutronix.de
Provide DECLARE/DEFINE_IDTENTRY() macros.
DEFINE_IDTENTRY() provides a wrapper which acts as the function
definition. The exception handler body is just appended to it with curly
brackets. The entry point is marked noinstr so that irq tracing and the
enter_from_user_mode() can be moved into the C-entry point. As all
C-entries use the same macro (or a later variant) the necessary entry
handling can be implemented at one central place.
DECLARE_IDTENTRY() provides the function prototypes:
- The C entry point cfunc
- The ASM entry point asm_cfunc
- The XEN/PV entry point xen_asm_cfunc
They all follow the same naming convention.
When included from ASM code DECLARE_IDTENTRY() is a macro which emits the
low level entry point in assembly by instantiating idtentry.
IDTENTRY is the simplest variant which just has a pt_regs argument. It's
going to be used for all exceptions which have no error code.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134904.273363275@linutronix.de
Currently entry_64_compat is exempt from objtool, but with vmlinux
mode there is no hiding it.
Make the following changes to make it pass:
- change entry_SYSENTER_compat to STT_NOTYPE; it's not a function
and doesn't have function type stack setup.
- mark all STT_NOTYPE symbols with UNWIND_HINT_EMPTY; so we do
validate them and don't treat them as unreachable.
- don't abuse RSP as a temp register, this confuses objtool
mightily as it (rightfully) thinks we're doing unspeakable
things to the stack.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505134341.272248024@linutronix.de
This is another step towards more C-code and less convoluted ASM.
Similar to the entry path, invoke the tracer before context tracking which
might turn off RCU and invoke lockdep as the last step before going back to
user space. Annotate the code sections in exit_to_user_mode() accordingly
so objtool won't complain about the tracer invocation.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505134340.703783926@linutronix.de
Now that the C entry points are safe, move the irq flags tracing code into
the entry helper:
- Invoke lockdep before calling into context tracking
- Use the safe trace_hardirqs_on_prepare() trace function after context
tracking established state and RCU is watching.
enter_from_user_mode() is also still invoked from the exception/interrupt
entry code which still contains the ASM irq flags tracing. So this is just
a redundant and harmless invocation of tracing / lockdep until these are
removed as well.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134340.611961721@linutronix.de
Both the callers in the low level ASM code and __context_tracking_exit()
which is invoked from enter_from_user_mode() via user_exit_irqoff() are
marked NOKPROBE. Allowing enter_from_user_mode() to be probed is
inconsistent at best.
Aside of that while function tracing per se is safe the function trace
entry/exit points can be used via BPF as well which is not safe to use
before context tracking has reached CONTEXT_KERNEL and adjusted RCU.
Mark it noinstr which moves it into the instrumentation protected text
section and includes notrace.
Note, this needs further fixups in context tracking to ensure that the
full call chain is protected. Will be addressed in follow up changes.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134340.429059405@linutronix.de
That code is already not traceable. Move it into the noinstr section so the
objtool section validation does not trigger.
Annotate the warning code as "safe". While it might be not under all
circumstances, getting the information out is important enough.
Should this ever trigger from the sensitive code which is shielded against
instrumentation, e.g. low level entry, then the printk is the least of the
worries.
Addresses the objtool warnings:
vmlinux.o: warning: objtool: context_tracking_recursion_enter()+0x7: call to __this_cpu_preempt_check() leaves .noinstr.text section
vmlinux.o: warning: objtool: __context_tracking_exit()+0x17: call to __this_cpu_preempt_check() leaves .noinstr.text section
vmlinux.o: warning: objtool: __context_tracking_enter()+0x2a: call to __this_cpu_preempt_check() leaves .noinstr.text section
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134340.902709267@linutronix.de
context tracking lacks a few protection mechanisms against instrumentation:
- While the core functions are marked NOKPROBE they lack protection
against function tracing which is required as the function entry/exit
points can be utilized by BPF.
- static functions invoked from the protected functions need to be marked
as well as they can be instrumented otherwise.
- using plain inline allows the compiler to emit traceable and probable
functions.
Fix this by marking the functions noinstr and converting the plain inlines
to __always_inline.
The NOKPROBE_SYMBOL() annotations are removed as the .noinstr.text section
is already excluded from being probed.
Cures the following objtool warnings:
vmlinux.o: warning: objtool: enter_from_user_mode()+0x34: call to __context_tracking_exit() leaves .noinstr.text section
vmlinux.o: warning: objtool: prepare_exit_to_usermode()+0x29: call to __context_tracking_enter() leaves .noinstr.text section
vmlinux.o: warning: objtool: syscall_return_slowpath()+0x29: call to __context_tracking_enter() leaves .noinstr.text section
vmlinux.o: warning: objtool: do_syscall_64()+0x7f: call to __context_tracking_enter() leaves .noinstr.text section
vmlinux.o: warning: objtool: do_int80_syscall_32()+0x3d: call to __context_tracking_enter() leaves .noinstr.text section
vmlinux.o: warning: objtool: do_fast_syscall_32()+0x9c: call to __context_tracking_enter() leaves .noinstr.text section
and generates new ones...
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20200505134340.811520478@linutronix.de
Use of memmove() in #DF is problematic considered tracing and other
instrumentation.
Remove the memmove() call and simply write out what needs doing; this
even clarifies the code, win-win! The code copies from the espfix64
stack to the normal task stack, there is no possible way for that to
overlap.
Survives selftests/x86, specifically sigreturn_64.
Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Acked-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20200505134058.863038566@linutronix.de
With commit dc20b2d526 ("x86/idt: Move interrupt gate initialization to
IDT code") non assigned system vectors are also marked as used in
'used_vectors' (now 'system_vectors') bitmap. This makes checks in
arch_show_interrupts() whether a particular system vector is allocated to
always pass and e.g. 'Hyper-V reenlightenment interrupts' entry always
shows up in /proc/interrupts.
Another side effect of having all unassigned system vectors marked as used
is that irq_matrix_debug_show() will wrongly count them among 'System'
vectors.
As it is now ensured that alloc_intr_gate() is not called after init, it is
possible to leave unused entries in 'system_vectors' unset to fix these
issues.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200428093824.1451532-4-vkuznets@redhat.com
There seems to be no reason to allocate interrupt gates after init. Mark
alloc_intr_gate() as __init and add WARN_ON() checks making sure it is
only used before idt_setup_apic_and_irq_gates() finalizes IDT setup and
maps all un-allocated entries to spurious entries.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200428093824.1451532-3-vkuznets@redhat.com
As a preparatory change for making alloc_intr_gate() __init split
xen_callback_vector() into callback vector setup via hypercall
(xen_setup_callback_vector()) and interrupt gate allocation
(xen_alloc_callback_vector()).
xen_setup_callback_vector() is being called twice: on init and upon
system resume from xen_hvm_post_suspend(). alloc_intr_gate() only
needs to be called once.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200428093824.1451532-2-vkuznets@redhat.com
Currently instrumentation of atomic primitives is done at the architecture
level, while composites or fallbacks are provided at the generic level.
The result is that there are no uninstrumented variants of the
fallbacks. Since there is now need of such variants to isolate text poke
from any form of instrumentation invert this ordering.
Doing this means moving the instrumentation into the generic code as
well as having (for now) two variants of the fallbacks.
Notes:
- the various *cond_read* primitives are not proper fallbacks
and got moved into linux/atomic.c. No arch_ variants are
generated because the base primitives smp_cond_load*()
are instrumented.
- once all architectures are moved over to arch_atomic_ one of the
fallback variants can be removed and some 2300 lines reclaimed.
- atomic_{read,set}*() are no longer double-instrumented
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lkml.kernel.org/r/20200505134058.769149955@linutronix.de
Use __always_inline for atomic fallback wrappers. When building for size
(CC_OPTIMIZE_FOR_SIZE), some compilers appear to be less inclined to
inline even relatively small static inline functions that are assumed to
be inlinable such as atomic ops. This can cause problems, for example in
UACCESS regions.
While the fallback wrappers aren't pure wrappers, they are trivial
nonetheless, and the function they wrap should determine the final
inlining policy.
For x86 tinyconfig we observe:
- vmlinux baseline: 1315988
- vmlinux with patch: 1315928 (-60 bytes)
[ tglx: Cherry-picked from KCSAN ]
Suggested-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Marco Elver <elver@google.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull epoll update from Al Viro:
"epoll conversion to read_iter from Jens; I thought there might be more
epoll stuff this cycle, but uaccess took too much time"
* 'work.epoll' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
eventfd: convert to f_op->read_iter()
Pull vfs fixes from Al Viro:
"A couple of trivial patches that fell through the cracks last cycle"
* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fs: fix indentation in deactivate_super()
vfs: Remove duplicated d_mountpoint check in __is_local_mountpoint
Pull sysctl fixes from Al Viro:
"Fixups to regressions in sysctl series"
* 'work.sysctl' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
sysctl: reject gigantic reads/write to sysctl files
cdrom: fix an incorrect __user annotation on cdrom_sysctl_info
trace: fix an incorrect __user annotation on stack_trace_sysctl
random: fix an incorrect __user annotation on proc_do_entropy
net/sysctl: remove leftover __user annotations on neigh_proc_dointvec*
net/sysctl: use cpumask_parse in flow_limit_cpu_sysctl
Pull i915 uaccess updates from Al Viro:
"Low-hanging fruit in i915; there are several trickier followups, but
that'll wait for the next cycle"
* 'uaccess.i915' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
i915:get_engines(): get rid of pointless access_ok()
i915: alloc_oa_regs(): get rid of pointless access_ok()
i915 compat ioctl(): just use drm_ioctl_kernel()
i915: switch copy_perf_config_registers_or_number() to unsafe_put_user()
i915: switch query_{topology,engine}_info() to copy_to_user()