The UEFI spec allows runtime services to be called with interrupts
masked or unmasked, and if a runtime service function needs to mask
interrupts, it must restore the mask to its original state before
returning (i.e. from the PoV of the OS, this does not change across a
call). Firmware should never unmask exceptions, as these may then be
taken by the OS unexpectedly.
Unfortunately, some firmware has been seen to unmask IRQs (and
potentially other maskable exceptions) across runtime services calls,
leaving IRQ flags corrupted after returning from a runtime services
function call. This may be detected by the IRQ tracing code, but often
goes unnoticed, leaving a potentially disastrous bug hidden.
This patch detects when the IRQ flags are corrupted by an EFI runtime
services call, logging the call and specific corruption to the console.
While restoring the expected value of the flags is insufficient to avoid
problems, we do so to avoid redundant warnings from elsewhere (e.g. IRQ
tracing).
The set of bits in flags which we want to check is architecture-specific
(e.g. we want to check FIQ on arm64, but not the zero flag on x86), so
each arch must provide ARCH_EFI_IRQ_FLAGS_MASK to describe those. In the
absence of this mask, the check is a no-op, and we redundantly save the
flags twice, but that will be short-lived as subsequent patches
will implement this and remove the scaffolding.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1461614832-17633-37-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently each architecture must implement two macros, efi_call_virt() and
__efi_call_virt(), which only differ by the presence or absence of a
return type. Otherwise, the logic surrounding the call is identical.
As each architecture must define the entire body of each, we can't place
any generic manipulation (e.g. irq flag validation) in the middle.
This patch adds template implementations of these macros. With these,
arch code can implement three template macros, avoiding reptition for
the void/non-void return cases:
* arch_efi_call_virt_setup()
Sets up the environment for the call (e.g. switching page tables,
allowing kernel-mode use of floating point, if required).
* arch_efi_call_virt()
Performs the call. The last expression in the macro must be the call
itself, allowing the logic to be shared by the void and non-void
cases.
* arch_efi_call_virt_teardown()
Restores the usual kernel environment once the call has returned.
While the savings from repition are minimal, we additionally gain the
ability to add common code around the call with the call environment set
up. This can be used to detect common firmware issues (e.g. bad irq mask
management).
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1461614832-17633-32-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The EFI capsule mechanism allows data blobs to be passed to the EFI
firmware. A common use case is performing firmware updates. This patch
just introduces the main infrastructure for interacting with the
firmware, and a driver that allows users to upload capsules will come
in a later patch.
Once a capsule has been passed to the firmware, the next reboot must
be performed using the ResetSystem() EFI runtime service, which may
involve overriding the reboot type specified by reboot=. This ensures
the reset value returned by QueryCapsuleCapabilities() is used to
reset the system, which is required for the capsule to be processed.
efi_capsule_pending() is provided for this purpose.
At the moment we only allow a single capsule blob to be sent to the
firmware despite the fact that UpdateCapsule() takes a 'CapsuleCount'
parameter. This simplifies the API and shouldn't result in any
downside since it is still possible to send multiple capsules by
repeatedly calling UpdateCapsule().
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Bryan O'Donoghue <pure.logic@nexus-software.ie>
Cc: Kweh Hock Leong <hock.leong.kweh@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Peter Jones <pjones@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: joeyli <jlee@suse.com>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1461614832-17633-28-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This module installs a reboot callback, such that if reboot() is invoked
with a string argument NNN, "NNN" is copied to the "LoaderEntryOneShot"
EFI variable, to be read by the bootloader.
If the string matches one of the boot labels defined in its configuration,
the bootloader will boot once to that label. The "LoaderEntryRebootReason"
EFI variable is set with the reboot reason: "reboot", "shutdown".
The bootloader reads this reboot reason and takes particular action
according to its policy.
There are reboot implementations that do "reboot <reason>", such as
Android's reboot command and Upstart's reboot replacement, which pass
the reason as an argument to the reboot syscall. There is no
platform-agnostic way how those could be modified to pass the reason
to the bootloader, regardless of platform or bootloader.
Signed-off-by: Jeremy Compostella <jeremy.compostella@intel.com>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stefan Stanacar <stefan.stanacar@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1461614832-17633-26-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Recent UEFI versions expose permission attributes for runtime services
memory regions, either in the UEFI memory map or in the separate memory
attributes table. This allows the kernel to map these regions with
stricter permissions, rather than the RWX permissions that are used by
default. So wire this up in our mapping routine.
Note that in the absence of permission attributes, we still only map
regions of type EFI_RUNTIME_SERVICE_CODE with the executable bit set.
Also, we base the mapping attributes of EFI_MEMORY_MAPPED_IO on the
type directly rather than on the absence of the EFI_MEMORY_WB attribute.
This is more correct, but is also required for compatibility with the
upcoming support for the Memory Attributes Table, which only carries
permission attributes, not memory type attributes.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Jones <pjones@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sai Praneeth Prakhya <sai.praneeth.prakhya@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1461614832-17633-12-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Most of the users of for_each_efi_memory_desc() are equally happy
iterating over the EFI memory map in efi.memmap instead of 'memmap',
since the former is usually a pointer to the latter.
For those users that want to specify an EFI memory map other than
efi.memmap, that can be done using for_each_efi_memory_desc_in_map().
One such example is in the libstub code where the firmware is queried
directly for the memory map, it gets iterated over, and then freed.
This change goes part of the way toward deleting the global 'memmap'
variable, which is not universally available on all architectures
(notably IA64) and is rather poorly named.
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Mark Salter <msalter@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1461614832-17633-7-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Commit:
2eec5dedf7 ("efi/arm-init: Use read-only early mappings")
updated the early ARM UEFI init code to create the temporary, early
mapping of the UEFI System table using read-only attributes, as a
hardening measure against inadvertent modification.
However, this still leaves the permanent, writable mapping of the UEFI
System table, which is only ever referenced during invocations of UEFI
Runtime Services, at which time the UEFI virtual mapping is available,
which also covers the system table. (This is guaranteed by the fact that
SetVirtualAddressMap(), which is a runtime service itself, converts
various entries in the table to their virtual equivalents, which implies
that the table must be covered by a RuntimeServicesData region that has
the EFI_MEMORY_RUNTIME attribute.)
So instead of creating this permanent mapping, record the virtual address
of the system table inside the UEFI virtual mapping, and dereference that
when accessing the table. This protects the contents of the system table
from inadvertent (or deliberate) modification when no UEFI Runtime
Services calls are in progress.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Leif Lindholm <leif.lindholm@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/1461614832-17633-3-git-send-email-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull EFI fix from Matt Fleming:
* Avoid out-of-bounds access in the efivars code when performing
string matching on converted EFI variable names (Laszlo Ersek)
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Pull thermal fixes from Eduardo Valentin:
"Specifics in this pull request:
- Fixes in mediatek and OF thermal drivers
- Fixes in power_allocator governor
- More fixes of unsigned to int type change in thermal_core.c.
These change have been CI tested using KernelCI bot. \o/"
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal:
thermal: fix Mediatek thermal controller build
thermal: consistently use int for trip temp
thermal: fix mtk_thermal build dependency
thermal: minor mtk_thermal.c cleanups
thermal: power_allocator: req_range multiplication should be a 64 bit type
thermal: of: add __init attribute
Pull asm-generic update from Arnd Bergmann:
"Here is one patch to wire up the preadv/pwritev system calls in the
generic system call table, which is required for all architectures
that were merged in the last few years, including arm64.
Usually these get merged along with the syscall implementation or one
of the architecture trees, but this time that did not happen.
Andre and Christoph both sent a version of this patch, I picked the
one I got first"
* tag 'asm-generic-4.6' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
generic syscalls: wire up preadv2 and pwritev2 syscalls
These new syscalls are implemented as generic code, so enable them for
architectures like arm64 which use the generic syscall table.
Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Pull x86 fixes from Ingo Molnar:
"Misc fixes: two EDAC driver fixes, a Xen crash fix, a HyperV log spam
fix and a documentation fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86 EDAC, sb_edac.c: Take account of channel hashing when needed
x86 EDAC, sb_edac.c: Repair damage introduced when "fixing" channel address
x86/mm/xen: Suppress hugetlbfs in PV guests
x86/doc: Correct limits in Documentation/x86/x86_64/mm.txt
x86/hyperv: Avoid reporting bogus NMI status for Gen2 instances
Pull perf, cpu hotplug and timer fixes from Ingo Molnar:
"perf:
- A single tooling fix for a user-triggerable segfault.
CPU hotplug:
- Fix a CPU hotplug corner case regression, introduced by the recent
hotplug rework
timers:
- Fix a boot hang in the ARM based Tango SoC clocksource driver"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf intel-pt: Fix segfault tracing transactions
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
cpu/hotplug: Fix rollback during error-out in __cpu_disable()
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
clocksource/drivers/tango-xtal: Fix boot hang due to incorrect test
Pull locking fixes from Ingo Molnar:
"Misc fixes:
pvqspinlocks:
- an instrumentation fix
futexes:
- preempt-count vs pagefault_disable decouple corner case fix
- futex requeue plist race window fix
- futex UNLOCK_PI transaction fix for a corner case"
* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
asm-generic/futex: Re-enable preemption in futex_atomic_cmpxchg_inatomic()
futex: Acknowledge a new waiter in counter before plist
futex: Handle unlock_pi race gracefully
locking/pvqspinlock: Fix division by zero in qstat_read()
Pull irq fixes from Ingo Molnar:
"A core irq affinity masks related fix and a MIPS irqchip driver fix"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
irqchip/mips-gic: Don't overrun pcpu_masks array
genirq: Dont allow affinity mask to be updated on IPIs
Pull objtool fixes from Ingo Molnar:
"A handful of objtool fixes: two improvements to how warnings are
printed plus a false positive warning fix, and build environment fix"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Fix Makefile to properly see if libelf is supported
objtool: Detect falling through to the next function
objtool: Add workaround for GCC switch jump table bug
Pull USB / PHY driver fixes from Greg KH:
"Here are two small sets of patches, both from subsystem trees, USB
gadget and PHY drivers.
Full details are in the shortlog, and they have all been in linux-next
for a while (before I merged them to the USB tree)"
* tag 'usb-4.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
usb: gadget: f_fs: Fix use-after-free
usb: dwc3: gadget: Fix suspend/resume during device mode
usb: dwc3: fix memory leak of dwc->regset
usb: dwc3: core: fix PHY handling during suspend
usb: dwc3: omap: fix up error path on probe()
usb: gadget: composite: Clear reserved fields of SSP Dev Cap
phy: rockchip-emmc: adapt binding to specifiy register offset and length
phy: rockchip-emmc: should be a child device of the GRF
phy: rockchip-dp: should be a child device of the GRF
Pull serial fixes from Greg KH:
"Here are 3 serial driver fixes for issues that have been reported.
Two are reverts, fixing problems that were in the big TTY/Serial
driver merge in 4.6-rc1, and the last one is a simple bugfix for a
regression that showed up in 4.6-rc1 as well.
All have been in linux-next with no reported issues"
* tag 'tty-4.6-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
Revert "serial: 8250: Add hardware dependency to RT288X option"
tty/serial/8250: fix RS485 half-duplex RX
Revert "serial-uartlite: Constify uartlite_be/uartlite_le"