On platforms with an arch timer erratum workaround, it's possible for
arch_timer_reg_read_stable() to recurse into itself when certain
tracing options are enabled, leading to stack overflows and related
problems.
For example, when PREEMPT_TRACER and FUNCTION_GRAPH_TRACER are
selected, it's possible to trigger this with:
$ mount -t debugfs nodev /sys/kernel/debug/
$ echo function_graph > /sys/kernel/debug/tracing/current_tracer
The problem is that in such cases, preempt_disable() instrumentation
attempts to acquire a timestamp via trace_clock(), resulting in a call
back to arch_timer_reg_read_stable(), and hence recursion.
This patch changes arch_timer_reg_read_stable() to use
preempt_{disable,enable}_notrace(), which avoids this.
This problem is similar to the fixed by upstream commit 96b3d28bf4
("sched/clock: Prevent tracing recursion in sched_clock_cpu()").
Fixes: 6acc71ccac ("arm64: arch_timer: Allows a CPU-specific erratum to only affect a subset of CPUs")
Signed-off-by: Ding Tianhong <dingtianhong@huawei.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org>
A Xen HVM guest running with KASLR enabled will die rather soon today
because the shared info page mapping is using va() too early. This was
introduced by commit a5d5f328b0 ("xen:
allocate page for shared info page from low memory").
In order to fix this use early_memremap() to get a temporary virtual
address for shared info until va() can be used safely.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Instead of calling xen_hvm_init_shared_info() on boot and resume split
it up into a boot time function searching for the pfn to use and a
mapping function doing the hypervisor mapping call.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Provide a hook in hypervisor_x86 called after setting up initial
memory mapping.
This is needed e.g. by Xen HVM guests to map the hypervisor shared
info page.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Conflicts:
include/linux/mm_types.h
mm/huge_memory.c
I removed the smp_mb__before_spinlock() like the following commit does:
8b1b436dd1 ("mm, locking: Rework {set,clear,mm}_tlb_flush_pending()")
and fixed up the affected commits.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
ARC cores on reset have all interrupt lines of built-in INTC enabled.
Which means once we globally enable interrupts (very early on boot)
faulty hardware blocks may trigger an interrupt that Linux kernel
cannot handle yet as corresponding handler is not yet installed.
In that case system falls in "interrupt storm" and basically never
does anything useful except entering and exiting generic IRQ handling
code.
One real example of that kind of problematic hardware is DW GMAC which
also has interrupts enabled on reset and if Ethernet PHY informs GMAC
about link state, GMAC immediately reports that upstream to ARC core
and here we are.
Now with that change we mask all individual IRQ lines making entire
system more fool-proof.
[This patch was motivated by Adaptrum platform support]
Signed-off-by: Alexey Brodkin <abrodkin@synopsys.com>
Cc: Eugeniy Paltsev <paltsev@synopsys.com>
Tested-by: Alexandru Gagniuc <alex.g@adaptrum.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
Rename the ->enter_freeze cpuidle driver callback to ->enter_s2idle
to make it clear that it is used for entering suspend-to-idle and
rename the related functions, variables and so on accordingly.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
According to Intel 64 and IA-32 Architectures SDM, Volume 3,
Chapter 14.2, "Software needs to exercise care to avoid delays
between the two RDMSRs (for example interrupts)".
So, disable interrupts during reading MSRs IA32_APERF and IA32_MPERF.
See also: commit 4ab60c3f32 (cpufreq: intel_pstate: Disable
interrupts during MSRs reading).
Signed-off-by: Doug Smythies <dsmythies@telus.net>
Reviewed-by: Len Brown <len.brown@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Nadav reported parallel MADV_DONTNEED on same range has a stale TLB
problem and Mel fixed it[1] and found same problem on MADV_FREE[2].
Quote from Mel Gorman:
"The race in question is CPU 0 running madv_free and updating some PTEs
while CPU 1 is also running madv_free and looking at the same PTEs.
CPU 1 may have writable TLB entries for a page but fail the pte_dirty
check (because CPU 0 has updated it already) and potentially fail to
flush.
Hence, when madv_free on CPU 1 returns, there are still potentially
writable TLB entries and the underlying PTE is still present so that a
subsequent write does not necessarily propagate the dirty bit to the
underlying PTE any more. Reclaim at some unknown time at the future
may then see that the PTE is still clean and discard the page even
though a write has happened in the meantime. I think this is possible
but I could have missed some protection in madv_free that prevents it
happening."
This patch aims for solving both problems all at once and is ready for
other problem with KSM, MADV_FREE and soft-dirty story[3].
TLB batch API(tlb_[gather|finish]_mmu] uses [inc|dec]_tlb_flush_pending
and mmu_tlb_flush_pending so that when tlb_finish_mmu is called, we can
catch there are parallel threads going on. In that case, forcefully,
flush TLB to prevent for user to access memory via stale TLB entry
although it fail to gather page table entry.
I confirmed this patch works with [4] test program Nadav gave so this
patch supersedes "mm: Always flush VMA ranges affected by zap_page_range
v2" in current mmotm.
NOTE:
This patch modifies arch-specific TLB gathering interface(x86, ia64,
s390, sh, um). It seems most of architecture are straightforward but
s390 need to be careful because tlb_flush_mmu works only if
mm->context.flush_mm is set to non-zero which happens only a pte entry
really is cleared by ptep_get_and_clear and friends. However, this
problem never changes the pte entries but need to flush to prevent
memory access from stale tlb.
[1] http://lkml.kernel.org/r/20170725101230.5v7gvnjmcnkzzql3@techsingularity.net
[2] http://lkml.kernel.org/r/20170725100722.2dxnmgypmwnrfawp@suse.de
[3] http://lkml.kernel.org/r/BD3A0EBE-ECF4-41D4-87FA-C755EA9AB6BD@gmail.com
[4] https://patchwork.kernel.org/patch/9861621/
[minchan@kernel.org: decrease tlb flush pending count in tlb_finish_mmu]
Link: http://lkml.kernel.org/r/20170808080821.GA31730@bbox
Link: http://lkml.kernel.org/r/20170802000818.4760-7-namit@vmware.com
Signed-off-by: Minchan Kim <minchan@kernel.org>
Signed-off-by: Nadav Amit <namit@vmware.com>
Reported-by: Nadav Amit <namit@vmware.com>
Reported-by: Mel Gorman <mgorman@techsingularity.net>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Jeff Dike <jdike@addtoit.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
THP migration is added but only supports x86_64 at the moment. For all
other architectures, swp_entry_to_pmd() only returns a zero pmd_t.
Due to a GCC zero initializer bug #53119, the standard (pmd_t){0}
initializer is not accepted by all GCC versions. __pmd() is a feasible
workaround. In addition, sparc32's pmd_t is an array instead of a single
value, so we need (pmd_t){ {0}, } instead of (pmd_t){0}. Thus,
a different __pmd() definition is needed in sparc32.
Signed-off-by: Zi Yan <zi.yan@cs.rutgers.edu>
Signed-off-by: David S. Miller <davem@davemloft.net>
New algorithm that takes advantage of the M7/M8 block init store
ASI, ie, overlapping pipelines and miss buffer filling.
Full details in code comments.
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Rename exception handlers to memcpy_xxx as these
are going to be used by new memcpy routines and these
handlers are not exclusive to NG4memcpy anymore.
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Separate the exception handlers from NG4memcpy so that it can be
used with new memcpy routines. Make a separate file for all these handlers.
Signed-off-by: Babu Moger <babu.moger@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Update comments about the range the different
parts of the code copies, the original comments were wrong.
Introduce a few descriptive labels too.
No functional changes.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
01cf9d524f ("microblaze/PCI: Support generic Xilinx AXI PCIe Host Bridge
IP driver") removed pcibios calls to:
pcibios_setup_bus_self()
pcibios_setup_bus_devices()
Given that pcibios_fixup_bus() was the only caller of those functions they
have now become dead code (along with the functions they were calling in
turn), so they can be removed.
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
[bhelgaas: remove "Fixup resources of a PCI<->PCI bridge" comment]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Michal Simek <michal.simek@xilinx.com>
Cc: Bharat Kumar Gogada <bharatku@xilinx.com>
Cc: Ravi Kiran Gummaluri <rgummal@xilinx.com>
Mainline had UFO fixes, but UFO is removed in net-next so we
take the HEAD hunks.
Minor context conflict in bcmsysport statistics bug fix.
Signed-off-by: David S. Miller <davem@davemloft.net>
TI's DP83867 phy is used on DRA72x EVM rev C and DRA71x
EVMs. Enable support for it in omap2plus_defconfig.
The driver is built into the kernel to help NFS booting.
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The DRA72 EVM Rev C straps the DP83867 GigaBit Ethernet phy's RX_DV/RX_CTRL
pin in mode 1. Unfortunately, the phy data manual disallows this.
Add "ti,dp83867-rxctrl-strap-quirk" property to the phy's device-tree node
to allow kernel to enable software workaround for this incorrect strap
setting. This is as suggested by the phy's datamanual and ensures proper
operation of this PHY.
This needs to be done for both instances of this PHY present on the board.
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The DRA71 EVM straps the DP83867 GigaBit Ethernet phy's RX_DV/RX_CTRL pin
in mode 1. Unfortunately, the phy data manual disallows this.
Add "ti,dp83867-rxctrl-strap-quirk" property to the phy's device-tree node
to allow kernel to enable software workaround for this incorrect strap
setting. This is as suggested by the phy's datamanual and ensures proper
operation of this PHY.
This needs to be done for both instances of this PHY present on the board.
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Commit a1d5ebaf8c ("arm64: big-endian: don't treat code as data when
copying sigret code") moved the 32-bit sigreturn trampoline code from
the aarch32_sigret_code array to kuser32.S. The commit removed the
array definition from signal32.c, but not its declaration in
signal32.h. Remove the leftover declaration.
Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Defining the two functions as 'static inline' and exporting them
leads to the interesting case where we can use the interface
from loadable modules, but not from built-in drivers, as shown
in this link failure:
vers/nvdimm/claim.o: In function `nsio_rw_bytes':
claim.c:(.text+0x1b8): undefined reference to `arch_invalidate_pmem'
drivers/nvdimm/pmem.o: In function `pmem_dax_flush':
pmem.c:(.text+0x11c): undefined reference to `arch_wb_cache_pmem'
drivers/nvdimm/pmem.o: In function `pmem_make_request':
pmem.c:(.text+0x5a4): undefined reference to `arch_invalidate_pmem'
pmem.c:(.text+0x650): undefined reference to `arch_invalidate_pmem'
pmem.c:(.text+0x6d4): undefined reference to `arch_invalidate_pmem'
This removes the bogus 'static inline'.
Fixes: d50e071fda ("arm64: Implement pmem API support")
Acked-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
PINCTRL_TI_IODELAY should be enabled so that "pinctrl_dev" can be created
for pinctrl entries populated with iodelay values in device tree data.
Select PINCTRL_TI_IODELAY for SOC_DRA7XX here.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Enable C_CAN/D_CAN driver supported by 66AK2G
Signed-off-by: Franklin S Cooper Jr <fcooper@ti.com>
Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
Add nodes for the two DCAN instances included in 66AK2G
Signed-off-by: Lokesh Vutla <lokeshvutla@ti.com>
[d-gerlach@ti.com: add power-domains and clock information]
Signed-off-by: Dave Gerlach <d-gerlach@ti.com>
[fcooper@ti.com: update subject and commit message. Misc minor updates]
Signed-off-by: Franklin S Cooper Jr <fcooper@ti.com>
Signed-off-by: Santosh Shilimkar <ssantosh@kernel.org>
It overflows the amount of space available in the initial .text section
of trap handler assembler in some configurations, resulting in build
failures.
Signed-off-by: David S. Miller <davem@davemloft.net>
The Cortex-A35 uses some implementation defined perf events.
The Cortex-A35 derives from the Cortex-A53 core, using the same event mapings
based on Cortex-A35 TRM r0p2, section C2.3 - Performance monitoring events
(pages C2-562 to C2-565).
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
The Cortex-A73 uses some implementation defined perf events.
This patch sets up the necessary mapping for Cortex-A73.
Mappings are based on Cortex-A73 TRM r0p2, section 11.9 Events
(pages 11-457 to 11-460).
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Now that the event mapping code always looks into the PMUv3 events
before any extended mappings, the extended mappings can be reduced to
only those events that are not discoverable through the PMCEID registers.
Signed-off-by: Will Deacon <will.deacon@arm.com>
Last level caches and node events were almost never connected in current
supported cores.
We connect last level caches to the actual last level within the core and
node events are connected to bus accesses.
Signed-off-by: Julien Thierry <julien.thierry@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Pull sparc updates from David Miller:
1) Recognize M8 cpus, just basic chip ID matching, from Allen Pais.
2) Prevent crashes when bringing up sunvdc virtual block devices in
some environments. From Jim Quigley.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sunvdc: prevent sunvdc panic when mpgroup disk added to guest domain
sparc64: Increase max_phys_bits to 51 and VA bits to 53 for M8.
sparc64: recognize and support sparc M8 cpu type
sparc64: properly name the cpu constants
The interrupt for power button is static data that comes from the
datasheet, there is no reason to need to define this value on every
board so seams reasonable put this information into the common tps65217
file.
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
The interrupt specifiers for USB and AC charger input are static data that
comes from the datasheet, there is no reason to need to define these values
on every board so seem reasonable put this information into the common
tps65217 file.
Signed-off-by: Enric Balletbo i Serra <enric.balletbo@collabora.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Replace deprecated "vmmc_aux" with the generic "vqmmc" binding for
MMC IO supply.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
DRA74x EVM Rev H EVM comes with revision 2.0 silicon.
However, earlier versions of EVM can come with either
revision 1.1 or revision 1.0 of silicon.
The device-tree file is written to support rev 2.0 of
silicon. pdata quirks are used to then override the
settings needed for PG 1.1 silicon.
PG 1.1 silicon has limitations w.r.t frequencies at
which MMC1/2/3 can operate as well as different IOdelay
numbers.
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Sekhar Nori <nsekhar@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
For systems with X86_FEATURE_TOPOEXT, current logic uses the APIC ID
to calculate shared_cpu_map. However, APIC IDs are not guaranteed to
be contiguous for cores across different L3s (e.g. family17h system
w/ downcore configuration). This breaks the logic, and results in an
incorrect L3 shared_cpu_map.
Instead, always use the previously calculated cpu_llc_shared_mask of
each CPU to derive the L3 shared_cpu_map.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170731085159.9455-3-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Current cpu_core_id fixup causes downcored F17h configurations to be
incorrect:
NODE: 0
processor 0 core id : 0
processor 1 core id : 1
processor 2 core id : 2
processor 3 core id : 4
processor 4 core id : 5
processor 5 core id : 0
NODE: 1
processor 6 core id : 2
processor 7 core id : 3
processor 8 core id : 4
processor 9 core id : 0
processor 10 core id : 1
processor 11 core id : 2
Code that relies on the cpu_core_id, like match_smt(), for example,
which builds the thread siblings masks used by the scheduler, is
mislead.
So, limit the fixup to pre-F17h machines. The new value for cpu_core_id
for F17h and later will represent the CPUID_Fn8000001E_EBX[CoreId],
which is guaranteed to be unique for each core within a socket.
This way we have:
NODE: 0
processor 0 core id : 0
processor 1 core id : 1
processor 2 core id : 2
processor 3 core id : 4
processor 4 core id : 5
processor 5 core id : 6
NODE: 1
processor 6 core id : 8
processor 7 core id : 9
processor 8 core id : 10
processor 9 core id : 12
processor 10 core id : 13
processor 11 core id : 14
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
[ Heavily massaged. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yazen Ghannam <Yazen.Ghannam@amd.com>
Link: http://lkml.kernel.org/r/20170731085159.9455-2-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We are now booting all mach-omap2 in device tree only mode.
Any code that is only called in legacy boot mode where
of_have_populated_dt() is not set is safe to remove now.
Let's leave the dummy omap2_system_dma_init_dev() check
in place for now to avoid a pointless merge conflict with
tusb6010 dmaengine conversion as pointed out by Peter
Ujfalusi <peter.ujfalusi@ti.com>.
Signed-off-by: Tony Lindgren <tony@atomide.com>
Switching FS and GS is a mess, and the current code is still subtly
wrong: it assumes that "Loading a nonzero value into FS sets the
index and base", which is false on AMD CPUs if the value being
loaded is 1, 2, or 3.
(The current code came from commit 3e2b68d752 ("x86/asm,
sched/x86: Rewrite the FS and GS context switch code"), which made
it better but didn't fully fix it.)
Rewrite it to be much simpler and more obviously correct. This
should fix it fully on AMD CPUs and shouldn't adversely affect
performance.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Chang Seok <chang.seok.bae@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>