Decoding an Intel PT trace of the kernel requires an accurate kernel
object image. This is provided by making a copy of kcore. However the
copy needs to be made under the same conditions as the original
recording, and then it needs to be associated with the perf.data file.
The perf-with-kcore script does that.
The script also checks the permissions on the buildid cache and can be
used to fix them. That is needed for distributions where root does not
have a home directory and consequently writes to the same buildid cache
as the user, resulting in cached files that the user does not have
access to.
Example:
$ ./perf-with-kcore
Usage: perf-with-kcore <perf sub-command> <perf.data directory> [<sub-command options> [ -- <workload>]]
<perf sub-command> can be record, script, report or inject
or: perf-with-kcore fix_buildid_cache_permissions
$ ./perf-with-kcore record pt_uname -e intel_pt// -- uname
Recording
Using /home/ahunter/bin/perf
perf version 3.15.rc3.g4549ba
/home/ahunter/bin/perf record -o pt_uname/perf.data -e intel_pt// -- uname
Linux
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 0.023 MB pt_uname/perf.data ]
Copying kcore
[sudo] password for ahunter:
Done
$ tools/perf/perf-with-kcore.sh script pt_uname | head
Using /home/ahunter/bin/perf
perf version 3.15.rc3.g4549ba
/home/ahunter/bin/perf script -i pt_uname/perf.data --kallsyms=pt_uname/kcore_dir/kallsyms
swapper 0 [002] 161533.969666: sched:sched_switch: swapper/2:0 [120] R ==> perf:11316 [120]
:11315 11315 [003] 161533.969704: sched:sched_switch: perf:11315 [120] S ==> swapper/3:0 [120]
:11316 11316 [002] 161533.969783: sched:sched_switch: perf:11316 [120] R ==> migration/2:33 [0]
:33 33 [002] 161533.969791: sched:sched_switch: migration/2:33 [0] S ==> swapper/2:0 [120]
swapper 0 [003] 161533.969792: sched:sched_switch: swapper/3:0 [120] R ==> perf:11316 [120]
:11316 11316 [003] 161533.970062: branches: 0 [unknown] ([unknown]) => ffffffff810532fa native_write_msr_safe ([kernel.kallsyms])
:11316 11316 [003] 161533.970062: branches: ffffffff810532fd native_write_msr_safe ([kernel.kallsyms]) => ffffffff81035b31 pt_config_start ([kernel.kallsyms])
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1406786474-9306-30-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This enables a PMU event to be specified in the form:
pmu//
which is effectively the same as:
pmu/config=0/
This patch is a precursor to defining default config for a PMU.
Further explanation extracted from lkml thread:
Imagine that the 'tsc' term did not exist.
Intel PT trace data would not contain TSC packets, and the decoder would
not know how to decode them.
Then imagine that a new version of the hardware adds 'tsc'.
It is such a useful feature that we want it by default, but older
versions of the tools don't know how to decode it, so the kernel cannot
turn it on by default.
It is similar to why the kernel does not select perf_event_attr.mmap2 by
default.
The kernel doesn't know whether the tool supports it.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1408129739-17368-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The new split Intel uncore driver code that recently went
into tip added a section mismatch, which the build process
complains about.
uncore_pmu_register() can be called from uncore_pci_probe,()
which is not __init and can be called from pci driver ->probe.
I'm not fully sure if it's actually possible to call the probe
function later, but it seems safer to mark uncore_pmu_register
not __init.
This also fixes the warning.
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1409332858-29039-1-git-send-email-andi@firstfloor.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The use of "rcu_assign_pointer()" is NULLing out the pointer.
According to RCU_INIT_POINTER()'s block comment:
"1. This use of RCU_INIT_POINTER() is NULLing out the pointer"
it is better to use it instead of rcu_assign_pointer() because it has a
smaller overhead.
The following Coccinelle semantic patch was used:
@@
@@
- rcu_assign_pointer
+ RCU_INIT_POINTER
(..., NULL)
Signed-off-by: Andreea-Cristina Bernat <bernat.ada@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: http://lkml.kernel.org/r/20140822132605.GA20130@ada
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The use of "rcu_assign_pointer()" is NULLing out the pointer.
According to RCU_INIT_POINTER()'s block comment:
"1. This use of RCU_INIT_POINTER() is NULLing out the pointer"
it is better to use it instead of rcu_assign_pointer() because it has a
smaller overhead.
The following Coccinelle semantic patch was used:
@@
@@
- rcu_assign_pointer
+ RCU_INIT_POINTER
(..., NULL)
Signed-off-by: Andreea-Cristina Bernat <bernat.ada@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: paulmck@linux.vnet.ibm.com
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: http://lkml.kernel.org/r/20140822141536.GA32051@ada
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The NFS/RDMA Kconfig symbol was split into separate options for client
and server in commit 2e8c12e1b7 ("xprtrdma: add separate Kconfig
options for NFSoRDMA client and server support").
Update the documentation to reflect this split.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "J. Bruce Fields" <bfields@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The example code provided with the i2c device interface documentation
won't compile since it uses the reserved word "register" to name a
variable.
The compiler fails with this error message:
error: expected identifier or '(' before '=' token
__u8 register = 0x20; /* Device register to access */
^
Rename the variable "register" to simply "reg" in the example code.
Another couple of typos has been fixed as well.
[Change "! =" to "!=".]
Signed-off-by: Jose Alarcon Roldan <jose.alarcon.roldan@gmail.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Wolfram Sang <wsa@the-dreams.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Despite the fact that these functions have been around for years, they
are little used (only 15 uses in 13 files at the preseht time) even
though many other files use work-arounds to achieve the same result.
By documenting them, hopefully they will become more widely used.
Signed-off-by: Rob Jones <rob.jones@codethink.co.uk>
Acked-by: Steven Whitehouse <swhiteho@redhat.com>
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull ACPI and power management fixes from Rafael Wysocki:
"These are regression fixes (ACPI sysfs, ACPI video, suspend test),
ACPI cpuidle deadlock fix, missing runtime validation of ACPI _DSD
output, a fix and a new CPU ID for the RAPL driver, new blacklist
entry for the ACPI EC driver and a couple of trivial cleanups
(intel_pstate and generic PM domains).
Specifics:
- Fix for recently broken test_suspend= command line argument (Rafael
Wysocki).
- Fixes for regressions related to the ACPI video driver caused by
switching the default to native backlight handling in 3.16 from
Hans de Goede.
- Fix for a sysfs attribute of ACPI device objects that returns stale
values sometimes due to the fact that they are cached instead of
executing the appropriate method (_SUN) every time (broken in
3.14). From Yasuaki Ishimatsu.
- Fix for a deadlock between cpuidle_lock and cpu_hotplug.lock in the
ACPI processor driver from Jiri Kosina.
- Runtime output validation for the ACPI _DSD device configuration
object missing from the support for it that has been introduced
recently. From Mika Westerberg.
- Fix for an unuseful and misleading RAPL (Running Average Power
Limit) domain detection message in the RAPL driver from Jacob Pan.
- New Intel Haswell CPU ID for the RAPL driver from Jason Baron.
- New Clevo W350etq blacklist entry for the ACPI EC driver from Lan
Tianyu.
- Cleanup for the intel_pstate driver and the core generic PM domains
code from Gabriele Mazzotta and Geert Uytterhoeven"
* tag 'pm+acpi-3.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI / cpuidle: fix deadlock between cpuidle_lock and cpu_hotplug.lock
ACPI / scan: not cache _SUN value in struct acpi_device_pnp
cpufreq: intel_pstate: Remove unneeded variable
powercap / RAPL: change domain detection message
powercap / RAPL: add support for CPU model 0x3f
PM / domains: Make generic_pm_domain.name const
PM / sleep: Fix test_suspend= command line option
ACPI / EC: Add msi quirk for Clevo W350etq
ACPI / video: Disable native_backlight on HP ENVY 15 Notebook PC
ACPI / video: Add a disable_native_backlight quirk
ACPI / video: Fix use_native_backlight selection logic
ACPICA: ACPI 5.1: Add support for runtime validation of _DSD package.
Pull filesystem fixes from Al Viro:
"Several bugfixes (all of them -stable fodder).
Alexey's one deals with double mutex_lock() in UFS (apparently, nobody
has tried to test "ufs: sb mutex merge + mutex_destroy" on something
like file creation/removal on ufs). Mine deal with two kinds of
umount bugs, in umount propagation and in handling of automounted
submounts, both resulting in bogus transient EBUSY from umount"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
ufs: fix deadlocks introduced by sb mutex merge
fix EBUSY on umount() from MNT_SHRINKABLE
get rid of propagate_umount() mistakenly treating slaves as busy.
Pull RCU fix from Ingo Molnar:
"A boot hang fix for the offloaded callback RCU model (RCU_NOCB_CPU=y
&& (TREE_CPU=y || TREE_PREEMPT_RC)) in certain bootup scenarios"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rcu: Make nocb leader kthreads process pending callbacks after spawning
Pull timer fixes from Thomas Gleixner:
"Three fixlets from the timer departement:
- Update the timekeeper before updating vsyscall and pvclock. This
fixes the kvm-clock regression reported by Chris and Paolo.
- Use the proper irq work interface from NMI. This fixes the
regression reported by Catalin and Dave.
- Clarify the compat_nanosleep error handling mechanism to avoid
future confusion"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
timekeeping: Update timekeeper before updating vsyscall and pvclock
compat: nanosleep: Clarify error handling
nohz: Restore NMI safe local irq work for local nohz kick
Commit 0244756edc ("ufs: sb mutex merge + mutex_destroy") introduces
deadlocks in ufs_new_inode() and ufs_free_inode().
Most callers of that functions acqure the mutex by themselves and
ufs_{new,free}_inode() do that via lock_ufs(),
i.e we have an unavoidable double lock.
The patch proposes to resolve the issue by making sure that
ufs_{new,free}_inode() are not called with the mutex held.
Found by Linux Driver Verification project (linuxtesting.org).
Cc: stable@vger.kernel.org # 3.16
Signed-off-by: Alexey Khoroshilov <khoroshilov@ispras.ru>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Pull kvm fixes from Paolo Bonzini:
"A smattering of bug fixes across most architectures"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
powerpc/kvm/cma: Fix panic introduces by signed shift operation
KVM: s390/mm: Fix guest storage key corruption in ptep_set_access_flags
KVM: s390/mm: Fix storage key corruption during swapping
arm/arm64: KVM: Complete WFI/WFE instructions
ARM/ARM64: KVM: Nuke Hyp-mode tlbs before enabling MMU
KVM: s390/mm: try a cow on read only pages for key ops
KVM: s390: Fix user triggerable bug in dead code
Pull xfs fixes from Dave Chinner:
"The fixes all address recently discovered data corruption issues.
The original Direct IO issue was discovered by Chris Mason @ Facebook
on a production workload which mixed buffered reads with direct reads
and writes IO to the same file. The fix for that exposed other issues
with page invalidation (exposed by millions of fsx operations) failing
due to dirty buffers beyond EOF.
Finally, the collapse_range code could also cause problems due to
racing writeback changing the extent map while it was being shifted
around. The commits for that problem are simple mitigation fixes that
prevent the problem from occuring. A more robust fix for 3.18 that
addresses the underlying problem is currently being worked on by
Brian.
Summary of fixes:
- a direct IO read/buffered read data corruption
- the associated fallout from the DIO data corruption fix
- collapse range bugs that are potential data corruption issues"
* tag 'xfs-for-linus-3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/dgc/linux-xfs:
xfs: trim eofblocks before collapse range
xfs: xfs_file_collapse_range is delalloc challenged
xfs: don't log inode unless extent shift makes extent modifications
xfs: use ranged writeback and invalidation for direct IO
xfs: don't zero partial page cache pages during O_DIRECT writes
xfs: don't zero partial page cache pages during O_DIRECT writes
xfs: don't dirty buffers beyond EOF
Pull mtd fixes from Brian Norris:
"Two trivial MTD updates for 3.17-rc4:
- a tiny comment tweak, to kill a bunch of DocBook warnings added
during the merge window
- a small fixup to the OTP routines' error handling"
* tag 'for-linus-20140905' of git://git.infradead.org/linux-mtd:
mtd: nand: fix DocBook warnings on nand_sdr_timings doc
mtd: cfi_cmdset_0002: check return code for get_chip()
The update_walltime() code works on the shadow timekeeper to make the
seqcount protected region as short as possible. But that update to the
shadow timekeeper does not update all timekeeper fields because it's
sufficient to do that once before it becomes life. One of these fields
is tkr.base_mono. That stays stale in the shadow timekeeper unless an
operation happens which copies the real timekeeper to the shadow.
The update function is called after the update calls to vsyscall and
pvclock. While not correct, it did not cause any problems because none
of the invoked update functions used base_mono.
commit cbcf2dd3b3 (x86: kvm: Make kvm_get_time_and_clockread()
nanoseconds based) changed that in the kvm pvclock update function, so
the stale mono_base value got used and caused kvm-clock to malfunction.
Put the update where it belongs and fix the issue.
Reported-by: Chris J Arges <chris.j.arges@canonical.com>
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gleb Natapov <gleb@kernel.org>
Cc: John Stultz <john.stultz@linaro.org>
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1409050000570.3333@nanos
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
The error handling in compat_sys_nanosleep() is correct, but
completely non obvious. Document it and restrict it to the
-ERESTART_RESTARTBLOCK return value for clarity.
Reported-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Pull i2c bugfixes from Wolfram Sang:
"I2C driver bugfixes for the 3.17 release. Details can be found in the
commit messages, yet I think this is typical driver stuff"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
Revert "i2c: rcar: remove spinlock"
i2c: at91: add bound checking on SMBus block length bytes
i2c: rk3x: fix bug that cause transfer fails in master receive mode
i2c: at91: Fix a race condition during signal handling in at91_do_twi_xfer.
i2c: mv64xxx: continue probe when clock-frequency is missing
i2c: rcar: fix MNR interrupt handling
Merge "at91: fixes for 3.17 #1" from Nicols Ferre:
First AT91 fixes batch for 3.17:
- compatibility string precision
- clock registration and USB DT fix for at91rm9200
* tag 'at91-fixes' of git://github.com/at91linux/linux-at91:
ARM: at91/dt: rm9200: fix usb clock definition
ARM: at91: rm9200: fix clock registration
ARM: at91/dt: sam9g20: set at91sam9g20 pllb driver
Signed-off-by: Kevin Hilman <khilman@linaro.org>
The function cleaning up an initialized event
was called from the "event_del" handler, instead
of being used as the "destroy" callback. In case of
events group allocation this caused NULL pointer
dereference (as events are added and deleted
multiple times then). Fixed now.
Signed-off-by: Pawel Moll <mail@pawelmoll.com>
Signed-off-by: Kevin Hilman <khilman@linaro.org>
Pull m68k updates from Geert Uytterhoeven:
"Wire up new syscalls getrandom and memfd_create"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: Wire up memfd_create
m68k: Wire up getrandom
The atmel,clk-divisors property is taking 4 divisors, if less are
provided, the clock registration will fail.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Actually register clocks from device tree when using the common clock
framework.
Signed-off-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Boris Brezillon <boris.brezillon@free-electrons.com>
[nicolas.ferre@atmel.com: add at91 to function name]
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Dave Hansen reports a massive scalability regression in an uncontained
page fault benchmark with more than 30 concurrent threads, which he
bisected down to 05b8430123 ("mm: memcontrol: use root_mem_cgroup
res_counter") and pin-pointed on res_counter spinlock contention.
That change relied on the per-cpu charge caches to mostly swallow the
res_counter costs, but it's apparent that the caches don't scale yet.
Revert memcg back to bypassing res_counters on the root level in order
to restore performance for uncontained workloads.
Reported-by: Dave Hansen <dave@sr71.net>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Tested-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Vladimir Davydov <vdavydov@parallels.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
This patch changes sync_filesystem() to be EXPORT_SYMBOL().
The reason this is needed is that starting with 3.15 kernel, due to
Theodore Ts'o's commit 02b9984d64 ("fs: push sync_filesystem() down to
the file system's remount_fs()"), all file systems that have dirty data
to be written out need to call sync_filesystem() from their
->remount_fs() method when remounting read-only.
As this is now a generically required function rather than an internal
only function it should be EXPORT_SYMBOL() so that all file systems can
call it.
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
Acked-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull regulator documentation fixes from Mark Brown:
"All the fixes people have found for the regulator API have been
documentation fixes, avoiding warnings while building the kerneldoc,
fixing some errors in one of the DT bindings documents and fixing some
typos in the header"
* tag 'regulator-v3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
regulator: fix kernel-doc warnings in header files
regulator: Proofread documentation
regulator: tps65090: Fix tps65090 typos in example
Merge "omap fixes against v3.17-rc3" from Tony Lindgren:
Few fixes for omaps mostly for various devices to get them working
properly on the new am437x and dra7 hardware for several devices
such as I2C, NAND, DDR3 and USB. There's also a clock fix for omap3.
And also included are two minor cosmetic fixes that are not
stictly fixes for the new hardware support added recently to
downgrade a GPMC warning into a debug statement, and fix the
confusing comments for dra7-evm spi1 mux.
Note that these are all .dts changes except for a GPMC change.
* tag 'omap-fixes-against-v3.17-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: (255 commits)
ARM: dts: dra7-evm: Add vtt regulator support
ARM: dts: dra7-evm: Fix spi1 mux documentation
ARM: dts: am43x-epos-evm: Disable QSPI to prevent conflict with GPMC-NAND
ARM: OMAP2+: gpmc: Don't complain if wait pin is used without r/w monitoring
ARM: dts: am43xx-epos-evm: Don't use read/write wait monitoring
ARM: dts: am437x-gp-evm: Don't use read/write wait monitoring
ARM: dts: am437x-gp-evm: Use BCH16 ECC scheme instead of BCH8
ARM: dts: am43x-epos-evm: Use BCH16 ECC scheme instead of BCH8
ARM: dts: am4372: fix USB regs size
ARM: dts: am437x-gp: switch i2c0 to 100KHz
ARM: dts: dra7-evm: Fix 8th NAND partition's name
ARM: dts: dra7-evm: Fix i2c3 pinmux and frequency
Linux 3.17-rc3
...
Signed-off-by: Kevin Hilman <khilman@linaro.org>
Pull GPIO fixes from Linus Walleij:
- some documentation sync
- resource leak in the bt8xx driver
- again fix the way varargs are used to handle the optional flags on
the gpiod_* accessors. Now hopefully nailed the entire problem.
* tag 'gpio-v3.17-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: move varargs hack outside #ifdef GPIOLIB
gpio: bt8xx: fix release of managed resources
Documentation: gpio: documentation for optional getters functions
* acpica:
ACPICA: ACPI 5.1: Add support for runtime validation of _DSD package.
* acpi-processor:
ACPI / cpuidle: fix deadlock between cpuidle_lock and cpu_hotplug.lock
* acpi-scan:
ACPI / scan: not cache _SUN value in struct acpi_device_pnp