Commit Graph

933928 Commits

Author SHA1 Message Date
Frederic Weisbecker
1f8a4212dc timers: Expand clk forward logic beyond nohz
As for next_expiry, the base->clk catch up logic will be expanded beyond
NOHZ in order to avoid triggering useless softirqs.

If softirqs should only fire to expire pending timers, periodic base->clk
increments must be skippable for random amounts of time.  Therefore prepare
to catch-up with missing updates whenever an up-to-date base clock is
needed.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juri Lelli <juri.lelli@redhat.com>
Link: https://lkml.kernel.org/r/20200717140551.29076-10-frederic@kernel.org
2020-07-17 21:55:24 +02:00
Frederic Weisbecker
90d52f65f3 timers: Reuse next expiry cache after nohz exit
Now that the next expiry it tracked unconditionally when a timer is added,
this information can be reused on a tick firing after exiting nohz.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juri Lelli <juri.lelli@redhat.com>
Link: https://lkml.kernel.org/r/20200717140551.29076-9-frederic@kernel.org
2020-07-17 21:55:23 +02:00
Frederic Weisbecker
dc2a0f1fb2 timers: Always keep track of next expiry
So far next expiry was only tracked while the CPU was in nohz_idle mode
in order to cope with missing ticks that can't increment the base->clk
periodically anymore.

This logic is going to be expanded beyond nohz in order to spare timer
softirqs so do it unconditionally.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juri Lelli <juri.lelli@redhat.com>
Link: https://lkml.kernel.org/r/20200717140551.29076-8-frederic@kernel.org
2020-07-17 21:55:23 +02:00
Frederic Weisbecker
001ec1b392 timers: Optimize _next_timer_interrupt() level iteration
If a level has a timer that expires before reaching the next level, there
is no need to iterate further.

The next level is reached when the 3 lower bits of the current level are
cleared. If the next event happens before/during that, the next levels
won't provide an earlier expiration.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juri Lelli <juri.lelli@redhat.com>
Link: https://lkml.kernel.org/r/20200717140551.29076-7-frederic@kernel.org
2020-07-17 21:55:22 +02:00
Frederic Weisbecker
4468897211 timers: Add comments about calc_index() ceiling work
calc_index() adds 1 unit of the level granularity to the expiry passed
in parameter to ensure that the timer doesn't expire too early. Add a
comment to explain that and the resulting layout in the wheel.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juri Lelli <juri.lelli@redhat.com>
Link: https://lkml.kernel.org/r/20200717140551.29076-6-frederic@kernel.org
2020-07-17 21:55:22 +02:00
Frederic Weisbecker
9a2b764b06 timers: Move trigger_dyntick_cpu() to enqueue_timer()
Consolidate the code by calling trigger_dyntick_cpu() from
enqueue_timer() instead of calling it from all its callers.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Juri Lelli <juri.lelli@redhat.com>
Link: https://lkml.kernel.org/r/20200717140551.29076-5-frederic@kernel.org
2020-07-17 21:55:22 +02:00
Anna-Maria Behnsen
1f32cab0db timers: Use only bucket expiry for base->next_expiry value
The bucket expiry time is the effective expriy time of timers and is
greater than or equal to the requested timer expiry time. This is due
to the guarantee that timers never expire early and the reduced expiry
granularity in the secondary wheel levels.

When a timer is enqueued, trigger_dyntick_cpu() checks whether the
timer is the new first timer. This check compares next_expiry with
the requested timer expiry value and not with the effective expiry
value of the bucket into which the timer was queued.

Storing the requested timer expiry value in base->next_expiry can lead
to base->clk going backwards if the requested timer expiry value is
smaller than base->clk. Commit 30c66fc30e ("timer: Prevent base->clk
from moving backward") worked around this by preventing the store when
timer->expiry is before base->clk, but did not fix the underlying
problem.

Use the expiry value of the bucket into which the timer is queued to
do the new first timer check. This fixes the base->clk going backward
problem.

The workaround of commit 30c66fc30e ("timer: Prevent base->clk from
moving backward") in trigger_dyntick_cpu() is not longer necessary as the
timers bucket expiry is guaranteed to be greater than or equal base->clk.

Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200717140551.29076-4-frederic@kernel.org
2020-07-17 21:55:21 +02:00
Frederic Weisbecker
3d2e83a2a6 timers: Preserve higher bits of expiration on index calculation
The higher bits of the timer expiration are cropped while calling
calc_index() due to the implicit cast from unsigned long to unsigned int.

This loss shouldn't have consequences on the current code since all the
computation to calculate the index is done on the lower 32 bits.

However to prepare for returning the actual bucket expiration from
calc_index() in order to properly fix base->next_expiry updates, the higher
bits need to be preserved.

Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20200717140551.29076-3-frederic@kernel.org
2020-07-17 21:55:21 +02:00
Frederic Weisbecker
e2a71bdea8 timer: Fix wheel index calculation on last level
When an expiration delta falls into the last level of the wheel, that delta
has be compared against the maximum possible delay and reduced to fit in if
necessary.

However instead of comparing the delta against the maximum, the code
compares the actual expiry against the maximum. Then instead of fixing the
delta to fit in, it sets the maximum delta as the expiry value.

This can result in various undesired outcomes, the worst possible one
being a timer expiring 15 days ahead to fire immediately.

Fixes: 500462a9de ("timers: Switch to a non-cascading wheel")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200717140551.29076-2-frederic@kernel.org
2020-07-17 21:44:05 +02:00
Frederic Weisbecker
30c66fc30e timer: Prevent base->clk from moving backward
When a timer is enqueued with a negative delta (ie: expiry is below
base->clk), it gets added to the wheel as expiring now (base->clk).

Yet the value that gets stored in base->next_expiry, while calling
trigger_dyntick_cpu(), is the initial timer->expires value. The
resulting state becomes:

	base->next_expiry < base->clk

On the next timer enqueue, forward_timer_base() may accidentally
rewind base->clk. As a possible outcome, timers may expire way too
early, the worst case being that the highest wheel levels get spuriously
processed again.

To prevent from that, make sure that base->next_expiry doesn't get below
base->clk.

Fixes: a683f390b9 ("timers: Forward the wheel clock whenever possible")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Anna-Maria Behnsen <anna-maria@linutronix.de>
Tested-by: Juri Lelli <juri.lelli@redhat.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20200703010657.2302-1-frederic@kernel.org
2020-07-09 11:56:57 +02:00
Linus Torvalds
dcb7fd82c7 Linux 5.8-rc4 v5.8-rc4 2020-07-05 16:20:22 -07:00
Linus Torvalds
bb5a93aaf2 x86/ldt: use "pr_info_once()" instead of open-coding it badly
Using a mutex for "print this warning only once" is so overdesigned as
to be actively offensive to my sensitive stomach.

Just use "pr_info_once()" that already does this, although in a
(harmlessly) racy manner that can in theory cause the message to be
printed twice if more than one CPU races on that "is this the first
time" test.

[ If somebody really cares about that harmless data race (which sounds
  very unlikely indeed), that person can trivially fix printk_once() by
  using a simple atomic access, preferably with an optimistic non-atomic
  test first before even bothering to treat the pointless "make sure it
  is _really_ just once" case.

  A mutex is most definitely never the right primitive to use for
  something like this. ]

Yes, this is a small and meaningless detail in a code path that hardly
matters.  But let's keep some code quality standards here, and not
accept outrageously bad code.

Link: https://lore.kernel.org/lkml/CAHk-=wgV9toS7GU3KmNpj8hCS9SeF+A0voHS8F275_mgLhL4Lw@mail.gmail.com/
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-05 12:50:20 -07:00
Linus Torvalds
72674d4800 Merge tag 'x86-urgent-2020-07-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "A series of fixes for x86:

   - Reset MXCSR in kernel_fpu_begin() to prevent using a stale user
     space value.

   - Prevent writing MSR_TEST_CTRL on CPUs which are not explicitly
     whitelisted for split lock detection. Some CPUs which do not
     support it crash even when the MSR is written to 0 which is the
     default value.

   - Fix the XEN PV fallout of the entry code rework

   - Fix the 32bit fallout of the entry code rework

   - Add more selftests to ensure that these entry problems don't come
     back.

   - Disable 16 bit segments on XEN PV. It's not supported because XEN
     PV does not implement ESPFIX64"

* tag 'x86-urgent-2020-07-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/ldt: Disable 16-bit segments on Xen PV
  x86/entry/32: Fix #MC and #DB wiring on x86_32
  x86/entry/xen: Route #DB correctly on Xen PV
  x86/entry, selftests: Further improve user entry sanity checks
  x86/entry/compat: Clear RAX high bits on Xen PV SYSENTER
  selftests/x86: Consolidate and fix get/set_eflags() helpers
  selftests/x86/syscall_nt: Clear weird flags after each test
  selftests/x86/syscall_nt: Add more flag combinations
  x86/entry/64/compat: Fix Xen PV SYSENTER frame setup
  x86/entry: Move SYSENTER's regs->sp and regs->flags fixups into C
  x86/entry: Assert that syscalls are on the right stack
  x86/split_lock: Don't write MSR_TEST_CTRL on CPUs that aren't whitelisted
  x86/fpu: Reset MXCSR to default in kernel_fpu_begin()
2020-07-05 12:23:49 -07:00
Linus Torvalds
f23dbe1893 Merge tag 'irq-urgent-2020-07-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
 "A set of interrupt chip driver fixes:

   - Ensure the atomicity of affinity updates in the GIC driver

   - Don't try to sleep in atomic context when waiting for the GICv4.1
     to respond. Use polling instead.

   - Typo fixes in Kconfig and warnings"

* tag 'irq-urgent-2020-07-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic: Atomically update affinity
  irqchip/riscv-intc: Fix a typo in a pr_warn()
  irqchip/gic-v4.1: Use readx_poll_timeout_atomic() to fix sleep in atomic
  irqchip/loongson-pci-msi: Fix a typo in Kconfig
2020-07-05 12:22:35 -07:00
Linus Torvalds
5465a324af Merge tag 'core-urgent-2020-07-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull rcu fixlet from Thomas Gleixner:
 "A single fix for a printk format warning in RCU"

* tag 'core-urgent-2020-07-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  rcuperf: Fix printk format warning
2020-07-05 12:21:28 -07:00
Linus Torvalds
4bc927367d Merge tag 'kbuild-fixes-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes frin Masahiro Yamada:

 - fix various bugs in xconfig

 - fix some issues in cross-compilation using Clang

 - fix documentation

* tag 'kbuild-fixes-v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  .gitignore: Do not track `defconfig` from `make savedefconfig`
  kbuild: make Clang build userprogs for target architecture
  kbuild: fix CONFIG_CC_CAN_LINK(_STATIC) for cross-compilation with Clang
  kconfig: qconf: parse newer types at debug info
  kconfig: qconf: navigate menus on hyperlinks
  kconfig: qconf: don't show goback button on splitMode
  kconfig: qconf: simplify the goBack() logic
  kconfig: qconf: re-implement setSelected()
  kconfig: qconf: make debug links work again
  kconfig: qconf: make search fully work again on split mode
  kconfig: qconf: cleanup includes
  docs: kbuild: fix ReST formatting
  gcc-plugins: fix gcc-plugins directory path in documentation
2020-07-05 12:14:24 -07:00
Linus Torvalds
19a61a753d Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
 "Four small fixes in three drivers.

  The mptfusion one has actually caused user visible issues in certain
  kernel configurations"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: mptfusion: Don't use GFP_ATOMIC for larger DMA allocations
  scsi: libfc: Skip additional kref updating work event
  scsi: libfc: Handling of extra kref
  scsi: qla2xxx: Fix a condition in qla2x00_find_all_fabric_devs()
2020-07-05 10:56:44 -07:00
Linus Torvalds
29206c6314 Merge tag 'block-5.8-2020-07-05' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - NVMe fixes from Christoph:
    - Fix crash in multi-path disk add (Christoph)
    - Fix ignore of identify error (Sagi)

 - Fix a compiler complaint that a function should be static (Wei)

* tag 'block-5.8-2020-07-05' of git://git.kernel.dk/linux-block:
  block: make function __bio_integrity_free() static
  nvme: fix a crash in nvme_mpath_add_disk
  nvme: fix identify error status silent ignore
2020-07-05 10:45:31 -07:00
Linus Torvalds
9fbe565cb7 Merge tag 'io_uring-5.8-2020-07-05' of git://git.kernel.dk/linux-block
Pull io_uring fix from Jens Axboe:
 "Andres reported a regression with the fix that was merged earlier this
  week, where his setup of using signals to interrupt io_uring CQ waits
  no longer worked correctly.

  Fix this, and also limit our use of TWA_SIGNAL to the case where we
  need it, and continue using TWA_RESUME for task_work as before.

  Since the original is marked for 5.7 stable, let's flush this one out
  early"

* tag 'io_uring-5.8-2020-07-05' of git://git.kernel.dk/linux-block:
  io_uring: fix regression with always ignoring signals in io_cqring_wait()
2020-07-05 10:41:33 -07:00
Linus Torvalds
7783485401 Merge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
 "The usual driver fixes and documentation updates"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: mlxcpld: check correct size of maximum RECV_LEN packet
  i2c: add Kconfig help text for slave mode
  i2c: slave-eeprom: update documentation
  i2c: eg20t: Load module automatically if ID matches
  i2c: designware: platdrv: Set class based on DMI
  i2c: algo-pca: Add 0x78 as SCL stuck low status for PCA9665
2020-07-05 10:35:01 -07:00
Linus Torvalds
45a5ac7a5c Merge tag 'mips_fixes_5.8_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Thomas Bogendoerfer:

 - fix for missing hazard barrier

 - DT fix for ingenic

 - DT fix of GPHY names for lantiq

 - fix usage of smp_processor_id() while preemption is enabled

* tag 'mips_fixes_5.8_1' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: Do not use smp_processor_id() in preemptible code
  MIPS: Add missing EHB in mtc0 -> mfc0 sequence for DSPen
  MIPS: ingenic: gcw0: Fix HP detection GPIO.
  MIPS: lantiq: xway: sysctrl: fix the GPHY clock alias names
2020-07-05 10:29:32 -07:00
Xingxing Su
5868347a19 MIPS: Do not use smp_processor_id() in preemptible code
Use preempt_disable() to fix the following bug under CONFIG_DEBUG_PREEMPT.

[   21.915305] BUG: using smp_processor_id() in preemptible [00000000] code: qemu-system-mip/1056
[   21.923996] caller is do_ri+0x1d4/0x690
[   21.927921] CPU: 0 PID: 1056 Comm: qemu-system-mip Not tainted 5.8.0-rc2 #3
[   21.934913] Stack : 0000000000000001 ffffffff81370000 ffffffff8071cd60 a80f926d5ac95694
[   21.942984]         a80f926d5ac95694 0000000000000000 98000007f0043c88 ffffffff80f2fe40
[   21.951054]         0000000000000000 0000000000000000 0000000000000001 0000000000000000
[   21.959123]         ffffffff802d60cc 98000007f0043dd8 ffffffff81f4b1e8 ffffffff81f60000
[   21.967192]         ffffffff81f60000 ffffffff80fe0000 ffff000000000000 0000000000000000
[   21.975261]         fffffffff500cce1 0000000000000001 0000000000000002 0000000000000000
[   21.983331]         ffffffff80fe1a40 0000000000000006 ffffffff8077f940 0000000000000000
[   21.991401]         ffffffff81460000 98000007f0040000 98000007f0043c80 000000fffba8cf20
[   21.999471]         ffffffff8071cd60 0000000000000000 0000000000000000 0000000000000000
[   22.007541]         0000000000000000 0000000000000000 ffffffff80212ab4 a80f926d5ac95694
[   22.015610]         ...
[   22.018086] Call Trace:
[   22.020562] [<ffffffff80212ab4>] show_stack+0xa4/0x138
[   22.025732] [<ffffffff8071cd60>] dump_stack+0xf0/0x150
[   22.030903] [<ffffffff80c73f5c>] check_preemption_disabled+0xf4/0x100
[   22.037375] [<ffffffff80213b84>] do_ri+0x1d4/0x690
[   22.042198] [<ffffffff8020b828>] handle_ri_int+0x44/0x5c
[   24.359386] BUG: using smp_processor_id() in preemptible [00000000] code: qemu-system-mip/1072
[   24.368204] caller is do_ri+0x1a8/0x690
[   24.372169] CPU: 4 PID: 1072 Comm: qemu-system-mip Not tainted 5.8.0-rc2 #3
[   24.379170] Stack : 0000000000000001 ffffffff81370000 ffffffff8071cd60 a80f926d5ac95694
[   24.387246]         a80f926d5ac95694 0000000000000000 98001007ef06bc88 ffffffff80f2fe40
[   24.395318]         0000000000000000 0000000000000000 0000000000000001 0000000000000000
[   24.403389]         ffffffff802d60cc 98001007ef06bdd8 ffffffff81f4b818 ffffffff81f60000
[   24.411461]         ffffffff81f60000 ffffffff80fe0000 ffff000000000000 0000000000000000
[   24.419533]         fffffffff500cce1 0000000000000001 0000000000000002 0000000000000000
[   24.427603]         ffffffff80fe0000 0000000000000006 ffffffff8077f940 0000000000000020
[   24.435673]         ffffffff81460020 98001007ef068000 98001007ef06bc80 000000fffbbbb370
[   24.443745]         ffffffff8071cd60 0000000000000000 0000000000000000 0000000000000000
[   24.451816]         0000000000000000 0000000000000000 ffffffff80212ab4 a80f926d5ac95694
[   24.459887]         ...
[   24.462367] Call Trace:
[   24.464846] [<ffffffff80212ab4>] show_stack+0xa4/0x138
[   24.470029] [<ffffffff8071cd60>] dump_stack+0xf0/0x150
[   24.475208] [<ffffffff80c73f5c>] check_preemption_disabled+0xf4/0x100
[   24.481682] [<ffffffff80213b58>] do_ri+0x1a8/0x690
[   24.486509] [<ffffffff8020b828>] handle_ri_int+0x44/0x5c

Signed-off-by: Xingxing Su <suxingxing@loongson.cn>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2020-07-05 11:43:52 +02:00
Hauke Mehrtens
fcec538ef8 MIPS: Add missing EHB in mtc0 -> mfc0 sequence for DSPen
This resolves the hazard between the mtc0 in the change_c0_status() and
the mfc0 in configure_exception_vector(). Without resolving this hazard
configure_exception_vector() could read an old value and would restore
this old value again. This would revert the changes change_c0_status()
did. I checked this by printing out the read_c0_status() at the end of
per_cpu_trap_init() and the ST0_MX is not set without this patch.

The hazard is documented in the MIPS Architecture Reference Manual Vol.
III: MIPS32/microMIPS32 Privileged Resource Architecture (MD00088), rev
6.03 table 8.1 which includes:

   Producer | Consumer | Hazard
  ----------|----------|----------------------------
   mtc0     | mfc0     | any coprocessor 0 register

I saw this hazard on an Atheros AR9344 rev 2 SoC with a MIPS 74Kc CPU.
There the change_c0_status() function would activate the DSPen by
setting ST0_MX in the c0_status register. This was reverted and then the
system got a DSP exception when the DSP registers were saved in
save_dsp() in the first process switch. The crash looks like this:

[    0.089999] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[    0.097796] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes, linear)
[    0.107070] Kernel panic - not syncing: Unexpected DSP exception
[    0.113470] Rebooting in 1 seconds..

We saw this problem in OpenWrt only on the MIPS 74Kc based Atheros SoCs,
not on the 24Kc based SoCs. We only saw it with kernel 5.4 not with
kernel 4.19, in addition we had to use GCC 8.4 or 9.X, with GCC 8.3 it
did not happen.

In the kernel I bisected this problem to commit 9012d01166 ("compiler:
allow all arches to enable CONFIG_OPTIMIZE_INLINING"), but when this was
reverted it also happened after commit 172dcd935c ("MIPS: Always
allocate exception vector for MIPSr2+").

Commit 0b24cae4d5 ("MIPS: Add missing EHB in mtc0 -> mfc0 sequence.")
does similar changes to a different file. I am not sure if there are
more places affected by this problem.

Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
2020-07-05 11:43:25 +02:00
Paul Menzel
ba77dca584 .gitignore: Do not track defconfig from make savedefconfig
Running `make savedefconfig` creates by default `defconfig`, which is,
currently, on git’s radar, for example, `git status` lists this file as
untracked.

So, add the file to `.gitignore`, so it’s ignored by git.

Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-07-05 16:15:46 +09:00
Linus Torvalds
9bc0b029a8 Merge tag 'powerpc-5.8-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
 "One fix for a regression in our pkey handling, which exhibits as
  PROT_EXEC mappings taking continuous page faults.

  Thanks to: Jan Stancek, Aneesh Kumar K.V"

* tag 'powerpc-5.8-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
  powerpc/mm/pkeys: Make pkey access check work on execute_only_key
2020-07-04 14:46:11 -07:00
Linus Torvalds
ec84c3f6ef Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
 "Nothing earth-shattering, really - some CPU errata workarounds (one
  day they'll get it right, ha!) and a fix for a boot failure with very
  large kernel images where the alternative patching gets confused when
  patching relative branches using veneers.

   - Fix alternative patching for very large kernel images and modules

   - Hook up existing CPU errata workarounds for Qualcomm Kryo CPUs"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Add KRYO4XX silver CPU cores to erratum list 1530923 and 1024718
  arm64: Add KRYO4XX gold CPU cores to erratum list 1463225 and 1418040
  arm64: Add MIDR value for KRYO4XX gold CPU cores
  arm64/alternatives: use subsections for replacement sequences
2020-07-04 14:43:26 -07:00
Jens Axboe
b7db41c9e0 io_uring: fix regression with always ignoring signals in io_cqring_wait()
When switching to TWA_SIGNAL for task_work notifications, we also made
any signal based condition in io_cqring_wait() return -ERESTARTSYS.
This breaks applications that rely on using signals to abort someone
waiting for events.

Check if we have a signal pending because of queued task_work, and
repeat the signal check once we've run the task_work. This provides a
reliable way of telling the two apart.

Additionally, only use TWA_SIGNAL if we are using an eventfd. If not,
we don't have the dependency situation described in the original commit,
and we can get by with just using TWA_RESUME like we previously did.

Fixes: ce593a6c48 ("io_uring: use signal based task_work running")
Cc: stable@vger.kernel.org # v5.7
Reported-by: Andres Freund <andres@anarazel.de>
Tested-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-04 13:44:45 -06:00
Andy Lutomirski
cc801833a1 x86/ldt: Disable 16-bit segments on Xen PV
Xen PV doesn't implement ESPFIX64, so they don't work right.  Disable
them.  Also print a warning the first time anyone tries to use a
16-bit segment on a Xen PV guest that would otherwise allow it
to help people diagnose this change in behavior.

This gets us closer to having all x86 selftests pass on Xen PV.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/92b2975459dfe5929ecf34c3896ad920bd9e3f2d.1593795633.git.luto@kernel.org
2020-07-04 19:47:26 +02:00
Andy Lutomirski
13cbc0cd4a x86/entry/32: Fix #MC and #DB wiring on x86_32
DEFINE_IDTENTRY_MCE and DEFINE_IDTENTRY_DEBUG were wired up as non-RAW
on x86_32, but the code expected them to be RAW.

Get rid of all the macro indirection for them on 32-bit and just use
DECLARE_IDTENTRY_RAW and DEFINE_IDTENTRY_RAW directly.

Also add a warning to make sure that we only hit the _kernel paths
in kernel mode.

Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/9e90a7ee8e72fd757db6d92e1e5ff16339c1ecf9.1593795633.git.luto@kernel.org
2020-07-04 19:47:26 +02:00
Andy Lutomirski
f41f082422 x86/entry/xen: Route #DB correctly on Xen PV
On Xen PV, #DB doesn't use IST. It still needs to be correctly routed
depending on whether it came from user or kernel mode.

Get rid of DECLARE/DEFINE_IDTENTRY_XEN -- it was too hard to follow the
logic.  Instead, route #DB and NMI through DECLARE/DEFINE_IDTENTRY_RAW on
Xen, and do the right thing for #DB.  Also add more warnings to the
exc_debug* handlers to make this type of failure more obvious.

This fixes various forms of corruption that happen when usermode
triggers #DB on Xen PV.

Fixes: 4c0dcd8350 ("x86/entry: Implement user mode C entry points for #DB and #MCE")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/4163e733cce0b41658e252c6c6b3464f33fdff17.1593795633.git.luto@kernel.org
2020-07-04 19:47:25 +02:00
Andy Lutomirski
3c73b81a91 x86/entry, selftests: Further improve user entry sanity checks
Chasing down a Xen bug caused me to realize that the new entry sanity
checks are still fairly weak.  Add some more checks.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/881de09e786ab93ce56ee4a2437ba2c308afe7a9.1593795633.git.luto@kernel.org
2020-07-04 19:47:25 +02:00
Andy Lutomirski
db5b2c5a90 x86/entry/compat: Clear RAX high bits on Xen PV SYSENTER
Move the clearing of the high bits of RAX after Xen PV joins the SYSENTER
path so that Xen PV doesn't skip it.

Arguably this code should be deleted instead, but that would belong in the
merge window.

Fixes: ffae641f57 ("x86/entry/64/compat: Fix Xen PV SYSENTER frame setup")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/9d33b3f3216dcab008070f1c28b6091ae7199969.1593795633.git.luto@kernel.org
2020-07-04 19:47:25 +02:00
Linus Torvalds
35e884f89d Merge tag 'for-linus-5.8b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
 "One small cleanup patch for ARM and two patches for the xenbus driver
  fixing latent problems (large stack allocations and bad return code
  settings)"

* tag 'for-linus-5.8b-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/xenbus: let xenbus_map_ring_valloc() return errno values only
  xen/xenbus: avoid large structs and arrays on the stack
  arm/xen: remove the unused macro GRANT_TABLE_PHYSADDR
2020-07-03 23:58:12 -07:00
Wolfram Sang
597911287f i2c: mlxcpld: check correct size of maximum RECV_LEN packet
I2C_SMBUS_BLOCK_MAX defines already the maximum number as defined in the
SMBus 2.0 specs. I don't see a reason to add 1 here. Also, fix the errno
to what is suggested for this error.

Fixes: c9bfdc7c16 ("i2c: mlxcpld: Add support for smbus block read transaction")
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Michael Shych <michaelsh@mellanox.com>
Tested-by: Michael Shych <michaelsh@mellanox.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-07-04 08:20:38 +02:00
Linus Torvalds
8b082a41da Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull sysctl fix from Al Viro:
 "Another regression fix for sysctl changes this cycle..."

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  Call sysctl_head_finish on error
2020-07-03 23:20:14 -07:00
Wolfram Sang
58e64b050d i2c: add Kconfig help text for slave mode
I can't recall why there was none, but we surely want to have it.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-07-04 08:17:53 +02:00
Wolfram Sang
59d3d6042d i2c: slave-eeprom: update documentation
Add more details which have either been missing ever since or describe
recent additions.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Luca Ceresoli <luca@lucaceresoli.net>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-07-04 08:17:53 +02:00
Andy Shevchenko
5f90786b31 i2c: eg20t: Load module automatically if ID matches
The driver can't be loaded automatically because it misses
module alias to be provided. Add corresponding MODULE_DEVICE_TABLE()
call to the driver.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-07-04 08:17:53 +02:00
Ricardo Ribalda
db2a8b6f1d i2c: designware: platdrv: Set class based on DMI
Current AMD's zen-based APUs use this core for some of its i2c-buses.

With this patch we re-enable autodetection of hwmon-alike devices, so
lm-sensors will be able to work automatically.

It does not affect the boot-time of embedded devices, as the class is
set based on the DMI information.

DMI is probed only on Qtechnology QT5222 Industrial Camera Platform.

DocLink: https://qtec.com/camera-technology-camera-platforms/
Fixes: 3eddad96c4 ("i2c: designware: reverts "i2c: designware: Add support for AMD I2C controller"")
Signed-off-by: Ricardo Ribalda <ribalda@kernel.org>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-07-04 08:17:53 +02:00
Chris Packham
cd217f2300 i2c: algo-pca: Add 0x78 as SCL stuck low status for PCA9665
The PCA9665 datasheet says that I2CSTA = 78h indicates that SCL is stuck
low, this differs to the PCA9564 which uses 90h for this indication.
Treat either 0x78 or 0x90 as an indication that the SCL line is stuck.

Based on looking through the PCA9564 and PCA9665 datasheets this should
be safe for both chips. The PCA9564 should not return 0x78 for any valid
state and the PCA9665 should not return 0x90.

Fixes: eff9ec95ef ("i2c-algo-pca: Add PCA9665 support")
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2020-07-04 08:17:47 +02:00
Linus Torvalds
b8e516b367 Merge tag '5.8-rc3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
 "Eight cifs/smb3 fixes, most when specifying the multiuser mount flag.

  Five of the fixes are for stable"

* tag '5.8-rc3-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: prevent truncation from long to int in wait_for_free_credits
  cifs: Fix the target file was deleted when rename failed.
  SMB3: Honor 'posix' flag for multiuser mounts
  SMB3: Honor 'handletimeout' flag for multiuser mounts
  SMB3: Honor lease disabling for multiuser mounts
  SMB3: Honor persistent/resilient handle flags for multiuser mounts
  SMB3: Honor 'seal' flag for multiuser mounts
  cifs: Display local UID details for SMB sessions in DebugData
2020-07-03 23:03:45 -07:00
Linus Torvalds
6f216714a6 Merge tag 'hwmon-for-v5.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging
Pull hwmon fixes from Guenter Roeck:

 - Fix typo in Kconfig SENSORS_IR35221 option

 - Fix potential memory leak in acpi_power_meter_add()

 - Make sure the OVERT mask is set correctly in max6697 driver

 - In PMBus core, fix page vs. register when accessing fans

 - Mark is_visible functions static in bt1-pvt driver

 - Define Temp- and Volt-to-N poly as maybe-unused in bt1-pvt driver

* tag 'hwmon-for-v5.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (pmbus) fix a typo in Kconfig SENSORS_IR35221 option
  hwmon: (acpi_power_meter) Fix potential memory leak in acpi_power_meter_add()
  hwmon: (max6697) Make sure the OVERT mask is set correctly
  hwmon: (pmbus) Fix page vs. register when accessing fans
  hwmon: (bt1-pvt) Mark is_visible functions static
  hwmon: (bt1-pvt) Define Temp- and Volt-to-N poly as maybe-unused
2020-07-03 17:28:16 -07:00
Linus Torvalds
bc2391e7bd Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "Subsystems affected by this patch series: mm/hugetlb, samples, mm/cma,
  mm/vmalloc, mm/pagealloc"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm/page_alloc: fix documentation error
  vmalloc: fix the owner argument for the new __vmalloc_node_range callers
  mm/cma.c: use exact_nid true to fix possible per-numa cma leak
  samples/vfs: avoid warning in statx override
  mm/hugetlb.c: fix pages per hugetlb calculation
2020-07-03 17:23:50 -07:00
Joel Savitz
8beeae86b8 mm/page_alloc: fix documentation error
When I increased the upper bound of the min_free_kbytes value in
ee8eb9a5fe ("mm/page_alloc: increase default min_free_kbytes bound") I
forgot to tweak the above comment to reflect the new value.  This patch
fixes that mistake.

Signed-off-by: Joel Savitz <jsavitz@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Rafael Aquini <aquini@redhat.com>
Cc: Fabrizio D'Angelo <fdangelo@redhat.com>
Link: http://lkml.kernel.org/r/20200624221236.29560-1-jsavitz@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-03 16:15:25 -07:00
Christoph Hellwig
a3a66c3822 vmalloc: fix the owner argument for the new __vmalloc_node_range callers
Fix the recently added new __vmalloc_node_range callers to pass the
correct values as the owner for display in /proc/vmallocinfo.

Fixes: 800e26b813 ("x86/hyperv: allocate the hypercall page with only read and execute bits")
Fixes: 10d5e97c1b ("arm64: use PAGE_KERNEL_ROX directly in alloc_insn_page")
Fixes: 7a0e27b2a0 ("mm: remove vmalloc_exec")
Reported-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20200627075649.2455097-1-hch@lst.de
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-03 16:15:25 -07:00
Barry Song
40366bd70b mm/cma.c: use exact_nid true to fix possible per-numa cma leak
Calling cma_declare_contiguous_nid() with false exact_nid for per-numa
reservation can easily cause cma leak and various confusion.  For example,
mm/hugetlb.c is trying to reserve per-numa cma for gigantic pages.  But it
can easily leak cma and make users confused when system has memoryless
nodes.

In case the system has 4 numa nodes, and only numa node0 has memory.  if
we set hugetlb_cma=4G in bootargs, mm/hugetlb.c will get 4 cma areas for 4
different numa nodes.  since exact_nid=false in current code, all 4 numa
nodes will get cma successfully from node0, but hugetlb_cma[1 to 3] will
never be available to hugepage will only allocate memory from
hugetlb_cma[0].

In case the system has 4 numa nodes, both numa node0&2 has memory, other
nodes have no memory.  if we set hugetlb_cma=4G in bootargs, mm/hugetlb.c
will get 4 cma areas for 4 different numa nodes.  since exact_nid=false in
current code, all 4 numa nodes will get cma successfully from node0 or 2,
but hugetlb_cma[1] and [3] will never be available to hugepage as
mm/hugetlb.c will only allocate memory from hugetlb_cma[0] and
hugetlb_cma[2].  This causes permanent leak of the cma areas which are
supposed to be used by memoryless node.

Of cource we can workaround the issue by letting mm/hugetlb.c scan all cma
areas in alloc_gigantic_page() even node_mask includes node0 only.  that
means when node_mask includes node0 only, we can get page from
hugetlb_cma[1] to hugetlb_cma[3].  But this will cause kernel crash in
free_gigantic_page() while it wants to free page by:
cma_release(hugetlb_cma[page_to_nid(page)], page, 1 << order)

On the other hand, exact_nid=false won't consider numa distance, it might
be not that useful to leverage cma areas on remote nodes.  I feel it is
much simpler to make exact_nid true to make everything clear.  After that,
memoryless nodes won't be able to reserve per-numa CMA from other nodes
which have memory.

Fixes: cf11e85fc0 ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Aslan Bakirov <aslan@fb.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Andreas Schaufler <andreas.schaufler@gmx.de>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200628074345.27228-1-song.bao.hua@hisilicon.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-03 16:15:25 -07:00
Kees Cook
c3eeaae9fd samples/vfs: avoid warning in statx override
Something changed recently to uncover this warning:

  samples/vfs/test-statx.c:24:15: warning: `struct foo' declared inside parameter list will not be visible outside of this definition or declaration
     24 | #define statx foo
        |               ^~~

Which is due the use of "struct statx" (here, "struct foo") in a function
prototype argument list before it has been defined:

 int
 # 56 "/usr/include/x86_64-linux-gnu/bits/statx-generic.h"
    foo
 # 56 "/usr/include/x86_64-linux-gnu/bits/statx-generic.h" 3 4
          (int __dirfd, const char *__restrict __path, int __flags,
            unsigned int __mask, struct
 # 57 "/usr/include/x86_64-linux-gnu/bits/statx-generic.h"
                                       foo
 # 57 "/usr/include/x86_64-linux-gnu/bits/statx-generic.h" 3 4
                                             *__restrict __buf)
   __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (2, 5)));

Add explicit struct before #include to avoid warning.

Fixes: f1b5618e01 ("vfs: Add a sample program for the new mount API")
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: David Howells <dhowells@redhat.com>
Link: http://lkml.kernel.org/r/202006282213.C516EA6@keescook
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-03 16:15:25 -07:00
Mike Kravetz
1139d336ff mm/hugetlb.c: fix pages per hugetlb calculation
The routine hpage_nr_pages() was incorrectly used to calculate the number
of base pages in a hugetlb page.  hpage_nr_pages is designed to be called
for THP pages and will return HPAGE_PMD_NR for hugetlb pages of any size.

Due to the context in which hpage_nr_pages was called, it is unlikely to
produce a user visible error.  The routine with the incorrect call is only
exercised in the case of hugetlb memory error or migration.  In addition,
this would need to be on an architecture which supports huge page sizes
less than PMD_SIZE.  And, the vma containing the huge page would also need
to smaller than PMD_SIZE.

Fixes: c0d0381ade ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization")
Reported-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: <stable@vger.kernel.org>
Link: http://lkml.kernel.org/r/20200629185003.97202-1-mike.kravetz@oracle.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-07-03 16:15:25 -07:00
Linus Torvalds
0c7d7d1fad Merge tag 'xfs-5.8-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs fix from Darrick Wong:
 "Fix a use-after-free bug when the fs shuts down"

* tag 'xfs-5.8-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: fix use-after-free on CIL context on shutdown
2020-07-03 14:46:46 -07:00
Linus Torvalds
7fec3ce50a Merge tag 'pci-v5.8-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci
Pull PCI fix from Bjorn Helgaas:
 "Fix a pcie_find_root_port() simplification that broke power management
  because it didn't handle the edge case of finding the Root Port of a
  Root Port itself (Mika Westerberg)""

* tag 'pci-v5.8-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: Make pcie_find_root_port() work for Root Ports
2020-07-03 12:14:51 -07:00