Commit Graph

107407 Commits

Author SHA1 Message Date
Linus Torvalds
24b888d8d5 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
 "A few updates for x86:

   - Fix an unintended sign extension issue in the fault handling code

   - Rename the new resource control config switch so it's less
     confusing

   - Avoid setting up EFI info in kexec when the EFI runtime is
     disabled.

   - Fix the microcode version check in the AMD microcode loader so it
     only loads higher version numbers and never downgrades

   - Set EFER.LME in the 32bit trampoline before returning to long mode
     to handle older AMD/KVM behaviour properly.

   - Add Darren and Andy as x86/platform reviewers"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/resctrl: Avoid confusion over the new X86_RESCTRL config
  x86/kexec: Don't setup EFI info if EFI runtime is not enabled
  x86/microcode/amd: Don't falsely trick the late loading mechanism
  MAINTAINERS: Add Andy and Darren as arch/x86/platform/ reviewers
  x86/fault: Fix sign-extend unintended sign extension
  x86/boot/compressed/64: Set EFER.LME=1 in 32-bit trampoline before returning to long mode
  x86/cpu: Add Atom Tremont (Jacobsville)
2019-02-03 09:08:12 -08:00
Linus Torvalds
cc6810e36b Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull cpu hotplug fixes from Thomas Gleixner:
 "Two fixes for the cpu hotplug machinery:

   - Replace the overly clever 'SMT disabled by BIOS' detection logic as
     it breaks KVM scenarios and prevents speculation control updates
     when the Hyperthreads are brought online late after boot.

   - Remove a redundant invocation of the speculation control update
     function"

* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM
  x86/speculation: Remove redundant arch_smt_update() invocation
2019-02-03 09:02:03 -08:00
Linus Torvalds
c8864cb70f Merge tag 'for-linus-20190202' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "A few fixes that should go into this release. This contains:

   - MD pull request from Song, fixing a recovery OOM issue (Alexei)

   - Fix for a sync related stall (Jianchao)

   - Dummy callback for timeouts (Tetsuo)

   - IDE atapi sense ordering fix (me)"

* tag 'for-linus-20190202' of git://git.kernel.dk/linux-block:
  ide: ensure atapi sense request aren't preempted
  blk-mq: fix a hung issue when fsync
  block: pass no-op callback to INIT_WORK().
  md/raid5: fix 'out of memory' during raid cache recovery
2019-02-02 10:16:28 -08:00
Johannes Weiner
e6d429313e x86/resctrl: Avoid confusion over the new X86_RESCTRL config
"Resource Control" is a very broad term for this CPU feature, and a term
that is also associated with containers, cgroups etc. This can easily
cause confusion.

Make the user prompt more specific. Match the config symbol name.

 [ bp: In the future, the corresponding ARM arch-specific code will be
   under ARM_CPU_RESCTRL and the arch-agnostic bits will be carved out
   under the CPU_RESCTRL umbrella symbol. ]

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Babu Moger <Babu.Moger@amd.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Morse <james.morse@arm.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: linux-doc@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Pu Wen <puwen@hygon.cn>
Cc: Reinette Chatre <reinette.chatre@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: x86-ml <x86@kernel.org>
Link: https://lkml.kernel.org/r/20190130195621.GA30653@cmpxchg.org
2019-02-02 10:34:52 +01:00
Qian Cai
b13bc35193 mm/hotplug: invalid PFNs from pfn_to_online_page()
On an arm64 ThunderX2 server, the first kmemleak scan would crash [1]
with CONFIG_DEBUG_VM_PGFLAGS=y due to page_to_nid() found a pfn that is
not directly mapped (MEMBLOCK_NOMAP).  Hence, the page->flags is
uninitialized.

This is due to the commit 9f1eb38e0e ("mm, kmemleak: little
optimization while scanning") starts to use pfn_to_online_page() instead
of pfn_valid().  However, in the CONFIG_MEMORY_HOTPLUG=y case,
pfn_to_online_page() does not call memblock_is_map_memory() while
pfn_valid() does.

Historically, the commit 68709f4538 ("arm64: only consider memblocks
with NOMAP cleared for linear mapping") causes pages marked as nomap
being no long reassigned to the new zone in memmap_init_zone() by
calling __init_single_page().

Since the commit 2d070eab2e ("mm: consider zone which is not fully
populated to have holes") introduced pfn_to_online_page() and was
designed to return a valid pfn only, but it is clearly broken on arm64.

Therefore, let pfn_to_online_page() call pfn_valid_within(), so it can
handle nomap thanks to the commit f52bb98f5a ("arm64: mm: always
enable CONFIG_HOLES_IN_ZONE"), while it will be optimized away on
architectures where have no HOLES_IN_ZONE.

[1]
  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000006
  Mem abort info:
    ESR = 0x96000005
    Exception class = DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
  Data abort info:
    ISV = 0, ISS = 0x00000005
    CM = 0, WnR = 0
  Internal error: Oops: 96000005 [#1] SMP
  CPU: 60 PID: 1408 Comm: kmemleak Not tainted 5.0.0-rc2+ #8
  pstate: 60400009 (nZCv daif +PAN -UAO)
  pc : page_mapping+0x24/0x144
  lr : __dump_page+0x34/0x3dc
  sp : ffff00003a5cfd10
  x29: ffff00003a5cfd10 x28: 000000000000802f
  x27: 0000000000000000 x26: 0000000000277d00
  x25: ffff000010791f56 x24: ffff7fe000000000
  x23: ffff000010772f8b x22: ffff00001125f670
  x21: ffff000011311000 x20: ffff000010772f8b
  x19: fffffffffffffffe x18: 0000000000000000
  x17: 0000000000000000 x16: 0000000000000000
  x15: 0000000000000000 x14: ffff802698b19600
  x13: ffff802698b1a200 x12: ffff802698b16f00
  x11: ffff802698b1a400 x10: 0000000000001400
  x9 : 0000000000000001 x8 : ffff00001121a000
  x7 : 0000000000000000 x6 : ffff0000102c53b8
  x5 : 0000000000000000 x4 : 0000000000000003
  x3 : 0000000000000100 x2 : 0000000000000000
  x1 : ffff000010772f8b x0 : ffffffffffffffff
  Process kmemleak (pid: 1408, stack limit = 0x(____ptrval____))
  Call trace:
   page_mapping+0x24/0x144
   __dump_page+0x34/0x3dc
   dump_page+0x28/0x4c
   kmemleak_scan+0x4ac/0x680
   kmemleak_scan_thread+0xb4/0xdc
   kthread+0x12c/0x13c
   ret_from_fork+0x10/0x18
  Code: d503201f f9400660 36000040 d1000413 (f9400661)
  ---[ end trace 4d4bd7f573490c8e ]---
  Kernel panic - not syncing: Fatal exception
  SMP: stopping secondary CPUs
  Kernel Offset: disabled
  CPU features: 0x002,20000c38
  Memory Limit: none
  ---[ end Kernel panic - not syncing: Fatal exception ]---

Link: http://lkml.kernel.org/r/20190122132916.28360-1-cai@lca.pw
Fixes: 9f1eb38e0e ("mm, kmemleak: little optimization while scanning")
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-01 15:46:23 -08:00
Tetsuo Handa
9bcdeb51bd oom, oom_reaper: do not enqueue same task twice
Arkadiusz reported that enabling memcg's group oom killing causes
strange memcg statistics where there is no task in a memcg despite the
number of tasks in that memcg is not 0.  It turned out that there is a
bug in wake_oom_reaper() which allows enqueuing same task twice which
makes impossible to decrease the number of tasks in that memcg due to a
refcount leak.

This bug existed since the OOM reaper became invokable from
task_will_free_mem(current) path in out_of_memory() in Linux 4.7,

  T1@P1     |T2@P1     |T3@P1     |OOM reaper
  ----------+----------+----------+------------
                                   # Processing an OOM victim in a different memcg domain.
                        try_charge()
                          mem_cgroup_out_of_memory()
                            mutex_lock(&oom_lock)
             try_charge()
               mem_cgroup_out_of_memory()
                 mutex_lock(&oom_lock)
  try_charge()
    mem_cgroup_out_of_memory()
      mutex_lock(&oom_lock)
                            out_of_memory()
                              oom_kill_process(P1)
                                do_send_sig_info(SIGKILL, @P1)
                                mark_oom_victim(T1@P1)
                                wake_oom_reaper(T1@P1) # T1@P1 is enqueued.
                            mutex_unlock(&oom_lock)
                 out_of_memory()
                   mark_oom_victim(T2@P1)
                   wake_oom_reaper(T2@P1) # T2@P1 is enqueued.
                 mutex_unlock(&oom_lock)
      out_of_memory()
        mark_oom_victim(T1@P1)
        wake_oom_reaper(T1@P1) # T1@P1 is enqueued again due to oom_reaper_list == T2@P1 && T1@P1->oom_reaper_list == NULL.
      mutex_unlock(&oom_lock)
                                   # Completed processing an OOM victim in a different memcg domain.
                                   spin_lock(&oom_reaper_lock)
                                   # T1P1 is dequeued.
                                   spin_unlock(&oom_reaper_lock)

but memcg's group oom killing made it easier to trigger this bug by
calling wake_oom_reaper() on the same task from one out_of_memory()
request.

Fix this bug using an approach used by commit 855b018325 ("oom,
oom_reaper: disable oom_reaper for oom_kill_allocating_task").  As a
side effect of this patch, this patch also avoids enqueuing multiple
threads sharing memory via task_will_free_mem(current) path.

Link: http://lkml.kernel.org/r/e865a044-2c10-9858-f4ef-254bc71d6cc2@i-love.sakura.ne.jp
Link: http://lkml.kernel.org/r/5ee34fc6-1485-34f8-8790-903ddabaa809@i-love.sakura.ne.jp
Fixes: af8e15cc85 ("oom, oom_reaper: do not enqueue task if it is on the oom_reaper_list head")
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Reported-by: Arkadiusz Miskiewicz <arekm@maven.pl>
Tested-by: Arkadiusz Miskiewicz <arekm@maven.pl>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Roman Gushchin <guro@fb.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Aleksa Sarai <asarai@suse.de>
Cc: Jay Kamat <jgkamat@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-02-01 15:46:23 -08:00
Linus Torvalds
5eeb63359b Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma
Pull rdma fixes from Jason Gunthorpe:
 "Still not much going on, the usual set of oops and driver fixes this
  time:

   - Fix two uapi breakage regressions in mlx5 drivers

   - Various oops fixes in hfi1, mlx4, umem, uverbs, and ipoib

   - A protocol bug fix for hfi1 preventing it from implementing the
     verbs API properly, and a compatability fix for EXEC STACK user
     programs

   - Fix missed refcounting in the 'advise_mr' patches merged this
     cycle.

   - Fix wrong use of the uABI in the hns SRQ patches merged this cycle"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  IB/uverbs: Fix OOPs in uverbs_user_mmap_disassociate
  IB/ipoib: Fix for use-after-free in ipoib_cm_tx_start
  IB/uverbs: Fix ioctl query port to consider device disassociation
  RDMA/mlx5: Fix flow creation on representors
  IB/uverbs: Fix OOPs upon device disassociation
  RDMA/umem: Add missing initialization of owning_mm
  RDMA/hns: Update the kernel header file of hns
  IB/mlx5: Fix how advise_mr() launches async work
  RDMA/device: Expose ib_device_try_get(()
  IB/hfi1: Add limit test for RC/UC send via loopback
  IB/hfi1: Remove overly conservative VM_EXEC flag check
  IB/{hfi1, qib}: Fix WC.byte_len calculation for UD_SEND_WITH_IMM
  IB/mlx4: Fix using wrong function to destroy sqp AHs under SRIOV
  RDMA/mlx5: Fix check for supported user flags when creating a QP
2019-02-01 10:39:24 -08:00
Linus Torvalds
3325254ca1 Merge tag 'pm-5.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
 "These fix a PM-runtime framework regression introduced by the recent
  switch-over of device autosuspend to hrtimers and a mistake in the
  "poll idle state" code introduced by a recent change in it.

  Specifics:

   - Since ktime_get() turns out to be problematic for device
     autosuspend in the PM-runtime framework, make it use
     ktime_get_mono_fast_ns() instead (Vincent Guittot).

   - Fix an initial value of a local variable in the "poll idle state"
     code that makes it behave not exactly as expected when all idle
     states except for the "polling" one are disabled (Doug Smythies)"

* tag 'pm-5.0-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  cpuidle: poll_state: Fix default time limit
  PM-runtime: Fix deadlock with ktime_get()
2019-02-01 10:23:39 -08:00
Linus Torvalds
5b4746a031 Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
 "Mostly driver fixes, but there's a core framework fix in here too:

   - Revert the commits that introduce clk management for the SP clk on
     MMP2 SoCs (used for OLPC). Turns out it wasn't a good idea and
     there isn't any need to manage this clk, it just causes more
     headaches.

   - A performance regression that went unnoticed for many years where
     we would traverse the entire clk tree looking for a clk by name
     when we already have the pointer to said clk that we're looking for

   - A parent linkage fix for the qcom SDM845 clk driver

   - An i.MX clk driver rate miscalculation fix where order of
     operations were messed up

   - One error handling fix from the static checkers"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: qcom: gcc: Use active only source for CPUSS clocks
  clk: ti: Fix error handling in ti_clk_parse_divider_data()
  clk: imx: Fix fractional clock set rate computation
  clk: Remove global clk traversal on fetch parent index
  Revert "dt-bindings: marvell,mmp2: Add clock id for the SP clock"
  Revert "clk: mmp2: add SP clock"
  Revert "Input: olpc_apsp - enable the SP clock"
2019-01-31 23:22:57 -08:00
Jens Axboe
9a6d548800 ide: ensure atapi sense request aren't preempted
There's an issue with how sense requests are handled in IDE. If ide-cd
encounters an error, it queues a sense request. With how IDE request
handling is done, this is the next request we need to handle. But it's
impossible to guarantee this, as another request could come in between
the sense being queued, and ->queue_rq() being run and handling it. If
that request ALSO fails, then we attempt to doubly queue the single
sense request we have.

Since we only support one active request at the time, defer request
processing when a sense request is queued.

Fixes: 600335205b "ide: convert to blk-mq"
Reported-by: He Zhe <zhe.he@windriver.com>
Tested-by: He Zhe <zhe.he@windriver.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-01-31 08:25:09 -07:00
Vincent Guittot
15efb47dc5 PM-runtime: Fix deadlock with ktime_get()
A deadlock has been seen when swicthing clocksources which use
PM-runtime.  The call path is:

change_clocksource
    ...
    write_seqcount_begin
    ...
    timekeeping_update
        ...
        sh_cmt_clocksource_enable
            ...
            rpm_resume
                pm_runtime_mark_last_busy
                    ktime_get
                        do
                            read_seqcount_begin
                        while read_seqcount_retry
    ....
    write_seqcount_end

Although we should be safe because we haven't yet changed the
clocksource at that time, we can't do that because of seqcount
protection.

Use ktime_get_mono_fast_ns() instead which is lock safe for such
cases.

With ktime_get_mono_fast_ns, the timestamp is not guaranteed to be
monotonic across an update and as a result can goes backward.
According to update_fast_timekeeper() description: "In the worst
case, this can result is a slightly wrong timestamp (a few
nanoseconds)". For PM-runtime autosuspend, this means only that
the suspend decision may be slightly suboptimal.

Fixes: 8234f6734c ("PM-runtime: Switch autosuspend over to using hrtimers")
Reported-by: Biju Das <biju.das@bp.renesas.com>
Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2019-01-30 22:49:06 +01:00
Waiman Long
af0c9af1b3 fs/dcache: Track & report number of negative dentries
The current dentry number tracking code doesn't distinguish between
positive & negative dentries.  It just reports the total number of
dentries in the LRU lists.

As excessive number of negative dentries can have an impact on system
performance, it will be wise to track the number of positive and
negative dentries separately.

This patch adds tracking for the total number of negative dentries in
the system LRU lists and reports it in the 5th field in the
/proc/sys/fs/dentry-state file.  The number, however, does not include
negative dentries that are in flight but not in the LRU yet as well as
those in the shrinker lists which are on the way out anyway.

The number of positive dentries in the LRU lists can be roughly found by
subtracting the number of negative dentries from the unused count.

Matthew Wilcox had confirmed that since the introduction of the
dentry_stat structure in 2.1.60, the dummy array was there, probably for
future extension.  They were not replacements of pre-existing fields.
So no sane applications that read the value of /proc/sys/fs/dentry-state
will do dummy thing if the last 2 fields of the sysctl parameter are not
zero.  IOW, it will be safe to use one of the dummy array entry for
negative dentry count.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-30 11:02:11 -08:00
Waiman Long
7d10f70fc1 fs: Don't need to put list_lru into its own cacheline
The list_lru structure is essentially just a pointer to a table of
per-node LRU lists.  Even if CONFIG_MEMCG_KMEM is defined, the list
field is just used for LRU list registration and shrinker_id is set at
initialization.  Those fields won't need to be touched that often.

So there is no point to make the list_lru structures to sit in their own
cachelines.

Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-30 11:02:11 -08:00
Josh Poimboeuf
b284909aba cpu/hotplug: Fix "SMT disabled by BIOS" detection for KVM
With the following commit:

  73d5e2b472 ("cpu/hotplug: detect SMT disabled by BIOS")

... the hotplug code attempted to detect when SMT was disabled by BIOS,
in which case it reported SMT as permanently disabled.  However, that
code broke a virt hotplug scenario, where the guest is booted with only
primary CPU threads, and a sibling is brought online later.

The problem is that there doesn't seem to be a way to reliably
distinguish between the HW "SMT disabled by BIOS" case and the virt
"sibling not yet brought online" case.  So the above-mentioned commit
was a bit misguided, as it permanently disabled SMT for both cases,
preventing future virt sibling hotplugs.

Going back and reviewing the original problems which were attempted to
be solved by that commit, when SMT was disabled in BIOS:

  1) /sys/devices/system/cpu/smt/control showed "on" instead of
     "notsupported"; and

  2) vmx_vm_init() was incorrectly showing the L1TF_MSG_SMT warning.

I'd propose that we instead consider #1 above to not actually be a
problem.  Because, at least in the virt case, it's possible that SMT
wasn't disabled by BIOS and a sibling thread could be brought online
later.  So it makes sense to just always default the smt control to "on"
to allow for that possibility (assuming cpuid indicates that the CPU
supports SMT).

The real problem is #2, which has a simple fix: change vmx_vm_init() to
query the actual current SMT state -- i.e., whether any siblings are
currently online -- instead of looking at the SMT "control" sysfs value.

So fix it by:

  a) reverting the original "fix" and its followup fix:

     73d5e2b472 ("cpu/hotplug: detect SMT disabled by BIOS")
     bc2d8d262c ("cpu/hotplug: Fix SMT supported evaluation")

     and

  b) changing vmx_vm_init() to query the actual current SMT state --
     instead of the sysfs control value -- to determine whether the L1TF
     warning is needed.  This also requires the 'sched_smt_present'
     variable to exported, instead of 'cpu_smt_control'.

Fixes: 73d5e2b472 ("cpu/hotplug: detect SMT disabled by BIOS")
Reported-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Joe Mario <jmario@redhat.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kvm@vger.kernel.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/e3a85d585da28cc333ecbc1e78ee9216e6da9396.1548794349.git.jpoimboe@redhat.com
2019-01-30 19:27:00 +01:00
Linus Torvalds
6296789878 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Need to save away the IV across tls async operations, from Dave
    Watson.

 2) Upon successful packet processing, we should liberate the SKB with
    dev_consume_skb{_irq}(). From Yang Wei.

 3) Only apply RX hang workaround on effected macb chips, from Harini
    Katakam.

 4) Dummy netdev need a proper namespace assigned to them, from Josh
    Elsasser.

 5) Some paths of nft_compat run lockless now, and thus we need to use a
    proper refcnt_t. From Florian Westphal.

 6) Avoid deadlock in mlx5 by doing IRQ locking, from Moni Shoua.

 7) netrom does not refcount sockets properly wrt. timers, fix that by
    using the sock timer API. From Cong Wang.

 8) Fix locking of inexact inserts of xfrm policies, from Florian
    Westphal.

 9) Missing xfrm hash generation bump, also from Florian.

10) Missing of_node_put() in hns driver, from Yonglong Liu.

11) Fix DN_IFREQ_SIZE, from Johannes Berg.

12) ip6mr notifier is invoked during traversal of wrong table, from Nir
    Dotan.

13) TX promisc settings not performed correctly in qed, from Manish
    Chopra.

14) Fix OOB access in vhost, from Jason Wang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (52 commits)
  MAINTAINERS: Add entry for XDP (eXpress Data Path)
  net: set default network namespace in init_dummy_netdev()
  net: b44: replace dev_kfree_skb_xxx by dev_consume_skb_xxx for drop profiles
  net: caif: call dev_consume_skb_any when skb xmit done
  net: 8139cp: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
  net: macb: Apply RXUBR workaround only to versions with errata
  net: ti: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
  net: apple: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
  net: amd8111e: replace dev_kfree_skb_irq by dev_consume_skb_irq
  net: alteon: replace dev_kfree_skb_irq by dev_consume_skb_irq
  net: tls: Fix deadlock in free_resources tx
  net: tls: Save iv in tls_rec for async crypto requests
  vhost: fix OOB in get_rx_bufs()
  qed: Fix stack out of bounds bug
  qed: Fix system crash in ll2 xmit
  qed: Fix VF probe failure while FLR
  qed: Fix LACP pdu drops for VFs
  qed: Fix bug in tx promiscuous mode settings
  net: i825xx: replace dev_kfree_skb_irq by dev_consume_skb_irq for drop profiles
  netfilter: ipt_CLUSTERIP: fix warning unused variable cn
  ...
2019-01-29 17:11:47 -08:00
Dave Watson
32eb67b93c net: tls: Save iv in tls_rec for async crypto requests
aead_request_set_crypt takes an iv pointer, and we change the iv
soon after setting it.  Some async crypto algorithms don't save the iv,
so we need to save it in the tls_rec for async requests.

Found by hardcoding x64 aesni to use async crypto manager (to test the async
codepath), however I don't think this combination can happen in the wild.
Presumably other hardware offloads will need this fix, but there have been
no user reports.

Fixes: a42055e8d2 ("Add support for async encryption of records...")
Signed-off-by: Dave Watson <davejwatson@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-28 23:05:55 -08:00
Linus Torvalds
9881051828 Merge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking fixes from Thomas Gleixner:
 "A small series of fixes which all address possible missed wakeups:

   - Document and fix the wakeup ordering of wake_q

   - Add the missing barrier in rcuwait_wake_up(), which was documented
     in the comment but missing in the code

   - Fix the possible missed wakeups in the rwsem and futex code"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/rwsem: Fix (possible) missed wakeup
  futex: Fix (possible) missed wakeup
  sched/wake_q: Fix wakeup ordering for wake_q
  sched/wake_q: Document wake_q_add()
  sched/wait: Fix rcuwait_wake_up() ordering
2019-01-27 11:52:50 -08:00
Linus Torvalds
0d484375d7 Merge branch 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull irq fixes from Thomas Gleixner:
 "A small set of fixes for the interrupt subsystem:

   - Fix a double increment in the irq descriptor allocator which
     resulted in a sanity check only being done for every second
     affinity mask

   - Add a missing device tree translation in the stm32-exti driver.
     Without that the interrupt association is completely wrong.

   - Initialize the mutex in the GIC-V3 MBI driver

   - Fix the alignment for aliasing devices in the GIC-V3-ITS driver so
     multi MSI allocations work correctly

   - Ensure that the initial affinity of a interrupt is not empty at
     startup time.

   - Drop bogus include in the madera irq chip driver

   - Fix KernelDoc regression"

* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/gic-v3-its: Align PCI Multi-MSI allocation on their size
  genirq/irqdesc: Fix double increment in alloc_descs()
  genirq: Fix the kerneldoc comment for struct irq_affinity_desc
  irqchip/madera: Drop GPIO includes
  irqchip/gic-v3-mbi: Fix uninitialized mbi_lock
  irqchip/stm32-exti: Add domain translate function
  genirq: Make sure the initial affinity is not empty
2019-01-27 11:25:38 -08:00
Linus Torvalds
c180f1b04b Merge tag 'dma-mapping-5.0-2' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fix from Christoph Hellwig:
 "Fix a xen-swiotlb regression on arm64"

* tag 'dma-mapping-5.0-2' of git://git.infradead.org/users/hch/dma-mapping:
  arm64/xen: fix xen-swiotlb cache flushing
2019-01-27 09:18:05 -08:00
Linus Torvalds
6a2651b55b Merge tag 'libnvdimm-fixes-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
 "A fix for namespace label support for non-Intel NVDIMMs that implement
  the ACPI standard label method.

  This has apparently never worked and could wait for v5.1. However it
  has enough visibility with hardware vendors [1] and distro bug
  trackers [2], and low enough risk that I decided it should go in for
  -rc4. The other fixups target the new, for v5.0, nvdimm security
  functionality. The larger init path fixup closes a memory leak and a
  potential userspace lockup due to missed notifications.

    [1] https://github.com/pmem/ndctl/issues/78
    [2] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1811785

  These have all soaked in -next for a week with no reported issues.

  Summary:

   - Fix support for NVDIMMs that implement the ACPI standard label
     methods.

   - Fix error handling for security overwrite (memory leak / userspace
     hang condition), and another one-line security cleanup"

* tag 'libnvdimm-fixes-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  acpi/nfit: Fix command-supported detection
  acpi/nfit: Block function zero DSMs
  libnvdimm/security: Require nvdimm_security_setup_events() to succeed
  nfit_test: fix security state pull for nvdimm security nfit_test
2019-01-27 09:11:51 -08:00
Linus Torvalds
78e372e650 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Pull input fixes from Dmitry Torokhov:
 "A fixup for the input_event fix for y2038 Sparc64, and couple other
  minor fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: input_event - fix the CONFIG_SPARC64 mixup
  Input: olpc_apsp - assign priv->dev earlier
  Input: uinput - fix undefined behavior in uinput_validate_absinfo()
  Input: raspberrypi-ts - fix link error
  Input: xpad - add support for SteelSeries Stratus Duo
  Input: input_event - provide override for sparc64
2019-01-27 09:07:03 -08:00
Linus Torvalds
037222ad3f Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Count ttl-dropped frames properly in mac80211, from Bob Copeland.

 2) Integer overflow in ktime handling of bcm can code, from Oliver
    Hartkopp.

 3) Fix RX desc handling wrt. hw checksumming in ravb, from Simon
    Horman.

 4) Various hash key fixes in hv_netvsc, from Haiyang Zhang.

 5) Use after free in ax25, from Eric Dumazet.

 6) Several fixes to the SSN support in SCTP, from Xin Long.

 7) Do not process frames after a NAPI reschedule in ibmveth, from
    Thomas Falcon.

 8) Fix NLA_POLICY_NESTED arguments, from Johannes Berg.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (42 commits)
  qed: Revert error handling changes.
  cfg80211: extend range deviation for DMG
  cfg80211: reg: remove warn_on for a normal case
  mac80211: Add attribute aligned(2) to struct 'action'
  mac80211: don't initiate TDLS connection if station is not associated to AP
  nl80211: fix NLA_POLICY_NESTED() arguments
  ibmveth: Do not process frames after calling napi_reschedule
  net: dev_is_mac_header_xmit() true for ARPHRD_RAWIP
  net: usb: asix: ax88772_bind return error when hw_reset fail
  MAINTAINERS: Update cavium networking drivers
  net/mlx4_core: Fix error handling when initializing CQ bufs in the driver
  net/mlx4_core: Add masking for a few queries on HCA caps
  sctp: set flow sport from saddr only when it's 0
  sctp: set chunk transport correctly when it's a new asoc
  sctp: improve the events for sctp stream adding
  sctp: improve the events for sctp stream reset
  ip_tunnel: Make none-tunnel-dst tunnel port work with lwtunnel
  ax25: fix possible use-after-free
  sfc: suppress duplicate nvmem partition types in efx_ef10_mtd_probe
  hv_netvsc: fix typos in code comments
  ...
2019-01-27 08:59:12 -08:00
Linus Torvalds
6b8f915916 Merge tag 'for-linus-20190125' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:
 "A collection of fixes for this release. This contains:

   - Silence sparse rightfully complaining about non-static wbt
     functions (Bart)

   - Fixes for the zoned comments/ioctl documentation (Damien)

   - direct-io fix that's been lingering for a while (Ernesto)

   - cgroup writeback fix (Tejun)

   - Set of NVMe patches for nvme-rdma/tcp (Sagi, Hannes, Raju)

   - Block recursion tracking fix (Ming)

   - Fix debugfs command flag naming for a few flags (Jianchao)"

* tag 'for-linus-20190125' of git://git.kernel.dk/linux-block:
  block: Fix comment typo
  uapi: fix ioctl documentation
  blk-wbt: Declare local functions static
  blk-mq: fix the cmd_flag_name array
  nvme-multipath: drop optimization for static ANA group IDs
  nvmet-rdma: fix null dereference under heavy load
  nvme-rdma: rework queue maps handling
  nvme-tcp: fix timeout handler
  nvme-rdma: fix timeout handler
  writeback: synchronize sync(2) against cgroup writeback membership switches
  block: cover another queue enter recursion via BIO_QUEUE_ENTERED
  direct-io: allow direct writes to empty inodes
2019-01-26 12:42:41 -08:00
David S. Miller
abfd04f738 qed: Revert error handling changes.
This is new code and not bug fixes.

This reverts all changes added by merge commit
8fb18be93e

Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-25 15:32:28 -08:00
Linus Torvalds
d488bd21a4 Merge tag 'char-misc-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
 "Here are some small char and misc driver fixes to resolve some
  reported issues, as well as a number of binderfs fixups that were
  found after auditing the filesystem code by Al Viro. As binderfs
  hasn't been in a previous release yet, it's good to get these in now
  before the first users show up.

  All of these have been in linux-next for a bit with no reported
  issues"

* tag 'char-misc-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (26 commits)
  i3c: master: Fix an error checking typo in 'cdns_i3c_master_probe()'
  binderfs: switch from d_add() to d_instantiate()
  binderfs: drop lock in binderfs_binder_ctl_create
  binderfs: kill_litter_super() before cleanup
  binderfs: rework binderfs_binder_device_create()
  binderfs: rework binderfs_fill_super()
  binderfs: prevent renaming the control dentry
  binderfs: remove outdated comment
  binderfs: use __u32 for device numbers
  binderfs: use correct include guards in header
  misc: pvpanic: fix warning implicit declaration
  char/mwave: fix potential Spectre v1 vulnerability
  misc: ibmvsm: Fix potential NULL pointer dereference
  binderfs: fix error return code in binderfs_fill_super()
  mei: me: add denverton innovation engine device IDs
  mei: me: mark LBG devices as having dma support
  mei: dma: silent the reject message
  binderfs: handle !CONFIG_IPC_NS builds
  binderfs: reserve devices for initial mount
  binderfs: rename header to binderfs.h
  ...
2019-01-25 13:03:34 -10:00
Lijun Ou
9d9d4ff788 RDMA/hns: Update the kernel header file of hns
The hns_roce_ib_create_srq_resp is used to interact with the user for
data, this was open coded to use a u32 directly, instead use a properly
sized structure.

Fixes: c7bcb13442 ("RDMA/hns: Add SRQ support for hip08 kernel mode")
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-25 09:55:48 -07:00
Maciej Żenczykowski
3b707c3008 net: dev_is_mac_header_xmit() true for ARPHRD_RAWIP
__bpf_redirect() and act_mirred checks this boolean
to determine whether to prefix an ethernet header.

Signed-off-by: Maciej Żenczykowski <maze@google.com>
Acked-by: Daniel Borkmann <daniel@iogearbox.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-24 22:45:34 -08:00
Lubomir Rintel
401fbb34f5 Revert "dt-bindings: marvell,mmp2: Add clock id for the SP clock"
It seems that the kernel has no business managing this clock: once the SP
clock is disabled, it's not sufficient to just enable it in order to bring
the SP core back up.

Pretty sure nothing ever used this and it's safe to remove.

This reverts commit e8a2c77914.

Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
2019-01-24 10:55:33 -08:00
Damien Le Moal
8367de2c99 block: Fix comment typo
Fix typo in REQ_OP_ZONE_RESET description.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-01-24 11:11:45 -07:00
Damien Le Moal
745815f955 uapi: fix ioctl documentation
The description of the BLKGETNRZONES zoned block device ioctl was not
added as a comment together with this ioctl definition in commit
65e4e3eee8 ("block: Introduce BLKGETNRZONES ioctl"). Add its
description here.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-01-24 11:11:42 -07:00
Linus Torvalds
aa7b98459f Merge tag 'sound-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "A significant amount of fixes at this time, mostly for covering the
  recent ASoC issues.

   - Fixes for the missing ASoC driver initialization with non-deferred
     probes; these triggered other problems in chain, which resulted in
     yet more fix commits

   - DaVinci runtime PM fix; the diff looks large but it's just a code
     shuffling

   - Various fixes for ASoC Intel drivers: a regression in HD-A HDMI,
     Kconfig dependency, machine driver adjustments, PLL fix.

   - Other ASoC driver-specific stuff including the trivial fixes caught
     by static analysis

   - Usual HD-audio quirks"

* tag 'sound-5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (30 commits)
  ALSA: hda - Add mute LED support for HP ProBook 470 G5
  ASoC: amd: Fix potential NULL pointer dereference
  ASoC: imx-audmux: change snprintf to scnprintf for possible overflow
  ASoC: rt5514-spi: Fix potential NULL pointer dereference
  ASoC: dapm: change snprintf to scnprintf for possible overflow
  ASoC: rt5682: Fix PLL source register definitions
  ASoC: core: Don't defer probe on optional, NULL components
  ASoC: core: Make snd_soc_find_component() more robust
  ASoC: soc-core: fix init platform memory handling
  ASoC: intel: skl: Fix display power regression
  ALSA: hda/realtek - Fix typo for ALC225 model
  ASoC: soc-core: Hold client_mutex around soc_init_dai_link()
  ASoC: Intel: Boards: move the codec PLL configuration to _init
  ASoC: soc-core: defer card probe until all component is added to list
  ASoC: atom: fix a missing check of snd_pcm_lib_malloc_pages
  ASoC: tlv320aic32x4: Kernel OOPS while entering DAPM standby mode
  ASoC: ti: davinci-mcasp: Move context save/restore to runtime_pm callbacks
  ASoC: Variable "val" in function rt274_i2c_probe() could be uninitialized
  ASoC: rt5682: Fix recording no sound issue
  ASoC: Intel: atom: Make PCI dependency explicit
  ...
2019-01-25 05:55:26 +13:00
Deepa Dinamani
141e5dcaa7 Input: input_event - fix the CONFIG_SPARC64 mixup
Arnd Bergmann pointed out that CONFIG_* cannot be used in a uapi header.
Override with an equivalent conditional.

Fixes: 2e746942eb ("Input: input_event - provide override for sparc64")
Fixes: 152194fe9c ("Input: extend usable life of event timestamps to 2106 on 32 bit systems")
Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
2019-01-24 00:38:39 -08:00
Christoph Hellwig
60d8cd572f arm64/xen: fix xen-swiotlb cache flushing
Xen-swiotlb hooks into the arm/arm64 arch code through a copy of the DMA
DMA mapping operations stored in the struct device arch data.

Switching arm64 to use the direct calls for the merged DMA direct /
swiotlb code broke this scheme.  Replace the indirect calls with
direct-calls in xen-swiotlb as well to fix this problem.

Fixes: 356da6d0cd ("dma-mapping: bypass indirect calls for dma-direct")
Reported-by: Julien Grall <julien.grall@arm.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2019-01-23 22:14:56 +01:00
Eric Dumazet
63530aba78 ax25: fix possible use-after-free
syzbot found that ax25 routes where not properly protected
against concurrent use [1].

In this particular report the bug happened while
copying ax25->digipeat.

Fix this problem by making sure we call ax25_get_route()
while ax25_route_lock is held, so that no modification
could happen while using the route.

The current two ax25_get_route() callers do not sleep,
so this change should be fine.

Once we do that, ax25_get_route() no longer needs to
grab a reference on the found route.

[1]
ax25_connect(): syz-executor0 uses autobind, please contact jreuter@yaina.de
BUG: KASAN: use-after-free in memcpy include/linux/string.h:352 [inline]
BUG: KASAN: use-after-free in kmemdup+0x42/0x60 mm/util.c:113
Read of size 66 at addr ffff888066641a80 by task syz-executor2/531

ax25_connect(): syz-executor0 uses autobind, please contact jreuter@yaina.de
CPU: 1 PID: 531 Comm: syz-executor2 Not tainted 5.0.0-rc2+ #10
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x1db/0x2d0 lib/dump_stack.c:113
 print_address_description.cold+0x7c/0x20d mm/kasan/report.c:187
 kasan_report.cold+0x1b/0x40 mm/kasan/report.c:317
 check_memory_region_inline mm/kasan/generic.c:185 [inline]
 check_memory_region+0x123/0x190 mm/kasan/generic.c:191
 memcpy+0x24/0x50 mm/kasan/common.c:130
 memcpy include/linux/string.h:352 [inline]
 kmemdup+0x42/0x60 mm/util.c:113
 kmemdup include/linux/string.h:425 [inline]
 ax25_rt_autobind+0x25d/0x750 net/ax25/ax25_route.c:424
 ax25_connect.cold+0x30/0xa4 net/ax25/af_ax25.c:1224
 __sys_connect+0x357/0x490 net/socket.c:1664
 __do_sys_connect net/socket.c:1675 [inline]
 __se_sys_connect net/socket.c:1672 [inline]
 __x64_sys_connect+0x73/0xb0 net/socket.c:1672
 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x458099
Code: 6d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f870ee22c78 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458099
RDX: 0000000000000048 RSI: 0000000020000080 RDI: 0000000000000005
RBP: 000000000073bf00 R08: 0000000000000000 R09: 0000000000000000
ax25_connect(): syz-executor4 uses autobind, please contact jreuter@yaina.de
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f870ee236d4
R13: 00000000004be48e R14: 00000000004ce9a8 R15: 00000000ffffffff

Allocated by task 526:
 save_stack+0x45/0xd0 mm/kasan/common.c:73
 set_track mm/kasan/common.c:85 [inline]
 __kasan_kmalloc mm/kasan/common.c:496 [inline]
 __kasan_kmalloc.constprop.0+0xcf/0xe0 mm/kasan/common.c:469
 kasan_kmalloc+0x9/0x10 mm/kasan/common.c:504
ax25_connect(): syz-executor5 uses autobind, please contact jreuter@yaina.de
 kmem_cache_alloc_trace+0x151/0x760 mm/slab.c:3609
 kmalloc include/linux/slab.h:545 [inline]
 ax25_rt_add net/ax25/ax25_route.c:95 [inline]
 ax25_rt_ioctl+0x3b9/0x1270 net/ax25/ax25_route.c:233
 ax25_ioctl+0x322/0x10b0 net/ax25/af_ax25.c:1763
 sock_do_ioctl+0xe2/0x400 net/socket.c:950
 sock_ioctl+0x32f/0x6c0 net/socket.c:1074
 vfs_ioctl fs/ioctl.c:46 [inline]
 file_ioctl fs/ioctl.c:509 [inline]
 do_vfs_ioctl+0x107b/0x17d0 fs/ioctl.c:696
 ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
 __do_sys_ioctl fs/ioctl.c:720 [inline]
 __se_sys_ioctl fs/ioctl.c:718 [inline]
 __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

ax25_connect(): syz-executor5 uses autobind, please contact jreuter@yaina.de
Freed by task 550:
 save_stack+0x45/0xd0 mm/kasan/common.c:73
 set_track mm/kasan/common.c:85 [inline]
 __kasan_slab_free+0x102/0x150 mm/kasan/common.c:458
 kasan_slab_free+0xe/0x10 mm/kasan/common.c:466
 __cache_free mm/slab.c:3487 [inline]
 kfree+0xcf/0x230 mm/slab.c:3806
 ax25_rt_add net/ax25/ax25_route.c:92 [inline]
 ax25_rt_ioctl+0x304/0x1270 net/ax25/ax25_route.c:233
 ax25_ioctl+0x322/0x10b0 net/ax25/af_ax25.c:1763
 sock_do_ioctl+0xe2/0x400 net/socket.c:950
 sock_ioctl+0x32f/0x6c0 net/socket.c:1074
 vfs_ioctl fs/ioctl.c:46 [inline]
 file_ioctl fs/ioctl.c:509 [inline]
 do_vfs_ioctl+0x107b/0x17d0 fs/ioctl.c:696
 ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
 __do_sys_ioctl fs/ioctl.c:720 [inline]
 __se_sys_ioctl fs/ioctl.c:718 [inline]
 __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
 do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

The buggy address belongs to the object at ffff888066641a80
 which belongs to the cache kmalloc-96 of size 96
The buggy address is located 0 bytes inside of
 96-byte region [ffff888066641a80, ffff888066641ae0)
The buggy address belongs to the page:
page:ffffea0001999040 count:1 mapcount:0 mapping:ffff88812c3f04c0 index:0x0
flags: 0x1fffc0000000200(slab)
ax25_connect(): syz-executor4 uses autobind, please contact jreuter@yaina.de
raw: 01fffc0000000200 ffffea0001817948 ffffea0002341dc8 ffff88812c3f04c0
raw: 0000000000000000 ffff888066641000 0000000100000020 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffff888066641980: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
 ffff888066641a00: 00 00 00 00 00 00 00 00 02 fc fc fc fc fc fc fc
>ffff888066641a80: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
                   ^
 ffff888066641b00: fb fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
 ffff888066641b80: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-23 11:18:00 -08:00
Tomer Tayar
278396de78 qede: Error recovery process
This patch adds the error recovery process in the qede driver.
The process includes a partial/customized driver unload and load, which
allows it to look like a short suspend period to the kernel while
preserving the net devices' state.

Signed-off-by: Tomer Tayar <tomer.tayar@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-22 17:30:39 -08:00
Tomer Tayar
c75860e48a qed: Add infrastructure for error detection and recovery
This patch adds the detection and handling of a parity error ("process kill
event"), including the update of the protocol drivers, and the prevention
of any HW access that will lead to device access towards the host while
recovery is in progress.
It also provides the means for the protocol drivers to trigger a recovery
process on their decision.

Signed-off-by: Tomer Tayar <tomer.tayar@cavium.com>
Signed-off-by: Ariel Elior <ariel.elior@cavium.com>
Signed-off-by: Michal Kalderon <michal.kalderon@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-22 17:30:38 -08:00
Tejun Heo
7fc5854f8c writeback: synchronize sync(2) against cgroup writeback membership switches
sync_inodes_sb() can race against cgwb (cgroup writeback) membership
switches and fail to writeback some inodes.  For example, if an inode
switches to another wb while sync_inodes_sb() is in progress, the new
wb might not be visible to bdi_split_work_to_wbs() at all or the inode
might jump from a wb which hasn't issued writebacks yet to one which
already has.

This patch adds backing_dev_info->wb_switch_rwsem to synchronize cgwb
switch path against sync_inodes_sb() so that sync_inodes_sb() is
guaranteed to see all the target wbs and inodes can't jump wbs to
escape syncing.

v2: Fixed misplaced rwsem init.  Spotted by Jiufei.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Jiufei Xue <xuejiufei@gmail.com>
Link: http://lkml.kernel.org/r/dc694ae2-f07f-61e1-7097-7c8411cee12d@gmail.com
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-01-22 14:39:38 -07:00
Linus Torvalds
787a3b4322 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid
Pull HID fixes from Jiri Kosina:

 - descriptor parsing regression fix for devices that have more than 16
   collections, from Peter Hutterer (and followup cleanup from Philipp
   Zabel)

 - quirk for Goodix touchpad

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid:
  HID: core: simplify active collection tracking
  HID: i2c-hid: Disable runtime PM on Goodix touchpad
  HID: core: replace the collection tree pointers with indices
2019-01-23 07:16:05 +13:00
Christian Brauner
7d0174065f binderfs: use __u32 for device numbers
We allow more then 255 binderfs binder devices to be created since there
are workloads that require more than that. If we use __u8 we'll overflow
after 255. So let's use a __u32.
Note that there's no released kernel with binderfs out there so this is
not a regression.

Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-22 12:13:17 +01:00
Christian Brauner
6fc23b6ed8 binderfs: use correct include guards in header
When we switched over from binder_ctl.h to binderfs.h we forgot to change
the include guards. It's minor but it's obviously correct.

Signed-off-by: Christian Brauner <christian@brauner.io>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-01-22 12:13:17 +01:00
Linus Torvalds
48b161983a Merge tag 'xarray-5.0-rc3' of git://git.infradead.org/users/willy/linux-dax
Pull XArray fixes from Matthew Wilcox:
 "Fix some oversights in the XArray porcelain API:

   - support for m68k's two-byte aligned pointers

   - reserving entries using xa_insert()

   - missing xa_insert_bh() and xa_insert_irq() functions

   - simplify using xa_for_each()

   - use lockdep correctly

   - a few other minor fixes and improvements"

* tag 'xarray-5.0-rc3' of git://git.infradead.org/users/willy/linux-dax:
  XArray: Fix an arithmetic error in xa_is_err
  XArray tests: Check mark 2 gets squashed
  XArray: Fix typo in comment
  XArray: Honour reserved entries in xa_insert
  XArray: Permit storing 2-byte-aligned pointers
  XArray: Change xa_for_each iterator
  XArray: Turn xa_init_flags into a static inline
  XArray tests: Add RCU locking
2019-01-22 17:08:30 +13:00
Jason Gunthorpe
d79af7242b RDMA/device: Expose ib_device_try_get(()
It turns out future patches need this capability quite widely now, not
just for netlink, so provide two global functions to manage the
registration lock refcount.

This also moves the point the lock becomes 1 to within
ib_register_device() so that the semantics of the public API are very sane
and clear. Calling ib_device_try_get() will fail on devices that are only
allocated but not yet registered.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Steve Wise <swise@opengridcomputing.com>
Reviewed-by: Parav Pandit <parav@mellanox.com>
2019-01-21 14:33:08 -07:00
Dan Williams
1cd7386549 libnvdimm/security: Require nvdimm_security_setup_events() to succeed
The following warning:

    ACPI0012:00: security event setup failed: -19

...is meant to capture exceptional failures of sysfs_get_dirent(),
however it will also fail in the common case when security support is
disabled. A few issues:

1/ A dev_warn() report for a common case is too chatty
2/ The setup of this notifier is generic, no need for it to be driven
   from the nfit driver, it can exist completely in the core.
3/ If it fails for any reason besides security support being disabled,
   that's fatal and should abort DIMM activation. Userspace may hang if
   it never gets overwrite notifications.
4/ The dirent needs to be released.

Move the call to the core 'dimm' driver, make it conditional on security
support being active, make it fatal for the exceptional case, add the
missing sysfs_put() at device disable time.

Fixes: 7d988097c5 ("...Add security DSM overwrite support")
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2019-01-21 09:57:43 -08:00
Peter Zijlstra
e6018c0f5c sched/wake_q: Document wake_q_add()
The only guarantee provided by wake_q_add() is that a wakeup will
happen after it, it does _NOT_ guarantee the wakeup will be delayed
until the matching wake_up_q().

If wake_q_add() fails the cmpxchg() a concurrent wakeup is pending and
that can happen at any time after the cmpxchg(). This means we should
not rely on the wakeup happening at wake_q_up(), but should be ready
for wake_q_add() to issue the wakeup.

The delay; if provided (most likely); should only result in more efficient
behaviour.

Reported-by: Yongji Xie <elohimes@gmail.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Waiman Long <longman@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2019-01-21 11:15:36 +01:00
Linus Torvalds
7d0ae236ed Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:

 1) Fix endless loop in nf_tables, from Phil Sutter.

 2) Fix cross namespace ip6_gre tunnel hash list corruption, from
    Olivier Matz.

 3) Don't be too strict in phy_start_aneg() otherwise we might not allow
    restarting auto negotiation. From Heiner Kallweit.

 4) Fix various KMSAN uninitialized value cases in tipc, from Ying Xue.

 5) Memory leak in act_tunnel_key, from Davide Caratti.

 6) Handle chip errata of mv88e6390 PHY, from Andrew Lunn.

 7) Remove linear SKB assumption in fou/fou6, from Eric Dumazet.

 8) Missing udplite rehash callbacks, from Alexey Kodanev.

 9) Log dirty pages properly in vhost, from Jason Wang.

10) Use consume_skb() in neigh_probe() as this is a normal free not a
    drop, from Yang Wei. Likewise in macvlan_process_broadcast().

11) Missing device_del() in mdiobus_register() error paths, from Thomas
    Petazzoni.

12) Fix checksum handling of short packets in mlx5, from Cong Wang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (96 commits)
  bpf: in __bpf_redirect_no_mac pull mac only if present
  virtio_net: bulk free tx skbs
  net: phy: phy driver features are mandatory
  isdn: avm: Fix string plus integer warning from Clang
  net/mlx5e: Fix cb_ident duplicate in indirect block register
  net/mlx5e: Fix wrong (zero) TX drop counter indication for representor
  net/mlx5e: Fix wrong error code return on FEC query failure
  net/mlx5e: Force CHECKSUM_UNNECESSARY for short ethernet frames
  tools: bpftool: Cleanup license mess
  bpf: fix inner map masking to prevent oob under speculation
  bpf: pull in pkt_sched.h header for tooling to fix bpftool build
  selftests: forwarding: Add a test case for externally learned FDB entries
  selftests: mlxsw: Test FDB offload indication
  mlxsw: spectrum_switchdev: Do not treat static FDB entries as sticky
  net: bridge: Mark FDB entries that were added by user as such
  mlxsw: spectrum_fid: Update dummy FID index
  mlxsw: pci: Return error on PCI reset timeout
  mlxsw: pci: Increase PCI SW reset timeout
  mlxsw: pci: Ring CQ's doorbell before RDQ's
  MAINTAINERS: update email addresses of liquidio driver maintainers
  ...
2019-01-21 12:52:31 +13:00
Linus Torvalds
bb617b9b45 Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Pull virtio/vhost fixes and cleanups from Michael Tsirkin:
 "Fixes and cleanups all over the place"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vhost/scsi: Use copy_to_iter() to send control queue response
  vhost: return EINVAL if iovecs size does not match the message size
  virtio-balloon: tweak config_changed implementation
  virtio: don't allocate vqs when names[i] = NULL
  virtio_pci: use queue idx instead of array idx to set up the vq
  virtio: document virtio_config_ops restrictions
  virtio: fix virtio_config_ops description
2019-01-21 07:37:16 +13:00
Linus Torvalds
315a6d850a Merge tags 'compiler-attributes-for-linus-v5.0-rc3' and 'clang-format-for-linus-v5.0-rc3' of git://github.com/ojeda/linux
Pull misc clang fixes from Miguel Ojeda:

  - A fix for OPTIMIZER_HIDE_VAR from Michael S Tsirkin

  - Update clang-format with the latest for_each macro list from Jason
    Gunthorpe

* tag 'compiler-attributes-for-linus-v5.0-rc3' of git://github.com/ojeda/linux:
  include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR

* tag 'clang-format-for-linus-v5.0-rc3' of git://github.com/ojeda/linux:
  clang-format: Update .clang-format with the latest for_each macro list
2019-01-21 07:23:42 +13:00
Linus Torvalds
5d5c303ea0 Merge tag 'mips_fixes_5.0_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux
Pull MIPS fixes from Paul Burton:

 - Fix IPI handling for Lantiq SoCs, which was broken by changes made
   back in v4.12.

 - Enable OF/DT serial support in ath79_defconfig to give us working
   serial by default.

 - Fix 64b builds for the Jazz platform.

 - Set up a struct device for the BCM47xx SoC to allow BCM47xx drivers
   to perform DMA again following the major DMA mapping changes made in
   v4.19.

 - Disable MSI on Cavium Octeon systems when the pcie_disable command
   line parameter introduced in v3.3 is used, in order to avoid
   inadvetently accessing PCIe controller registers despite the command
   line.

 - Fix a build failure for Cavium Octeon kernels with kexec enabled,
   introduced in v4.20.

 - Fix a regression in the behaviour of semctl/shmctl/msgctl IPC
   syscalls for kernels including n32 support but not o32 support caused
   by some cleanup in v3.19.

* tag 'mips_fixes_5.0_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: OCTEON: fix kexec support
  mips: fix n32 compat_ipc_parse_version
  Disable MSI also when pcie-octeon.pcie_disable on
  MIPS: BCM47XX: Setup struct device for the SoC
  MIPS: jazz: fix 64bit build
  MIPS: ath79: Enable OF serial ports in the default config
  MIPS: lantiq: Use CP0_LEGACY_COMPARE_IRQ
  MIPS: lantiq: Fix IPI interrupt handling
2019-01-20 10:33:18 +12:00
Linus Torvalds
26caabbcd7 Merge tag 'libnvdimm-fixes-5.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm
Pull libnvdimm fixes from Dan Williams:
 "A crash fix, a build warning fix, a miscellaneous small cleanups.

  In case anyone is looking for them, there was a regression caught by
  testing that caused two patches to be dropped from this update.  Those
  patches have been reworked and will soak for another week / re-target
  5.0-rc4.

   - Fix driver initialization crash due to the inability to report an
     'error' state for a DIMM's security capability.

   - Build warning fix for little-endian ARM64 builds

   - Fix a potential race between the EDAC driver's usage of the NFIT
     SMBIOS id for a DIMM and the driver shutdown path.

   - A small collection of one-line benign cleanups for duplicate
     variable assignments, a duplicate header include and a mis-typed
     function argument"

* tag 'libnvdimm-fixes-5.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  libnvdimm/security: Fix nvdimm_security_state() state request selection
  acpi/nfit: Remove duplicate set nd_set in acpi_nfit_init_interleave_set()
  acpi/nfit: Fix race accessing memdev in nfit_get_smbios_id()
  libnvdimm/dimm: Fix security capability detection for non-Intel NVDIMMs
  nfit: Mark some functions as __maybe_unused
  ACPI/nfit: delete the function to_acpi_nfit_desc
  ACPI/nfit: delete the redundant header file
2019-01-20 10:24:30 +12:00
Camelia Groza
3e64cf7a43 net: phy: phy driver features are mandatory
Since phy driver features became a link_mode bitmap, phy drivers that
don't have a list of features configured will cause the kernel to crash
when probed.

Prevent the phy driver from registering if the features field is missing.

Fixes: 719655a149 ("net: phy: Replace phy driver features u32 with link_mode bitmap")
Reported-by: Scott Wood <oss@buserror.net>
Signed-off-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-01-19 10:03:08 -08:00