Commit Graph

915607 Commits

Author SHA1 Message Date
Jens Axboe
b0beb28097 Revert "block: end bio with BLK_STS_AGAIN in case of non-mq devs and REQ_NOWAIT"
This reverts commit c58c1f8343.

io_uring does do the right thing for this case, and we're still returning
-EAGAIN to userspace for the cases we don't support. Revert this change
to avoid doing endless spins of resubmits.

Cc: stable@vger.kernel.org # v5.6
Reported-by: Bijan Mottahedeh <bijan.mottahedeh@oracle.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-28 13:20:39 -06:00
Jens Axboe
15fede122b Merge branch 'nvme-5.7' of git://git.infradead.org/nvme into block-5.7
Pull NVMe poll fix from Christoph.

* 'nvme-5.7' of git://git.infradead.org/nvme:
  nvme-pci: avoid race between nvme_reap_pending_cqes() and nvme_poll()
2020-05-28 08:48:12 -06:00
Dongli Zhang
9210c075ce nvme-pci: avoid race between nvme_reap_pending_cqes() and nvme_poll()
There may be a race between nvme_reap_pending_cqes() and nvme_poll(), e.g.,
when doing live reset while polling the nvme device.

      CPU X                        CPU Y
                               nvme_poll()
nvme_dev_disable()
-> nvme_stop_queues()
-> nvme_suspend_io_queues()
-> nvme_suspend_queue()
                               -> spin_lock(&nvmeq->cq_poll_lock);
-> nvme_reap_pending_cqes()
   -> nvme_process_cq()        -> nvme_process_cq()

In the above scenario, the nvme_process_cq() for the same queue may be
running on both CPU X and CPU Y concurrently.

It is much more easier to reproduce the issue when CONFIG_PREEMPT is
enabled in kernel. When CONFIG_PREEMPT is disabled, it would take longer
time for nvme_stop_queues()-->blk_mq_quiesce_queue() to wait for grace
period.

This patch protects nvme_process_cq() with nvmeq->cq_poll_lock in
nvme_reap_pending_cqes().

Fixes: fa46c6fb5d ("nvme/pci: move cqe check after device shutdown")
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-05-27 20:32:56 +02:00
Chaitanya Kulkarni
1592cd15ee null_blk: don't allow discard for zoned mode
Zoned block device specification do not define the behavior of
discard/trim command as this command is generally replaced by the reset
write pointer (zone reset) command. Emulate this in null_blk by making
zoned and discard options mutually exclusive.

Suggested-by: Damien Le Moal <damien.lemoal@wdc.com>
Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-21 08:47:28 -06:00
Chaitanya Kulkarni
e274832590 null_blk: return error for invalid zone size
In null_init_zone_dev() check if the zone size is larger than device
capacity, return error if needed.

This also fixes the following oops :-

null_blk: changed the number of conventional zones to 4294967295
BUG: kernel NULL pointer dereference, address: 0000000000000010
PGD 7d76c5067 P4D 7d76c5067 PUD 7d240c067 PMD 0
Oops: 0002 [#1] SMP NOPTI
CPU: 4 PID: 5508 Comm: nullbtests.sh Tainted: G OE 5.7.0-rc4lblk-fnext0
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e4
RIP: 0010:null_init_zoned_dev+0x17a/0x27f [null_blk]
RSP: 0018:ffffc90007007e00 EFLAGS: 00010246
RAX: 0000000000000020 RBX: ffff8887fb3f3c00 RCX: 0000000000000007
RDX: 0000000000000000 RSI: ffff8887ca09d688 RDI: ffff888810fea510
RBP: 0000000000000010 R08: ffff8887ca09d688 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8887c26e8000
R13: ffffffffa05e9390 R14: 0000000000000000 R15: 0000000000000001
FS:  00007fcb5256f740(0000) GS:ffff888810e00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000010 CR3: 000000081e8fe000 CR4: 00000000003406e0
Call Trace:
 null_add_dev+0x534/0x71b [null_blk]
 nullb_device_power_store.cold.41+0x8/0x2e [null_blk]
 configfs_write_file+0xe6/0x150
 vfs_write+0xba/0x1e0
 ksys_write+0x5f/0xe0
 do_syscall_64+0x60/0x250
 entry_SYSCALL_64_after_hwframe+0x49/0xb3
RIP: 0033:0x7fcb51c71840

Signed-off-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-21 08:47:28 -06:00
Jens Axboe
3948955397 Merge branch 'nvme-5.7' of git://git.infradead.org/nvme into block-5.7
Pull NVMe fix from Christoph.

* 'nvme-5.7' of git://git.infradead.org/nvme:
  nvme-pci: dma read memory barrier for completions
2020-05-16 09:25:34 -06:00
Keith Busch
b69e2ef24b nvme-pci: dma read memory barrier for completions
Control dependencies do not guarantee load order across the condition,
allowing a CPU to predict and speculate memory reads.

Commit 324b494c28 inlined verifying a new completion with its
handling. At least one architecture was observed to access the contents
out of order, resulting in the driver using stale data for the
completion.

Add a dma read barrier before reading the completion queue entry and
after the condition its contents depend on to ensure the read order is
determinsitic.

Reported-by: John Garry <john.garry@huawei.com>
Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Tested-by: John Garry <john.garry@huawei.com>
Acked-by: Will Deacon <will@kernel.org>
Reviewed-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-05-12 18:02:24 +02:00
Sagi Grimberg
59c7c3caaa nvme: fix possible hang when ns scanning fails during error recovery
When the controller is reconnecting, the host fails I/O and admin
commands as the host cannot reach the controller. ns scanning may
revalidate namespaces during that period and it is wrong to remove
namespaces due to these failures as we may hang (see 205da24343).

One command that may fail is nvme_identify_ns_descs. Since we return
success due to having ns identify descriptor list optional, we continue
to compare ns identifiers in nvme_revalidate_disk, obviously fail and
return -ENODEV to nvme_validate_ns, which will remove the namespace.

Exactly what we don't want to happen.

Fixes: 22802bf742 ("nvme: Namepace identification descriptor list is optional")
Tested-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09 16:07:58 -06:00
Alexey Dobriyan
a8de663916 nvme-pci: fix "slimmer CQ head update"
Pre-incrementing ->cq_head can't be done in memory because OOB value
can be observed by another context.

This devalues space savings compared to original code :-\

	$ ./scripts/bloat-o-meter ../vmlinux-000 ../obj/vmlinux
	add/remove: 0/0 grow/shrink: 0/4 up/down: 0/-32 (-32)
	Function                                     old     new   delta
	nvme_poll_irqdisable                         464     456      -8
	nvme_poll                                    455     447      -8
	nvme_irq                                     388     380      -8
	nvme_dev_disable                             955     947      -8

But the code is minimal now: one read for head, one read for q_depth,
one increment, one comparison, single instruction phase bit update and
one write for new head.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reported-by: John Garry <john.garry@huawei.com>
Tested-by: John Garry <john.garry@huawei.com>
Fixes: e2a366a4b0 ("nvme-pci: slimmer CQ head update")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09 16:07:58 -06:00
Christoph Hellwig
6bd87eec23 bdi: add a ->dev_name field to struct backing_dev_info
Cache a copy of the name for the life time of the backing_dev_info
structure so that we can reference it even after unregistering.

Fixes: 68f23b8906 ("memcg: fix a crash in wb_workfn when a device disappears")
Reported-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09 16:07:57 -06:00
Yufen Yu
d51cfc53ad bdi: use bdi_dev_name() to get device name
Use the common interface bdi_dev_name() to get device name.

Signed-off-by: Yufen Yu <yuyufen@huawei.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>

Add missing <linux/backing-dev.h> include BFQ

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-09 16:07:39 -06:00
Christoph Hellwig
eb7ae5e06b bdi: move bdi_dev_name out of line
bdi_dev_name is not a fast path function, move it out of line.  This
prepares for using it from modular callers without having to export
an implementation detail like bdi_unknown_name.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-07 08:45:47 -06:00
Christoph Hellwig
156c757372 vboxsf: don't use the source name in the bdi name
Simplify the bdi name to mirror what we are doing elsewhere, and
drop them name in favor of just using a number.  This avoids a
potentially very long bdi name.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-07 08:45:47 -06:00
Tejun Heo
0b80f9866e iocost: protect iocg->abs_vdebt with iocg->waitq.lock
abs_vdebt is an atomic_64 which tracks how much over budget a given cgroup
is and controls the activation of use_delay mechanism. Once a cgroup goes
over budget from forced IOs, it has to pay it back with its future budget.
The progress guarantee on debt paying comes from the iocg being active -
active iocgs are processed by the periodic timer, which ensures that as time
passes the debts dissipate and the iocg returns to normal operation.

However, both iocg activation and vdebt handling are asynchronous and a
sequence like the following may happen.

1. The iocg is in the process of being deactivated by the periodic timer.

2. A bio enters ioc_rqos_throttle(), calls iocg_activate() which returns
   without anything because it still sees that the iocg is already active.

3. The iocg is deactivated.

4. The bio from #2 is over budget but needs to be forced. It increases
   abs_vdebt and goes over the threshold and enables use_delay.

5. IO control is enabled for the iocg's subtree and now IOs are attributed
   to the descendant cgroups and the iocg itself no longer issues IOs.

This leaves the iocg with stuck abs_vdebt - it has debt but inactive and no
further IOs which can activate it. This can end up unduly punishing all the
descendants cgroups.

The usual throttling path has the same issue - the iocg must be active while
throttled to ensure that future event will wake it up - and solves the
problem by synchronizing the throttling path with a spinlock. abs_vdebt
handling is another form of overage handling and shares a lot of
characteristics including the fact that it isn't in the hottest path.

This patch fixes the above and other possible races by strictly
synchronizing abs_vdebt and use_delay handling with iocg->waitq.lock.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Vlad Dmitriev <vvd@fb.com>
Cc: stable@vger.kernel.org # v5.4+
Fixes: e1518f63f2 ("blk-iocost: Don't let merges push vtime into the future")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-05 09:23:18 -06:00
Christoph Hellwig
10c70d95c0 block: remove the bd_openers checks in blk_drop_partitions
When replacing the bd_super check with a bd_openers I followed a logical
conclusion, which turns out to be utterly wrong.  When a block device has
bd_super sets it has a mount file system on it (although not every
mounted file system sets bd_super), but that also implies it doesn't even
have partitions to start with.

So instead of trying to come up with a logical check for all openers,
just remove the check entirely.

Fixes: d3ef553627 ("block: fix busy device checking in blk_drop_partitions")
Fixes: cb6b771b05 ("block: fix busy device checking in blk_drop_partitions again")
Reported-by: Michal Koutný <mkoutny@suse.com>
Reported-by: Yang Xu <xuyang2018.jy@cn.fujitsu.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-04-30 10:25:43 -06:00
Jens Axboe
47ed39e062 Merge branch 'nvme-5.7' of git://git.infradead.org/nvme into block-5.7
Pull NVMe fix from Christoph.

* 'nvme-5.7' of git://git.infradead.org/nvme:
  nvme: prevent double free in nvme_alloc_ns() error handling
2020-04-30 09:32:48 -06:00
Niklas Cassel
132be62387 nvme: prevent double free in nvme_alloc_ns() error handling
When jumping to the out_put_disk label, we will call put_disk(), which will
trigger a call to disk_release(), which calls blk_put_queue().

Later in the cleanup code, we do blk_cleanup_queue(), which will also call
blk_put_queue().

Putting the queue twice is incorrect, and will generate a KASAN splat.

Set the disk->queue pointer to NULL, before calling put_disk(), so that the
first call to blk_put_queue() will not free the queue.

The second call to blk_put_queue() uses another pointer to the same queue,
so this call will still free the queue.

Fixes: 85136c0102 ("lightnvm: simplify geometry enumeration")
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2020-04-27 17:08:06 +02:00
Damien Le Moal
d205bde78f null_blk: Cleanup zoned device initialization
Move all zoned mode related code from null_blk_main.c to
null_blk_zoned.c, avoiding an ugly #ifdef in the process.
Rename null_zone_init() into null_init_zoned_dev(), null_zone_exit()
into null_free_zoned_dev() and add the new function
null_register_zoned_dev() to finalize the zoned dev setup before
add_disk().

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-04-23 09:35:09 -06:00
Damien Le Moal
9dd44c7e99 null_blk: Fix zoned command handling
For write operations issued to a null_blk device with zoned mode
enabled, the state and write pointer position of the zone targeted by
the command should be checked before badblocks and memory backing
are handled as the write may be first failed due to, for instance, a
sector position not aligned with the zone write pointer. This order of
checking for errors reflects more accuratly the behavior of physical
zoned devices.

Furthermore, the write pointer position of the target zone should be
incremented only and only if no errors are reported by badblocks and
memory backing handling.

To fix this, introduce the small helper function null_process_cmd()
which execute null_handle_badblocks() and null_handle_memory_backed()
and use this function in null_zone_write() to correctly handle write
requests to zoned null devices depending on the type and state of the
write target zone. Also call this function in null_handle_zoned() to
process read requests to zoned null devices.

null_process_cmd() is called directly from null_handle_cmd() for
regular null devices, resulting in no functional change for these type
of devices. To have symmetric names, the function null_handle_zoned()
is renamed to null_process_zoned_cmd().

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-04-23 09:35:09 -06:00
Ma, Jianpeng
d56deb1e4e block: remove unused header
Dax related code already removed from this file.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jianpeng Ma <jianpeng.ma@intel.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-04-21 09:51:10 -06:00
Waiman Long
d6c8e949a3 blk-iocost: Fix error on iocost_ioc_vrate_adj
Systemtap 4.2 is unable to correctly interpret the "u32 (*missed_ppm)[2]"
argument of the iocost_ioc_vrate_adj trace entry defined in
include/trace/events/iocost.h leading to the following error:

  /tmp/stapAcz0G0/stap_c89c58b83cea1724e26395efa9ed4939_6321_aux_6.c:78:8:
  error: expected ‘;’, ‘,’ or ‘)’ before ‘*’ token
   , u32[]* __tracepoint_arg_missed_ppm

That argument type is indeed rather complex and hard to read. Looking
at block/blk-iocost.c. It is just a 2-entry u32 array. By simplifying
the argument to a simple "u32 *missed_ppm" and adjusting the trace
entry accordingly, the compilation error was gone.

Fixes: 7caa47151a ("blkcg: implement blk-iocost")
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-04-21 09:49:36 -06:00
Douglas Anderson
b849dd84b6 bdev: Reduce time holding bd_mutex in sync in blkdev_close()
While trying to "dd" to the block device for a USB stick, I
encountered a hung task warning (blocked for > 120 seconds).  I
managed to come up with an easy way to reproduce this on my system
(where /dev/sdb is the block device for my USB stick) with:

  while true; do dd if=/dev/zero of=/dev/sdb bs=4M; done

With my reproduction here are the relevant bits from the hung task
detector:

 INFO: task udevd:294 blocked for more than 122 seconds.
 ...
 udevd           D    0   294      1 0x00400008
 Call trace:
  ...
  mutex_lock_nested+0x40/0x50
  __blkdev_get+0x7c/0x3d4
  blkdev_get+0x118/0x138
  blkdev_open+0x94/0xa8
  do_dentry_open+0x268/0x3a0
  vfs_open+0x34/0x40
  path_openat+0x39c/0xdf4
  do_filp_open+0x90/0x10c
  do_sys_open+0x150/0x3c8
  ...

 ...
 Showing all locks held in the system:
 ...
 1 lock held by dd/2798:
  #0: ffffff814ac1a3b8 (&bdev->bd_mutex){+.+.}, at: __blkdev_put+0x50/0x204
 ...
 dd              D    0  2798   2764 0x00400208
 Call trace:
  ...
  schedule+0x8c/0xbc
  io_schedule+0x1c/0x40
  wait_on_page_bit_common+0x238/0x338
  __lock_page+0x5c/0x68
  write_cache_pages+0x194/0x500
  generic_writepages+0x64/0xa4
  blkdev_writepages+0x24/0x30
  do_writepages+0x48/0xa8
  __filemap_fdatawrite_range+0xac/0xd8
  filemap_write_and_wait+0x30/0x84
  __blkdev_put+0x88/0x204
  blkdev_put+0xc4/0xe4
  blkdev_close+0x28/0x38
  __fput+0xe0/0x238
  ____fput+0x1c/0x28
  task_work_run+0xb0/0xe4
  do_notify_resume+0xfc0/0x14bc
  work_pending+0x8/0x14

The problem appears related to the fact that my USB disk is terribly
slow and that I have a lot of RAM in my system to cache things.
Specifically my writes seem to be happening at ~15 MB/s and I've got
~4 GB of RAM in my system that can be used for buffering.  To write 4
GB of buffer to disk thus takes ~4000 MB / ~15 MB/s = ~267 seconds.

The 267 second number is a problem because in __blkdev_put() we call
sync_blockdev() while holding the bd_mutex.  Any other callers who
want the bd_mutex will be blocked for the whole time.

The problem is made worse because I believe blkdev_put() specifically
tells other tasks (namely udev) to go try to access the device at right
around the same time we're going to hold the mutex for a long time.

Putting some traces around this (after disabling the hung task detector),
I could confirm:
 dd:    437.608600: __blkdev_put() right before sync_blockdev() for sdb
 udevd: 437.623901: blkdev_open() right before blkdev_get() for sdb
 dd:    661.468451: __blkdev_put() right after sync_blockdev() for sdb
 udevd: 663.820426: blkdev_open() right after blkdev_get() for sdb

A simple fix for this is to realize that sync_blockdev() works fine if
you're not holding the mutex.  Also, it's not the end of the world if
you sync a little early (though it can have performance impacts).
Thus we can make a guess that we're going to need to do the sync and
then do it without holding the mutex.  We still do one last sync with
the mutex but it should be much, much faster.

With this, my hung task warnings for my test case are gone.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Guenter Roeck <groeck@chromium.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-04-20 09:31:20 -06:00
Zhiqiang Liu
c4b4c2a78a buffer: remove useless comment and WB_REASON_FREE_MORE_MEM, reason.
free_more_memory func has been completely removed in commit bc48f001de
("buffer: eliminate the need to call free_more_memory() in __getblk_slow()")

So comment and `WB_REASON_FREE_MORE_MEM` reason about free_more_memory
are no longer needed.

Fixes: bc48f001de ("buffer: eliminate the need to call free_more_memory() in __getblk_slow()")
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-04-17 21:38:11 -06:00
Linus Torvalds
90280eaa88 Merge tag 'docs-fixes' of git://git.lwn.net/linux
Pull documentation fixes from Jonathan Corbet:
 "A handful of fixes for reasonably obnoxious documentation issues"

* tag 'docs-fixes' of git://git.lwn.net/linux:
  scripts: documentation-file-ref-check: Add line break before exit
  scripts/kernel-doc: Add missing close-paren in c:function directives
  docs: admin-guide: merge sections for the kernel.modprobe sysctl
  docs: timekeeping: Use correct prototype for deprecated functions
2020-04-17 13:10:50 -07:00
Linus Torvalds
5d286d5ebc Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull proc fix from Eric Biederman:
 "While running syzbot happened to spot one more oversight in my rework
  of proc_flush_task.

  The fields proc_self and proc_thread_self were not being reinitialized
  when proc was unmounted, which could cause problems if the mount of
  proc fails"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  proc: Handle umounts cleanly
2020-04-17 12:05:01 -07:00
Linus Torvalds
ceb1adbacb Merge tag 'mtd/fixes-for-5.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux
Pull mtd fix from Richard Weinberger:
 "spi-nor: fix for missing directory after code refactoring"

* tag 'mtd/fixes-for-5.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: spi-nor: Compile files in controllers/ directory
2020-04-17 11:52:20 -07:00
Linus Torvalds
1634615dc1 Merge tag 'linux-watchdog-5.7-rc2' of git://www.linux-watchdog.org/linux-watchdog
Pull watchdog fix from Wim Van Sebroeck:
 "Fix restart handler in sp805 driver"

* tag 'linux-watchdog-5.7-rc2' of git://www.linux-watchdog.org/linux-watchdog:
  watchdog: sp805: fix restart handler
2020-04-17 11:40:08 -07:00
Linus Torvalds
8fce9058ca Merge tag 'devicetree-fixes-for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux
Pull devicetree fixes from Rob Herring:

 - Fix warnings from enabling more dtc warnings which landed in the
   merge window and didn't get fixed in time.

 - Fix some document references from DT schema conversions

 - Fix kmemleak errors in DT unittests

* tag 'devicetree-fixes-for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: (23 commits)
  kbuild: check libyaml installation for 'make dt_binding_check'
  of: unittest: kmemleak in duplicate property update
  of: overlay: kmemleak in dup_and_fixup_symbol_prop()
  of: unittest: kmemleak in of_unittest_overlay_high_level()
  of: unittest: kmemleak in of_unittest_platform_populate()
  of: unittest: kmemleak on changeset destroy
  MAINTAINERS: dt: fix pointers for ARM Integrator, Versatile and RealView
  MAINTAINERS: dt: update display/allwinner file entry
  dt-bindings: iio: dac: AD5570R fix bindings errors
  dt-bindings: Fix misspellings of "Analog Devices"
  dt-bindings: pwm: Fix cros-ec-pwm example dtc 'reg' warning
  docs: dt: rockchip,dwc3.txt: fix a pointer to a renamed file
  docs: dt: fix a broken reference for a file converted to json
  docs: dt: qcom,dwc3.txt: fix cross-reference for a converted file
  docs: dt: fix broken reference to phy-cadence-torrent.yaml
  dt-bindings: interrupt-controller: Fix loongson,parent_int_map property schema
  dt-bindings: hwmon: Fix incorrect $id paths
  dt-bindings: Fix dtc warnings on reg and ranges in examples
  dt-bindings: BD718x7 - add missing I2C bus properties
  dt-bindings: clock: syscon-icst: Remove unneeded unit name
  ...
2020-04-17 11:35:20 -07:00
Linus Torvalds
95988fbc7c Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Catalin Marinas:

 - Remove vdso code trying to free unallocated pages.

 - Delete the space separator in the __emit_inst macro as it breaks the
   clang integrated assembler.

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: Delete the space separator in __emit_inst
  arm64: vdso: don't free unallocated pages
2020-04-17 10:39:43 -07:00
Linus Torvalds
d0a4ebe7d1 Merge tag 'for-linus-5.7-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen update from Juergen Gross:

 - a small cleanup patch

 - a security fix for a bug in the Xen hypervisor to avoid enabling Xen
   guests to crash dom0 on an unfixed hypervisor.

* tag 'for-linus-5.7-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  arm/xen: make _xen_start_info static
  xen/xenbus: ensure xenbus_map_ring_valloc() returns proper grant status
2020-04-17 10:35:17 -07:00
Linus Torvalds
a2286a449b Merge tag 'io_uring-5.7-2020-04-17' of git://git.kernel.dk/linux-block
Pull io_uring fixes from Jens Axboe:

 - wrap up the init/setup cleanup (Pavel)

 - fix some issues around deferral sequences (Pavel)

 - fix splice punt check using the wrong struct file member

 - apply poll re-arm logic for pollable retry too

 - pollable retry should honor cancelation

 - fix setup time error handling syzbot reported crash

 - restore work state when poll is canceled

* tag 'io_uring-5.7-2020-04-17' of git://git.kernel.dk/linux-block:
  io_uring: don't count rqs failed after current one
  io_uring: kill already cached timeout.seq_offset
  io_uring: fix cached_sq_head in io_timeout()
  io_uring: only post events in io_poll_remove_all() if we completed some
  io_uring: io_async_task_func() should check and honor cancelation
  io_uring: check for need to re-wait in polled async handling
  io_uring: correct O_NONBLOCK check for splice punt
  io_uring: restore req->work when canceling poll request
  io_uring: move all request init code in one place
  io_uring: keep all sqe->flags in req->flags
  io_uring: early submission req fail code
  io_uring: track mm through current->mm
  io_uring: remove obsolete @mm_fault
2020-04-17 10:12:26 -07:00
Linus Torvalds
bf9196d51f Merge tag 'block-5.7-2020-04-17' of git://git.kernel.dk/linux-block
Pull block fixes from Jens Axboe:

 - Fix for a driver tag leak in error handling (John)

 - Remove now defunct Kconfig selection from dasd (Stefan)

 - blk-wbt trace fiexs (Tommi)

* tag 'block-5.7-2020-04-17' of git://git.kernel.dk/linux-block:
  blk-wbt: Drop needless newlines from tracepoint format strings
  blk-wbt: Use tracepoint_string() for wbt_step tracepoint string literals
  s390/dasd: remove IOSCHED_DEADLINE from DASD Kconfig
  blk-mq: Put driver tag in blk_mq_dispatch_rq_list() when no budget
2020-04-17 10:08:08 -07:00
Linus Torvalds
2acbb9e670 Merge tag 'libata-5.7-2020-04-17' of git://git.kernel.dk/linux-block
Pull libata fixlet from Jens Axboe:
 "Add yet another Comet Lake PCI ID for ahci"

* tag 'libata-5.7-2020-04-17' of git://git.kernel.dk/linux-block:
  ahci: Add Intel Comet Lake PCH-U PCI ID
2020-04-17 10:06:16 -07:00
Linus Torvalds
c5304dd59b Merge tag 'for-5.7-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fix from David Sterba:
 "A regression fix for a warning caused by running balance and snapshot
  creation in parallel"

* tag 'for-5.7-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: fix setting last_trans for reloc roots
2020-04-17 10:00:33 -07:00
Linus Torvalds
5a32fe48bc Merge tag 'pm-5.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management update from Rafael Wysocki:
 "Allow the operating performance points (OPP) core to be used in the
  case when the same driver is used on different platforms, some of
  which have an OPP table and some of which have a clock node (Rajendra
  Nayak)"

* tag 'pm-5.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  opp: Manage empty OPP tables with clk handle
2020-04-17 09:56:28 -07:00
Linus Torvalds
c8a6552ff1 Merge tag 'sound-5.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
 "One significant regression fix is for HD-audio buffer preallocation.
  In 5.6 it was set to non-prompt for x86 and forced to 0, but this
  turned out to be problematic for some applications, hence it gets
  reverted. Distros would need to restore CONFIG_SND_HDA_PREALLOC_SIZE
  value to the earlier values they've used in the past.

  Other than that, we've received quite a few small fixes for HD-audio
  and USB-audio. Most of them are for dealing with the broken TRX40
  mobos and the runtime PM without HD-audio codecs"

* tag 'sound-5.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda: call runtime_allow() for all hda controllers
  ALSA: hda: Allow setting preallocation again for x86
  ALSA: hda: Explicitly permit using autosuspend if runtime PM is supported
  ALSA: hda: Skip controller resume if not needed
  ALSA: hda: Keep the controller initialization even if no codecs found
  ALSA: hda: Release resources at error in delayed probe
  ALSA: hda: Honor PM disablement in PM freeze and thaw_noirq ops
  ALSA: hda: Don't release card at firmware loading error
  ALSA: usb-audio: Check mapping at creating connector controls, too
  ALSA: usb-audio: Don't create jack controls for PCM terminals
  ALSA: usb-audio: Don't override ignore_ctl_error value from the map
  ALSA: usb-audio: Filter error from connector kctl ops, too
  ALSA: hda/realtek - Enable the headset mic on Asus FX505DT
  ALSA: ctxfi: Remove unnecessary cast in kfree
2020-04-17 09:48:50 -07:00
Masahiro Yamada
0903060fe5 kbuild: check libyaml installation for 'make dt_binding_check'
If you run 'make dtbs_check' without installing the libyaml package,
the error message "dtc needs libyaml ..." is shown.

This should be checked also for 'make dt_binding_check' because dtc
needs to validate *.example.dts extracted from *.yaml files.

It is missing since commit 4f0e3a57d6 ("kbuild: Add support for DT
binding schema checks"), but this fix-up is applicable only after commit
e10c4321dc ("kbuild: allow to run dt_binding_check and dtbs_check
in a single command").

I gave the Fixes tag to the latter in case somebody is interested in
back-porting this.

Fixes: e10c4321dc ("kbuild: allow to run dt_binding_check and dtbs_check in a single command")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-04-17 10:45:23 -05:00
Tommi Rantala
3f22037d38 blk-wbt: Drop needless newlines from tracepoint format strings
Drop needless newlines from tracepoint format strings, they only add
empty lines to perf tracing output.

Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-04-17 08:21:46 -06:00
Tommi Rantala
3a89c25d98 blk-wbt: Use tracepoint_string() for wbt_step tracepoint string literals
Use tracepoint_string() for string literals that are used in the
wbt_step tracepoint, so that userspace tools can display the string
content.

Signed-off-by: Tommi Rantala <tommi.t.rantala@nokia.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-04-17 08:21:44 -06:00
Stefan Haberland
3dceecfad6 s390/dasd: remove IOSCHED_DEADLINE from DASD Kconfig
CONFIG_IOSCHED_DEADLINE was removed with
commit f382fb0bce ("block: remove legacy IO schedulers")

and setting of the scheduler was removed with
commit a5fd8ddce2 ("s390/dasd: remove setting of scheduler from driver").

So get rid of the select.

Reported-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Stefan Haberland <sth@linux.ibm.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-04-17 08:05:27 -06:00
Frank Rowand
29acfb6559 of: unittest: kmemleak in duplicate property update
kmemleak reports several memory leaks from devicetree unittest.
This is the fix for problem 5 of 5.

When overlay 'overlay_bad_add_dup_prop' is applied, the apply code
properly detects that a memory leak will occur if the overlay is removed
since the duplicate property is located in a base devicetree node and
reports via printk():

  OF: overlay: WARNING: memory leak will occur if overlay removed, property: /testcase-data-2/substation@100/motor-1/rpm_avail
  OF: overlay: WARNING: memory leak will occur if overlay removed, property: /testcase-data-2/substation@100/motor-1/rpm_avail

The overlay is removed when the apply code detects multiple changesets
modifying the same property.  This is reported via printk():

  OF: overlay: ERROR: multiple fragments add, update, and/or delete property /testcase-data-2/substation@100/motor-1/rpm_avail

As a result of this error, the overlay is removed resulting in the
expected memory leak.

Add another device node level to the overlay so that the duplicate
property is located in a node added by the overlay, thus no memory
leak will occur when the overlay is removed.

Thus users of kmemleak will not have to debug this leak in the future.

Fixes: 2fe0e8769d ("of: overlay: check prevents multiple fragments touching same property")
Reported-by: Erhard F. <erhard_f@mailbox.org>
Signed-off-by: Frank Rowand <frank.rowand@sony.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-04-17 08:31:34 -05:00
Frank Rowand
478ff649b1 of: overlay: kmemleak in dup_and_fixup_symbol_prop()
kmemleak reports several memory leaks from devicetree unittest.
This is the fix for problem 4 of 5.

target_path was not freed in the non-error path.

Fixes: e0a58f3e08 ("of: overlay: remove a dependency on device node full_name")
Reported-by: Erhard F. <erhard_f@mailbox.org>
Signed-off-by: Frank Rowand <frank.rowand@sony.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-04-17 08:31:34 -05:00
Frank Rowand
145fc138f9 of: unittest: kmemleak in of_unittest_overlay_high_level()
kmemleak reports several memory leaks from devicetree unittest.
This is the fix for problem 3 of 5.

of_unittest_overlay_high_level() failed to kfree the newly created
property when the property named 'name' is skipped.

Fixes: 39a751a4cb ("of: change overlay apply input data from unflattened to FDT")
Reported-by: Erhard F. <erhard_f@mailbox.org>
Signed-off-by: Frank Rowand <frank.rowand@sony.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-04-17 08:31:34 -05:00
Frank Rowand
216830d241 of: unittest: kmemleak in of_unittest_platform_populate()
kmemleak reports several memory leaks from devicetree unittest.
This is the fix for problem 2 of 5.

of_unittest_platform_populate() left an elevated reference count for
grandchild nodes (which are platform devices).  Fix the platform
device reference counts so that the memory will be freed.

Fixes: fb2caa50fb ("of/selftest: add testcase for nodes with same name and address")
Reported-by: Erhard F. <erhard_f@mailbox.org>
Signed-off-by: Frank Rowand <frank.rowand@sony.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-04-17 08:31:34 -05:00
Frank Rowand
b3fb36ed69 of: unittest: kmemleak on changeset destroy
kmemleak reports several memory leaks from devicetree unittest.
This is the fix for problem 1 of 5.

of_unittest_changeset() reaches deeply into the dynamic devicetree
functions.  Several nodes were left with an elevated reference
count and thus were not properly cleaned up.  Fix the reference
counts so that the memory will be freed.

Fixes: 201c910bd6 ("of: Transactional DT support.")
Reported-by: Erhard F. <erhard_f@mailbox.org>
Signed-off-by: Frank Rowand <frank.rowand@sony.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-04-17 08:31:34 -05:00
Mauro Carvalho Chehab
21a431e627 MAINTAINERS: dt: fix pointers for ARM Integrator, Versatile and RealView
There's a conversion from a plain text binding file into 4 yaml ones.
The old file got removed, causing this new warning:

	Warning: MAINTAINERS references a file that doesn't exist: Documentation/devicetree/bindings/arm/arm-boards

Address it by replacing the old reference by the new ones

Fixes: 4b900070d5 ("dt-bindings: arm: Add Versatile YAML schema")
Fixes: 2d483550b6 ("dt-bindings: arm: Drop the non-YAML bindings")
Fixes: 7db625b9fa ("dt-bindings: arm: Add RealView YAML schema")
Fixes: 4fb00d9066 ("dt-bindings: arm: Add Versatile Express and Juno YAML schema")
Fixes: 33fbfb3eaf ("dt-bindings: arm: Add Integrator YAML schema")
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-04-17 08:31:34 -05:00
Mauro Carvalho Chehab
f4d859b7f3 MAINTAINERS: dt: update display/allwinner file entry
Changeset f5a98bfe7b ("dt-bindings: display: Convert Allwinner display pipeline to schemas")
split Documentation/devicetree/bindings/display/sunxi/sun4i-drm.txt
into several files. Yet, it kept the old place at MAINTAINERS.

Update it to point to the new place.

Fixes: f5a98bfe7b ("dt-bindings: display: Convert Allwinner display pipeline to schemas")
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Rob Herring <robh@kernel.org>
2020-04-17 08:31:34 -05:00
Alexandru Tachici
2cf3818f18 dt-bindings: iio: dac: AD5570R fix bindings errors
Replaced num property with reg property, fixed errors
reported by dt-binding-check.

Fixes: ea52c21268 ("dt-bindings: iio: dac: Add docs for AD5770R DAC")
Signed-off-by: Alexandru Tachici <alexandru.tachici@analog.com>
[robh: Fix required property list, fix Fixes tag]
Signed-off-by: Rob Herring <robh@kernel.org>
2020-04-17 08:29:16 -05:00
Josef Bacik
aec7db3b13 btrfs: fix setting last_trans for reloc roots
I made a mistake with my previous fix, I assumed that we didn't need to
mess with the reloc roots once we were out of the part of relocation where
we are actually moving the extents.

The subtle thing that I missed is that btrfs_init_reloc_root() also
updates the last_trans for the reloc root when we do
btrfs_record_root_in_trans() for the corresponding fs_root.  I've added a
comment to make sure future me doesn't make this mistake again.

This showed up as a WARN_ON() in btrfs_copy_root() because our
last_trans didn't == the current transid.  This could happen if we
snapshotted a fs root with a reloc root after we set
rc->create_reloc_tree = 0, but before we actually merge the reloc root.

Worth mentioning that the regression produced the following warning
when running snapshot creation and balance in parallel:

  BTRFS info (device sdc): relocating block group 30408704 flags metadata|dup
  ------------[ cut here ]------------
  WARNING: CPU: 0 PID: 12823 at fs/btrfs/ctree.c:191 btrfs_copy_root+0x26f/0x430 [btrfs]
  CPU: 0 PID: 12823 Comm: btrfs Tainted: G        W 5.6.0-rc7-btrfs-next-58 #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
  RIP: 0010:btrfs_copy_root+0x26f/0x430 [btrfs]
  RSP: 0018:ffffb96e044279b8 EFLAGS: 00010202
  RAX: 0000000000000009 RBX: ffff9da70bf61000 RCX: ffffb96e04427a48
  RDX: ffff9da733a770c8 RSI: ffff9da70bf61000 RDI: ffff9da694163818
  RBP: ffff9da733a770c8 R08: fffffffffffffff8 R09: 0000000000000002
  R10: ffffb96e044279a0 R11: 0000000000000000 R12: ffff9da694163818
  R13: fffffffffffffff8 R14: ffff9da6d2512000 R15: ffff9da714cdac00
  FS:  00007fdeacf328c0(0000) GS:ffff9da735e00000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 000055a2a5b8a118 CR3: 00000001eed78002 CR4: 00000000003606f0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   ? create_reloc_root+0x49/0x2b0 [btrfs]
   ? kmem_cache_alloc_trace+0xe5/0x200
   create_reloc_root+0x8b/0x2b0 [btrfs]
   btrfs_reloc_post_snapshot+0x96/0x5b0 [btrfs]
   create_pending_snapshot+0x610/0x1010 [btrfs]
   create_pending_snapshots+0xa8/0xd0 [btrfs]
   btrfs_commit_transaction+0x4c7/0xc50 [btrfs]
   ? btrfs_mksubvol+0x3cd/0x560 [btrfs]
   btrfs_mksubvol+0x455/0x560 [btrfs]
   __btrfs_ioctl_snap_create+0x15f/0x190 [btrfs]
   btrfs_ioctl_snap_create_v2+0xa4/0xf0 [btrfs]
   ? mem_cgroup_commit_charge+0x6e/0x540
   btrfs_ioctl+0x12d8/0x3760 [btrfs]
   ? do_raw_spin_unlock+0x49/0xc0
   ? _raw_spin_unlock+0x29/0x40
   ? __handle_mm_fault+0x11b3/0x14b0
   ? ksys_ioctl+0x92/0xb0
   ksys_ioctl+0x92/0xb0
   ? trace_hardirqs_off_thunk+0x1a/0x1c
   __x64_sys_ioctl+0x16/0x20
   do_syscall_64+0x5c/0x280
   entry_SYSCALL_64_after_hwframe+0x49/0xbe
  RIP: 0033:0x7fdeabd3bdd7

Fixes: 2abc726ab4 ("btrfs: do not init a reloc root if we aren't relocating")
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-04-17 15:20:08 +02:00
Jason Yan
74f4c438f2 arm/xen: make _xen_start_info static
Fix the following sparse warning:

arch/arm64/xen/../../arm/xen/enlighten.c:39:19: warning: symbol
'_xen_start_info' was not declared. Should it be static?

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Jason Yan <yanaijie@huawei.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Link: https://lore.kernel.org/r/20200415084853.5808-1-yanaijie@huawei.com
Signed-off-by: Juergen Gross <jgross@suse.com>
2020-04-17 07:45:12 +02:00