Commit Graph

127961 Commits

Author SHA1 Message Date
Martin K. Petersen
556666bce1 Merge branch '5.12/scsi-fixes' into 5.13/scsi-staging
Pull 5.12/scsi-fixes into the 5.13 SCSI tree to provide a baseline for
some UFS changes that would otherwise cause conflicts during the
merge.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-05 22:57:29 -04:00
Arnd Bergmann
5b11c9d80b scsi: fcoe: Fix mismatched fcoe_wwn_from_mac declaration
An old cleanup changed the array size from MAX_ADDR_LEN to unspecified in
the declaration, but now gcc-11 warns about this:

drivers/scsi/fcoe/fcoe_ctlr.c:1972:37: error: argument 1 of type ‘unsigned char[32]’ with mismatched bound [-Werror=array-parameter=]
 1972 | u64 fcoe_wwn_from_mac(unsigned char mac[MAX_ADDR_LEN],
      |                       ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
In file included from /git/arm-soc/drivers/scsi/fcoe/fcoe_ctlr.c:33:
include/scsi/libfcoe.h:252:37: note: previously declared as ‘unsigned char[]’
  252 | u64 fcoe_wwn_from_mac(unsigned char mac[], unsigned int, unsigned int);
      |                       ~~~~~~~~~~~~~~^~~~~

Change the type back to what the function definition uses.

Link: https://lore.kernel.org/r/20210322164702.957810-1-arnd@kernel.org
Fixes: fdd78027fd ("[SCSI] fcoe: cleans up libfcoe.h and adds fcoe.h for fcoe module")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-04-01 22:59:43 -04:00
Wan Jiabing
6bfe9855da scsi: core: scsi_host_cmd_pool is declared twice
struct scsi_host_cmd_pool has already been declared. Remove the duplicate.

Link: https://lore.kernel.org/r/20210325064632.855002-1-wanjiabing@vivo.com
Signed-off-by: Wan Jiabing <wanjiabing@vivo.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-29 23:32:03 -04:00
Gulam Mohamed
9e67600ed6 scsi: iscsi: Fix race condition between login and sync thread
A kernel panic was observed due to a timing issue between the sync thread
and the initiator processing a login response from the target. The session
reopen can be invoked both from the session sync thread when iscsid
restarts and from iscsid through the error handler. Before the initiator
receives the response to a login, another reopen request can be sent from
the error handler/sync session. When the initial login response is
subsequently processed, the connection has been closed and the socket has
been released.

To fix this a new connection state, ISCSI_CONN_BOUND, is added:

 - Set the connection state value to ISCSI_CONN_DOWN upon
   iscsi_if_ep_disconnect() and iscsi_if_stop_conn()

 - Set the connection state to the newly created value ISCSI_CONN_BOUND
   after bind connection (transport->bind_conn())

 - In iscsi_set_param(), return -ENOTCONN if the connection state is not
   either ISCSI_CONN_BOUND or ISCSI_CONN_UP

Link: https://lore.kernel.org/r/20210325093248.284678-1-gulam.mohamed@oracle.com
Reviewed-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Gulam Mohamed <gulam.mohamed@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

index 91074fd97f64..f4bf62b007a0 100644
2021-03-29 21:17:45 -04:00
Bhaskar Chowdhury
ae98ddf05f scsi: scsi_dh: Fix a typo
s/infrastruture/infrastructure/

[mkp: combined .c and .h patches]

Link: https://lore.kernel.org/r/20210322064724.4108343-1-unixbhaskar@gmail.com
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

frog
2021-03-24 23:03:43 -04:00
Bart Van Assche
035e9f4716 scsi: sbitmap: Silence a debug kernel warning triggered by sbitmap_put()
All sbitmap code uses implied preemption protection to update
sb->alloc_hint except sbitmap_put(). Using implied preemption protection is
safe since the value of sb->alloc_hint only affects performance of sbitmap
allocations but not their correctness. Change this_cpu_ptr() in
sbitmap_put() into raw_cpu_ptr() to suppress the following kernel warning
that appears with preemption debugging enabled (CONFIG_DEBUG_PREEMPT):

BUG: using smp_processor_id() in preemptible [00000000] code: scsi_eh_0/152
caller is debug_smp_processor_id+0x17/0x20
CPU: 1 PID: 152 Comm: scsi_eh_0 Tainted: G        W         5.12.0-rc1-dbg+ #6
Call Trace:
 show_stack+0x52/0x58
 dump_stack+0xaf/0xf3
 check_preemption_disabled+0xce/0xd0
 debug_smp_processor_id+0x17/0x20
 scsi_device_unbusy+0x13a/0x1c0 [scsi_mod]
 scsi_finish_command+0x4d/0x290 [scsi_mod]
 scsi_eh_flush_done_q+0x1e7/0x280 [scsi_mod]
 ata_scsi_port_error_handler+0x592/0x750 [libata]
 ata_scsi_error+0x1a0/0x1f0 [libata]
 scsi_error_handler+0x19e/0x330 [scsi_mod]
 kthread+0x222/0x250
 ret_from_fork+0x1f/0x30

Link: https://lore.kernel.org/r/20210317032648.9080-1-bvanassche@acm.org
Fixes: c548e62bcf ("scsi: sbitmap: Move allocation hint into sbitmap")
Cc: Hannes Reinecke <hare@suse.de>
Cc: Omar Sandoval <osandov@fb.com>
Reviewed-by: Ming Lei <ming.lei@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-18 22:39:30 -04:00
Michael Kelley
3d9c3dcc58 scsi: storvsc: Enable scatterlist entry lengths > 4Kbytes
storvsc currently sets .dma_boundary to limit scatterlist entries to 4
Kbytes, which is less efficient with huge pages that offer large chunks of
contiguous physical memory. Improve the algorithm for creating the Hyper-V
guest physical address PFN array so that scatterlist entries with lengths >
4Kbytes are handled.  As a result, remove the .dma_boundary setting.

The improved algorithm also adds support for scatterlist entries with
offsets >= 4Kbytes, which is supported by many other SCSI low-level
drivers.  And it retains support for architectures where possibly PAGE_SIZE
!= HV_HYP_PAGE_SIZE (such as ARM64).

Link: https://lore.kernel.org/r/1614120294-1930-1-git-send-email-mikelley@microsoft.com
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-17 00:04:40 -04:00
Kashyap Desai
af1830956d scsi: core: Add mq_poll support to SCSI layer
Currently IOPOLL support is only available in block layer. This patch
adds mq_poll support to the SCSI layer.

Link: https://lore.kernel.org/r/20210215074048.19424-2-kashyap.desai@broadcom.com
Cc: sumit.saxena@broadcom.com
Cc: chandrakanth.patil@broadcom.com
Cc: linux-block@vger.kernel.org
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:37:03 -05:00
Mike Christie
39ae3edda3 scsi: target: core: Make completion affinity configurable
It may not always be best to complete the IO on same CPU as it was
submitted on. This commit allows userspace to configure it.

This has been useful for vhost-scsi where we have a single thread for
submissions and completions. If we force the completion on the submission
CPU we may be adding conflicts with what the user has setup in the lower
levels with settings like the block layer rq_affinity or the driver's IRQ
or softirq (the network's rps_cpus value) settings.

We may also want to set it up where the vhost thread runs on CPU N and does
its submissions/completions there, and then have LIO do its completion
booking on CPU M, but can't configure the lower levels due to issues like
using dm-multipath with lots of paths (the path selector can throw commands
all over the system because it's only taking into account latency/throughput
at its level).

The new setting is in:

    /sys/kernel/config/target/$fabric/$target/param/cmd_completion_affinity

Writing:

    -1 -> Gives the current default behavior of completing on the
          submission CPU.

    -2 -> Completes the cmd on the CPU the lower layers sent it to us from.

   > 0 -> Completes on the CPU userspace has specified.

Link: https://lore.kernel.org/r/20210227170006.5077-26-michael.christie@oracle.com
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:37:03 -05:00
Mike Christie
302990ac3b scsi: target: core: Fix backend plugging
target_core_iblock is plugging and unplugging on every command and this is
causing perf issues for drivers that prefer batched cmds. With recent
patches we can now take multiple cmds from a fabric driver queue and then
pass them down the backend drivers in a batch. This patch adds this support
by adding 2 callouts to the backend for plugging and unplugging the
device. Subsequent commits will add support for iblock and tcmu device
plugging.

Link: https://lore.kernel.org/r/20210227170006.5077-22-michael.christie@oracle.com
Reviewed-by: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:37:02 -05:00
Mike Christie
802ec4f672 scsi: target: core: Cleanup cmd flag bits
We have a couple holes in the cmd flags definitions. This cleans up the
definitions to fix that and make it easier to read.

Link: https://lore.kernel.org/r/20210227170006.5077-21-michael.christie@oracle.com
Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:37:02 -05:00
Mike Christie
eb44ce8c8c scsi: target: core: Add workqueue based cmd submission
loop and vhost/scsi do their target cmd submission from driver
workqueues. This allows them to avoid an issue where the backend may block
waiting for resources like tags/requests, mem/locks, etc and that ends up
blocking their entire submission path and for the case of vhost-scsi both
the submission and completion path.

This patch adds a helper drivers can use to submit from a LIO workqueue.
This code will then be extended in the next patches to fix the plugging of
backend devices.

We are only converting vhost/loop initially, but the workqueue based
submission will work for other drivers and have similar benefits where the
main target loops will not end up blocking one some backend resource.

Link: https://lore.kernel.org/r/20210227170006.5077-17-michael.christie@oracle.com
Tested-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:37:02 -05:00
Mike Christie
0869419947 scsi: target: core: Add gfp_t arg to target_cmd_init_cdb()
tcm_loop could be used like a normal block device, so we can't use
GFP_KERNEL and should use GFP_NOIO. This adds a gfp_t arg to
target_cmd_init_cdb() and converts the users. For every driver but loop
GFP_KERNEL is kept.

This will also be useful in subsequent patches where loop needs to do
target_submit_prep() from interrupt context to get a ref to the se_device,
and so it will need to use GFP_ATOMIC.

Link: https://lore.kernel.org/r/20210227170006.5077-16-michael.christie@oracle.com
Tested-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:37:02 -05:00
Mike Christie
0fa50a8b12 scsi: target: core: Remove target_submit_cmd_map_sgls()
Convert target_submit_cmd() to do its own calls and then remove
target_submit_cmd_map_sgls() since no one uses it.

Link: https://lore.kernel.org/r/20210227170006.5077-15-michael.christie@oracle.com
Tested-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Bodo Stroesser <bostroesser@gmail.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:37:01 -05:00
Mike Christie
750a1d93f9 scsi: target: core: Break up target_submit_cmd_map_sgls()
This breaks up target_submit_cmd_map_sgls() into 3 helpers:

 - target_init_cmd(): Do the basic general setup and get a refcount to the
   session to make sure the caller can execute the cmd.

 - target_submit_prep(): Do the mapping, cdb processing and get a ref to
   the LUN.

 - target_submit(): Pass the cmd to LIO core for execution.

The above functions must be used by drivers that either:

 1. Rely on LIO for session shutdown synchronization by calling
    target_stop_session().

 2. Need to map sgls.

When the next patches are applied then simple drivers that do not need the
extra functionality above can use target_submit_cmd() and not worry about
failures being returned and how to handle them, since many drivers were
getting this wrong and would have hit refcount bugs.

Also, by breaking target_submit_cmd_map_sgls() up into these 3 helper
functions, we can allow the later patches to do the init/prep from
interrupt context and then do the submission from a workqueue.

Link: https://lore.kernel.org/r/20210227170006.5077-5-michael.christie@oracle.com
Cc: Bart Van Assche <bvanassche@acm.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Nilesh Javali <njavali@marvell.com>
Cc: Michael Cyr <mikecyr@linux.ibm.com>
Cc: Chris Boot <bootc@bootc.net>
Cc: Felipe Balbi <balbi@kernel.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:37:00 -05:00
Mike Christie
a78b713618 scsi: target: core: Rename transport_init_se_cmd()
Rename transport_init_se_cmd() to __target_init_cmd() to reflect that it is
more of an internal function that drivers should normally not use and
because we are going to add a new init function in the next patches.

Link: https://lore.kernel.org/r/20210227170006.5077-4-michael.christie@oracle.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:37:00 -05:00
Ming Lei
020b0f0a31 scsi: core: Replace sdev->device_busy with sbitmap
SCSI currently uses an atomic variable to track queue depth for each
attached device. The queue depth depends on many factors such as transport
type and device implementation. In addition, the SCSI device queue depth is
not a static entity but changes over time as a result of congestion
management.

While blk-mq currently tracks queue depth for each hctx, it can't easily be
changed to accommodate the SCSI per-device requirement.

The current approach of using an atomic variable doesn't scale well when
there are lots of CPU cores and the disk is very fast. IOPS can be
substantially impacted by the atomic in the hot path.

Replace the atomic variable sdev->device_busy with an sbitmap for tracking
the SCSI device queue depth.

It has been observed that IOPS is improved ~30% by this patchset in the
following test:

1) test machine(32 logical CPU cores)
	Thread(s) per core:  2
	Core(s) per socket:  8
	Socket(s):           2
	NUMA node(s):        2
	Model name:          Intel(R) Xeon(R) Silver 4110 CPU @ 2.10GHz

2) setup scsi_debug:
modprobe scsi_debug virtual_gb=128 max_luns=1 submit_queues=32 delay=0 max_queue=256

3) fio script:
fio --rw=randread --size=128G --direct=1 --ioengine=libaio --iodepth=2048 \
	--numjobs=32 --bs=4k --group_reporting=1 --group_reporting=1 --runtime=60 \
	--loops=10000 --name=job1 --filename=/dev/sdN

[mkp: fix device_busy reference in mpt3sas]

Link: https://lore.kernel.org/r/20210122023317.687987-14-ming.lei@redhat.com
Link: https://lore.kernel.org/linux-block/20200119071432.18558-6-ming.lei@redhat.com/
Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:37:00 -05:00
Ming Lei
8278807abd scsi: core: Add scsi_device_busy() wrapper
Add scsi_device_busy() helper to prepare drivers for tracking device queue
depth via sbitmap_queue.

Link: https://lore.kernel.org/r/20210122023317.687987-12-ming.lei@redhat.com
Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:37:00 -05:00
Ming Lei
9ebb4d70dc scsi: core: Put hot fields of scsi_host_template in one cacheline
The following three fields of scsi_host_template are referenced in the SCSI
I/O submission hot path. Put them together in one cacheline:

 - cmd_size

 - queuecommand

 - commit_rqs

Link: https://lore.kernel.org/r/20210122023317.687987-10-ming.lei@redhat.com
Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:37:00 -05:00
Ming Lei
2a5a24aa83 scsi: blk-mq: Return budget token from .get_budget callback
SCSI uses a global atomic variable to track queue depth for each
LUN/request queue.

This doesn't scale well when there are lots of CPU cores and the disk is
very fast. It has been observed that IOPS is affected a lot by tracking
queue depth via sdev->device_busy in the I/O path.

Return budget token from .get_budget callback. The budget token can be
passed to driver so that we can replace the atomic variable with
sbitmap_queue and alleviate the scaling problems that way.

Link: https://lore.kernel.org/r/20210122023317.687987-9-ming.lei@redhat.com
Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:36:59 -05:00
Ming Lei
d022d18c04 scsi: blk-mq: Add callbacks for storing & retrieving budget token
Since SCSI is the only driver which requires dispatch budget move the token
from struct request to struct scsi_cmnd.

Link: https://lore.kernel.org/r/20210122023317.687987-8-ming.lei@redhat.com
Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:36:59 -05:00
Ming Lei
2d13b1ea9f scsi: sbitmap: Add sbitmap_calculate_shift() helper
Move code for calculating default shift into a public helper which can be
used by SCSI.

Link: https://lore.kernel.org/r/20210122023317.687987-7-ming.lei@redhat.com
Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:36:59 -05:00
Ming Lei
cbb9950b41 scsi: sbitmap: Export sbitmap_weight
SCSI's .device_busy will be converted to sbitmap and sbitmap_weight is
needed. Export the helper.

The only existing user of sbitmap_weight() uses it to find out how many
bits are set and not cleared. Align sbitmap_weight() meaning with this
usage model.

Link: https://lore.kernel.org/r/20210122023317.687987-6-ming.lei@redhat.com
Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:36:59 -05:00
Ming Lei
c548e62bcf scsi: sbitmap: Move allocation hint into sbitmap
Allocation hint should have belonged to sbitmap. Also, when sbitmap's depth
is high and there is no need to use mulitple wakeup queues, user can
benefit from percpu allocation hint too.

Move allocation hint into sbitmap, then SCSI device queue can benefit from
allocation hint when converting to plain sbitmap.

Convert vhost/scsi.c to use sbitmap allocation with percpu alloc hint. This
is more efficient than the previous approach.

Link: https://lore.kernel.org/r/20210122023317.687987-5-ming.lei@redhat.com
Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Mike Christie <michael.christie@oracle.com>
Cc: virtualization@lists.linux-foundation.org
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:36:59 -05:00
Ming Lei
efe1f3a1d5 scsi: sbitmap: Maintain allocation round_robin in sbitmap
Currently the allocation round_robin info is maintained by sbitmap_queue.

However, bit allocation really belongs to sbitmap. Move it there.

Link: https://lore.kernel.org/r/20210122023317.687987-3-ming.lei@redhat.com
Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: virtualization@lists.linux-foundation.org
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:36:59 -05:00
Ming Lei
4ec5917903 scsi: sbitmap: Remove sbitmap_clear_bit_unlock
No one uses this helper any more, so kill it.

Link: https://lore.kernel.org/r/20210122023317.687987-2-ming.lei@redhat.com
Cc: Omar Sandoval <osandov@fb.com>
Cc: Kashyap Desai <kashyap.desai@broadcom.com>
Cc: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Cc: Ewan D. Milne <emilne@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Tested-by: Sumanesh Samanta <sumanesh.samanta@broadcom.com>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:36:59 -05:00
Adrian Hunter
f7733625ec scsi: ufs: Add exception event tracepoint
Currently, exception event status can be read from wExceptionEventStatus
attribute (sysfs file attributes/exception_event_status under the UFS host
controller device directory). Polling that attribute to track UFS exception
events is impractical, so add a tracepoint to track exception events for
testing and debugging purposes.

Note, by the time the exception event status is read, the exception event
may have cleared, so the value can be zero - see example below.

Note also, only enabled exception events can be reported. A subsequent
patch adds the ability for users to enable selected exception events via
debugfs.

Example with driver instrumented to enable all exception events:

  # echo 1 > /sys/kernel/debug/tracing/events/ufs/ufshcd_exception_event/enable

  ... do some I/O ...

  # cat /sys/kernel/debug/tracing/trace
  # tracer: nop
  #
  # entries-in-buffer/entries-written: 3/3   #P:5
  #
  #                                _-----=> irqs-off
  #                               / _----=> need-resched
  #                              | / _---=> hardirq/softirq
  #                              || / _--=> preempt-depth
  #                              ||| /     delay
  #           TASK-PID     CPU#  ||||   TIMESTAMP  FUNCTION
  #              | |         |   ||||      |         |
       kworker/2:2-173     [002] ....   731.486419: ufshcd_exception_event: 0000:00:12.5: status 0x0
       kworker/2:2-173     [002] ....   732.608918: ufshcd_exception_event: 0000:00:12.5: status 0x4
       kworker/2:2-173     [002] ....   732.609312: ufshcd_exception_event: 0000:00:12.5: status 0x4

Link: https://lore.kernel.org/r/20210209062437.6954-2-adrian.hunter@intel.com
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2021-03-04 17:36:58 -05:00
Jens Axboe
caf6912f3f swap: fix swapfile read/write offset
We're not factoring in the start of the file for where to write and
read the swapfile, which leads to very unfortunate side effects of
writing where we should not be...

Fixes: 48d15436fd ("mm: remove get_swap_bio")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-02 17:25:46 -07:00
Linus Torvalds
7a7fd0de4a Merge branch 'kmap-conversion-for-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull kmap conversion updates from David Sterba:
 "This contains changes regarding kmap API use and eg conversion from
  kmap_atomic to kmap_local_page.

  The API belongs to memory management but to save cross-tree
  dependency headaches we've agreed to take it through the btrfs tree
  because there are some trivial conversions possible, while the rest
  will need some time and getting the easy cases out of the way would be
  convenient.

  The changes can be grouped:

   - function exports, new helpers

   - new VM_BUG_ON for additional verification; it's been discussed if
     it should be VM_BUG_ON or BUG_ON, the former was chosen due to
     performance reasons

   - code replaced by relevant helpers"

[ This is an updated version of a request that originally came in during
  the merge window, but I asked for some updates:

    https://lore.kernel.org/lkml/cover.1614090658.git.dsterba@suse.com/

  which is why this got merge after the merge window closed.  - Linus ]

* 'kmap-conversion-for-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: use copy_highpage() instead of 2 kmaps()
  btrfs: use memcpy_[to|from]_page() and kmap_local_page()
  mm/highmem: Add VM_BUG_ON() to mem*_page() calls
  mm/highmem: Introduce memcpy_page(), memmove_page(), and memset_page()
  mm/highmem: Convert memcpy_[to|from]_page() to kmap_local_page()
  mm/highmem: Lift memcpy_[to|from]_page to core
2021-03-01 11:24:18 -08:00
Linus Torvalds
cd278456d4 Merge tag 'csky-for-linus-5.12-rc1' of git://github.com/c-sky/csky-linux
Pull arch/csky updates from Guo Ren:
 "Features:
   - add new memory layout 2.5G(user):1.5G(kernel)
   - add kmemleak support
   - reconstruct VDSO framework: add VDSO with GENERIC_GETTIMEOFDAY,
     GENERIC_TIME_VSYSCALL, HAVE_GENERIC_VDSO
   - add faulthandler_disabled() check
   - support (fix) swapon
   - add (fix) _PAGE_ACCESSED for default pgprot
   - abort uaccess retries upon fatal signal (from arm)

  Fixes and optimizations:
   - fix perf probe failure
   - fix show_regs doesn't contain regs->usp
   - remove custom asm/atomic.h implementation
   - fix barrier design
   - fix futex SMP implementation
   - fix asm/cmpxchg.h with correct ordering barrier
   - cleanup asm/spinlock.h
   - fix PTE global for 2.5:1.5 virtual memory
   - remove prologue of page fault handler in entry.S
   - fix TLB maintenance synchronization problem
   - add show_tlb for CPU_CK860 debug
   - fix FAULT_FLAG_XXX param for handle_mm_fault
   - fix update_mmu_cache called with user io mapping
   - fix do_page_fault parent irq status
   - fix a size determination in gpr_get()
   - pgtable.h: Coding convention
   - kprobe: Fix code in simulate without 'long'
   - fix pfn_valid error with wrong max_mapnr
   - use free_initmem_default() in free_initmem()
   - fix compile error"

* tag 'csky-for-linus-5.12-rc1' of git://github.com/c-sky/csky-linux: (30 commits)
  csky: Fixup compile error
  csky: use free_initmem_default() in free_initmem()
  csky: Fixup pfn_valid error with wrong max_mapnr
  csky: Add VDSO with GENERIC_GETTIMEOFDAY, GENERIC_TIME_VSYSCALL, HAVE_GENERIC_VDSO
  csky: kprobe: Fixup code in simulate without 'long'
  csky: Fixup swapon
  csky: pgtable.h: Coding convention
  csky: Fixup _PAGE_ACCESSED for default pgprot
  csky: remove unused including <linux/version.h>
  csky: Fix a size determination in gpr_get()
  csky: Reconstruct VDSO framework
  csky: mm: abort uaccess retries upon fatal signal
  csky: Sync riscv mm/fault.c for easy maintenance
  csky: Fixup do_page_fault parent irq status
  csky: Add faulthandler_disabled() check
  csky: Fixup update_mmu_cache called with user io mapping
  csky: Fixup FAULT_FLAG_XXX param for handle_mm_fault
  csky: Add show_tlb for CPU_CK860 debug
  csky: Fix TLB maintenance synchronization problem
  csky: Add kmemleak support
  ...
2021-02-28 12:06:45 -08:00
Linus Torvalds
0b311e34d5 Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
 "This is a few driver updates (iscsi, mpt3sas) that were still in the
  staging queue when the merge window opened (all committed on or before
  8 Feb) and some small bug fixes which came in during the merge window
  (all committed on 22 Feb)"

* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (30 commits)
  scsi: hpsa: Correct dev cmds outstanding for retried cmds
  scsi: sd: Fix Opal support
  scsi: target: tcmu: Fix memory leak caused by wrong uio usage
  scsi: target: tcmu: Move some functions without code change
  scsi: sd: sd_zbc: Don't pass GFP_NOIO to kvcalloc
  scsi: aic7xxx: Remove unused function pointer typedef ahc_bus_suspend/resume_t
  scsi: bnx2fc: Fix Kconfig warning & CNIC build errors
  scsi: ufs: Fix a duplicate dev quirk number
  scsi: aic79xx: Fix spelling of version
  scsi: target: core: Prevent underflow for service actions
  scsi: target: core: Add cmd length set before cmd complete
  scsi: iscsi: Drop session lock in iscsi_session_chkready()
  scsi: qla4xxx: Use iscsi_is_session_online()
  scsi: libiscsi: Reset max/exp cmdsn during recovery
  scsi: iscsi_tcp: Fix shost can_queue initialization
  scsi: libiscsi: Add helper to calculate max SCSI cmds per session
  scsi: libiscsi: Fix iSCSI host workq destruction
  scsi: libiscsi: Fix iscsi_task use after free()
  scsi: libiscsi: Drop taskqueuelock
  scsi: libiscsi: Fix iscsi_prep_scsi_cmd_pdu() error handling
  ...
2021-02-28 11:51:20 -08:00
Linus Torvalds
3ab6608e66 Merge tag 'block-5.12-2021-02-27' of git://git.kernel.dk/linux-block
Pull more block updates from Jens Axboe:
 "A few stragglers (and one due to me missing it originally), and fixes
  for changes in this merge window mostly. In particular:

   - blktrace cleanups (Chaitanya, Greg)

   - Kill dead blk_pm_* functions (Bart)

   - Fixes for the bio alloc changes (Christoph)

   - Fix for the partition changes (Christoph, Ming)

   - Fix for turning off iopoll with polled IO inflight (Jeffle)

   - nbd disconnect fix (Josef)

   - loop fsync error fix (Mauricio)

   - kyber update depth fix (Yang)

   - max_sectors alignment fix (Mikulas)

   - Add bio_max_segs helper (Matthew)"

* tag 'block-5.12-2021-02-27' of git://git.kernel.dk/linux-block: (21 commits)
  block: Add bio_max_segs
  blktrace: fix documentation for blk_fill_rw()
  block: memory allocations in bounce_clone_bio must not fail
  block: remove the gfp_mask argument to bounce_clone_bio
  block: fix bounce_clone_bio for passthrough bios
  block-crypto-fallback: use a bio_set for splitting bios
  block: fix logging on capacity change
  blk-settings: align max_sectors on "logical_block_size" boundary
  block: reopen the device in blkdev_reread_part
  block: don't skip empty device in in disk_uevent
  blktrace: remove debugfs file dentries from struct blk_trace
  nbd: handle device refs for DESTROY_ON_DISCONNECT properly
  kyber: introduce kyber_depth_updated()
  loop: fix I/O error on fsync() in detached loop devices
  block: fix potential IO hang when turning off io_poll
  block: get rid of the trace rq insert wrapper
  blktrace: fix blk_rq_merge documentation
  blktrace: fix blk_rq_issue documentation
  blktrace: add blk_fill_rwbs documentation comment
  block: remove superfluous param in blk_fill_rwbs()
  ...
2021-02-28 11:23:38 -08:00
Linus Torvalds
5695e51619 Merge tag 'io_uring-worker.v3-2021-02-25' of git://git.kernel.dk/linux-block
Pull io_uring thread rewrite from Jens Axboe:
 "This converts the io-wq workers to be forked off the tasks in question
  instead of being kernel threads that assume various bits of the
  original task identity.

  This kills > 400 lines of code from io_uring/io-wq, and it's the worst
  part of the code. We've had several bugs in this area, and the worry
  is always that we could be missing some pieces for file types doing
  unusual things (recent /dev/tty example comes to mind, userfaultfd
  reads installing file descriptors is another fun one... - both of
  which need special handling, and I bet it's not the last weird oddity
  we'll find).

  With these identical workers, we can have full confidence that we're
  never missing anything. That, in itself, is a huge win. Outside of
  that, it's also more efficient since we're not wasting space and code
  on tracking state, or switching between different states.

  I'm sure we're going to find little things to patch up after this
  series, but testing has been pretty thorough, from the usual
  regression suite to production. Any issue that may crop up should be
  manageable.

  There's also a nice series of further reductions we can do on top of
  this, but I wanted to get the meat of it out sooner rather than later.
  The general worry here isn't that it's fundamentally broken. Most of
  the little issues we've found over the last week have been related to
  just changes in how thread startup/exit is done, since that's the main
  difference between using kthreads and these kinds of threads. In fact,
  if all goes according to plan, I want to get this into the 5.10 and
  5.11 stable branches as well.

  That said, the changes outside of io_uring/io-wq are:

   - arch setup, simple one-liner to each arch copy_thread()
     implementation.

   - Removal of net and proc restrictions for io_uring, they are no
     longer needed or useful"

* tag 'io_uring-worker.v3-2021-02-25' of git://git.kernel.dk/linux-block: (30 commits)
  io-wq: remove now unused IO_WQ_BIT_ERROR
  io_uring: fix SQPOLL thread handling over exec
  io-wq: improve manager/worker handling over exec
  io_uring: ensure SQPOLL startup is triggered before error shutdown
  io-wq: make buffered file write hashed work map per-ctx
  io-wq: fix race around io_worker grabbing
  io-wq: fix races around manager/worker creation and task exit
  io_uring: ensure io-wq context is always destroyed for tasks
  arch: ensure parisc/powerpc handle PF_IO_WORKER in copy_thread()
  io_uring: cleanup ->user usage
  io-wq: remove nr_process accounting
  io_uring: flag new native workers with IORING_FEAT_NATIVE_WORKERS
  net: remove cmsg restriction from io_uring based send/recvmsg calls
  Revert "proc: don't allow async path resolution of /proc/self components"
  Revert "proc: don't allow async path resolution of /proc/thread-self components"
  io_uring: move SQPOLL thread io-wq forked worker
  io-wq: make io_wq_fork_thread() available to other users
  io-wq: only remove worker from free_list, if it was there
  io_uring: remove io_identity
  io_uring: remove any grabbing of context
  ...
2021-02-27 08:29:02 -08:00
Linus Torvalds
5ceabb6078 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull misc vfs updates from Al Viro:
 "Assorted stuff pile - no common topic here"

* 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  whack-a-mole: don't open-code iminor/imajor
  9p: fix misuse of sscanf() in v9fs_stat2inode()
  audit_alloc_mark(): don't open-code ERR_CAST()
  fs/inode.c: make inode_init_always() initialize i_ino to 0
  vfs: don't unnecessarily clone write access for writable fds
2021-02-27 08:07:12 -08:00
Matthew Wilcox (Oracle)
5f7136db82 block: Add bio_max_segs
It's often inconvenient to use BIO_MAX_PAGES due to min() requiring the
sign to be the same.  Introduce bio_max_segs() and change BIO_MAX_PAGES to
be unsigned to make it easier for the users.

Reviewed-by: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-02-26 15:49:51 -07:00
Linus Torvalds
a3905af5be Merge tag 'for-linus' of git://github.com/openrisc/linux
Pull OpenRISC updates from Stafford Horne:

 - Update for Litex SoC controller to support wider width registers as
   well as reset.

 - Refactor SMP code to use device tree to define possible cpus.

 - Update build including generating vmlinux.bin

* tag 'for-linus' of git://github.com/openrisc/linux:
  openrisc: Use devicetree to determine present cpus
  drivers/soc/litex: Add restart handler
  openrisc: add arch/openrisc/Kbuild
  drivers/soc/litex: make 'litex_[set|get]_reg()' methods private
  drivers/soc/litex: support 32-bit subregisters, 64-bit CPUs
  drivers/soc/litex: s/LITEX_REG_SIZE/LITEX_SUBREG_ALIGN/g
  drivers/soc/litex: separate MMIO from subregister offset calculation
  drivers/soc/litex: move generic accessors to litex.h
  openrisc: restart: Call common handlers before hanging
  openrisc: Add vmlinux.bin target
2021-02-26 14:16:06 -08:00
Linus Torvalds
e7270e47a0 Merge tag 's390-5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull more s390 updates from Vasily Gorbik:

 - Fix physical vs virtual confusion in some basic mm macros and
   routines. Caused by __pa == __va on s390 currently.

 - Get rid of on-stack cpu masks.

 - Add support for complete CPU counter set extraction.

 - Add arch_irq_work_raise implementation.

 - virtio-ccw revision and opcode fixes.

* tag 's390-5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/cpumf: Add support for complete counter set extraction
  virtio/s390: implement virtio-ccw revision 2 correctly
  s390/smp: implement arch_irq_work_raise()
  s390/topology: move cpumasks away from stack
  s390/smp: smp_emergency_stop() - move cpumask away from stack
  s390/smp: __smp_rescan_cpus() - move cpumask away from stack
  s390/smp: consolidate locking for smp_rescan()
  s390/mm: fix phys vs virt confusion in vmem_*() functions family
  s390/mm: fix phys vs virt confusion in pgtable allocation routines
  s390/mm: fix invalid __pa() usage in pfn_pXd() macros
  s390/mm: make pXd_deref() macros return a pointer
  s390/opcodes: rename selhhhr to selfhr
2021-02-26 14:12:32 -08:00
Linus Torvalds
ef9856a734 Merge branch 'stable/for-linus-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb
Pull swiotlb updates from Konrad Rzeszutek Wilk:
 "Two memory encryption related patches (SWIOTLB is enabled by default
  for AMD-SEV):

   - Add support for alignment so that NVME can properly work

   - Keep track of requested DMA buffers length, as underlaying hardware
     devices can trip SWIOTLB to bounce too much and crash the kernel

  And a tiny fix to use proper APIs in drivers"

* 'stable/for-linus-5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
  swiotlb: Validate bounce size in the sync/unmap path
  nvme-pci: set min_align_mask
  swiotlb: respect min_align_mask
  swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single
  swiotlb: refactor swiotlb_tbl_map_single
  swiotlb: clean up swiotlb_tbl_unmap_single
  swiotlb: factor out a nr_slots helper
  swiotlb: factor out an io_tlb_offset helper
  swiotlb: add a IO_TLB_SIZE define
  driver core: add a min_align_mask field to struct device_dma_parameters
  sdhci: stop poking into swiotlb internals
2021-02-26 13:59:32 -08:00
Linus Torvalds
fecfd01539 Merge tag 'leds-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds
Pull LED updates from Pavel Machek:
 "Besides the usual fixes and new drivers, we are changing CLASS_FLASH
  to return success to make it easier to work with V4L2 stuff disabled,
  and we are getting rid of enum that should have been plain integer
  long time ago. I'm slightly nervous about potential warnings, but it
  needed to be fixed at some point"

* tag 'leds-5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/pavel/linux-leds:
  leds: lp50xx: Get rid of redundant explicit casting
  leds: lp50xx: Update headers block to reflect reality
  leds: lp50xx: Get rid of redundant check in lp50xx_enable_disable()
  leds: lp50xx: Reduce level of dereferences
  leds: lp50xx: Switch to new style i2c-driver probe function
  leds: lp50xx: Don't spam logs when probe is deferred
  leds: apu: extend support for PC Engines APU1 with newer firmware
  leds: flash: Fix multicolor no-ops registration by return 0
  leds: flash: Add flash registration with undefined CONFIG_LEDS_CLASS_FLASH
  leds: lgm: Add LED controller driver for LGM SoC
  dt-bindings: leds: Add bindings for Intel LGM SoC
  leds: led-core: Get rid of enum led_brightness
  leds: gpio: Set max brightness to 1
  leds: lm3533: Switch to using the new API kobj_to_dev()
  leds: ss4200: simplify the return expression of register_nasgpio_led()
  leds: Use DEVICE_ATTR_{RW, RO, WO} macros
2021-02-26 13:56:40 -08:00
Linus Torvalds
8b83369ddc Merge tag 'riscv-for-linus-5.12-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V updates from Palmer Dabbelt:
 "A handful of new RISC-V related patches for this merge window:

   - A check to ensure drivers are properly using uaccess. This isn't
     manifesting with any of the drivers I'm currently using, but may
     catch errors in new drivers.

   - Some preliminary support for the FU740, along with the HiFive
     Unleashed it will appear on.

   - NUMA support for RISC-V, which involves making the arm64 code
     generic.

   - Support for kasan on the vmalloc region.

   - A handful of new drivers for the Kendryte K210, along with the DT
     plumbing required to boot on a handful of K210-based boards.

   - Support for allocating ASIDs.

   - Preliminary support for kernels larger than 128MiB.

   - Various other improvements to our KASAN support, including the
     utilization of huge pages when allocating the KASAN regions.

  We may have already found a bug with the KASAN_VMALLOC code, but it's
  passing my tests. There's a fix in the works, but that will probably
  miss the merge window.

* tag 'riscv-for-linus-5.12-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (75 commits)
  riscv: Improve kasan population by using hugepages when possible
  riscv: Improve kasan population function
  riscv: Use KASAN_SHADOW_INIT define for kasan memory initialization
  riscv: Improve kasan definitions
  riscv: Get rid of MAX_EARLY_MAPPING_SIZE
  soc: canaan: Sort the Makefile alphabetically
  riscv: Disable KSAN_SANITIZE for vDSO
  riscv: Remove unnecessary declaration
  riscv: Add Canaan Kendryte K210 SD card defconfig
  riscv: Update Canaan Kendryte K210 defconfig
  riscv: Add Kendryte KD233 board device tree
  riscv: Add SiPeed MAIXDUINO board device tree
  riscv: Add SiPeed MAIX GO board device tree
  riscv: Add SiPeed MAIX DOCK board device tree
  riscv: Add SiPeed MAIX BiT board device tree
  riscv: Update Canaan Kendryte K210 device tree
  dt-bindings: add resets property to dw-apb-timer
  dt-bindings: fix sifive gpio properties
  dt-bindings: update sifive uart compatible string
  dt-bindings: update sifive clint compatible string
  ...
2021-02-26 10:28:35 -08:00
Linus Torvalds
8f47d753d4 Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
 "The big one is a fix for the VHE enabling path during early boot,
  where the code enabling the MMU wasn't necessarily in the identity map
  of the new page-tables, resulting in a consistent crash with 64k
  pages. In fixing that, we noticed some missing barriers too, so we
  added those for the sake of architectural compliance.

  Other than that, just the usual merge window trickle. There'll be more
  to come, too.

  Summary:

   - Fix lockdep false alarm on resume-from-cpuidle path

   - Fix memory leak in kexec_file

   - Fix module linker script to work with GDB

   - Fix error code when trying to use uprobes with AArch32 instructions

   - Fix late VHE enabling with 64k pages

   - Add missing ISBs after TLB invalidation

   - Fix seccomp when tracing syscall -1

   - Fix stacktrace return code at end of stack

   - Fix inconsistent whitespace for pointer return values

   - Fix compiler warnings when building with W=1"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  arm64: stacktrace: Report when we reach the end of the stack
  arm64: ptrace: Fix seccomp of traced syscall -1 (NO_SYSCALL)
  arm64: Add missing ISB after invalidating TLB in enter_vhe
  arm64: Add missing ISB after invalidating TLB in __primary_switch
  arm64: VHE: Enable EL2 MMU from the idmap
  KVM: arm64: make the hyp vector table entries local
  arm64/mm: Fixed some coding style issues
  arm64: uprobe: Return EOPNOTSUPP for AARCH32 instruction probing
  kexec: move machine_kexec_post_load() to public interface
  arm64 module: set plt* section addresses to 0x0
  arm64: kexec_file: fix memory leakage in create_dtb() when fdt_open_into() fails
  arm64: spectre: Prevent lockdep splat on v4 mitigation enable path
2021-02-26 10:19:03 -08:00
Linus Torvalds
8b1e2c50bc Merge tag 'trace-v5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing fixes from Steven Rostedt:
 "Two fixes:

   - Fix an unsafe printf string usage in a kmem trace event

   - Fix spelling in output from the latency-collector tool"

* tag 'trace-v5.12-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing/tools: fix a couple of spelling mistakes
  mm, tracing: Fix kmem_cache_free trace event to not print stale pointers
2021-02-26 10:14:18 -08:00
Linus Torvalds
2bd3f4eeb3 Merge tag 'orphan-handling-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull orphan handling fix from Kees Cook:
 "Another case of bogus .eh_frame emission was noticed under
  CONFIG_GCOV_KERNEL=y.

  Summary:

   - Define SANITIZER_DISCARDS with CONFIG_GCOV_KERNEL=y (Nathan
     Chancellor)"

* tag 'orphan-handling-v5.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  vmlinux.lds.h: Define SANITIZER_DISCARDS with CONFIG_GCOV_KERNEL=y
2021-02-26 10:12:19 -08:00
Linus Torvalds
5c2e7a0af2 Merge tag 'for-linus-5.12b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull more xen updates from Juergen Gross:

 - A small series for Xen event channels adding some sysfs nodes for per
   pv-device settings and statistics, and two fixes of theoretical
   problems.

 - two minor fixes (one for an unlikely error path, one for a comment).

* tag 'for-linus-5.12b-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen-front-pgdir-shbuf: don't record wrong grant handle upon error
  xen: Replace lkml.org links with lore
  xen/evtchn: use READ/WRITE_ONCE() for accessing ring indices
  xen/evtchn: use smp barriers for user event ring
  xen/events: add per-xenbus device event statistics and settings
2021-02-26 10:04:45 -08:00
Linus Torvalds
d94d14008e Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull more KVM updates from Paolo Bonzini:
 "x86:

   - take into account HVA before retrying on MMU notifier race

   - fixes for nested AMD guests without NPT

   - allow INVPCID in guest without PCID

   - disable PML in hardware when not in use

   - MMU code cleanups:

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (28 commits)
  KVM: SVM: Fix nested VM-Exit on #GP interception handling
  KVM: vmx/pmu: Fix dummy check if lbr_desc->event is created
  KVM: x86/mmu: Consider the hva in mmu_notifier retry
  KVM: x86/mmu: Skip mmu_notifier check when handling MMIO page fault
  KVM: Documentation: rectify rst markup in KVM_GET_SUPPORTED_HV_CPUID
  KVM: nSVM: prepare guest save area while is_guest_mode is true
  KVM: x86/mmu: Remove a variety of unnecessary exports
  KVM: x86: Fold "write-protect large" use case into generic write-protect
  KVM: x86/mmu: Don't set dirty bits when disabling dirty logging w/ PML
  KVM: VMX: Dynamically enable/disable PML based on memslot dirty logging
  KVM: x86: Further clarify the logic and comments for toggling log dirty
  KVM: x86: Move MMU's PML logic to common code
  KVM: x86/mmu: Make dirty log size hook (PML) a value, not a function
  KVM: x86/mmu: Expand on the comment in kvm_vcpu_ad_need_write_protect()
  KVM: nVMX: Disable PML in hardware when running L2
  KVM: x86/mmu: Consult max mapping level when zapping collapsible SPTEs
  KVM: x86/mmu: Pass the memslot to the rmap callbacks
  KVM: x86/mmu: Split out max mapping level calculation to helper
  KVM: x86/mmu: Expand collapsible SPTE zap for TDP MMU to ZONE_DEVICE and HugeTLB pages
  KVM: nVMX: no need to undo inject_page_fault change on nested vmexit
  ...
2021-02-26 10:00:12 -08:00
Linus Torvalds
245137cdf0 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "118 patches:

   - The rest of MM.

     Includes kfence - another runtime memory validator. Not as thorough
     as KASAN, but it has unmeasurable overhead and is intended to be
     usable in production builds.

   - Everything else

  Subsystems affected by this patch series: alpha, procfs, sysctl,
  misc, core-kernel, MAINTAINERS, lib, bitops, checkpatch, init,
  coredump, seq_file, gdb, ubsan, initramfs, and mm (thp, cma,
  vmstat, memory-hotplug, mlock, rmap, zswap, zsmalloc, cleanups,
  kfence, kasan2, and pagemap2)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (118 commits)
  MIPS: make userspace mapping young by default
  initramfs: panic with memory information
  ubsan: remove overflow checks
  kgdb: fix to kill breakpoints on initmem after boot
  scripts/gdb: fix list_for_each
  x86: fix seq_file iteration for pat/memtype.c
  seq_file: document how per-entry resources are managed.
  fs/coredump: use kmap_local_page()
  init/Kconfig: fix a typo in CC_VERSION_TEXT help text
  init: clean up early_param_on_off() macro
  init/version.c: remove Version_<LINUX_VERSION_CODE> symbol
  checkpatch: do not apply "initialise globals to 0" check to BPF progs
  checkpatch: don't warn about colon termination in linker scripts
  checkpatch: add kmalloc_array_node to unnecessary OOM message check
  checkpatch: add warning for avoiding .L prefix symbols in assembly files
  checkpatch: improve TYPECAST_INT_CONSTANT test message
  checkpatch: prefer ftrace over function entry/exit printks
  checkpatch: trivial style fixes
  checkpatch: ignore warning designated initializers using NR_CPUS
  checkpatch: improve blank line after declaration test
  ...
2021-02-26 09:50:09 -08:00
Huang Pei
f685a533a7 MIPS: make userspace mapping young by default
MIPS page fault path(except huge page) takes 3 exceptions (1 TLB Miss + 2
TLB Invalid), butthe second TLB Invalid exception is just triggered by
__update_tlb from do_page_fault writing tlb without _PAGE_VALID set.  With
this patch, user space mapping prot is made young by default (with both
_PAGE_VALID and _PAGE_YOUNG set), and it only take 1 TLB Miss + 1 TLB
Invalid exception

Remove pte_sw_mkyoung without polluting MM code and make page fault delay
of MIPS on par with other architecture

Link: https://lkml.kernel.org/r/20210204013942.8398-1-huangpei@loongson.cn
Signed-off-by: Huang Pei <huangpei@loongson.cn>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Acked-by: <huangpei@loongson.cn>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: <ambrosehua@gmail.com>
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Paul Burton <paulburton@kernel.org>
Cc: Li Xuefeng <lixuefeng@loongson.cn>
Cc: Yang Tiezhu <yangtiezhu@loongson.cn>
Cc: Gao Juxin <gaojuxin@loongson.cn>
Cc: Fuxin Zhang <zhangfx@lemote.com>
Cc: Huacai Chen <chenhc@lemote.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-26 09:41:05 -08:00
Sumit Garg
d54ce6158e kgdb: fix to kill breakpoints on initmem after boot
Currently breakpoints in kernel .init.text section are not handled
correctly while allowing to remove them even after corresponding pages
have been freed.

Fix it via killing .init.text section breakpoints just prior to initmem
pages being freed.

Doug: "HW breakpoints aren't handled by this patch but it's probably
not such a big deal".

Link: https://lkml.kernel.org/r/20210224081652.587785-1-sumit.garg@linaro.org
Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Suggested-by: Doug Anderson <dianders@chromium.org>
Acked-by: Doug Anderson <dianders@chromium.org>
Acked-by: Daniel Thompson <daniel.thompson@linaro.org>
Tested-by: Daniel Thompson <daniel.thompson@linaro.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-26 09:41:05 -08:00
Masahiro Yamada
a5a673f731 init: clean up early_param_on_off() macro
Use early_param() to define early_param_on_off().

Link: https://lkml.kernel.org/r/20210201041532.4025025-1-masahiroy@kernel.org
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Johan Hovold <johan@kernel.org>
Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Joe Perches <joe@perches.com>
Cc: Nick Desaulniers <ndesaulniers@gooogle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-26 09:41:05 -08:00
Geert Uytterhoeven
4945cca232 include/linux/bitops.h: spelling s/synomyn/synonym/
Fix a misspelling of "synonym".

Link: https://lkml.kernel.org/r/20210108105305.2028120-1-geert+renesas@glider.be
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-02-26 09:41:04 -08:00