linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-08 00:29:36 -04:00

Author	SHA1	Message	Date
Gabriel Krisman Bertazi	7e7cd796f2	scsi: iscsi: Fix deadlock on recovery path during GFP_IO reclaim iSCSI suffers from a deadlock in case a management command submitted via the netlink socket sleeps on an allocation while holding the rx_queue_mutex if that allocation causes a memory reclaim that writebacks to a failed iSCSI device. The recovery procedure can never make progress to recover the failed disk or abort outstanding IO operations to complete the reclaim (since rx_queue_mutex is locked), thus locking the system. Nevertheless, just marking all allocations under rx_queue_mutex as GFP_NOIO (or locking the userspace process with something like PF_MEMALLOC_NOIO) is not enough, since the iSCSI command code relies on other subsystems that try to grab locked mutexes, whose threads are GFP_IO, leading to the same deadlock. One instance where this situation can be observed is in the backtraces below, stitched from multiple bugs reports, involving the kobj uevent sent when a session is created. The root of the problem is not the fact that iSCSI does GFP_IO allocations, that is acceptable. The actual problem is that rx_queue_mutex has a very large granularity, covering every unrelated netlink command execution at the same time as the error recovery path. The proposed fix leverages the recently added mechanism to stop failed connections from the kernel, by enabling it to execute even though a management command from the netlink socket is being run (rx_queue_mutex is held), provided that the command is known to be safe. It splits the rx_queue_mutex in two mutexes, one protecting from concurrent command execution from the netlink socket, and one protecting stop_conn from racing with other connection management operations that might conflict with it. It is not very pretty, but it is the simplest way to resolve the deadlock. I considered making it a lock per connection, but some external mutex would still be needed to deal with iscsi_if_destroy_conn. The patch was tested by forcing a memory shrinker (unrelated, but used bufio/dm-verity) to reclaim iSCSI pages every time ISCSI_UEVENT_CREATE_SESSION happens, which is reasonable to simulate reclaims that might happen with GFP_KERNEL on that path. Then, a faulty hung target causes a connection to fail during intensive IO, at the same time a new session is added by iscsid. The following stacktraces are stiches from several bug reports, showing a case where the deadlock can happen. iSCSI-write holding: rx_queue_mutex waiting: uevent_sock_mutex kobject_uevent_env+0x1bd/0x419 kobject_uevent+0xb/0xd device_add+0x48a/0x678 scsi_add_host_with_dma+0xc5/0x22d iscsi_host_add+0x53/0x55 iscsi_sw_tcp_session_create+0xa6/0x129 iscsi_if_rx+0x100/0x1247 netlink_unicast+0x213/0x4f0 netlink_sendmsg+0x230/0x3c0 iscsi_fail iscsi_conn_failure waiting: rx_queue_mutex schedule_preempt_disabled+0x325/0x734 __mutex_lock_slowpath+0x18b/0x230 mutex_lock+0x22/0x40 iscsi_conn_failure+0x42/0x149 worker_thread+0x24a/0xbc0 EventManager_ holding: uevent_sock_mutex waiting: dm_bufio_client->lock dm_bufio_lock+0xe/0x10 shrink+0x34/0xf7 shrink_slab+0x177/0x5d0 do_try_to_free_pages+0x129/0x470 try_to_free_mem_cgroup_pages+0x14f/0x210 memcg_kmem_newpage_charge+0xa6d/0x13b0 __alloc_pages_nodemask+0x4a3/0x1a70 fallback_alloc+0x1b2/0x36c __kmalloc_node_track_caller+0xb9/0x10d0 __alloc_skb+0x83/0x2f0 kobject_uevent_env+0x26b/0x419 dm_kobject_uevent+0x70/0x79 dev_suspend+0x1a9/0x1e7 ctl_ioctl+0x3e9/0x411 dm_ctl_ioctl+0x13/0x17 do_vfs_ioctl+0xb3/0x460 SyS_ioctl+0x5e/0x90 MemcgReclaimerD" holding: dm_bufio_client->lock waiting: stuck io to finish (needs iscsi_fail thread to progress) schedule at ffffffffbd603618 io_schedule at ffffffffbd603ba4 do_io_schedule at ffffffffbdaf0d94 __wait_on_bit at ffffffffbd6008a6 out_of_line_wait_on_bit at ffffffffbd600960 wait_on_bit.constprop.10 at ffffffffbdaf0f17 __make_buffer_clean at ffffffffbdaf18ba __cleanup_old_buffer at ffffffffbdaf192f shrink at ffffffffbdaf19fd do_shrink_slab at ffffffffbd6ec000 shrink_slab at ffffffffbd6ec24a do_try_to_free_pages at ffffffffbd6eda09 try_to_free_mem_cgroup_pages at ffffffffbd6ede7e mem_cgroup_resize_limit at ffffffffbd7024c0 mem_cgroup_write at ffffffffbd703149 cgroup_file_write at ffffffffbd6d9c6e sys_write at ffffffffbd6662ea system_call_fastpath at ffffffffbdbc34a2 Link: https://lore.kernel.org/r/20200520022959.1912856-1-krisman@collabora.com Reported-by: Khazhismel Kumykov <khazhy@google.com> Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-26 21:20:24 -04:00
Stanley Chu	51dd905bd2	scsi: ufs: Fix WriteBooster flush during runtime suspend Currently UFS host driver promises VCC supply if UFS device needs to do WriteBooster flush during runtime suspend. However the UFS specification mentions: "While the flushing operation is in progress, the device is in Active power mode." Therefore UFS host driver needs to promise more: Keep UFS device as "Active power mode", otherwise UFS device shall not do any flush if device enters Sleep or PowerDown power mode. Similarly, the same promises shall be applied if device needs urgent BKOP during runtime suspend. Fix this by not changing device power mode if WriteBooster flush or urgent BKOP is required in ufshcd_suspend(). Now, if device finishes its job but is not resumed for a very long time, system will have unnecessary power drain because VCC is still supplied. A method to re-check the threshold of keeping VCC supply is required to fix the power drain. However, the threshold re-check needs to re-activate the link first because the decision depends on the latest device status. Also introduce a delayed work to force runtime resume after a certain delay during runtime suspend. This makes threshold re-check happen natually in the entry of the next runtime-suspend. The device can continue its WriteBooster flush or urgent BKOP jobs soon after resumed if device has no upcoming requests and link enters hibern8 state either by Auto-Hibern8 or hibern8 during clk-gating scheme. This solution not only prevents power drain but also makes as much use of time as possible for device's background jobs. Link: https://lore.kernel.org/r/20200522083212.4008-5-stanley.chu@mediatek.com Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-26 21:14:08 -04:00
Stanley Chu	e31011ab37	scsi: ufs: Fix index of attributes query for WriteBooster feature For WriteBooster feature related attributes, the index used by query shall be LUN ID if LU Dedicated buffer mode is enabled. Link: https://lore.kernel.org/r/20200522083212.4008-4-stanley.chu@mediatek.com Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-26 21:14:07 -04:00
Stanley Chu	c7cee3e746	scsi: ufs: Allow WriteBooster on UFS 2.2 devices According to the UFS specification, WriteBooster is officially supported by UFS 2.2. Since UFS 2.2 specification has been finalized in JEDEC and such devices have also showed up in the market, modify the checking rule for ufshcd_wb_probe() to allow these devices to enable WriteBooster. Link: https://lore.kernel.org/r/20200522083212.4008-3-stanley.chu@mediatek.com Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-26 21:14:06 -04:00
Stanley Chu	3a66ae512b	scsi: ufs: Remove unnecessary memset for dev_info The whole UFS host instance has been zero-initialized by scsi_host_alloc(), thus UFS driver does not need to clear "dev_info" member specifically in ufshcd_device_params_init(). Simply remove the unnecessary code. Link: https://lore.kernel.org/r/20200522083212.4008-2-stanley.chu@mediatek.com Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-26 21:14:05 -04:00
Jeffrey Hugo	3be60b564d	scsi: ufs-qcom: Fix scheduling while atomic issue ufs_qcom_dump_dbg_regs() uses usleep_range, a sleeping function, but can be called from atomic context in the following flow: ufshcd_intr -> ufshcd_sl_intr -> ufshcd_check_errors -> ufshcd_print_host_regs -> ufshcd_vops_dbg_register_dump -> ufs_qcom_dump_dbg_regs This causes a boot crash on the Lenovo Miix 630 when the interrupt is handled on the idle thread. Fix the issue by switching to udelay(). Link: https://lore.kernel.org/r/20200525204125.46171-1-jeffrey.l.hugo@gmail.com Fixes: `9c46b86762` ("scsi: ufs-qcom: dump additional testbus registers") Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Signed-off-by: Jeffrey Hugo <jeffrey.l.hugo@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-26 20:05:02 -04:00
Suganath Prabu S	f56577e8c7	scsi: mpt3sas: Fix reply queue count in non RDPQ mode For non RDPQ mode, the driver allocates a single contiguous block of memory pool for all reply descriptor post queues and passes down a single address in the ReplyDescriptorPostQueueAddress field of the IOC Init Request Message to the firmware. So reply_post queue will have only one entry which holds the address of this single contiguous block of memory pool. While allocating the reply descriptor post queue pool, driver should loop only once in non-RDPQ mode. But the driver is looping for ioc->reply_queue_count number of times even though reply_post queue's queue depth is only one in non-RDPQ mode. This leads to 'BUG: KASAN: use-after-free in base_alloc_rdpq_dma_pool'. The fix is to loop only once while allocating memory for the reply descriptor post queue in non-RDPQ mode Fixes: `8012209eb2` ("scsi: mpt3sas: Handle RDPQ DMA allocation in same 4G region") Link: https://lore.kernel.org/r/20200522103558.5710-1-suganath-prabu.subramani@broadcom.com Reported-by: Tomas Henzl <thenzl@redhat.com> Reviewed-by: Tomas Henzl <thenzl@redhat.com> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-26 19:57:59 -04:00
Xiyu Yang	7217e6e694	scsi: lpfc: Fix lpfc_nodelist leak when processing unsolicited event In order to create or activate a new node, lpfc_els_unsol_buffer() invokes lpfc_nlp_init() or lpfc_enable_node() or lpfc_nlp_get(), all of them will return a reference of the specified lpfc_nodelist object to "ndlp" with increased refcnt. When lpfc_els_unsol_buffer() returns, local variable "ndlp" becomes invalid, so the refcount should be decreased to keep refcount balanced. The reference counting issue happens in one exception handling path of lpfc_els_unsol_buffer(). When "ndlp" in DEV_LOSS, the function forgets to decrease the refcnt increased by lpfc_nlp_init() or lpfc_enable_node() or lpfc_nlp_get(), causing a refcnt leak. Fix this issue by calling lpfc_nlp_put() when "ndlp" in DEV_LOSS. Link: https://lore.kernel.org/r/1590416184-52592-1-git-send-email-xiyuyang19@fudan.edu.cn Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: James Smart <james.smart@broadcom.com> Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-26 15:55:14 -04:00
Dan Carpenter	9d7464b188	scsi: target: tcmu: Fix a use after free in tcmu_check_expired_queue_cmd() The pr_debug() dereferences "cmd" after we already freed it by calling tcmu_free_cmd(cmd). The debug printk needs to be done earlier. Link: https://lore.kernel.org/r/20200523101129.GB98132@mwanda Fixes: `61fb248221` ("scsi: target: tcmu: Userspace must not complete queued commands") Reviewed-by: Mike Christie <mchristi@redhat.com> Reviewed-by: David Disseldorp <ddiss@suse.de> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-26 15:54:39 -04:00
Sudhakar Panneerselvam	5ae6a6a915	scsi: vhost: Notify TCM about the maximum sg entries supported per command vhost-scsi pre-allocates the maximum sg entries per command and if a command requires more than VHOST_SCSI_PREALLOC_SGLS entries, then that command is failed by it. This patch lets vhost communicate the max sg limit when it registers vhost_scsi_ops with TCM. With this change, TCM would report the max sg entries through "Block Limits" VPD page which will be typically queried by the SCSI initiator during device discovery. By knowing this limit, the initiator could ensure the maximum transfer length is less than or equal to what is reported by vhost-scsi. Link: https://lore.kernel.org/r/1590166317-953-1-git-send-email-sudhakar.panneerselvam@oracle.com Cc: Michael S. Tsirkin <mst@redhat.com> Cc: Jason Wang <jasowang@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Sudhakar Panneerselvam <sudhakar.panneerselvam@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-26 15:52:08 -04:00
Daniel Wagner	ac988c4936	scsi: qla2xxx: Remove return value from qla_nvme_ls() The function always returns QLA_SUCCESS and the caller qla2x00_start_sp() doesn't even evalute the return value. So there is no point in returning a status. Link: https://lore.kernel.org/r/20200520130819.90625-1-dwagner@suse.de Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-26 15:48:33 -04:00
Bart Van Assche	ce9a9321c1	scsi: qla2xxx: Remove an unused function This was detected by building the qla2xxx driver with clang. See also commit `a9083016a5` ("[SCSI] qla2xxx: Add ISP82XX support"). Link: https://lore.kernel.org/r/20200520040738.1017-1-bvanassche@acm.org Cc: Arun Easi <aeasi@marvell.com> Cc: Nilesh Javali <njavali@marvell.com> Cc: Himanshu Madhani <himanshu.madhani@oracle.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Daniel Wagner <dwagner@suse.de> Cc: Martin Wilck <mwilck@suse.com> Cc: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-26 15:47:18 -04:00
Bob Liu	3ce419662d	scsi: iscsi: Register sysfs for iscsi workqueue This patch enables setting cpu affinity through "cpumask" for iscsi workqueues (iscsi_q_xx and iscsi_eh), so as to get performance isolation. The max number of active worker was changed form 1 to 2, because "cpumask" of ordered workqueue isn't allowed to change. Link: https://lore.kernel.org/r/20200505011908.15538-1-bob.liu@oracle.com Reviewed-by: Lee Duncan <lduncan@suse.com> Signed-off-by: Bob Liu <bob.liu@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-26 15:40:30 -04:00
Douglas Gilbert	b6ff8ca733	scsi: scsi_debug: Parser tables and code interaction This patch is in response to a static analyser report from Dan Carpenter titled: "[bug report] scsi: scsi_debug: Add per_host_store option". This code may not clear the static analyzer reports, but may shed light on why they occur. Amongst other things this driver has a table driven SCSI command parser which also involves some C code. There are some invariants between the table entries and the corresponding C code (i.e. the resp_*() functions) that, if broken, may lead to a NULL dereference. And the report is valid, at least in the case of the PRE-FETCH command. Alas, that is not one of the cases that the static analyzer reported. In this particular corner case: when the fake_rw flag is set and the table entry for a "store"-accessing command does not have the required F_FAKE_RW flag set, do the following. Call BUG_ON() in the devip2sip() very close to a comment block explaining why it was called and how to fix it. checkpatch.pl complains about the BUG_ON() but there is no reasonable remedial action that can be taken at run time. This change allows the code reported by the static analyzer to be simplified. Comments were also added to the table flags (e.g. F_FAKE_RW) so developers who add commands might be more inclined to use them (properly). Link: https://lore.kernel.org/r/20200513013943.25285-1-dgilbert@interlog.com Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 22:25:52 -04:00
Ye Bin	840e1b55bb	scsi: core: Refactor scsi_mq_setup_tags function shost->tag_set is used too many times, introduce temporary parameter tag_set instead of &shost->tag_set. Link: https://lore.kernel.org/r/20200518074732.39679-1-yebin10@huawei.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:55:15 -04:00
Ye Bin	4dea170f4f	scsi: core: Fix incorrect usage of shost_for_each_device shost_for_each_device(sdev, shost) \ for ((sdev) = __scsi_iterate_devices((shost), NULL); \ (sdev); \ (sdev) = __scsi_iterate_devices((shost), (sdev))) When terminating shost_for_each_device() iteration with break or return, scsi_device_put() should be used to prevent stale scsi device references from being left behind. Link: https://lore.kernel.org/r/20200518074420.39275-1-yebin10@huawei.com Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Ye Bin <yebin10@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:54:17 -04:00
Bart Van Assche	7ffa5b9397	scsi: qla2xxx: Fix endianness annotations in source files Fix all endianness complaints reported by sparse (C=2) without affecting the behavior of the code on little endian CPUs. Link: https://lore.kernel.org/r/20200518211712.11395-16-bvanassche@acm.org Cc: Nilesh Javali <njavali@marvell.com> Cc: Quinn Tran <qutran@marvell.com> Cc: Martin Wilck <mwilck@suse.com> Cc: Daniel Wagner <dwagner@suse.de> Cc: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:43:28 -04:00
Bart Van Assche	21038b0900	scsi: qla2xxx: Fix endianness annotations in header files Annotate members of FC protocol and firmware dump data structures as big endian. Annotate members of RISC control structures as little endian. Annotate mailbox registers as little endian. Annotate the mb[] arrays as CPU-endian because communication of the mb[] values with the hardware happens through the readw() and writew() functions. readw() converts from __le16 to u16 and writew() converts from u16 to __le16. Annotate 'handles' as CPU-endian because for the firmware these are opaque values. Link: https://lore.kernel.org/r/20200518211712.11395-15-bvanassche@acm.org CC: Hannes Reinecke <hare@suse.de> Cc: Nilesh Javali <njavali@marvell.com> Cc: Quinn Tran <qutran@marvell.com> Cc: Martin Wilck <mwilck@suse.com> Cc: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:43:24 -04:00
Bart Van Assche	2a4b684ab0	scsi: qla2xxx: Use make_handle() instead of open-coding it Link: https://lore.kernel.org/r/20200518211712.11395-14-bvanassche@acm.org Cc: Arun Easi <aeasi@marvell.com> Cc: Nilesh Javali <njavali@marvell.com> Cc: Martin Wilck <mwilck@suse.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:43:20 -04:00
Bart Van Assche	ab053c09ee	scsi: qla2xxx: Cast explicitly to uint16_t / uint32_t Casting a pointer to void * and relying on an implicit cast from void * to uint16_t or uint32_t suppresses sparse warnings about endianness. Hence cast explicitly to uint16_t and uint32_t. Additionally, remove superfluous void * casts. Link: https://lore.kernel.org/r/20200518211712.11395-13-bvanassche@acm.org Cc: Arun Easi <aeasi@marvell.com> Cc: Nilesh Javali <njavali@marvell.com> Cc: Daniel Wagner <dwagner@suse.de> Cc: Himanshu Madhani <himanshu.madhani@oracle.com> Cc: Martin Wilck <mwilck@suse.com> Cc: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:43:19 -04:00
Bart Van Assche	04474d3a1c	scsi: qla2xxx: Change {RD,WRT}_REG_*() function names from upper case into lower case This was suggested by Daniel Wagner. Link: https://lore.kernel.org/r/20200518211712.11395-12-bvanassche@acm.org Cc: Nilesh Javali <njavali@marvell.com> Cc: Quinn Tran <qutran@marvell.com> Cc: Martin Wilck <mwilck@suse.com> Cc: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:43:17 -04:00
Bart Van Assche	37139da1b0	scsi: qla2xxx: Fix the code that reads from mailbox registers Make the MMIO accessors strongly typed such that the compiler checks whether the accessor function is used that matches the register width. Fix those MMIO accesses where another number of bits was read or written than the size of the register. Link: https://lore.kernel.org/r/20200518211712.11395-11-bvanassche@acm.org Cc: Nilesh Javali <njavali@marvell.com> Cc: Quinn Tran <qutran@marvell.com> Cc: Martin Wilck <mwilck@suse.com> Cc: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:43:10 -04:00
Bart Van Assche	c388841622	scsi: qla2xxx: Use register names instead of register offsets Make qla27xx_write_remote_reg() easier to read by using register names instead of register offsets. The 'pahole' tool has been used to convert register offsets into register names. See also commit `cbb01c2f2f` ("scsi: qla2xxx: Fix MPI failure AEN (8200) handling"). Link: https://lore.kernel.org/r/20200518211712.11395-10-bvanassche@acm.org Cc: Arun Easi <aeasi@marvell.com> Cc: Nilesh Javali <njavali@marvell.com> Cc: Martin Wilck <mwilck@suse.com> Cc: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:43:08 -04:00
Bart Van Assche	e544b720ef	scsi: qla2xxx: Change two hardcoded constants into offsetof() / sizeof() expressions This patch does not change any functionality. Link: https://lore.kernel.org/r/20200518211712.11395-9-bvanassche@acm.org Cc: Nilesh Javali <njavali@marvell.com> Cc: Quinn Tran <qutran@marvell.com> Cc: Martin Wilck <mwilck@suse.com> Cc: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:43:07 -04:00
Bart Van Assche	d9ab5f1f05	scsi: qla2xxx: Increase the size of struct qla_fcp_prio_cfg to FCP_PRIO_CFG_SIZE This patch fixes the following Coverity complaint without changing any functionality: CID 337793 (#1 of 1): Wrong size argument (SIZEOF_MISMATCH) suspicious_sizeof: Passing argument ha->fcp_prio_cfg of type struct qla_fcp_prio_cfg * and argument 32768UL to function memset is suspicious because a multiple of sizeof (struct qla_fcp_prio_cfg) /48/ is expected. memset(ha->fcp_prio_cfg, 0, FCP_PRIO_CFG_SIZE); Link: https://lore.kernel.org/r/20200518211712.11395-8-bvanassche@acm.org Cc: Nilesh Javali <njavali@marvell.com> Cc: Quinn Tran <qutran@marvell.com> Cc: Martin Wilck <mwilck@suse.com> Cc: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:43:07 -04:00
Bart Van Assche	66f8636777	scsi: qla2xxx: Make a gap in struct qla2xxx_offld_chain explicit This patch makes struct qla2xxx_offld_chain compatible with ARCH=i386. Link: https://lore.kernel.org/r/20200518211712.11395-7-bvanassche@acm.org Cc: Nilesh Javali <njavali@marvell.com> Cc: Quinn Tran <qutran@marvell.com> Cc: Martin Wilck <mwilck@suse.com> Cc: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Arun Easi <aeasi@marvell.com> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:43:06 -04:00
Bart Van Assche	8a73a0e002	scsi: qla2xxx: Add more BUILD_BUG_ON() statements Before fixing the endianness annotations in data structures, make the compiler verify the size of FC protocol and firmware data structures. Link: https://lore.kernel.org/r/20200518211712.11395-6-bvanassche@acm.org Cc: Nilesh Javali <njavali@marvell.com> Cc: Quinn Tran <qutran@marvell.com> Cc: Martin Wilck <mwilck@suse.com> Cc: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:43:05 -04:00
Bart Van Assche	59d23cf3f2	scsi: qla2xxx: Sort BUILD_BUG_ON() statements alphabetically Before adding more BUILD_BUG_ON() statements, sort the existing statements alphabetically. Link: https://lore.kernel.org/r/20200518211712.11395-5-bvanassche@acm.org Cc: Nilesh Javali <njavali@marvell.com> Cc: Quinn Tran <qutran@marvell.com> Cc: Martin Wilck <mwilck@suse.com> Cc: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:43:04 -04:00
Bart Van Assche	8ae178760b	scsi: qla2xxx: Simplify the functions for dumping firmware Instead of passing an argument to the firmware dumping functions that tells these functions whether or not to obtain the hardware lock, obtain that lock before calling these functions. This patch fixes the following recently introduced C=2 build error: CHECK drivers/scsi/qla2xxx/qla_tmpl.c drivers/scsi/qla2xxx/qla_tmpl.c:1133:1: error: Expected ; at end of statement drivers/scsi/qla2xxx/qla_tmpl.c:1133:1: error: got } drivers/scsi/qla2xxx/qla_tmpl.h:247:0: error: Expected } at end of function drivers/scsi/qla2xxx/qla_tmpl.h:247:0: error: got end-of-input Link: https://lore.kernel.org/r/20200518211712.11395-4-bvanassche@acm.org Fixes: `cbb01c2f2f` ("scsi: qla2xxx: Fix MPI failure AEN (8200) handling") Cc: Arun Easi <aeasi@marvell.com> Cc: Nilesh Javali <njavali@marvell.com> Cc: Himanshu Madhani <himanshu.madhani@oracle.com> Cc: Martin Wilck <mwilck@suse.com> Cc: Roman Bolshakov <r.bolshakov@yadro.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Daniel Wagner <dwagner@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:43:04 -04:00
Bart Van Assche	fbbc95a49d	scsi: qla2xxx: Suppress two recently introduced compiler warnings Suppress the following two compiler warnings because these are not useful: In file included from ./include/trace/define_trace.h:102, from ./include/trace/events/qla.h:39, from drivers/scsi/qla2xxx/qla_dbg.c:77: ./include/trace/events/qla.h: In function 'trace_event_raw_event_qla_log_event': ./include/trace/trace_events.h:691:9: warning: function 'trace_event_raw_event_qla_log_event' might be a candidate for 'gnu_printf' format attribute [-Wsuggest-attribute=format] 691 \| struct trace_event_raw_##call entry; \ \| ^~~~~~~~~~~~~~~~ ./include/trace/events/qla.h:12:1: note: in expansion of macro 'DECLARE_EVENT_CLASS' 12 \| DECLARE_EVENT_CLASS(qla_log_event, \| ^~~~~~~~~~~~~~~~~~~ In file included from ./include/trace/define_trace.h:103, from ./include/trace/events/qla.h:39, from drivers/scsi/qla2xxx/qla_dbg.c:77: ./include/trace/events/qla.h: In function 'perf_trace_qla_log_event': ./include/trace/perf.h:41:9: warning: function 'perf_trace_qla_log_event' might be a candidate for 'gnu_printf' format attribute [-Wsuggest-attribute=format] 41 \| struct hlist_head head; \ \| ^~~~~~~~~~ ./include/trace/events/qla.h:12:1: note: in expansion of macro 'DECLARE_EVENT_CLASS' Link: https://lore.kernel.org/r/20200518211712.11395-3-bvanassche@acm.org Fixes: `598a90f200` ("scsi: qla2xxx: add ring buffer for tracing debug logs") Cc: Rajan Shanmugavelu <rajan.shanmugavelu@oracle.com> Cc: Joe Jin <joe.jin@oracle.com> Cc: Nilesh Javali <njavali@marvell.com> Cc: Quinn Tran <qutran@marvell.com> Cc: Martin Wilck <mwilck@suse.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Arun Easi <aeasi@marvell.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:43:01 -04:00
Bart Van Assche	246ee22583	scsi: qla2xxx: Fix spelling of a variable name Change "offet" into "offset" in a variable name. Link: https://lore.kernel.org/r/20200518211712.11395-2-bvanassche@acm.org Cc: Nilesh Javali <njavali@marvell.com> Cc: Quinn Tran <qutran@marvell.com> Cc: Martin Wilck <mwilck@suse.com> Reviewed-by: Daniel Wagner <dwagner@suse.de> Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Arun Easi <aeasi@marvell.com> Reviewed-by: Roman Bolshakov <r.bolshakov@yadro.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:43:00 -04:00
Bodo Stroesser	61fb248221	scsi: target: tcmu: Userspace must not complete queued commands When tcmu queues a new command - no matter whether in command ring or in qfull_queue - a cmd_id from IDR udev->commands is assigned to the command. If userspace sends a wrong command completion containing the cmd_id of a command on the qfull_queue, tcmu_handle_completions() finds the command in the IDR and calls tcmu_handle_completion() for it. This might do some nasty things because commands in qfull_queue do not have a valid dbi list. To fix this bug, we no longer add queued commands to the idr. Instead the cmd_id is assign when a command is written to the command ring. Due to this change I had to adapt the source code at several places where up to now an idr_for_each had been done. [mkp: fix checkpatch warnings] Link: https://lore.kernel.org/r/20200518164833.12775-1-bstroesser@ts.fujitsu.com Acked-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Bodo Stroesser <bstroesser@ts.fujitsu.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:33:04 -04:00
Lance Digby	5482d56bfe	scsi: target: core: Add initiatorname to NON_EXISTENT_LUN error The NON_EXISTENT_LUN error can be written without an error condition on the initiator responsible. Adding the initiatorname to this message will reduce the effort required to fix this when many initiators are supported by a target. This version ensures the initiator name is also printed on the same message in transport_lookup_tmr_lun for consistency. Link: https://lore.kernel.org/r/9b13bb2e1f52f1792cd81850ee95bf3781bb5363.1589759816.git.lance.digby@gmail.com Reviewed-by: Mike Christie <mchristi@redhat.com> Signed-off-by: Lance Digby <lance.digby@gmail.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:27:10 -04:00
John Garry	1cdee00442	scsi: hisi_sas: Stop returning error code from slot_complete_vX_hw() The error codes are never checked, stop returning them. Link: https://lore.kernel.org/r/1589552025-165012-5-git-send-email-john.garry@huawei.com Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:23:59 -04:00
Luo Jiaxing	1a0efb55b2	scsi: hisi_sas: Add SAS_RAS_INTR0 to debugfs register name list Register SAS_RAS_INTR0 can help us to figure out which ECC error has occurred. This register is helpful to identify RAS issue, so we add it to the list of debugfs register name list for easier retrieval. Link: https://lore.kernel.org/r/1589552025-165012-4-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:23:58 -04:00
Luo Jiaxing	1e954d1f00	scsi: hisi_sas: Modify the commit information for DSM method Make it clear that BIOS may modify some register settings. Link: https://lore.kernel.org/r/1589552025-165012-3-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:23:58 -04:00
Luo Jiaxing	e16b9ed61e	scsi: hisi_sas: Do not reset phy timer to wait for stray phy up We found out that after phy up, the hardware reports another oob interrupt but did not follow a phy up interrupt: oob ready -> phy up -> DEV found -> oob read -> wait phy up -> timeout We run link reset when wait phy up timeout, and it send a normal disk into reset processing. So we made some circumvention action in the code, so that this abnormal oob interrupt will not start the timer to wait for phy up. Link: https://lore.kernel.org/r/1589552025-165012-2-git-send-email-john.garry@huawei.com Signed-off-by: Luo Jiaxing <luojiaxing@huawei.com> Signed-off-by: John Garry <john.garry@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:23:57 -04:00
Damien Le Moal	c5f8852273	scsi: sd: Add zoned capabilities device attribute Export through sysfs as a scsi_disk attribute the zoned capabilities of a disk ("zoned_cap" attribute file). This new attribute indicates in human readable form (i.e. a string) the zoned block capabilities implemented by the disk as found in the ZONED field of the disk block device characteristics VPD page. The possible values are: - "none": ZONED=00b (not reported), regular disk - "host-aware": ZONED=01b, host-aware ZBC disk - "drive-managed": ZONED=10b, drive-managed ZBC disk (regular disk interface) For completeness, also add the following value which is detected using the device type rather than the ZONED field: - "host-managed": device type = 0x14 (TYPE_ZBC), host-managed ZBC disk This new sysfs attribute is purely informational and complementary to the "zoned" device request queue sysfs attribute as it allows applications and user daemons (e.g. udev) to easily differentiate regular disks from drive-managed SMR disks without the need for direct access tools such as provided by sg3utils. Link: https://lore.kernel.org/r/20200515054856.1408575-1-damien.lemoal@wdc.com Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:18:26 -04:00
Bart Van Assche	5cac1095cf	scsi: ufs: Make ufshcd_wait_for_register() sleep instead of busy-waiting The ufshcd_wait_for_register() function either sleeps or spins until the specified register has reached the desired value. Busy-waiting is not only considered a bad practice but also has a bad impact on energy consumption. Always sleep instead of spinning by making sure that all ufshcd_wait_for_register() calls happen from a context where it is allowed to sleep. The only function call that has to be moved is the ufshcd_hba_stop() call in ufshcd_host_reset_and_restore(). Link: https://lore.kernel.org/r/20200507222750.19113-1-bvanassche@acm.org Cc: Can Guo <cang@codeaurora.org> Cc: Avri Altman <avri.altman@wdc.com> Cc: Bean Huo <beanhuo@micron.com> Cc: Alim Akhtar <alim.akhtar@samsung.com> Cc: Asutosh Das <asutoshd@codeaurora.org> Tested-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:03:07 -04:00
Wei Yongjun	d0b1e4a638	scsi: cxlflash: Fix error return code in cxlflash_probe() Fix to return negative error code -ENOMEM from create_afu error handling case instead of 0, as done elsewhere in this function. Link: https://lore.kernel.org/r/20200428141855.88704-1-weiyongjun1@huawei.com Acked-by: Matthew R. Ochs <mrochs@linux.ibm.com> Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-19 21:01:32 -04:00
Stanley Chu	f48b285ae6	scsi: ufs-mediatek: Customize WriteBooster flush policy Change the WriteBooster policy to keep VCC on during runtime suspend if available WriteBooster buffer is less than 80%. Link: https://lore.kernel.org/r/20200509093716.21010-5-stanley.chu@mediatek.com Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-14 21:02:22 -04:00
Stanley Chu	d14734ae3a	scsi: ufs: Customize flush threshold for WriteBooster Allow flush threshold for WriteBooster to be customizable by vendors. To achieve this, make the value a variable in struct ufs_hba_variant_params. Also introduce UFS_WB_BUF_REMAIN_PERCENT() macro to provide a more flexible way to specify WriteBooster available buffer values. Link: https://lore.kernel.org/r/20200509093716.21010-4-stanley.chu@mediatek.com Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-14 21:02:22 -04:00
Stanley Chu	90b8491c00	scsi: ufs: Introduce ufs_hba_variant_params to group customizable parameters The UFS driver is growing more and more customizable parameters. Collect them in one place. Link: https://lore.kernel.org/r/20200509093716.21010-2-stanley.chu@mediatek.com Reviewed-by: Asutosh Das <asutoshd@codeaurora.org> Signed-off-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-14 21:02:22 -04:00
Damien Le Moal	0bd735df76	scsi: sd: Signal drive managed SMR disks Print a message indicating that a disk is a drive-managed SMR model when such drive is found using the ZONED field of the Block Device Characteristics VPD page (IDENTIFY data on ATA side). [mkp: typo] Link: https://lore.kernel.org/r/20200514081953.1252087-1-damien.lemoal@wdc.com Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-14 20:48:53 -04:00
ChenTao	21d2b76831	scsi: ufs-mediatek: Make ufs_mtk_fixup_dev_quirks static Fix the following warning: drivers/scsi/ufs/ufs-mediatek.c:585:6: warning: symbol 'ufs_mtk_fixup_dev_quirks' was not declared. Should it be static? Link: https://lore.kernel.org/r/20200514012655.127202-1-chentao107@huawei.com Reported-by: Hulk Robot <hulkci@huawei.com> Reviewed-by: Stanley Chu <stanley.chu@mediatek.com> Signed-off-by: ChenTao <chentao107@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-14 20:47:23 -04:00
Dan Carpenter	25c21d20bc	scsi: aacraid: Fix an oops in error handling If the memdup_user() function fails then it results in an Oops in the error handling code when we try to kfree() and error pointer. Link: https://lore.kernel.org/r/20200513093703.GB347693@mwanda Fixes: `8d925b1f00` ("scsi: aacraid: Use memdup_user() as a cleanup") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-14 20:24:01 -04:00
Jason Yan	55ce24b3bf	scsi: hisi_sas: Display proc_name in sysfs The 'proc_name' entry in sysfs for hisi_sas is 'null' now because it is not initialized in scsi_host_template. It looks like: [root@localhost ~]# cat /sys/class/scsi_host/host2/proc_name (null) While the other driver's entry looks like: linux-vnMQMU:~ # cat /sys/class/scsi_host/host0/proc_name megaraid_sas Link: https://lore.kernel.org/r/20200512113258.30781-1-yanaijie@huawei.com Cc: John Garry <john.garry@huawei.com> Cc: Xiang Chen <chenxiang66@hisilicon.com> Acked-by: John Garry <john.garry@huawei.com> Signed-off-by: Jason Yan <yanaijie@huawei.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-12 20:20:07 -04:00
Dan Carpenter	47742bde28	scsi: scsi_debug: Fix an error handling bug in sdeb_zbc_model_str() This test is checking the wrong variable. It should be testing "res". The "sdeb_zbc_model" variable is an enum (unsigned in this situation) and we never assign negative values to it. [mkp: fixed commit desc issue reported by Doug] Link: https://lore.kernel.org/r/20200509100408.GA5555@mwanda Fixes: `9267e0eb41` ("scsi: scsi_debug: Add ZBC module parameter") Acked-by: Douglas Gilbert <dgilbert@interlog.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-11 23:21:27 -04:00
Benjamin Block	d0dff2ac98	scsi: zfcp: Move allocation of the shost object to after xconf- and xport-data At the moment we allocate and register the Scsi_Host object corresponding to a zfcp adapter (FCP device) very early in the life cycle of the adapter - even before we fully discover and initialize the underlying firmware/hardware. This had the advantage that we could already use the Scsi_Host object, and fill in all its information during said discover and initialize. Due to commit `737eb78e82` ("block: Delay default elevator initialization") (first released in v5.4), we noticed a regression that would prevent us from using any storage volume if zfcp is configured with support for DIF or DIX (zfcp.dif=1 \|\| zfcp.dix=1). Doing so would result in an illegal memory access as soon as the first request is sent with such an configuration. As example for a crash resulting from this: scsi host0: scsi_eh_0: sleeping scsi host0: zfcp qdio: 0.0.1900 ZFCP on SC 4bd using AI:1 QEBSM:0 PRI:1 TDD:1 SIGA: W AP scsi 0:0:0:0: scsi scan: INQUIRY pass 1 length 36 Unable to handle kernel pointer dereference in virtual kernel address space Failing address: 0000000000000000 TEID: 0000000000000483 Fault in home space mode while using kernel ASCE. AS:0000000035c7c007 R3:00000001effcc007 S:00000001effd1000 P:000000000000003d Oops: 0004 ilc:3 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: ... CPU: 1 PID: 783 Comm: kworker/u760:5 Kdump: loaded Not tainted 5.6.0-rc2-bb-next+ #1 Hardware name: ... Workqueue: scsi_wq_0 fc_scsi_scan_rport [scsi_transport_fc] Krnl PSW : 0704e00180000000 000003ff801fcdae (scsi_queue_rq+0x436/0x740 [scsi_mod]) R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3 Krnl GPRS: 0fffffffffffffff 0000000000000000 0000000187150120 0000000000000000 000003ff80223d20 000000000000018e 000000018adc6400 0000000187711000 000003e0062337e8 00000001ae719000 0000000187711000 0000000187150000 00000001ab808100 0000000187150120 000003ff801fcd74 000003e0062336a0 Krnl Code: 000003ff801fcd9e: e310a35c0012 lt %r1,860(%r10) 000003ff801fcda4: a7840010 brc 8,000003ff801fcdc4 #000003ff801fcda8: e310b2900004 lg %r1,656(%r11) >000003ff801fcdae: d71710001000 xc 0(24,%r1),0(%r1) 000003ff801fcdb4: e310b2900004 lg %r1,656(%r11) 000003ff801fcdba: 41201018 la %r2,24(%r1) 000003ff801fcdbe: e32010000024 stg %r2,0(%r1) 000003ff801fcdc4: b904002b lgr %r2,%r11 Call Trace: [<000003ff801fcdae>] scsi_queue_rq+0x436/0x740 [scsi_mod] ([<000003ff801fcd74>] scsi_queue_rq+0x3fc/0x740 [scsi_mod]) [<00000000349c9970>] blk_mq_dispatch_rq_list+0x390/0x680 [<00000000349d1596>] blk_mq_sched_dispatch_requests+0x196/0x1a8 [<00000000349c7a04>] __blk_mq_run_hw_queue+0x144/0x160 [<00000000349c7ab6>] __blk_mq_delay_run_hw_queue+0x96/0x228 [<00000000349c7d5a>] blk_mq_run_hw_queue+0xd2/0xe0 [<00000000349d194a>] blk_mq_sched_insert_request+0x192/0x1d8 [<00000000349c17b8>] blk_execute_rq_nowait+0x80/0x90 [<00000000349c1856>] blk_execute_rq+0x6e/0xb0 [<000003ff801f8ac2>] __scsi_execute+0xe2/0x1f0 [scsi_mod] [<000003ff801fef98>] scsi_probe_and_add_lun+0x358/0x840 [scsi_mod] [<000003ff8020001c>] __scsi_scan_target+0xc4/0x228 [scsi_mod] [<000003ff80200254>] scsi_scan_target+0xd4/0x100 [scsi_mod] [<000003ff802d8b96>] fc_scsi_scan_rport+0x96/0xc0 [scsi_transport_fc] [<0000000034245ce8>] process_one_work+0x458/0x7d0 [<00000000342462a2>] worker_thread+0x242/0x448 [<0000000034250994>] kthread+0x15c/0x170 [<0000000034e1979c>] ret_from_fork+0x30/0x38 INFO: lockdep is turned off. Last Breaking-Event-Address: [<000003ff801fbc36>] scsi_add_cmd_to_list+0x9e/0xa8 [scsi_mod] Kernel panic - not syncing: Fatal exception: panic_on_oops While this issue is exposed by the commit named above, this is only by accident. The real issue exists for longer already - basically since it's possible to use blk-mq via scsi-mq, and blk-mq pre-allocates all requests for a tag-set during initialization of the same. For a given Scsi_Host object this is done when adding the object to the midlayer (`scsi_add_host()` and such). In `scsi_mq_setup_tags()` the midlayer calculates how much memory is required for a single scsi_cmnd, and its additional data, which also might include space for additional protection data - depending on whether the Scsi_Host has any form of protection capabilities (`scsi_host_get_prot()`). The problem is now thus, because zfcp does this step before we actually know whether the firmware/hardware has these capabilities, we don't set any protection capabilities in the Scsi_Host object. And so, no space is allocated for additional protection data for requests in the Scsi_Host tag-set. Once we go through discover and initialize the FCP device firmware/hardware fully (this is done via the firmware commands "Exchange Config Data" and "Exchange Port Data") we find out whether it actually supports DIF and DIX, and we set the corresponding capabilities in the Scsi_Host object (in `zfcp_scsi_set_prot()`). Now the Scsi_Host potentially has protection capabilities, but the already allocated requests in the tag-set don't have any space allocated for that. When we then trigger target scanning or add scsi_devices manually, the midlayer will use requests from that tag-set, and before sending most requests, it will also call `scsi_mq_prep_fn()`. To prepare the scsi_cmnd this function will check again whether the used Scsi_Host has any protection capabilities - and now it potentially has - and if so, it will try to initialize the assumed to be preallocated structures and thus it causes the crash, like shown above. Before delaying the default elevator initialization with the commit named above, we always would also allocate an elevator for any scsi_device before ever sending any requests - in contrast to now, where we do it after device-probing. That elevator in turn would have its own tag-set, and that is initialized after we went through discovery and initialization of the underlying firmware/hardware. So requests from that tag-set can be allocated properly, and if used - unless the user changes/disabled the default elevator - this would hide the underlying issue. To fix this for any configuration - with or without an elevator - we move the allocation and registration of the Scsi_Host object for a given FCP device to after the first complete discovery and initialization of the underlying firmware/hardware. By doing that we can make all basic properties of the Scsi_Host known to the midlayer by the time we call `scsi_add_host()`, including whether we have any protection capabilities. To do that we have to delay all the accesses that we would have done in the past during discovery and initialization, and do them instead once we are finished with it. The previous patches ramp up to this by fencing and factoring out all these accesses, and make it possible to re-do them later on. In addition we make also use of the diagnostic buffers we recently added with commit `92953c6e0a` ("scsi: zfcp: signal incomplete or error for sync exchange config/port data") commit `7e418833e6` ("scsi: zfcp: diagnostics buffer caching and use for exchange port data") commit `088210233e` ("scsi: zfcp: add diagnostics buffer for exchange config data") (first released in v5.5), because these already cache all the information we need for that "re-do operation" - the information cached are always updated during xconf or xport data, so it won't be stale. In addition to the move and re-do, this patch also updates the function-documentation of `zfcp_scsi_adapter_register()` and changes how it reports if a Scsi_Host object already exists. In that case future recovery-operations can skip this step completely and behave much like they would do in the past - zfcp does not release a once allocated Scsi_Host object unless the corresponding FCP device is deconstructed completely. Link: https://lore.kernel.org/r/030dd6da318bbb529f0b5268ec65cebcd20fc0a3.1588956679.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-11 23:19:52 -04:00
Benjamin Block	71159b6ecb	scsi: zfcp: Fence early sysfs interfaces for accesses of shost objects When setting an adapter online for the first time, we also create a couple of entries for it in the sysfs device tree. This is also true even if the adapter has not yet ever gone successfully through exchange config and exchange port data. When moving the scsi host object allocation and registration to after the first exchange config and exchange port data, this make the `port_rescan` attribute susceptible to invalid pointer-dereferences of the shost field before the adapter is fully initialized. When written to, it schedules a `scan_work` item that will in turn make use of the associated fibre channel host object to check the topology used for this FCP device. Because scanning for remote ports can't be done successfully without completing exchange config and exchange port data first, we can simply fence `port_rescan`, and so prevent the illegal access. As with cases where we can't get a reference to the adapter, we also return -ENODEV here. Applications need to handle that errno today already. After a successful allocation of the scsi host object nothing changes in the work flow. Link: https://lore.kernel.org/r/ef65366d309993ca91b6917727590ca7ca166c8f.1588956679.git.bblock@linux.ibm.com Reviewed-by: Steffen Maier <maier@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2020-05-11 23:19:50 -04:00

1 2 3 4 5 ...

915542 Commits