Currently the driver marks the controller as unrecoverable if there is an
asynchronous reset or fault during the initialization, reinitialization
post reset, and OS resume.
Enhance driver to retry the initialization, re-initialization, and resume
sequences for a maximum of 3 times if the controller became faulty or
asynchronously reset due to a firmware activation during the initialization
sequence.
Link: https://lore.kernel.org/r/20211220141159.16117-15-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Save snapdump and fault the controller with the given reason code if it is
already not in the fault or not in asynchronous reset. This ensures that
soft reset is issued from the watchdog thread. This will also be used to
handle initialization time faults/resets/timeout as in those cases
immediate soft reset invocation is not required.
Link: https://lore.kernel.org/r/20211220141159.16117-12-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The following special handling is needed for UNMAP commands issued to NVMe
drives:
- On B0 boards, if the parameter list length is greater than 24 and not a
16-byte multiple, then truncate the parameter list length to a 16-byte
multiple.
- On A0 boards, if the parameter list length is greater than block
descriptor data length + 8, then truncate the parameter list length to
block descriptor data length + 8 value.
Link: https://lore.kernel.org/r/20211220141159.16117-10-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The SAS4 Controller firmware exposes the SES devices in Managed PCIe Switch
as a PCIe Device Type SCSI Device
(MPI3_DEVICE0_PCIE_DEVICE_INFO_TYPE_SCSI_DEVICE).
Driver is enhanced to handle this device type by:
- Exposing the device to the upper layers and
- Not updating any hardware sectors & virtual boundary settings as these
settings are needed only for NVMe devices.
Link: https://lore.kernel.org/r/20211220141159.16117-7-sreekanth.reddy@broadcom.com
Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Processing events such as PORTE_BROADCAST_RCVD may cause dependency issues
for runtime power management support. Such a problem would be that
handling a PORTE_BROADCAST_RCVD event requires that the host is resumed to
send SMP commands. However, in resuming the host, the phyup events
generated from re-enabling the phys are processed in the same workqueue as
the original PORTE_BROADCAST_RCVD event. As such, the host will never
finish resuming (as it waits for the phyup event processing), and then the
PORTE_BROADCAST_RCVD event can't be processed as the SMP commands are
blocked, and so we have a deadlock. Solve this problem by ensuring that
libsas keeps the host active until completely finished phy or port events,
such as PORTE_BYTES_DMAED. As such, we don't have to worry about resuming
the host for processing individual SMP commands in this example.
Link: https://lore.kernel.org/r/1639999298-244569-15-git-send-email-chenxiang66@hisilicon.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
It is possible that controller may become suspended between processing a
phyup interrupt and the event being processed by libsas. As such, we can't
ensure the controller is active when processing the phyup event - this may
cause the phyup event to be lost or other issues. To avoid any possible
issues, add pm_runtime_get_noresume() in phyup interrupt handler and
pm_runtime_put_sync() in the work handler exit to ensure that we stay
always active. Since we only want to call pm_runtime_get_noresume() for v3
hw, signal this will a new event, HISI_PHYE_PHY_UP_PM.
Link: https://lore.kernel.org/r/1639999298-244569-14-git-send-email-chenxiang66@hisilicon.com
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During the processing of event PORT_BYTES_DMAED, the driver queues work
DISCE_DISCOVER_DOMAIN and then flushes workqueue ha->disco_q. If a new
phyup event occurs during resuming the controller, the work
PORTE_BYTES_DMAED of new phy occurs before suspended phy's. The work
DISCE_DISCOVER_DOMAIN of new phy requires an active SAS controller (it
needs to resume SAS controller by function scsi_sysfs_add_sdev() and some
other functions such as function add_device_link()). However, the
activation of the SAS controller requires completion of work
PORTE_BYTES_DMAED of suspended phys while it is blocked by new phy's work
on ha->event_q. So there is a deadlock and it is released only after resume
timeout.
To solve the issue, defer works of new phys during suspend and queue those
defer works after SAS controller becomes active.
Link: https://lore.kernel.org/r/1639999298-244569-13-git-send-email-chenxiang66@hisilicon.com
Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Most places that use asd_sas_port->phy_list are protected by spinlock
asd_sas_port->phy_list_lock, however there are still some places which miss
grabbing the lock. Add it in function hisi_sas_refresh_port_id() when
accessing asd_sas_port->phy_list. This carries a risk that list mutates
while at the same time dropping the lock in function
hisi_sas_send_ata_reset_each_phy(). Read asd_sas_port->phy_mask instead of
accessing asd_sas_port->phy_list to avoid this risk.
Link: https://lore.kernel.org/r/1639999298-244569-6-git-send-email-chenxiang66@hisilicon.com
Acked-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
John Garry reported a deadlock that occurs when trying to access a
runtime-suspended SATA device. For obscure reasons, the rescan procedure
causes the link to be hard-reset, which disconnects the device.
The rescan tries to carry out a runtime resume when accessing the device.
scsi_rescan_device() holds the SCSI device lock and won't release it until
it can put commands onto the device's block queue. This can't happen until
the queue is successfully runtime-resumed or the device is unregistered.
But the runtime resume fails because the device is disconnected, and
__scsi_remove_device() can't do the unregistration because it can't get the
device lock.
The best way to resolve this deadlock appears to be to allow the block
queue to start running again even after an unsuccessful runtime resume.
The idea is that the driver or the SCSI error handler will need to be able
to use the queue to resolve the runtime resume failure.
This patch removes the err argument to blk_post_runtime_resume() and makes
the routine act as though the resume was successful always. This fixes the
deadlock.
Link: https://lore.kernel.org/r/1639999298-244569-4-git-send-email-chenxiang66@hisilicon.com
Fixes: e27829dc92 ("scsi: serialize ->rescan against ->remove")
Reported-and-tested-by: John Garry <john.garry@huawei.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
For the hisi_sas driver, if a directly attached disk is removed during
suspend, a hang will occur in the resume process:
The background is that in commit 16fd4a7c59 ("scsi: hisi_sas: Add device
link between SCSI devices and hisi_hba"), it is ensured that the HBA device
cannot be runtime suspended when any SCSI device associated is active.
Other drivers which use libsas don't worry about this as none support
runtime suspend.
The mentioned hang occurs when an disk is removed during suspend. In the
removal process - from PHYE_RESUME_TIMEOUT event processing - we call into
scsi_remove_device(), which is being processed in the HA event workqueue.
Here we wait for all suppliers of the SCSI device to resume, which includes
the HBA device (from the above commit). However the HBA device cannot
resume, as it is waiting for the PHYE_RESUME_TIMEOUT to be processed (from
calling sas_resume_ha() -> sas_drain_work()). This is the deadlock.
There does not appear to be any need for the sas_drain_work() to be called
at all in sas_resume_ha() as it is not syncing against anything, so allow
LLDDs to avoid this by providing a variant of sas_resume_ha() which does
"sync", i.e. doesn't drain the event workqueue.
Link: https://lore.kernel.org/r/1639999298-244569-2-git-send-email-chenxiang66@hisilicon.com
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Xiang Chen <chenxiang66@hisilicon.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The OOB interrupt and phyup interrupt handlers may run out-of-order in high
CPU usage scenarios. Since the hisi_sas_phy.timer is added in
hisi_sas_phy_oob_ready() and disarmed in phy_up_v3_hw(), this out-of-order
execution will cause hisi_sas_phy.timer timeout to trigger.
To solve, protect hisi_sas_phy.timer and .attached with a lock, and ensure
that the timer won't be added after phyup handler completes.
Link: https://lore.kernel.org/r/1639579061-179473-8-git-send-email-john.garry@huawei.com
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If we issue a controller reset command during executing a FLR a hung task
may be found:
Call trace:
__switch_to+0x158/0x1cc
__schedule+0x2e8/0x85c
schedule+0x7c/0x110
schedule_timeout+0x190/0x1cc
__down+0x7c/0xd4
down+0x5c/0x7c
hisi_sas_task_exec+0x510/0x680 [hisi_sas_main]
hisi_sas_queue_command+0x24/0x30 [hisi_sas_main]
smp_execute_task_sg+0xf4/0x23c [libsas]
sas_smp_phy_control+0x110/0x1e0 [libsas]
transport_sas_phy_reset+0xc8/0x190 [libsas]
phy_reset_work+0x2c/0x40 [libsas]
process_one_work+0x1dc/0x48c
worker_thread+0x15c/0x464
kthread+0x160/0x170
ret_from_fork+0x10/0x18
This is a race condition which occurs when the FLR completes first.
Here the host HISI_SAS_RESETTING_BIT flag out gets of sync as
HISI_SAS_RESETTING_BIT is not always cleared with the hisi_hba.sem held, so
now only set/unset HISI_SAS_RESETTING_BIT under hisi_hba.sem .
Link: https://lore.kernel.org/r/1639579061-179473-7-git-send-email-john.garry@huawei.com
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A user may issue a control phy command from sysfs at any time, even if the
controller is resetting.
If a phy is disabled by hardreset/linkreset command before calling
get_phys_state() in the reset path, the saved phy state may be incorrect.
To avoid incorrectly recording the phy state, use hisi_hba.sem to ensure
that the controller reset may not run at the same time as when the phy
control function is running.
Link: https://lore.kernel.org/r/1639579061-179473-6-git-send-email-john.garry@huawei.com
Signed-off-by: Qi Liu <liuqi115@huawei.com>
Signed-off-by: John Garry <john.garry@huawei.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Call shost_for_each_device() with holding host->host_lock will cause a
deadlock situation, which will cause the system to stall (the log as
follow). Fix this issue by using __shost_for_each_device() in
ufshcd_pending_cmds().
stalls on CPUs/tasks:
all trace:
__switch_to+0x120/0x170
0xffff800011643998
ask dump for CPU 5:
ask:kworker/u16:2 state:R running task stack: 0 pid: 80 ppid: 2 flags:0x0000000a
orkqueue: events_unbound async_run_entry_fn
all trace:
__switch_to+0x120/0x170
0x0
ask dump for CPU 6:
ask:kworker/u16:6 state:R running task stack: 0 pid: 164 ppid: 2 flags:0x0000000a
orkqueue: events_unbound async_run_entry_fn
all trace:
__switch_to+0x120/0x170
0xffff54e7c4429f80
ask dump for CPU 7:
ask:kworker/u16:4 state:R running task stack: 0 pid: 153 ppid: 2 flags:0x0000000a
orkqueue: events_unbound async_run_entry_fn
all trace:
__switch_to+0x120/0x170
blk_mq_run_hw_queue+0x34/0x110
blk_mq_sched_insert_request+0xb0/0x120
blk_execute_rq_nowait+0x68/0x88
blk_execute_rq+0x4c/0xd8
__scsi_execute+0xec/0x1d0
scsi_vpd_inquiry+0x84/0xf0
scsi_get_vpd_buf+0x34/0xb8
scsi_attach_vpd+0x34/0x140
scsi_probe_and_add_lun+0xa6c/0xab8
__scsi_scan_target+0x438/0x4f8
scsi_scan_channel+0x6c/0xa8
scsi_scan_host_selected+0xf0/0x150
do_scsi_scan_host+0x88/0x90
scsi_scan_host+0x1b4/0x1d0
ufshcd_async_scan+0x248/0x310
async_run_entry_fn+0x30/0x178
process_one_work+0x1e8/0x368
worker_thread+0x40/0x478
kthread+0x174/0x180
ret_from_fork+0x10/0x20
Link: https://lore.kernel.org/r/20211214120537.531628-1-huobean@gmail.com
Fixes: 8d077ede48 ("scsi: ufs: Optimize the command queueing code")
Reported-by: YongQin Liu <yongqin.liu@linaro.org>
Reported-by: Amit Pundir <amit.pundir@linaro.org>
Tested-by: John Stultz <john.stultz@linaro.org>
Tested-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Co-developed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Bean Huo <beanhuo@micron.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull in the 5.16 fixes branch to resolve a conflict in the UFS driver
core.
Conflicts:
drivers/scsi/ufs/ufshcd.c
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When building under -Warray-bounds, a warning is generated when casting a
u32 into MAILBOX_t (which is larger). This warning is conservative, but
it's not an unreasonable change to make to improve future robustness. Use a
tagged struct_group that can refer to either the specific fields or the
first u32 separately, silencing this warning:
drivers/scsi/lpfc/lpfc_sli.c: In function 'lpfc_reset_barrier':
drivers/scsi/lpfc/lpfc_sli.c:4787:29: error: array subscript 'MAILBOX_t[0]' is partly outside array bounds of 'volatile uint32_t[1]' {aka 'volatile unsigned int[1]'} [-Werror=array-bounds]
4787 | ((MAILBOX_t *)&mbox)->mbxCommand = MBX_KILL_BOARD;
| ^~
drivers/scsi/lpfc/lpfc_sli.c:4752:27: note: while referencing 'mbox'
4752 | volatile uint32_t mbox;
| ^~~~
There is no change to the resulting executable instruction code.
Link: https://lore.kernel.org/r/20211203223351.107323-1-keescook@chromium.org
Reviewed-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>