linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-09 14:56:54 -04:00

Author	SHA1	Message	Date
Damien Le Moal	82dbb57ac8	scsi: mpt3sas: Avoid IOMMU page faults on REPORT ZONES Some firmware versions of the 9600 series SAS HBA byte-swap the REPORT ZONES command reply buffer from ATA-ZAC devices by directly accessing the buffer in the host memory. This does not respect the default command DMA direction and causes IOMMU page faults on architectures with an IOMMU enforcing write-only mappings for DMA_FROM_DEVICE DMA driection (e.g. AMD hosts). scsi 18:0:0:0: Direct-Access-ZBC ATA WDC WSH722020AL W870 PQ: 0 ANSI: 6 scsi 18:0:0:0: SATA: handle(0x0027), sas_addr(0x300062b2083e7c40), phy(0), device_name(0x5000cca29dc35e11) scsi 18:0:0:0: enclosure logical id (0x300062b208097c40), slot(0) scsi 18:0:0:0: enclosure level(0x0000), connector name( C0.0) scsi 18:0:0:0: atapi(n), ncq(y), asyn_notify(n), smart(y), fua(y), sw_preserve(y) scsi 18:0:0:0: qdepth(32), tagged(1), scsi_level(7), cmd_que(1) sd 18:0:0:0: Attached scsi generic sg2 type 20 sd 18:0:0:0: [sdc] Host-managed zoned block device mpt3sas 0000:41:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0021 address=0xfff9b200 flags=0x0050] mpt3sas 0000:41:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0021 address=0xfff9b300 flags=0x0050] mpt3sas_cm0: mpt3sas_ctl_pre_reset_handler: Releasing the trace buffer due to adapter reset. mpt3sas_cm0 fault info from func: mpt3sas_base_make_ioc_ready mpt3sas_cm0: fault_state(0x2666)! mpt3sas_cm0: sending diag reset !! mpt3sas_cm0: diag reset: SUCCESS sd 18:0:0:0: [sdc] REPORT ZONES start lba 0 failed sd 18:0:0:0: [sdc] REPORT ZONES: Result: hostbyte=DID_RESET driverbyte=DRIVER_OK sd 18:0:0:0: [sdc] 0 4096-byte logical blocks: (0 B/0 B) Avoid such issue by always mapping the buffer of REPORT ZONES commands using DMA_BIDIRECTIONAL (read+write IOMMU mapping). This is done by introducing the helper function _base_scsi_dma_map() and using this helper in _base_build_sg_scmd() and _base_build_sg_scmd_ieee() instead of calling directly scsi_dma_map(). Fixes: `471ef9d4e4` ("mpt3sas: Build MPI SGL LIST on GEN2 HBAs and IEEE SGL LIST on GEN3 HBAs") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20240719073913.179559-3-dlemoal@kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-22 20:37:03 -04:00
Damien Le Moal	1abc900ddd	scsi: mpi3mr: Avoid IOMMU page faults on REPORT ZONES Some firmware versions of the 9600 series SAS HBA byte-swap the REPORT ZONES command reply buffer from ATA-ZAC devices by directly accessing the buffer in the host memory. This does not respect the default command DMA direction and causes IOMMU page faults on architectures with an IOMMU enforcing write-only mappings for DMA_FROM_DEVICE DMA direction (e.g. AMD hosts), leading to the device capacity to be dropped to 0: scsi 18:0:58:0: Direct-Access-ZBC ATA WDC WSH722626AL W930 PQ: 0 ANSI: 7 scsi 18:0:58:0: Power-on or device reset occurred sd 18:0:58:0: Attached scsi generic sg9 type 20 sd 18:0:58:0: [sdj] Host-managed zoned block device mpi3mr 0000:c1:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0001 address=0xfec0c400 flags=0x0050] mpi3mr 0000:c1:00.0: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x0001 address=0xfec0c500 flags=0x0050] sd 18:0:58:0: [sdj] REPORT ZONES start lba 0 failed sd 18:0:58:0: [sdj] REPORT ZONES: Result: hostbyte=DID_SOFT_ERROR driverbyte=DRIVER_OK sd 18:0:58:0: [sdj] 0 4096-byte logical blocks: (0 B/0 B) sd 18:0:58:0: [sdj] Write Protect is off sd 18:0:58:0: [sdj] Mode Sense: 6b 00 10 08 sd 18:0:58:0: [sdj] Write cache: enabled, read cache: enabled, supports DPO and FUA sd 18:0:58:0: [sdj] Attached SCSI disk Avoid this issue by always mapping the buffer of REPORT ZONES commands using DMA_BIDIRECTIONAL, that is, using a read-write IOMMU mapping. Suggested-by: Christoph Hellwig <hch@lst.de> Fixes: `023ab2a9b4` ("scsi: mpi3mr: Add support for queue command processing") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal <dlemoal@kernel.org> Link: https://lore.kernel.org/r/20240719073913.179559-2-dlemoal@kernel.org Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-22 20:37:03 -04:00
Manivannan Sadhasivam	ac6efb12ca	scsi: ufs: core: Do not set link to OFF state while waking up from hibernation UFS link is just put into hibern8 state during the 'freeze' process of the hibernation. Afterwards, the system may get powered down. But that doesn't matter during wakeup. Because during wakeup from hibernation, UFS link is again put into hibern8 state by the restore kernel and then the control is handed over to the to image kernel. So in both the places, UFS link is never turned OFF. But ufshcd_system_restore() just assumes that the link will be in OFF state and sets the link state accordingly. And this breaks hibernation wakeup: [ 2445.371335] phy phy-1d87000.phy.3: phy_power_on was called before phy_init [ 2445.427883] ufshcd-qcom 1d84000.ufshc: Controller enable failed [ 2445.427890] ufshcd-qcom 1d84000.ufshc: ufshcd_host_reset_and_restore: Host init failed -5 [ 2445.427906] ufs_device_wlun 0:0:0:49488: ufshcd_wl_resume failed: -5 [ 2445.427918] ufs_device_wlun 0:0:0:49488: PM: dpm_run_callback(): scsi_bus_restore returns -5 [ 2445.427973] ufs_device_wlun 0:0:0:49488: PM: failed to restore async: error -5 So fix the issue by removing the code that sets the link to OFF state. Cc: Anjana Hari <quic_ahari@quicinc.com> Cc: stable@vger.kernel.org # 6.3 Fixes: `88441a8d35` ("scsi: ufs: core: Add hibernation callbacks") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Link: https://lore.kernel.org/r/20240718170659.201647-1-manivannan.sadhasivam@linaro.org Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-22 20:32:36 -04:00
Johan Hovold	da3e19ef0b	scsi: Revert "scsi: sd: Do not repeat the starting disk message" This reverts commit `7a6bbc2829`. The offending commit tried to suppress a double "Starting disk" message for some drivers, but instead started spamming the log with bogus messages every five seconds: [ 311.798956] sd 0:0:0:0: [sda] Starting disk [ 316.919103] sd 0:0:0:0: [sda] Starting disk [ 322.040775] sd 0:0:0:0: [sda] Starting disk [ 327.161140] sd 0:0:0:0: [sda] Starting disk [ 332.281352] sd 0:0:0:0: [sda] Starting disk [ 337.401878] sd 0:0:0:0: [sda] Starting disk [ 342.521527] sd 0:0:0:0: [sda] Starting disk [ 345.850401] sd 0:0:0:0: [sda] Starting disk [ 350.967132] sd 0:0:0:0: [sda] Starting disk [ 356.090454] sd 0:0:0:0: [sda] Starting disk ... on machines that do not actually stop the disk on runtime suspend (e.g. the Qualcomm sc8280xp CRD with UFS). Let's just revert for now to address the regression. Fixes: `7a6bbc2829` ("scsi: sd: Do not repeat the starting disk message") Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Link: https://lore.kernel.org/r/20240716161101.30692-1-johan+linaro@kernel.org Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-22 20:30:13 -04:00
Peter Wang	3911af778f	scsi: ufs: core: Fix deadlock during RTC update There is a deadlock when runtime suspend waits for the flush of RTC work, and the RTC work calls ufshcd_rpm_get_sync() to wait for runtime resume. Here is deadlock backtrace: kworker/0:1 D 4892.876354 10 10971 4859 0x4208060 0x8 10 0 120 670730152367 ptr f0ffff80c2e40000 0 1 0x00000001 0x000000ff 0x000000ff 0x000000ff <ffffffee5e71ddb0> __switch_to+0x1a8/0x2d4 <ffffffee5e71e604> __schedule+0x684/0xa98 <ffffffee5e71ea60> schedule+0x48/0xc8 <ffffffee5e725f78> schedule_timeout+0x48/0x170 <ffffffee5e71fb74> do_wait_for_common+0x108/0x1b0 <ffffffee5e71efe0> wait_for_completion+0x44/0x60 <ffffffee5d6de968> __flush_work+0x39c/0x424 <ffffffee5d6decc0> __cancel_work_sync+0xd8/0x208 <ffffffee5d6dee2c> cancel_delayed_work_sync+0x14/0x28 <ffffffee5e2551b8> __ufshcd_wl_suspend+0x19c/0x480 <ffffffee5e255fb8> ufshcd_wl_runtime_suspend+0x3c/0x1d4 <ffffffee5dffd80c> scsi_runtime_suspend+0x78/0xc8 <ffffffee5df93580> __rpm_callback+0x94/0x3e0 <ffffffee5df90b0c> rpm_suspend+0x2d4/0x65c <ffffffee5df91448> __pm_runtime_suspend+0x80/0x114 <ffffffee5dffd95c> scsi_runtime_idle+0x38/0x6c <ffffffee5df912f4> rpm_idle+0x264/0x338 <ffffffee5df90f14> __pm_runtime_idle+0x80/0x110 <ffffffee5e24ce44> ufshcd_rtc_work+0x128/0x1e4 <ffffffee5d6e3a40> process_one_work+0x26c/0x650 <ffffffee5d6e65c8> worker_thread+0x260/0x3d8 <ffffffee5d6edec8> kthread+0x110/0x134 <ffffffee5d616b18> ret_from_fork+0x10/0x20 Skip updating RTC if RPM state is not RPM_ACTIVE. Fixes: `6bf999e0eb` ("scsi: ufs: core: Add UFS RTC support") Cc: stable@vger.kernel.org # 6.9.x Signed-off-by: Peter Wang <peter.wang@mediatek.com> Link: https://lore.kernel.org/r/20240715063831.29792-1-peter.wang@mediatek.com Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-15 22:57:16 -04:00
Peter Wang	022587d8ae	scsi: ufs: core: Bypass quick recovery if force reset is needed If force_reset is true, bypass quick recovery. This will shorten error recovery time. Signed-off-by: Peter Wang <peter.wang@mediatek.com> Link: https://lore.kernel.org/r/20240712094506.11284-1-peter.wang@mediatek.com Reviewed-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-15 22:53:27 -04:00
Kyoungrul Kim	0c60eb0cc3	scsi: ufs: core: Check LSDBS cap when !mcq If the user sets use_mcq_mode to 0, the host will try to activate the LSDB mode unconditionally even when the LSDBS of device HCI cap is 1. This makes commands time out and causes device probing to fail. To prevent that problem, check the LSDBS cap when MCQ is not supported. Signed-off-by: Kyoungrul Kim <k831.kim@samsung.com> Link: https://lore.kernel.org/r/20240709232520epcms2p8ebdb5c4fccc30a6221390566589bf122@epcms2p8 Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-15 22:51:06 -04:00
Zhongqiu Han	23cef42d17	scsi: aha152x: Use DECLARE_COMPLETION_ONSTACK for non-constant completion The _ONSTACK variant should be used for on-stack completion, otherwise it will break lockdep. See also commit `6e9a4738c9` ("[PATCH] completions: lockdep annotate on stack completions"). Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com> Link: https://lore.kernel.org/r/20240705103614.3650637-1-quic_zhonhan@quicinc.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:51:44 -04:00
Chen Ni	6ca9fede7c	scsi: qla2xxx: Convert comma to semicolon Replace a comma between expression statements by a semicolon. Fixes: `d4523bd6fd` ("scsi: qla2xxx: Refactor asynchronous command initialization") Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Link: https://lore.kernel.org/r/20240711005724.2358446-1-nichen@iscas.ac.cn Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:47:12 -04:00
Martin K. Petersen	22b8d89b9d	Merge patch series "qla2xxx misc. bug fixes" Nilesh Javali <njavali@marvell.com> says: Martin, Please apply the qla2xxx driver miscellaneous bug fixes to the scsi tree at your earliest convenience. Link: https://lore.kernel.org/r/20240710171057.35066-1-njavali@marvell.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:45:08 -04:00
Nilesh Javali	a1392b19ca	scsi: qla2xxx: Update version to 10.02.09.300-k Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20240710171057.35066-12-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:44:10 -04:00
Quinn Tran	c449b41987	scsi: qla2xxx: Use QP lock to search for bsg On bsg timeout, hardware_lock is used as part of search for the srb. Instead, qpair lock should be used to iterate through different qpair. Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20240710171057.35066-11-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:44:10 -04:00
Quinn Tran	beafd69246	scsi: qla2xxx: Reduce fabric scan duplicate code For fabric scan, current code uses switch scan opcode and flags as the method to iterate through different commands to carry out the process. This makes it hard to read. This patch convert those opcode and flags into steps. In addition, this help reduce some duplicate code. Consolidate routines that handle GPNFT & GNNFT. Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20240710171057.35066-10-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:44:10 -04:00
Shreyas Deodhar	348744f27a	scsi: qla2xxx: Fix optrom version displayed in FDMI Bios version was popluated for FDMI response. Systems with EFI would show optrom version as 0. EFI version is populated here and BIOS version is already displayed under FDMI_HBA_BOOT_BIOS_NAME. Cc: stable@vger.kernel.org Signed-off-by: Shreyas Deodhar <sdeodhar@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20240710171057.35066-9-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:44:10 -04:00
Manish Rangankar	76f480d7c7	scsi: qla2xxx: During vport delete send async logout explicitly During vport delete, it is observed that during unload we hit a crash because of stale entries in outstanding command array. For all these stale I/O entries, eh_abort was issued and aborted (fast_fail_io = 2009h) but I/Os could not complete while vport delete is in process of deleting. BUG: kernel NULL pointer dereference, address: 000000000000001c #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI Workqueue: qla2xxx_wq qla_do_work [qla2xxx] RIP: 0010:dma_direct_unmap_sg+0x51/0x1e0 RSP: 0018:ffffa1e1e150fc68 EFLAGS: 00010046 RAX: 0000000000000000 RBX: 0000000000000021 RCX: 0000000000000001 RDX: 0000000000000021 RSI: 0000000000000000 RDI: ffff8ce208a7a0d0 RBP: ffff8ce208a7a0d0 R08: 0000000000000000 R09: ffff8ce378aac9c8 R10: ffff8ce378aac8a0 R11: ffffa1e1e150f9d8 R12: 0000000000000000 R13: 0000000000000000 R14: ffff8ce378aac9c8 R15: 0000000000000000 FS: 0000000000000000(0000) GS:ffff8d217f000000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000001c CR3: 0000002089acc000 CR4: 0000000000350ee0 Call Trace: <TASK> qla2xxx_qpair_sp_free_dma+0x417/0x4e0 ? qla2xxx_qpair_sp_compl+0x10d/0x1a0 ? qla2x00_status_entry+0x768/0x2830 ? newidle_balance+0x2f0/0x430 ? dequeue_entity+0x100/0x3c0 ? qla24xx_process_response_queue+0x6a1/0x19e0 ? __schedule+0x2d5/0x1140 ? qla_do_work+0x47/0x60 ? process_one_work+0x267/0x440 ? process_one_work+0x440/0x440 ? worker_thread+0x2d/0x3d0 ? process_one_work+0x440/0x440 ? kthread+0x156/0x180 ? set_kthread_struct+0x50/0x50 ? ret_from_fork+0x22/0x30 </TASK> Send out async logout explicitly for all the ports during vport delete. Cc: stable@vger.kernel.org Signed-off-by: Manish Rangankar <mrangankar@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20240710171057.35066-8-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:44:10 -04:00
Shreyas Deodhar	4475afa264	scsi: qla2xxx: Complete command early within lock A crash was observed while performing NPIV and FW reset, BUG: kernel NULL pointer dereference, address: 000000000000001c #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 1 PREEMPT_RT SMP NOPTI RIP: 0010:dma_direct_unmap_sg+0x51/0x1e0 RSP: 0018:ffffc90026f47b88 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000021 RCX: 0000000000000002 RDX: 0000000000000021 RSI: 0000000000000000 RDI: ffff8881041130d0 RBP: ffff8881041130d0 R08: 0000000000000000 R09: 0000000000000034 R10: ffffc90026f47c48 R11: 0000000000000031 R12: 0000000000000000 R13: 0000000000000000 R14: ffff8881565e4a20 R15: 0000000000000000 FS: 00007f4c69ed3d00(0000) GS:ffff889faac80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000001c CR3: 0000000288a50002 CR4: 00000000007706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> ? __die_body+0x1a/0x60 ? page_fault_oops+0x16f/0x4a0 ? do_user_addr_fault+0x174/0x7f0 ? exc_page_fault+0x69/0x1a0 ? asm_exc_page_fault+0x22/0x30 ? dma_direct_unmap_sg+0x51/0x1e0 ? preempt_count_sub+0x96/0xe0 qla2xxx_qpair_sp_free_dma+0x29f/0x3b0 [qla2xxx] qla2xxx_qpair_sp_compl+0x60/0x80 [qla2xxx] __qla2x00_abort_all_cmds+0xa2/0x450 [qla2xxx] The command completion was done early while aborting the commands in driver unload path but outside lock to avoid the WARN_ON condition of performing dma_free_attr within the lock. However this caused race condition while command completion via multiple paths causing system crash. Hence complete the command early in unload path but within the lock to avoid race condition. Fixes: `0367076b08` ("scsi: qla2xxx: Perform lockless command completion in abort path") Cc: stable@vger.kernel.org Signed-off-by: Shreyas Deodhar <sdeodhar@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20240710171057.35066-7-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:44:10 -04:00
Quinn Tran	29e222085d	scsi: qla2xxx: Fix flash read failure Link up failure is observed as a result of flash read failure. Current code does not check flash read return code where it relies on FW checksum to detect the problem. Add check of flash read failure to detect the problem sooner. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/all/202406210815.rPDRDMBi-lkp@intel.com/ Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20240710171057.35066-6-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:44:10 -04:00
Saurav Kashyap	ce2065c4cc	scsi: qla2xxx: Return ENOBUFS if sg_cnt is more than one for ELS cmds Firmware only supports single DSDs in ELS Pass-through IOCB (0x53h), sg cnt is decided by the SCSI ML. User is not aware of the cause of an acutal error. Return the appropriate return code that will be decoded by API and application and proper error message will be displayed to user. Fixes: `6e98016ca0` ("[SCSI] qla2xxx: Re-organized BSG interface specific code.") Cc: stable@vger.kernel.org Signed-off-by: Saurav Kashyap <skashyap@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20240710171057.35066-5-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:44:10 -04:00
Shreyas Deodhar	c03d740152	scsi: qla2xxx: Fix for possible memory corruption Init Control Block is dereferenced incorrectly. Correctly dereference ICB Cc: stable@vger.kernel.org Signed-off-by: Shreyas Deodhar <sdeodhar@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20240710171057.35066-4-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:44:09 -04:00
Nilesh Javali	eb1d4ce260	scsi: qla2xxx: validate nvme_local_port correctly The driver load failed with error message, qla2xxx [0000:04:00.0]-ffff:0: register_localport failed: ret=ffffffef and with a kernel crash, BUG: unable to handle kernel NULL pointer dereference at 0000000000000070 Workqueue: events_unbound qla_register_fcport_fn [qla2xxx] RIP: 0010:nvme_fc_register_remoteport+0x16/0x430 [nvme_fc] RSP: 0018:ffffaaa040eb3d98 EFLAGS: 00010282 RAX: 0000000000000000 RBX: ffff9dfb46b78c00 RCX: 0000000000000000 RDX: ffff9dfb46b78da8 RSI: ffffaaa040eb3e08 RDI: 0000000000000000 RBP: ffff9dfb612a0a58 R08: ffffffffaf1d6270 R09: 3a34303a30303030 R10: 34303a303030305b R11: 2078787832616c71 R12: ffff9dfb46b78dd4 R13: ffff9dfb46b78c24 R14: ffff9dfb41525300 R15: ffff9dfb46b78da8 FS: 0000000000000000(0000) GS:ffff9dfc67c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000070 CR3: 000000018da10004 CR4: 00000000000206f0 Call Trace: qla_nvme_register_remote+0xeb/0x1f0 [qla2xxx] ? qla2x00_dfs_create_rport+0x231/0x270 [qla2xxx] qla2x00_update_fcport+0x2a1/0x3c0 [qla2xxx] qla_register_fcport_fn+0x54/0xc0 [qla2xxx] Exit the qla_nvme_register_remote() function when qla_nvme_register_hba() fails and correctly validate nvme_local_port. Cc: stable@vger.kernel.org Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20240710171057.35066-3-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:44:09 -04:00
Quinn Tran	c3d98b12ee	scsi: qla2xxx: Unable to act on RSCN for port online The device does not come online when the target port is online. There were multiple RSCNs indicating multiple devices were affected. Driver is in the process of finishing a fabric scan. A new RSCN (device up) arrived at the tail end of the last fabric scan. Driver mistakenly thinks the new RSCN is being taken care of by the previous fabric scan, where this notification is cleared and not acted on. The laser needs to be blinked again to get the device to show up. To prevent driver from accidentally clearing the RSCN notification, each RSCN is given a generation value. A fabric scan will scan for that generation(s). Any new RSCN arrive after the scan start will have a new generation value. This will trigger another scan to get latest data. The RSCN notification flag will be cleared when the scan is associate to that generation. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202406210538.w875N70K-lkp@intel.com/ Fixes: `bb2ca6b3f0` ("scsi: qla2xxx: Relogin during fabric disturbance") Cc: stable@vger.kernel.org Signed-off-by: Quinn Tran <qutran@marvell.com> Signed-off-by: Nilesh Javali <njavali@marvell.com> Link: https://lore.kernel.org/r/20240710171057.35066-2-njavali@marvell.com Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:44:09 -04:00
Martin K. Petersen	af8e69efd7	Merge patch series "Basic inline encryption support for ufs-exynos" Eric Biggers <ebiggers@kernel.org> says: Add support for Flash Memory Protector (FMP), which is the inline encryption hardware on Exynos and Exynos-based SoCs. Specifically, add support for the "traditional FMP mode" that works on many Exynos-based SoCs including gs101. This is the mode that uses "software keys" and is compatible with the upstream kernel's existing inline encryption framework in the block and filesystem layers. I plan to add support for the wrapped key support on gs101 at a later time. Tested on gs101 (specifically Pixel 6) by running the 'encrypt' group of xfstests on a filesystem mounted with the 'inlinecrypt' mount option. This patchset applies to v6.10-rc6, and it has no prerequisites that aren't already upstream. Link: https://lore.kernel.org/r/20240708235330.103590-1-ebiggers@kernel.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:33:34 -04:00
Eric Biggers	c96499fcb4	scsi: ufs: exynos: Add support for Flash Memory Protector (FMP) Add support for Flash Memory Protector (FMP), which is the inline encryption hardware on Exynos and Exynos-based SoCs. Specifically, add support for the "traditional FMP mode" that works on many Exynos-based SoCs including gs101. This is the mode that uses "software keys" and is compatible with the upstream kernel's existing inline encryption framework in the block and filesystem layers. I plan to add support for the wrapped key support on gs101 at a later time. Tested on gs101 (specifically Pixel 6) by running the 'encrypt' group of xfstests on a filesystem mounted with the 'inlinecrypt' mount option. Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20240708235330.103590-7-ebiggers@kernel.org Reviewed-by: Peter Griffin <peter.griffin@linaro.org> Tested-by: Peter Griffin <peter.griffin@linaro.org> Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:32:30 -04:00
Eric Biggers	4c45dba50a	scsi: ufs: core: Add UFSHCD_QUIRK_KEYS_IN_PRDT Since the nonstandard inline encryption support on Exynos SoCs requires that raw cryptographic keys be copied into the PRDT, it is desirable to zeroize those keys after each request to keep them from being left in memory. Therefore, add a quirk bit that enables the zeroization. We could instead do the zeroization unconditionally. However, using a quirk bit avoids adding the zeroization overhead to standard devices. Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Peter Griffin <peter.griffin@linaro.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20240708235330.103590-6-ebiggers@kernel.org Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:32:30 -04:00
Eric Biggers	8ecea3da15	scsi: ufs: core: Add fill_crypto_prdt variant op Add a variant op to allow host drivers to initialize nonstandard crypto-related fields in the PRDT. This is needed to support inline encryption on the "Exynos" UFS controller. Note that this will be used together with the support for overriding the PRDT entry size that was already added by commit `ada1e653a5` ("scsi: ufs: core: Allow UFS host drivers to override the sg entry size"). Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Peter Griffin <peter.griffin@linaro.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20240708235330.103590-5-ebiggers@kernel.org Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:32:30 -04:00
Eric Biggers	e95881e008	scsi: ufs: core: Add UFSHCD_QUIRK_BROKEN_CRYPTO_ENABLE Add UFSHCD_QUIRK_BROKEN_CRYPTO_ENABLE which tells the UFS core to not use the crypto enable bit defined by the UFS specification. This is needed to support inline encryption on the "Exynos" UFS controller. Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Peter Griffin <peter.griffin@linaro.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20240708235330.103590-4-ebiggers@kernel.org Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:32:30 -04:00
Eric Biggers	ec99818afb	scsi: ufs: core: fold ufshcd_clear_keyslot() into its caller Fold ufshcd_clear_keyslot() into its only remaining caller. Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Peter Griffin <peter.griffin@linaro.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20240708235330.103590-3-ebiggers@kernel.org Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:32:29 -04:00
Eric Biggers	c2a90eee29	scsi: ufs: core: Add UFSHCD_QUIRK_CUSTOM_CRYPTO_PROFILE Add UFSHCD_QUIRK_CUSTOM_CRYPTO_PROFILE which lets UFS host drivers initialize the blk_crypto_profile themselves rather than have it be initialized by ufshcd-core according to the UFSHCI standard. This is needed to support inline encryption on the "Exynos" UFS controller which has a nonstandard interface. Reviewed-by: Bart Van Assche <bvanassche@acm.org> Reviewed-by: Peter Griffin <peter.griffin@linaro.org> Signed-off-by: Eric Biggers <ebiggers@google.com> Link: https://lore.kernel.org/r/20240708235330.103590-2-ebiggers@kernel.org Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:32:29 -04:00
Martin K. Petersen	e30618a480	Merge patch series "UFS patches for kernel 6.11" Bart Van Assche <bvanassche@acm.org> says: Hi Martin, Please consider this series of UFS driver patches for the next merge window. Thank you, Bart. Link: https://lore.kernel.org/r/20240708211716.2827751-1-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:26:08 -04:00
Martin K. Petersen	5e9a522b07	Merge branch '6.10/scsi-fixes' into 6.11/scsi-staging Pull in my fixes branch to resolve an mpi3mr merge conflict reported by sfr. Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:23:30 -04:00
Bart Van Assche	af568c7e82	scsi: ufs: mcq: Make .get_hba_mac() optional UFSHCI controllers that are compliant with the UFSHCI 4.0 standard report the maximum number of supported commands in the controller capabilities register. Use that value if .get_hba_mac == NULL. Reviewed-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240708211716.2827751-11-bvanassche@acm.org Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:19:05 -04:00
Bart Van Assche	5e2053a419	scsi: ufs: mcq: Inline ufshcd_mcq_vops_get_hba_mac() Make ufshcd_mcq_decide_queue_depth() easier to read by inlining ufshcd_mcq_vops_get_hba_mac(). Reviewed-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240708211716.2827751-10-bvanassche@acm.org Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:19:05 -04:00
Bart Van Assche	7e2c268dc3	scsi: ufs: mcq: Move the ufshcd_mcq_enable() call Move the ufshcd_mcq_enable() call from inside ufshcd_config_mcq() to the callers of this function. No functionality is changed by this patch. This patch makes a later patch easier to read ("scsi: ufs: Make .get_hba_mac() optional"). Cc: Peter Wang <peter.wang@mediatek.com> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240708211716.2827751-9-bvanassche@acm.org Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:19:05 -04:00
Bart Van Assche	4a8c859b44	scsi: ufs: mcq: Move the "hba->mcq_enabled = true" assignment Move the "hba->mcq_enabled = true" assignment to prevent that it gets duplicated by a later patch that will introduce more ufshcd_mcq_enable() calls. No functionality is changed by this patch. Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240708211716.2827751-8-bvanassche@acm.org Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:19:05 -04:00
Bart Van Assche	0fca3318e5	scsi: ufs: core: Inline is_mcq_enabled() Improve code readability by inlining is_mcq_enabled(). Cc: Peter Wang <peter.wang@mediatek.com> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240708211716.2827751-7-bvanassche@acm.org Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:19:05 -04:00
Bart Van Assche	f4750af708	scsi: ufs: core: Initialize hba->reserved_slot earlier Move the hba->reserved_slot and the host->can_queue assignments from ufshcd_config_mcq() into ufshcd_alloc_mcq(). The advantages of this change are as follows: - It becomes easier to verify that these two parameters are updated if hba->nutrs is updated. - It prevents unnecessary assignments to these two parameters. While ufshcd_config_mcq() is called during host reset, ufshcd_alloc_mcq() is not. Cc: Can Guo <quic_cang@quicinc.com> Reviewed-by: Peter Wang <peter.wang@mediatek.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240708211716.2827751-6-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:19:04 -04:00
Bart Van Assche	b53eb9a050	scsi: ufs: core: Rename the MASK_TRANSFER_REQUESTS_SLOTS constant Rename this constant to prepare for the introduction of the MASK_TRANSFER_REQUESTS_SLOTS_MCQ constant. The acronym "SDB" stands for "single doorbell" (mode). Reviewed-by: Peter Wang <peter.wang@mediatek.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240708211716.2827751-5-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:19:04 -04:00
Bart Van Assche	92c0b10fef	scsi: ufs: core: Remove two constants The SCSI host template members .cmd_per_lun and .can_queue are copied into the SCSI host data structure. Before these are used, these are overwritten by ufshcd_init(). Hence, this patch does not change any functionality. Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Keoseong Park <keosung.park@samsung.com> Reviewed-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240708211716.2827751-4-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:19:04 -04:00
Bart Van Assche	93ef12d92f	scsi: ufs: core: Initialize struct uic_command once Instead of first zero-initializing struct uic_command and next initializing it memberwise, initialize all members at once. Reviewed-by: Daejun Park <daejun7.park@samsung.com> Reviewed-by: Avri Altman <avri.altman@wdc.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240708211716.2827751-3-bvanassche@acm.org Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:19:04 -04:00
Bart Van Assche	d502dac69a	scsi: ufs: core: Declare functions once Several functions are declared in include/ufs/ufshcd.h and also in drivers/ufs/core/ufshcd-priv.h. Remove the duplicate declarations. Reviewed-by: Peter Wang <peter.wang@mediatek.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Link: https://lore.kernel.org/r/20240708211716.2827751-2-bvanassche@acm.org Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Keoseong Park <keosung.park@samsung.com> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-10 22:19:04 -04:00
Martin K. Petersen	6cd48c8f62	Merge patch series "mpi3mr: Support PCI Error Recovery" Sumit Saxena <sumit.saxena@broadcom.com> says: This patch series contains the changes done in the driver to support PCI error recovery. It is rework of older patch series from Ranjan Kumar, see [1]. [1] https://lore.kernel.org/all/20231214205900.270488-1-ranjan.kumar@broadcom.com/ Link: https://lore.kernel.org/r/20240627101735.18286-1-sumit.saxena@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-04 23:38:16 -04:00
Sumit Saxena	cf82b9e866	scsi: mpi3mr: Driver version update Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Link: https://lore.kernel.org/r/20240627101735.18286-4-sumit.saxena@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-04 23:37:07 -04:00
Sumit Saxena	1c342b0548	scsi: mpi3mr: Prevent PCI writes from driver during PCI error recovery Prevent interaction with the hardware while the error recovery in progress. Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com> Co-developed-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Link: https://lore.kernel.org/r/20240627101735.18286-3-sumit.saxena@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-04 23:37:07 -04:00
Sumit Saxena	30bafe1774	scsi: mpi3mr: Support PCI Error Recovery callback handlers PCI Error recovery support is required to recover the controller upon detection of PCI errors. Add support for the PCI error recovery callback handlers in mpi3mr driver. Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com> Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com> Co-developed-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com> Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com> Link: https://lore.kernel.org/r/20240627101735.18286-2-sumit.saxena@broadcom.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-04 23:37:07 -04:00
Martin K. Petersen	34438552c9	Merge patch series "Update lpfc to revision 14.4.0.3" Justin Tee <justintee8345@gmail.com> says: Update lpfc to revision 14.4.0.3 This patch set contains bug fixes related to discovery, submission of mailbox commands, and proper endianness conversions. The patches were cut against Martin's 6.11/scsi-queue tree. Link: https://lore.kernel.org/r/20240628172011.25921-1-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-04 23:25:40 -04:00
Justin Tee	41972df1a5	scsi: lpfc: Update lpfc version to 14.4.0.3 Update lpfc version to 14.4.0.3. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20240628172011.25921-9-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-04 23:24:52 -04:00
Justin Tee	8bc7c61764	scsi: lpfc: Revise lpfc_prep_embed_io routine with proper endian macro usages On big endian architectures, it is possible to run into a memory out of bounds pointer dereference when FCP targets are zoned. In lpfc_prep_embed_io, the memcpy(ptr, fcp_cmnd, sgl->sge_len) is referencing a little endian formatted sgl->sge_len value. So, the memcpy can cause big endian systems to crash. Redefine the *sgl ptr as a struct sli4_sge_le to make it clear that we are referring to a little endian formatted data structure. And, update the routine with proper le32_to_cpu macro usages. Fixes: `af20bb73ac` ("scsi: lpfc: Add support for 32 byte CDBs") Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20240628172011.25921-8-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-04 23:24:52 -04:00
Justin Tee	f65f31ac12	scsi: lpfc: Fix incorrect request len mbox field when setting trunking via sysfs When setting trunk modes through sysfs, the SLI_CONFIG mailbox command's command payload length is incorrectly hardcoded to 12 bytes. SLI_CONFIG's payload length field should be specified large enough to encompass both the submailbox command header and the submailbox request itself. Thus, replace the hardcoded 12 bytes with a clearer calculation by way of sizeof(struct lpfc_mbx_set_trunk_mode) - sizeof(struct lpfc_sli4_cfg_mhdr). Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20240628172011.25921-7-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-04 23:24:51 -04:00
Justin Tee	ede596b143	scsi: lpfc: Handle mailbox timeouts in lpfc_get_sfp_info The MBX_TIMEOUT return code is not handled in lpfc_get_sfp_info and the routine unconditionally frees submitted mailbox commands regardless of return status. The issue is that for MBX_TIMEOUT cases, when firmware returns SFP information at a later time, that same mailbox memory region references previously freed memory in its cmpl routine. Fix by adding checks for the MBX_TIMEOUT return code. During mailbox resource cleanup, check the mbox flag to make sure that the wait did not timeout. If the MBOX_WAKE flag is not set, then do not free the resources because it will be freed when firmware completes the mailbox at a later time in its cmpl routine. Also, increase the timeout from 30 to 60 seconds to accommodate boot scripts requiring longer timeouts. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20240628172011.25921-6-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-04 23:24:51 -04:00
Justin Tee	15e21dc6d6	scsi: lpfc: Fix handling of fully recovered fabric node in dev_loss callbk In rare cases when a fabric node is recovered after a link bounce and before dev_loss_tmo callbk is reached, the driver may leave the fabric node in an inconsistent state with the NLP_IN_DEV_LOSS flag perpetually set. In lpfc_dev_loss_tmo_callbk, a check is added for a recovered fabric node. If the node is recovered, then don't queue the lpfc_dev_loss_tmo_handler work. In lpfc_dev_loss_tmo_handler, the path taken for the recovered fabric nodes is updated to clear the NLP_IN_DEV_LOSS flag. Signed-off-by: Justin Tee <justin.tee@broadcom.com> Link: https://lore.kernel.org/r/20240628172011.25921-5-justintee8345@gmail.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>	2024-07-04 23:24:51 -04:00

1 2 3 4 5 ...

1279477 Commits