Commit Graph

1279456 Commits

Author SHA1 Message Date
Martin K. Petersen
af8e69efd7 Merge patch series "Basic inline encryption support for ufs-exynos"
Eric Biggers <ebiggers@kernel.org> says:

Add support for Flash Memory Protector (FMP), which is the inline
encryption hardware on Exynos and Exynos-based SoCs.

Specifically, add support for the "traditional FMP mode" that works on
many Exynos-based SoCs including gs101.  This is the mode that uses
"software keys" and is compatible with the upstream kernel's existing
inline encryption framework in the block and filesystem layers.  I
plan to add support for the wrapped key support on gs101 at a later
time.

Tested on gs101 (specifically Pixel 6) by running the 'encrypt' group
of xfstests on a filesystem mounted with the 'inlinecrypt' mount
option.

This patchset applies to v6.10-rc6, and it has no prerequisites that
aren't already upstream.

Link: https://lore.kernel.org/r/20240708235330.103590-1-ebiggers@kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-10 22:33:34 -04:00
Eric Biggers
c96499fcb4 scsi: ufs: exynos: Add support for Flash Memory Protector (FMP)
Add support for Flash Memory Protector (FMP), which is the inline
encryption hardware on Exynos and Exynos-based SoCs.

Specifically, add support for the "traditional FMP mode" that works on many
Exynos-based SoCs including gs101.  This is the mode that uses "software
keys" and is compatible with the upstream kernel's existing inline
encryption framework in the block and filesystem layers.  I plan to add
support for the wrapped key support on gs101 at a later time.

Tested on gs101 (specifically Pixel 6) by running the 'encrypt' group of
xfstests on a filesystem mounted with the 'inlinecrypt' mount option.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20240708235330.103590-7-ebiggers@kernel.org
Reviewed-by: Peter Griffin <peter.griffin@linaro.org>
Tested-by: Peter Griffin <peter.griffin@linaro.org>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-10 22:32:30 -04:00
Eric Biggers
4c45dba50a scsi: ufs: core: Add UFSHCD_QUIRK_KEYS_IN_PRDT
Since the nonstandard inline encryption support on Exynos SoCs requires
that raw cryptographic keys be copied into the PRDT, it is desirable to
zeroize those keys after each request to keep them from being left in
memory.  Therefore, add a quirk bit that enables the zeroization.

We could instead do the zeroization unconditionally.  However, using a
quirk bit avoids adding the zeroization overhead to standard devices.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Peter Griffin <peter.griffin@linaro.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20240708235330.103590-6-ebiggers@kernel.org
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-10 22:32:30 -04:00
Eric Biggers
8ecea3da15 scsi: ufs: core: Add fill_crypto_prdt variant op
Add a variant op to allow host drivers to initialize nonstandard
crypto-related fields in the PRDT.  This is needed to support inline
encryption on the "Exynos" UFS controller.

Note that this will be used together with the support for overriding the
PRDT entry size that was already added by commit ada1e653a5 ("scsi: ufs:
core: Allow UFS host drivers to override the sg entry size").

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Peter Griffin <peter.griffin@linaro.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20240708235330.103590-5-ebiggers@kernel.org
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-10 22:32:30 -04:00
Eric Biggers
e95881e008 scsi: ufs: core: Add UFSHCD_QUIRK_BROKEN_CRYPTO_ENABLE
Add UFSHCD_QUIRK_BROKEN_CRYPTO_ENABLE which tells the UFS core to not use
the crypto enable bit defined by the UFS specification.  This is needed to
support inline encryption on the "Exynos" UFS controller.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Peter Griffin <peter.griffin@linaro.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20240708235330.103590-4-ebiggers@kernel.org
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-10 22:32:30 -04:00
Eric Biggers
ec99818afb scsi: ufs: core: fold ufshcd_clear_keyslot() into its caller
Fold ufshcd_clear_keyslot() into its only remaining caller.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Peter Griffin <peter.griffin@linaro.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20240708235330.103590-3-ebiggers@kernel.org
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-10 22:32:29 -04:00
Eric Biggers
c2a90eee29 scsi: ufs: core: Add UFSHCD_QUIRK_CUSTOM_CRYPTO_PROFILE
Add UFSHCD_QUIRK_CUSTOM_CRYPTO_PROFILE which lets UFS host drivers
initialize the blk_crypto_profile themselves rather than have it be
initialized by ufshcd-core according to the UFSHCI standard.  This is
needed to support inline encryption on the "Exynos" UFS controller which
has a nonstandard interface.

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Peter Griffin <peter.griffin@linaro.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Link: https://lore.kernel.org/r/20240708235330.103590-2-ebiggers@kernel.org
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-10 22:32:29 -04:00
Martin K. Petersen
e30618a480 Merge patch series "UFS patches for kernel 6.11"
Bart Van Assche <bvanassche@acm.org> says:

Hi Martin,

Please consider this series of UFS driver patches for the next merge window.

Thank you,

Bart.

Link: https://lore.kernel.org/r/20240708211716.2827751-1-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-10 22:26:08 -04:00
Martin K. Petersen
5e9a522b07 Merge branch '6.10/scsi-fixes' into 6.11/scsi-staging
Pull in my fixes branch to resolve an mpi3mr merge conflict reported
by sfr.

Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-10 22:23:30 -04:00
Bart Van Assche
af568c7e82 scsi: ufs: mcq: Make .get_hba_mac() optional
UFSHCI controllers that are compliant with the UFSHCI 4.0 standard report
the maximum number of supported commands in the controller capabilities
register. Use that value if .get_hba_mac == NULL.

Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240708211716.2827751-11-bvanassche@acm.org
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-10 22:19:05 -04:00
Bart Van Assche
5e2053a419 scsi: ufs: mcq: Inline ufshcd_mcq_vops_get_hba_mac()
Make ufshcd_mcq_decide_queue_depth() easier to read by inlining
ufshcd_mcq_vops_get_hba_mac().

Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240708211716.2827751-10-bvanassche@acm.org
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-10 22:19:05 -04:00
Bart Van Assche
7e2c268dc3 scsi: ufs: mcq: Move the ufshcd_mcq_enable() call
Move the ufshcd_mcq_enable() call from inside ufshcd_config_mcq() to the
callers of this function. No functionality is changed by this patch. This
patch makes a later patch easier to read ("scsi: ufs: Make .get_hba_mac()
optional").

Cc: Peter Wang <peter.wang@mediatek.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240708211716.2827751-9-bvanassche@acm.org
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-10 22:19:05 -04:00
Bart Van Assche
4a8c859b44 scsi: ufs: mcq: Move the "hba->mcq_enabled = true" assignment
Move the "hba->mcq_enabled = true" assignment to prevent that it gets
duplicated by a later patch that will introduce more ufshcd_mcq_enable()
calls.  No functionality is changed by this patch.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240708211716.2827751-8-bvanassche@acm.org
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-10 22:19:05 -04:00
Bart Van Assche
0fca3318e5 scsi: ufs: core: Inline is_mcq_enabled()
Improve code readability by inlining is_mcq_enabled().

Cc: Peter Wang <peter.wang@mediatek.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240708211716.2827751-7-bvanassche@acm.org
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-10 22:19:05 -04:00
Bart Van Assche
f4750af708 scsi: ufs: core: Initialize hba->reserved_slot earlier
Move the hba->reserved_slot and the host->can_queue assignments from
ufshcd_config_mcq() into ufshcd_alloc_mcq(). The advantages of this change
are as follows:

 - It becomes easier to verify that these two parameters are updated if
   hba->nutrs is updated.

 - It prevents unnecessary assignments to these two parameters. While
   ufshcd_config_mcq() is called during host reset, ufshcd_alloc_mcq() is
   not.

Cc: Can Guo <quic_cang@quicinc.com>
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240708211716.2827751-6-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-10 22:19:04 -04:00
Bart Van Assche
b53eb9a050 scsi: ufs: core: Rename the MASK_TRANSFER_REQUESTS_SLOTS constant
Rename this constant to prepare for the introduction of the
MASK_TRANSFER_REQUESTS_SLOTS_MCQ constant. The acronym "SDB" stands for
"single doorbell" (mode).

Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240708211716.2827751-5-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-10 22:19:04 -04:00
Bart Van Assche
92c0b10fef scsi: ufs: core: Remove two constants
The SCSI host template members .cmd_per_lun and .can_queue are copied into
the SCSI host data structure. Before these are used, these are overwritten
by ufshcd_init(). Hence, this patch does not change any functionality.

Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Keoseong Park <keosung.park@samsung.com>
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240708211716.2827751-4-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-10 22:19:04 -04:00
Bart Van Assche
93ef12d92f scsi: ufs: core: Initialize struct uic_command once
Instead of first zero-initializing struct uic_command and next initializing
it memberwise, initialize all members at once.

Reviewed-by: Daejun Park <daejun7.park@samsung.com>
Reviewed-by: Avri Altman <avri.altman@wdc.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240708211716.2827751-3-bvanassche@acm.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-10 22:19:04 -04:00
Bart Van Assche
d502dac69a scsi: ufs: core: Declare functions once
Several functions are declared in include/ufs/ufshcd.h and also in
drivers/ufs/core/ufshcd-priv.h. Remove the duplicate declarations.

Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20240708211716.2827751-2-bvanassche@acm.org
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Keoseong Park <keosung.park@samsung.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-10 22:19:04 -04:00
Martin K. Petersen
6cd48c8f62 Merge patch series "mpi3mr: Support PCI Error Recovery"
Sumit Saxena <sumit.saxena@broadcom.com> says:

This patch series contains the changes done in the driver to support
PCI error recovery. It is rework of older patch series from Ranjan
Kumar, see [1].

[1] https://lore.kernel.org/all/20231214205900.270488-1-ranjan.kumar@broadcom.com/

Link: https://lore.kernel.org/r/20240627101735.18286-1-sumit.saxena@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:38:16 -04:00
Sumit Saxena
cf82b9e866 scsi: mpi3mr: Driver version update
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Link: https://lore.kernel.org/r/20240627101735.18286-4-sumit.saxena@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:37:07 -04:00
Sumit Saxena
1c342b0548 scsi: mpi3mr: Prevent PCI writes from driver during PCI error recovery
Prevent interaction with the hardware while the error recovery in progress.

Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com>
Co-developed-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Link: https://lore.kernel.org/r/20240627101735.18286-3-sumit.saxena@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:37:07 -04:00
Sumit Saxena
30bafe1774 scsi: mpi3mr: Support PCI Error Recovery callback handlers
PCI Error recovery support is required to recover the controller upon
detection of PCI errors. Add support for the PCI error recovery callback
handlers in mpi3mr driver.

Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com>
Co-developed-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Link: https://lore.kernel.org/r/20240627101735.18286-2-sumit.saxena@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:37:07 -04:00
Martin K. Petersen
34438552c9 Merge patch series "Update lpfc to revision 14.4.0.3"
Justin Tee <justintee8345@gmail.com> says:

Update lpfc to revision 14.4.0.3

This patch set contains bug fixes related to discovery, submission of
mailbox commands, and proper endianness conversions.

The patches were cut against Martin's 6.11/scsi-queue tree.

Link: https://lore.kernel.org/r/20240628172011.25921-1-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:25:40 -04:00
Justin Tee
41972df1a5 scsi: lpfc: Update lpfc version to 14.4.0.3
Update lpfc version to 14.4.0.3.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-9-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:24:52 -04:00
Justin Tee
8bc7c61764 scsi: lpfc: Revise lpfc_prep_embed_io routine with proper endian macro usages
On big endian architectures, it is possible to run into a memory out of
bounds pointer dereference when FCP targets are zoned.

In lpfc_prep_embed_io, the memcpy(ptr, fcp_cmnd, sgl->sge_len) is
referencing a little endian formatted sgl->sge_len value.  So, the memcpy
can cause big endian systems to crash.

Redefine the *sgl ptr as a struct sli4_sge_le to make it clear that we are
referring to a little endian formatted data structure.  And, update the
routine with proper le32_to_cpu macro usages.

Fixes: af20bb73ac ("scsi: lpfc: Add support for 32 byte CDBs")
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-8-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:24:52 -04:00
Justin Tee
f65f31ac12 scsi: lpfc: Fix incorrect request len mbox field when setting trunking via sysfs
When setting trunk modes through sysfs, the SLI_CONFIG mailbox command's
command payload length is incorrectly hardcoded to 12 bytes.  SLI_CONFIG's
payload length field should be specified large enough to encompass both the
submailbox command header and the submailbox request itself.

Thus, replace the hardcoded 12 bytes with a clearer calculation by way of
sizeof(struct lpfc_mbx_set_trunk_mode) - sizeof(struct lpfc_sli4_cfg_mhdr).

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-7-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:24:51 -04:00
Justin Tee
ede596b143 scsi: lpfc: Handle mailbox timeouts in lpfc_get_sfp_info
The MBX_TIMEOUT return code is not handled in lpfc_get_sfp_info and the
routine unconditionally frees submitted mailbox commands regardless of
return status.  The issue is that for MBX_TIMEOUT cases, when firmware
returns SFP information at a later time, that same mailbox memory region
references previously freed memory in its cmpl routine.

Fix by adding checks for the MBX_TIMEOUT return code.  During mailbox
resource cleanup, check the mbox flag to make sure that the wait did not
timeout.  If the MBOX_WAKE flag is not set, then do not free the resources
because it will be freed when firmware completes the mailbox at a later
time in its cmpl routine.

Also, increase the timeout from 30 to 60 seconds to accommodate boot
scripts requiring longer timeouts.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-6-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:24:51 -04:00
Justin Tee
15e21dc6d6 scsi: lpfc: Fix handling of fully recovered fabric node in dev_loss callbk
In rare cases when a fabric node is recovered after a link bounce and
before dev_loss_tmo callbk is reached, the driver may leave the fabric node
in an inconsistent state with the NLP_IN_DEV_LOSS flag perpetually set.

In lpfc_dev_loss_tmo_callbk, a check is added for a recovered fabric node.
If the node is recovered, then don't queue the lpfc_dev_loss_tmo_handler
work. In lpfc_dev_loss_tmo_handler, the path taken for the recovered fabric
nodes is updated to clear the NLP_IN_DEV_LOSS flag.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-5-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:24:51 -04:00
Justin Tee
aeaf117cc7 scsi: lpfc: Relax PRLI issue conditions after GID_FT response
If previously in REG_LOGIN_ISSUE state, then remove the requirement that
PLOGI must have been received from the remote port before issuing a PRLI.
After GID_FT completes, it does not matter whether the driver itself sent a
PLOGI or received one.  The fact that we're in REG_LOGIN_ISSUE state simply
means that the next state should be issuing the PRLI to continue discovery
of the remote port.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:24:51 -04:00
Justin Tee
9609385dd9 scsi: lpfc: Allow DEVICE_RECOVERY mode after RSCN receipt if in PRLI_ISSUE state
Certain vendor specific targets initially register with the fabric as an
initiator function first and then re-register as a target function
afterwards.

The timing of the target function re-registration can cause a race
condition such that the driver is stuck assuming the remote port as an
initiator function and never discovers the target's hosted LUNs.

Expand the nlp_state qualifier to also include NLP_STE_PRLI_ISSUE because
the state means that PRLI was issued but we have not quite reached
MAPPED_NODE state yet.  If we received an RSCN in the PRLI_ISSUE state,
then we should restart discovery again by going into DEVICE_RECOVERY.

Fixes: dded1dc31a ("scsi: lpfc: Modify when a node should be put in device recovery mode during RSCN")
Cc: <stable@vger.kernel.org> # v6.6+
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-3-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:24:51 -04:00
Justin Tee
e999ef1542 scsi: lpfc: Cancel ELS WQE instead of issuing abort when SLI port is inactive
During SLI port errata events, there should be no expectation that
submitted outstanding WQEs will return back CQEs.  In these situations, the
driver should not rely on receiving CQEs from the SLI port to signal WQE
resource clean up.

Put an sli_flag LPFC_SLI_ACTIVE check in lpfc_els_flush_cmd() when walking
the txcmplq.  The sli_flag check helps determine whether to issue an abort
or driver based cancel on outstanding WQEs.  If !LPFC_SLI_ACTIVE, then
there's no point to issue anything to the SLI port.  Instead, let the
driver based cancel logic clean up the submitted WQE resources.

Also, enhance some abort log messages that help with future debugging.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240628172011.25921-2-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:24:51 -04:00
Damien Le Moal
7a6bbc2829 scsi: sd: Do not repeat the starting disk message
The SCSI disk message "Starting disk" to signal resuming of a suspended
disk is printed in both sd_resume() and sd_resume_common() which results
in this message being printed twice when resuming from e.g. autosuspend:

$ echo 5000 > /sys/block/sda/device/power/autosuspend_delay_ms
$ echo auto > /sys/block/sda/device/power/control

[ 4962.438293] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[ 4962.501121] sd 0:0:0:0: [sda] Stopping disk

$ echo on > /sys/block/sda/device/power/control

[ 4972.805851] sd 0:0:0:0: [sda] Starting disk
[ 4980.558806] sd 0:0:0:0: [sda] Starting disk

Fix this double print by removing the call to sd_printk() from sd_resume()
and moving the call to sd_printk() in sd_resume_common() earlier in the
function, before the check using sd_do_start_stop().  Doing so, the message
is printed once regardless if sd_resume_common() actually executes
sd_start_stop_device() (i.e. SCSI device case) or not (libsas and libata
managed ATA devices case).

Fixes: 0c76106cb9 ("scsi: sd: Fix TCG OPAL unlock on system resume")
Cc: stable@vger.kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Link: https://lore.kernel.org/r/20240701215326.128067-1-dlemoal@kernel.org
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:05:46 -04:00
Peter Wang
74736103fb scsi: ufs: core: Fix ufshcd_abort_one racing issue
When ufshcd_abort_one is racing with the completion ISR, the completed tag
of the request's mq_hctx pointer will be set to NULL by ISR.  Return
success when request is completed by ISR because ufshcd_abort_one does not
need to do anything.

The racing flow is:

Thread A
ufshcd_err_handler					step 1
	...
	ufshcd_abort_one
		ufshcd_try_to_abort_task
			ufshcd_cmd_inflight(true)	step 3
		ufshcd_mcq_req_to_hwq
			blk_mq_unique_tag
				rq->mq_hctx->queue_num	step 5

Thread B
ufs_mtk_mcq_intr(cq complete ISR)			step 2
	scsi_done
		...
		__blk_mq_free_request
			rq->mq_hctx = NULL;		step 4

Below is KE back trace.
  ufshcd_try_to_abort_task: cmd at tag 41 not pending in the device.
  ufshcd_try_to_abort_task: cmd at tag=41 is cleared.
  Aborting tag 41 / CDB 0x28 succeeded
  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000194
  pc : [0xffffffddd7a79bf8] blk_mq_unique_tag+0x8/0x14
  lr : [0xffffffddd6155b84] ufshcd_mcq_req_to_hwq+0x1c/0x40 [ufs_mediatek_mod_ise]
   do_mem_abort+0x58/0x118
   el1_abort+0x3c/0x5c
   el1h_64_sync_handler+0x54/0x90
   el1h_64_sync+0x68/0x6c
   blk_mq_unique_tag+0x8/0x14
   ufshcd_err_handler+0xae4/0xfa8 [ufs_mediatek_mod_ise]
   process_one_work+0x208/0x4fc
   worker_thread+0x228/0x438
   kthread+0x104/0x1d4
   ret_from_fork+0x10/0x20

Fixes: 93e6c0e19d ("scsi: ufs: core: Clear cmd if abort succeeds in MCQ mode")
Suggested-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20240628070030.30929-3-peter.wang@mediatek.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:04:15 -04:00
Peter Wang
9307a998cb scsi: ufs: core: Fix ufshcd_clear_cmd racing issue
When ufshcd_clear_cmd is racing with the completion ISR, the completed tag
of the request's mq_hctx pointer will be set to NULL by the ISR.  And
ufshcd_clear_cmd's call to ufshcd_mcq_req_to_hwq will get NULL pointer KE.
Return success when the request is completed by ISR because sq does not
need cleanup.

The racing flow is:

Thread A
ufshcd_err_handler					step 1
	ufshcd_try_to_abort_task
		ufshcd_cmd_inflight(true)		step 3
		ufshcd_clear_cmd
			...
			ufshcd_mcq_req_to_hwq
			blk_mq_unique_tag
				rq->mq_hctx->queue_num	step 5

Thread B
ufs_mtk_mcq_intr(cq complete ISR)			step 2
	scsi_done
		...
		__blk_mq_free_request
			rq->mq_hctx = NULL;		step 4

Below is KE back trace:

  ufshcd_try_to_abort_task: cmd pending in the device. tag = 6
  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000194
   pc : [0xffffffd589679bf8] blk_mq_unique_tag+0x8/0x14
   lr : [0xffffffd5862f95b4] ufshcd_mcq_sq_cleanup+0x6c/0x1cc [ufs_mediatek_mod_ise]
   Workqueue: ufs_eh_wq_0 ufshcd_err_handler [ufs_mediatek_mod_ise]
   Call trace:
    dump_backtrace+0xf8/0x148
    show_stack+0x18/0x24
    dump_stack_lvl+0x60/0x7c
    dump_stack+0x18/0x3c
    mrdump_common_die+0x24c/0x398 [mrdump]
    ipanic_die+0x20/0x34 [mrdump]
    notify_die+0x80/0xd8
    die+0x94/0x2b8
    __do_kernel_fault+0x264/0x298
    do_page_fault+0xa4/0x4b8
    do_translation_fault+0x38/0x54
    do_mem_abort+0x58/0x118
    el1_abort+0x3c/0x5c
    el1h_64_sync_handler+0x54/0x90
    el1h_64_sync+0x68/0x6c
    blk_mq_unique_tag+0x8/0x14
    ufshcd_clear_cmd+0x34/0x118 [ufs_mediatek_mod_ise]
    ufshcd_try_to_abort_task+0x2c8/0x5b4 [ufs_mediatek_mod_ise]
    ufshcd_err_handler+0xa7c/0xfa8 [ufs_mediatek_mod_ise]
    process_one_work+0x208/0x4fc
    worker_thread+0x228/0x438
    kthread+0x104/0x1d4
    ret_from_fork+0x10/0x20

Fixes: 8d72903489 ("scsi: ufs: mcq: Add supporting functions for MCQ abort")
Suggested-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Peter Wang <peter.wang@mediatek.com>
Link: https://lore.kernel.org/r/20240628070030.30929-2-peter.wang@mediatek.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 23:04:14 -04:00
Terrence Adams
76a20140ef scsi: pm8001: Update log level when reading config table
Reading the main config table occurs as a part of initialization in
pm80xx_chip_init(). Because of this it makes more sense to have it be a
part of the INIT logging.

Signed-off-by: Terrence Adams <tadamsjr@google.com>
Link: https://lore.kernel.org/r/20240627155924.2361370-3-tadamsjr@google.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 22:53:13 -04:00
Igor Pylypiv
e4f949ef15 scsi: pm80xx: Set phy->enable_completion only when we wait for it
pm8001_phy_control() populates the enable_completion pointer with a stack
address, sends a PHY_LINK_RESET / PHY_HARD_RESET, waits 300 ms, and
returns. The problem arises when a phy control response comes late.  After
300 ms the pm8001_phy_control() function returns and the passed
enable_completion stack address is no longer valid. Late phy control
response invokes complete() on a dangling enable_completion pointer which
leads to a kernel crash.

Signed-off-by: Igor Pylypiv <ipylypiv@google.com>
Signed-off-by: Terrence Adams <tadamsjr@google.com>
Link: https://lore.kernel.org/r/20240627155924.2361370-2-tadamsjr@google.com
Acked-by: Jack Wang <jinpu.wang@ionos.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 22:53:13 -04:00
Kyoungrul Kim
7cbff570db scsi: ufs: core: Remove SCSI host only if added
If host tries to remove ufshcd driver from a UFS device it would cause a
kernel panic if ufshcd_async_scan fails during ufshcd_probe_hba before
adding a SCSI host with scsi_add_host and MCQ is enabled since SCSI host
has been defered after MCQ configuration introduced by commit 0cab4023ec
("scsi: ufs: core: Defer adding host to SCSI if MCQ is supported").

To guarantee that SCSI host is removed only if it has been added, set the
scsi_host_added flag to true after adding a SCSI host and check whether it
is set or not before removing it.

Signed-off-by: Kyoungrul Kim <k831.kim@samsung.com>
Signed-off-by: Minwoo Im <minwoo.im@samsung.com>
Link: https://lore.kernel.org/r/20240627085104epcms2p5897a3870ea5c6416aa44f94df6c543d7@epcms2p5
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 22:51:31 -04:00
Ram Prakash Gupta
ed7dac86f1 scsi: ufs: qcom: Enable suspending clk scaling on no request
Enable suspending clk scaling on no request for Qualcomm SoC.

Signed-off-by: Ram Prakash Gupta <quic_rampraka@quicinc.com>
Link: https://lore.kernel.org/r/20240627083756.25340-3-quic_rampraka@quicinc.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 22:48:33 -04:00
Ram Prakash Gupta
50183ac2cf scsi: ufs: core: Suspend clk scaling on no request
Currently UFS clk scaling is getting suspended only when the clks are
scaled down. When high load is generated, a huge amount of latency is added
due to scaling up the clk and completing the request post that.

Suspending the scaling in its existing state when high load is generated
improves the random performance KPI by 28%. So suspending the scaling when
there are no requests. And the clk would be put in low scaled state when
the actual request load is low.

Make this change optional by having the check enabled using vops since for
some devices suspending without bringing the clk in low scaled state might
have impact on power consumption of the SoC.

Signed-off-by: Ram Prakash Gupta <quic_rampraka@quicinc.com>
Link: https://lore.kernel.org/r/20240627083756.25340-2-quic_rampraka@quicinc.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 22:48:33 -04:00
Tomas Henzl
de24085328 scsi: mpi3mr: Correct a test in mpi3mr_sas_port_add()
The test for a possible shift overflow is not correct. Fix it by replacing
the '>' with a '>='.

Signed-off-by: Tomas Henzl <thenzl@redhat.com>
Link: https://lore.kernel.org/r/20240627074827.13672-1-thenzl@redhat.com
Suggested-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-07-04 22:40:32 -04:00
Martin K. Petersen
06b91c00db Merge patch series "mpi3mr: Host diag buffer support"
Ranjan Kumar <ranjan.kumar@broadcom.com> says:

The controllers managed by mpi3mr driver requires system memory to
save hardware and firmware diagnostic information, this patch set
enhances the drivers to provide host memory to the controller for
diagnostic information.  This patch set also provides driver changes
to push kernel messages into the diagnostic buffers reserved for the
driver, so that the information will be available as part of debug
data fetched from the controller.  In addition, support for
configuring automatic diagnostic information is added in the driver.

Link: https://lore.kernel.org/r/20240626102646.14298-1-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-06-26 23:30:54 -04:00
Ranjan Kumar
3f7e469987 scsi: mpi3mr: Update driver version to 8.9.1.0.50
Update driver version to 8.9.1.0.50

Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20240626102646.14298-5-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-06-26 23:30:09 -04:00
Ranjan Kumar
78b506984e scsi: mpi3mr: Add ioctl support for HDB
Add interface for applications to manage the host diagnostic buffers and
update the automatic diag buffer capture triggers.

Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20240626102646.14298-4-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-06-26 23:30:09 -04:00
Ranjan Kumar
d8d08d1638 scsi: mpi3mr: Trigger support
Add functions to process automatic diag triggers. If a condition defined in
the triggers is met, the driver will call appropriate controller functions
to save the diagnostic information.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202405151955.BiAWI1SY-lkp@intel.com/
Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20240626102646.14298-3-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-06-26 23:30:09 -04:00
Ranjan Kumar
fc44449411 scsi: mpi3mr: HDB allocation and posting for hardware and firmware buffers
To be able to debug controller problems it is beneficial to allocate and
configure system/host memory buffers which can be used to capture hardware
and firmware diagnostic information.

Add functions required to allocate and post firmware and hardware
diagnostic buffers to the controller and to set up automatic diagnostic
capture triggers.

Captures will be triggered under the following circumstances:

 1. Firmware is in FAULT state.

 2. Admin commands time out.

 3. Controller reset caused due to I/O timeout

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202405151758.7xrJz6rp-lkp@intel.com/
Co-developed-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Sathya Prakash <sathya.prakash@broadcom.com>
Signed-off-by: Ranjan Kumar <ranjan.kumar@broadcom.com>
Link: https://lore.kernel.org/r/20240626102646.14298-2-ranjan.kumar@broadcom.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-06-26 23:30:09 -04:00
Adrian Hunter
bdee2f1dcd scsi: ufs: ufs-pci: Add support for Intel Panther Lake
Add PCI ID to support Intel Panther Lake, same as MTL.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20240618073158.38504-1-adrian.hunter@intel.com
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-06-26 23:21:10 -04:00
Jeff Johnson
4d66ecc6e5 scsi: ufs: qcom: Add missing MODULE_DESCRIPTION() macro
With ARCH=arm64, make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/ufs/host/ufs-qcom.o

Add the missing invocation of the MODULE_DESCRIPTION() macro.

Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Link: https://lore.kernel.org/r/20240625-md-drivers-ufs-host-v2-1-59a56974b05a@quicinc.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-06-26 22:50:59 -04:00
Huai-Yuan Liu
5e0bf3e8ae scsi: lpfc: Fix a possible null pointer dereference
In function lpfc_xcvr_data_show, the memory allocation with kmalloc might
fail, thereby making rdp_context a null pointer. In the following context
and functions that use this pointer, there are dereferencing operations,
leading to null pointer dereference.

To fix this issue, a null pointer check should be added. If it is null,
use scnprintf to notify the user and return len.

Fixes: 479b0917e4 ("scsi: lpfc: Create a sysfs entry called lpfc_xcvr_data for transceiver info")
Signed-off-by: Huai-Yuan Liu <qq810974084@gmail.com>
Link: https://lore.kernel.org/r/20240621082545.449170-1-qq810974084@gmail.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-06-26 22:45:04 -04:00
Xingui Yang
ab2068a6fb scsi: libsas: Fix exp-attached device scan after probe failure scanned in again after probe failed
The expander phy will be treated as broadcast flutter in the next
revalidation after the exp-attached end device probe failed, as follows:

[78779.654026] sas: broadcast received: 0
[78779.654037] sas: REVALIDATING DOMAIN on port 0, pid:10
[78779.654680] sas: ex 500e004aaaaaaa1f phy05 change count has changed
[78779.662977] sas: ex 500e004aaaaaaa1f phy05 originated BROADCAST(CHANGE)
[78779.662986] sas: ex 500e004aaaaaaa1f phy05 new device attached
[78779.663079] sas: ex 500e004aaaaaaa1f phy05:U:8 attached: 500e004aaaaaaa05 (stp)
[78779.693542] hisi_sas_v3_hw 0000:b4:02.0: dev[16:5] found
[78779.701155] sas: done REVALIDATING DOMAIN on port 0, pid:10, res 0x0
[78779.707864] sas: Enter sas_scsi_recover_host busy: 0 failed: 0
...
[78835.161307] sas: --- Exit sas_scsi_recover_host: busy: 0 failed: 0 tries: 1
[78835.171344] sas: sas_probe_sata: for exp-attached device 500e004aaaaaaa05 returned -19
[78835.180879] hisi_sas_v3_hw 0000:b4:02.0: dev[16:5] is gone
[78835.187487] sas: broadcast received: 0
[78835.187504] sas: REVALIDATING DOMAIN on port 0, pid:10
[78835.188263] sas: ex 500e004aaaaaaa1f phy05 change count has changed
[78835.195870] sas: ex 500e004aaaaaaa1f phy05 originated BROADCAST(CHANGE)
[78835.195875] sas: ex 500e004aaaaaaa1f rediscovering phy05
[78835.196022] sas: ex 500e004aaaaaaa1f phy05:U:A attached: 500e004aaaaaaa05 (stp)
[78835.196026] sas: ex 500e004aaaaaaa1f phy05 broadcast flutter
[78835.197615] sas: done REVALIDATING DOMAIN on port 0, pid:10, res 0x0

The cause of the problem is that the related ex_phy's attached_sas_addr was
not cleared after the end device probe failed, so reset it.

Signed-off-by: Xingui Yang <yangxingui@huawei.com>
Link: https://lore.kernel.org/r/20240619091742.25465-1-yangxingui@huawei.com
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
2024-06-26 22:31:53 -04:00