Finn Thain <fthain@linux-m68k.org> says:
This series begins with some work on the mac_scsi driver to improve
compatibility with SCSI2SD v5 devices. Better error handling is needed
there because the PDMA hardware does not tolerate the write latency
spikes which SD cards can produce.
A bug is fixed in the 5380 core driver so that scatter/gather can be
enabled in mac_scsi.
Several patches at the end of this series improve robustness and
correctness in the core driver.
This series has been tested on a variety of mac_scsi hosts. A variety
of SCSI targets was also tested, including Quantum HDD, Fujitsu HDD,
Iomega FDD, Ricoh CD-RW, Matsushita CD-ROM, SCSI2SD and BlueSCSI.
Link: https://lore.kernel.org/r/cover.1723001788.git.fthain@linux-m68k.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
It's not an error for a target to change the bus phase during a transfer.
Unfortunately, the FLAG_DMA_FIXUP workaround does not allow for that -- a
phase change produces a DRQ timeout error and the device borken flag will
be set.
Check the phase match bit during FLAG_DMA_FIXUP processing. Don't forget to
decrement the command residual. While we are here, change shost_printk()
into scmd_printk() for better consistency with other DMA error messages.
Tested-by: Stan Johnson <userm57@yahoo.com>
Fixes: 55181be8ce ("ncr5380: Replace redundant flags with FLAG_NO_DMA_FIXUP")
Signed-off-by: Finn Thain <fthain@linux-m68k.org>
Link: https://lore.kernel.org/r/99dc7d1f4c825621b5b120963a69f6cd3e9ca659.1723001788.git.fthain@linux-m68k.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
SD cards can produce write latency spikes on the order of a hundred
milliseconds. If the target firmware does not hide that latency during DATA
IN and OUT phases it can cause the PDMA circuitry to raise a processor bus
fault which in turn leads to an unreliable byte count and a DMA overrun.
The Last Byte Sent flag is used to detect the overrun but this mechanism is
unreliable on some systems. Instead, set a DID_ERROR result whenever there
is a bus fault during a PDMA send, unless the cause was a phase mismatch.
Cc: stable@vger.kernel.org # 5.15+
Reported-and-tested-by: Stan Johnson <userm57@yahoo.com>
Fixes: 7c1f3e3447 ("scsi: mac_scsi: Treat Last Byte Sent time-out as failure")
Signed-off-by: Finn Thain <fthain@linux-m68k.org>
Link: https://lore.kernel.org/r/cc38df687ace2c4ffc375a683b2502fc476b600d.1723001788.git.fthain@linux-m68k.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Replace of_get_property() with the type specific
of_property_count_u32_elems() to get the property length.
This is part of a larger effort to remove callers of of_get_property() and
similar functions. of_get_property() leaks the DT property data pointer
which is a problem for dynamically allocated nodes which may be freed.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/r/20240808170704.1438658-1-robh@kernel.org
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Use of_property_present() to test for property presence rather than
of_find_property(). This is part of a larger effort to remove callers of
of_find_property() and similar functions. of_find_property() leaks the DT
struct property and data pointers which is a problem for dynamically
allocated nodes which may be freed.
Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
Link: https://lore.kernel.org/r/20240808170644.1436991-1-robh@kernel.org
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Don Brace <don.brace@microchip.com> says:
These patches are based on Martin Petersen's 6.11/scsi-queue tree
https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git 6.11/scsi-queue
The functional changes of note to smartpqi are for: multipath failover
and improving the accuracy of our RAID bypass counter.
For multipath we are:
Reverting commit 94a68c8143 ("scsi: smartpqi: Quickly propagate
path failures to SCSI midlayer") because under certain rare
conditions involving encryption-enabled devices, a false path
failure is reported to the SML causing multipath to failover to
the other path.
Improving errors returned from the driver back to the SML by
checking for error codes returned from the firmware and returning
the correct ASC/ASCQ codes to the SML.
The other two patches add PCI-IDs for new controllers and change the
driver version.
Link: https://lore.kernel.org/r/20240711194704.982400-1-don.brace@microchip.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Justin Tee <justintee8345@gmail.com> says:
Update lpfc to revision 14.4.0.4
This patch set contains diagnostic logging improvements, a minor clean
up when submitting abort requests, a bug fix related to reset and
errata paths, and modifications to FLOGI and PRLO ELS command
handling.
The patches were cut against Martin's 6.11/scsi-queue tree.
Link: https://lore.kernel.org/r/20240726231512.92867-1-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
A kref imbalance occurs when handling an unsolicited PRLO in direct
attached topology.
Rework PRLO rcv handling when in MAPPED state. Save the state that we were
handling a PRLO by setting nlp_last_elscmd to ELS_CMD_PRLO. Then in the
lpfc_cmpl_els_logo_acc() completion routine, manually restart discovery.
By issuing the PLOGI, which nlp_gets, before nlp_put at the end of the
lpfc_cmpl_els_logo_acc() routine, we are saving us from a final nlp_put.
And, we are still allowing the unreg_rpi to happen.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240726231512.92867-7-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In direct attached topology, certain target vendors that are quick to issue
FLOGI followed by a cable pull for more than dev_loss_tmo may result in a
kref imbalance for the remote port ndlp object.
Add an nlp_get when the defer_flogi_acc flag is set. This is expected to
balance the nlp_put in the defer_flogi_acc clause in the
lpfc_issue_els_flogi() routine. Because we need to retain the ndlp ptr,
reorganize all of the defer_flogi_acc information into one
lpfc_defer_flogi_acc struct.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240726231512.92867-6-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The vport->vmid_flag is unintentionally cleared twice after an issue_lip
via the lpfc_reinit_vmid routine().
The first call to lpfc_reinit_vmid() is in lpfc_cmpl_els_flogi(). Then
lpfc_cmpl_els_flogi_fabric() calls lpfc_register_new_vport(), which calls
lpfc_cmpl_reg_new_vport() when the mbox command completes and calls
lpfc_reinit_vmid() a second time.
Fix by moving the vmid_flag clear outside of the lpfc_reinit_vmid() routine
so that vmid_flag is only cleared once upon FLOGI completion.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240726231512.92867-5-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
When the HBA is undergoing a reset or is handling an errata event, NULL ptr
dereference crashes may occur in routines such as
lpfc_sli_flush_io_rings(), lpfc_dev_loss_tmo_callbk(), or
lpfc_abort_handler().
Add NULL ptr checks before dereferencing hdwq pointers that may have been
freed due to operations colliding with a reset or errata event handler.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240726231512.92867-4-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
During diagnostics, it has been determined that the 0115 log message for
receipt of unknown ELS cmds does not benefit from trace buffer dumps. The
trace buffer dump floods the console with unnecessary information, and the
singular LOG_ELS flag has proven more beneficial in debugging efforts when
dealing with unknown ELS cmds.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240726231512.92867-2-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The default UIC command timeout still remains 500ms. Allow platform
drivers to override the UIC command timeout if desired.
In a real product, the 500ms timeout value is probably good enough.
However, during the product development where there are a lot of logging
and debug messages being printed to the UART console, interrupt starvations
happen occasionally because the UART may print long debug messages from
different modules in the system. While printing, the UART may have
interrupts disabled for more than 500ms, causing UIC command timeout. The
UIC command timeout would trigger more printing from the UFS driver, and
eventually a watchdog timeout may occur unnecessarily.
Add support for overriding the UIC command timeout value with the newly
created uic_cmd_timeout kernel module parameter. Default value is
500ms. Supported values range from 500ms to 2 seconds.
Signed-off-by: Bao D. Nguyen <quic_nguyenb@quicinc.com>
Link: https://lore.kernel.org/r/e4e1c87f3f867f270a3d4b5d57a00139ff0e9741.1721792309.git.quic_nguyenb@quicinc.com
Suggested-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Peter Wang <peter.wang@mediatek.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Replace the deprecated[1] use of 1-element arrays in struct sgmap, struct
sgmap64, struct sgmapraw, struct user_sgmap, and struct user_sgmap64 with
modern flexible arrays. Additionally remove struct user_sgmapraw as it is
unused.
The resulting binary output differences from this change are limited only
to stack space consumption of the smaller "srbu" variable in
aac_issue_safw_bmic_identify() and aac_get_safw_ciss_luns(), as well as the
smaller associated pair of memcpy()s in
aac_send_safw_bmic_cmd(). Artificially growing the size of srbu back to its
prior size removes all binary differences[2].
As an aside, after studying the aacraid driver code I wonder how
aac_send_wellness_command() ever works. It is reporting a size 4 bytes too
small for what it has constructed in memory in the DMA region: sgentry64 is
size 12, whereas sgentry is size 8. Perhaps the hardware doesn't
care. (Regardless, it is unchanged by this patch.)
Link: https://github.com/KSPP/linux/issues/79 [1]
Link: https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=dev/v6.10-rc2/1-element&id=45e6226bcbc5e982541754eca7ac29f403e82f5e [2]
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20240711215739.208776-2-kees@kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
struct aac_srb_unit contains struct aac_srb, which contains struct sgmap,
which ends in a (currently) "fake" (1-element) flexible array. Converting
this to a flexible array is needed so that runtime bounds checking won't
think the array is fixed size (i.e. under CONFIG_FORTIFY_SOURCE=y and/or
CONFIG_UBSAN_BOUNDS=y), as other parts of aacraid use struct sgmap as a
flexible array.
It is not legal to have a flexible array in the middle of a structure, so
it either needs to be split up or rearranged so that it is at the end of
the structure. Luckily, struct aac_srb_unit, which is exclusively
consumed/updated by aac_send_safw_bmic_cmd(), does not depend on member
ordering.
The values set in the on-stack struct aac_srb_unit instance "srbu" by the
only two callers, aac_issue_safw_bmic_identify() and
aac_get_safw_ciss_luns(), do not contain anything in srbu.srb.sgmap.sg, and
they both implicitly initialize srbu.srb.sgmap.count to 0 during
memset(). For example:
memset(&srbu, 0, sizeof(struct aac_srb_unit));
srbcmd = &srbu.srb;
srbcmd->flags = cpu_to_le32(SRB_DataIn);
srbcmd->cdb[0] = CISS_REPORT_PHYSICAL_LUNS;
srbcmd->cdb[1] = 2; /* extended reporting */
srbcmd->cdb[8] = (u8)(datasize >> 8);
srbcmd->cdb[9] = (u8)(datasize);
rcode = aac_send_safw_bmic_cmd(dev, &srbu, phys_luns, datasize);
During aac_send_safw_bmic_cmd(), a separate srb is mapped into DMA, and has
srbu.srb copied into it:
srb = fib_data(fibptr);
memcpy(srb, &srbu->srb, sizeof(struct aac_srb));
Only then is srb.sgmap.count written and srb->sg populated:
srb->count = cpu_to_le32(xfer_len);
sg64 = (struct sgmap64 *)&srb->sg;
sg64->count = cpu_to_le32(1);
sg64->sg[0].addr[1] = cpu_to_le32(upper_32_bits(addr));
sg64->sg[0].addr[0] = cpu_to_le32(lower_32_bits(addr));
sg64->sg[0].count = cpu_to_le32(xfer_len);
But this is happening in the DMA memory, not in srbu.srb. An attempt to
copy the changes back to srbu does happen:
/*
* Copy the updated data for other dumping or other usage if
* needed
*/
memcpy(&srbu->srb, srb, sizeof(struct aac_srb));
But this was never correct: the sg64 (3 u32s) overlap of srb.sg (2 u32s)
always meant that srbu.srb would have held truncated information and any
attempt to walk srbu.srb.sg.sg based on the value of srbu.srb.sg.count
would result in attempting to parse past the end of srbu.srb.sg.sg[0] into
srbu.srb_reply.
After getting a reply from hardware, the reply is copied into
srbu.srb_reply:
srb_reply = (struct aac_srb_reply *)fib_data(fibptr);
memcpy(&srbu->srb_reply, srb_reply, sizeof(struct aac_srb_reply));
This has always been fixed-size, so there's no issue here. It is worth
noting that the two callers _never check_ srbu contents -- neither
srbu.srb nor srbu.srb_reply is examined. (They depend on the mapped
xfer_buf instead.)
Therefore, the ordering of members in struct aac_srb_unit does not matter,
and the flexible array member can moved to the end.
(Additionally, the two memcpy()s that update srbu could be entirely
removed as they are never consumed, but I left that as-is.)
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20240711215739.208776-1-kees@kernel.org
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Replace the deprecated[1] use of a 1-element array in struct
_CONFIG_PAGE_IOC_4 with a modern flexible array.
Additionally add __counted_by annotation since SEP is only ever accessed
after updating ACtiveSEP:
lsi/mpi_cnfg.h: IOC_4_SEP SEP[] __counted_by(ActiveSEP); /* 08h */
mptsas.c: ii = IOCPage4Ptr->ActiveSEP++;
mptsas.c: IOCPage4Ptr->SEP[ii].SEPTargetID = id;
mptsas.c: IOCPage4Ptr->SEP[ii].SEPBus = channel;
No binary differences are present after this conversion.
Link: https://github.com/KSPP/linux/issues/79 [1]
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20240711172821.123936-6-kees@kernel.org
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Replace the deprecated[1] use of a 1-element array in struct
_CONFIG_PAGE_IOC_3 with a modern flexible array.
Additionally add __counted_by annotation since PhysDisk is only ever
accessed via a loops bounded by NumPhysDisks:
lsi/mpi_cnfg.h: IOC_3_PHYS_DISK PhysDisk[] __counted_by(NumPhysDisks); /* 08h */
mptscsih.c: for (i = 0; i < ioc->raid_data.pIocPg3->NumPhysDisks; i++) {
mptscsih.c: if ((id == ioc->raid_data.pIocPg3->PhysDisk[i].PhysDiskID) &&
mptscsih.c: (channel == ioc->raid_data.pIocPg3->PhysDisk[i].PhysDiskBus)) {
mptscsih.c: for (i = 0; i < ioc->raid_data.pIocPg3->NumPhysDisks; i++) {
mptscsih.c: ioc->raid_data.pIocPg3->PhysDisk[i].PhysDiskNum);
mptscsih.c: ioc->raid_data.pIocPg3->PhysDisk[i].PhysDiskNum,
mptscsih.c: for (i = 0; i < ioc->raid_data.pIocPg3->NumPhysDisks; i++) {
mptscsih.c: if ((id == ioc->raid_data.pIocPg3->PhysDisk[i].PhysDiskID) &&
mptscsih.c: (channel == ioc->raid_data.pIocPg3->PhysDisk[i].PhysDiskBus)) {
mptscsih.c: rc = ioc->raid_data.pIocPg3->PhysDisk[i].PhysDiskNum;
mptscsih.c: for (i = 0; i < ioc->raid_data.pIocPg3->NumPhysDisks; i++) {
mptscsih.c: ioc->raid_data.pIocPg3->PhysDisk[i].PhysDiskNum);
mptscsih.c: ioc->raid_data.pIocPg3->PhysDisk[i].PhysDiskNum,
No binary differences are present after this conversion.
Link: https://github.com/KSPP/linux/issues/79 [1]
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20240711172821.123936-5-kees@kernel.org
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Replace the deprecated[1] use of a 1-element array in struct
_CONFIG_PAGE_IOC_2 with a modern flexible array.
Additionally add __counted_by annotation since RaidVolume is only ever
accessed from loops controlled by NumActiveVolumes:
lsi/mpi_cnfg.h: CONFIG_PAGE_IOC_2_RAID_VOL RaidVolume[] __counted_by(NumActiveVolumes); /* 0Ch */
mptbase.c: for (i = 0; i < pIoc2->NumActiveVolumes ; i++)
mptbase.c: pIoc2->RaidVolume[i].VolumeBus,
mptbase.c: pIoc2->RaidVolume[i].VolumeID);
mptsas.c: for (i = 0; i < ioc->raid_data.pIocPg2->NumActiveVolumes; i++) {
mptsas.c: RaidVolume[i].VolumeID) {
mptsas.c: RaidVolume[i].VolumeBus;
mptsas.c: for (i = 0; i < ioc->raid_data.pIocPg2->NumActiveVolumes; i++) {
mptsas.c: ioc->raid_data.pIocPg2->RaidVolume[i].VolumeID, 0);
mptsas.c: ioc->raid_data.pIocPg2->RaidVolume[i].VolumeID);
mptsas.c: ioc->raid_data.pIocPg2->RaidVolume[i].VolumeID, 0);
mptsas.c: for (i = 0; i < ioc->raid_data.pIocPg2->NumActiveVolumes; i++) {
mptsas.c: if (ioc->raid_data.pIocPg2->RaidVolume[i].VolumeID ==
mptsas.c: for (i = 0; i < ioc->raid_data.pIocPg2->NumActiveVolumes; i++)
mptsas.c: if (ioc->raid_data.pIocPg2->RaidVolume[i].VolumeID == id)
mptspi.c: for (i=0; i < ioc->raid_data.pIocPg2->NumActiveVolumes; i++) {
mptspi.c: if (ioc->raid_data.pIocPg2->RaidVolume[i].VolumeID == id) {
No binary differences are present after this conversion.
Link: https://github.com/KSPP/linux/issues/79 [1]
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20240711172821.123936-4-kees@kernel.org
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Replace the deprecated[1] use of a 1-element array in struct
_CONFIG_PAGE_RAID_PHYS_DISK_1 with a modern flexible array.
Additionally add __counted_by annotation since Path is only ever accessed
via a loops bounded by NumPhysDiskPaths:
lsi/mpi_cnfg.h: RAID_PHYS_DISK1_PATH Path[] __counted_by(NumPhysDiskPaths);/* 0Ch */
mptbase.c: phys_disk->NumPhysDiskPaths = buffer->NumPhysDiskPaths;
mptbase.c: for (i = 0; i < phys_disk->NumPhysDiskPaths; i++) {
mptbase.c: phys_disk->Path[i].PhysDiskID = buffer->Path[i].PhysDiskID;
mptbase.c: phys_disk->Path[i].PhysDiskBus = buffer->Path[i].PhysDiskBus;
No binary differences are present after this conversion.
Link: https://github.com/KSPP/linux/issues/79 [1]
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20240711172821.123936-3-kees@kernel.org
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Replace the deprecated[1] use of a 1-element array in struct
_CONFIG_PAGE_SAS_IO_UNIT_0 with a modern flexible array.
Additionally add __counted_by annotation since PhyData is only ever
accessed via a loops bounded by NumPhys:
lsi/mpi_cnfg.h: MPI_SAS_IO_UNIT0_PHY_DATA PhyData[] __counted_by(NumPhys); /* 10h */
mptsas.c: port_info->num_phys = buffer->NumPhys;
mptsas.c: for (i = 0; i < port_info->num_phys; i++) {
mptsas.c: mptsas_print_phy_data(ioc, &buffer->PhyData[i]);
mptsas.c: port_info->phy_info[i].phy_id = i;
mptsas.c: port_info->phy_info[i].port_id =
mptsas.c: buffer->PhyData[i].Port;
mptsas.c: port_info->phy_info[i].negotiated_link_rate =
mptsas.c: buffer->PhyData[i].NegotiatedLinkRate;
No binary differences are present after this conversion.
Link: https://github.com/KSPP/linux/issues/79 [1]
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://lore.kernel.org/r/20240711172821.123936-2-kees@kernel.org
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>