Pull SCSI updates from James Bottomley:
"Usual driver updates (ufs, lpfc, target, qla2xxx) plus assorted
cleanups and fixes including the WQ_PERCPU series.
The biggest core change is the new allocation of pseudo-devices which
allow the sending of internal commands to a given SCSI target"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (147 commits)
scsi: MAINTAINERS: Add the UFS include directory
scsi: scsi_debug: Support injecting unaligned write errors
scsi: qla2xxx: Fix improper freeing of purex item
scsi: ufs: rockchip: Fix compile error without CONFIG_GPIOLIB
scsi: ufs: rockchip: Reset controller on PRE_CHANGE of hce enable notify
scsi: ufs: core: Use scsi_device_busy()
scsi: ufs: core: Fix single doorbell mode support
scsi: pm80xx: Add WQ_PERCPU to alloc_workqueue() users
scsi: target: Add WQ_PERCPU to alloc_workqueue() users
scsi: qedi: Add WQ_PERCPU to alloc_workqueue() users
scsi: target: ibmvscsi: Add WQ_PERCPU to alloc_workqueue() users
scsi: qedf: Add WQ_PERCPU to alloc_workqueue() users
scsi: bnx2fc: Add WQ_PERCPU to alloc_workqueue() users
scsi: be2iscsi: Add WQ_PERCPU to alloc_workqueue() users
scsi: message: fusion: Add WQ_PERCPU to alloc_workqueue() users
scsi: lpfc: WQ_PERCPU added to alloc_workqueue() users
scsi: scsi_transport_fc: WQ_PERCPU added to alloc_workqueue users()
scsi: scsi_dh_alua: WQ_PERCPU added to alloc_workqueue() users
scsi: qla2xxx: WQ_PERCPU added to alloc_workqueue() users
scsi: target: sbp: Replace use of system_unbound_wq with system_dfl_wq
...
In 2009, commit c82f63e411 ("PCI: check saved state before restore")
changed the behavior of pci_restore_state() such that it became necessary
to call pci_save_state() afterwards, lest recovery from subsequent PCI
errors fails.
The commit has just been reverted and so all the pci_save_state() after
pci_restore_state() calls that have accumulated in the tree are now
superfluous. Drop them.
Two drivers chose a different approach to achieve the same result:
drivers/scsi/ipr.c and drivers/net/ethernet/intel/e1000e/netdev.c set the
pci_dev's "state_saved" flag to true before calling pci_restore_state().
Drop this as well.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Dave Jiang <dave.jiang@intel.com>
Acked-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> # qat
Link: https://patch.msgid.link/c2b28cc4defa1b743cf1dedee23c455be98b397a.1760274044.git.lukas@wunner.de
Marco Crivellari <marco.crivellari@suse.com> says:
Hi,
=== Current situation: problems ===
Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is
set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected.
This leads to different scenarios if a work item is scheduled on an
isolated CPU where "delay" value is 0 or greater then 0:
schedule_delayed_work(, 0);
This will be handled by __queue_work() that will queue the work item on the
current local (isolated) CPU, while:
schedule_delayed_work(, 1);
Will move the timer on an housekeeping CPU, and schedule the work there.
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistency cannot be addressed without refactoring the API.
=== Recent changes to the WQ API ===
The following, address the recent changes in the Workqueue API:
- commit 128ea9f6cc ("workqueue: Add system_percpu_wq and system_dfl_wq")
- commit 930c2ea566 ("workqueue: Add new WQ_PERCPU flag")
The old workqueues will be removed in a future release cycle.
=== Introduced Changes by this series ===
1) [P 1] Replace uses of system_wq and system_unbound_wq
system_unbound_wq is to be used when locality is not required.
Because of that, system_unbound_wq has been replaced with
system_dfl_wq, to make clear it should be used when locality
is not required.
2) [P 2-3-4] WQ_PERCPU added to alloc_workqueue()
This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.
Thanks!
Link: https://patch.msgid.link/20251031095643.74246-1-marco.crivellari@suse.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently if a user enqueue a work item using schedule_delayed_work()
the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND. This lack of consistentcy cannot be
addressed without refactoring the API.
alloc_workqueue() treats all queues as per-CPU by default, while unbound
workqueues must opt-in via WQ_UNBOUND.
This default is suboptimal: most workloads benefit from unbound queues,
allowing the scheduler to place worker threads where they’re needed and
reducing noise when CPUs are isolated.
This change adds a new WQ_PERCPU flag to explicitly request
alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.
This patch continues the effort to refactor worqueue APIs, which has
begun with the change introducing new workqueues and a new
alloc_workqueue flag:
commit 128ea9f6cc ("workqueue: Add system_percpu_wq and system_dfl_wq")
commit 930c2ea566 ("workqueue: Add new WQ_PERCPU flag")
With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
must now use WQ_PERCPU.
Once migration is complete, WQ_UNBOUND can be removed and unbound will
become the implicit default.
Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Marco Crivellari <marco.crivellari@suse.com>
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Link: https://patch.msgid.link/20251104110808.123424-1-marco.crivellari@suse.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
FC-LS and FC-GS specifications outline fields for registering a platform
name identifier (PNI) to the fabric. The PNI assists fabrics with
identifying the physical server source of frames in the fabric.
lpfc generates a PNI based partially on the uuid specific for the
system. Initial attempts to extract a uuid are made from SMBIOS's
System Information 08h uuid entry. If SMBIOS DMI does not exist, a PNI
is not generated and PNI registration with the fabric is skipped.
The PNI is submitted in FLOGI and FDISC frames. After successful fabric
login, the RSPNI_PNI CT frame is submitted to the fabric to register the
OS host name tying it to the PNI.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://patch.msgid.link/20251106224639.139176-10-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Currently, there is a kref put in the lpfc_cleanup() routine that takes
care of outstanding references on fabric controller ndlps in UNUSED
state. While typically there is a state change from UNUSED -> REGLOGIN
when the ndlp successfully logs into the fabric, there may be cases when
FLOGI is unsuccessful and the ndlp will remain in UNUSED state without a
registered rpi, yet the ndlp incorrectly has a kref count of one.
To address this, handling of Fabric Controller ndlps are moved into the
routines: lpfc_issue_els_scr(), lpfc_issue_els_rdf(),
lpfc_cmpl_els_disc_cmd().
In both lpfc_issue_els_scr() and lpfc_issue_els_rdf(), if there does not
exist a previously created fabric controller ndlp, an ndlp will be
created. Otherwise, we can reuse the pre-existing ndlp object.
In lpfc_cmpl_els_disc_cmd(), if the SCR or RDF are not successfully
issued, the initial reference on the ndlp that is not registered with
upper layers will be decremented with a kref_put().
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://patch.msgid.link/20251106224639.139176-7-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In point-to-point topology, the driver sometimes defers the unsolicited
FLOGI LS_ACC until it sends its FLOGI to the remote port. When this
happens, lpfc neglects to release the ndlp allocated for the unsolicited
FLOGI. This patch adds code to release the ndlp for the deferred
unsolicited FLOGI LS_ACC.
An NLP_FLOGI_DFR_ACC flag is introduced to facilitate identifying an
ndlp with an expected deferred FLOGI LS_ACC completion. When
lpfc_cmpl_els_rsp() detects the correct qualifiers, it releases the
initial reference on the ndlp object. And when lpfc_cmpl_els_rsp()
exits, the remaining put for the deferred action is executed and the
ndlp is released.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://patch.msgid.link/20251106224639.139176-6-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Unregistration of an rpi object should be done when a PLOGI is received
as PLOGI receipt implies an implicit LOGO. Previously, the driver would
continue using the same, already registered, rpi and ACC the received
PLOGI.
Replace the ACC and early return statement with break to execute the
rest of the lpfc_rcv_plogi logic outside the switch case statement.
This ensures unregistration and reregistration of an rpi after PLOGI_ACC
completion.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://patch.msgid.link/20251106224639.139176-5-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pull pci updates from Bjorn Helgaas:
"Enumeration:
- Add PCI_FIND_NEXT_CAP() and PCI_FIND_NEXT_EXT_CAP() macros that
take config space accessor functions.
Implement pci_find_capability(), pci_find_ext_capability(), and
dwc, dwc endpoint, and cadence capability search interfaces with
them (Hans Zhang)
- Leave parent unit address 0 in 'interrupt-map' so that when we
build devicetree nodes to describe PCI functions that contain
multiple peripherals, we can build this property even when
interrupt controllers lack 'reg' properties (Lorenzo Pieralisi)
- Add a Xeon 6 quirk to disable Extended Tags and limit Max Read
Request Size to 128B to avoid a performance issue (Ilpo Järvinen)
- Add sysfs 'serial_number' file to expose the Device Serial Number
(Matthew Wood)
- Fix pci_acpi_preserve_config() memory leak (Nirmoy Das)
Resource management:
- Align m68k pcibios_enable_device() with other arches (Ilpo
Järvinen)
- Remove sparc pcibios_enable_device() implementations that don't do
anything beyond what pci_enable_resources() does (Ilpo Järvinen)
- Remove mips pcibios_enable_resources() and use
pci_enable_resources() instead (Ilpo Järvinen)
- Clean up bridge window sizing and assignment (Ilpo Järvinen),
including:
- Leave non-claimed bridge windows disabled
- Enable bridges even if a window wasn't assigned because not all
windows are required by downstream devices
- Preserve bridge window type when releasing the resource, since
the type is needed for reassignment
- Consolidate selection of bridge windows into two new
interfaces, pbus_select_window() and
pbus_select_window_for_type(), so this is done consistently
- Compute bridge window start and end earlier to avoid logging
stale information
MSI:
- Add quirk to disable MSI on RDC PCI to PCIe bridges (Marcos Del Sol
Vives)
Error handling:
- Align AER with EEH by allowing drivers to request a Bus Reset on
Non-Fatal Errors (in addition to the reset on Fatal Errors that we
already do) (Lukas Wunner)
- If error recovery fails, emit FAILED_RECOVERY uevents for the
devices, not for the bridge leading to them.
This makes them correspond to BEGIN_RECOVERY uevents (Lukas Wunner)
- Align AER with EEH by calling err_handler.error_detected()
callbacks to notify drivers if error recovery fails (Lukas Wunner)
- Align AER with EEH by restoring device error_state to
pci_channel_io_normal before the err_handler.slot_reset() callback.
This is earlier than before the err_handler.resume() callback
(Lukas Wunner)
- Emit a BEGIN_RECOVERY uevent when driver's
err_handler.error_detected() requests a reset, as well as when it
says recovery is complete or can be done without a reset (Niklas
Schnelle)
- Align s390 with AER and EEH by emitting uevents during error
recovery (Niklas Schnelle)
- Align EEH with AER and s390 by emitting BEGIN_RECOVERY,
SUCCESSFUL_RECOVERY, or FAILED_RECOVERY uevents depending on the
result of err_handler.error_detected() (Niklas Schnelle)
- Fix a NULL pointer dereference in aer_ratelimit() when ACPI GHES
error information identifies a device without an AER Capability
(Breno Leitao)
- Update error decoding and TLP Log printing for new errors in
current PCIe base spec (Lukas Wunner)
- Update error recovery documentation to match the current code
and use consistent nomenclature (Lukas Wunner)
ASPM:
- Enable all ClockPM and ASPM states for devicetree platforms, since
there's typically no firmware that enables ASPM
This is a risky change that may uncover hardware or configuration
defects at boot-time rather than when users enable ASPM via sysfs
later. Booting with "pcie_aspm=off" prevents this enabling
(Manivannan Sadhasivam)
- Remove the qcom code that enabled ASPM (Manivannan Sadhasivam)
Power management:
- If a device has already been disconnected, e.g., by a hotplug
removal, don't bother trying to resume it to D0 when detaching the
driver.
This avoids annoying "Unable to change power state from D3cold to
D0" messages (Mario Limonciello)
- Ensure devices are powered up before config reads for
'max_link_width', 'current_link_speed', 'current_link_width',
'secondary_bus_number', and 'subordinate_bus_number' sysfs files.
This prevents using invalid data (~0) in drivers or lspci and,
depending on how the PCIe controller reports errors, may avoid
error interrupts or crashes (Brian Norris)
Virtualization:
- Add rescan/remove locking when enabling/disabling SR-IOV, which
avoids list corruption on s390, where disabling SR-IOV also
generates hotplug events (Niklas Schnelle)
Peer-to-peer DMA:
- Free struct p2p_pgmap, not a member within it, in the
pci_p2pdma_add_resource() error path (Sungho Kim)
Endpoint framework:
- Document sysfs interface for BAR assignment of vNTB endpoint
functions (Jerome Brunet)
- Fix array underflow in endpoint BAR test case (Dan Carpenter)
- Skip endpoint IRQ test if the IRQ is out of range to avoid false
errors (Christian Bruel)
- Fix endpoint test case for controllers with fixed-size BARs smaller
than requested by the test (Marek Vasut)
- Restore inbound translation when disabling doorbell so the endpoint
doorbell test case can be run more than once (Niklas Cassel)
- Avoid a NULL pointer dereference when releasing DMA channels in
endpoint DMA test case (Shin'ichiro Kawasaki)
- Convert tegra194 interrupt number to MSI vector to fix endpoint
Kselftest MSI_TEST test case (Niklas Cassel)
- Reset tegra194 BARs when running in endpoint mode so the BAR tests
don't overwrite the ATU settings in BAR4 (Niklas Cassel)
- Handle errors in tegra194 BPMP transactions so we don't mistakenly
skip future PERST# assertion (Vidya Sagar)
AMD MDB PCIe controller driver:
- Update DT binding example to separate PERST# to a Root Port stanza
to make multiple Root Ports possible in the future (Sai Krishna
Musham)
- Add driver support for PERST# being described in a Root Port
stanza, falling back to the host bridge if not found there (Sai
Krishna Musham)
Freescale i.MX6 PCIe controller driver:
- Enable the 3.3V Vaux supply if available so devices can request
wakeup with either Beacon or WAKE# (Richard Zhu)
MediaTek PCIe Gen3 controller driver:
- Add optional sys clock ready time setting to avoid sys_clk_rdy
signal glitching in MT6991 and MT8196 (AngeloGioacchino Del Regno)
- Add DT binding and driver support for MT6991 and MT8196
(AngeloGioacchino Del Regno)
NVIDIA Tegra PCIe controller driver:
- When asserting PERST#, disable the controller instead of mistakenly
disabling the PLL twice (Nagarjuna Kristam)
- Convert struct tegra_msi mask_lock to raw spinlock to avoid a lock
nesting error (Marek Vasut)
Qualcomm PCIe controller driver:
- Select PCI Power Control Slot driver so slot voltage rails can be
turned on/off if described in Root Port devicetree node (Qiang Yu)
- Parse only PCI bridge child nodes in devicetree, skipping unrelated
nodes such as OPP (Operating Performance Points), which caused
probe failures (Krishna Chaitanya Chundru)
- Add 8.0 GT/s and 32.0 GT/s equalization settings (Ziyue Zhang)
- Consolidate Root Port 'phy' and 'reset' properties in struct
qcom_pcie_port, regardless of whether we got them from the Root
Port node or the host bridge node (Manivannan Sadhasivam)
- Fetch and map the ELBI register space in the DWC core rather than
in each driver individually (Krishna Chaitanya Chundru)
- Enable ECAM mechanism in DWC core by setting up iATU with 'CFG
Shift Feature' and use this in the qcom driver (Krishna Chaitanya
Chundru)
- Add SM8750 compatible to qcom,pcie-sm8550.yaml (Krishna Chaitanya
Chundru)
- Update qcom,pcie-x1e80100.yaml to allow fifth PCIe host on Qualcomm
Glymur, which is compatible with X1E80100 but doesn't have the
cnoc_sf_axi clock (Qiang Yu)
Renesas R-Car PCIe controller driver:
- Fix a typo that prevented correct PHY initialization (Marek Vasut)
- Add a missing 1ms delay after PWR reset assertion as required by
the V4H manual (Marek Vasut)
- Assure reset has completed before DBI access to avoid SError (Marek
Vasut)
- Fix inverted PHY initialization check, which sometimes led to
timeouts and failure to start the controller (Marek Vasut)
- Pass the correct IRQ domain to generic_handle_domain_irq() to fix a
regression when converting to msi_create_parent_irq_domain()
(Claudiu Beznea)
- Drop the spinlock protecting the PMSR register - it's no longer
required since pci_lock already serializes accesses (Marek Vasut)
- Convert struct rcar_msi mask_lock to raw spinlock to avoid a lock
nesting error (Marek Vasut)
SOPHGO PCIe controller driver:
- Check for existence of struct cdns_pcie.ops before using it to
allow Cadence drivers that don't need to supply ops (Chen Wang)
- Add DT binding and driver for the SOPHGO SG2042 PCIe controller
(Chen Wang)
STMicroelectronics STM32MP25 PCIe controller driver:
- Update pinctrl documentation of initial states and use in runtime
suspend/resume (Christian Bruel)
- Add pinctrl_pm_select_init_state() for use by stm32 driver, which
needs it during resume (Christian Bruel)
- Add devicetree bindings and drivers for the STMicroelectronics
STM32MP25 in host and endpoint modes (Christian Bruel)
Synopsys DesignWare PCIe controller driver:
- Add support for x16 in devicetree 'num-lanes' property (Konrad
Dybcio)
- Verify that if DT specifies a single IRQ for all eDMA channels, it
is named 'dma' (Niklas Cassel)
TI J721E PCIe driver:
- Add MODULE_DEVICE_TABLE() so driver can be autoloaded (Siddharth
Vadapalli)
- Power controller off before configuring the glue layer so the
controller latches the correct values on power-on (Siddharth
Vadapalli)
TI Keystone PCIe controller driver:
- Use devm_request_irq() so 'ks-pcie-error-irq' is freed when driver
exits with error (Siddharth Vadapalli)
- Add Peripheral Virtualization Unit (PVU), which restricts DMA from
PCIe devices to specific regions of host memory, to the ti,am65
binding (Jan Kiszka)
Xilinx NWL PCIe controller driver:
- Clear bootloader E_ECAM_CONTROL before merging in the new driver
value to avoid writing invalid values (Jani Nurminen)"
* tag 'pci-v6.18-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (141 commits)
PCI/AER: Avoid NULL pointer dereference in aer_ratelimit()
MAINTAINERS: Add entry for ST STM32MP25 PCIe drivers
PCI: stm32-ep: Add PCIe Endpoint support for STM32MP25
dt-bindings: PCI: Add STM32MP25 PCIe Endpoint bindings
PCI: stm32: Add PCIe host support for STM32MP25
PCI: xilinx-nwl: Fix ECAM programming
PCI: j721e: Fix incorrect error message in probe()
PCI: keystone: Use devm_request_irq() to free "ks-pcie-error-irq" on exit
dt-bindings: PCI: qcom,pcie-x1e80100: Set clocks minItems for the fifth Glymur PCIe Controller
PCI: dwc: Support 16-lane operation
PCI: Add lockdep assertion in pci_stop_and_remove_bus_device()
PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV
PCI: rcar-host: Convert struct rcar_msi mask_lock into raw spinlock
PCI: tegra194: Rename 'root_bus' to 'root_port_bus' in tegra_pcie_downstream_dev_to_D0()
PCI: tegra: Convert struct tegra_msi mask_lock into raw spinlock
PCI: rcar-gen4: Fix inverted break condition in PHY initialization
PCI: rcar-gen4: Assure reset occurs before DBI access
PCI: rcar-gen4: Add missing 1ms delay after PWR reset assertion
PCI: Set up bridge resources earlier
PCI: rcar-host: Drop PMSR spinlock
...
Pull SCSI updates from James Bottomley:
"Usual driver updates (ufs, mpi3mr, lpfc, pm80xx, mpt3sas) plus
assorted cleanups and fixes.
The only core update is to sd.c and is mostly cosmetic"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (105 commits)
scsi: MAINTAINERS: Update FC element owners
scsi: mpt3sas: Update driver version to 54.100.00.00
scsi: mpt3sas: Add support for 22.5 Gbps SAS link rate
scsi: mpt3sas: Suppress unnecessary IOCLogInfo on CONFIG_INVALID_PAGE
scsi: mpt3sas: Fix crash in transport port remove by using ioc_info()
scsi: ufs: ufs-qcom: Add support for limiting HS gear and rate
scsi: ufs: pltfrm: Add DT support to limit HS gear and gear rate
scsi: ufs: ufs-qcom: Remove redundant re-assignment to hs_rate
scsi: ufs: dt-bindings: Document gear and rate limit properties
scsi: ufs: core: Fix data race in CPU latency PM QoS request handling
scsi: libfc: Fix potential buffer overflow in fc_ct_ms_fill()
scsi: storvsc: Remove redundant ternary operators
scsi: ufs: core: Change MCQ interrupt enable flow
scsi: smartpqi: Replace kmalloc() + copy_from_user() with memdup_user()
scsi: hpsa: Replace kmalloc() + copy_from_user() with memdup_user()
scsi: hpsa: Fix potential memory leak in hpsa_big_passthru_ioctl()
scsi: lpfc: Copyright updates for 14.4.0.11 patches
scsi: lpfc: Update lpfc version to 14.4.0.11
scsi: lpfc: Convert debugfs directory counts from atomic to unsigned int
scsi: lpfc: Clean up extraneous phba dentries
...
Justin Tee <justintee8345@gmail.com> says:
Update lpfc to revision 14.4.0.11
This patch set contains clean up of unused members in various structs,
bug fixes related to discovery and resource allocation, and updates to
handling of debugfs entries.
The patches were cut against Martin's 6.18/scsi-queue tree.
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Right after phba->nvmeio_trc is kzalloc'ed, phba->nvmeio_trc is set to
NULL and the memory reference to free the kzalloc'ed memory is lost.
Remove the phba->nvmeio_trc NULL ptr assignment after kzalloc.
phba->nvmeio_trc is freed in lpfc_debugfs_terminate.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Message-ID: <20250915180811.137530-10-justintee8345@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
To assist in debugging lpfc_xri_rebalancing driver parameter, a debugfs
entry is used. The debugfs file operations for xri rebalancing have
been previously implemented, but lack definition for its information
buffer size. Similar to other pre-existing debugfs entry buffers,
define LPFC_HDWQINFO_SIZE as 8192 bytes.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Message-ID: <20250915180811.137530-9-justintee8345@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
There is a timing race condition when a PRLI may be sent on the wire
before PLOGI_ACC in Point to Point topology. Fix by deferring REG_RPI
mbox completion handling to after PLOGI_ACC's CQE completion. Because
the discovery state machine only sends PRLI after REG_RPI mbox
completion, PRLI is now guaranteed to be sent after PLOGI_ACC.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Message-ID: <20250915180811.137530-8-justintee8345@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
If lpfc_reset_flush_io_context fails to execute, then the wrong return
status code may be passed back to upper layers when issuing a target
reset TMF command. Fix by checking the return status from
lpfc_reset_flush_io_context() first in order to properly return FAILED
or FAST_IO_FAIL.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Message-ID: <20250915180811.137530-7-justintee8345@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
The kref for Fabric_DID ndlps is not decremented after repeated FDISC
failures and exhausting maximum allowed retries. This can leave the
ndlp lingering unnecessarily. Add a test and set bit operation for the
NLP_DROPPED flag. If not previously set, then a kref is decremented. The
ndlp is freed when the remaining reference for the completing ELS is
put.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Message-ID: <20250915180811.137530-6-justintee8345@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
In lpfc_cleanup, there is an extraneous nlp_put for NPIV ports on the
F_Port_Ctrl ndlp object. In cases when an ABTS is issued, the
outstanding kref is needed for when a second XRI_ABORTED CQE is
received. The final kref for the ndlp is designed to be decremented in
lpfc_sli4_els_xri_aborted instead. Also, add a new log message to allow
for future diagnostics when debugging related issues.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Message-ID: <20250915180811.137530-5-justintee8345@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
lpfc_sli4_queue_setup() does not allocate memory and is used for
submitting CREATE_QUEUE mailbox commands. Thus, if such mailbox
commands fail we should clean up by also freeing the memory allocated
for the queues with lpfc_sli4_queue_destroy(). Change the intended
clean up label for the lpfc_sli4_queue_setup() error case to
out_destroy_queue.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Message-ID: <20250915180811.137530-4-justintee8345@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Driver rmmod may take a long time when in a very large SAN environment.
This is because outstanding ELS WQEs may end up taking E_D_TOV seconds
to complete causing long delays. Speed this up by issuing aborts with
the ia bit set so that outstanding ELS WQEs complete faster.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Message-ID: <20250915180811.137530-3-justintee8345@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Change the 'ret' variable in lpfc_sli4_issue_wqe() from uint32_t to int,
as it needs to store either negative error codes or zero returned by
lpfc_sli4_wq_put().
Storing the negative error codes in unsigned type, doesn't cause an
issue at runtime but can be confusing. Additionally, assigning negative
error codes to unsigned type may trigger a GCC warning when the
-Wsign-conversion flag is enabled.
No effect on runtime.
Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com>
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
-Wflex-array-member-not-at-end has been introduced in GCC-14, and we are
getting ready to enable it, globally.
So, in order to avoid ending up with a flexible-array member in the
middle of multiple other structs, we use the '__struct_group()' helper
to create a new tagged 'struct fc_df_desc_fpin_reg_hdr'. This structure
groups together all the members of the flexible 'struct
fc_df_desc_fpin_reg' except the flexible array.
As a result, the array is effectively separated from the rest of the
members without modifying the memory layout of the flexible structure.
We then change the type of the middle struct members currently causing
trouble from 'struct fc_df_desc_fpin_reg' to 'struct
fc_df_desc_fpin_reg_hdr'.
We also want to ensure that in case new members need to be added to the
flexible structure, they are always included within the newly created
tagged struct. For this, we use '_Static_assert()'. This ensures that
the memory layout for both the flexible structure and the new tagged
struct is the same after any changes.
This approach avoids having to implement 'struct fc_df_desc_fpin_reg_hdr'
as a completely separate structure, thus preventing having to maintain
two independent but basically identical structures, closing the door
to potential bugs in the future.
The above is also done for flexible structures 'struct fc_els_rdf' and
'struct fc_els_rdf_resp'
So, with these changes, fix the following warnings:
drivers/scsi/lpfc/lpfc_hw4.h:4936:41: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/scsi/lpfc/lpfc_hw4.h:4942:41: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
drivers/scsi/lpfc/lpfc_hw4.h:4947:41: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/aK6hbQLyQlvlySf8@kspp
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Fix a use-after-free window by correcting the buffer release sequence in
the deferred receive path. The code freed the RQ buffer first and only
then cleared the context pointer under the lock. Concurrent paths (e.g.,
ABTS and the repost path) also inspect and release the same pointer under
the lock, so the old order could lead to double-free/UAF.
Note that the repost path already uses the correct pattern: detach the
pointer under the lock, then free it after dropping the lock. The
deferred path should do the same.
Fixes: 472e146d1c ("scsi: lpfc: Correct upcalling nvmet_fc transport during io done downcall")
Cc: stable@vger.kernel.org
Signed-off-by: John Evans <evans1210144@gmail.com>
Link: https://lore.kernel.org/r/20250828044008.743-1-evans1210144@gmail.com
Reviewed-by: Justin Tee <justin.tee@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Back in 2017, commit 2fd260f03b ("PCI/AER: Remove unused .link_reset()
callback") removed .link_reset() from struct pci_error_handlers, but left
a few code comments behind which still mention it. Remove them.
The code comments in the SolarFlare Ethernet drivers point out that no
.mmio_enabled() callback is needed because the driver's .error_detected()
callback always returns PCI_ERS_RESULT_NEED_RESET, which causes
pcie_do_recovery() to skip .mmio_enabled(). That's not quite correct
because efx_io_error_detected() does return PCI_ERS_RESULT_RECOVERED under
certain conditions and then .mmio_enabled() would indeed be called if it
were implemented. Remove this misleading portion of the code comment as
well.
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/1d72a891a7f57115e78a73046e776f7e0c8cd68f.1755008151.git.lukas@wunner.de
Pull SCSI updates from James Bottomley:
"Smaller set of driver updates than usual (ufs, lpfc, mpi3mr).
The rest (including the core file changes) are doc updates and some
minor bug fixes"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (49 commits)
scsi: libiscsi: Initialize iscsi_conn->dd_data only if memory is allocated
scsi: scsi_transport_fc: Add comments to describe added 'rport' parameter
scsi: bfa: Double-free fix
scsi: isci: Fix dma_unmap_sg() nents value
scsi: mvsas: Fix dma_unmap_sg() nents value
scsi: elx: efct: Fix dma_unmap_sg() nents value
scsi: scsi_transport_fc: Change to use per-rport devloss_work_q
scsi: ufs: exynos: Fix programming of HCI_UTRL_NEXUS_TYPE
scsi: core: Fix kernel doc for scsi_track_queue_full()
scsi: ibmvscsi_tgt: Fix dma_unmap_sg() nents value
scsi: ibmvscsi_tgt: Fix typo in comment
scsi: mpi3mr: Update driver version to 8.14.0.5.50
scsi: mpi3mr: Serialize admin queue BAR writes on 32-bit systems
scsi: mpi3mr: Drop unnecessary volatile from __iomem pointers
scsi: mpi3mr: Fix race between config read submit and interrupt completion
scsi: ufs: ufs-qcom: Enable QUnipro Internal Clock Gating
scsi: ufs: core: Add ufshcd_dme_rmw() to modify DME attributes
scsi: ufs: ufs-qcom: Update esi_vec_mask for HW major version >= 6
scsi: core: Use scsi_cmd_priv() instead of open-coding it
scsi: qla2xxx: Remove firmware URL
...
Move clearing of HBA_SETUP flag out of lpfc_sli_brdrestart_s4 and before
lpfc_sli4_queue_unset. lpfc_sli4_queue_unset kfrees phba queues, so
clear the HBA_SETUP atomic flag to signal that the phba struct is no
longer initialized.
Also, add a check for the HBA_SETUP flag in the lpfc_sli4_io_xri_aborted
routine before dereferencing the ELS WQ.
Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20250618192138.124116-10-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>