Pull hyperv updates from Wei Liu:
- PCI passthrough for Hyper-V confidential VMs (Michael Kelley)
- Hyper-V VTL mode support (Saurabh Sengar)
- Move panic report initialization code earlier (Long Li)
- Various improvements and bug fixes (Dexuan Cui and Michael Kelley)
* tag 'hyperv-next-signed-20230424' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (22 commits)
PCI: hv: Replace retarget_msi_interrupt_params with hyperv_pcpu_input_arg
Drivers: hv: move panic report code from vmbus to hv early init code
x86/hyperv: VTL support for Hyper-V
Drivers: hv: Kconfig: Add HYPERV_VTL_MODE
x86/hyperv: Make hv_get_nmi_reason public
x86/hyperv: Add VTL specific structs and hypercalls
x86/init: Make get/set_rtc_noop() public
x86/hyperv: Exclude lazy TLB mode CPUs from enlightened TLB flushes
x86/hyperv: Add callback filter to cpumask_to_vpset()
Drivers: hv: vmbus: Remove the per-CPU post_msg_page
clocksource: hyper-v: make sure Invariant-TSC is used if it is available
PCI: hv: Enable PCI pass-thru devices in Confidential VMs
Drivers: hv: Don't remap addresses that are above shared_gpa_boundary
hv_netvsc: Remove second mapping of send and recv buffers
Drivers: hv: vmbus: Remove second way of mapping ring buffers
Drivers: hv: vmbus: Remove second mapping of VMBus monitor pages
swiotlb: Remove bounce buffer remapping for Hyper-V
Driver: VMBus: Add Devicetree support
dt-bindings: bus: Add Hyper-V VMBus
Drivers: hv: vmbus: Convert acpi_device to more generic platform_device
...
Pull driver core updates from Greg KH:
"Here is the large set of driver core changes for 6.4-rc1.
Once again, a busy development cycle, with lots of changes happening
in the driver core in the quest to be able to move "struct bus" and
"struct class" into read-only memory, a task now complete with these
changes.
This will make the future rust interactions with the driver core more
"provably correct" as well as providing more obvious lifetime rules
for all busses and classes in the kernel.
The changes required for this did touch many individual classes and
busses as many callbacks were changed to take const * parameters
instead. All of these changes have been submitted to the various
subsystem maintainers, giving them plenty of time to review, and most
of them actually did so.
Other than those changes, included in here are a small set of other
things:
- kobject logging improvements
- cacheinfo improvements and updates
- obligatory fw_devlink updates and fixes
- documentation updates
- device property cleanups and const * changes
- firwmare loader dependency fixes.
All of these have been in linux-next for a while with no reported
problems"
* tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (120 commits)
device property: make device_property functions take const device *
driver core: update comments in device_rename()
driver core: Don't require dynamic_debug for initcall_debug probe timing
firmware_loader: rework crypto dependencies
firmware_loader: Strip off \n from customized path
zram: fix up permission for the hot_add sysfs file
cacheinfo: Add use_arch[|_cache]_info field/function
arch_topology: Remove early cacheinfo error message if -ENOENT
cacheinfo: Check cache properties are present in DT
cacheinfo: Check sib_leaf in cache_leaves_are_shared()
cacheinfo: Allow early level detection when DT/ACPI info is missing/broken
cacheinfo: Add arm64 early level initializer implementation
cacheinfo: Add arch specific early level initializer
tty: make tty_class a static const structure
driver core: class: remove struct class_interface * from callbacks
driver core: class: mark the struct class in struct class_interface constant
driver core: class: make class_register() take a const *
driver core: class: mark class_release() as taking a const *
driver core: remove incorrect comment for device_create*
MIPS: vpe-cmp: remove module owner pointer from struct class usage.
...
Pull pci updates from Bjorn Helgaas:
"Resource management:
- Add pci_dev_for_each_resource() and pci_bus_for_each_resource()
iterators
PCIe native device hotplug:
- Fix AB-BA deadlock between reset_lock and device_lock
Power management:
- Wait longer for devices to become ready after resume (as we do for
reset) to accommodate Intel Titan Ridge xHCI devices
- Extend D3hot delay for NVIDIA HDA controllers to avoid
unrecoverable devices after a bus reset
Error handling:
- Clear PCIe Device Status after EDR since generic error recovery now
only clears it when AER is native
ASPM:
- Work around Chromebook firmware defect that clobbers Capability
list (including ASPM L1 PM Substates Cap) when returning from
D3cold to D0
Freescale i.MX6 PCIe controller driver:
- Install imprecise external abort handler only when DT indicates
PCIe support
Freescale Layerscape PCIe controller driver:
- Add ls1028a endpoint mode support
Qualcomm PCIe controller driver:
- Add SM8550 DT binding and driver support
- Add SDX55 DT binding and driver support
- Use bulk APIs for clocks of IP 1.0.0, 2.3.2, 2.3.3
- Use bulk APIs for reset of IP 2.1.0, 2.3.3, 2.4.0
- Add DT "mhi" register region for supported SoCs
- Expose link transition counts via debugfs to help debug low power
issues
- Support system suspend and resume; reduce interconnect bandwidth
and turn off clock and PHY if there are no active devices
- Enable async probe by default to reduce boot time
Miscellaneous:
- Sort controller Kconfig entries by vendor"
* tag 'pci-v6.4-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (56 commits)
PCI: xilinx: Drop obsolete dependency on COMPILE_TEST
PCI: mobiveil: Sort Kconfig entries by vendor
PCI: dwc: Sort Kconfig entries by vendor
PCI: Sort controller Kconfig entries by vendor
PCI: Use consistent controller Kconfig menu entry language
PCI: xilinx-nwl: Add 'Xilinx' to Kconfig prompt
PCI: hv: Add 'Microsoft' to Kconfig prompt
PCI: meson: Add 'Amlogic' to Kconfig prompt
PCI: Use of_property_present() for testing DT property presence
PCI/PM: Extend D3hot delay for NVIDIA HDA controllers
dt-bindings: PCI: qcom: Document msi-map and msi-map-mask properties
PCI: qcom: Add SM8550 PCIe support
dt-bindings: PCI: qcom: Add SM8550 compatible
PCI: qcom: Add support for SDX55 SoC
dt-bindings: PCI: qcom-ep: Fix the unit address used in example
dt-bindings: PCI: qcom: Add SDX55 SoC
dt-bindings: PCI: qcom: Update maintainers entry
PCI: qcom: Enable async probe by default
PCI: qcom: Add support for system suspend and resume
PCI/PM: Drop pci_bridge_wait_for_secondary_bus() timeout parameter
...
Pull irq fix from Borislav Petkov:
- Remove an over-zealous sanity check of the array of MSI-X vectors to
be allocated for a device
* tag 'irq_urgent_for_v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
PCI/MSI: Remove over-zealous hardware size check in pci_msix_validate_entries()
4 commits are involved here:
A (2016): commit 0de8ce3ee8 ("PCI: hv: Allocate physically contiguous hypercall params buffer")
B (2017): commit be66b67365 ("PCI: hv: Use page allocation for hbus structure")
C (2019): commit 877b911a5b ("PCI: hv: Avoid a kmemleak false positive caused by the hbus buffer")
D (2018): commit 68bb7bfb79 ("X86/Hyper-V: Enable IPI enlightenments")
Patch D introduced the per-CPU hypercall input page "hyperv_pcpu_input_arg"
in 2018. With patch D, we no longer need the per-Hyper-V-PCI-bus hypercall
input page "hbus->retarget_msi_interrupt_params" that was added in patch A,
and the issue addressed by patch B is no longer an issue, and we can also
get rid of patch C.
The change here is required for PCI device assignment to work for
Confidential VMs (CVMs) running without a paravisor, because otherwise we
would have to call set_memory_decrypted() for
"hbus->retarget_msi_interrupt_params" before calling the hypercall
HVCALL_RETARGET_INTERRUPT.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Link: https://lore.kernel.org/r/20230421013025.17152-1-decui@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Pull pci fix from Bjorn Helgaas:
- Previously we ignored PCI devices if the DT "status" property or the
ACPI _STA method said it was not present.
Per spec, _STA cannot be used for that purpose, and using it that way
caused regressions, so skip the _STA check (Rob Herring)
* tag 'pci-v6.3-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
PCI: Restrict device disabled status check to DT
- Use correct PCIE20_PARF_AXI_MSTR_WR_ADDR_HALT_V2 register for v2.7.0
(Manivannan Sadhasivam)
- Remove "PCIE20_" prefix from register definitions (Manivannan Sadhasivam)
- Sort registers and bitfield declarations (Manivannan Sadhasivam)
- Convert to GENMASK and FIELD_PREP (Manivannan Sadhasivam)
- Use bulk APIs for clocks of IP 1.0.0, 2.3.2, 2.3.3 (Manivannan
Sadhasivam)
- Use bulk APIs for reset of IP 2.1.0, 2.3.3, 2.4.0 (Manivannan Sadhasivam)
- Rename qcom_pcie_config_sid_sm8250() to be non SM8250-specific
(Manivannan Sadhasivam)
- Add DT "mhi" register region for supported SoCs (Manivannan Sadhasivam)
- Expose link transition counts via debugfs to help debug low power issues
(Manivannan Sadhasivam)
- Support system suspend and resume; reduce interconnect bandwidth and turn
off clock and PHY if there are no active devices (Manivannan Sadhasivam)
- Enable async probe by default to reduce boot time (Manivannan Sadhasivam)
- Add Manivannan Sadhasivam as qcom DT binding maintainer, replacing
Stanimir Varbanov (Manivannan Sadhasivam)
- Add DT binding and driver support for Qcom SDX55 SoC (Manivannan
Sadhasivam)
- Add DT binding and driver support for SM8550 SoC (Abel Vesa)
- Document msi-map and msi-map-mask DT properties (Manivannan Sadhasivam)
* pci/controller/qcom:
dt-bindings: PCI: qcom: Document msi-map and msi-map-mask properties
PCI: qcom: Add SM8550 PCIe support
dt-bindings: PCI: qcom: Add SM8550 compatible
PCI: qcom: Add support for SDX55 SoC
dt-bindings: PCI: qcom-ep: Fix the unit address used in example
dt-bindings: PCI: qcom: Add SDX55 SoC
dt-bindings: PCI: qcom: Update maintainers entry
PCI: qcom: Enable async probe by default
PCI: qcom: Add support for system suspend and resume
PCI: qcom: Expose link transition counts via debugfs
dt-bindings: PCI: qcom: Add "mhi" register region to supported SoCs
PCI: qcom: Rename qcom_pcie_config_sid_sm8250() to reflect IP version
PCI: qcom: Use macros for defining total no. of clocks & supplies
PCI: qcom: Use bulk reset APIs for handling resets for IP rev 2.4.0
PCI: qcom: Use bulk reset APIs for handling resets for IP rev 2.3.3
PCI: qcom: Use bulk clock APIs for handling clocks for IP rev 2.3.3
PCI: qcom: Use bulk clock APIs for handling clocks for IP rev 2.3.2
PCI: qcom: Use bulk clock APIs for handling clocks for IP rev 1.0.0
PCI: qcom: Use bulk reset APIs for handling resets for IP rev 2.1.0
PCI: qcom: Use lower case for hex
PCI: qcom: Add missing macros for register fields
PCI: qcom: Use bitfield definitions for register fields
PCI: qcom: Sort and group registers and bitfield definitions
PCI: qcom: Remove PCIE20_ prefix from register definitions
PCI: qcom: Fix the incorrect register usage in v2.7.0 config
- Use the PCI_CONF1_ADDRESS() macro to simplify config space address
computation (Pali Rohár)
* pci/controller/ixp4xx:
PCI: ixp4xx: Use PCI_CONF1_ADDRESS() macro
- Install i.MX6 PCI abort handler only when DT contains a PCI controller
claimed by the imx6 driver (H. Nikolaus Schaller)
* pci/controller/dwc:
PCI: imx6: Install the fault handler only on compatible match
- Wait longer for devices to become ready after resume (as we do for reset)
to accommodate Intel Titan Ridge xHCI devices (Mika Westerberg)
- Drop pci_bridge_wait_for_secondary_bus() timeout parameter since all
callers pass the same value (Mika Westerberg)
- Extend D3hot delay for NVIDIA HDA controllers to avoid unrecoverable
devices after a bus reset (Alex Williamson)
* pci/reset:
PCI/PM: Extend D3hot delay for NVIDIA HDA controllers
PCI/PM: Drop pci_bridge_wait_for_secondary_bus() timeout parameter
PCI/PM: Increase wait time after resume
- Use of_property_present(), instead of lower-level functions like
of_get_property(), for testing DT property presence (Rob Herring)
* pci/enumeration:
PCI: Use of_property_present() for testing DT property presence
It is preferred to use typed property access functions (i.e.
of_property_read_<type> functions) rather than low-level
of_get_property()/of_find_property() functions for reading properties. As
part of this, convert of_get_property()/of_find_property() calls to the
recently added of_property_present() helper when we just want to test for
presence of a property and nothing more.
Link: https://lore.kernel.org/r/20230310144719.1544443-1-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> # pcie-mediatek
Assignment of NVIDIA Ampere-based GPUs have seen a regression since the
below referenced commit, where the reduced D3hot transition delay appears
to introduce a small window where a D3hot->D0 transition followed by a bus
reset can wedge the device. The entire device is subsequently unavailable,
returning -1 on config space read and is unrecoverable without a host
reset.
This has been observed with RTX A2000 and A5000 GPU and audio functions
assigned to a Windows VM, where shutdown of the VM places the devices in
D3hot prior to vfio-pci performing a bus reset when userspace releases the
devices. The issue has roughly a 2-3% chance of occurring per shutdown.
Restoring the HDA controller d3hot_delay to the effective value before the
below commit has been shown to resolve the issue. NVIDIA confirms this
change should be safe for all of their HDA controllers.
Fixes: 3e347969a5 ("PCI/PM: Reduce D3hot delay with usleep_range()")
Link: https://lore.kernel.org/r/20230413194042.605768-1-alex.williamson@redhat.com
Reported-by: Zhiyi Guo <zhguo@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Tarun Gupta <targupta@nvidia.com>
Cc: Abhishek Sahu <abhsahu@nvidia.com>
Cc: Tarun Gupta <targupta@nvidia.com>
For PCI pass-thru devices in a Confidential VM, Hyper-V requires
that PCI config space be accessed via hypercalls. In normal VMs,
config space accesses are trapped to the Hyper-V host and emulated.
But in a confidential VM, the host can't access guest memory to
decode the instruction for emulation, so an explicit hypercall must
be used.
Add functions to make the new MMIO read and MMIO write hypercalls.
Update the PCI config space access functions to use the hypercalls
when such use is indicated by Hyper-V flags. Also, set the flag to
allow the Hyper-V PCI driver to be loaded and used in a Confidential
VM (a.k.a., "Isolation VM"). The driver has previously been hardened
against a malicious Hyper-V host[1].
[1] https://lore.kernel.org/all/20220511223207.3386-2-parri.andrea@gmail.com/
Co-developed-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Link: https://lore.kernel.org/r/1679838727-87310-13-git-send-email-mikelley@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
pci_msix_validate_entries() validates the entries array which is handed in
by the caller for a MSI-X interrupt allocation. Aside of consistency
failures it also detects a failure when the size of the MSI-X hardware table
in the device is smaller than the size of the entries array.
That's wrong for the case of range allocations where the caller provides
the minimum and the maximum number of vectors to allocate, when the
hardware size is greater or equal than the mininum, but smaller than the
maximum.
Remove the hardware size check completely from that function and just
ensure that the entires array up to the maximum size is consistent.
The limitation and range checking versus the hardware size happens
independently of that afterwards anyway because the entries array is
optional.
Fixes: 4644d22eb6 ("PCI/MSI: Validate MSI-X contiguous restriction early")
Reported-by: David Laight <David.Laight@aculab.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87v8i3sg62.ffs@tglx
SM8550 requires two additional clocks for proper working.
Add these two clocks as optional clocks (as only required by this
platform) and compatible for this platform.
While at it, let's also rename the reset variable to "rst" from
"pci_reset" to match the existing naming preference.
Link: https://lore.kernel.org/r/20230320144658.1794991-2-abel.vesa@linaro.org
Signed-off-by: Abel Vesa <abel.vesa@linaro.org>
[lpieralisi@kernel.org: commit log rewording]
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>
Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>
Reviewed-by: Johan Hovold <johan+linaro@kernel.org>
During the system suspend, vote for minimal interconnect bandwidth (1KiB)
to keep the interconnect path active for config access and also turn OFF
the resources like clock and PHY if there are no active devices connected
to the controller. For the controllers with active devices, the resources
are kept ON as removing the resources will trigger access violation during
the late end of suspend cycle as kernel tries to access the config space of
PCIe devices to mask the MSIs.
Also, it is not desirable to put the link into L2/L3 state as that
implies VDD supply will be removed and the devices may go into powerdown
state. This will affect the lifetime of storage devices like NVMe.
And finally, during resume, turn ON the resources if the controller was
truly suspended (resources OFF) and update the interconnect bandwidth
based on PCIe Gen speed.
Suggested-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
Link: https://lore.kernel.org/r/20230403154922.20704-2-manivannan.sadhasivam@linaro.org
Tested-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
Reviewed-by: Johan Hovold <johan+linaro@kernel.org>
Acked-by: Dhruva Gole <d-gole@ti.com>
PCIe r6.0 sec 6.6.1 prescribes that a device must be able to respond to
config requests within 1.0 s (PCI_RESET_WAIT) after exiting conventional
reset and this same delay is prescribed when coming out of D3cold (as that
involves reset too).
A device that requires more than 1 second to initialize after reset may
respond to config requests with Request Retry Status completions (sec
2.3.1), and we accommodate that in Linux with a 60 second cap
(PCIE_RESET_READY_POLL_MS).
Previously we waited up to PCIE_RESET_READY_POLL_MS only in the reset code
path, not in the resume path. However, a device has surfaced, namely Intel
Titan Ridge xHCI, which requires a longer delay also in the resume code
path.
Make the resume code path to use this same extended delay as the reset
path.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216728
Link: https://lore.kernel.org/r/20230404052714.51315-2-mika.westerberg@linux.intel.com
Reported-by: Chris Chiu <chris.chiu@canonical.com>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Lukas Wunner <lukas@wunner.de>
Pull pci fixes from Bjorn Helgaas:
- Provide pci_msix_can_alloc_dyn() stub when CONFIG_PCI_MSI unset to
avoid build errors (Reinette Chatre)
- Quirk AMD XHCI controller that loses MSI-X state in D3hot to avoid
broken USB after hotplug or suspend/resume (Basavaraj Natikar)
- Fix use-after-free in pci_bus_release_domain_nr() (Rob Herring)
* tag 'pci-v6.3-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
PCI: Fix use-after-free in pci_bus_release_domain_nr()
x86/PCI: Add quirk for AMD XHCI controller that loses MSI-X state in D3hot
PCI/MSI: Provide missing stub for pci_msix_can_alloc_dyn()
In 2013, commits
2e35afaefe ("PCI: pciehp: Add reset_slot() method")
608c388122 ("PCI: Add slot reset option to pci_dev_reset()")
amended PCIe hotplug to mask Presence Detect Changed events during a
Secondary Bus Reset. The reset thus no longer causes gratuitous slot
bringdown and bringup.
However the commits neglected to serialize reset with code paths reading
slot registers. For instance, a slot bringup due to an earlier hotplug
event may see the Presence Detect State bit cleared during a concurrent
Secondary Bus Reset.
In 2018, commit
5b3f7b7d06 ("PCI: pciehp: Avoid slot access during reset")
retrofitted the missing locking. It introduced a reset_lock which
serializes a Secondary Bus Reset with other parts of pciehp.
Unfortunately the locking turns out to be overzealous: reset_lock is
held for the entire enumeration and de-enumeration of hotplugged devices,
including driver binding and unbinding.
Driver binding and unbinding acquires device_lock while the reset_lock
of the ancestral hotplug port is held. A concurrent Secondary Bus Reset
acquires the ancestral reset_lock while already holding the device_lock.
The asymmetric locking order in the two code paths can lead to AB-BA
deadlocks.
Michael Haeuptle reports such deadlocks on simultaneous hot-removal and
vfio release (the latter implies a Secondary Bus Reset):
pciehp_ist() # down_read(reset_lock)
pciehp_handle_presence_or_link_change()
pciehp_disable_slot()
__pciehp_disable_slot()
remove_board()
pciehp_unconfigure_device()
pci_stop_and_remove_bus_device()
pci_stop_bus_device()
pci_stop_dev()
device_release_driver()
device_release_driver_internal()
__device_driver_lock() # device_lock()
SYS_munmap()
vfio_device_fops_release()
vfio_device_group_close()
vfio_device_close()
vfio_device_last_close()
vfio_pci_core_close_device()
vfio_pci_core_disable() # device_lock()
__pci_reset_function_locked()
pci_reset_bus_function()
pci_dev_reset_slot_function()
pci_reset_hotplug_slot()
pciehp_reset_slot() # down_write(reset_lock)
Ian May reports the same deadlock on simultaneous hot-removal and an
AER-induced Secondary Bus Reset:
aer_recover_work_func()
pcie_do_recovery()
aer_root_reset()
pci_bus_error_reset()
pci_slot_reset()
pci_slot_lock() # device_lock()
pci_reset_hotplug_slot()
pciehp_reset_slot() # down_write(reset_lock)
Fix by releasing the reset_lock during driver binding and unbinding,
thereby splitting and shrinking the critical section.
Driver binding and unbinding is protected by the device_lock() and thus
serialized with a Secondary Bus Reset. There's no need to additionally
protect it with the reset_lock. However, pciehp does not bind and
unbind devices directly, but rather invokes PCI core functions which
also perform certain enumeration and de-enumeration steps.
The reset_lock's purpose is to protect slot registers, not enumeration
and de-enumeration of hotplugged devices. That would arguably be the
job of the PCI core, not the PCIe hotplug driver. After all, an
AER-induced Secondary Bus Reset may as well happen during boot-time
enumeration of the PCI hierarchy and there's no locking to prevent that
either.
Exempting *de-enumeration* from the reset_lock is relatively harmless:
A concurrent Secondary Bus Reset may foil config space accesses such as
PME interrupt disablement. But if the device is physically gone, those
accesses are pointless anyway. If the device is physically present and
only logically removed through an Attention Button press or the sysfs
"power" attribute, PME interrupts as well as DMA cannot come through
because pciehp_unconfigure_device() disables INTx and Bus Master bits.
That's still protected by the reset_lock in the present commit.
Exempting *enumeration* from the reset_lock also has limited impact:
The exempted call to pci_bus_add_device() may perform device accesses
through pcibios_bus_add_device() and pci_fixup_device() which are now
no longer protected from a concurrent Secondary Bus Reset. Otherwise
there should be no impact.
In essence, the present commit seeks to fix the AB-BA deadlocks while
still retaining a best-effort reset protection for enumeration and
de-enumeration of hotplugged devices -- until a general solution is
implemented in the PCI core.
Link: https://lore.kernel.org/linux-pci/CS1PR8401MB0728FC6FDAB8A35C22BD90EC95F10@CS1PR8401MB0728.NAMPRD84.PROD.OUTLOOK.COM
Link: https://lore.kernel.org/linux-pci/20200615143250.438252-1-ian.may@canonical.com
Link: https://lore.kernel.org/linux-pci/ce878dab-c0c4-5bd0-a725-9805a075682d@amd.com
Link: https://lore.kernel.org/linux-pci/ed831249-384a-6d35-0831-70af191e9bce@huawei.com
Link: https://bugzilla.kernel.org/show_bug.cgi?id=215590
Fixes: 5b3f7b7d06 ("PCI: pciehp: Avoid slot access during reset")
Link: https://lore.kernel.org/r/fef2b2e9edf245c049a8c5b94743c0f74ff5008a.1681191902.git.lukas@wunner.de
Reported-by: Michael Haeuptle <michael.haeuptle@hpe.com>
Reported-by: Ian May <ian.may@canonical.com>
Reported-by: Andrey Grodzovsky <andrey2805@gmail.com>
Reported-by: Rahul Kumar <rahul.kumar1@amd.com>
Reported-by: Jialin Zhang <zhangjialin11@huawei.com>
Tested-by: Anatoli Antonovitch <Anatoli.Antonovitch@amd.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v4.19+
Cc: Dan Stein <dstein@hpe.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Alex Michon <amichon@kalrayinc.com>
Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@linux.intel.com>
Qualcomm PCIe controllers have debug registers in the MHI region that
count PCIe link transitions. Expose them over debugfs to userspace to
help debug the low power issues.
Note that even though the registers are prefixed as PARF_, they don't
live under the "parf" register region. The register naming is following
the Qualcomm's internal documentation as like other registers.
While at it, let's arrange the local variables in probe function to follow
reverse XMAS tree order.
Link: https://lore.kernel.org/r/20230316081117.14288-20-manivannan.sadhasivam@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>
All the resets are asserted and deasserted at the same time. So the bulk
reset APIs can be used to handle them together. This simplifies the code
a lot.
It should be noted that there were delays in-between the reset asserts and
deasserts. But going by the config used by other revisions, those delays
are not really necessary. So a single delay after all asserts and one after
deasserts is used.
The total number of resets supported is 12 but only ipq4019 is using all of
them.
Link: https://lore.kernel.org/r/20230316081117.14288-13-manivannan.sadhasivam@linaro.org
Tested-by: Sricharan Ramabadhran <quic_srichara@quicinc.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>