Commit Graph

1074 Commits

Author SHA1 Message Date
Linus Torvalds
7a0892d283 Merge tag 'pci-v6.18-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci fixes from Bjorn Helgaas:

 - Cache the ASPM L0s/L1 Supported bits early so quirks can override
   them if necessary (Bjorn Helgaas)

 - Add quirks for PA Semi and Freescale Root Ports and a HiSilicon Wi-Fi
   device that are reported to have broken L0s and L1 (Shawn Lin, Bjorn
   Helgaas)

* tag 'pci-v6.18-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI/ASPM: Avoid L0s and L1 on Hi1105 [19e5:1105] Wi-Fi
  PCI/ASPM: Avoid L0s and L1 on PA Semi [1959:a002] Root Ports
  PCI/ASPM: Avoid L0s and L1 on Freescale [1957:0451] Root Ports
  PCI/ASPM: Convert quirks to override advertised link states
  PCI/ASPM: Add pcie_aspm_remove_cap() to override advertised link states
  PCI/ASPM: Cache L0s/L1 Supported so advertised link states can be overridden
2025-11-14 15:45:31 -08:00
Bjorn Helgaas
575b98e39d PCI/ASPM: Add pcie_aspm_remove_cap() to override advertised link states
Add pcie_aspm_remove_cap().  A quirk can use this to prevent use of ASPM
L0s or L1 link states, even if the device advertised support for them.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Shawn Lin <shawn.lin@rock-chips.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Link: https://patch.msgid.link/20251110222929.2140564-3-helgaas@kernel.org
2025-11-12 18:51:27 -06:00
Bjorn Helgaas
4495bffd86 PCI/ASPM: Cache L0s/L1 Supported so advertised link states can be overridden
Defective devices sometimes advertise support for ASPM L0s or L1 states
even if they don't work correctly.

Cache the L0s Supported and L1 Supported bits early in enumeration so
HEADER quirks can override the ASPM states advertised in Link Capabilities
before pcie_aspm_cap_init() enables ASPM.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Shawn Lin <shawn.lin@rock-chips.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Link: https://patch.msgid.link/20251110222929.2140564-2-helgaas@kernel.org
2025-11-12 18:47:16 -06:00
Bjorn Helgaas
df5192d9bb PCI/ASPM: Enable only L0s and L1 for devicetree platforms
f3ac2ff148 ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree
platforms") enabled Clock Power Management and L1 PM Substates, but those
features depend on CLKREQ# and possibly other device-specific
configuration.  We don't know whether CLKREQ# is supported, so we shouldn't
blindly enable Clock PM and L1 PM Substates.

Enable only ASPM L0s and L1, and only when both ends of the link advertise
support for them.

Fixes: f3ac2ff148 ("PCI/ASPM: Enable all ClockPM and ASPM states for devicetree platforms")
Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Link: https://lore.kernel.org/r/db5c95a1-cf3e-46f9-8045-a1b04908051a@xenosoft.de/
Reported-by: FUKAUMI Naoki <naoki@radxa.com>
Closes: https://lore.kernel.org/r/22594781424C5C98+22cb5d61-19b1-4353-9818-3bb2b311da0b@radxa.com/
Reported-by: Herve Codina <herve.codina@bootlin.com>
Link: https://lore.kernel.org/r/20251015101304.3ec03e6b@bootlin.com/
Reported-by: Diederik de Haas <diederik@cknow-tech.com>
Closes: https://lore.kernel.org/r/DDJXHRIRGTW9.GYC2ULZ5WQAL@cknow-tech.com/
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: FUKAUMI Naoki <naoki@radxa.com>
Tested-by: Diederik de Haas <diederik@cknow-tech.com>
Acked-by: Dragan Simic <dsimic@manjaro.org>
Link: https://patch.msgid.link/20251023180645.1304701-1-helgaas@kernel.org
2025-10-23 20:08:14 -05:00
Bjorn Helgaas
7cc5e1e62b Merge branch 'pci/aspm'
- Enable all ClockPM and ASPM states for devicetree platforms, since
  there's typically no firmware that enables ASPM (Manivannan Sadhasivam)

- Remove the qcom code that enabled ASPM (Manivannan Sadhasivam)

* pci/aspm:
  PCI: qcom: Remove custom ASPM enablement code
  PCI/ASPM: Enable all ClockPM and ASPM states for devicetree platforms
2025-10-03 12:13:07 -05:00
Breno Leitao
deb2f22838 PCI/AER: Avoid NULL pointer dereference in aer_ratelimit()
When platform firmware supplies error information to the OS, e.g., via the
ACPI APEI GHES mechanism, it may identify an error source device that
doesn't advertise an AER Capability and therefore dev->aer_info, which
contains AER stats and ratelimiting data, is NULL.

pci_dev_aer_stats_incr() already checks dev->aer_info for NULL, but
aer_ratelimit() did not, leading to NULL pointer dereferences like this one
from the URL below:

  {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 0
  {1}[Hardware Error]: event severity: corrected
  {1}[Hardware Error]:   device_id: 0000:00:00.0
  {1}[Hardware Error]:   vendor_id: 0x8086, device_id: 0x2020
  {1}[Hardware Error]:   aer_cor_status: 0x00001000, aer_cor_mask: 0x00002000
  BUG: kernel NULL pointer dereference, address: 0000000000000264
  RIP: 0010:___ratelimit+0xc/0x1b0
  pci_print_aer+0x141/0x360
  aer_recover_work_func+0xb5/0x130

[8086:2020] is an Intel "Sky Lake-E DMI3 Registers" device that claims to
be a Root Port but does not advertise an AER Capability.

Add a NULL check in aer_ratelimit() to avoid the NULL pointer dereference.
Note that this also prevents ratelimiting these events from GHES.

Fixes: a57f2bfb4a ("PCI/AER: Ratelimit correctable and non-fatal error logging")
Link: https://lore.kernel.org/r/buduna6darbvwfg3aogl5kimyxkggu3n4romnmq6sozut6axeu@clnx7sfsy457/
Signed-off-by: Breno Leitao <leitao@debian.org>
[bhelgaas: add crash details to commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20250929-aer_crash_2-v1-1-68ec4f81c356@debian.org
2025-10-02 09:35:10 -05:00
Manivannan Sadhasivam
f3ac2ff148 PCI/ASPM: Enable all ClockPM and ASPM states for devicetree platforms
So far, the PCI subsystem has honored the ASPM and Clock PM states set by
the BIOS (through LNKCTL) during device initialization, if it relies on the
default state selected using:

* Kconfig: CONFIG_PCIEASPM_DEFAULT=y, or
* cmdline: "pcie_aspm=off", or
* FADT: ACPI_FADT_NO_ASPM

This was done conservatively to avoid issues with the buggy devices that
advertise ASPM capabilities, but behave erratically if the ASPM states are
enabled. So the PCI subsystem ended up trusting the BIOS to enable only the
ASPM states that were known to work for the devices.

But this turned out to be a problem for devicetree platforms, especially
the ARM based devicetree platforms powering Embedded and *some* Compute
devices as they tend to run without any standard BIOS. So the ASPM states
on these platforms were left disabled during boot and the PCI subsystem
never bothered to enable them, unless the user has forcefully enabled the
ASPM states through Kconfig, cmdline, and sysfs or the device drivers
themselves, enabling the ASPM states through pci_enable_link_state() APIs.

This caused runtime power issues on those platforms. So a couple of
approaches were tried to mitigate this BIOS dependency without user
intervention by enabling the ASPM states in the PCI controller drivers
after device enumeration, and overriding the ASPM/Clock PM states
by the PCI controller drivers through an API before enumeration.

But it has been concluded that none of these mitigations should really be
required and the PCI subsystem should enable the ASPM states advertised by
the devices without relying on BIOS or the PCI controller drivers. If any
device is found to be misbehaving after enabling ASPM states that they
advertised, then those devices should be quirked to disable the problematic
ASPM/Clock PM states.

In an effort to do so, start by overriding the ASPM and Clock PM states set
by the BIOS for devicetree platforms first. Separate helper functions are
introduced to override the BIOS set states by enabling all of them if
of_have_populated_dt() returns true. To aid debugging, print the overridden
ASPM and Clock PM states as well.

In the future, these helpers could be extended to allow other platforms
like VMD, newer ACPI systems with a cutoff year etc... to follow the path.

Link: https://lore.kernel.org/linux-pci/20250828204345.GA958461@bhelgaas
Suggested-by: Bjorn Helgaas <helgaas@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com>
[bhelgaas: tweak comments and dmesg logs]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250922-pci-dt-aspm-v2-1-2a65cf84e326@oss.qualcomm.com
2025-09-23 18:06:33 -05:00
Vernon Yang
0a27bdb14b PCI/AER: Fix NULL pointer access by aer_info
The kzalloc(GFP_KERNEL) may return NULL, so all accesses to aer_info->xxx
will result in kernel panic. Fix it.

Signed-off-by: Vernon Yang <yanglincheng@kylinos.cn>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250904182527.67371-1-vernon2gm@gmail.com
2025-09-11 17:55:33 -05:00
Lukas Wunner
c8ab5e888b PCI/AER: Print TLP Log for errors introduced since PCIe r1.1
When reporting an error, the AER driver prints the TLP Header / Prefix Log
only for errors enumerated in the AER_LOG_TLP_MASKS macro.

The macro was never amended since its introduction in 2006 with commit
6c2b374d74 ("PCI-Express AER implemetation: AER core and aerdriver").
At the time, PCIe r1.1 was the latest spec revision.

Amend the macro with errors defined since then to avoid omitting the TLP
Header / Prefix Log for newer errors.

The order of the errors in AER_LOG_TLP_MASKS follows PCIe r1.1 sec 6.2.7
rather than 7.10.2, because only the former documents for which errors a
TLP Header / Prefix is logged.  Retain this order.  The section number is
still 6.2.7 in today's PCIe r7.0.

For Completion Timeouts, the TLP Header / Prefix is only logged if the
Completion Timeout Prefix / Header Log Capable bit is set in the AER
Capabilities and Control register.  Introduce a tlp_header_logged() helper
to check whether the TLP Header / Prefix Log is populated and use it in
the two places which currently match against AER_LOG_TLP_MASKS directly.

For Uncorrectable Internal Errors, logging of the TLP Header / Prefix is
optional per PCIe r7.0 sec 6.2.7.  If needed, drivers could indicate
through a flag whether devices are capable and tlp_header_logged() could
then check that flag.

pcitools introduced macros for newer errors with commit 144b0911cc0b
("ls-ecaps: extend decode support for more fields for AER CE and UE
status"):
  https://git.kernel.org/pub/scm/utils/pciutils/pciutils.git/commit/?id=144b0911cc0b

Unfortunately some of those macros are overly long:
  PCI_ERR_UNC_POISONED_TLP_EGRESS
  PCI_ERR_UNC_DMWR_REQ_EGRESS_BLOCKED
  PCI_ERR_UNC_IDE_CHECK
  PCI_ERR_UNC_MISR_IDE_TLP
  PCI_ERR_UNC_PCRC_CHECK
  PCI_ERR_UNC_TLP_XLAT_EGRESS_BLOCKED

This seems unsuitable for <linux/pci_regs.h>, so shorten to:
  PCI_ERR_UNC_POISON_BLK
  PCI_ERR_UNC_DMWR_BLK
  PCI_ERR_UNC_IDE_CHECK
  PCI_ERR_UNC_MISR_IDE
  PCI_ERR_UNC_PCRC_CHECK
  PCI_ERR_UNC_XLAT_BLK

Note that some of the existing macros in <linux/pci_regs.h> do not match
exactly with pcitools (e.g. PCI_ERR_UNC_SDES versus PCI_ERR_UNC_SURPDN),
so it does not seem mandatory for them to be identical.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/5f707caf1260bd8f15012bb032f7da9a9b898aba.1756712066.git.lukas@wunner.de
2025-09-04 10:09:05 -05:00
Lukas Wunner
6633875250 PCI/AER: Support errors introduced by PCIe r6.0
PCIe r6.0 defined five additional errors in the Uncorrectable Error
Status, Mask and Severity Registers (PCIe r7.0 sec 7.8.4.2ff).

lspci has been supporting them since commit 144b0911cc0b ("ls-ecaps:
extend decode support for more fields for AER CE and UE status"):

  https://git.kernel.org/pub/scm/utils/pciutils/pciutils.git/commit/?id=144b0911cc0b

Amend the AER driver to recognize them as well, instead of logging them as
"Unknown Error Bit".

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/21f1875b18d4078c99353378f37dcd6b994f6d4e.1756301211.git.lukas@wunner.de
2025-08-27 16:15:24 -05:00
Lukas Wunner
45bc82563d PCI/ERR: Update device error_state already after reset
After a Fatal Error has been reported by a device and has been recovered
through a Secondary Bus Reset, AER updates the device's error_state to
pci_channel_io_normal before invoking its driver's ->resume() callback.

By contrast, EEH updates the error_state earlier, namely after resetting
the device and before invoking its driver's ->slot_reset() callback.
Commit c58dc575f3 ("powerpc/pseries: Set error_state to
pci_channel_io_normal in eeh_report_reset()") explains in great detail
that the earlier invocation is necessitated by various drivers checking
accessibility of the device with pci_channel_offline() and avoiding
accesses if it returns true.  It returns true for any other error_state
than pci_channel_io_normal.

The device should be accessible already after reset, hence the reasoning
is that it's safe to update the error_state immediately afterwards.

This deviation between AER and EEH seems problematic because drivers
behave differently depending on which error recovery mechanism the
platform uses.  Three drivers have gone so far as to update the
error_state themselves, presumably to work around AER's behavior.

For consistency, amend AER to update the error_state at the same recovery
steps as EEH.  Drop the now unnecessary workaround from the three drivers.

Keep updating the error_state before ->resume() in case ->error_detected()
or ->mmio_enabled() return PCI_ERS_RESULT_RECOVERED, which causes
->slot_reset() to be skipped.  There are drivers doing this even for Fatal
Errors, e.g. mhi_pci_error_detected().

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/4517af6359ffb9d66152b827a5d2833459144e3f.1755008151.git.lukas@wunner.de
2025-08-13 13:20:56 -05:00
Lukas Wunner
9011f0667c PCI/ERR: Notify drivers on failure to recover
According to Documentation/PCI/pci-error-recovery.rst, the following shall
occur on failure to recover from a PCIe Uncorrectable Error:

  STEP 6: Permanent Failure
  -------------------------
  A "permanent failure" has occurred, and the platform cannot recover
  the device.  The platform will call error_detected() with a
  pci_channel_state_t value of pci_channel_io_perm_failure.

  The device driver should, at this point, assume the worst. It should
  cancel all pending I/O, refuse all new I/O, returning -EIO to
  higher layers. The device driver should then clean up all of its
  memory and remove itself from kernel operations, much as it would
  during system shutdown.

Sathya notes that AER does not call error_detected() on failure and thus
deviates from the document (as well as EEH, for which the document was
originally added).

Most drivers do nothing on permanent failure, but the SCSI drivers and a
number of Ethernet drivers do take advantage of the notification to flush
queues and give up resources.

Amend AER to notify such drivers and align with the documentation and EEH.

Link: https://lore.kernel.org/r/f496fc0f-64d7-46a4-8562-dba74e31a956@linux.intel.com/
Suggested-by: Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/ec212d4d4f5c65d29349df33acdc9768ff8279d1.1755008151.git.lukas@wunner.de
2025-08-13 13:20:46 -05:00
Lukas Wunner
1cbc5e25fb PCI/ERR: Fix uevent on failure to recover
Upon failure to recover from a PCIe error through AER, DPC or EDR, a
uevent is sent to inform user space about disconnection of the bridge
whose subordinate devices failed to recover.

However the bridge itself is not disconnected.  Instead, a uevent should
be sent for each of the subordinate devices.

Only if the "bridge" happens to be a Root Complex Event Collector or
Integrated Endpoint does it make sense to send a uevent for it (because
there are no subordinate devices).

Right now if there is a mix of subordinate devices with and without
pci_error_handlers, a BEGIN_RECOVERY event is sent for those with
pci_error_handlers but no FAILED_RECOVERY event is ever sent for them
afterwards.  Fix it.

Fixes: 856e1eb9bd ("PCI/AER: Add uevents in AER and EEH error/resume")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org  # v4.16+
Link: https://patch.msgid.link/68fc527a380821b5d861dd554d2ce42cb739591c.1755008151.git.lukas@wunner.de
2025-08-13 13:20:35 -05:00
Lukas Wunner
d0a2dee7d4 PCI/AER: Allow drivers to opt in to Bus Reset on Non-Fatal Errors
When Advanced Error Reporting was introduced in September 2006 by commit
6c2b374d74 ("PCI-Express AER implemetation: AER core and aerdriver"), it
sought to adhere to the recovery flow and callbacks specified in
Documentation/PCI/pci-error-recovery.rst.

That document had been added in January 2006, when Enhanced Error Handling
(EEH) was introduced for PowerPC with commit 065c635907 ("[PATCH] PCI
Error Recovery: documentation").

However the AER driver deviates from the document in that it never
performs a Secondary Bus Reset on Non-Fatal Errors, but always on Fatal
Errors.  By contrast, EEH allows drivers to opt in or out of a Bus Reset
regardless of error severity, by returning PCI_ERS_RESULT_NEED_RESET or
PCI_ERS_RESULT_CAN_RECOVER from their ->error_detected() callback.  If all
drivers agree that they can recover without a Bus Reset, EEH skips it.
Should one of them request a Bus Reset, it overrides all other drivers.

This inconsistency between EEH and AER seems problematic because drivers
need to be aware of and cope with it.

The file Documentation/PCI/pcieaer-howto.rst hints at a rationale for
always performing a Bus Reset on Fatal Errors:  "Fatal errors [...] cause
the link to be unreliable.  [...] This [reset_link] callback is used to
reset the PCIe physical link when a fatal error happens.  If an error
message indicates a fatal error, [...] performing link reset at upstream
is necessary."

There's no such rationale provided for never performing a Bus Reset on
Non-Fatal Errors.

The "xe" driver has a need to attempt a reset of local units on graphics
cards upon a Non-Fatal Error.  If that is insufficient for recovery, the
driver wants to opt in to a Bus Reset.

Accommodate such use cases and align AER more closely with EEH by
performing a Bus Reset in pcie_do_recovery() if drivers request it and the
faulting device's channel_state is pci_channel_io_normal.  The AER driver
sets this channel_state for Non-Fatal Errors.  For Fatal Errors, it uses
pci_channel_io_frozen.

This limits the deviation from Documentation/PCI/pci-error-recovery.rst
and EEH to the unconditional Bus Reset on Fatal Errors.

pcie_do_recovery() is also invoked by the Downstream Port Containment and
Error Disconnect Recover drivers.  They both set the channel_state to
pci_channel_io_frozen, hence pcie_do_recovery() continues to always invoke
the ->reset_subordinates() callback in their case.  That is necessary
because the callback brings the link back up at the containing Downstream
Port.

There are two behavioral changes resulting from this commit:

First, if channel_state is pci_channel_io_normal and one of the affected
drivers returns PCI_ERS_RESULT_NEED_RESET from its ->error_detected()
callback, a Bus Reset will now be performed.  There are drivers doing this
and although it would be possible to avoid a behavioral change by letting
them return PCI_ERS_RESULT_CAN_RECOVER instead, the impression I got from
examination of all drivers is that they actually expect or want a Bus
Reset (cxl_error_detected() is a case in point).  In any case, if they can
cope with a Bus Reset on Fatal Errors, they shouldn't have issues with a
Bus Reset on Non-Fatal Errors.

Second, if channel_state is pci_channel_io_frozen and all affected drivers
return PCI_ERS_RESULT_CAN_RECOVER from ->error_detected(), their
->mmio_enabled() callback is now invoked prior to performing a Bus Reset,
instead of afterwards.  This actually makes sense:  For example,
drivers/scsi/sym53c8xx_2/sym_glue.c dumps debug registers in its
->mmio_enabled() callback.  Doing so after reset right now captures the
post-reset state instead of the faulting state, which is useless.

There is only one other driver which implements ->mmio_enabled() and
returns PCI_ERS_RESULT_CAN_RECOVER from ->error_detected() for
channel_state pci_channel_io_frozen, drivers/scsi/ipr.c (IBM Power RAID).
It appears to only be used on EEH platforms.  So the second behavioral
change is limited to these two drivers.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/28fd805043bb57af390168d05abb30898cf4fc58.1755008151.git.lukas@wunner.de
2025-08-13 13:20:26 -05:00
Linus Torvalds
0bd0a41a51 Merge tag 'pci-v6.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull PCI updates from Bjorn Helgaas:
 "Enumeration:

   - Allow built-in drivers, not just modular drivers, to use async
     initial probing (Lukas Wunner)

   - Support Immediate Readiness even on devices with no PM Capability
     (Sean Christopherson)

   - Consolidate definition of PCIE_RESET_CONFIG_WAIT_MS (100ms), the
     required delay between a reset and sending config requests to a
     device (Niklas Cassel)

   - Add pci_is_display() to check for "Display" base class and use it
     in ALSA hda, vfio, vga_switcheroo, vt-d (Mario Limonciello)

   - Allow 'isolated PCI functions' (multi-function devices without a
     function 0) for LoongArch, similar to s390 and jailhouse (Huacai
     Chen)

  Power control:

   - Add ability to enable optional slot clock for cases where the PCIe
     host controller and the slot are supplied by different clocks
     (Marek Vasut)

  PCIe native device hotplug:

   - Fix runtime PM ref imbalance on Hot-Plug Capable ports caused by
     misinterpreting a config read failure after a device has been
     removed (Lukas Wunner)

   - Avoid creating a useless PCIe port service device for pciehp if the
     slot is handled by the ACPI hotplug driver (Lukas Wunner)

   - Ignore ACPI hotplug slots when calculating depth of pciehp hotplug
     ports (Lukas Wunner)

  Virtualization:

   - Save VF resizable BAR state and restore it after reset (Michał
     Winiarski)

   - Allow IOV resources (VF BARs) to be resized (Michał Winiarski)

   - Add pci_iov_vf_bar_set_size() so drivers can control VF BAR size
     (Michał Winiarski)

  Endpoint framework:

   - Add RC-to-EP doorbell support using platform MSI controller,
     including a test case (Frank Li)

   - Allow BAR assignment via configfs so platforms have flexibility in
     determining BAR usage (Jerome Brunet)

  Native PCIe controller drivers:

   - Convert amazon,al-alpine-v[23]-pcie, apm,xgene-pcie,
     axis,artpec6-pcie, marvell,armada-3700-pcie, st,spear1340-pcie to
     DT schema format (Rob Herring)

   - Use dev_fwnode() instead of of_fwnode_handle() to remove OF
     dependency in altera (fixes an unused variable), designware-host,
     mediatek, mediatek-gen3, mobiveil, plda, xilinx, xilinx-dma,
     xilinx-nwl (Jiri Slaby, Arnd Bergmann)

   - Convert aardvark, altera, brcmstb, designware-host, iproc,
     mediatek, mediatek-gen3, mobiveil, plda, rcar-host, vmd, xilinx,
     xilinx-dma, xilinx-nwl from using pci_msi_create_irq_domain() to
     using msi_create_parent_irq_domain() instead; this makes the
     interrupt controller per-PCI device, allows dynamic allocation of
     vectors after initialization, and allows support of IMS (Nam Cao)

  APM X-Gene PCIe controller driver:

   - Rewrite MSI handling to MSI CPU affinity, drop useless CPU hotplug
     bits, use device-managed memory allocations, and clean things up
     (Marc Zyngier)

   - Probe xgene-msi as a standard platform driver rather than a
     subsys_initcall (Marc Zyngier)

  Broadcom STB PCIe controller driver:

   - Add optional DT 'num-lanes' property and if present, use it to
     override the Maximum Link Width advertised in Link Capabilities
     (Jim Quinlan)

  Cadence PCIe controller driver:

   - Use PCIe Message routing types from the PCI core rather than
     defining private ones (Hans Zhang)

  Freescale i.MX6 PCIe controller driver:

   - Add IMX8MQ_EP third 64-bit BAR in epc_features (Richard Zhu)

   - Add IMX8MM_EP and IMX8MP_EP fixed 256-byte BAR 4 in epc_features
     (Richard Zhu)

   - Configure LUT for MSI/IOMMU in Endpoint mode so Root Complex can
     trigger doorbel on Endpoint (Frank Li)

   - Remove apps_reset (LTSSM_EN) from
     imx_pcie_{assert,deassert}_core_reset(), which fixes a hotplug
     regression on i.MX8MM (Richard Zhu)

   - Delay Endpoint link start until configfs 'start' written (Richard
     Zhu)

  Intel VMD host bridge driver:

   - Add Intel Panther Lake (PTL)-H/P/U Vendor ID (George D Sworo)

  Qualcomm PCIe controller driver:

   - Add DT binding and driver support for SA8255p, which supports ECAM
     for Configuration Space access (Mayank Rana)

   - Update DT binding and driver to describe PHYs and per-Root Port
     resets in a Root Port stanza and deprecate describing them in the
     host bridge; this makes it possible to support multiple Root Ports
     in the future (Krishna Chaitanya Chundru)

   - Add Qualcomm QCS615 to SM8150 DT binding (Ziyue Zhang)

   - Add Qualcomm QCS8300 to SA8775p DT binding (Ziyue Zhang)

   - Drop TBU and ref clocks from Qualcomm SM8150 and SC8180x DT
     bindings (Konrad Dybcio)

   - Document 'link_down' reset in Qualcomm SA8775P DT binding (Ziyue
     Zhang)

   - Add required PCIE_RESET_CONFIG_WAIT_MS delay after Link up IRQ
     (Niklas Cassel)

  Rockchip PCIe controller driver:

   - Drop unused PCIe Message routing and code definitions (Hans Zhang)

   - Remove several unused header includes (Hans Zhang)

   - Use standard PCIe config register definitions instead of
     rockchip-specific redefinitions (Geraldo Nascimento)

   - Set Target Link Speed to 5.0 GT/s before retraining so we have a
     chance to train at a higher speed (Geraldo Nascimento)

  Rockchip DesignWare PCIe controller driver:

   - Prevent race between link training and register update via DBI by
     inhibiting link training after hot reset and link down (Wilfred
     Mallawa)

   - Add required PCIE_RESET_CONFIG_WAIT_MS delay after Link up IRQ
     (Niklas Cassel)

  Sophgo PCIe controller driver:

   - Add DT binding and driver for Sophgo SG2044 PCIe controller driver
     in Root Complex mode (Inochi Amaoto)

  Synopsys DesignWare PCIe controller driver:

   - Add required PCIE_RESET_CONFIG_WAIT_MS after waiting for Link up on
     Ports that support > 5.0 GT/s. Slower Ports still rely on the
     not-quite-correct PCIE_LINK_WAIT_SLEEP_MS 90ms default delay while
     waiting for the Link (Niklas Cassel)"

* tag 'pci-v6.17-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (116 commits)
  dt-bindings: PCI: qcom,pcie-sa8775p: Document 'link_down' reset
  dt-bindings: PCI: Remove 83xx-512x-pci.txt
  dt-bindings: PCI: Convert amazon,al-alpine-v[23]-pcie to DT schema
  dt-bindings: PCI: Convert marvell,armada-3700-pcie to DT schema
  dt-bindings: PCI: Convert apm,xgene-pcie to DT schema
  dt-bindings: PCI: Convert axis,artpec6-pcie to DT schema
  dt-bindings: PCI: Convert st,spear1340-pcie to DT schema
  PCI: Move is_pciehp check out of pciehp_is_native()
  PCI: pciehp: Use is_pciehp instead of is_hotplug_bridge
  PCI/portdrv: Use is_pciehp instead of is_hotplug_bridge
  PCI/ACPI: Fix runtime PM ref imbalance on Hot-Plug Capable ports
  selftests: pci_endpoint: Add doorbell test case
  misc: pci_endpoint_test: Add doorbell test case
  PCI: endpoint: pci-epf-test: Add doorbell test support
  PCI: endpoint: Add pci_epf_align_inbound_addr() helper for inbound address alignment
  PCI: endpoint: pci-ep-msi: Add checks for MSI parent and mutability
  PCI: endpoint: Add RC-to-EP doorbell support using platform MSI controller
  PCI: dwc: Add Sophgo SG2044 PCIe controller driver in Root Complex mode
  PCI: vmd: Switch to msi_create_parent_irq_domain()
  PCI: vmd: Convert to lock guards
  ...
2025-08-01 13:59:07 -07:00
Bjorn Helgaas
58d2b6b6b2 Merge branch 'pci/misc'
- Remove resolved hotplug TODO item (Guilherme Giacomo Simoes)

- Fix typos (Bjorn Helgaas)

* pci/misc:
  PCI: Fix typos
  PCI: hotplug: Remove TODO about unused .get_power(), .hardware_test()
2025-07-31 16:12:19 -05:00
Bjorn Helgaas
0e142889f4 Merge branch 'pci/hotplug'
- Fix runtime PM ref imbalance on Hot-Plug Capable ports caused by
  misinterpreting a config read failure after a device has been removed
  (Lukas Wunner)

- Avoid creating a useless PCIe port service device for pciehp if the slot
  is handled by the ACPI hotplug driver (Lukas Wunner)

- Ignore ACPI hotplug slots when calculating depth of pciehp hotplug ports
  (Lukas Wunner)

- Simplify pci_bridge_d3_possible() and clarify comments (Lukas Wunner)

* pci/hotplug:
  PCI: Move is_pciehp check out of pciehp_is_native()
  PCI: pciehp: Use is_pciehp instead of is_hotplug_bridge
  PCI/portdrv: Use is_pciehp instead of is_hotplug_bridge
  PCI/ACPI: Fix runtime PM ref imbalance on Hot-Plug Capable ports
2025-07-31 16:11:42 -05:00
Bjorn Helgaas
010c310577 Merge branch 'pci/aspm'
- Change aspm_disabled and aspm_force from int to bool (Hans Zhang)

- Initialize val at declaration (Hans Zhang)

* pci/aspm:
  PCI/ASPM: Consolidate variable declaration and initialization
  PCI/ASPM: Use boolean type for aspm_disabled and aspm_force
2025-07-31 16:11:40 -05:00
Lukas Wunner
1d60796a62 PCI/portdrv: Use is_pciehp instead of is_hotplug_bridge
The PCIe port driver erroneously creates a subdevice for hotplug on ACPI
slots which are handled by the ACPI hotplug driver.

Avoid by checking the is_pciehp flag instead of is_hotplug_bridge when
deciding whether to create a subdevice.  The latter encompasses ACPI slots
whereas the former doesn't.

The superfluous subdevice has no real negative impact, it occupies memory
and interrupt resources but otherwise just sits there waiting for
interrupts from the slot that are never signaled.

Fixes: f841522283 ("PCI: Use cached copy of PCI_EXP_SLTCAP_HPC bit")
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v4.7+
Link: https://patch.msgid.link/40d5a5fe8d40595d505949c620a067fa110ee85e.1752390102.git.lukas@wunner.de
2025-07-29 11:45:39 -05:00
Bjorn Helgaas
50fcd1c14e PCI: Fix typos
Fix typos.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20250722213743.2822761-1-helgaas@kernel.org
2025-07-23 13:12:38 -05:00
Akshay Jindal
a6f494becf PCI/AER: Add message when AER_MAX_MULTI_ERR_DEVICES limit is hit
When a PCIe device detects an error, it logs the error locally and issues
an error Message routed to the Root Complex (PCIe r6.0, sec 6.2.5). If the
Root Port or RCEC supports AER and Linux has enabled the AER interrupt,
aer_isr() traverses the relevant devices and adds those with AER errors
logged to the aer_err_info.dev[] array for error logging and recovery.

If aer_isr() finds more than AER_MAX_MULTI_ERR_DEVICES devices with AER
errors logged, it silently ignores them, and those extra devices are not
included in the recovery flow.

Emit an error message if we find more than AER_MAX_MULTI_ERR_DEVICES
devices with AER errors logged.

Testing details at link below.

Signed-off-by: Akshay Jindal <akshayaj.lkd@gmail.com>
[bhelgaas: commit log, join error message]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250619185041.73240-1-akshayaj.lkd@gmail.com
2025-06-26 15:34:22 -05:00
Manivannan Sadhasivam
5aa326a6a2 PCI/PTM: Build debugfs code only if CONFIG_DEBUG_FS is enabled
Otherwise, the following build error will happen for CONFIG_DEBUG_FS=n &&
CONFIG_PCIE_PTM=y:

  drivers/pci/pcie/ptm.c:498:25: error: redefinition of 'pcie_ptm_create_debugfs'
    498 | struct pci_ptm_debugfs *pcie_ptm_create_debugfs(struct device *dev, void *pdata,
	|                         ^
  ./include/linux/pci.h:1915:2: note: previous definition is here
   1915 | *pcie_ptm_create_debugfs(struct device *dev, void *pdata,
	|  ^
  drivers/pci/pcie/ptm.c:546:6: error: redefinition of 'pcie_ptm_destroy_debugfs'
    546 | void pcie_ptm_destroy_debugfs(struct pci_ptm_debugfs *ptm_debugfs)
	|      ^
  ./include/linux/pci.h:1918:1: note: previous definition is here
   1918 | pcie_ptm_destroy_debugfs(struct pci_ptm_debugfs *ptm_debugfs) { }
	|

Fixes: 132833405e ("PCI: Add debugfs support for exposing PTM context")
Reported-by: Eric Biggers <ebiggers@kernel.org>
Closes: https://lore.kernel.org/linux-pci/20250607025506.GA16607@sol
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Eric Biggers <ebiggers@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20250608033305.15214-1-manivannan.sadhasivam@linaro.org
2025-06-23 12:55:49 -05:00
Hans Zhang
9890dd3fb7 PCI/AER: Use bool for AER disable state tracking
Change pcie_aer_disable variable to bool and update pci_no_aer()
to set it to true. Improves code readability and aligns with modern
kernel practices.

Signed-off-by: Hans Zhang <hans.zhang@cixtech.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250516165223.125083-3-18255117159@163.com
2025-06-17 17:32:27 -05:00
Hans Zhang
64fd90ef25 PCI/ASPM: Consolidate variable declaration and initialization
Merge the declaration and initialization of 'val' into a single statement
for clarity. This eliminates a redundant assignment operation and improves
code readability while maintaining the same functionality.

Signed-off-by: Hans Zhang <18255117159@163.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://patch.msgid.link/20250522161533.394689-1-18255117159@163.com
2025-06-17 17:31:54 -05:00
Hans Zhang
c1842b98c9 PCI/ASPM: Use boolean type for aspm_disabled and aspm_force
The aspm_disabled and aspm_force variables are used as boolean flags.
Change their type from int to bool and update assignments to use
true/false instead of 1/0. This improves code clarity.

Signed-off-by: Hans Zhang <18255117159@163.com>
Signed-off-by: Manivannan Sadhasivam <mani@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://patch.msgid.link/20250517154939.139237-1-18255117159@163.com
2025-06-17 17:31:43 -05:00
Bjorn Helgaas
db847adbf9 Merge branch 'pci/ptm-debugfs'
- Add debugfs support for exposing DWC device-specific PTM context
  (Manivannan Sadhasivam)

* pci/ptm-debugfs:
  PCI: qcom-ep: Mask PTM_UPDATING interrupt
  PCI: dwc: Add debugfs support for PTM context
  PCI: dwc: Pass DWC PCIe mode to dwc_pcie_debugfs_init()
  PCI: Add debugfs support for exposing PTM context
2025-06-04 10:50:44 -05:00
Bjorn Helgaas
1acf6a5e79 Merge branch 'pci/bwctrl'
- Simplify link bandwidth controller by replacing the count of Link
  Bandwidth Management Status (LBMS) events with a PCI_LINK_LBMS_SEEN flag
  (Ilpo Järvinen)

- Update the Link Speed after retraining, since the Link Speed may have
  changed (Ilpo Järvinen)

* pci/bwctrl:
  PCI: Update Link Speed after retraining
  PCI/bwctrl: Replace lbms_count with PCI_LINK_LBMS_SEEN flag
2025-06-04 10:49:49 -05:00
Manivannan Sadhasivam
b06d125e62 PCI/ERR: Remove misleading TODO regarding kernel panic
A PCI device is just another peripheral in a system. So failure to
recover it, must not result in a kernel panic. So remove the TODO which
is quite misleading.

Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Link: https://patch.msgid.link/20250508-pcie-reset-slot-v4-1-7050093e2b50@linaro.org
2025-05-30 12:21:19 -05:00
Jon Pan-Doh
b4fe7398de PCI/AER: Add sysfs attributes for log ratelimits
Allow userspace to read/write log ratelimits per device (including
enable/disable). Create aer/ sysfs directory to store them and any
future AER configs.

The new sysfs files are:

  /sys/bus/pci/devices/*/aer/correctable_ratelimit_burst
  /sys/bus/pci/devices/*/aer/correctable_ratelimit_interval_ms
  /sys/bus/pci/devices/*/aer/nonfatal_ratelimit_burst
  /sys/bus/pci/devices/*/aer/nonfatal_ratelimit_interval_ms

The default values are ratelimit_burst=10, ratelimit_interval_ms=5000, so
if we try to emit more than 10 messages in a 5 second period, some are
suppressed.

Update AER sysfs ABI filename to reflect the broader scope of AER sysfs
attributes (e.g. stats and ratelimits).

  Documentation/ABI/testing/sysfs-bus-pci-devices-aer_stats ->
    sysfs-bus-pci-devices-aer

Tested using aer-inject[1]. Configured correctable log ratelimit to 5.
Sent 6 AER errors. Observed 5 errors logged while AER stats
(cat /sys/bus/pci/devices/<dev>/aer_dev_correctable) shows 6.

Disabled ratelimiting and sent 6 more AER errors. Observed all 6 errors
logged and accounted in AER stats (12 total errors).

[1] https://git.kernel.org/pub/scm/linux/kernel/git/gong.chen/aer-inject.git

[bhelgaas: note fatal errors are not ratelimited, "aer_report" ->
"aer_info", replace ratelimit_log_enable toggle with *_ratelimit_interval_ms]

Signed-off-by: Karolina Stolarek <karolina.stolarek@oracle.com>
Signed-off-by: Jon Pan-Doh <pandoh@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-21-helgaas@kernel.org
2025-05-23 11:11:45 -05:00
Jon Pan-Doh
a57f2bfb4a PCI/AER: Ratelimit correctable and non-fatal error logging
Spammy devices can flood kernel logs with AER errors and slow/stall
execution. Add per-device ratelimits for AER correctable and non-fatal
uncorrectable errors that use the kernel defaults (10 per 5s).  Logging of
fatal errors is not ratelimited.

There are two AER logging entry points:

  - aer_print_error() is used by DPC and native AER

  - pci_print_aer() is used by GHES and CXL

The native AER aer_print_error() case includes a loop that may log details
from multiple devices, which are ratelimited individually.  If we log
details for any device, we also log the Error Source ID from the Root Port
or RCEC.

If no such device details are found, we still log the Error Source from the
ERR_* Message, ratelimited by the Root Port or RCEC that received it.

The DPC aer_print_error() case is not ratelimited, since this only happens
for fatal errors.

The CXL pci_print_aer() case is ratelimited by the Error Source device.

The GHES pci_print_aer() case is via aer_recover_work_func(), which
searches for the Error Source device.  If the device is not found, there's
no per-device ratelimit, so we use a system-wide ratelimit that covers all
error types (correctable, non-fatal, and fatal).

Sargun at Meta reported internally that a flood of AER errors causes RCU
CPU stall warnings and CSD-lock warnings.

Tested using aer-inject[1]. Sent 11 AER errors. Observed 10 errors logged
while AER stats (cat /sys/bus/pci/devices/<dev>/aer_dev_correctable) show
true count of 11.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/gong.chen/aer-inject.git

[bhelgaas: commit log, factor out trace_aer_event() and aer_print_rp_info()
changes to previous patches, enable Error Source logging if any downstream
detail will be printed, don't ratelimit fatal errors, "aer_report" ->
"aer_info", "cor_log_ratelimit" -> "correctable_ratelimit",
"uncor_log_ratelimit" -> "nonfatal_ratelimit"]

Reported-by: Sargun Dhillon <sargun@meta.com>
Signed-off-by: Jon Pan-Doh <pandoh@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-19-helgaas@kernel.org
2025-05-23 11:11:45 -05:00
Bjorn Helgaas
d72bae4230 PCI/AER: Simplify add_error_device()
Return -ENOSPC error early so the usual path through add_error_device() is
the straightline code.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-18-helgaas@kernel.org
2025-05-23 11:11:45 -05:00
Bjorn Helgaas
94bc15c348 PCI/AER: Convert aer_get_device_error_info(), aer_print_error() to index
Previously aer_get_device_error_info() and aer_print_error() took a pointer
to struct aer_err_info and a pointer to a pci_dev.  Typically the pci_dev
was one of the elements of the aer_err_info.dev[] array (DPC was an
exception, where the dev[] array was unused).

Convert aer_get_device_error_info() and aer_print_error() to take an index
into the aer_err_info.dev[] array instead.  A future patch will add
per-device ratelimit information, so the index makes it convenient to find
the ratelimit associated with the device.

To accommodate DPC, set info->dev[0] to the DPC port before using these
interfaces.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-17-helgaas@kernel.org
2025-05-23 11:11:36 -05:00
Karolina Stolarek
09683a6184 PCI/AER: Rename struct aer_stats to aer_info
Update name to reflect the broader definition of structs/variables that are
stored (e.g. ratelimits). This is a preparatory patch for adding rate limit
support.

[bhelgaas: "aer_report" -> "aer_info"]

Signed-off-by: Karolina Stolarek <karolina.stolarek@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250522232339.1525671-16-helgaas@kernel.org
2025-05-23 11:02:33 -05:00
Karolina Stolarek
36c5932074 PCI/AER: Reduce pci_print_aer() correctable error level to KERN_WARNING
Some existing logs in pci_print_aer() log with error severity by default.

Convert them to use KERN_WARNING for correctable errors and KERN_ERR for
uncorrectable errors.

[bhelgaas: commit log]

Signed-off-by: Karolina Stolarek <karolina.stolarek@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250522232339.1525671-15-helgaas@kernel.org
2025-05-23 11:02:19 -05:00
Bjorn Helgaas
82013ff394 PCI/ERR: Add printk level to pcie_print_tlp_log()
aer_print_error() produces output at a printk level (KERN_ERR/KERN_WARNING/
etc) that depends on the kind of error, and it calls pcie_print_tlp_log(),
which previously always produced output at KERN_ERR.

Add a "level" parameter so aer_print_error() can control the level of the
pcie_print_tlp_log() output to match.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-14-helgaas@kernel.org
2025-05-23 11:02:03 -05:00
Karolina Stolarek
c8f6791e33 PCI/AER: Check log level once and remember it
When reporting an AER error, we check its type multiple times to determine
the log level for each message. Do this check only in the top-level
functions (aer_isr_one_error(), pci_print_aer()) and save the level in
struct aer_err_info.

[bhelgaas: save log level in struct aer_err_info instead of passing it
as a parameter]

Signed-off-by: Karolina Stolarek <karolina.stolarek@oracle.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250522232339.1525671-13-helgaas@kernel.org
2025-05-23 11:01:52 -05:00
Bjorn Helgaas
6bb4befbd6 PCI/AER: Trace error event before ratelimiting
As with the AER statistics, we always want to emit trace events, even if
the actual dmesg logging is rate limited.

Call trace_aer_event() immediately after pci_dev_aer_stats_incr() so both
happen before ratelimiting.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-12-helgaas@kernel.org
2025-05-23 11:01:42 -05:00
Bjorn Helgaas
88a7765e62 PCI/AER: Update statistics before ratelimiting
There are two AER logging entry points:

  - aer_print_error() is used by DPC (dpc_process_error()) and native AER
    handling (aer_process_err_devices()).

  - pci_print_aer() is used by GHES (aer_recover_work_func()) and CXL
    (cxl_handle_rdport_errors())

Both use __aer_print_error() to print the AER error bits.  Previously
__aer_print_error() also incremented the AER statistics via
pci_dev_aer_stats_incr().

Call pci_dev_aer_stats_incr() early in the entry points instead of in
__aer_print_error() so we update the statistics even if the actual printing
of error bits is rate limited by a future change.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250522232339.1525671-11-helgaas@kernel.org
2025-05-23 11:01:28 -05:00
Bjorn Helgaas
ad9839137c PCI/AER: Simplify pci_print_aer()
Simplify pci_print_aer() by initializing the struct aer_err_info "info"
with a designated initializer list (it was previously initialized with
memset()) and using pci_name().

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-10-helgaas@kernel.org
2025-05-23 11:01:17 -05:00
Bjorn Helgaas
57964ba390 PCI/AER: Initialize aer_err_info before using it
Previously the struct aer_err_info "e_info" was allocated on the stack
without being initialized, so it contained junk except for the fields we
explicitly set later.

Initialize "e_info" at declaration with a designated initializer list,
which initializes the other members to zero.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250522232339.1525671-9-helgaas@kernel.org
2025-05-23 11:01:06 -05:00
Bjorn Helgaas
ca2426a570 PCI/AER: Move aer_print_source() earlier in file
Move aer_print_source() earlier in the file so a future change can use it
from aer_print_error(), where it's easier to rate limit it.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250522232339.1525671-8-helgaas@kernel.org
2025-05-23 11:00:54 -05:00
Jon Pan-Doh
99c3fd0de8 PCI/AER: Rename aer_print_port_info() to aer_print_source()
Rename aer_print_port_info() to aer_print_source() to be more descriptive.
This logs the Error Source ID logged by a Root Port or Root Complex Event
Collector when it receives an ERR_COR, ERR_NONFATAL, or ERR_FATAL Message.

[bhelgaas: aer_print_rp_info() -> aer_print_source()]

Signed-off-by: Jon Pan-Doh <pandoh@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250522232339.1525671-7-helgaas@kernel.org
2025-05-23 11:00:38 -05:00
Bjorn Helgaas
f40bd28655 PCI/AER: Extract bus/dev/fn in aer_print_port_info() with PCI_BUS_NUM(), etc
Use PCI_BUS_NUM(), PCI_SLOT(), PCI_FUNC() to extract the bus number,
device, and function number directly from the Error Source ID.  There's no
need to shift and mask it explicitly.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250522232339.1525671-6-helgaas@kernel.org
2025-05-23 11:00:27 -05:00
Bjorn Helgaas
6a1eda7459 PCI/AER: Consolidate Error Source ID logging in aer_isr_one_error_type()
Previously we decoded the AER Error Source ID in aer_isr_one_error_type(),
then again in find_source_device() if we didn't find any devices with
errors logged in their AER Capabilities.

Consolidate this so we only decode and log the Error Source ID once in
aer_isr_one_error_type().  Add a "found" parameter so we can add a note
when we didn't find any downstream devices with errors logged in their AER
Capability.

This changes the dmesg logging when we found no devices with errors logged:

  - pci 0000:00:01.0: AER: Correctable error message received from 0000:02:00.0
  - pci 0000:00:01.0: AER: found no error details for 0000:02:00.0
  + pci 0000:00:01.0: AER: Correctable error message received from 0000:02:00.0 (no details found)

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-5-helgaas@kernel.org
2025-05-23 11:00:14 -05:00
Bjorn Helgaas
6fc4dae74a PCI/AER: Factor COR/UNCOR error handling out from aer_isr_one_error()
aer_isr_one_error() duplicates the Error Source ID logging and AER error
processing for Correctable Errors and Uncorrectable Errors.  Factor out the
duplicated code to aer_isr_one_error_type().

aer_isr_one_error() doesn't need the struct aer_rpc pointer, so pass it the
Root Port or RCEC pci_dev pointer instead.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-4-helgaas@kernel.org
2025-05-23 11:00:01 -05:00
Bjorn Helgaas
a0b62cc310 PCI/DPC: Log Error Source ID only when valid
DPC Error Source ID is only valid when the DPC Trigger Reason indicates
that DPC was triggered due to reception of an ERR_NONFATAL or ERR_FATAL
Message (PCIe r6.0, sec 7.9.14.5).

When DPC was triggered by ERR_NONFATAL (PCI_EXP_DPC_STATUS_TRIGGER_RSN_NFE)
or ERR_FATAL (PCI_EXP_DPC_STATUS_TRIGGER_RSN_FE) from a downstream device,
log the Error Source ID (decoded into domain/bus/device/function).  Don't
print the source otherwise, since it's not valid.

For DPC trigger due to reception of ERR_NONFATAL or ERR_FATAL, the dmesg
logging changes:

  - pci 0000:00:01.0: DPC: containment event, status:0x000d source:0x0200
  - pci 0000:00:01.0: DPC: ERR_FATAL detected
  + pci 0000:00:01.0: DPC: containment event, status:0x000d, ERR_FATAL received from 0000:02:00.0

and when DPC triggered for other reasons, where DPC Error Source ID is
undefined, e.g., unmasked uncorrectable error:

  - pci 0000:00:01.0: DPC: containment event, status:0x0009 source:0x0200
  - pci 0000:00:01.0: DPC: unmasked uncorrectable error detected
  + pci 0000:00:01.0: DPC: containment event, status:0x0009: unmasked uncorrectable error detected

Previously the "containment event" message was at KERN_INFO and the
"%s detected" message was at KERN_WARNING.  Now the single message is at
KERN_WARNING.

Fixes: 26e5157133 ("PCI: Add Downstream Port Containment driver")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://patch.msgid.link/20250522232339.1525671-3-helgaas@kernel.org
2025-05-23 10:59:40 -05:00
Bjorn Helgaas
a424b598e6 PCI/DPC: Initialize aer_err_info before using it
Previously the struct aer_err_info "info" was allocated on the stack
without being initialized, so it contained junk except for the fields we
explicitly set later.

Initialize "info" at declaration so it starts as all zeros.

Fixes: 8aefa9b0d9 ("PCI/DPC: Print AER status in DPC event handling")
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250522232339.1525671-2-helgaas@kernel.org
2025-05-23 10:58:52 -05:00
Ilpo Järvinen
6ade6e81f8 PCI: Update Link Speed after retraining
PCIe Link Retraining can alter Link Speed. pcie_retrain_link() that
performs the Link Training is called from bwctrl and ASPM driver.

While bwctrl listens for Link Bandwidth Management Status (LBMS) to
pick up changes in Link Speed, there is a race between
pcie_reset_lbms() clearing LBMS after the Link Training and
pcie_bwnotif_irq() reading the Link Status register. If LBMS is already
cleared when the irq handler reads the register, the interrupt handler
will return early with IRQ_NONE and won't update the Link Speed.

When Link Speed update originates from bwctrl,
pcie_bwctrl_change_speed() ensures Link Speed is updated after the
retraining. ASPM driver, however, calls pcie_retrain_link() but does
not update the Link Speed after retraining which can result in stale
Link Speed. Also, it is possible to have ASPM support with
CONFIG_PCIEPORTBUS=n in which case bwctrl will not be built in (and
thus won't update the Link Speed at all).

To ensure Link Speed is not left stale after Link Training, move the
call to pcie_update_link_speed() from pcie_bwctrl_change_speed() into
pcie_retrain_link().

Suggested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Link: https://lore.kernel.org/linux-pci/aBCjpfyYmlkJ12AZ@wunner.de
Link: https://lore.kernel.org/r/20250514132821.15705-1-ilpo.jarvinen@linux.intel.com
2025-05-15 16:28:44 +00:00
Manivannan Sadhasivam
132833405e PCI: Add debugfs support for exposing PTM context
Precision Time Management (PTM) mechanism defined in PCIe spec r6.0,
sec 6.21 allows precise coordination of timing information across multiple
components in a PCIe hierarchy with independent local time clocks.

PCI core already supports enabling PTM in the root port and endpoint
devices through PTM Extended Capability registers. But the PTM context
supported by the PTM capable components such as Root Complex (RC) and
Endpoint (EP) controllers were not exposed as of now. Part of the reason is
that the spec doesn't define how the context information is exposed to the
software and left it to the vendor implementation. So there is no
standardized way to get access to the context information and each vendor
have defined their own way.

This commit adds debugfs support to expose the PTM context to userspace
from both PCIe RC and EP controllers. Since the context information is
exposed in a vendor specific way, the debugfs interface allows the
controller drivers to implement callbacks for each attribute, to be called
by the generic PTM driver.

The Controller drivers are expected to call pcie_ptm_create_debugfs() to
create the debugfs attributes for the PTM context and call
pcie_ptm_destroy_debugfs() to destroy them. The drivers should also
populate the relevant callbacks in the 'struct pcie_ptm_ops' structure
based on the controller implementation.

Below PTM context are exposed through debugfs:

PCIe RC
=======

1. PTM Local clock
2. PTM T2 timestamp
3. PTM T3 timestamp
4. PTM Context valid

PCIe EP
=======

1. PTM Local clock
2. PTM T1 timestamp
3. PTM T4 timestamp
4. PTM Master clock
5. PTM Context update

Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
[kwilczynski: fix overflow issue reported by Dan Carpenter from
https://lore.kernel.org/linux-pci/b41c1754-c6b7-4805-9f14-7c643d6c5304@suswa.mountain]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Link: https://patch.msgid.link/20250505-pcie-ptm-v4-1-02d26d51400b@linaro.org
2025-05-15 09:16:20 +00:00
Ilpo Järvinen
2389d8dc38 PCI/bwctrl: Replace lbms_count with PCI_LINK_LBMS_SEEN flag
PCIe BW controller counted LBMS assertions for the purposes of the Target
Speed quirk (pcie_failed_link_retrain()). It was also a plan to expose the
LBMS count through sysfs to allow better diagnosing link related issues.
Lukas Wunner suggested, however, that adding a trace event would be better
for diagnostics purposes, leaving only pcie_failed_link_retrain() as a user
of the lbms_count.

The logic in pcie_failed_link_retrain() does not require keeping count of
LBMS assertions, so replace lbms_count with a simple flag in pci_dev's
priv_flags.  The reduced complexity allows removing pcie_bwctrl_lbms_rwsem.

Since pcie_failed_link_retrain() runs before bwctrl is probed during boot,
the LBMS in Link Status register still has to be checked by the quirk.

The priv_flags numbering is not continuous because hotplug code added a few
flags to fill numbers 4-5 (hotplug and bwctrl changes are routed through in
different branches).

Suggested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
[kwilczynski: squashed a fix to resolve build failures from
https://lore.kernel.org/all/20250508090036.1528-1-ilpo.jarvinen@linux.intel.com]
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Link: https://patch.msgid.link/20250422115548.1483-1-ilpo.jarvinen@linux.intel.com
2025-05-15 08:38:40 +00:00