Commit Graph

1185183 Commits

Author SHA1 Message Date
Bjorn Helgaas
7e229f0e05 Merge branch 'pci/pm'
- Reduce wait time for secondary bus to be ready to speed up resume (Mika
  Westerberg)

- Avoid putting EloPOS E2/S2/H2 (as well as Elo i2) PCIe Ports in D3cold
  (Ondrej Zary)

- Call _REG when transitioning D-states so AML that uses the PCI config
  space OpRegion works, which fixes some ASMedia GPIO controllers (Mario
  Limonciello)

* pci/pm:
  PCI/ACPI: Call _REG when transitioning D-states
  PCI/ACPI: Validate acpi_pci_set_power_state() parameter
  PCI/PM: Avoid putting EloPOS E2/S2/H2 PCIe Ports in D3cold
  PCI/PM: Shorten pci_bridge_wait_for_secondary_bus() wait time for slow links
2023-06-26 12:59:56 -05:00
Bjorn Helgaas
db5ccb2eda Merge branch 'pci/hotplug'
- Simplify Attention Button logging (Bjorn Helgaas)

- Cancel bringup sequence if card is not present, to keep from blinking
  Power Indicator indefinitely (Rongguang Wei)

- Reassign bridge resources if necessary for ACPI hotplug (Igor Mammedov)

* pci/hotplug:
  PCI: acpiphp: Reassign resources on bridge if necessary
  PCI: pciehp: Cancel bringup sequence if card is not present
  PCI: pciehp: Simplify Attention Button logging
2023-06-26 12:59:56 -05:00
Bjorn Helgaas
1abb473903 Merge branch 'pci/enumeration'
- Add PCI_EXT_CAP_ID_PL_32GT define (Ben Dooks)

- Propagate firmware node by calling device_set_node() for better
  modularity (Andy Shevchenko)

- Discover Data Link Layer Link Active Reporting earlier so quirks can take
  advantage of it (Maciej W. Rozycki)

- Use cached Data Link Layer Link Active Reporting capability in pciehp,
  powerpc/eeh, and mlx5 (Maciej W. Rozycki)

- Run quirk for devices that require OS to clear Retrain Link earlier, so
  later quirks can rely on it (Maciej W. Rozycki)

- Export pcie_retrain_link() for use outside ASPM (Maciej W. Rozycki)

- Add Data Link Layer Link Active Reporting as another way for
  pcie_retrain_link() to determine the link is up (Maciej W. Rozycki)

- Work around link training failures (especially on the ASMedia ASM2824
  switch) by training first at 2.5GT/s and then attempting higher rates
  (Maciej W. Rozycki)

* pci/enumeration:
  PCI: Add failed link recovery for device reset events
  PCI: Work around PCIe link training failures
  PCI: Use pcie_wait_for_link_status() in pcie_wait_for_link_delay()
  PCI: Add support for polling DLLLA to pcie_retrain_link()
  PCI: Export pcie_retrain_link() for use outside ASPM
  PCI: Export PCIe link retrain timeout
  PCI: Execute quirk_enable_clear_retrain_link() earlier
  PCI/ASPM: Factor out waiting for link training to complete
  PCI/ASPM: Avoid unnecessary pcie_link_state use
  PCI/ASPM: Use distinct local vars in pcie_retrain_link()
  net/mlx5: Rely on dev->link_active_reporting
  powerpc/eeh: Rely on dev->link_active_reporting
  PCI: pciehp: Rely on dev->link_active_reporting
  PCI: Initialize dev->link_active_reporting earlier
  PCI: of: Propagate firmware node by calling device_set_node()
  PCI: Add PCI_EXT_CAP_ID_PL_32GT define

# Conflicts:
#	drivers/pci/pcie/aspm.c
2023-06-26 12:59:56 -05:00
Bjorn Helgaas
0f32114ea0 Merge branch 'pci/aspm'
- Disable ASPM on MFD function removal to avoid use-after-free (Ding Hui)

- Tighten up pci_enable_link_state() and pci_disable_link_state()
  interfaces so they don't enable/disable states the driver didn't specify
  (Ajay Agarwal)

- Avoid link retraining race that can happen if ASPM sets link control
  parameters while the link is in the midst of training for some other
  reason (Ilpo Järvinen)

* pci/aspm:
  PCI/ASPM: Avoid link retraining race
  PCI/ASPM: Factor out pcie_wait_for_retrain()
  PCI/ASPM: Return 0 or -ETIMEDOUT from  pcie_retrain_link()
  PCI/ASPM: Remove unnecessary ASPM_STATE_L1SS check
  PCI/ASPM: Rename L1.2-specific functions from 'l1ss' to 'l12'
  PCI/ASPM: Set ASPM_STATE_L1 when driver enables L1.1 or L1.2
  PCI/ASPM: Set only ASPM_STATE_L1 when driver enables L1
  PCI/ASPM: Disable only ASPM_STATE_L1 when driver disables L1
  PCI/ASPM: Disable ASPM on MFD function removal to avoid use-after-free
2023-06-26 12:59:55 -05:00
Bjorn Helgaas
a274a4e65f Merge branch 'pci/aer'
- Unexport pci_save_aer_state() since it's only used in drivers/pci/ (Bjorn
  Helgaas)

- Drop recommendation for drivers to configure AER Capability, since the
  PCI core does this for all devices (Dave Jiang, Bjorn Helgaas)

* pci/aer:
  Documentation: PCI: Tidy AER documentation
  Documentation: PCI: Update cross references to .rst files
  Documentation: PCI: Drop recommendation to configure AER Capability
  PCI: Unexport pci_save_aer_state()
2023-06-26 12:59:55 -05:00
Mario Limonciello
112a7f9c8e PCI/ACPI: Call _REG when transitioning D-states
ACPI r6.5, sec 6.5.4, describes how AML is unable to access an
OperationRegion unless _REG has been called to connect a handler:

  The OS runs _REG control methods to inform AML code of a change in the
  availability of an operation region. When an operation region handler is
  unavailable, AML cannot access data fields in that region.  (Operation
  region writes will be ignored and reads will return indeterminate data.)

The PCI core does not call _REG at any time, leading to the undefined
behavior mentioned in the spec.

The spec explains that _REG should be executed to indicate whether a
given region can be accessed:

  Once _REG has been executed for a particular operation region, indicating
  that the operation region handler is ready, a control method can access
  fields in the operation region. Conversely, control methods must not
  access fields in operation regions when _REG method execution has not
  indicated that the operation region handler is ready.

An example included in the spec demonstrates calling _REG when devices are
turned off: "when the host controller or bridge controller is turned off
or disabled, PCI Config Space Operation Regions for child devices are
no longer available. As such, ETH0’s _REG method will be run when it
is turned off and will again be run when PCI1 is turned off."

It is reported that ASMedia PCIe GPIO controllers fail functional tests
after the system has returning from suspend (S3 or s2idle). This is because
the BIOS checks whether the OSPM has called the _REG method to determine
whether it can interact with the OperationRegion assigned to the device as
part of the other AML called for the device.

To fix this issue, call acpi_evaluate_reg() when devices are transitioning
to D3cold or D0.

[bhelgaas: split pci_power_t checking to preliminary patch]
Link: https://uefi.org/specs/ACPI/6.5/06_Device_Configuration.html#reg-region
Link: https://lore.kernel.org/r/20230620140451.21007-1-mario.limonciello@amd.com
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
2023-06-23 12:28:08 -05:00
Bjorn Helgaas
5557b62634 PCI/ACPI: Validate acpi_pci_set_power_state() parameter
Previously acpi_pci_set_power_state() assumed the requested power state was
valid (PCI_D0 ... PCI_D3cold).  If a caller supplied something else, we
could index outside the state_conv[] array and pass junk to
acpi_device_set_power().

Validate the pci_power_t parameter and return -EINVAL if it's invalid.

Link: https://lore.kernel.org/r/20230621222857.GA122930@bhelgaas
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
2023-06-23 12:28:01 -05:00
Ondrej Zary
9e30fd26f4 PCI/PM: Avoid putting EloPOS E2/S2/H2 PCIe Ports in D3cold
The quirk for Elo i2 introduced in commit 92597f97a4 ("PCI/PM: Avoid
putting Elo i2 PCIe Ports in D3cold") is also needed by EloPOS E2/S2/H2
which uses the same Continental Z2 board.

Change the quirk to match the board instead of system.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215715
Link: https://lore.kernel.org/r/20230614074253.22318-1-linux@zary.sk
Signed-off-by: Ondrej Zary <linux@zary.sk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
2023-06-23 12:27:58 -05:00
Ilpo Järvinen
e7e3975636 PCI/ASPM: Avoid link retraining race
PCIe r6.0.1, sec 7.5.3.7, recommends setting the link control parameters,
then waiting for the Link Training bit to be clear before setting the
Retrain Link bit.

This avoids a race where the LTSSM may not use the updated parameters if it
is already in the midst of link training because of other normal link
activity.

Wait for the Link Training bit to be clear before toggling the Retrain Link
bit to ensure that the LTSSM uses the updated link control parameters.

[bhelgaas: commit log, return 0 (success)/-ETIMEDOUT instead of bool for
both pcie_wait_for_retrain() and the existing pcie_retrain_link()]
Suggested-by: Lukas Wunner <lukas@wunner.de>
Fixes: 7d715a6c1a ("PCI: add PCI Express ASPM support")
Link: https://lore.kernel.org/r/20230502083923.34562-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org
2023-06-20 14:58:52 -05:00
Ilpo Järvinen
9c7f136433 PCI/ASPM: Factor out pcie_wait_for_retrain()
Factor pcie_wait_for_retrain() out from pcie_retrain_link().  No functional
change intended.

[bhelgaas: split out from
https://lore.kernel.org/r/20230502083923.34562-1-ilpo.jarvinen@linux.intel.com]
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-20 14:58:52 -05:00
Bjorn Helgaas
f5297a01ee PCI/ASPM: Return 0 or -ETIMEDOUT from pcie_retrain_link()
"pcie_retrain_link" is not a question with a true/false answer, so "bool"
isn't quite the right return type.  Return 0 for success or -ETIMEDOUT if
the retrain failed.  No functional change intended.

[bhelgaas: based on Ilpo's patch below]
Link: https://lore.kernel.org/r/20230502083923.34562-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-20 14:58:31 -05:00
Maciej W. Rozycki
08e3ed12ca PCI: Add failed link recovery for device reset events
Request failed link recovery with any upstream PCIe bridge where a device
has not come back after reset within PCI_RESET_WAIT time.  Reset the
polling interval if recovery succeeded, otherwise continue as usual.

[bhelgaas: inline pcie_parent_link_retrain()]
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2306111631050.64925@angie.orcam.me.uk
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-20 10:58:53 -05:00
Maciej W. Rozycki
a89c82249c PCI: Work around PCIe link training failures
Attempt to handle cases such as with a downstream port of the ASMedia
ASM2824 PCIe switch where link training never completes and the link
continues switching between speeds indefinitely with the data link layer
never reaching the active state.

It has been observed with a downstream port of the ASMedia ASM2824 Gen 3
switch wired to the upstream port of the Pericom PI7C9X2G304 Gen 2 switch,
using a Delock Riser Card PCI Express x1 > 2 x PCIe x1 device, P/N 41433,
wired to a SiFive HiFive Unmatched board.  In this setup the switches
should negotiate a link speed of 5.0GT/s, falling back to 2.5GT/s if
necessary.

Instead the link continues oscillating between the two speeds, at the rate
of 34-35 times per second, with link training reported repeatedly active
~84% of the time.  Limiting the target link speed to 2.5GT/s with the
upstream ASM2824 device makes the two switches communicate correctly.
Removing the speed restriction afterwards makes the two devices switch to
5.0GT/s then.

Make use of these observations and detect the inability to train the link
by checking for the Data Link Layer Link Active status bit being off while
the Link Bandwidth Management Status indicating that hardware has changed
the link speed or width in an attempt to correct unreliable link operation.

Restrict the speed to 2.5GT/s then with the Target Link Speed field,
request a retrain and wait 200ms for the data link to go up.  If this is
successful, lift the restriction, letting the devices negotiate a higher
speed.

Also check for a 2.5GT/s speed restriction the firmware may have already
arranged and lift it too with ports of devices known to continue working
afterwards (currently only ASM2824), that already report their data link
being up.

[bhelgaas: reorder and squash stubs from
https://lore.kernel.org/r/alpine.DEB.2.21.2306111619570.64925@angie.orcam.me.uk
to avoid adding stubs that do nothing]
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2203022037020.56670@angie.orcam.me.uk/
Link: https://source.denx.de/u-boot/u-boot/-/commit/a398a51ccc68
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2305310038540.59226@angie.orcam.me.uk
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-20 10:58:53 -05:00
Maciej W. Rozycki
7604bc294c PCI: Use pcie_wait_for_link_status() in pcie_wait_for_link_delay()
Remove a DLLLA status bit polling loop from pcie_wait_for_link_delay() and
call almost identical code in pcie_wait_for_link_status() instead.  This
reduces the lower bound on the polling interval from 10ms to 1ms, possibly
increasing the CPU load on the system in favour to reducing the wait time.

Link: https://lore.kernel.org/r/alpine.DEB.2.21.2306111611170.64925@angie.orcam.me.uk
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-20 10:58:53 -05:00
Maciej W. Rozycki
680e9c47a2 PCI: Add support for polling DLLLA to pcie_retrain_link()
Let the caller of pcie_retrain_link() specify whether they want to use the
LT bit or the DLLLA bit of the Link Status Register to determine if link
training has completed.  It is up to the caller to verify whether the use
of the DLLLA bit, the implementation of which is optional, is valid for the
device requested.

Link: https://lore.kernel.org/r/alpine.DEB.2.21.2306110310540.64925@angie.orcam.me.uk
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-20 10:58:53 -05:00
Maciej W. Rozycki
37edd87eb6 PCI: Export pcie_retrain_link() for use outside ASPM
Export pcie_retrain_link() for link retrain needs outside ASPM.  Struct
pcie_link_state is local to ASPM and only used by pcie_retrain_link() to
get at the associated PCI device, so change the operand and adjust the lone
call site accordingly.  Document the interface.  No functional change at
this point.

Link: https://lore.kernel.org/r/alpine.DEB.2.21.2306110229010.64925@angie.orcam.me.uk
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-20 10:58:53 -05:00
Maciej W. Rozycki
33a176abcc PCI: Export PCIe link retrain timeout
Convert LINK_RETRAIN_TIMEOUT from jiffies to milliseconds, accordingly
rename to PCIE_LINK_RETRAIN_TIMEOUT_MS, and make available via "pci.h" for
the PCI core to use.  Use in pcie_wait_for_link_delay().

Link: https://lore.kernel.org/r/alpine.DEB.2.21.2305310030280.59226@angie.orcam.me.uk
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-20 10:58:53 -05:00
Maciej W. Rozycki
07a8d698de PCI: Execute quirk_enable_clear_retrain_link() earlier
Make quirk_enable_clear_retrain_link() an early quirk so that any later
fixups can rely on dev->clear_retrain_link to have been already
initialised.

[bhelgaas: reorder to just before it becomes possible to call
pcie_retrain_link() earlier]
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2305310049000.59226@angie.orcam.me.uk
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-20 10:58:53 -05:00
Maciej W. Rozycki
3c0ec896a4 PCI/ASPM: Factor out waiting for link training to complete
Move code polling for the Link Training bit to clear into a function of its
own.

[bhelgaas: reorder to clean up before exposing to PCI core]
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2306111605060.64925@angie.orcam.me.uk
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-20 10:58:53 -05:00
Maciej W. Rozycki
fd6e6e38eb PCI/ASPM: Avoid unnecessary pcie_link_state use
[bhelgaas: extract from expose patch, reorder to clean up before exposing]
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2306110229010.64925@angie.orcam.me.uk
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-20 10:58:46 -05:00
Maciej W. Rozycki
b168979977 PCI/ASPM: Use distinct local vars in pcie_retrain_link()
Use separate local variables to hold the respective values retrieved from
the Link Control Register and the Link Status Register.  Improves
readability and it makes it possible for the compiler to detect actual
uninitialised use should this code change in the future.

[bhelgaas: reorder to clean up before exposing to PCI core]
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2306110252260.64925@angie.orcam.me.uk
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-14 17:58:12 -05:00
Maciej W. Rozycki
3bff63ee03 net/mlx5: Rely on dev->link_active_reporting
Use dev->link_active_reporting to determine whether Data Link Layer Link
Active Reporting is available rather than re-retrieving the capability.

Link: https://lore.kernel.org/r/alpine.DEB.2.21.2305310125370.59226@angie.orcam.me.uk
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-14 17:58:12 -05:00
Maciej W. Rozycki
1541a21305 powerpc/eeh: Rely on dev->link_active_reporting
Use dev->link_active_reporting to determine whether Data Link Layer Link
Active Reporting is available rather than re-retrieving the capability.

Link: https://lore.kernel.org/r/alpine.DEB.2.21.2305310124100.59226@angie.orcam.me.uk
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-14 17:58:12 -05:00
Maciej W. Rozycki
1f087398db PCI: pciehp: Rely on dev->link_active_reporting
Use dev->link_active_reporting to determine whether Data Link Layer Link
Active Reporting is available rather than re-retrieving the capability.

Link: https://lore.kernel.org/r/alpine.DEB.2.21.2305310028150.59226@angie.orcam.me.uk
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
2023-06-14 17:58:12 -05:00
Maciej W. Rozycki
42adbdc74c PCI: Initialize dev->link_active_reporting earlier
Determine whether Data Link Layer Link Active Reporting is available before
calling any fixups so that the cached value can be used there and later on.

[bhelgaas: move to set_pcie_port_type() where other PCIe init is done]
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2305310122210.59226@angie.orcam.me.uk
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-14 17:58:12 -05:00
Bjorn Helgaas
11502feab4 Documentation: PCI: Tidy AER documentation
Consistently use:

  PCIe          previously PCIe, PCI Express, or pci express
  Root Port     previously Root Port or root port
  Endpoint      previously EndPoint or endpoint
  AER           previously AER or aer
  please        previously pls

Also update a few awkward wordings.

Link: https://lore.kernel.org/r/20230609222500.1267795-5-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Stefan Roese <sr@denx.de>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2023-06-12 12:17:36 -05:00
Bjorn Helgaas
f142badf46 Documentation: PCI: Update cross references to .rst files
Change references to *.txt to *.rst to match the current filenames.

Link: https://lore.kernel.org/r/20230609222500.1267795-4-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Stefan Roese <sr@denx.de>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2023-06-12 12:17:33 -05:00
Bjorn Helgaas
a6378a7a1c Documentation: PCI: Drop recommendation to configure AER Capability
Since f26e58bf6f ("PCI/AER: Enable error reporting when AER is native"),
the PCI core enables PCIe device error reporting for all devices during
enumeration, so drivers don't need to do it.

Remove the recommendation for drivers to configure AER and call
pci_enable_pcie_error_reporting() themselves.

Also remove the suggestion that drivers may change AER mask and severity
registers.  Ownership of these registers is negotiated between the OS and
platform firmware.  If firmware owns these registers, the OS must not
change them.

Link: https://lore.kernel.org/r/20230609222500.1267795-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Stefan Roese <sr@denx.de>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2023-06-12 12:17:26 -05:00
Bjorn Helgaas
ba3da66783 PCI: Unexport pci_save_aer_state()
pci_save_aer_state() and pci_restore_aer_state() are only used in
drivers/pci, so don't expose them to the rest of the kernel.  No functional
change intended.

Link: https://lore.kernel.org/r/20230609222500.1267795-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Stefan Roese <sr@denx.de>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2023-06-12 12:17:22 -05:00
Mika Westerberg
7b3ba09feb PCI/PM: Shorten pci_bridge_wait_for_secondary_bus() wait time for slow links
With slow links (<= 5GT/s) active link reporting is not mandatory, so if a
device is disconnected during system sleep we might end up waiting for it
to respond for ~60s, which slows down resume time.

PCIe r6.0, sec 6.6.1, mandates that software must wait for at least 1s
before it can assume a device is broken, so use that minimum requirement
for slow links and bail out if the device doesn't respond within 1s.
However, if the port supports active link reporting we can wait longer as
we do with the fast links.

This should make system resume time faster for slow links as well while
still following the PCIe spec.

While there move the PCI_RESET_WAIT constant into pci.c because it is
not used outside of that file anymore.

Link: https://lore.kernel.org/r/20230425064751.24951-1-mika.westerberg@linux.intel.com
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2023-06-06 17:18:49 -05:00
Andy Shevchenko
c6f54cf44c PCI: of: Propagate firmware node by calling device_set_node()
Insulate pci_set_of_node() and pci_set_bus_of_node() from possible
changes to fwnode_handle implementation by using device_set_node()
instead of open-coding dev->dev.fwnode assignments.

Link: https://lore.kernel.org/r/20230421100939.68225-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-06-06 17:15:45 -05:00
Ben Dooks
0b3dee602a PCI: Add PCI_EXT_CAP_ID_PL_32GT define
Add the define for PCI_EXT_CAP_ID_PL_32GT for drivers that will want this
whilst doing Gen5/Gen6 accesses.

Link: https://lore.kernel.org/r/20230531095713.293229-1-ben.dooks@codethink.co.uk
Signed-off-by: Ben Dooks <ben.dooks@sifive.com>
Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-05-31 16:34:38 -05:00
Igor Mammedov
40613da52b PCI: acpiphp: Reassign resources on bridge if necessary
When using ACPI PCI hotplug, hotplugging a device with large BARs may fail
if bridge windows programmed by firmware are not large enough.

Reproducer:
  $ qemu-kvm -monitor stdio -M q35  -m 4G \
      -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=on \
      -device id=rp1,pcie-root-port,bus=pcie.0,chassis=4 \
      disk_image

 wait till linux guest boots, then hotplug device:
   (qemu) device_add qxl,bus=rp1

 hotplug on guest side fails with:
   pci 0000:01:00.0: [1b36:0100] type 00 class 0x038000
   pci 0000:01:00.0: reg 0x10: [mem 0x00000000-0x03ffffff]
   pci 0000:01:00.0: reg 0x14: [mem 0x00000000-0x03ffffff]
   pci 0000:01:00.0: reg 0x18: [mem 0x00000000-0x00001fff]
   pci 0000:01:00.0: reg 0x1c: [io  0x0000-0x001f]
   pci 0000:01:00.0: BAR 0: no space for [mem size 0x04000000]
   pci 0000:01:00.0: BAR 0: failed to assign [mem size 0x04000000]
   pci 0000:01:00.0: BAR 1: no space for [mem size 0x04000000]
   pci 0000:01:00.0: BAR 1: failed to assign [mem size 0x04000000]
   pci 0000:01:00.0: BAR 2: assigned [mem 0xfe800000-0xfe801fff]
   pci 0000:01:00.0: BAR 3: assigned [io  0x1000-0x101f]
   qxl 0000:01:00.0: enabling device (0000 -> 0003)
   Unable to create vram_mapping
   qxl: probe of 0000:01:00.0 failed with error -12

However when using native PCIe hotplug
  '-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off'
it works fine, since kernel attempts to reassign unused resources.

Use the same machinery as native PCIe hotplug to (re)assign resources.

Link: https://lore.kernel.org/r/20230424191557.2464760-1-imammedo@redhat.com
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Cc: stable@vger.kernel.org
2023-05-24 11:53:41 -05:00
Rongguang Wei
e8afd0d9fc PCI: pciehp: Cancel bringup sequence if card is not present
If a PCIe hotplug slot has an Attention Button, the normal hot-add flow is:

  - Slot is empty and slot power is off
  - User inserts card in slot and presses Attention Button
  - OS blinks Power Indicator for 5 seconds
  - After 5 seconds, OS turns on Power Indicator, turns on slot power, and
    enumerates the device

Previously, if a user pressed the Attention Button on an *empty* slot,
pciehp logged the following messages and blinked the Power Indicator
until a second button press:

  [0.000] pciehp: Button press: will power on in 5 sec
  [0.001] # Power Indicator starts blinking
  [5.001] # 5 second timeout; slot is empty, so we should cancel the
            request to power on and turn off Power Indicator

  [7.000] # Power Indicator still blinking
  [8.000] # possible card insertion
  [9.000] pciehp: Button press: canceling request to power on

The first button press incorrectly left the slot in BLINKINGON_STATE, so
the second was interpreted as a "cancel power on" event regardless of
whether a card was present.

If the slot is empty, turn off the Power Indicator and return from
BLINKINGON_STATE to OFF_STATE after 5 seconds, effectively canceling the
request to power on.  Putting the slot in OFF_STATE also means the second
button press will correctly request a slot power on if the slot is
occupied.

[bhelgaas: commit log]
Link: https://lore.kernel.org/r/20230512021518.336460-1-clementwei90@163.com
Fixes: d331710ea7 ("PCI: pciehp: Become resilient to missed events")
Suggested-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Rongguang Wei <weirongguang@kylinos.cn>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
2023-05-24 11:52:38 -05:00
Bjorn Helgaas
5054133a88 PCI: pciehp: Simplify Attention Button logging
Previously, pressing the Attention Button always logged two lines, the
first from pciehp_ist() and the second from pciehp_handle_button_press():

  Attention button pressed
  Powering on due to button press

Since pciehp_handle_button_press() always logs the more detailed message,
remove the generic "Attention button pressed" message.  Reword the
pciehp_handle_button_press() to be of the form:

  Button press: will power on in 5 sec
  Button press: will power off in 5 sec
  Button press: canceling request to power on
  Button press: canceling request to power off
  Button press: ignoring invalid state %#x

Link: https://lore.kernel.org/r/20230522214051.619337-1-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
2023-05-24 11:49:50 -05:00
Ajay Agarwal
911afb9f95 PCI/ASPM: Remove unnecessary ASPM_STATE_L1SS check
Previously aspm_l1ss_init() checked if ASPM_STATE_L1SS is supported before
calling aspm_calc_l12_info(), only for that function to return if
ASPM_STATE_L1_2_MASK is not supported. Simplify the logic by directly
checking for ASPM_STATE_L1_2_MASK.

Link: https://lore.kernel.org/r/20230504111301.229358-6-ajayagarwal@google.com
Signed-off-by: Ajay Agarwal <ajayagarwal@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-05-19 10:29:40 -05:00
Ajay Agarwal
05a55d9ca1 PCI/ASPM: Rename L1.2-specific functions from 'l1ss' to 'l12'
The functions aspm_calc_l1ss_info() and calc_l1ss_pwron() perform
calculations and register programming specific to L1.2 state.  Rename them
to aspm_calc_l12_info() and calc_l12_pwron() respectively.

Link: https://lore.kernel.org/r/20230504111301.229358-5-ajayagarwal@google.com
Signed-off-by: Ajay Agarwal <ajayagarwal@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-05-19 10:29:40 -05:00
Ajay Agarwal
80950a5460 PCI/ASPM: Set ASPM_STATE_L1 when driver enables L1.1 or L1.2
Previously pci_enable_link_state(PCIE_LINK_STATE_L1_1) enabled only
ASPM_STATE_L1_1 and did not enable ASPM_STATE_L1.  The L1.1 state only
works when L1 is enabled, so enable ASPM_STATE_L1 in addition, and do the
same for L1.2.

The only current caller is vmd_pm_enable_quirk(), which enables *all* ASPM
states, so this should have no functional effect.

[bhelgaas: commit log]
Link: https://lore.kernel.org/r/20230504111301.229358-4-ajayagarwal@google.com
Signed-off-by: Ajay Agarwal <ajayagarwal@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-05-19 10:26:49 -05:00
Ajay Agarwal
25edb25d79 PCI/ASPM: Set only ASPM_STATE_L1 when driver enables L1
Previously pci_enable_link_state(PCIE_LINK_STATE_L1) enabled L1SS as well
as L1.  Enable only ASPM_STATE_L1 when the caller enables L1.

The only current caller is vmd_pm_enable_quirk(), which enables *all* ASPM
states, so this should have no functional effect.

[bhelgaas: commit log]
Link: https://lore.kernel.org/r/20230504111301.229358-3-ajayagarwal@google.com
Signed-off-by: Ajay Agarwal <ajayagarwal@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2023-05-18 16:51:50 -05:00
Ajay Agarwal
fb097dcd5a PCI/ASPM: Disable only ASPM_STATE_L1 when driver disables L1
Previously pci_disable_link_state(PCIE_LINK_STATE_L1) disabled L1SS as well
as L1.  This is unnecessary since pcie_config_aspm_link() takes care that
L1SS is not enabled if L1 is disabled.

Disable only ASPM_STATE_L1 when the caller disables L1.  No functional
changes intended.

This is consistent with aspm_attr_store_common(), which disables only L1,
not L1SS, when L1 is disabled via the sysfs "l1_aspm" file.

[bhelgaas: commit log]
Link: https://lore.kernel.org/r/20230504111301.229358-2-ajayagarwal@google.com
Signed-off-by: Ajay Agarwal <ajayagarwal@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2023-05-18 16:51:13 -05:00
Ding Hui
456d8aa37d PCI/ASPM: Disable ASPM on MFD function removal to avoid use-after-free
Struct pcie_link_state->downstream is a pointer to the pci_dev of function
0.  Previously we retained that pointer when removing function 0, and
subsequent ASPM policy changes dereferenced it, resulting in a
use-after-free warning from KASAN, e.g.:

  # echo 1 > /sys/bus/pci/devices/0000:03:00.0/remove
  # echo powersave > /sys/module/pcie_aspm/parameters/policy

  BUG: KASAN: slab-use-after-free in pcie_config_aspm_link+0x42d/0x500
  Call Trace:
   kasan_report+0xae/0xe0
   pcie_config_aspm_link+0x42d/0x500
   pcie_aspm_set_policy+0x8e/0x1a0
   param_attr_store+0x162/0x2c0
   module_attr_store+0x3e/0x80

PCIe spec r6.0, sec 7.5.3.7, recommends that software program the same ASPM
Control value in all functions of multi-function devices.

Disable ASPM and free the pcie_link_state when any child function is
removed so we can discard the dangling pcie_link_state->downstream pointer
and maintain the same ASPM Control configuration for all functions.

[bhelgaas: commit log and comment]
Debugged-by: Zongquan Qin <qinzongquan@sangfor.com.cn>
Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Fixes: b5a0a9b59c ("PCI/ASPM: Read and set up L1 substate capabilities")
Link: https://lore.kernel.org/r/20230507034057.20970-1-dinghui@sangfor.com.cn
Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2023-05-18 16:12:13 -05:00
Linus Torvalds
ac9a78681b Linux 6.4-rc1 v6.4-rc1 2023-05-07 13:34:35 -07:00
Linus Torvalds
f085df1be6 Merge tag 'perf-tools-for-v6.4-3-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tool updates from Arnaldo Carvalho de Melo:
 "Third version of perf tool updates, with the build problems with with
  using a 'vmlinux.h' generated from the main build fixed, and the bpf
  skeleton build disabled by default.

  Build:

   - Require libtraceevent to build, one can disable it using
     NO_LIBTRACEEVENT=1.

     It is required for tools like 'perf sched', 'perf kvm', 'perf
     trace', etc.

     libtraceevent is available in most distros so installing
     'libtraceevent-devel' should be a one-time event to continue
     building perf as usual.

     Using NO_LIBTRACEEVENT=1 produces tooling that is functional and
     sufficient for lots of users not interested in those libtraceevent
     dependent features.

   - Allow Python support in 'perf script' when libtraceevent isn't
     linked, as not all features requires it, for instance Intel PT does
     not use tracepoints.

   - Error if the python interpreter needed for jevents to work isn't
     available and NO_JEVENTS=1 isn't set, preventing a build without
     support for JSON vendor events, which is a rare but possible
     condition. The two check error messages:

        $(error ERROR: No python interpreter needed for jevents generation. Install python or build with NO_JEVENTS=1.)
        $(error ERROR: Python interpreter needed for jevents generation too old (older than 3.6). Install a newer python or build with NO_JEVENTS=1.)

   - Make libbpf 1.0 the minimum required when building with out of
     tree, distro provided libbpf.

   - Use libsdtc++'s and LLVM's libcxx's __cxa_demangle, a portable C++
     demangler, add 'perf test' entry for it.

   - Make binutils libraries opt in, as distros disable building with it
     due to licensing, they were used for C++ demangling, for instance.

   - Switch libpfm4 to opt-out rather than opt-in, if libpfm-devel (or
     equivalent) isn't installed, we'll just have a build warning:

       Makefile.config:1144: libpfm4 not found, disables libpfm4 support. Please install libpfm4-dev

   - Add a feature test for scandirat(), that is not implemented so far
     in musl and uclibc, disabling features that need it, such as
     scanning for tracepoints in /sys/kernel/tracing/events.

  perf BPF filters:

   - New feature where BPF can be used to filter samples, for instance:

      $ sudo ./perf record -e cycles --filter 'period > 1000' true
      $ sudo ./perf script
           perf-exec 2273949 546850.708501:       5029 cycles:  ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms])
           perf-exec 2273949 546850.708508:      32409 cycles:  ffffffff826f9e25 finish_wait+0x5 ([kernel.kallsyms])
           perf-exec 2273949 546850.708526:     143369 cycles:  ffffffff82b4cdbf xas_start+0x5f ([kernel.kallsyms])
           perf-exec 2273949 546850.708600:     372650 cycles:  ffffffff8286b8f7 __pagevec_lru_add+0x117 ([kernel.kallsyms])
           perf-exec 2273949 546850.708791:     482953 cycles:  ffffffff829190de __mod_memcg_lruvec_state+0x4e ([kernel.kallsyms])
                true 2273949 546850.709036:     501985 cycles:  ffffffff828add7c tlb_gather_mmu+0x4c ([kernel.kallsyms])
                true 2273949 546850.709292:     503065 cycles:      7f2446d97c03 _dl_map_object_deps+0x973 (/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2)

   - In addition to 'period' (PERF_SAMPLE_PERIOD), the other
     PERF_SAMPLE_ can be used for filtering, and also some other sample
     accessible values, from tools/perf/Documentation/perf-record.txt:

        Essentially the BPF filter expression is:

        <term> <operator> <value> (("," | "||") <term> <operator> <value>)*

     The <term> can be one of:
        ip, id, tid, pid, cpu, time, addr, period, txn, weight, phys_addr,
        code_pgsz, data_pgsz, weight1, weight2, weight3, ins_lat, retire_lat,
        p_stage_cyc, mem_op, mem_lvl, mem_snoop, mem_remote, mem_lock,
        mem_dtlb, mem_blk, mem_hops

     The <operator> can be one of:
        ==, !=, >, >=, <, <=, &

     The <value> can be one of:
        <number> (for any term)
        na, load, store, pfetch, exec (for mem_op)
        l1, l2, l3, l4, cxl, io, any_cache, lfb, ram, pmem (for mem_lvl)
        na, none, hit, miss, hitm, fwd, peer (for mem_snoop)
        remote (for mem_remote)
        na, locked (for mem_locked)
        na, l1_hit, l1_miss, l2_hit, l2_miss, any_hit, any_miss, walk, fault (for mem_dtlb)
        na, by_data, by_addr (for mem_blk)
        hops0, hops1, hops2, hops3 (for mem_hops)

  perf lock contention:

   - Show lock type with address.

   - Track and show mmap_lock, siglock and per-cpu rq_lock with address.
     This is done for mmap_lock by following the current->mm pointer:

      $ sudo ./perf lock con -abl -- sleep 10
       contended   total wait     max wait     avg wait            address   symbol
       ...
           16344    312.30 ms      2.22 ms     19.11 us   ffff8cc702595640
           17686    310.08 ms      1.49 ms     17.53 us   ffff8cc7025952c0
               3     84.14 ms     45.79 ms     28.05 ms   ffff8cc78114c478   mmap_lock
            3557     76.80 ms     68.75 us     21.59 us   ffff8cc77ca3af58
               1     68.27 ms     68.27 ms     68.27 ms   ffff8cda745dfd70
               9     54.53 ms      7.96 ms      6.06 ms   ffff8cc7642a48b8   mmap_lock
           14629     44.01 ms     60.00 us      3.01 us   ffff8cc7625f9ca0
            3481     42.63 ms    140.71 us     12.24 us   ffffffff937906ac   vmap_area_lock
           16194     38.73 ms     42.15 us      2.39 us   ffff8cd397cbc560
              11     38.44 ms     10.39 ms      3.49 ms   ffff8ccd6d12fbb8   mmap_lock
               1      5.43 ms      5.43 ms      5.43 ms   ffff8cd70018f0d8
            1674      5.38 ms    422.93 us      3.21 us   ffffffff92e06080   tasklist_lock
             581      4.51 ms    130.68 us      7.75 us   ffff8cc9b1259058
               5      3.52 ms      1.27 ms    703.23 us   ffff8cc754510070
             112      3.47 ms     56.47 us     31.02 us   ffff8ccee38b3120
             381      3.31 ms     73.44 us      8.69 us   ffffffff93790690   purge_vmap_area_lock
             255      3.19 ms     36.35 us     12.49 us   ffff8d053ce30c80

   - Update default map size to 16384.

   - Allocate single letter option -M for --map-nr-entries, as it is
     proving being frequently used.

   - Fix struct rq lock access for older kernels with BPF's CO-RE
     (Compile once, run everywhere).

   - Fix problems found with MSAn.

  perf report/top:

   - Add inline information when using --call-graph=fp or lbr, as was
     already done to the --call-graph=dwarf callchain mode.

   - Improve the 'srcfile' sort key performance by really using an
     optimization introduced in 6.2 for the 'srcline' sort key that
     avoids calling addr2line for comparision with each sample.

  perf sched:

   - Make 'perf sched latency/map/replay' to use "sched:sched_waking"
     instead of "sched:sched_waking", consistent with 'perf record'
     since d566a9c2d4 ("perf sched: Prefer sched_waking event when it
     exists").

  perf ftrace:

   - Make system wide the default target for latency subcommand, run the
     following command then generate some network traffic and press
     control+C:

       # perf ftrace latency -T __kfree_skb
     ^C
         DURATION     |      COUNT | GRAPH                                          |
          0 - 1    us |         27 | #############                                  |
          1 - 2    us |         22 | ###########                                    |
          2 - 4    us |          8 | ####                                           |
          4 - 8    us |          5 | ##                                             |
          8 - 16   us |         24 | ############                                   |
         16 - 32   us |          2 | #                                              |
         32 - 64   us |          1 |                                                |
         64 - 128  us |          0 |                                                |
        128 - 256  us |          0 |                                                |
        256 - 512  us |          0 |                                                |
        512 - 1024 us |          0 |                                                |
          1 - 2    ms |          0 |                                                |
          2 - 4    ms |          0 |                                                |
          4 - 8    ms |          0 |                                                |
          8 - 16   ms |          0 |                                                |
         16 - 32   ms |          0 |                                                |
         32 - 64   ms |          0 |                                                |
         64 - 128  ms |          0 |                                                |
        128 - 256  ms |          0 |                                                |
        256 - 512  ms |          0 |                                                |
        512 - 1024 ms |          0 |                                                |
          1 - ...   s |          0 |                                                |
       #

  perf top:

   - Add --branch-history (LBR: Last Branch Record) option, just like
     already available for 'perf record'.

   - Fix segfault in thread__comm_len() where thread->comm was being
     used outside thread->comm_lock.

  perf annotate:

   - Allow configuring objdump and addr2line in ~/.perfconfig., so that
     you can use alternative binaries, such as llvm's.

  perf kvm:

   - Add TUI mode for 'perf kvm stat report'.

  Reference counting:

   - Add reference count checking infrastructure to check for use after
     free, done to the 'cpumap', 'namespaces', 'maps' and 'map' structs,
     more to come.

     To build with it use -DREFCNT_CHECKING=1 in the make command line
     to build tools/perf. Documented at:

       https://perf.wiki.kernel.org/index.php/Reference_Count_Checking

   - The above caught, for instance, fix, present in this series:

        - Fix maps use after put in 'perf test "Share thread maps"':

          'maps' is copied from leader, but the leader is put on line 79
          and then 'maps' is used to read the reference count below - so
          a use after put, with the put of maps happening within
          thread__put.

     Fixed by reversing the order of puts so that the leader is put
     last.

   - Also several fixes were made to places where reference counts were
     not being held.

   - Make this one of the tests in 'make -C tools/perf build-test' to
     regularly build test it and to make sure no direct access to the
     reference counted structs are made, doing that via accessors to
     check the validity of the struct pointer.

  ARM64:

   - Fix 'perf report' segfault when filtering coresight traces by
     sparse lists of CPUs.

   - Add support for 'simd' as a sort field for 'perf report', to show
     ARM's NEON SIMD's predicate flags: "partial" and "empty".

  arm64 vendor events:

   - Add N1 metrics.

  Intel vendor events:

   - Add graniterapids, grandridge and sierraforrest events.

   - Refresh events for: alderlake, aldernaken, broadwell, broadwellde,
     broadwellx, cascadelakx, haswell, haswellx, icelake, icelakex,
     jaketown, meteorlake, knightslanding, sandybridge, sapphirerapids,
     silvermont, skylake, tigerlake and westmereep-dp

   - Refresh metrics for alderlake-n, broadwell, broadwellde,
     broadwellx, haswell, haswellx, icelakex, ivybridge, ivytown and
     skylakex.

  perf stat:

   - Implement --topdown using JSON metrics.

   - Add TopdownL1 JSON metric as a default if present, but disable it
     for now for some Intel hybrid architectures, a series of patches
     addressing this is being reviewed and will be submitted for v6.5.

   - Use metrics for --smi-cost.

   - Update topdown documentation.

  Vendor events (JSON) infrastructure:

   - Add support for computing and printing metric threshold values. For
     instance, here is one found in thesapphirerapids json file:

       {
           "BriefDescription": "Percentage of cycles spent in System Management Interrupts.",
           "MetricExpr": "((msr@aperf@ - cycles) / msr@aperf@ if msr@smi@ > 0 else 0)",
           "MetricGroup": "smi",
           "MetricName": "smi_cycles",
           "MetricThreshold": "smi_cycles > 0.1",
           "ScaleUnit": "100%"
       },

   - Test parsing metric thresholds with the fake PMU in 'perf test
     pmu-events'.

   - Support for printing metric thresholds in 'perf list'.

   - Add --metric-no-threshold option to 'perf stat'.

   - Add rand (reverse and) and has_pmem (optane memory) support to
     metrics.

   - Sort list of input files to avoid depending on the order from
     readdir() helping in obtaining reproducible builds.

  S/390:

   - Add common metrics: - CPI (cycles per instruction), prbstate (ratio
     of instructions executed in problem state compared to total number
     of instructions), l1mp (Level one instruction and data cache misses
     per 100 instructions).

   - Add cache metrics for z13, z14, z15 and z16.

   - Add metric for TLB and cache.

  ARM:

   - Add raw decoding for SPE (Statistical Profiling Extension) v1.3 MTE
     (Memory Tagging Extension) and MOPS (Memory Operations) load/store.

  Intel PT hardware tracing:

   - Add event type names UINTR (User interrupt delivered) and UIRET
     (Exiting from user interrupt routine), documented in table 32-50
     "CFE Packet Type and Vector Fields Details" in the Intel Processor
     Trace chapter of The Intel SDM Volume 3 version 078.

   - Add support for new branch instructions ERETS and ERETU.

   - Fix CYC timestamps after standalone CBR

  ARM CoreSight hardware tracing:

   - Allow user to override timestamp and contextid settings.

   - Fix segfault in dso lookup.

   - Fix timeless decode mode detection.

   - Add separate decode paths for timeless and per-thread modes.

  auxtrace:

   - Fix address filter entire kernel size.

  Miscellaneous:

   - Fix use-after-free and unaligned bugs in the PLT handling routines.

   - Use zfree() to reduce chances of use after free.

   - Add missing 0x prefix for addresses printed in hexadecimal in 'perf
     probe'.

   - Suppress massive unsupported target platform errors in the unwind
     code.

   - Fix return incorrect build_id size in elf_read_build_id().

   - Fix 'perf scripts intel-pt-events.py' IPC output for Python 2 .

   - Add missing new parameter in kfree_skb tracepoint to the python
     scripts using it.

   - Add 'perf bench syscall fork' benchmark.

   - Add support for printing PERF_MEM_LVLNUM_UNC (Uncached access) in
     'perf mem'.

   - Fix wrong size expectation for perf test 'Setup struct
     perf_event_attr' caused by the patch adding
     perf_event_attr::config3.

   - Fix some spelling mistakes"

* tag 'perf-tools-for-v6.4-3-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (365 commits)
  Revert "perf build: Make BUILD_BPF_SKEL default, rename to NO_BPF_SKEL"
  Revert "perf build: Warn for BPF skeletons if endian mismatches"
  perf metrics: Fix SEGV with --for-each-cgroup
  perf bpf skels: Stop using vmlinux.h generated from BTF, use subset of used structs + CO-RE
  perf stat: Separate bperf from bpf_profiler
  perf test record+probe_libc_inet_pton: Fix call chain match on x86_64
  perf test record+probe_libc_inet_pton: Fix call chain match on s390
  perf tracepoint: Fix memory leak in is_valid_tracepoint()
  perf cs-etm: Add fix for coresight trace for any range of CPUs
  perf build: Fix unescaped # in perf build-test
  perf unwind: Suppress massive unsupported target platform errors
  perf script: Add new parameter in kfree_skb tracepoint to the python scripts using it
  perf script: Print raw ip instead of binary offset for callchain
  perf symbols: Fix return incorrect build_id size in elf_read_build_id()
  perf list: Modify the warning message about scandirat(3)
  perf list: Fix memory leaks in print_tracepoint_events()
  perf lock contention: Rework offset calculation with BPF CO-RE
  perf lock contention: Fix struct rq lock access
  perf stat: Disable TopdownL1 on hybrid
  perf stat: Avoid SEGV on counter->name
  ...
2023-05-07 11:32:18 -07:00
Linus Torvalds
17784de648 Merge tag 'core-debugobjects-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull debugobjects fix from Thomas Gleixner:
 "A single fix for debugobjects:

  The recent fix to ensure atomicity of lookup and allocation
  inadvertently broke the pool refill mechanism, so that debugobject
  OOMs now in certain situations. The reason is that the functions which
  got updated no longer invoke debug_objecs_init(), which is now the
  only place to care about refilling the tracking object pool.

  Restore the original behaviour by adding explicit refill opportunities
  to those places"

* tag 'core-debugobjects-2023-05-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  debugobject: Ensure pool refill (again)
2023-05-07 11:04:26 -07:00
Linus Torvalds
6f69c98181 Merge tag 'v6.4-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:

 - A long-standing bug in crypto_engine

 - A buggy but harmless check in the sun8i-ss driver

 - A regression in the CRYPTO_USER interface

* tag 'v6.4-p2' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: api - Fix CRYPTO_USER checks for report function
  crypto: engine - fix crypto_queue backlog handling
  crypto: sun8i-ss - Fix a test in sun8i_ss_setup_ivs()
2023-05-07 10:57:14 -07:00
Linus Torvalds
63342b1dd5 Merge tag '6.4-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6
Pull cifs fixes from Steve French:
 "smb3 client fixes, mostly DFS or reconnect related:

   - Two DFS connection sharing fixes

   - DFS refresh fix

   - Reconnect fix

   - Two potential use after free fixes

   - Also print prefix patch in mount debug msg

   - Two small cleanup fixes"

* tag '6.4-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: Remove unneeded semicolon
  cifs: fix sharing of DFS connections
  cifs: avoid potential races when handling multiple dfs tcons
  cifs: protect access of TCP_Server_Info::{origin,leaf}_fullpath
  cifs: fix potential race when tree connecting ipc
  cifs: fix potential use-after-free bugs in TCP_Server_Info::hostname
  cifs: print smb3_fs_context::source when mounting
  cifs: protect session status check in smb2_reconnect()
  SMB3.1.1: correct definition for app_instance_id create contexts
2023-05-07 10:46:21 -07:00
Linus Torvalds
d6b8a8c49a Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
 "A couple more patches that would be good to get into -rc1:

   - Revert an i.MX patch that's causing video failures because division
     math goes sideways

   - Fix a clang + W=1 build isue where FIELD_PREP() is taking a 32-bit
     variable instead of the usual u64 type

   - Fix a Kconfig bug in the StarFive JH7110 clk config that selects a
     reset controller when it can't be selected"

* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: starfive: Fix RESET_STARFIVE_JH7110 can't be selected in a specified case
  clk: sp7021: Adjust width of _m in HWM_FIELD_PREP()
  Revert "clk: imx: composite-8m: Add support to determine_rate"
2023-05-07 10:31:45 -07:00
Linus Torvalds
1c1094e47e Merge tag 'mailbox-v6.4' of git://git.linaro.org/landing-teams/working/fujitsu/integration
Pull mailbox updates from Jassi Brar:

 - mailbox api: allow direct registration to a channel and convert omap
   and pcc to use mbox_bind_client

 - omap and hi6220 : use of_property_read_bool

 - test: fix double-free and use spinlock header

 - rockchip and bcm-pdc: drop of_match_ptr

 - mpfs: change config symbol

 - mediatek gce: support MT6795

 - qcom apcs: consolidate of_device_id and support IPQ9574

* tag 'mailbox-v6.4' of git://git.linaro.org/landing-teams/working/fujitsu/integration:
  dt-bindings: mailbox: qcom: add compatible for IPQ9574 SoC
  mailbox: qcom-apcs-ipc: do not grow the of_device_id
  dt-bindings: mailbox: qcom,apcs-kpss-global: use fallbacks for few variants
  dt-bindings: mailbox: mediatek,gce-mailbox: Add support for MT6795
  mailbox: mpfs: convert SOC_MICROCHIP_POLARFIRE to ARCH_MICROCHIP_POLARFIRE
  mailbox: bcm-pdc: drop of_match_ptr for ID table
  mailbox: rockchip: drop of_match_ptr for ID table
  mailbox: mailbox-test: Fix potential double-free in mbox_test_message_write()
  mailbox: mailbox-test: Explicitly include header for spinlock support
  mailbox: Use of_property_read_bool() for boolean properties
  mailbox: pcc: Use mbox_bind_client
  mailbox: omap: Use mbox_bind_client
  mailbox: Allow direct registration to a channel
2023-05-07 10:17:33 -07:00
Linus Torvalds
03e5cb7b50 Merge tag 'for-6.4/io_uring-2023-05-07' of git://git.kernel.dk/linux
Pull more io_uring updates from Jens Axboe:
 "Nothing major in here, just two different parts:

   - A small series from Breno that enables passing the full SQE down
     for ->uring_cmd().

     This is a prerequisite for enabling full network socket operations.
     Queued up a bit late because of some stylistic concerns that got
     resolved, would be nice to have this in 6.4-rc1 so the dependent
     work will be easier to handle for 6.5.

   - Fix for the huge page coalescing, which was a regression introduced
     in the 6.3 kernel release (Tobias)"

* tag 'for-6.4/io_uring-2023-05-07' of git://git.kernel.dk/linux:
  io_uring: Remove unnecessary BUILD_BUG_ON
  io_uring: Pass whole sqe to commands
  io_uring: Create a helper to return the SQE size
  io_uring/rsrc: check for nonconsecutive pages
2023-05-07 10:00:09 -07:00
Arnaldo Carvalho de Melo
9a2d5178b9 Revert "perf build: Make BUILD_BPF_SKEL default, rename to NO_BPF_SKEL"
This reverts commit a980755beb.

We need to better polish building with BPF skels, so revert back to
making it an experimental feature that has to be explicitely enabled
using BUILD_BPF_SKEL=1.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-05-06 18:07:37 -03:00