Commit Graph

1136013 Commits

Author SHA1 Message Date
Bjorn Helgaas
0084cd6072 Merge branch 'pci/sysfs'
- Fix a double free in the error path of creating sysfs "resource%d"
  attributes (Sascha Hauer)

* pci/sysfs:
  PCI/sysfs: Fix double free in error path
2022-12-10 10:36:35 -06:00
Bjorn Helgaas
8961fc4f8c Merge branch 'pci/resource'
- Remove EfiMemoryMappedIO regions from the E820 map to allow PCI core to
  allocate BARs from them.  The only purpose of EfiMemoryMappedIO is to
  tell the OS to map things needed by EFI runtime services, so it's often
  used for PCI host bridge apertures.  If we can't allocate from those
  apertures, we can't hot-add devices (Bjorn Helgaas)

* pci/resource:
  x86/PCI: Use pr_info() when possible
  x86/PCI: Fix log message typo
  x86/PCI: Tidy E820 removal messages
  PCI: Skip allocate_resource() if too little space available
  efi/x86: Remove EfiMemoryMappedIO from E820 map
2022-12-10 10:36:34 -06:00
Bjorn Helgaas
9303050181 Merge branch 'pci/portdrv'
- Squash portdrv_core.c and portdrv_pci.c into portdrv.c to make it easier
  to find things (Bjorn Helgaas)

- Allow AER service only for Root Ports & RCECs so portdrv can successfully
  bind to other devices that have AER but lack MSI (which they don't need
  for AER), which allows power management for those devices (Bjorn Helgaas)

* pci/portdrv:
  PCI/portdrv: Allow AER service only for Root Ports & RCECs
  PCI/portdrv: Unexport pcie_port_service_register(), pcie_port_service_unregister()
  PCI/portdrv: Move private things to portdrv.c
  PCI/portdrv: Squash into portdrv.c
2022-12-10 10:36:34 -06:00
Bjorn Helgaas
ec7c9a681d Merge branch 'pci/pm-agp'
- Convert AGP efficeon, intel, amd-k7, ati, nvidia to generic power
  management (Bjorn Helgaas)

* pci/pm-agp:
  agp/via: Update to DEFINE_SIMPLE_DEV_PM_OPS()
  agp/sis: Update to DEFINE_SIMPLE_DEV_PM_OPS()
  agp/amd64: Update to DEFINE_SIMPLE_DEV_PM_OPS()
  agp/nvidia: Convert to generic power management
  agp/ati: Convert to generic power management
  agp/amd-k7: Convert to generic power management
  agp/intel: Convert to generic power management
  agp/efficeon: Convert to generic power management
2022-12-10 10:36:33 -06:00
Bjorn Helgaas
e1f2d15397 Merge branch 'pci/pm'
- Remove unused 'state' parameter to pci_legacy_suspend_late() (Bjorn
  Helgaas)

* pci/pm:
  PCI/PM: Remove unused 'state' parameter to pci_legacy_suspend_late()
2022-12-10 10:36:33 -06:00
Bjorn Helgaas
eae10935ef Merge branch 'pci/misc'
- Use METHOD_NAME__UID instead of plain string to make it easier to find
  all uses (Yipeng Zou)

* pci/misc:
  PCI/ACPI: Use METHOD_NAME__UID instead of plain string
2022-12-10 10:36:32 -06:00
Bjorn Helgaas
84c3482963 Merge branch 'pci/hotplug'
- Enable pciehp by default if USB4 is enabled because USB4/Thunderbolt
  tunneling depends on native PCIe hotplug (Albert Zhou)

- Make sure pciehp binds only to Downstream Ports, not Upstream Ports
  (Rafael J. Wysocki)

- Remove unused get_mode1_ECC_cap callback in shpchp (Ian Cowan)

- Enable pciehp Command Completed Interrupt only if supported to reduce
  confusion when looking at lspci output (Pali Rohár)

* pci/hotplug:
  PCI: pciehp: Enable Command Completed Interrupt only if supported
  PCI: shpchp: Remove unused get_mode1_ECC_cap callback
  PCI: acpiphp: Avoid setting is_hotplug_bridge for PCIe Upstream Ports
  PCI/portdrv: Set PCIE_PORT_SERVICE_HP for Root and Downstream Ports only
  PCI: pciehp: Enable by default if USB4 enabled
2022-12-10 10:36:32 -06:00
Bjorn Helgaas
51ef4873c6 Merge branch 'pci/enumeration'
- Only read/write PCIe Link 2 registers for devices with Links and PCIe
  Capability version >= 2 (Maciej W. Rozycki)

- Revert a patch that cleared PCI_STATUS during enumeration because it
  broke Linux guests on Apple's virtualization framework (Bjorn Helgaas)

- Assign PCI domain IDs using IDAs so IDs can be easily reused after
  loading/unloading host bridge drivers (Pali Rohár)

- Fix pci_device_is_present(), which previously always returned "false" for
  VFs because their vendor ID is always 0xfff (Michael S. Tsirkin)

- Check for alloc failure in pci_request_irq() (Zeng Heng)

* pci/enumeration:
  PCI: Check for alloc failure in pci_request_irq()
  PCI: Fix pci_device_is_present() for VFs by checking PF
  PCI: Assign PCI domain IDs by ida_alloc()
  Revert "PCI: Clear PCI_STATUS when setting up device"
  PCI: Access Link 2 registers only for devices with Links
2022-12-10 10:36:32 -06:00
Bjorn Helgaas
cad4f43f36 Merge branch 'pci/doe'
- Fix calculation of DOE length to account for the "0 means 2^18 DWORDs"
  special case (Li Ming)

* pci/doe:
  PCI/DOE: Fix maximum data object length miscalculation
2022-12-10 10:36:31 -06:00
Bjorn Helgaas
d91482bb21 x86/PCI: Use pr_info() when possible
Use pr_info() and similar when possible.  No functional change intended.

Link: https://lore.kernel.org/r/20221209205131.GA1726524@bhelgaas
Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
2022-12-10 10:33:18 -06:00
Bjorn Helgaas
2bfa89fab5 x86/PCI: Fix log message typo
Add missing word in the log message:

  - ... so future kernels can this automatically
  + ... so future kernels can do this automatically

Suggested-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Link: https://lore.kernel.org/r/20221208190341.1560157-5-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
2022-12-10 10:33:18 -06:00
Bjorn Helgaas
00904bf64c x86/PCI: Tidy E820 removal messages
These messages:

  clipped [mem size 0x00000000 64bit] to [mem size 0xfffffffffffa0000 64bit] for e820 entry [mem 0x0009f000-0x000fffff]

aren't as useful as they could be because (a) the resource is often
IORESOURCE_UNSET, so we print the size instead of the start/end and (b) we
print the available resource even if it is empty after removing the E820
entry.

Print the available space by hand to avoid the IORESOURCE_UNSET problem and
only if it's non-empty.  No functional change intended.

Link: https://lore.kernel.org/r/20221208190341.1560157-4-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
2022-12-10 10:33:11 -06:00
Bjorn Helgaas
5c5fb3c3a7 PCI: Skip allocate_resource() if too little space available
pci_bus_alloc_from_region() allocates MMIO space by iterating through all
the resources available on the bus.  The available resource might be
reduced if the caller requires 32-bit space or we're avoiding BIOS or E820
areas.

Don't bother calling allocate_resource() if we need more space than is
available in this resource.  This prevents some pointless and annoying
messages about avoided areas.

Link: https://lore.kernel.org/r/20221208190341.1560157-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
2022-12-10 10:31:47 -06:00
Bjorn Helgaas
07eab0901e efi/x86: Remove EfiMemoryMappedIO from E820 map
Firmware can use EfiMemoryMappedIO to request that MMIO regions be mapped
by the OS so they can be accessed by EFI runtime services, but should have
no other significance to the OS (UEFI r2.10, sec 7.2).  However, most
bootloaders and EFI stubs convert EfiMemoryMappedIO regions to
E820_TYPE_RESERVED entries, which prevent Linux from allocating space from
them (see remove_e820_regions()).

Some platforms use EfiMemoryMappedIO entries for PCI MMCONFIG space and PCI
host bridge windows, which means Linux can't allocate BAR space for
hot-added devices.

Remove large EfiMemoryMappedIO regions from the E820 map to avoid this
problem.

Leave small (< 256KB) EfiMemoryMappedIO regions alone because on some
platforms, these describe non-window space that's included in host bridge
_CRS.  If we assign that space to PCI devices, they don't work.  On the
Lenovo X1 Carbon, this leads to suspend/resume failures.

The previous solution to the problem of allocating BARs in these regions
was to add pci_crs_quirks[] entries to disable E820 checking for these
machines (see d341838d77 ("x86/PCI: Disable E820 reserved region clipping
via quirks")):

  Acer   DMI_PRODUCT_NAME    Spin SP513-54N
  Clevo  DMI_BOARD_NAME      X170KM-G
  Lenovo DMI_PRODUCT_VERSION *IIL*

Florent reported the BAR allocation issue on the Clevo NL4XLU.  We could
add another quirk for the NL4XLU, but I hope this generic change can solve
it for many machines without having to add quirks.

This change has been tested on Clevo X170KM-G (Konrad) and Lenovo Ideapad
Slim 3 (Matt) and solves the problem even when overriding the existing
quirks by booting with "pci=use_e820".

Link: https://bugzilla.kernel.org/show_bug.cgi?id=216565     Clevo NL4XLU
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206459#c78 Clevo X170KM-G
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1868899    Ideapad Slim 3
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2029207    X1 Carbon
Link: https://lore.kernel.org/r/20221208190341.1560157-2-helgaas@kernel.org
Reported-by: Florent DELAHAYE <kernelorg@undead.fr>
Tested-by: Konrad J Hambrick <kjhambrick@gmail.com>
Tested-by: Matt Hansen <2lprbe78@duck.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
2022-12-10 10:31:42 -06:00
Bjorn Helgaas
d8d2b65a94 PCI/portdrv: Allow AER service only for Root Ports & RCECs
Previously portdrv allowed the AER service for any device with an AER
capability (assuming Linux had control of AER) even though the AER service
driver only attaches to Root Port and RCECs.

Because get_port_device_capability() included AER for non-RP, non-RCEC
devices, we tried to initialize the AER IRQ even though these devices
don't generate AER interrupts.

Intel DG1 and DG2 discrete graphics cards contain a switch leading to a
GPU.  The switch supports AER but not MSI, so initializing an AER IRQ
failed, and portdrv failed to claim the switch port at all.  The GPU itself
could be suspended, but the switch could not be put in a low-power state
because it had no driver.

Don't allow the AER service on non-Root Port, non-Root Complex Event
Collector devices.  This means we won't enable Bus Mastering if the device
doesn't require MSI, the AER service will not appear in sysfs, and the AER
service driver will not bind to the device.

Link: https://lore.kernel.org/r/20221207084105.84947-1-mika.westerberg@linux.intel.com
Link: https://lore.kernel.org/r/20221210002922.1749403-1-helgaas@kernel.org
Based-on-patch-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2022-12-10 10:26:40 -06:00
Pali Rohár
6d4671b534 PCI: pciehp: Enable Command Completed Interrupt only if supported
The No Command Completed Support bit in the Slot Capabilities register
indicates whether Command Completed Interrupt Enable is unsupported.

We already check whether No Command Completed Support bit is set in
pcie_wait_cmd(), and do not wait in this case.

Don't enable this Command Completed Interrupt at all if NCCS is set, so
that when users dump configuration space from userspace, the dump does not
confuse them by saying that Command Completed Interrupt is not supported,
but it is enabled.

Link: https://lore.kernel.org/r/20220927141926.8895-2-kabel@kernel.org
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Marek Behún <kabel@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
2022-12-07 08:27:20 -06:00
Ian Cowan
9676f40618 PCI: shpchp: Remove unused get_mode1_ECC_cap callback
The ->get_mode1_ECC_cap callback in the shpchp_hpc_ops struct is never
called, so remove it.

[bhelgaas: squash]
Link: https://lore.kernel.org/r/20221112142859.319733-2-ian@linux.cowan.aero
Link: https://lore.kernel.org/r/20221112142859.319733-3-ian@linux.cowan.aero
Link: https://lore.kernel.org/r/20221112142859.319733-4-ian@linux.cowan.aero
Signed-off-by: Ian Cowan <ian@linux.cowan.aero>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2022-11-22 14:09:51 -06:00
Rafael J. Wysocki
c63a3be76d PCI: acpiphp: Avoid setting is_hotplug_bridge for PCIe Upstream Ports
It is reported that on some systems pciehp binds to an Upstream Port and
attempts to operate it which causes devices below the Port to disappear
from the bus.

This happens because acpiphp sets dev->is_hotplug_bridge for that Port
(after receiving a Device Check notification on it from the platform
firmware via ACPI) during the enumeration of PCI devices.

get_port_device_capability() sees that dev->is_hotplug_bridge is set and
adds PCIE_PORT_SERVICE_HP to Port services, which allows pciehp to bind to
the Port in question.

Even though this particular problem can be addressed by making the
portdrv_core checks more robust, it also causes power management to work
differently on the affected systems which generally is not desirable (PCIe
Ports with dev->is_hotplug_bridge set have to pass additional tests to be
allowed to go into the D3hot/cold power states which affects runtime PM of
devices below these Ports).

For this reason, amend check_hotplug_bridge() with a PCIe type check to
prevent it from setting dev->is_hotplug_bridge for Upstream Ports.

Suggested-by: Lukas Wunner <lukas@wunner.de>
Link: https://lore.kernel.org/r/2262230.ElGaqSPkdT@kreacher
Reported-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Tested-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
2022-11-22 13:27:28 -06:00
Rafael J. Wysocki
05f5747414 PCI/portdrv: Set PCIE_PORT_SERVICE_HP for Root and Downstream Ports only
It is reported that on some systems pciehp binds to an Upstream Port and
attempts to operate it which causes devices below the Port to disappear
from the bus.

This happens because acpiphp sets dev->is_hotplug_bridge for that Port
(after receiving a Device Check notification on it from the platform
firmware via ACPI) during the enumeration of PCI devices.

get_port_device_capability() sees that dev->is_hotplug_bridge is set and
adds PCIE_PORT_SERVICE_HP to Port services (which allows pciehp to bind to
the Port in question) without consulting the PCIe type, which should be
either Root Port or Downstream Port for the hotplug capability to be
present.

Per PCIe r6.0, sec 7.5.3.2, the Slot Implemented bit is only valid for
Downstream Ports (including Root Ports), and PCIe hotplug depends on the
Slot Capabilities / Control / Status registers.

Make get_port_device_capability() more robust by adding a PCIe type check
to it before adding PCIE_PORT_SERVICE_HP to Port services which helps to
avoid the problem.

[bhelgaas: add spec citation]
Suggested-by: Lukas Wunner <lukas@wunner.de>
Link: https://lore.kernel.org/r/4786090.31r3eYUQgx@kreacher
Reported-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
2022-11-22 13:27:11 -06:00
Zeng Heng
2d9cd957d4 PCI: Check for alloc failure in pci_request_irq()
When kvasprintf() fails to allocate memory, it returns a NULL pointer.
Return error from pci_request_irq() so we don't dereference it.

[bhelgaas: commit log]
Fixes: 704e8953d3 ("PCI/irq: Add pci_request_irq() and pci_free_irq() helpers")
Link: https://lore.kernel.org/r/20221121020029.3759444-1-zengheng4@huawei.com
Signed-off-by: Zeng Heng <zengheng4@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2022-11-21 16:55:18 -06:00
Li Ming
a4ff8e7a71 PCI/DOE: Fix maximum data object length miscalculation
Per PCIe r6.0, sec 6.30.1, a data object Length of 0x0 indicates 2^18
DWORDs (256K DW or 1MB) being transferred.  Adjust the value of data object
length for this case on both sending side and receiving side.

Don't bother checking whether Length is greater than SZ_1M because all
values of the 18-bit Length field are valid, and it is impossible to
represent anything larger than SZ_1M:

  0x00000    256K DW (1M bytes)
  0x00001       1 DW (4 bytes)
  ...
  0x3ffff  256K-1 DW (1M - 4 bytes)

[bhelgaas: commit log]
Link: https://lore.kernel.org/r/20221116015637.3299664-1-ming4.li@intel.com
Fixes: 9d24322e88 ("PCI/DOE: Add DOE mailbox support functions")
Signed-off-by: Li Ming <ming4.li@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Lukas Wunner <lukas@wunner.de>
Cc: stable@vger.kernel.org	# v6.0+
2022-11-16 14:26:46 -06:00
Albert Zhou
e67ad9354a PCI: pciehp: Enable by default if USB4 enabled
Thunderbolt/USB4 PCIe tunneling depends on native PCIe hotplug.  Enable
pciehp by default if USB4 is enabled.

[bhelgaas: squash, update subject, commit logs, tidy whitespace]
Link: https://lore.kernel.org/r/20221115113857.35800-2-albert.zhou.50@gmail.com
Link: https://lore.kernel.org/r/20221115113857.35800-3-albert.zhou.50@gmail.com
Signed-off-by: Albert Zhou <albert.zhou.50@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2022-11-15 09:25:21 -06:00
Michael S. Tsirkin
98b04dd0b4 PCI: Fix pci_device_is_present() for VFs by checking PF
pci_device_is_present() previously didn't work for VFs because it reads the
Vendor and Device ID, which are 0xffff for VFs, which looks like they
aren't present.  Check the PF instead.

Wei Gong reported that if virtio I/O is in progress when the driver is
unbound or "0" is written to /sys/.../sriov_numvfs, the virtio I/O
operation hangs, which may result in output like this:

  task:bash state:D stack:    0 pid: 1773 ppid:  1241 flags:0x00004002
  Call Trace:
   schedule+0x4f/0xc0
   blk_mq_freeze_queue_wait+0x69/0xa0
   blk_mq_freeze_queue+0x1b/0x20
   blk_cleanup_queue+0x3d/0xd0
   virtblk_remove+0x3c/0xb0 [virtio_blk]
   virtio_dev_remove+0x4b/0x80
   ...
   device_unregister+0x1b/0x60
   unregister_virtio_device+0x18/0x30
   virtio_pci_remove+0x41/0x80
   pci_device_remove+0x3e/0xb0

This happened because pci_device_is_present(VF) returned "false" in
virtio_pci_remove(), so it called virtio_break_device().  The broken vq
meant that vring_interrupt() skipped the vq.callback() that would have
completed the virtio I/O operation via virtblk_done().

[bhelgaas: commit log, simplify to always use pci_physfn(), add stable tag]
Link: https://lore.kernel.org/r/20221026060912.173250-1-mst@redhat.com
Reported-by: Wei Gong <gongwei833x@gmail.com>
Tested-by: Wei Gong <gongwei833x@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
2022-11-11 17:37:56 -06:00
Sascha Hauer
aa382ffa70 PCI/sysfs: Fix double free in error path
When pci_create_attr() fails, pci_remove_resource_files() is called which
will iterate over the res_attr[_wc] arrays and frees every non NULL entry.
To avoid a double free here set the array entry only after it's clear we
successfully initialized it.

Fixes: b562ec8f74 ("PCI: Don't leak memory if sysfs_create_bin_file() fails")
Link: https://lore.kernel.org/r/20221007070735.GX986@pengutronix.de/
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
2022-11-09 16:57:29 -06:00
Pali Rohár
c14f7ccc9f PCI: Assign PCI domain IDs by ida_alloc()
Replace assignment of PCI domain IDs from atomic_inc_return() to
ida_alloc().

Use two IDAs, one for static domain allocations (those which are defined in
device tree) and second for dynamic allocations (all other).

During removal of root bus / host bridge, also release the domain ID.  The
released ID can be reused again, for example when dynamically loading and
unloading native PCI host bridge drivers.

This change also allows to mix static device tree assignment and dynamic by
kernel as all static allocations are reserved in dynamic pool.

[bhelgaas: set "err" if "bus->domain_nr < 0"]
Link: https://lore.kernel.org/r/20220714184130.5436-1-pali@kernel.org
Signed-off-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2022-11-08 11:14:18 -06:00
Bjorn Helgaas
44e985938e Revert "PCI: Clear PCI_STATUS when setting up device"
This reverts commit 6cd514e58f.

Christophe Fergeau reported that 6cd514e58f ("PCI: Clear PCI_STATUS when
setting up device") causes boot failures when trying to start linux guests
with Apple's virtualization framework (for example using
https://developer.apple.com/documentation/virtualization/running_linux_in_a_virtual_machine?language=objc)

6cd514e58f only solved a cosmetic problem, so revert it to fix the boot
failures.

Link: https://bugzilla.redhat.com/show_bug.cgi?id=2137803
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2022-11-07 15:31:08 -06:00
Yipeng Zou
9f0b4cc174 PCI/ACPI: Use METHOD_NAME__UID instead of plain string
Replace the string "_UID" with the METHOD_NAME__UID macro so instances are
easier to find.

Link: https://lore.kernel.org/r/20221104032430.186424-1-zouyipeng@huawei.com
Signed-off-by: Yipeng Zou <zouyipeng@huawei.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2022-11-07 15:22:52 -06:00
Maciej W. Rozycki
503fa23614 PCI: Access Link 2 registers only for devices with Links
PCIe r2.0, sec 7.8 added Link Capabilities/Status/Control 2 registers to
the PCIe Capability with Capability Version 2.

Previously we assumed these registers were implemented for all PCIe
Capabilities of version 2 or greater, but in fact they are only
implemented for devices with Links.

Update pcie_capability_reg_implemented() to check whether the device has
a Link.

[bhelgaas: commit log, squash export]
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2209100057070.2275@angie.orcam.me.uk
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2209100057300.2275@angie.orcam.me.uk
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2022-11-04 10:38:11 -05:00
Bjorn Helgaas
5984de0b41 PCI/PM: Remove unused 'state' parameter to pci_legacy_suspend_late()
1a1daf097e ("PCI/PM: Remove unused pci_driver.suspend_late() hook")
removed the legacy .suspend_late() hook, which was the only user of the
"state" parameter to pci_legacy_suspend_late(), but it neglected to remove
the parameter.

Remove the unused "state" parameter to pci_legacy_suspend_late().

Link: https://lore.kernel.org/r/20221025193502.669091-1-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-10-26 11:26:11 -05:00
Bjorn Helgaas
73fcd4520e agp/via: Update to DEFINE_SIMPLE_DEV_PM_OPS()
As of 1a3c7bb088 ("PM: core: Add new *_PM_OPS macros, deprecate old
ones"), SIMPLE_DEV_PM_OPS() is deprecated in favor of
DEFINE_SIMPLE_DEV_PM_OPS(), which has the advantage that the PM callbacks
don't need to be wrapped with #ifdef CONFIG_PM or tagged with
__maybe_unused.

Convert to DEFINE_SIMPLE_DEV_PM_OPS().  No functional change intended.

Link: https://lore.kernel.org/r/20221025203852.681822-9-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2022-10-26 11:25:56 -05:00
Bjorn Helgaas
746e926b9f agp/sis: Update to DEFINE_SIMPLE_DEV_PM_OPS()
As of 1a3c7bb088 ("PM: core: Add new *_PM_OPS macros, deprecate old
ones"), SIMPLE_DEV_PM_OPS() is deprecated in favor of
DEFINE_SIMPLE_DEV_PM_OPS(), which has the advantage that the PM callbacks
don't need to be wrapped with #ifdef CONFIG_PM or tagged with
__maybe_unused.

Convert to DEFINE_SIMPLE_DEV_PM_OPS().  No functional change intended.

Link: https://lore.kernel.org/r/20221025203852.681822-8-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2022-10-26 11:25:52 -05:00
Bjorn Helgaas
8c1f82c710 agp/amd64: Update to DEFINE_SIMPLE_DEV_PM_OPS()
As of 1a3c7bb088 ("PM: core: Add new *_PM_OPS macros, deprecate old
ones"), SIMPLE_DEV_PM_OPS() is deprecated in favor of
DEFINE_SIMPLE_DEV_PM_OPS(), which has the advantage that the PM callbacks
don't need to be wrapped with #ifdef CONFIG_PM or tagged with
__maybe_unused.

Convert to DEFINE_SIMPLE_DEV_PM_OPS().  No functional change intended.

Link: https://lore.kernel.org/r/20221025203852.681822-7-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2022-10-26 11:25:49 -05:00
Bjorn Helgaas
11a8d8774e agp/nvidia: Convert to generic power management
Convert agpgart-nvidia from legacy PCI power management to the generic
power management framework.

Previously agpgart-nvidia used legacy PCI power management, and
agp_nvidia_suspend() and agp_nvidia_resume() were responsible for both
device-specific things and generic PCI things:

  agp_nvidia_suspend
    pci_save_state                         <-- generic PCI
    pci_set_power_state(PCI_D3hot)         <-- generic PCI

  agp_nvidia_resume
    pci_set_power_state(PCI_D0)            <-- generic PCI
    pci_restore_state                      <-- generic PCI
    nvidia_configure                       <-- device-specific

Convert to generic power management where the PCI bus PM methods do the
generic PCI things, and the driver needs only the device-specific part,
i.e.,

  suspend_devices_and_enter
    dpm_suspend_start(PMSG_SUSPEND)
      pci_pm_suspend                       # PCI bus .suspend() method
        agp_nvidia_suspend                 <-- not needed at all; removed
    suspend_enter
      dpm_suspend_noirq(PMSG_SUSPEND)
        pci_pm_suspend_noirq               # PCI bus .suspend_noirq() method
          pci_save_state                   <-- generic PCI
          pci_prepare_to_sleep             <-- generic PCI
            pci_set_power_state
    ...
    dpm_resume_end(PMSG_RESUME)
      pci_pm_resume                        # PCI bus .resume() method
        pci_restore_standard_config
          pci_set_power_state(PCI_D0)      <-- generic PCI
          pci_restore_state                <-- generic PCI
        agp_nvidia_resume                  # driver->pm->resume
          nvidia_configure                 <-- device-specific

Based on 0aeddbd0cb ("via-agp: convert to generic power management") by
Vaibhav Gupta <vaibhavgupta40@gmail.com>.

Link: https://lore.kernel.org/r/20221025203852.681822-6-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2022-10-26 11:25:38 -05:00
Bjorn Helgaas
6a1274ea0e agp/ati: Convert to generic power management
Convert agpgart-ati from legacy PCI power management to the generic power
management framework.

Previously agpgart-ati used legacy PCI power management, and
agp_ati_suspend() and agp_ati_resume() were responsible for both
device-specific things and generic PCI things like saving and restoring
config space and managing power state:

  agp_ati_suspend
    pci_save_state                         <-- generic PCI
    pci_set_power_state(PCI_D3hot)         <-- generic PCI

  agp_ati_resume
    pci_set_power_state(PCI_D0)            <-- generic PCI
    pci_restore_state                      <-- generic PCI
    ati_configure                          <-- device-specific

With generic power management, the PCI bus PM methods do the generic PCI
things, and the driver needs only the device-specific part, i.e.,

  suspend_devices_and_enter
    dpm_suspend_start(PMSG_SUSPEND)
      pci_pm_suspend                       # PCI bus .suspend() method
        agp_ati_suspend                    <-- not needed at all; removed
    suspend_enter
      dpm_suspend_noirq(PMSG_SUSPEND)
        pci_pm_suspend_noirq               # PCI bus .suspend_noirq() method
          pci_save_state                   <-- generic PCI
          pci_prepare_to_sleep             <-- generic PCI
            pci_set_power_state
    ...
    dpm_resume_end(PMSG_RESUME)
      pci_pm_resume                        # PCI bus .resume() method
        pci_restore_standard_config
          pci_set_power_state(PCI_D0)      <-- generic PCI
          pci_restore_state                <-- generic PCI
        agp_ati_resume                     # driver->pm->resume
          ati_configure                    <-- device-specific

Based on 0aeddbd0cb ("via-agp: convert to generic power management") by
Vaibhav Gupta <vaibhavgupta40@gmail.com>.

Link: https://lore.kernel.org/r/20221025203852.681822-5-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2022-10-26 11:25:33 -05:00
Bjorn Helgaas
c78679d1fe agp/amd-k7: Convert to generic power management
Convert agpgart-amdk7 from legacy PCI power management to the generic power
management framework.

Previously agpgart-amdk7 used legacy PCI power management, and
agp_amdk7_suspend() and agp_amdk7_resume() were responsible for both
device-specific things and generic PCI things like saving and restoring
config space and managing power state:

  agp_amdk7_suspend
    pci_save_state                         <-- generic PCI
    pci_set_power_state                    <-- generic PCI

  agp_amdk7_resume
    pci_set_power_state(PCI_D0)            <-- generic PCI
    pci_restore_state                      <-- generic PCI
    amd_irongate_driver.configure          <-- device-specific

Convert to generic power management where the PCI bus PM methods do the
generic PCI things, and the driver needs only the device-specific part,
i.e.,

  suspend_devices_and_enter
    dpm_suspend_start(PMSG_SUSPEND)
      pci_pm_suspend                       # PCI bus .suspend() method
        agp_amdk7_suspend                  <-- not needed at all; removed
    suspend_enter
      dpm_suspend_noirq(PMSG_SUSPEND)
        pci_pm_suspend_noirq               # PCI bus .suspend_noirq() method
          pci_save_state                   <-- generic PCI
          pci_prepare_to_sleep             <-- generic PCI
            pci_set_power_state
    ...
    dpm_resume_end(PMSG_RESUME)
      pci_pm_resume                        # PCI bus .resume() method
        pci_restore_standard_config
          pci_set_power_state(PCI_D0)      <-- generic PCI
          pci_restore_state                <-- generic PCI
        agp_amdk7_resume                   # driver->pm->resume
          amd_irongate_driver.configure    <-- device-specific

Based on 0aeddbd0cb ("via-agp: convert to generic power management") by
Vaibhav Gupta <vaibhavgupta40@gmail.com>.

Link: https://lore.kernel.org/r/20221025203852.681822-4-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2022-10-26 11:25:28 -05:00
Bjorn Helgaas
7f142022e6 agp/intel: Convert to generic power management
Convert agpgart-intel from legacy PCI power management to the generic power
management framework.

Previously agpgart-intel used legacy PCI power management, and
agp_intel_resume() was responsible for both device-specific things and
generic PCI things like saving and restoring config space and managing
power state.

In this case, agp_intel_suspend() was empty, and agp_intel_resume()
already did only device-specific things, so simply convert it to take a
struct device * instead of a struct pci_dev *.

Based on 0aeddbd0cb ("via-agp: convert to generic power management") by
Vaibhav Gupta <vaibhavgupta40@gmail.com>.

Link: https://lore.kernel.org/r/20221025203852.681822-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2022-10-26 11:25:22 -05:00
Bjorn Helgaas
94e9f9a23f agp/efficeon: Convert to generic power management
Convert agpgart-efficeon from legacy PCI power management to the generic
power management framework.

Previously agpgart-efficeon used legacy PCI power management, which means
agp_efficeon_suspend() and agp_efficeon_resume() were responsible for both
device-specific things and generic PCI things like saving and restoring
config space and managing power state.

In this case, agp_efficeon_suspend() was empty, and agp_efficeon_resume()
already did only device-specific things, so simply convert it to take a
struct device * instead of a struct pci_dev *.

Based on 0aeddbd0cb ("via-agp: convert to generic power management") by
Vaibhav Gupta <vaibhavgupta40@gmail.com>.

Link: https://lore.kernel.org/r/20221025203852.681822-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
2022-10-26 11:25:16 -05:00
Bjorn Helgaas
461a65d7d1 PCI/portdrv: Unexport pcie_port_service_register(), pcie_port_service_unregister()
pcie_port_service_register() and pcie_port_service_unregister() are used
only by the pciehp, aer, dpc, and pme PCIe port service drivers, none of
which can be modules.  Unexport pcie_port_service_register() and
pcie_port_service_unregister().  No functional change intended.

Link: https://lore.kernel.org/r/20221019204127.44463-4-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2022-10-24 14:57:30 -05:00
Bjorn Helgaas
29f193feee PCI/portdrv: Move private things to portdrv.c
Previously several things used by portdrv_core.c and portdrv_pci.c were
shared by defining them in portdrv.h.  Now that portdrv_core.c and
portdrv_pci.c have been squashed, move things that can be private into
portdrv.c.  No functional change intended.

Link: https://lore.kernel.org/r/20221019204127.44463-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2022-10-24 14:57:30 -05:00
Bjorn Helgaas
a1ccd3d911 PCI/portdrv: Squash into portdrv.c
Squash portdrv_core.c and portdrv_pci.c into portdrv.c to make it easier to
find things.  The whole thing is less than 1000 lines, and it's a pain to
bounce back and forth between two files.

Several portdrv_core.c functions were non-static because they were
referenced from portdrv_pci.c.  Make them static since they're now all in
portdrv.c.

No functional change intended.

Link: https://lore.kernel.org/r/20221019204127.44463-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Keith Busch <kbusch@kernel.org>
2022-10-24 14:57:30 -05:00
Linus Torvalds
9abf2313ad Linux 6.1-rc1 v6.1-rc1 2022-10-16 15:36:24 -07:00
Linus Torvalds
f1947d7c8a Merge tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull more random number generator updates from Jason Donenfeld:
 "This time with some large scale treewide cleanups.

  The intent of this pull is to clean up the way callers fetch random
  integers. The current rules for doing this right are:

   - If you want a secure or an insecure random u64, use get_random_u64()

   - If you want a secure or an insecure random u32, use get_random_u32()

     The old function prandom_u32() has been deprecated for a while
     now and is just a wrapper around get_random_u32(). Same for
     get_random_int().

   - If you want a secure or an insecure random u16, use get_random_u16()

   - If you want a secure or an insecure random u8, use get_random_u8()

   - If you want secure or insecure random bytes, use get_random_bytes().

     The old function prandom_bytes() has been deprecated for a while
     now and has long been a wrapper around get_random_bytes()

   - If you want a non-uniform random u32, u16, or u8 bounded by a
     certain open interval maximum, use prandom_u32_max()

     I say "non-uniform", because it doesn't do any rejection sampling
     or divisions. Hence, it stays within the prandom_*() namespace, not
     the get_random_*() namespace.

     I'm currently investigating a "uniform" function for 6.2. We'll see
     what comes of that.

  By applying these rules uniformly, we get several benefits:

   - By using prandom_u32_max() with an upper-bound that the compiler
     can prove at compile-time is ≤65536 or ≤256, internally
     get_random_u16() or get_random_u8() is used, which wastes fewer
     batched random bytes, and hence has higher throughput.

   - By using prandom_u32_max() instead of %, when the upper-bound is
     not a constant, division is still avoided, because
     prandom_u32_max() uses a faster multiplication-based trick instead.

   - By using get_random_u16() or get_random_u8() in cases where the
     return value is intended to indeed be a u16 or a u8, we waste fewer
     batched random bytes, and hence have higher throughput.

  This series was originally done by hand while I was on an airplane
  without Internet. Later, Kees and I worked on retroactively figuring
  out what could be done with Coccinelle and what had to be done
  manually, and then we split things up based on that.

  So while this touches a lot of files, the actual amount of code that's
  hand fiddled is comfortably small"

* tag 'random-6.1-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random:
  prandom: remove unused functions
  treewide: use get_random_bytes() when possible
  treewide: use get_random_u32() when possible
  treewide: use get_random_{u8,u16}() when possible, part 2
  treewide: use get_random_{u8,u16}() when possible, part 1
  treewide: use prandom_u32_max() when possible, part 2
  treewide: use prandom_u32_max() when possible, part 1
2022-10-16 15:27:07 -07:00
Linus Torvalds
8636df94ec Merge tag 'perf-tools-for-v6.1-2-2022-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull more perf tools updates from Arnaldo Carvalho de Melo:

 - Use BPF CO-RE (Compile Once, Run Everywhere) to support old kernels
   when using bperf (perf BPF based counters) with cgroups.

 - Support HiSilicon PCIe Performance Monitoring Unit (PMU), that
   monitors bandwidth, latency, bus utilization and buffer occupancy.

   Documented in Documentation/admin-guide/perf/hisi-pcie-pmu.rst.

 - User space tasks can migrate between CPUs, so when tracing selected
   CPUs, system-wide sideband is still needed, fix it in the setup of
   Intel PT on hybrid systems.

 - Fix metricgroups title message in 'perf list', it should state that
   the metrics groups are to be used with the '-M' option, not '-e'.

 - Sync the msr-index.h copy with the kernel sources, adding support for
   using "AMD64_TSC_RATIO" in filter expressions in 'perf trace' as well
   as decoding it when printing the MSR tracepoint arguments.

 - Fix program header size and alignment when generating a JIT ELF in
   'perf inject'.

 - Add multiple new Intel PT 'perf test' entries, including a jitdump
   one.

 - Fix the 'perf test' entries for 'perf stat' CSV and JSON output when
   running on PowerPC due to an invalid topology number in that arch.

 - Fix the 'perf test' for arm_coresight failures on the ARM Juno
   system.

 - Fix the 'perf test' attr entry for PERF_FORMAT_LOST, adding this
   option to the or expression expected in the intercepted
   perf_event_open() syscall.

 - Add missing condition flags ('hs', 'lo', 'vc', 'vs') for arm64 in the
   'perf annotate' asm parser.

 - Fix 'perf mem record -C' option processing, it was being chopped up
   when preparing the underlying 'perf record -e mem-events' and thus
   being ignored, requiring using '-- -C CPUs' as a workaround.

 - Improvements and tidy ups for 'perf test' shell infra.

 - Fix Intel PT information printing segfault in uClibc, where a NULL
   format was being passed to fprintf.

* tag 'perf-tools-for-v6.1-2-2022-10-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (23 commits)
  tools arch x86: Sync the msr-index.h copy with the kernel sources
  perf auxtrace arm64: Add support for parsing HiSilicon PCIe Trace packet
  perf auxtrace arm64: Add support for HiSilicon PCIe Tune and Trace device driver
  perf auxtrace arm: Refactor event list iteration in auxtrace_record__init()
  perf tests stat+json_output: Include sanity check for topology
  perf tests stat+csv_output: Include sanity check for topology
  perf intel-pt: Fix system_wide dummy event for hybrid
  perf intel-pt: Fix segfault in intel_pt_print_info() with uClibc
  perf test: Fix attr tests for PERF_FORMAT_LOST
  perf test: test_intel_pt.sh: Add 9 tests
  perf inject: Fix GEN_ELF_TEXT_OFFSET for jit
  perf test: test_intel_pt.sh: Add jitdump test
  perf test: test_intel_pt.sh: Tidy some alignment
  perf test: test_intel_pt.sh: Print a message when skipping kernel tracing
  perf test: test_intel_pt.sh: Tidy some perf record options
  perf test: test_intel_pt.sh: Fix return checking again
  perf: Skip and warn on unknown format 'configN' attrs
  perf list: Fix metricgroups title message
  perf mem: Fix -C option behavior for perf mem record
  perf annotate: Add missing condition flags for arm64
  ...
2022-10-16 15:14:29 -07:00
Linus Torvalds
2df76606db Merge tag 'kbuild-fixes-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fixes from Masahiro Yamada:

 - Fix CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y compile error for the
   combination of Clang >= 14 and GAS <= 2.35.

 - Drop vmlinux.bz2 from the rpm package as it just annoyingly increased
   the package size.

 - Fix modpost error under build environments using musl.

 - Make *.ll files keep value names for easier debugging

 - Fix single directory build

 - Prevent RISC-V from selecting the broken DWARF5 support when Clang
   and GAS are used together.

* tag 'kbuild-fixes-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  lib/Kconfig.debug: Add check for non-constant .{s,u}leb128 support to DWARF5
  kbuild: fix single directory build
  kbuild: add -fno-discard-value-names to cmd_cc_ll_c
  scripts/clang-tools: Convert clang-tidy args to list
  modpost: put modpost options before argument
  kbuild: Stop including vmlinux.bz2 in the rpm's
  Kconfig.debug: add toolchain checks for DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT
  Kconfig.debug: simplify the dependency of DEBUG_INFO_DWARF4/5
2022-10-16 11:12:22 -07:00
Linus Torvalds
2fcd8f108f Merge tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull more clk updates from Stephen Boyd:
 "This is the final part of the clk patches for this merge window.

  The clk rate range series needed another week to fully bake. Maxime
  fixed the bug that broke clk notifiers and prevented this from being
  included in the first pull request. He also added a unit test on top
  to make sure it doesn't break so easily again. The majority of the
  series fixes up how the clk_set_rate_*() APIs work, particularly
  around when the rate constraints are dropped and how they move around
  when reparenting clks. Overall it's a much needed improvement to the
  clk rate range APIs that used to be pretty broken if you looked
  sideways.

  Beyond the core changes there are a few driver fixes for a compilation
  issue or improper data causing clks to fail to register or have the
  wrong parents. These are good to get in before the first -rc so that
  the system actually boots on the affected devices"

* tag 'clk-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: (31 commits)
  clk: tegra: Fix Tegra PWM parent clock
  clk: at91: fix the build with binutils 2.27
  clk: qcom: gcc-msm8660: Drop hardcoded fixed board clocks
  clk: mediatek: clk-mux: Add .determine_rate() callback
  clk: tests: Add tests for notifiers
  clk: Update req_rate on __clk_recalc_rates()
  clk: tests: Add missing test case for ranges
  clk: qcom: clk-rcg2: Take clock boundaries into consideration for gfx3d
  clk: Introduce the clk_hw_get_rate_range function
  clk: Zero the clk_rate_request structure
  clk: Stop forwarding clk_rate_requests to the parent
  clk: Constify clk_has_parent()
  clk: Introduce clk_core_has_parent()
  clk: Switch from __clk_determine_rate to clk_core_round_rate_nolock
  clk: Add our request boundaries in clk_core_init_rate_req
  clk: Introduce clk_hw_init_rate_request()
  clk: Move clk_core_init_rate_req() from clk_core_round_rate_nolock() to its caller
  clk: Change clk_core_init_rate_req prototype
  clk: Set req_rate on reparenting
  clk: Take into account uncached clocks in clk_set_rate_range()
  ...
2022-10-16 11:08:19 -07:00
Linus Torvalds
b08cd74448 Merge tag '6.1-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6
Pull more cifs updates from Steve French:

 - fix a regression in guest mounts to old servers

 - improvements to directory leasing (caching directory entries safely
   beyond the root directory)

 - symlink improvement (reducing roundtrips needed to process symlinks)

 - an lseek fix (to problem where some dir entries could be skipped)

 - improved ioctl for returning more detailed information on directory
   change notifications

 - clarify multichannel interface query warning

 - cleanup fix (for better aligning buffers using ALIGN and round_up)

 - a compounding fix

 - fix some uninitialized variable bugs found by Coverity and the kernel
   test robot

* tag '6.1-rc-smb3-client-fixes-part2' of git://git.samba.org/sfrench/cifs-2.6:
  smb3: improve SMB3 change notification support
  cifs: lease key is uninitialized in two additional functions when smb1
  cifs: lease key is uninitialized in smb1 paths
  smb3: must initialize two ACL struct fields to zero
  cifs: fix double-fault crash during ntlmssp
  cifs: fix static checker warning
  cifs: use ALIGN() and round_up() macros
  cifs: find and use the dentry for cached non-root directories also
  cifs: enable caching of directories for which a lease is held
  cifs: prevent copying past input buffer boundaries
  cifs: fix uninitialised var in smb2_compound_op()
  cifs: improve symlink handling for smb2+
  smb3: clarify multichannel warning
  cifs: fix regression in very old smb1 mounts
  cifs: fix skipping to incorrect offset in emit_cached_dirents
2022-10-16 11:01:40 -07:00
Tetsuo Handa
80493877d7 Revert "cpumask: fix checking valid cpu range".
This reverts commit 78e5a33994 ("cpumask: fix checking valid cpu range").

syzbot is hitting WARN_ON_ONCE(cpu >= nr_cpumask_bits) warning at
cpu_max_bits_warn() [1], for commit 78e5a33994 ("cpumask: fix checking
valid cpu range") is broken.  Obviously that patch hits WARN_ON_ONCE()
when e.g.  reading /proc/cpuinfo because passing "cpu + 1" instead of
"cpu" will trivially hit cpu == nr_cpumask_bits condition.

Although syzbot found this problem in linux-next.git on 2022/09/27 [2],
this problem was not fixed immediately.  As a result, that patch was
sent to linux.git before the patch author recognizes this problem, and
syzbot started failing to test changes in linux.git since 2022/10/10
[3].

Andrew Jones proposed a fix for x86 and riscv architectures [4].  But
[2] and [5] indicate that affected locations are not limited to arch
code.  More delay before we find and fix affected locations, less tested
kernel (and more difficult to bisect and fix) before release.

We should have inspected and fixed basically all cpumask users before
applying that patch.  We should not crash kernels in order to ask
existing cpumask users to update their code, even if limited to
CONFIG_DEBUG_PER_CPU_MAPS=y case.

Link: https://syzkaller.appspot.com/bug?extid=d0fd2bf0dd6da72496dd [1]
Link: https://syzkaller.appspot.com/bug?extid=21da700f3c9f0bc40150 [2]
Link: https://syzkaller.appspot.com/bug?extid=51a652e2d24d53e75734 [3]
Link: https://lkml.kernel.org/r/20221014155845.1986223-1-ajones@ventanamicro.com [4]
Link: https://syzkaller.appspot.com/bug?extid=4d46c43d81c3bd155060 [5]
Reported-by: Andrew Jones <ajones@ventanamicro.com>
Reported-by: syzbot+d0fd2bf0dd6da72496dd@syzkaller.appspotmail.com
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-10-16 10:45:17 -07:00
Nathan Chancellor
0a6de78cff lib/Kconfig.debug: Add check for non-constant .{s,u}leb128 support to DWARF5
When building with a RISC-V kernel with DWARF5 debug info using clang
and the GNU assembler, several instances of the following error appear:

  /tmp/vgettimeofday-48aa35.s:2963: Error: non-constant .uleb128 is not supported

Dumping the .s file reveals these .uleb128 directives come from
.debug_loc and .debug_ranges:

  .Ldebug_loc0:
          .byte   4                               # DW_LLE_offset_pair
          .uleb128 .Lfunc_begin0-.Lfunc_begin0    #   starting offset
          .uleb128 .Ltmp1-.Lfunc_begin0           #   ending offset
          .byte   1                               # Loc expr size
          .byte   90                              # DW_OP_reg10
          .byte   0                               # DW_LLE_end_of_list

  .Ldebug_ranges0:
          .byte   4                               # DW_RLE_offset_pair
          .uleb128 .Ltmp6-.Lfunc_begin0           #   starting offset
          .uleb128 .Ltmp27-.Lfunc_begin0          #   ending offset
          .byte   4                               # DW_RLE_offset_pair
          .uleb128 .Ltmp28-.Lfunc_begin0          #   starting offset
          .uleb128 .Ltmp30-.Lfunc_begin0          #   ending offset
          .byte   0                               # DW_RLE_end_of_list

There is an outstanding binutils issue to support a non-constant operand
to .sleb128 and .uleb128 in GAS for RISC-V but there does not appear to
be any movement on it, due to concerns over how it would work with
linker relaxation.

To avoid these build errors, prevent DWARF5 from being selected when
using clang and an assembler that does not have support for these symbol
deltas, which can be easily checked in Kconfig with as-instr plus the
small test program from the dwz test suite from the binutils issue.

Link: https://sourceware.org/bugzilla/show_bug.cgi?id=27215
Link: https://github.com/ClangBuiltLinux/linux/issues/1719
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-10-17 02:06:47 +09:00
Masahiro Yamada
3753af778d kbuild: fix single directory build
Commit f110e5a250 ("kbuild: refactor single builds of *.ko") was wrong.

KBUILD_MODULES _is_ needed for single builds.

Otherwise, "make foo/bar/baz/" does not build module objects at all.

Fixes: f110e5a250 ("kbuild: refactor single builds of *.ko")
Reported-by: David Sterba <dsterba@suse.cz>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Tested-by: David Sterba <dsterba@suse.com>
2022-10-17 02:03:52 +09:00
Linus Torvalds
1501278bb7 Merge tag 'slab-for-6.1-rc1-hotfix' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab
Pull slab hotfix from Vlastimil Babka:
 "A single fix for the common-kmalloc series, for warnings on mips and
  sparc64 reported by Guenter Roeck"

* tag 'slab-for-6.1-rc1-hotfix' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
  mm/slab: use kmalloc_node() for off slab freelist_idx_t array allocation
2022-10-15 17:05:07 -07:00