Pull iommu fixes from Will Deacon:
"We're still resolving a regression with the handling of unexpected
page faults on SMMUv3, but we're not quite there with a fix yet.
- Fix NULL dereference when freeing domain in Unisoc SPRD driver
- Separate assignment statements with semicolons in AMD page-table
code
- Fix Tegra erratum workaround when the CPU is using 16KiB pages"
* tag 'iommu-fixes-v6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux:
iommu: arm-smmu: Fix Tegra workaround for PAGE_SIZE mappings
iommu/amd: Convert comma to semicolon
iommu: sprd: Avoid NULL deref in sprd_iommu_hw_en
Pull iommu updates from Will Deacon:
"Core:
- Support for the "ats-supported" device-tree property
- Removal of the 'ops' field from 'struct iommu_fwspec'
- Introduction of iommu_paging_domain_alloc() and partial conversion
of existing users
- Introduce 'struct iommu_attach_handle' and provide corresponding
IOMMU interfaces which will be used by the IOMMUFD subsystem
- Remove stale documentation
- Add missing MODULE_DESCRIPTION() macro
- Misc cleanups
Allwinner Sun50i:
- Ensure bypass mode is disabled on H616 SoCs
- Ensure page-tables are allocated below 4GiB for the 32-bit
page-table walker
- Add new device-tree compatible strings
AMD Vi:
- Use try_cmpxchg64() instead of cmpxchg64() when updating pte
Arm SMMUv2:
- Print much more useful information on context faults
- Fix Qualcomm TBU probing when CONFIG_ARM_SMMU_QCOM_DEBUG=n
- Add new Qualcomm device-tree bindings
Arm SMMUv3:
- Support for hardware update of access/dirty bits and reporting via
IOMMUFD
- More driver rework from Jason, this time updating the PASID/SVA
support to prepare for full IOMMUFD support
- Add missing MODULE_DESCRIPTION() macro
- Minor fixes and cleanups
NVIDIA Tegra:
- Fix for benign fwspec initialisation issue exposed by rework on the
core branch
Intel VT-d:
- Use try_cmpxchg64() instead of cmpxchg64() when updating pte
- Use READ_ONCE() to read volatile descriptor status
- Remove support for handling Execute-Requested requests
- Avoid calling iommu_domain_alloc()
- Minor fixes and refactoring
Qualcomm MSM:
- Updates to the device-tree bindings"
* tag 'iommu-updates-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (72 commits)
iommu/tegra-smmu: Pass correct fwnode to iommu_fwspec_init()
iommu/vt-d: Fix identity map bounds in si_domain_init()
iommu: Move IOMMU_DIRTY_NO_CLEAR define
dt-bindings: iommu: Convert msm,iommu-v0 to yaml
iommu/vt-d: Fix aligned pages in calculate_psi_aligned_address()
iommu/vt-d: Limit max address mask to MAX_AGAW_PFN_WIDTH
docs: iommu: Remove outdated Documentation/userspace-api/iommu.rst
arm64: dts: fvp: Enable PCIe ATS for Base RevC FVP
iommu/of: Support ats-supported device-tree property
dt-bindings: PCI: generic: Add ats-supported property
iommu: Remove iommu_fwspec ops
OF: Simplify of_iommu_configure()
ACPI: Retire acpi_iommu_fwspec_ops()
iommu: Resolve fwspec ops automatically
iommu/mediatek-v1: Clean up redundant fwspec checks
RDMA/usnic: Use iommu_paging_domain_alloc()
wifi: ath11k: Use iommu_paging_domain_alloc()
wifi: ath10k: Use iommu_paging_domain_alloc()
drm/msm: Use iommu_paging_domain_alloc()
vhost-vdpa: Use iommu_paging_domain_alloc()
...
Current code configures GCR3 even when device is attached to identity
domain. So that we can support SVA with identity domain. This means in
attach device path it updates Guest Translation related bits in DTE.
Commit de111f6b4f ("iommu/amd: Enable Guest Translation after reading
IOMMU feature register") missed to enable Control[GT] bit in resume
path. Its causing certain laptop to fail to resume after suspend.
This is because we have inconsistency between between control register
(GT is disabled) and DTE (where we have enabled guest translation related
bits) in resume path. And IOMMU hardware throws ILLEGAL_DEV_TABLE_ENTRY.
Fix it by enabling GT bit in resume path.
Reported-by: Błażej Szczygieł <spaz16@wp.pl>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218975
Fixes: de111f6b4f ("iommu/amd: Enable Guest Translation after reading IOMMU feature register")
Tested-by: Błażej Szczygieł <spaz16@wp.pl>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20240621101533.20216-1-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Commit 87a6f1f22c ("iommu/amd: Introduce per-device domain ID to fix
potential TLB aliasing issue") introduced per device domain ID when
domain is configured with v2 page table. And in invalidation path, it
uses per device structure (dev_data->gcr3_info.domid) to get the domain ID.
In detach_device() path, current code tries to invalidate IOMMU cache
after removing dev_data from domain device list. This means when domain
is configured with v2 page table, amd_iommu_domain_flush_all() will not be
able to invalidate cache as device is already removed from domain device
list.
This is causing change domain tests (changing domain type from identity to DMA)
to fail with IO_PAGE_FAULT issue.
Hence invalidate cache and update DTE before updating data structures.
Reported-by: FahHean Lee <fahhean.lee@amd.com>
Reported-by: Dheeraj Kumar Srivastava <dheerajkumar.srivastava@amd.com>
Fixes: 87a6f1f22c ("iommu/amd: Introduce per-device domain ID to fix potential TLB aliasing issue")
Tested-by: Dheeraj Kumar Srivastava <dheerajkumar.srivastava@amd.com>
Tested-by: Sairaj Arun Kodilkar <sairaj.arunkodilkar@amd.com>
Tested-by: FahHean Lee <fahhean.lee@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jerry Snitselaar <jsnitsel@redhat.com>
Link: https://lore.kernel.org/r/20240620060552.13984-1-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Pull pci updates from Bjorn Helgaas:
"Enumeration:
- Skip E820 checks for MCFG ECAM regions for new (2016+) machines,
since there's no requirement to describe them in E820 and some
platforms require ECAM to work (Bjorn Helgaas)
- Rename PCI_IRQ_LEGACY to PCI_IRQ_INTX to be more specific (Damien
Le Moal)
- Remove last user and pci_enable_device_io() (Heiner Kallweit)
- Wait for Link Training==0 to avoid possible race (Ilpo Järvinen)
- Skip waiting for devices that have been disconnected while
suspended (Ilpo Järvinen)
- Clear Secondary Status errors after enumeration since Master Aborts
and Unsupported Request errors are an expected part of enumeration
(Vidya Sagar)
MSI:
- Remove unused IMS (Interrupt Message Store) support (Bjorn Helgaas)
Error handling:
- Mask Genesys GL975x SD host controller Replay Timer Timeout
correctable errors caused by a hardware defect; the errors cause
interrupts that prevent system suspend (Kai-Heng Feng)
- Fix EDR-related _DSM support, which previously evaluated revision 5
but assumed revision 6 behavior (Kuppuswamy Sathyanarayanan)
ASPM:
- Simplify link state definitions and mask calculation (Ilpo
Järvinen)
Power management:
- Avoid D3cold for HP Pavilion 17 PC/1972 PCIe Ports, where BIOS
apparently doesn't know how to put them back in D0 (Mario
Limonciello)
CXL:
- Support resetting CXL devices; special handling required because
CXL Ports mask Secondary Bus Reset by default (Dave Jiang)
DOE:
- Support DOE Discovery Version 2 (Alexey Kardashevskiy)
Endpoint framework:
- Set endpoint BAR to be 64-bit if the driver says that's all the
device supports, in addition to doing so if the size is >2GB
(Niklas Cassel)
- Simplify endpoint BAR allocation and setting interfaces (Niklas
Cassel)
Cadence PCIe controller driver:
- Drop DT binding redundant msi-parent and pci-bus.yaml (Krzysztof
Kozlowski)
Cadence PCIe endpoint driver:
- Configure endpoint BARs to be 64-bit based on the BAR type, not the
BAR value (Niklas Cassel)
Freescale Layerscape PCIe controller driver:
- Convert DT binding to YAML (Frank Li)
MediaTek MT7621 PCIe controller driver:
- Add DT binding missing 'reg' property for child Root Ports
(Krzysztof Kozlowski)
- Fix theoretical string truncation in PHY name (Sergio Paracuellos)
NVIDIA Tegra194 PCIe controller driver:
- Return success for endpoint probe instead of falling through to the
failure path (Vidya Sagar)
Renesas R-Car PCIe controller driver:
- Add DT binding missing IOMMU properties (Geert Uytterhoeven)
- Add DT binding R-Car V4H compatible for host and endpoint mode
(Yoshihiro Shimoda)
Rockchip PCIe controller driver:
- Configure endpoint BARs to be 64-bit based on the BAR type, not the
BAR value (Niklas Cassel)
- Add DT binding missing maxItems to ep-gpios (Krzysztof Kozlowski)
- Set the Subsystem Vendor ID, which was previously zero because it
was masked incorrectly (Rick Wertenbroek)
Synopsys DesignWare PCIe controller driver:
- Restructure DBI register access to accommodate devices where this
requires Refclk to be active (Manivannan Sadhasivam)
- Remove the deinit() callback, which was only need by the
pcie-rcar-gen4, and do it directly in that driver (Manivannan
Sadhasivam)
- Add dw_pcie_ep_cleanup() so drivers that support PERST# can clean
up things like eDMA (Manivannan Sadhasivam)
- Rename dw_pcie_ep_exit() to dw_pcie_ep_deinit() to make it parallel
to dw_pcie_ep_init() (Manivannan Sadhasivam)
- Rename dw_pcie_ep_init_complete() to dw_pcie_ep_init_registers() to
reflect the actual functionality (Manivannan Sadhasivam)
- Call dw_pcie_ep_init_registers() directly from all the glue
drivers, not just those that require active Refclk from the host
(Manivannan Sadhasivam)
- Remove the "core_init_notifier" flag, which was an obscure way for
glue drivers to indicate that they depend on Refclk from the host
(Manivannan Sadhasivam)
TI J721E PCIe driver:
- Add DT binding J784S4 SoC Device ID (Siddharth Vadapalli)
- Add DT binding J722S SoC support (Siddharth Vadapalli)
TI Keystone PCIe controller driver:
- Add DT binding missing num-viewport, phys and phy-name properties
(Jan Kiszka)
Miscellaneous:
- Constify and annotate with __ro_after_init (Heiner Kallweit)
- Convert DT bindings to YAML (Krzysztof Kozlowski)
- Check for kcalloc() failure in of_pci_prop_intr_map() (Duoming
Zhou)"
* tag 'pci-v6.10-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (97 commits)
PCI: Do not wait for disconnected devices when resuming
x86/pci: Skip early E820 check for ECAM region
PCI: Remove unused pci_enable_device_io()
ata: pata_cs5520: Remove unnecessary call to pci_enable_device_io()
PCI: Update pci_find_capability() stub return types
PCI: Remove PCI_IRQ_LEGACY
scsi: vmw_pvscsi: Do not use PCI_IRQ_LEGACY instead of PCI_IRQ_LEGACY
scsi: pmcraid: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
scsi: mpt3sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
scsi: megaraid_sas: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
scsi: ipr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
scsi: hpsa: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
scsi: arcmsr: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
wifi: rtw89: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
dt-bindings: PCI: rockchip,rk3399-pcie: Add missing maxItems to ep-gpios
Revert "genirq/msi: Provide constants for PCI/IMS support"
Revert "x86/apic/msi: Enable PCI/IMS"
Revert "iommu/vt-d: Enable PCI/IMS"
Revert "iommu/amd: Enable PCI/IMS"
Revert "PCI/MSI: Provide IMS (Interrupt Message Store) support"
...
With WERROR=y, which is default, clang is not happy:
.../amd/pasid.c:168:3: error: call to undeclared function 'mmu_notifier_unregister'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
.../amd/pasid.c:191:8: error: call to undeclared function 'mmu_notifier_register'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
2 errors generated.
Select missed dependency.
Fixes: a5a91e5484 ("iommu/amd: Add SVA domain support")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240429111707.2795194-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This includes :
- Add data structure to track per protection domain dev/pasid binding details
protection_domain->dev_data_list will track attached list of
dev_data/PASIDs.
- Move 'to_pdomain()' to header file
- Add iommu_sva_set_dev_pasid(). It will check whether PASID is supported
or not. Also adds PASID to SVA protection domain list as well as to
device GCR3 table.
- Add iommu_ops.remove_dev_pasid support. It will unbind PASID from
device. Also remove pasid data from protection domain device list.
- Add IOMMU_SVA as dependency to AMD_IOMMU driver
For a given PASID, iommu_set_dev_pasid() will bind all devices to same
SVA protection domain (1 PASID : 1 SVA protection domain : N devices).
This protection domain is different from device protection domain (one
that's mapped in attach_device() path). IOMMU uses domain ID for caching,
invalidation, etc. In SVA mode it will use per-device-domain-ID. Hence in
invalidation path we retrieve domain ID from gcr3_info_table structure and
use that for invalidation.
Co-developed-by: Wei Huang <wei.huang2@amd.com>
Signed-off-by: Wei Huang <wei.huang2@amd.com>
Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240418103400.6229-14-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Return success from enable_feature(IOPF) path as this interface is going
away. Instead we will enable/disable IOPF support in attach/detach device
path.
In attach device path, if device is capable of PRI, then we will add it to
per IOMMU IOPF queue and enable PPR support in IOMMU. Also it will
attach device to domain even if it fails to enable PRI or add device to
IOPF queue as device can continue to work without PRI support.
In detach device patch it follows following sequence:
- Flush the queue for the given device
- Disable PPR support in DTE[devid]
- Remove device from IOPF queue
- Disable device PRI
Also add IOMMU_IOPF as dependency to AMD_IOMMU driver.
Co-developed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240418103400.6229-13-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
AMD IOMMU hardware supports PCI Peripheral Paging Request (PPR) using
a PPR log, which is a circular buffer containing requests from downstream
end-point devices.
There is one PPR log per IOMMU instance. Therefore, allocate an iopf_queue
per IOMMU instance during driver initialization, and free the queue during
driver deinitialization.
Also rename enable_iommus_v2() -> enable_iommus_ppr() to reflect its
usage. And add amd_iommu_gt_ppr_supported() check before enabling PPR
log.
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240418103400.6229-10-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Commit eda8c2860a ("iommu/amd: Enable device ATS/PASID/PRI capabilities
independently") changed the way it enables device capability while
attaching devices. I missed to account the attached domain capability.
Meaning if domain is not capable of handling PASID/PRI (ex: paging
domain with v1 page table) then enabling device feature is not required.
This patch enables PASID/PRI only if domain is capable of handling SVA.
Also move pci feature enablement to do_attach() function so that we make
SVA capability in one place. Finally make PRI enable/disable functions as
static functions.
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240418103400.6229-9-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
SVA can be supported if domain is in passthrough mode or paging domain
with v2 page table. Current code sets up GCR3 table for domain with v2
page table only. Setup GCR3 table for all SVA capable domains.
- Move GCR3 init/destroy to separate function.
- Change default GCR3 table to use MAX supported PASIDs. Ideally it
should use 1 level PASID table as its using PASID zero only. But we
don't have support to extend PASID table yet. We will fix this later.
- When domain is configured with passthrough mode, allocate default GCR3
table only if device is SVA capable.
Note that in attach_device() path it will not know whether device will use
SVA or not. If device is attached to passthrough domain and if it doesn't
use SVA then GCR3 table will never be used. We will endup wasting memory
allocated for GCR3 table. This is done to avoid DTE update when
attaching PASID to device.
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240418103400.6229-8-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
This variable will track the number of PASIDs supported by the device.
If IOMMU or device doesn't support PASID then it will be zero.
This will be used while allocating GCR3 table to decide required number
of PASID table levels. Also in PASID bind path it will use this variable
to check whether device supports PASID or not.
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240418103400.6229-7-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
In preparation to subsequent PPR-related patches, and also remove static
declaration for certain helper functions so that it can be reused in other
files.
Also rename below functions:
alloc_ppr_log -> amd_iommu_alloc_ppr_log
iommu_enable_ppr_log -> amd_iommu_enable_ppr_log
free_ppr_log -> amd_iommu_free_ppr_log
iommu_poll_ppr_log -> amd_iommu_poll_ppr_log
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Co-developed-by: Vasant Hegde <vasant.hegde@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240418103400.6229-5-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
It's somewhat hard to see, but arm64's arch_setup_dma_ops() should only
ever call iommu_setup_dma_ops() after a successful iommu_probe_device(),
which means there should be no harm in achieving the same order of
operations by running it off the back of iommu_probe_device() itself.
This then puts it in line with the x86 and s390 .probe_finalize bodges,
letting us pull it all into the main flow properly. As a bonus this lets
us fold in and de-scope the PCI workaround setup as well.
At this point we can also then pull the call up inside the group mutex,
and avoid having to think about whether iommu_group_store_type() could
theoretically race and free the domain if iommu_setup_dma_ops() ran just
*before* iommu_device_use_default_domain() claims it... Furthermore we
replace one .probe_finalize call completely, since the only remaining
implementations are now one which only needs to run once for the initial
boot-time probe, and two which themselves render that path unreachable.
This leaves us a big step closer to realistically being able to unpick
the variety of different things that iommu_setup_dma_ops() has been
muddling together, and further streamline iommu-dma into core API flows
in future.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> # For Intel IOMMU
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Hanjun Guo <guohanjun@huawei.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Link: https://lore.kernel.org/r/bebea331c1d688b34d9862eefd5ede47503961b8.1713523152.git.robin.murphy@arm.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The Intel IOMMU code currently tries to allocate all DMAR fault interrupt
vectors on the boot cpu. On large systems with high DMAR counts this
results in vector exhaustion, and most of the vectors are not initially
allocated socket local.
Instead, have a cpu on each node do the vector allocation for the DMARs on
that node. The boot cpu still does the allocation for its node during its
boot sequence.
Signed-off-by: Dimitri Sivanich <sivanich@hpe.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Link: https://lore.kernel.org/r/Zfydpp2Hm+as16TY@hpe.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
LOCKDEP detector reported below warning:
----------------------------------------
[ 23.796949] ========================================================
[ 23.796950] WARNING: possible irq lock inversion dependency detected
[ 23.796952] 6.8.0fix+ #811 Not tainted
[ 23.796954] --------------------------------------------------------
[ 23.796954] kworker/0:1/8 just changed the state of lock:
[ 23.796956] ff365325e084a9b8 (&domain->lock){..-.}-{3:3}, at: amd_iommu_flush_iotlb_all+0x1f/0x50
[ 23.796969] but this lock took another, SOFTIRQ-unsafe lock in the past:
[ 23.796970] (pd_bitmap_lock){+.+.}-{3:3}
[ 23.796972]
and interrupts could create inverse lock ordering between them.
[ 23.796973]
other info that might help us debug this:
[ 23.796974] Chain exists of:
&domain->lock --> &dev_data->lock --> pd_bitmap_lock
[ 23.796980] Possible interrupt unsafe locking scenario:
[ 23.796981] CPU0 CPU1
[ 23.796982] ---- ----
[ 23.796983] lock(pd_bitmap_lock);
[ 23.796985] local_irq_disable();
[ 23.796985] lock(&domain->lock);
[ 23.796988] lock(&dev_data->lock);
[ 23.796990] <Interrupt>
[ 23.796991] lock(&domain->lock);
Fix this issue by disabling interrupt when acquiring pd_bitmap_lock.
Note that this is temporary fix. We have a plan to replace custom bitmap
allocator with IDA allocator.
Fixes: 87a6f1f22c ("iommu/amd: Introduce per-device domain ID to fix potential TLB aliasing issue")
Reviewed-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20240404102717.6705-1-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
The host SNP worthiness can determined later, after alternatives have
been patched, in snp_rmptable_init() depending on cmdline options like
iommu=pt which is incompatible with SNP, for example.
Which means that one cannot use X86_FEATURE_SEV_SNP and will need to
have a special flag for that control.
Use that newly added CC_ATTR_HOST_SEV_SNP in the appropriate places.
Move kdump_sev_callback() to its rightful place, while at it.
Fixes: 216d106c7f ("x86/sev: Add SEV-SNP host initialization support")
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Tested-by: Srikanth Aithal <sraithal@amd.com>
Link: https://lore.kernel.org/r/20240327154317.29909-6-bp@alien8.de
Pull iommu updates from Joerg Roedel:
"Core changes:
- Constification of bus_type pointer
- Preparations for user-space page-fault delivery
- Use a named kmem_cache for IOVA magazines
Intel VT-d changes from Lu Baolu:
- Add RBTree to track iommu probed devices
- Add Intel IOMMU debugfs document
- Cleanup and refactoring
ARM-SMMU Updates from Will Deacon:
- Device-tree binding updates for a bunch of Qualcomm SoCs
- SMMUv2: Support for Qualcomm X1E80100 MDSS
- SMMUv3: Significant rework of the driver's STE manipulation and
domain handling code. This is the initial part of a larger scale
rework aiming to improve the driver's implementation of the
IOMMU-API in preparation for hooking up IOMMUFD support.
AMD-Vi Updates:
- Refactor GCR3 table support for SVA
- Cleanups
Some smaller cleanups and fixes"
* tag 'iommu-updates-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (88 commits)
iommu: Fix compilation without CONFIG_IOMMU_INTEL
iommu/amd: Fix sleeping in atomic context
iommu/dma: Document min_align_mask assumption
iommu/vt-d: Remove scalabe mode in domain_context_clear_one()
iommu/vt-d: Remove scalable mode context entry setup from attach_dev
iommu/vt-d: Setup scalable mode context entry in probe path
iommu/vt-d: Fix NULL domain on device release
iommu: Add static iommu_ops->release_domain
iommu/vt-d: Improve ITE fault handling if target device isn't present
iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected
PCI: Make pci_dev_is_disconnected() helper public for other drivers
iommu/vt-d: Use device rbtree in iopf reporting path
iommu/vt-d: Use rbtree to track iommu probed devices
iommu/vt-d: Merge intel_svm_bind_mm() into its caller
iommu/vt-d: Remove initialization for dynamically heap-allocated rcu_head
iommu/vt-d: Remove treatment for revoking PASIDs with pending page faults
iommu/vt-d: Add the document for Intel IOMMU debugfs
iommu/vt-d: Use kcalloc() instead of kzalloc()
iommu/vt-d: Remove INTEL_IOMMU_BROKEN_GFX_WA
iommu: re-use local fwnode variable in iommu_ops_from_fwnode()
...
Commit cf70873e3d ("iommu/amd: Refactor GCR3 table helper functions")
changed GFP flag we use for GCR3 table. Original plan was to move GCR3
table allocation outside spinlock. But this requires complete rework of
attach device path. Hence we didn't do it as part of SVA series. For now
revert the GFP flag to ATOMIC (same as original code).
Fixes: cf70873e3d ("iommu/amd: Refactor GCR3 table helper functions")
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20240307052738.116035-1-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
On many systems that have an AMD IOMMU the following sequence of
warnings is observed during bootup.
```
pci 0000:00:00.2 can't derive routing for PCI INT A
pci 0000:00:00.2: PCI INT A: not connected
```
This series of events happens because of the IOMMU initialization
sequence order and the lack of _PRT entries for the IOMMU.
During initialization the IOMMU driver first enables the PCI device
using pci_enable_device(). This will call acpi_pci_irq_enable()
which will check if the interrupt is declared in a PCI routing table
(_PRT) entry. According to the PCI spec [1] these routing entries
are only required under PCI root bridges:
The _PRT object is required under all PCI root bridges
The IOMMU is directly connected to the root complex, so there is no
parent bridge to look for a _PRT entry. The first warning is emitted
since no entry could be found in the hierarchy. The second warning is
then emitted because the interrupt hasn't yet been configured to any
value. The pin was configured in pci_read_irq() but the byte in
PCI_INTERRUPT_LINE return 0xff which means "Unknown".
After that sequence of events pci_enable_msi() is called and this
will allocate an interrupt.
That is both of these warnings are totally harmless because the IOMMU
uses MSI for interrupts. To avoid even trying to probe for a _PRT
entry mark the IOMMU as IRQ managed. This avoids both warnings.
Link: https://uefi.org/htmlspecs/ACPI_Spec_6_4_html/06_Device_Configuration/Device_Configuration.html?highlight=_prt#prt-pci-routing-table [1]
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Fixes: cffe0a2b5a ("x86, irq: Keep balance of IOAPIC pin reference count")
Reviewed-by: Vasant Hegde <vasant.hegde@amd.com>
Link: https://lore.kernel.org/r/20240122233400.1802-1-mario.limonciello@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>
With v1 page table, the AMD IOMMU spec states that the hardware must use
the domain ID to tag its internal translation caches. I/O devices with
different v1 page tables must be given different domain IDs. I/O devices
that share the same v1 page table __may__ be given the same domain ID.
This domain ID management policy is currently implemented by the AMD
IOMMU driver. In this case, only the domain ID is needed when issuing the
INVALIDATE_IOMMU_PAGES command to invalidate the IOMMU translation cache
(TLB).
With v2 page table, the hardware uses domain ID and PASID as parameters
to tag and issue the INVALIDATE_IOMMU_PAGES command. Since the GCR3 table
is setup per-device, and there is no guarantee for PASID to be unique
across multiple devices. The same PASID for different devices could
have different v2 page tables. In such case, if multiple devices share the
same domain ID, IOMMU translation cache for these devices would be polluted
due to TLB aliasing.
Hence, avoid the TLB aliasing issue with v2 page table by allocating unique
domain ID for each device even when multiple devices are sharing the same v1
page table. Please note that this fix would result in multiple
INVALIDATE_IOMMU_PAGES commands (one per domain id) when unmapping a
translation.
Domain ID can be shared until device starts using PASID. We will enhance this
code later where we will allocate per device domain ID only when its needed.
Signed-off-by: Vasant Hegde <vasant.hegde@amd.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Link: https://lore.kernel.org/r/20240205115615.6053-18-vasant.hegde@amd.com
Signed-off-by: Joerg Roedel <jroedel@suse.de>