linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-14 05:22:19 -04:00

Author	SHA1	Message	Date
Rolf Eike Beer	85ef671f97	iommu: make inclusion of amd directory conditional Nothing in there is active if CONFIG_AMD_IOMMU is not enabled, so the whole directory can depend on that switch as well. Fixes: `cbe94c6e1a` ("iommu/amd: Move Kconfig and Makefile bits down into amd directory") Signed-off-by: Rolf Eike Beer <eb@emlix.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/1894970.atdPhlSkOF@devpool92.emlix.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-05-16 08:46:46 +02:00
Rolf Eike Beer	ddcc66cfe8	iommu: make inclusion of intel directory conditional Nothing in there is active if CONFIG_INTEL_IOMMU is not enabled, so the whole directory can depend on that switch as well. Fixes: `ab65ba57e3` ("iommu/vt-d: Move Kconfig and Makefile bits down into intel directory") Signed-off-by: Rolf Eike Beer <eb@emlix.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/3818749.MHq7AAxBmi@devpool92.emlix.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-05-16 08:46:46 +02:00
Rolf Eike Beer	9548feff84	iommu: remove duplicate selection of DMAR_TABLE This is already done in intel/Kconfig. Fixes: `70bad345e6` ("iommu: Fix compilation without CONFIG_IOMMU_INTEL") Signed-off-by: Rolf Eike Beer <eb@emlix.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/2232605.Mh6RI2rZIc@devpool92.emlix.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-05-16 08:46:45 +02:00
Lu Baolu	2e9b2ee2ba	iommu: Cleanup comments for dev_enable/disable_feat The dev_enable/disable_feat ops have been removed by commit <f984fb09e60e> ("iommu: Remove iommu_dev_enable/disable_feature()"). Cleanup the comments to make the code clean. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20250430025249.2371751-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-05-02 08:41:43 +02:00
Jason Gunthorpe	e586e22974	iommu: Protect against overflow in iommu_pgsize() On a 32 bit system calling: iommu_map(0, 0x40000000) When using the AMD V1 page table type with a domain->pgsize of 0xfffff000 causes iommu_pgsize() to miscalculate a result of: size=0x40000000 count=2 count should be 1. This completely corrupts the mapping process. This is because the final test to adjust the pagesize malfunctions when the addition overflows. Use check_add_overflow() to prevent this. Fixes: `b1d99dc5f9` ("iommu: Hook up '->unmap_pages' driver callback") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Link: https://lore.kernel.org/r/0-v1-3ad28fc2e3a3+163327-iommu_overflow_pgsize_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-28 13:33:30 +02:00
Robin Murphy	da33e87bd2	iommu: Handle yet another race around registration Next up on our list of race windows to close is another one during iommu_device_register() - it's now OK again for multiple instances to run their bus_iommu_probe() in parallel, but an iommu_probe_device() can still also race against a running bus_iommu_probe(). As Johan has managed to prove, this has now become a lot more visible on DT platforms wth driver_async_probe where a client driver is attempting to probe in parallel with its IOMMU driver - although commit `b46064a188` ("iommu: Handle race with default domain setup") resolves this from the client driver's point of view, this isn't before of_iommu_configure() has had the chance to attempt to "replay" a probe that the bus walk hasn't even tried yet, and so still cause the out-of-order group allocation behaviour that we're trying to clean up (and now warning about). The most reliable thing to do here is to explicitly keep track of the "iommu_device_register() is still running" state, so we can then special-case the ops lookup for the replay path (based on dev->iommu again) to let that think it's still waiting for the IOMMU driver to appear at all. This still leaves the longstanding theoretical case of iommu_bus_notifier() being triggered during bus_iommu_probe(), but it's not so simple to defer a notifier, and nobody's ever reported that being a visible issue, so let's quietly kick that can down the road for now... Reported-by: Johan Hovold <johan@kernel.org> Fixes: `bcb81ac6ae` ("iommu: Get DT/ACPI parsing into the proper probe path") Signed-off-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/88d54c1b48fed8279aa47d30f3d75173685bb26a.1745516488.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-28 13:31:18 +02:00
Lu Baolu	6f7340120a	iommu: Allow attaching static domains in iommu_attach_device_pasid() The idxd driver attaches the default domain to a PASID of the device to perform kernel DMA using that PASID. The domain is attached to the device's PASID through iommu_attach_device_pasid(), which checks if the domain->owner matches the iommu_ops retrieved from the device. If they do not match, it returns a failure. if (ops != domain->owner \|\| pasid == IOMMU_NO_PASID) return -EINVAL; The static identity domain implemented by the intel iommu driver doesn't specify the domain owner. Therefore, kernel DMA with PASID doesn't work for the idxd driver if the device translation mode is set to passthrough. Generally the owner field of static domains are not set because they are already part of iommu ops. Add a helper domain_iommu_ops_compatible() that checks if a domain is compatible with the device's iommu ops. This helper explicitly allows the static blocked and identity domains associated with the device's iommu_ops to be considered compatible. Fixes: `2031c469f8` ("iommu/vt-d: Add support for static identity domain") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220031 Cc: stable@vger.kernel.org Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/linux-iommu/20250422191554.GC1213339@ziepe.ca/ Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20250424034123.2311362-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-28 13:23:35 +02:00
Arnd Bergmann	fa26198d30	iommu/io-pgtable-arm: dynamically allocate selftest device struct In general a 'struct device' is way too large to be put on the kernel stack. Apparently something just caused it to grow a slightly larger, which pushed the arm_lpae_do_selftests() function over the warning limit in some configurations: drivers/iommu/io-pgtable-arm.c:1423:19: error: stack frame size (1032) exceeds limit (1024) in 'arm_lpae_do_selftests' [-Werror,-Wframe-larger-than] 1423 \| static int __init arm_lpae_do_selftests(void) \| ^ Change the function to use a dynamically allocated faux_device instead of the on-stack device structure. Fixes: `ca25ec247a` ("iommu/io-pgtable-arm: Remove iommu_dev==NULL special case") Link: https://lore.kernel.org/all/ab75a444-22a1-47f5-b3c0-253660395b5a@arm.com/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Robin Murphy <robin.murphy@arm.com> Link: https://lore.kernel.org/r/20250423164826.2931382-1-arnd@kernel.org Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-28 13:21:18 +02:00
Jason Gunthorpe	21c03574df	iommu: Hide ops.domain_alloc behind CONFIG_FSL_PAMU fsl_pamu is the last user of domain_alloc(), and it is using it to create something weird that doesn't really fit into the iommu subsystem architecture. It is a not a paging domain since it doesn't have any map/unmap ops. It may be some special kind of identity domain. For now just leave it as is. Wrap it's definition in CONFIG_FSL_PAMU to discourage any new drivers from attempting to use it. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/5-v4-ff5fb6b03bd1+288-iommu_virtio_domains_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-28 13:14:59 +02:00
Jason Gunthorpe	a4672d0fe1	iommu: Do not call domain_alloc() in iommu_sva_domain_alloc() No driver implements SVA under domain_alloc() anymore, this is dead code. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/4-v4-ff5fb6b03bd1+288-iommu_virtio_domains_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-28 13:14:59 +02:00
Jason Gunthorpe	07107e7444	iommu/virtio: Move to domain_alloc_paging() virtio has the complication that it sometimes wants to return a paging domain for IDENTITY which makes this conversion a little different than other drivers. Add a viommu_domain_alloc_paging() that combines viommu_domain_alloc() and viommu_domain_finalise() to always return a fully initialized and finalized paging domain. Use viommu_domain_alloc_identity() to implement the special non-bypass IDENTITY flow by calling viommu_domain_alloc_paging() then viommu_domain_map_identity(). Remove support for deferred finalize and the vdomain->mutex. Remove core support for domain_alloc() IDENTITY as virtio was the last driver using it. Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/3-v4-ff5fb6b03bd1+288-iommu_virtio_domains_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-28 13:14:58 +02:00
Jason Gunthorpe	0d609a1450	iommu: Add domain_alloc_identity() virtio-iommu has a mode where the IDENTITY domain is actually a paging domain with an identity mapping covering some of the system address space manually created. To support this add a new domain_alloc_identity() op that accepts the struct device so that virtio can allocate and fully finalize a paging domain to return. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/2-v4-ff5fb6b03bd1+288-iommu_virtio_domains_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-28 13:14:57 +02:00
Jason Gunthorpe	0d76a6edae	iommu/virtio: Break out bypass identity support into a global static To make way for a domain_alloc_paging conversion add the typical global static IDENTITY domain. This supports VMMs that have a VIRTIO_IOMMU_F_BYPASS_CONFIG config. If the VMM does not have support then the domain_alloc path is still used, which creates an IDENTITY domain out of a paging domain. Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/1-v4-ff5fb6b03bd1+288-iommu_virtio_domains_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-28 13:14:57 +02:00
Lu Baolu	f984fb09e6	iommu: Remove iommu_dev_enable/disable_feature() No external drivers use these interfaces anymore. Furthermore, no existing iommu drivers implement anything in the callbacks. Remove them to avoid dead code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Link: https://lore.kernel.org/r/20250418080130.1844424-9-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-28 13:04:35 +02:00
Lu Baolu	be2a24322c	iommufd: Remove unnecessary IOMMU_DEV_FEAT_IOPF The iopf enablement has been moved to the iommu drivers. It is unnecessary for iommufd to handle iopf enablement. Remove the iopf enablement logic to avoid duplication. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Link: https://lore.kernel.org/r/20250418080130.1844424-8-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-28 13:04:34 +02:00
Lu Baolu	ec027bf7e8	uacce: Remove unnecessary IOMMU_DEV_FEAT_IOPF None of the drivers implement anything for IOMMU_DEV_FEAT_IOPF anymore, remove it to avoid dead code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Zhangfei Gao <zhangfei.gao@linaro.org> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Link: https://lore.kernel.org/r/20250418080130.1844424-7-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-28 13:04:34 +02:00
Lu Baolu	853b01b5ef	dmaengine: idxd: Remove unnecessary IOMMU_DEV_FEAT_IOPF The IOMMU_DEV_FEAT_IOPF implementation in the iommu driver is just a no-op. It will also be removed from the iommu driver in the subsequent patch. Remove it to avoid dead code. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Vinod Koul <vkoul@kernel.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Link: https://lore.kernel.org/r/20250418080130.1844424-6-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-28 13:04:33 +02:00
Lu Baolu	c2fa4d4cce	iommufd/selftest: Put iopf enablement in domain attach path Update iopf enablement in the iommufd mock device driver to use the new method, similar to the arm-smmu-v3 driver. Enable iopf support when any domain with an iopf_handler is attached, and disable it when the domain is removed. Add a refcount in the mock device state structure to keep track of the number of domains set to the device and PASIDs that require iopf. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Link: https://lore.kernel.org/r/20250418080130.1844424-5-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-28 13:04:33 +02:00
Lu Baolu	17fce9d233	iommu/vt-d: Put iopf enablement in domain attach path Update iopf enablement in the driver to use the new method, similar to the arm-smmu-v3 driver. Enable iopf support when any domain with an iopf_handler is attached, and disable it when the domain is removed. Place all the logic for controlling the PRI and iopf queue in the domain set/remove/replace paths. Keep track of the number of domains set to the device and PASIDs that require iopf. When the first domain requiring iopf is attached, add the device to the iopf queue and enable PRI. When the last domain is removed, remove it from the iopf queue and disable PRI. Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Link: https://lore.kernel.org/r/20250418080130.1844424-4-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-28 13:04:31 +02:00
Jason Gunthorpe	7c8896dd4a	iommu: Remove IOMMU_DEV_FEAT_SVA None of the drivers implement anything here anymore, remove the dead code. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Yi Liu <yi.l.liu@intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Link: https://lore.kernel.org/r/20250418080130.1844424-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-28 13:04:29 +02:00
Jason Gunthorpe	cfea71aea9	iommu/arm-smmu-v3: Put iopf enablement in the domain attach path SMMUv3 co-mingles FEAT_IOPF and FEAT_SVA behaviors so that fault reporting doesn't work unless both are enabled. This is not correct and causes problems for iommufd which does not enable FEAT_SVA for it's fault capable domains. These APIs are both obsolete, update SMMUv3 to use the new method like AMD implements. A driver should enable iopf support when a domain with an iopf_handler is attached, and disable iopf support when the domain is removed. Move the fault support logic to sva domain allocation and to domain attach, refusing to create or attach fault capable domains if the HW doesn't support it. Move all the logic for controlling the iopf queue under arm_smmu_attach_prepare(). Keep track of the number of domains on the master (over all the SSIDs) that require iopf. When the first domain requiring iopf is attached create the iopf queue, when the last domain is detached destroy it. Turn FEAT_IOPF and FEAT_SVA into no ops. Remove the sva_lock, this is all protected by the group mutex. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Zhangfei Gao <zhangfei.gao@linaro.org> Acked-by: Will Deacon <will@kernel.org> Link: https://lore.kernel.org/r/20250418080130.1844424-2-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-28 13:04:28 +02:00
Robin Murphy	0da188c846	iommu: Split out and tidy up Arm Kconfig There are quite a lot of options for the Arm drivers, still all buried in the top-level Kconfig. For ease of use and consistency with all the other subdirectories, break these out into drivers/arm. For similar clarity and self-consistency, also tweak the ARM_SMMU sub-options to use "if" instead of "depends", to match ARM_SMMU_V3. Lastly also clean up the slightly messy description of ARM_SMMU_DISABLE_BYPASS_BY_DEFAULT as highlighted by Geert - by now we really shouldn't need commentary on v4.x kernel behaviour anyway - and downgrade it to EXPERT as the first step in the 6-year-old threat to remove it entirely. Cc: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Link: https://lore.kernel.org/r/a614ec86ba78c09cd16e348f633f6bb38793391f.1742480488.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:34:00 +02:00
Robin Murphy	0c8e9c148e	iommu: Avoid introducing more races Although the lock-juggling is only a temporary workaround, we don't want it to make things avoidably worse. Jason was right to be nervous, since bus_iommu_probe() doesn't care which IOMMU instance it's probing for, so it probably is possible for one walk to finish a probe which a different walk started, thus we do want to check for that. Also there's no need to drop the lock just to have of_iommu_configure() do nothing when a fwspec already exists; check that directly and avoid opening a window at all in that (still somewhat likely) case. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/09d901ad11b3a410fbb6e27f7d04ad4609c3fe4a.1741706365.git.robin.murphy@arm.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:31:07 +02:00
Jason Gunthorpe	249d3327f0	iommu/vtd: Remove iommu_alloc_pages_node() Intel is the only thing that uses this now, convert to the size versions, trying to avoid PAGE_SHIFT. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/23-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:54 +02:00
Jason Gunthorpe	c3b42b6ffa	iommu/amd: Use iommu_alloc_pages_node_sz() for the IRT Use the actual size of the irq_table allocation, limiting to 128 due to the HW alignment needs. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/22-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:53 +02:00
Jason Gunthorpe	5087f663c2	iommu/pages: Remove iommu_alloc_page_node() Use iommu_alloc_pages_node_sz() instead. AMD and Intel are both using 4K pages for these structures since those drivers only work on 4K PAGE_SIZE. riscv is also spec'd to use SZ_4K. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/21-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:51 +02:00
Jason Gunthorpe	28024569e8	iommu/pages: Remove iommu_alloc_page/pages() A few small changes to the remaining drivers using these will allow them to be removed: - Exynos wants to allocate fixed 16K/8K allocations - Rockchip already has a define SPAGE_SIZE which is used by the dma_map immediately following, using SPAGE_ORDER which is a lg2size - tegra has size constants already for its two allocations Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/20-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:50 +02:00
Jason Gunthorpe	d50aaa4a9f	iommu: Update various drivers to pass in lg2sz instead of order to iommu pages Convert most of the places calling get_order() as an argument to the iommu-pages allocator into order_base_2() or the _sz flavour instead. These places already have an exact size, there is no particular reason to use order here. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/19-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:48 +02:00
Jason Gunthorpe	9dda3f01dd	iommu/riscv: Update to use iommu_alloc_pages_node_lg2() One part of RISCV already has a computed size, however the queue allocation must be aligned to 4k. The other objects are 4k by spec. Reviewed-by: Tomasz Jeznach <tjeznach@rivosinc.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/18-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:48 +02:00
Jason Gunthorpe	5faa04c4ed	iommu/amd: Use roundup_pow_two() instead of get_order() If x >= PAGE_SIZE then: 1 << (get_order(x) + PAGE_SHIFT) == roundup_pow_two() Inline this into the only caller, compute the size of the HW device table in terms of 4K pages which matches the HW definition. Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/17-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:47 +02:00
Jason Gunthorpe	e874c666b1	iommu/amd: Change rlookup, irq_lookup, and alias to use kvalloc() This is just CPU memory used by the driver to track things, it doesn't need to use iommu-pages. All of them are indexed by devid and devid is bounded by pci_seg->last_bdf or we are already out of bounds on the page allocation. Switch them to use some version of kvmalloc_array() and drop the now unused constants and remove the tbl_size() round up to PAGE_SIZE multiples logic. Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/16-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:47 +02:00
Jason Gunthorpe	b3efacc451	iommu/pages: Allow sub page sizes to be passed into the allocator Generally drivers have a specific idea what their HW structure size should be. In a lot of cases this is related to PAGE_SIZE, but not always. ARM64, for example, allows a 4K IO page table size on a 64K CPU page table system. Currently we don't have any good support for sub page allocations, but make the API accommodate this by accepting a sub page size from the caller and rounding up internally. This is done by moving away from order as the size input and using size: size == 1 << (order + PAGE_SHIFT) Following patches convert drivers away from using order and try to specify allocation sizes independent of PAGE_SIZE. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/15-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:46 +02:00
Jason Gunthorpe	580ccca4ee	iommu/pages: Move the __GFP_HIGHMEM checks into the common code The entire allocator API is built around using the kernel virtual address, it is illegal to pass GFP_HIGHMEM in as a GFP flag. Block it in the common code. Remove the duplicated checks from drivers. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/14-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:45 +02:00
Jason Gunthorpe	212fcf36c6	iommu/pages: Move from struct page to struct ioptdesc and folio This brings the iommu page table allocator into the modern world of having its own private page descriptor and not re-using fields from struct page for its own purpose. It follows the basic pattern of struct ptdesc which did this transformation for the CPU page table allocator. Currently iommu-pages is pretty basic so this isn't a huge benefit, however I see a coming need for features that CPU allocator has, like sub PAGE_SIZE allocations, and RCU freeing. This provides the base infrastructure to implement those cleanly. Remove numa_node_id() calls from the inlines and instead use NUMA_NO_NODE which will get switched to numa_mem_id(), which seems to be the right ID to use for memory allocations. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/13-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:44 +02:00
Jason Gunthorpe	27bc9f717f	iommu/pages: Remove iommu_put_pages_list_old and the _Generic Nothing uses the old list_head path now, remove it. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/12-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:43 +02:00
Jason Gunthorpe	868240c34e	iommu: Change iommu_iotlb_gather to use iommu_page_list This converts the remaining places using list of pages to the new API. The Intel free path was shared with its gather path, so it is converted at the same time. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/11-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:42 +02:00
Jason Gunthorpe	c70637cdd8	iommu/amd: Convert to use struct iommu_pages_list Change the internal freelist to use struct iommu_pages_list. AMD uses the freelist to batch free the entire table during domain destruction, and to replace table levels with leafs during map. Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/10-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:42 +02:00
Jason Gunthorpe	d4d5153ad6	iommu/riscv: Convert to use struct iommu_pages_list Change the internal freelist to use struct iommu_pages_list. riscv uses this page list to free page table levels that are replaced with leaf ptes. Reviewed-by: Tomasz Jeznach <tjeznach@rivosinc.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/9-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:41 +02:00
Jason Gunthorpe	13f43d7cf3	iommu/pages: Formalize the freelist API We want to get rid of struct page references outside the internal allocator implementation. The free list has the driver open code something like: list_add_tail(&virt_to_page(ptr)->lru, freelist); Move the above into a small inline and make the freelist into a wrapper type 'struct iommu_pages_list' so that the compiler can help check all the conversion. This struct has also proven helpful in some future ideas to convert to a singly linked list to get an extra pointer in the struct page, and to signal that the pages should be freed with RCU. Use a temporary _Generic so we don't need to rename the free function as the patches progress. Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/8-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:40 +02:00
Jason Gunthorpe	f5af4a4f7c	iommu/pages: De-inline the substantial functions These are called in a lot of places and are not trivial. Move them to the core module. Tidy some of the comments and function arguments, fold __iommu_alloc_account() into its only caller, change __iommu_free_account() into __iommu_free_page() to remove some duplication. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/7-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:39 +02:00
Jason Gunthorpe	3e8e986ce8	iommu/pages: Remove iommu_free_page() Use iommu_free_pages() instead. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/6-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:36 +02:00
Jason Gunthorpe	4316ba4a50	iommu/pages: Remove the order argument to iommu_free_pages() Now that we have a folio under the allocation iommu_free_pages() can know the order of the original allocation and do the correct thing to free it. The next patch will rename iommu_free_page() to iommu_free_pages() so we have naming consistency with iommu_alloc_pages_node(). Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/5-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:33 +02:00
Jason Gunthorpe	c11a1a4792	iommu/pages: Make iommu_put_pages_list() work with high order allocations alloc_pages_node(, order) needs to be paired with __free_pages(, order) to free all the allocated pages. For order != 0 the return from alloc_pages_node() is just a page list, it hasn't been formed into a folio. However iommu_put_pages_list() just calls put_page() on the head page of an allocation, which will end up leaking the tail pages if order != 0. Fix this by using __GFP_COMP to create a high order folio and then always use put_page() to free the full high order folio. __iommu_free_account() can get the order of the allocation via folio_order(), which corrects the accounting of high order allocations in iommu_put_pages_list(). This is the same technique slub uses. As far as I can tell, none of the places using high order allocations are also using the free list, so this not a current bug. Fixes: `06c375053c` ("iommu/vt-d: add wrapper functions for page allocations") Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/4-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:33 +02:00
Jason Gunthorpe	8360c03dd9	iommu/pages: Remove __iommu_alloc_pages()/__iommu_free_pages() These were only used by tegra-smmu and leaked the struct page out of the API. Delete them since tega-smmu has been converted to the other APIs. In the process flatten the call tree so we have fewer one line functions calling other one line functions.. iommu_alloc_pages_node() is the real allocator and everything else can just call it directly. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/3-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:32 +02:00
Jason Gunthorpe	a96969a915	iommu/tegra: Do not use struct page as the handle for pts Instead use the virtual address and dma_map_single() like as->pd uses. Introduce a small struct tegra_pt instead of void * to have some clarity what is using this API and add compile safety during the conversion. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/2-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:32 +02:00
Jason Gunthorpe	50568f87d1	iommu/terga: Do not use struct page as the handle for as->pd memory Instead use the virtual address. Change from dma_map_page() to dma_map_single() which works directly on a KVA. Add a type for the pd table level for clarity. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Link: https://lore.kernel.org/r/1-v4-c8663abbb606+3f7-iommu_pages_jgg@nvidia.com Signed-off-by: Joerg Roedel <jroedel@suse.de>	2025-04-17 16:22:31 +02:00
Linus Torvalds	8ffd015db8	Linux 6.15-rc2 v6.15-rc2	2025-04-13 11:54:49 -07:00
Linus Torvalds	004a365eb8	Merge tag 'erofs-for-6.15-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs fixes from Gao Xiang: - Properly handle errors when file-backed I/O fails - Fix compilation issues on ARM platform (arm-linux-gnueabi) - Fix parsing of encoded extents - Minor cleanup * tag 'erofs-for-6.15-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: remove duplicate code erofs: fix encoded extents handling erofs: add __packed annotation to union(__le16..) erofs: set error to bio if file-backed IO fails	2025-04-13 10:52:04 -07:00
Linus Torvalds	5aaaedb0cb	Merge tag 'ext4_for_linus-6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 fixes from Ted Ts'o: "A few more miscellaneous ext4 bug fixes and cleanups including some syzbot failures and fixing a stale file handing refeencing an inode previously used as a regular file, but which has been deleted and reused as an ea_inode would result in ext4 erroneously considering this a case of fs corruption" * tag 'ext4_for_linus-6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: ext4: fix off-by-one error in do_split ext4: make block validity check resistent to sb bh corruption ext4: avoid -Wflex-array-member-not-at-end warning Documentation: ext4: Add fields to ext4_super_block documentation ext4: don't treat fhandle lookup of ea_inode as FS corruption	2025-04-13 07:15:50 -07:00
Linus Torvalds	051ea726ee	Merge tag 'fixes-2025-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock fix from Mike Rapoport: "Fix build of memblock test. Add missing stubs for mutex and free_reserved_area() to memblock tests" * tag 'fixes-2025-04-13' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: memblock tests: Fix mutex related build error	2025-04-13 07:11:33 -07:00

1 2 3 4 5 ...

1351491 Commits