linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-06-11 05:55:28 -04:00

Author	SHA1	Message	Date
Jason Gunthorpe	ef7bfe5bbf	iommupt/x86: Support SW bits and permit PT_FEAT_DMA_INCOHERENT VT-d requires PT_FEAT_DMA_INCOHERENT for the x86 page table as well, implement the required SW bits and enable the feature. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-11-05 09:50:19 +01:00
Jason Gunthorpe	1978fac281	iommupt/x86: Set the dirty bit only for writable PTEs AMD and VTD are historically different here, adopt the VTD version of setting the D bit only on writable PTEs as it makes more sense. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-11-05 09:50:19 +01:00
Jason Gunthorpe	5448c1558f	iommupt: Add the Intel VT-d second stage page table format The VT-d second stage format is almost the same as the x86 PAE format, except the bit encodings in the PTE are different and a few new PTE features, like force coherency are present. Among all the formats it is unique in not having a designated present bit. Comparing the performance of several operations to the existing version: iommu_map() pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 53,66 , 50,64 , 21.21 2^21, 59,70 , 56,67 , 16.16 2^30, 54,66 , 52,63 , 17.17 2562^12, 384,524 , 337,516 , 34.34 2562^21, 387,632 , 336,626 , 46.46 2562^30, 376,629 , 323,623 , 48.48 iommu_unmap() pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 67,86 , 63,84 , 25.25 2^21, 64,84 , 59,80 , 26.26 2^30, 59,78 , 56,74 , 24.24 2562^12, 216,335 , 198,317 , 37.37 2562^21, 245,350 , 232,344 , 32.32 2562^30, 248,345 , 226,339 , 33.33 Cc: Tina Zhang <tina.zhang@intel.com> Cc: Kevin Tian <kevin.tian@intel.com> Cc: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-11-05 09:50:17 +01:00
Jason Gunthorpe	efa03dab7c	iommupt: Flush the CPU cache after any writes to the page table Flush the CPU cache for the page table memory after each set of writes to the page table. The iommu should have visibility to the updated entries as soon as the map/unmap/etc operations return, like normal coherent hardware does. The caches also have to be flushed before any gather can be submitted to the driver. Implement the same solution to the race as io-pgtable-arm by using a software PTE bit to track if a table entry has been flushed or not. If another thread is still flushing then another concurrent map operation could return without IOMMU visibility to a required table entry. The SW bit will tell the second thread to also flush the cache. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-11-05 09:47:45 +01:00
Jason Gunthorpe	aefd967dab	iommupt: Use the incoherent start/stop functions for PT_FEAT_DMA_INCOHERENT This is the first step to supporting an incoherent walker, start and stop the incoherence around the allocation and frees of the page table memory. The iommu_pages API maps this to dma_map/unmap_single(), or arch cache flushing calls. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-11-05 09:47:44 +01:00
Jason Gunthorpe	bcc64b57b4	iommupt: Add basic support for SW bits in the page table SW bits can be placed on items, including table entries, single OA's and individual items within a contiguous OA. They are guaranteed to be ignored by the HW. The API is very basic since the only use case so far is a single bit. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-11-05 09:47:44 +01:00
Jason Gunthorpe	36ae67b139	iommu/pages: Add support for incoherent IOMMU page table walkers Some IOMMU HW cannot snoop the CPU cache when it walks the IO page tables. The CPU is required to flush the cache to make changes visible to the HW. Provide some helpers from iommu-pages to manage this. The helpers combine both the ARM and x86 (used in Intel VT-d) versions of the cache flushing under a single API. The ARM version uses the DMA API to access the cache flush on the assumption that the iommu is using a direct mapping and is already marked incoherent. The helpers will do the DMA API calls to set things up and keep track of DMA mapped folios using a bit in the ioptdesc so that unmapping on error paths is cleaner. The Intel version just calls the arch cache flush call directly and has no need to cleanup prior to destruction. Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-11-05 09:47:43 +01:00
Jason Gunthorpe	bc5233c090	iommupt: Add a kunit test for the IOMMU implementation This intends to have high coverage of the page table format functions and the IOMMU implementation itself, exercising the various corner cases. The kunit tests can be run in the kunit framework, using commands like: tools/testing/kunit/kunit.py run --build_dir build_kunit_arm64 --arch arm64 --make_options LLVM=-19 --kunitconfig ./drivers/iommu/generic_pt/.kunitconfig tools/testing/kunit/kunit.py run --build_dir build_kunit_uml --kunitconfig ./drivers/iommu/generic_pt/.kunitconfig tools/testing/kunit/kunit.py run --build_dir build_kunit_x86_64 --arch x86_64 --kunitconfig ./drivers/iommu/generic_pt/.kunitconfig tools/testing/kunit/kunit.py run --build_dir build_kunit_i386 --arch i386 --kunitconfig ./drivers/iommu/generic_pt/.kunitconfig tools/testing/kunit/kunit.py run --build_dir build_kunit_i386pae --arch i386 --kunitconfig ./drivers/iommu/generic_pt/.kunitconfig --kconfig_add CONFIG_X86_PAE=y There are several interesting corner cases on the 32 bit platforms that need checking. Like the generic tests, these are run on the format's configuration list using kunit "params". This also checks the core iommu parts of the page table code as it enters the logic through a mock iommu_domain. The following are checked: - PT_FEAT_DYNAMIC_TOP properly adds levels one by one - Every page size can be iommu_map()'d, and mapping creates that size - iommu_iova_to_phys() works with every page size - Test converting OA -> non present -> OA when the two OAs overlap and free table levels - Test that unmap stops at holes, unmap doesn't split, and unmap returns the right values for partial unmap requests - Randomly map/unmap. Checks map with random sizes, that map fails when hitting collisions doing nothing, unmap/map with random intersections and full unmap of random sizes. Also checks iommu_iova_to_phys() with random sizes - Check for memory leaks by monitoring NR_SECONDARY_PAGETABLE Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-11-05 09:08:58 +01:00
Jason Gunthorpe	2fdf6db436	iommu/amd: Remove AMD io_pgtable support None of this is used anymore, delete it. Reviewed-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-11-05 09:08:57 +01:00
Alejandro Jimenez	789a5913b2	iommu/amd: Use the generic iommu page table Replace the io_pgtable versions with pt_iommu versions. The v2 page table uses the x86 implementation that will be eventually shared with VT-d. This supports the same special features as the original code: - increase_top for the v1 format to allow scaling from 3 to 6 levels - non-present flushing - Dirty tracking for v1 only - __sme_set() to adjust the PTEs for CC - Optimization for flushing with virtualization to minimize the range - amd_iommu_pgsize_bitmap override of the native page sizes - page tables allocate from the device's NUMA node Rework the domain ops so that v1/v2 get their own ops. Make dedicated allocation functions for v1 and v2. Hook up invalidation for a top change to struct pt_iommu_flush_ops. Delete some of the iopgtable related code that becomes unused in this patch. The next patch will delete the rest of it. This fixes a race bug in AMD's increase_address_space() implementation. It stores the top level and top pointer in different memory, which prevents other threads from reading a coherent version: increase_address_space() alloc_pte() level = pgtable->mode - 1; pgtable->root = pte; pgtable->mode += 1; pte = &pgtable->root[PM_LEVEL_INDEX(level, address)]; The iommupt version is careful to put mode and root under a single READ_ONCE and then is careful to only READ_ONCE a single time per walk. Signed-off-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-11-05 09:08:56 +01:00
Jason Gunthorpe	aef5de756e	iommupt: Add the x86 64 bit page table format This is used by x86 CPUs and can be used in AMD/VT-d x86 IOMMUs. When a x86 IOMMU is running SVA the MM will be using this format. This implementation follows the AMD v2 io-pgtable version. There is nothing remarkable here, the format can have 4 or 5 levels and limited support for different page sizes. No contiguous pages support. x86 uses a sign extension mechanism where the top bits of the VA must match the sign bit. The core code supports this through PT_FEAT_SIGN_EXTEND which creates and upper and lower VA range. All the new operations will work correctly in both spaces, however currently there is no way to report the upper space to other layers. Future patches can improve that. In principle this can support 3 page tables levels matching the 32 bit PAE table format, but no iommu driver needs this. The focus is on the modern 64 bit 4 and 5 level formats. Comparing the performance of several operations to the existing version: iommu_map() pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 71,61 , 66,58 , -13.13 2^21, 66,60 , 61,55 , -10.10 2^30, 59,56 , 56,54 , -3.03 2562^12, 392,1360 , 345,1289 , 73.73 2562^21, 383,1159 , 335,1145 , 70.70 2562^30, 378,965 , 331,892 , 62.62 iommu_unmap() pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 77,71 , 73,68 , -7.07 2^21, 76,70 , 70,66 , -6.06 2^30, 69,66 , 66,63 , -4.04 2562^12, 225,899 , 210,870 , 75.75 2562^21, 262,722 , 248,710 , 65.65 2562^30, 251,643 , 244,634 , 61.61 The small -ve values in the iommu_unmap() are due to the core code calling iommu_pgsize() before invoking the domain op. This is unncessary with this implementation. Future work optimizes this and gets to 2%, 4%, 3%. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-11-05 09:07:14 +01:00
Jason Gunthorpe	e93d5945ed	iommufd: Change the selftest to use iommupt instead of xarray The iommufd self test uses an xarray to store the pfns and their orders to emulate a page table. Make it act more like a real iommu driver by replacing the xarray with an iommupt based page table. The new AMDv1 mock format behaves similarly to the xarray. Add set_dirty() as a iommu_pt operation to allow the test suite to simulate HW dirty. Userspace can select between several formats including the normal AMDv1 format and a special MOCK_IOMMUPT_HUGE variation for testing huge page dirty tracking. To make the dirty tracking test work the page table must only store exactly 2M huge pages otherwise the logic the test uses fails. They cannot be broken up or combined. Aside from aligning the selftest with a real page table implementation, this helps test the iommupt code itself. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-11-05 09:07:13 +01:00
Jason Gunthorpe	e5359dcc61	iommupt: Add a mock pagetable format for iommufd selftest to use The iommufd self test uses an xarray to store the pfns and their orders to emulate a page table. Slightly modify the amdv1 page table to create a real page table that has similar properties: - 2k base granule to simulate something like a 4k page table on a 64K PAGE_SIZE ARM system - Contiguous page support for every PFN order - Dirty tracking AMDv1 is the closest format, as it is the only one that already supports every page size. Tweak it to have only 5 levels and an 11 bit base granule and compile it separately as a format variant. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-11-05 09:07:13 +01:00
Jason Gunthorpe	1dd4187f53	iommupt: Add a kunit test for Generic Page Table This intends to have high coverage of the page table format functions, it uses the IOMMU implementation to create a tree which it then walks through and directly calls the generic page table functions to test them. It is a good starting point to test a new format header as it is often able to find typos and inconsistencies much more directly, rather than with an obscure failure in the iommu implementation. The tests can be run with commands like: tools/testing/kunit/kunit.py run --build_dir build_kunit_arm64 --arch arm64 --make_options LLVM=-19 --kunitconfig ./drivers/iommu/generic_pt/.kunitconfig tools/testing/kunit/kunit.py run --build_dir build_kunit_uml --kunitconfig ./drivers/iommu/generic_pt/.kunitconfig --kconfig_add CONFIG_WERROR=n tools/testing/kunit/kunit.py run --build_dir build_kunit_x86_64 --arch x86_64 --kunitconfig ./drivers/iommu/generic_pt/.kunitconfig tools/testing/kunit/kunit.py run --build_dir build_kunit_i386 --arch i386 --kunitconfig ./drivers/iommu/generic_pt/.kunitconfig tools/testing/kunit/kunit.py run --build_dir build_kunit_i386pae --arch i386 --kunitconfig ./drivers/iommu/generic_pt/.kunitconfig --kconfig_add CONFIG_X86_PAE=y There are several interesting corner cases on the 32 bit platforms that need checking. The format can declare a list of configurations that generate different configurations the initialize the page table, for instance with different top levels or other parameters. The kunit will turn these into "params" which cause each test to run multiple times. The tests are repeated to run at every table level to check that all the item encoding formats work. The following are checked: - Basic init works for each configuration - The various log2 functions have the expected behavior at the limits - pt_compute_best_pgsize() works - pt_table_pa() reads back what pt_install_table() writes - range.max_vasz_lg2 works properly - pt_table_oa_lg2sz() and pt_table_item_lg2sz() use a contiguous non-overlapping set of bits from the VA up to the defined max_va - pt_possible_sizes() and pt_can_have_leaf() produces a sensible layout - pt_item_oa(), pt_entry_oa(), and pt_entry_num_contig_lg2() read back what pt_install_leaf_entry() writes - pt_clear_entry() works - pt_attr_from_entry() reads back what pt_iommu_set_prot() & pt_install_leaf_entry() writes - pt_entry_set_write_clean(), pt_entry_make_write_dirty(), and pt_entry_write_is_dirty() work Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-11-05 09:07:11 +01:00
Jason Gunthorpe	4a00f94348	iommupt: Add read_and_clear_dirty op IOMMU HW now supports updating a dirty bit in an entry when a DMA writes to the entry's VA range. iommufd has a uAPI to read and clear the dirty bits from the tables. This is a trivial recursive descent algorithm to read and optionally clear the dirty bits. The format needs a function to tell if a contiguous entry is dirty, and a function to clear a contiguous entry back to clean. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-11-05 09:07:11 +01:00
Jason Gunthorpe	dcd6a011a8	iommupt: Add map_pages op map is slightly complicated because it has to handle a number of special edge cases: - Overmapping a previously shared, but now empty, table level with an OA. Requries validating and freeing the possibly empty tables - Doing the above across an entire to-be-created contiguous entry - Installing a new shared table level concurrently with another thread - Expanding the table by adding more top levels Table expansion is a unique feature of AMDv1, this version is quite similar except we handle racing concurrent lockless map. The table top pointer and starting level are encoded in a single uintptr_t which ensures we can READ_ONCE() without tearing. Any op will do the READ_ONCE() and use that fixed point as its starting point. Concurrent expansion is handled with a table global spinlock. When inserting a new table entry map checks that the entire portion of the table is empty. This includes freeing any empty lower tables that will be overwritten by an OA. A separate free list is used while checking and collecting all the empty lower tables so that writing the new entry is uninterrupted, either the new entry fully writes or nothing changes. A special fast path for PAGE_SIZE is implemented that does a direct walk to the leaf level and installs a single entry. This gives ~15% improvement for iommu_map() when mapping lists of single pages. This version sits under the iommu_domain_ops as map_pages() but does not require the external page size calculation. The implementation is actually map_range() and can do arbitrary ranges, internally handling all the validation and supporting any arrangment of page sizes. A future series can optimize iommu_map() to take advantage of this. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-11-05 09:07:10 +01:00
Jason Gunthorpe	7c53f4238a	iommupt: Add unmap_pages op unmap_pages removes mappings and any fully contained interior tables from the given range. This follows the now-standard iommu_domain API definition where it does not split up larger page sizes into smaller. The caller must perform unmap only on ranges created by map or it must have somehow otherwise determined safe cut points (eg iommufd/vfio use iova_to_phys to scan for them) A future work will provide 'cut' which explicitly does the page size split if the HW can support it. unmap is implemented with a recursive descent of the tree. If the caller provides a VA range that spans an entire table item then the table memory can be freed as well. If an entire table item can be freed then this version will also check the leaf-only level of the tree to ensure that all entries are present to generate -EINVAL. Many of the existing drivers don't do this extra check. This version sits under the iommu_domain_ops as unmap_pages() but does not require the external page size calculation. The implementation is actually unmap_range() and can do arbitrary ranges, internally handling all the validation and supporting any arrangment of page sizes. A future series can optimize __iommu_unmap() to take advantage of this. Freed page table memory is batched up in the gather and will be freed in the driver's iotlb_sync() callback after the IOTLB flush completes. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-11-05 09:07:10 +01:00
Jason Gunthorpe	9d4c274cd7	iommupt: Add iova_to_phys op iova_to_phys is a performance path for the DMA API and iommufd, implement it using an unrolled get_user_pages() like function waterfall scheme. The implementation itself is fairly trivial. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-11-05 09:07:09 +01:00
Jason Gunthorpe	879ced2bab	iommupt: Add the AMD IOMMU v1 page table format AMD IOMMU v1 is unique in supporting contiguous pages with a variable size and it can decode the full 64 bit VA space. Unlike other x86 page tables this explicitly does not do sign extension as part of allowing the entire 64 bit VA space to be supported. The general design is quite similar to the x86 PAE format, except with a 6th level and quite different PTE encoding. This format is the only one that uses the PT_FEAT_DYNAMIC_TOP feature in the existing code as the existing AMDv1 code starts out with a 3 level table and adds levels on the fly if more IOVA is needed. Comparing the performance of several operations to the existing version: iommu_map() pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 65,64 , 62,61 , -1.01 2^13, 70,66 , 67,62 , -8.08 2^14, 73,69 , 71,65 , -9.09 2^15, 78,75 , 75,71 , -5.05 2^16, 89,89 , 86,84 , -2.02 2^17, 128,121 , 124,112 , -10.10 2^18, 175,175 , 170,163 , -4.04 2^19, 264,306 , 261,279 , 6.06 2^20, 444,525 , 438,489 , 10.10 2^21, 60,62 , 58,59 , 1.01 2562^12, 381,1833 , 367,1795 , 79.79 2562^21, 375,1623 , 356,1555 , 77.77 2562^30, 356,1338 , 349,1277 , 72.72 iommu_unmap() pgsz ,avg new,old ns, min new,old ns , min % (+ve is better) 2^12, 76,89 , 71,86 , 17.17 2^13, 79,89 , 75,86 , 12.12 2^14, 78,90 , 74,86 , 13.13 2^15, 82,89 , 74,86 , 13.13 2^16, 79,89 , 74,86 , 13.13 2^17, 81,89 , 77,87 , 11.11 2^18, 90,92 , 87,89 , 2.02 2^19, 91,93 , 88,90 , 2.02 2^20, 96,95 , 91,92 , 1.01 2^21, 72,88 , 68,85 , 20.20 2562^12, 372,6583 , 364,6251 , 94.94 2562^21, 398,6032 , 392,5758 , 93.93 2562^30, 396,5665 , 389,5258 , 92.92 The ~5-17x speedup when working with mutli-PTE map/unmaps is because the AMD implementation rewalks the entire table on every new PTE while this version retains its position. The same speedup will be seen with dirtys as well. The old implementation triggers a compiler optimization that ends up generating a "rep stos" memset for contiguous PTEs. Since AMD can have contiguous PTEs that span 2Kbytes of table this is a huge win compared to a normal movq loop. It is why the unmap side has a fairly flat runtime as the contiguous PTE sides increases. This version makes it explicit with a memset64() call. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Vasant Hegde <vasant.hegde@amd.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-11-05 09:07:08 +01:00
Jason Gunthorpe	cdb39d9185	iommupt: Add the basic structure of the iommu implementation The existing IOMMU page table implementations duplicate all of the working algorithms for each format. By using the generic page table API a single C version of the IOMMU algorithms can be created and re-used for all of the different formats used in the drivers. The implementation will provide a single C version of the iommu domain operations: iova_to_phys, map, unmap, and read_and_clear_dirty. Further, adding new algorithms and techniques becomes easy to do across the entire fleet of drivers and formats. The C functions are drop in compatible with the existing iommu_domain_ops using the IOMMU_PT_DOMAIN_OPS() macro. Each per-format implementation compilation unit will produce exported symbols following the pattern pt_iommu_FMT_map_pages() which the macro directly maps to the iommu_domain_ops members. This avoids the additional function pointer indirection like io-pgtable has. The top level struct used by the drivers is pt_iommu_table_FMT. It contains the other structs to allow container_of() to move between the driver, iommu page table, generic page table, and generic format layers. struct pt_iommu_table_amdv1 { struct pt_iommu { struct iommu_domain domain; } iommu; struct pt_amdv1 { struct pt_common common; } amdpt; }; The driver is expected to union the pt_iommu_table_FMT with its own existing domain struct: struct driver_domain { union { struct iommu_domain domain; struct pt_iommu_table_amdv1 amdv1; }; }; PT_IOMMU_CHECK_DOMAIN(struct driver_domain, amdv1, domain); To create an alias to avoid renaming 'domain' in a lot of driver code. This allows all the layers to access all the necessary functions to implement their different roles with no change to any of the existing iommu core code. Implement the basic starting point: pt_iommu_init(), get_info() and deinit(). Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-11-05 09:07:07 +01:00
Jason Gunthorpe	ab0b572847	genpt: Add Documentation/ files Add some general description and pull in the kdoc comments from the source file to index most of the useful functions. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-11-05 09:07:07 +01:00
Jason Gunthorpe	7c5b184db7	genpt: Generic Page Table base API The generic API is intended to be separated from the implementation of page table algorithms. It contains only accessors for walking and manipulating the table and helpers that are useful for building an implementation. Memory management is not in the generic API, but part of the implementation. Using a multi-compilation approach the implementation module would include headers in this order: common.h defs_FMT.h pt_defs.h FMT.h pt_common.h IMPLEMENTATION.h Where each compilation unit would have a combination of FMT and IMPLEMENTATION to produce a per-format per-implementation module. The API is designed so that the format headers have minimal logic, and default implementations are provided if the format doesn't include one. Generally formats provide their code via an inline function using the pattern: static inline FMTpt_XX(..) {} #define pt_XX FMTpt_XX The common code then enforces a function signature so that there is no drift in function arguments, or accidental polymorphic functions (as has been slightly troublesome in mm). Use of function-like #defines are avoided in the format even though many of the functions are small enough. Provide kdocs for the API surface. This is enough to implement the 8 initial format variations with all of their features: * Entries comprised of contiguous blocks of IO PTEs for larger page sizes (AMDv1, ARMv8) * Multi-level tables, up to 6 levels. Runtime selected top level * The size of the top table level can be selected at runtime (ARM's concatenated tables) * The number of levels in the table can optionally increase dynamically during map (AMDv1) * Optional leaf entries at any level * 32 bit/64 bit virtual and output addresses, using every bit * Sign extended addressing (x86) * Dirty tracking A basic simple format takes about 200 lines to declare the require inline functions. Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Tested-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com> Tested-by: Pasha Tatashin <pasha.tatashin@soleen.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-11-05 09:07:04 +01:00
Nicolin Chen	fd714986e4	iommu: Pass in old domain to attach_dev callback functions The IOMMU core attaches each device to a default domain on probe(). Then, every new "attach" operation has a fundamental meaning of two-fold: - detach from its currently attached (old) domain - attach to a given new domain Modern IOMMU drivers following this pattern usually want to clean up the things related to the old domain, so they call iommu_get_domain_for_dev() to fetch the old domain. Pass in the old domain pointer from the core to drivers, aligning with the set_dev_pasid op that does so already. Ensure all low-level attach fcuntions in the core can forward the correct old domain pointer. Thus, rework those functions as well. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-10-27 13:55:35 +01:00
Nicolin Chen	2b33598e66	iommu: Do not revert set_domain for the last gdev The last gdev is the device that failed the __iommu_device_set_domain(). So, it doesn't need to be reverted, given it's attached to group->domain already. This is not a problem currently, since it's a simply re-attach. However, the core will need to pass in the old domain to __iommu_device_set_domain so the old domain pointers would be inconsistent between a failed device and all its prior succeeded devices, as all the prior devices need to be reverted. Avoid the re-attach for the last gdev, by breaking before the revert. Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-10-27 13:55:35 +01:00
Nicolin Chen	c21b34762e	iommu/amd: Set release_domain to blocked_domain The set_dev_pasid for a release domain never gets called anyhow. So, there is no point in defining a separate release_domain from the blocked_domain. Simply reuse the blocked_domain. Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-10-27 13:55:35 +01:00
Nicolin Chen	680a6a60fc	iommu/exynos-iommu: Set release_domain to exynos_identity_domain Following a coming core change to pass in the old domain pointer into the attach_dev op and its callbacks, exynos_iommu_identity_attach() will need this new argument too, which the release_device op doesn't provide. Instead, the core provides a release_domain to attach to the device prior to invoking the release_device callback. Thus, simply use that. Acked-by: Marek Szyprowski <m.szyprowski@samsung.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-10-27 13:55:35 +01:00
Nicolin Chen	52f77fb176	iommu/arm-smmu-v3: Set release_domain to arm_smmu_blocked_domain Since the core now takes care of the require_direct case for the release domain, simply use that via the release_domain op. Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-10-27 13:55:34 +01:00
Jason Gunthorpe	e94160488e	iommu: Generic support for RMRs during device release Generally an IOMMU driver should leave the translation as BLOCKED until the translation entry is probed onto a struct device. When the struct device is removed, the translation should be put back to BLOCKED. Drivers that are able to work like this can set their release_domain to the blocking domain, and the core code handles this work. The exception is when the device has an IOMMU_RESV_DIRECT region, in which case the OS should continuously allow translations for the given range. And the core code generally prevents using a BLOCKED domain with this device. Continue this logic for the device release and hoist some open coding from drivers. If the device has dev->iommu->require_direct and the driver uses a BLOCKED release_domain, override it to IDENTITY to preserve the semantics. The only remaining required driver code for IOMMU_RESV_DIRECT should preset an IDENTITY translation during early IOMMU startup for those devices. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-10-27 13:55:34 +01:00
Pedro Demarchi Gomes	db340b02b2	iommu/pages: use folio_nr_pages() instead of shift operation folio_nr_pages() is a faster helper function to get the number of pages when NR_PAGES_IN_LARGE_FOLIO is enabled. Signed-off-by: Pedro Demarchi Gomes <pedrodemargomes@gmail.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2025-10-27 12:46:57 +01:00
Linus Torvalds	dcb6fa37fd	Linux 6.18-rc3 v6.18-rc3	2025-10-26 15:59:49 -07:00
Linus Torvalds	4bb1f7e19c	Merge tag 'char-misc-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc driver fixes from Greg KH: "Here are some small char/misc/android driver fixes for 6.18-rc3 for reported issues. Included in here are: - rust binder fixes for reported issues - mei device id addition - mei driver fixes - comedi bugfix - most usb driver bugfixes - fastrpc memory leak fix All of these have been in linux-next for a while with no reported issues" * tag 'char-misc-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: most: usb: hdm_probe: Fix calling put_device() before device initialization most: usb: Fix use-after-free in hdm_disconnect binder: remove "invalid inc weak" check mei: txe: fix initialization order comedi: fix divide-by-zero in comedi_buf_munge() mei: late_bind: Fix -Wincompatible-function-pointer-types-strict misc: fastrpc: Fix dma_buf object leak in fastrpc_map_lookup mei: me: add wildcat lake P DID misc: amd-sbi: Clarify that this is a BMC driver nvmem: rcar-efuse: add missing MODULE_DEVICE_TABLE binder: Fix missing kernel-doc entries in binder.c rust_binder: report freeze notification only when fully frozen rust_binder: don't delete FreezeListener if there are pending duplicates rust_binder: freeze_notif_done should resend if wrong state rust_binder: remove warning about orphan mappings rust_binder: clean `clippy::mem_replace_with_default` warning	2025-10-26 10:33:46 -07:00
Linus Torvalds	40282418e1	Merge tag 'staging-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging driver fixes from Greg KH: "Here are some small staging driver fixes for the gpib subsystem to resolve some reported issues. Included in here are: - memory leak fixes - error code fixes - proper protocol fixes All of these have been in linux-next for almost 2 weeks now with no reported issues" * tag 'staging-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: gpib: Fix device reference leak in fmh_gpib driver staging: gpib: Return -EINTR on device clear staging: gpib: Fix sending clear and trigger events staging: gpib: Fix no EOI on 1 and 2 byte writes	2025-10-26 10:29:45 -07:00
Linus Torvalds	aa6085a067	Merge tag 'tty-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial driver fixes from Greg KH: "Here are some small tty and serial driver fixes for reported issues. Included in here are: - sh-sci serial driver fixes - 8250_dw and _mtk driver fixes - sc16is7xx driver bugfix - new 8250_exar device ids added All of these have been in linux-next this past week with no reported issues" * tag 'tty-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: serial: 8250_mtk: Enable baud clock and manage in runtime PM serial: 8250_dw: handle reset control deassert error dt-bindings: serial: sh-sci: Fix r8a78000 interrupts serial: sc16is7xx: remove useless enable of enhanced features serial: 8250_exar: add support for Advantech 2 port card with Device ID 0x0018 tty: serial: sh-sci: fix RSCI FIFO overrun handling	2025-10-26 10:24:39 -07:00
Linus Torvalds	6190d0fa18	Merge tag 'usb-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB driver fixes from Greg KH: "Here are some small USB driver fixes and new device ids for 6.18-rc3. Included in here are: - new option serial driver device ids added - dt bindings fixes for numerous platforms - xhci bugfixes for many reported regressions - usbio dependency bugfix - dwc3 driver fix - raw-gadget bugfix All of these have been in linux-next this week with no reported issues" * tag 'usb-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: serial: option: add Telit FN920C04 ECM compositions USB: serial: option: add Quectel RG255C tcpm: switch check for role_sw device with fw_node usb/core/quirks: Add Huawei ME906S to wakeup quirk usb: raw-gadget: do not limit transfer length USB: serial: option: add UNISOC UIS7720 xhci: dbc: enable back DbC in resume if it was enabled before suspend xhci: dbc: fix bogus 1024 byte prefix if ttyDBC read races with stall event usb: xhci-pci: Fix USB2-only root hub registration dt-bindings: usb: qcom,snps-dwc3: Fix bindings for X1E80100 usb: misc: Add x86 dependency for Intel USBIO driver dt-bindings: usb: switch: split out ports definition usb: dwc3: Don't call clk_bulk_disable_unprepare() twice dt-bindings: usb: dwc3-imx8mp: dma-range is required only for imx8mp	2025-10-26 10:21:13 -07:00
Linus Torvalds	dbfc6422a3	Merge tag 'x86_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Remove dead code leftovers after a recent mitigations cleanup which fail a Clang build - Make sure a Retbleed mitigation message is printed only when necessary - Correct the last Zen1 microcode revision for which Entrysign sha256 check is needed - Fix a NULL ptr deref when mounting the resctrl fs on a system which supports assignable counters but where L3 total and local bandwidth monitoring has been disabled at boot * tag 'x86_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/bugs: Remove dead code which might prevent from building x86/bugs: Qualify RETBLEED_INTEL_MSG x86/microcode: Fix Entrysign revision check for Zen1/Naples x86,fs/resctrl: Fix NULL pointer dereference with events force-disabled in mbm_event mode	2025-10-26 09:57:18 -07:00
Linus Torvalds	5fee0dafba	Merge tag 'irq_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq fixes from Borislav Petkov: - Restore the original buslock locking in a couple of places in the irq core subsystem after a rework * tag 'irq_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/manage: Add buslock back in to enable_irq() genirq/manage: Add buslock back in to __disable_irq_nosync() genirq/chip: Add buslock back in to irq_set_handler()	2025-10-26 09:54:36 -07:00
Linus Torvalds	af8159515f	Merge tag 'objtool_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fixes from Borislav Petkov: - Fix x32 build due to wrong format specifier on that sub-arch - Add one more Rust noreturn function to objtool's list * tag 'objtool_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Fix failure when being compiled on x32 system objtool/rust: add one more `noreturn` Rust function	2025-10-26 09:44:36 -07:00
Linus Torvalds	1bc9743b64	Merge tag 'sched_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fix from Borislav Petkov: - Make sure a CFS runqueue on a throttled hierarchy has its PELT clock throttled otherwise task movement and manipulation would lead to dangling cfs_rq references and an eventual crash * tag 'sched_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Start a cfs_rq on throttled hierarchy with PELT clock throttled	2025-10-26 09:42:19 -07:00
Linus Torvalds	7ea5092f52	Merge tag 'timers_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Borislav Petkov: - Do not create more than eight (max supported) AUX clocks sysfs hierarchies * tag 'timers_urgent_for_v6.18_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timekeeping: Fix aux clocks sysfs initialization loop bound	2025-10-26 09:40:16 -07:00
Linus Torvalds	72761a7e31	Merge tag 'driver-core-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core Pull driver core fixes from Danilo Krummrich: - In Device::parent(), do not make any assumptions on the device context of the parent device - Check visibility before changing ownership of a sysfs attribute group - In topology_parse_cpu_capacity(), replace an incorrect usage of PTR_ERR_OR_ZERO() with IS_ERR_OR_NULL() - In devcoredump, fix a circular locking dependency between struct devcd_entry::mutex and kernfs - Do not warn about a pending fw_devlink sync state * tag 'driver-core-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core: arch_topology: Fix incorrect error check in topology_parse_cpu_capacity() rust: device: fix device context of Device::parent() sysfs: check visibility before changing group attribute ownership devcoredump: Fix circular locking dependency with devcd->mutex. driver core: fw_devlink: Don't warn about sync_state() pending	2025-10-25 11:03:46 -07:00
Linus Torvalds	818444a61b	Merge tag 'firewire-fixes-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 Pull firewire fixes from Takashi Sakamoto: "A small collection of FireWire fixes. This includes corrections to sparse and API documentation" * tag 'firewire-fixes-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: firewire: init_ohci1394_dma: add missing function parameter documentation firewire: core: fix __must_hold() annotation	2025-10-25 10:58:32 -07:00
Linus Torvalds	9bb956508c	Merge tag 'riscv-for-linus-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V fixes from Paul Walmsley: - Close a race during boot between userspace vDSO usage and some late-initialized vDSO data - Improve performance on systems with non-CPU-cache-coherent DMA-capable peripherals by enabling write combining on pgprot_dmacoherent() allocations - Add human-readable detail for RISC-V IPI tracing - Provide more information to zsmalloc on 64-bit RISC-V to improve allocation - Silence useless boot messages about CPUs that have been disabled in DT - Resolve some compiler and smatch warnings and remove a redundant macro * tag 'riscv-for-linus-6.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: riscv: hwprobe: avoid uninitialized variable use in hwprobe_arch_id() riscv: cpufeature: avoid uninitialized variable in has_thead_homogeneous_vlenb() riscv: hwprobe: Fix stale vDSO data for late-initialized keys at boot riscv: add a forward declaration for cpuinfo_op RISC-V: Don't print details of CPUs disabled in DT riscv: Remove the PER_CPU_OFFSET_SHIFT macro riscv: mm: Define MAX_POSSIBLE_PHYSMEM_BITS for zsmalloc riscv: Register IPI IRQs with unique names ACPI: RIMT: Fix unused function warnings when CONFIG_IOMMU_API is disabled RISC-V: Define pgprot_dmacoherent() for non-coherent devices	2025-10-25 09:35:26 -07:00
Linus Torvalds	27c0b5c4f6	Merge tag 'xfs-fixes-6.18-rc3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux Pull xfs fixes from Carlos Maiolino: "The main highlight here is a fix for a bug brought in by the removal of attr2 mount option, where some installations might actually have 'attr2' explicitly configured in fstab preventing system to boot by not being able to remount the rootfs as RW. Besides that there are a couple fix to the zonefs implementation, changing XFS_ONLINE_SCRUB_STATS to depend on DEBUG_FS (was select before), and some other minor changes" * tag 'xfs-fixes-6.18-rc3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix locking in xchk_nlinks_collect_dir xfs: loudly complain about defunct mount options xfs: always warn about deprecated mount options xfs: don't set bt_nr_sectors to a negative number xfs: don't use __GFP_NOFAIL in xfs_init_fs_context xfs: cache open zone in inode->i_private xfs: avoid busy loops in GCD xfs: XFS_ONLINE_SCRUB_STATS should depend on DEBUG_FS xfs: do not tightly pack-write large files xfs: Improve CONFIG_XFS_RT Kconfig help	2025-10-25 09:31:13 -07:00
Linus Torvalds	566771afc7	Merge tag 'v6.18-rc2-smb-server-fixes' of git://git.samba.org/ksmbd Pull smb server fixes from Steve French: "smbdirect (RDMA) fixes in order avoid potential submission queue overflows: - free transport teardown fix - credit related fixes (five server related, one client related)" * tag 'v6.18-rc2-smb-server-fixes' of git://git.samba.org/ksmbd: smb: server: let free_transport() wait for SMBDIRECT_SOCKET_DISCONNECTED smb: client: make use of smbdirect_socket.send_io.lcredits.* smb: server: make use of smbdirect_socket.send_io.lcredits.* smb: server: simplify sibling_list handling in smb_direct_flush_send_list/send_done smb: server: smb_direct_disconnect_rdma_connection() already wakes all waiters on error smb: smbdirect: introduce smbdirect_socket.send_io.lcredits.* smb: server: allocate enough space for RW WRs and ib_drain_qp()	2025-10-24 18:50:15 -07:00
Andy Shevchenko	53abe3e1c1	sched: Remove never used code in mm_cid_get() Clang is not happy with set but unused variable (this is visible with `make W=1` build: kernel/sched/sched.h:3744:18: error: variable 'cpumask' set but not used [-Werror,-Wunused-but-set-variable] It seems like the variable was never used along with the assignment that does not have side effects as far as I can see. Remove those altogether. Fixes: `223baf9d17` ("sched: Fix performance regression introduced by mm_cid") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Tested-by: Eric Biggers <ebiggers@kernel.org> Reviewed-by: Breno Leitao <leitao@debian.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2025-10-24 16:55:46 -07:00
Linus Torvalds	3d08a425d2	Merge tag 'drm-fixes-2025-10-24' of https://gitlab.freedesktop.org/drm/kernel Pull drm fixes from Simona Vetter: "Very quiet, all just small stuff and nothing scary pending to my knowledge: - drm_panic: bunch of size calculation fixes - pantor: fix kernel panic on partial gpu va unmap - rockchip: hdmi hotplug setup fix - amdgpu: dp mst, dc/display fixes - i915: fix panic structure leak - xe: madvise uapi fix, wq alloc error, vma flag handling fix" * tag 'drm-fixes-2025-10-24' of https://gitlab.freedesktop.org/drm/kernel: drm/xe: Check return value of GGTT workqueue allocation drm/amd/display: use GFP_NOWAIT for allocation in interrupt handler drm/amd/display: increase max link count and fix link->enc NULL pointer access drm/amd/display: Fix NULL pointer dereference drm/panic: Fix 24bit pixel crossing page boundaries drm/panic: Fix divide by 0 if the screen width < font width drm/panic: Fix kmsg text drawing rectangle drm/panic: Fix qr_code, ensure vmargin is positive drm/panic: Fix overlap between qr code and logo drm/panic: Fix drawing the logo on a small narrow screen drm/xe/uapi: Hide the madvise autoreset behind a VM_BIND flag drm/xe: Retain vma flags when recreating and splitting vmas for madvise drm/i915/panic: fix panic structure allocation memory leak drm/panthor: Fix kernel panic on partial unmap of a GPU VA region drm/rockchip: dw_hdmi: use correct SCLIN mask for RK3228	2025-10-24 16:49:16 -07:00
Linus Torvalds	31009296f8	Merge tag 'pci-v6.18-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fixes from Bjorn Helgaas: - Add DWC custom pci_ops for the root bus instead of overwriting the DBI base address, which broke drivers that rely on the DBI address for iATU programming; fixes an FU740 probe regression (Krishna Chaitanya Chundru) - Revert qcom ECAM enablement, which is rendered unnecessary by the DWC custom pci_ops (Krishna Chaitanya Chundru) - Fix longstanding MIPS Malta resource registration issues to avoid exposing them when the next commit fixes the boot failure (Maciej W. Rozycki) - Use pcibios_align_resource() on MIPS Malta to fix boot failure caused by using the generic pci_enable_resources() (Ilpo Järvinen) - Enable only ASPM L0s and L1, not L1 PM Substates, for devicetree platforms because we lack information required to configure L1 Substates; fixes regressions on powerpc and rockchip. A qcom regression (L1 Substates no longer enabled) remains and will be addressed next (Bjorn Helgaas) * tag 'pci-v6.18-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI/ASPM: Enable only L0s and L1 for devicetree platforms MIPS: Malta: Use pcibios_align_resource() to block io range MIPS: Malta: Fix PCI southbridge legacy resource reservations MIPS: Malta: Fix keyboard resource preventing i8042 driver from registering Revert "PCI: qcom: Prepare for the DWC ECAM enablement" PCI: dwc: Use custom pci_ops for root bus DBI vs ECAM config access	2025-10-24 16:43:08 -07:00
Nirbhay Sharma	73ba88fb04	firewire: init_ohci1394_dma: add missing function parameter documentation Add missing kernel-doc parameter descriptions for five functions in init_ohci1394_dma.c to fix documentation warnings when building with W=1. This patch addresses the following warnings: - init_ohci1394_wait_for_busresets: missing @ohci description - init_ohci1394_enable_physical_dma: missing @ohci description - init_ohci1394_reset_and_init_dma: missing @ohci description - init_ohci1394_controller: missing @num, @slot, @func descriptions - setup_ohci1394_dma: missing @opt description Tested with GCC 13.2.0 and W=1 flag. All documentation warnings for these functions have been resolved. Signed-off-by: Nirbhay Sharma <nirbhay.lkd@gmail.com> Link: https://lore.kernel.org/r/20251024203219.101990-2-nirbhay.lkd@gmail.com Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>	2025-10-25 08:29:56 +09:00
Linus Torvalds	7083bb6060	Merge tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull crypto library fix from Eric Biggers: "Avoid some false-positive KMSAN warnings by restoring the dependency of the architecture-optimized Poly1305 code on !KMSAN" * tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: lib/crypto: poly1305: Restore dependency of arch code on !KMSAN	2025-10-24 15:51:24 -07:00
Linus Torvalds	f2b2465726	Merge tag '6.18-rc2-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6 Pull smb client fixes from Steve French: - add missing tracepoints - smbdirect (RDMA) fix - fix potential issue with credits underflow - rename fix - improvement to calc_signature and additional cleanup patch * tag '6.18-rc2-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: #include cifsglob.h before trace.h to allow structs in tracepoints cifs: Call the calc_signature functions directly smb: client: get rid of d_drop() in cifs_do_rename() cifs: Fix TCP_Server_Info::credits to be signed cifs: Add a couple of missing smb3_rw_credits tracepoints smb: client: allocate enough space for MR WRs and ib_drain_qp()	2025-10-24 15:48:08 -07:00

1 2 3 4 5 ...

1397369 Commits