linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-16 04:21:09 -04:00

Author	SHA1	Message	Date
Tomasz Jeznach	1bb54043ff	MAINTAINERS: update Tomasz Jeznach's email address Switch from the previous work address to a linux.dev account, as the work address is no longer actively monitored. Signed-off-by: Tomasz Jeznach <tomasz.jeznach@linux.dev> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2026-05-15 07:30:28 +02:00
Jason Gunthorpe	58829512ad	iommupt: Fix the end_index calculation in __map_range_leaf() Sashiko noticed a mismatch of units in this math: num_leaves is actually the number of leaf entries (so a 16-item contiguous leaf is one num_leaves), while index is in items. The mismatch in maths causes __map_range_leaf() to exit early instead of efficiently filling a larger range of contiguous PTEs. The early exit is caught by the functions above and then __map_range_leaf() is re-invoked, so there is no functional issue. Correct the misuse of units by adjusting num_leaves with the leaf size and avoid the performance cost of looping externally. There are also some mismatched types for num_leaves; simplify things to remove the duplicated calculations. Fixes: `d6c65b0fd6` ("iommupt: Avoid rewalking during map") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Reviewd-by: Pranjal Shrivastava <praan@google.com> Tested-by: Josua Mayer <josua@solid-run.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2026-05-15 07:29:16 +02:00
Jason Gunthorpe	8ef3f77c44	iommupt: Check for missing PAGE_SIZE in the pgsize_bitmap Sashiko pointed out that the driver could drop PAGE_SIZE from the pgsize_bitmap. That is technically allowed but nothing does it, and such an iommu_domain would not be used with the DMA API today. Still, it is against the design and it is trivial to fix up. Lift the PT_WARN_ON to the if branch and just skip the fast path. Fixes: `dcd6a011a8` ("iommupt: Add map_pages op") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Tested-by: Josua Mayer <josua@solid-run.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2026-05-15 07:29:16 +02:00
Jason Gunthorpe	0735c54804	iommu: Handle unmap error when iommu_debug is enabled Sashiko noticed a latent bug where the map error flow called iommu_unmap() which calls iommu_debug_unmap_begin()/iommu_debug_unmap_end() however since this is an error path the map flow never actually established the original iommu_debug_map() it will malfunction. Lift the unmap error handling into iommu_map_nosync() and reorder it so the trace_map()/iommu_debug_map() records the partial mapping and then immediately unmaps it. This avoid creating the unbalanced tracking and provides saner tracing instead of a unmap unmatched to any map. Fixes: `ccc21213f0` ("iommu: Add calls for IOMMU_DEBUG_PAGEALLOC") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Tested-by: Josua Mayer <josua@solid-run.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2026-05-15 07:29:16 +02:00
Jason Gunthorpe	b948a87228	iommu: Fix up map/unmap debugging for iommupt domains Sashiko noticed a few issues in this path, and a few more were found on review. Tidy them up further. These are intertwined because the debug code depends on some of the WARN_ONs to function right: Lift into iommu_map_nosync(): - The might_sleep_if() - 0 pgsize_bitmap WARN_ON - Promote the illegal domain->type to a WARN_ON - WARN_ON for illegal gfp flags Then remove the return 0 since it is now safe to call iommu_debug_map(). Lift into __iommu_unmap(): - 0 pgsize_bitmap WARN_ON - Promote the illegal domain->type to a WARN_ON - iommu_debug_unmap_begin() This now pairs with the unconditional iommu_debug_map() on the mapping side. Thus iommu debugging now works for iommupt along with some of the other debugging features. Fixes: `99fb8afa16` ("iommupt: Directly call iommupt's unmap_range()") Fixes: `d6c65b0fd6` ("iommupt: Avoid rewalking during map") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Tested-by: Josua Mayer <josua@solid-run.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2026-05-15 07:29:15 +02:00
Jason Gunthorpe	6fc7e8a3b8	iommu: Fix loss of errno on map failure for classic ops A typo, likely from a rebase, inverted the condition and caused errors to be lost. Fix it to be "if (ret)". This was breaking iommu_create_device_direct_mappings() on drivers that don't use iommupt and don't fully set up their domain in alloc_pages() (i.e., SMMUv2). In this case the first call of iommu_create_device_direct_mappings() should fail due to the incompletely initialized domain. Since it wrongly returns success, the second call to iommu_create_device_direct_mappings() doesn't happen and IOMMU_RESV_DIRECT is never set up. Cc: stable@vger.kernel.org Fixes: `d6c65b0fd6` ("iommupt: Avoid rewalking during map") Reported-by: Josua Mayer <josua@solid-run.com> Closes: https://lore.kernel.org/all/321c2e57-6a17-4aef-ba42-d2ebd577e472@solid-run.com/ Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Pranjal Shrivastava <praan@google.com> Reviewed-by: Samiullah Khawaja <skhawaja@google.com> Reviewed-by: Mostafa Saleh <smostafa@google.com> Tested-by: Josua Mayer <josua@solid-run.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2026-05-15 07:29:15 +02:00
Zhenzhong Duan	79ea2feb91	iommu/vt-d: Avoid NULL pointer dereference or refcount corruption Commit `60f030f741` ("iommu/vt-d: Avoid use of NULL after WARN_ON_ONCE") fixed a NULL pointer dereference in an unlikely situation partly. If dev_pasid is not found in the dev_pasids list, it remains NULL. However, the teardown operations are executed unconditionally, this lead to a NULL pointer dereference or refcount corruption. If the domain was never attached to this IOMMU, info will be NULL, which would cause an immediate dereference when checking --info->refcnt. Even if info is not NULL, decrementing the refcount without having removed a valid PASID might unbalance the count. This could lead to premature dropping of the refcount to 0, potentially causing a use-after-free for the remaining active devices sharing the domain. Fix it by returning early if dev_pasid is NULL, before executing the teardown operations. Issue found by AI review and suggested by Kevin Tian. https://sashiko.dev/#/patchset/20260421031347.1408890-1-zhenzhong.duan%40intel.com Fixes: `60f030f741` ("iommu/vt-d: Avoid use of NULL after WARN_ON_ONCE") Cc: stable@vger.kernel.org Suggested-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20260422033538.95000-1-zhenzhong.duan@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2026-05-11 10:19:37 +02:00
Zhenzhong Duan	a6dea58d86	iommu/vt-d: Fix oops due to out of scope access Below oops triggers when kill QEMU process: Oops: general protection fault, probably for non-canonical address 0x7fffffff844eaaa7: 0000 [#1] SMP NOPTI Call Trace: <TASK> do_raw_spin_lock+0xaa/0xc0 _raw_spin_lock_irqsave+0x21/0x40 domain_remove_dev_pasid+0x52/0x160 intel_nested_set_dev_pasid+0x1b9/0x1e0 __iommu_set_group_pasid+0x56/0x120 pci_dev_reset_iommu_done+0xe3/0x180 pcie_flr+0x65/0x160 __pci_reset_function_locked+0x5b/0x120 vfio_pci_core_close_device+0x63/0xe0 [vfio_pci_core] vfio_df_close+0x4f/0xa0 vfio_df_unbind_iommufd+0x2d/0x60 vfio_device_fops_release+0x3e/0x40 __fput+0xe5/0x2c0 task_work_run+0x58/0xa0 do_exit+0x2c8/0x600 do_group_exit+0x2f/0xa0 get_signal+0x863/0x8c0 arch_do_signal_or_restart+0x24/0x100 exit_to_user_mode_loop+0x87/0x380 do_syscall_64+0x2ff/0x11e0 entry_SYSCALL_64_after_hwframe+0x76/0x7e The global static blocked domain is a dummy domain without corresponding dmar_domain structure, accessing beyond iommu_domain structure triggers oops easily. Fix it by return early in domain_remove_dev_pasid() like identity domain. Fixes: `7d0c9da6c1` ("iommu/vt-d: Add set_dev_pasid callback for dma domain") Cc: stable@vger.kernel.org Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/r/20260421031347.1408890-1-zhenzhong.duan@intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2026-05-11 10:19:36 +02:00
Naval Alcalá	2cda2e10dc	iommu/vt-d: Disable DMAR for Intel Q35 IGFX Intel Q35 integrated graphics (8086:29b2) exhibits broken DMAR behaviour similar to other G4x/GM45 devices for which DMAR is already disabled via quirks. When DMAR is enabled, the system may hard lock up during boot or early device initialization, requiring a reset. Add the missing PCI ID to the existing quirk list to disable DMAR for this device. Fixes: `1f76249cc3` ("iommu/vt-d: Declare Broadwell igfx dmar support snafu") Cc: stable@vger.kernel.org Closes: https://bugzilla.kernel.org/show_bug.cgi?id=201185 Closes: https://bugzilla.kernel.org/show_bug.cgi?id=216064 Signed-off-by: Naval Alcalá <ari@naval.cat> Link: https://lore.kernel.org/r/20260410161622.13549-1-ari@naval.cat Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2026-05-11 10:19:36 +02:00
Nicolin Chen	15dd29ca62	iommu: Warn on premature unblock during DMA aliased sibling reset When two aliased siblings are in the same iommu_group, they might share the same RID. The reset functions don't support this case, though it is unclear whether there is a real case of having an ATS capable device on a PCI/PCI-X bus. Theoretically, however, if two aliased devices are resetting concurrently, one might be unblocked prematurely in the middle of the reset by the other sibling who completes the reset first. This isn't a regression from this series but it's better to spit a warning, so we can know if such use case is common enough for us to make subsequent patches for its coverage. Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2026-05-11 10:12:45 +02:00
Nicolin Chen	5474e6e17a	iommu: Fix WARN_ON in __iommu_group_set_domain_nofail() due to reset In __iommu_group_set_domain_internal(), concurrent domain attachments are rejected when any device in the group is recovering. This is necessary to fence concurrent attachments to a multi-device group where devices might share the same RID due to PCI DMA alias quirks, but triggers the WARN_ON in __iommu_group_set_domain_nofail(). Other IOMMU_SET_DOMAIN_MUST_SUCCEED callers in detach/teardown paths, such as __iommu_group_set_core_domain and __iommu_release_dma_ownership, should not be rejected, as the domain would be freed anyway in these nofail paths while group->domain is still pointing to it. So pci_dev_reset_iommu_done() could trigger a UAF when re-attaching group->domain. Honor the IOMMU_SET_DOMAIN_MUST_SUCCEED flag, allowing the callers through the group->recovery_cnt fence, so as to update the group->domain pointer. Instead add a gdev->blocked check in the device iteration loop, to prevent any concurrent per-device detachment. Fixes: `c279e83953` ("iommu: Introduce pci_dev_reset_iommu_prepare/done()") Cc: stable@vger.kernel.org Closes: https://sashiko.dev/#/patchset/20260407194644.171304-1-nicolinc%40nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2026-05-11 10:12:44 +02:00
Nicolin Chen	fc3523b16d	iommu: Fix ATS invalidation timeouts during __iommu_remove_group_pasid() If a device is blocked, its PASID domains are already detached. Repeating iommu_remove_dev_pasid() is unnecessary and might trigger ATS invalidation timeouts. Skip the iommu_remove_dev_pasid() call upon gdev->blocked. Fixes: `c279e83953` ("iommu: Introduce pci_dev_reset_iommu_prepare/done()") Cc: stable@vger.kernel.org Closes: https://sashiko.dev/#/patchset/20260407194644.171304-1-nicolinc%40nvidia.com Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2026-05-11 10:12:44 +02:00
Nicolin Chen	0d5fd7a932	iommu: Fix nested pci_dev_reset_iommu_prepare/done() Shuai found that cxl_reset_bus_function() calls pci_reset_bus_function() internally while both are calling pci_dev_reset_iommu_prepare/done(). As pci_dev_reset_iommu_prepare() doesn't support re-entry, the inner call will trigger a WARN_ON and return -EBUSY, resulting in failing the entire device reset. On the other hand, removing the outer calls in the PCI callers is unsafe. As pointed out by Kevin, device-specific quirks like reset_hinic_vf_dev() execute custom firmware waits after their inner pcie_flr() completes. If the IOMMU protection relies solely on the inner reset, the IOMMU will be unblocked prematurely while the device is still resetting. Instead, fix this by making pci_dev_reset_iommu_prepare/done() reentrant. Introduce gdev->reset_depth to handle the re-entries on the same device. Fixes: `c279e83953` ("iommu: Introduce pci_dev_reset_iommu_prepare/done()") Cc: stable@vger.kernel.org Reported-by: Shuai Xue <xueshuai@linux.alibaba.com> Closes: https://lore.kernel.org/all/absKsk7qQOwzhpzv@Asurada-Nvidia/ Suggested-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2026-05-11 10:12:44 +02:00
Nicolin Chen	1615e8896a	iommu: Fix pasid attach in pci_dev_reset_iommu_prepare/done() Now the helpers handle per-gdev resets. Replace __iommu_set_group_pasid() with set_dev_pasid() accordingly, in the pci_dev_reset_iommu_done(). Also add max_pasids check as other callers. Fixes: `c279e83953` ("iommu: Introduce pci_dev_reset_iommu_prepare/done()") Cc: stable@vger.kernel.org Reported-by: Shuai Xue <xueshuai@linux.alibaba.com> Closes: https://lore.kernel.org/all/ad858513-09fc-455e-bbc5-fe38a225cc78@linux.alibaba.com/ Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2026-05-11 10:12:44 +02:00
Nicolin Chen	b296ca1fb4	iommu: Replace per-group resetting_domain with per-gdev blocked flag The core tracks device resetting states with a per-group resetting_domain, while a reset is actually per group-device. Such a mismatch might lead to confusion and even difficulty to untangle per-gdev handling requirement. Shuai found that cxl_reset_bus_function() calls pci_reset_bus_function() internally while both are calling pci_dev_reset_iommu_prepare/done(). And the solution requires the core to track at the group_device level as well. Introduce a 'blocked' flag to struct group_device, to allow a multi-device group to isolate concurrent device resets independently. As the reset routine is per gdev, it cannot clear group->resetting_domain without iterating over the device list to ensure no other device is being reset. Simplify it by replacing the resetting_domain with a 'recovery_cnt' in the struct iommu_group. No functional change. But this is essential to apply following bug fixes. Fixes: `c279e83953` ("iommu: Introduce pci_dev_reset_iommu_prepare/done()") Cc: stable@vger.kernel.org Reported-by: Shuai Xue <xueshuai@linux.alibaba.com> Closes: https://lore.kernel.org/all/absKsk7qQOwzhpzv@Asurada-Nvidia/ Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2026-05-11 10:12:43 +02:00
Nicolin Chen	834ab85aa9	iommu: Fix kdocs of pci_dev_reset_iommu_done() Remove the duplicated word. No functional change. Fixes: `c279e83953` ("iommu: Introduce pci_dev_reset_iommu_prepare/done()") Reviewed-by: Shuai Xue <xueshuai@linux.alibaba.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2026-05-11 10:12:43 +02:00
Nicolin Chen	d769711fcd	iommu: Fix NULL group->domain dereference in pci_dev_reset_iommu_done() Local sashiko review pointed it out that group->domain could be NULL when a default domain fails to allocate during the first probe, which can crash at domain->ops->attach_dev dereference in __iommu_attach_device() invoked by pci_dev_reset_iommu_done(). pci_dev_reset_iommu_prepare() is fine as an old_domain pointer can be NULL. Skip the re-attach in pci_dev_reset_iommu_done() to fix the bug. Fixes: `c279e83953` ("iommu: Introduce pci_dev_reset_iommu_prepare/done()") Cc: stable@vger.kernel.org Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2026-05-11 10:12:43 +02:00
Jose Fernandez (Anthropic)	07d0f496fe	iommu/amd: Bounds-check devid in __rlookup_amd_iommu() iommu_device_register() walks every device on the PCI bus via bus_for_each_dev() and calls amd_iommu_probe_device() for each. The inlined check_device() path computes the device's sbdf, calls rlookup_amd_iommu() to find the owning IOMMU, and only afterwards verifies devid <= pci_seg->last_bdf. __rlookup_amd_iommu() indexes rlookup_table[devid] with no bounds check of its own, so for a PCI device whose BDF is not described by the IVRS, the lookup reads past the end of the allocation before the caller's bounds check can run. This was harmless before commit `e874c666b1` ("iommu/amd: Change rlookup, irq_lookup, and alias to use kvalloc()"): the table was a zeroed page-order allocation, so the over-read returned NULL and the caller's NULL check skipped the device. After that commit the table is a tight kvcalloc() and the over-read returns adjacent slab contents, which check_device() then dereferences as a struct amd_iommu *, causing a boot-time GPF. Seen on Google Compute Engine ct6e VMs, where the virtualized IVRS describes only the four TPU endpoints 00:04.0-07.0; the gVNIC at 00:08.0 (devid 0x40) indexes 56 bytes past the 456-byte allocation, into the adjacent kmalloc-512 slab object: pci 0000:00:04.0: Adding to iommu group 0 pci 0000:00:05.0: Adding to iommu group 1 pci 0000:00:06.0: Adding to iommu group 2 pci 0000:00:07.0: Adding to iommu group 3 Oops: general protection fault, probably for non-canonical address 0x3a64695f78746382: 0000 [#1] SMP NOPTI CPU: 0 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.18.22 #1 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/06/2025 RIP: 0010:amd_iommu_probe_device+0x54/0x3a0 Call Trace: __iommu_probe_device+0x107/0x520 probe_iommu_group+0x29/0x50 bus_for_each_dev+0x7e/0xe0 iommu_device_register+0xc9/0x240 iommu_go_to_state+0x9c0/0x1c60 amd_iommu_init+0x14/0x40 pci_iommu_init+0x16/0x60 do_one_initcall+0x47/0x2f0 Guard the array access in __rlookup_amd_iommu(). With the fix applied on 6.18.22, the gVNIC at 00:08.0 is skipped cleanly and the VM boots. Fixes: `e874c666b1` ("iommu/amd: Change rlookup, irq_lookup, and alias to use kvalloc()") Cc: stable@vger.kernel.org Reported-by: Ziyuan Chen <zc@anthropic.com> Tested-by: Ziyuan Chen <zc@anthropic.com> Reviewed-by: Josef Bacik <josef@toxicpanda.com> Assisted-by: Claude:unspecified Signed-off-by: Jose Fernandez (Anthropic) <jose.fernandez@linux.dev> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2026-05-11 10:07:52 +02:00
Eder Zulian	8dfd3d8d74	iommu/amd: Remove latent out-of-bounds access in IOMMU debugfs In iommu_mmio_write() and iommu_capability_write(), the variables dbg_mmio_offset and dbg_cap_offset are declared as int. However, they are populated using kstrtou32_from_user(). If a user provides a sufficiently large value, it can become a negative integer. Prior to this patch, the AMD IOMMU debugfs implementation was already protected by different mechanisms. 1. #define OFS_IN_SZ 8 ensures the user string <= 8 bytes, so e.g. 0xffffffff isn't a valid input. if (cnt > OFS_IN_SZ) return -EINVAL; 2. Implicit type promotion in iommu_mmio_write(), dbg_mmio_offset is int and iommu->mmio_phys_end is u64 if (dbg_mmio_offset > iommu->mmio_phys_end - sizeof(u64)) return -EINVAL; 3. The show handlers would currently catch the negative number and refuse to perform the read. Replace kstrtou32_from_user() with kstrtos32_from_user() to parse the input, and check for negative values to explicitly prevent out-of-bounds memory accesses directly in iommu_mmio_write() and iommu_capability_write(). Signed-off-by: Eder Zulian <ezulian@redhat.com> Fixes: `7a4ee419e8` ("iommu/amd: Add debugfs support to dump IOMMU MMIO registers") Cc: stable@vger.kernel.org Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>	2026-05-11 09:52:54 +02:00
Linus Torvalds	5d6919055d	Linux 7.1-rc3 v7.1-rc3	2026-05-10 14:08:09 -07:00
Linus Torvalds	afaa0a4770	Merge tag 'edac_urgent_for_v7.1_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC fix from Borislav Petkov: - Fix a string leak in the versalnet driver * tag 'edac_urgent_for_v7.1_rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/versalnet: Fix device name memory leak	2026-05-10 12:21:57 -07:00
Hyunwoo Kim	aa54b1d27f	rxrpc: Also unshare DATA/RESPONSE packets when paged frags are present The DATA-packet handler in rxrpc_input_call_event() and the RESPONSE handler in rxrpc_verify_response() copy the skb to a linear one before calling into the security ops only when skb_cloned() is true. An skb that is not cloned but still carries externally-owned paged fragments (e.g. SKBFL_SHARED_FRAG set by splice() into a UDP socket via __ip_append_data, or a chained skb_has_frag_list()) falls through to the in-place decryption path, which binds the frag pages directly into the AEAD/skcipher SGL via skb_to_sgvec(). Extend the gate to also unshare when skb_has_frag_list() or skb_has_shared_frag() is true. This catches the splice-loopback vector and other externally-shared frag sources while preserving the zero-copy fast path for skbs whose frags are kernel-private (e.g. NIC page_pool RX, GRO). The OOM/trace handling already in place is reused. Fixes: `d0d5c0cd1e` ("rxrpc: Use skb_unshare() rather than skb_cow_data()") Cc: stable@vger.kernel.org Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com> Reviewed-by: Jiayuan Chen <jiayuan.chen@linux.dev> Acked-by: David Howells <dhowells@redhat.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-05-10 08:15:57 -07:00
Linus Torvalds	a1a10cdbc6	Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk driver fixes from Stephen Boyd: - Mark the DDR bus clk critical in the SpaceMiT driver so that boot doesn't fail - Fix boot on Mobile EyeQ by creating the auxiliary device for the ethernet PHY - Plug an OF node leak in Rockchip rk808 clk driver * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: rk808: fix OF node reference imbalance MAINTAINERS: add myself as a reviewer for the clk subsystem reset: eyeq: drop device_set_of_node_from_dev() done by parent clk: eyeq: add EyeQ5 children auxiliary device for generic PHYs clk: eyeq: use the auxiliary device creation helper clk: spacemit: k3: mark top_dclk as CLK_IS_CRITICAL	2026-05-10 08:10:47 -07:00
Linus Torvalds	515186b7be	Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Pull bpf fixes from Alexei Starovoitov: - Fix sk_local_storage diag dump via netlink (Amery Hung) - Fix off-by-one in arena direct-value access (Junyoung Jang) - Reject TCP_NODELAY in bpf-tcp congestion control (KaFai Wan) - Fix type confusion in bpf__sock() (Kuniyuki Iwashima) - Reject TX-only AF_XDP sockets (Linpu Yu) - Don't run arg-tracking analysis twice on main subprog (Paul Chaignon) - Fix NULL pointer dereference in bpf_sk_storage_clone and fib lookup (Weiming Shi) tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf: bpf: Fix off-by-one boundary validation in arena direct-value access xskmap: reject TX-only AF_XDP sockets bpf: Don't run arg-tracking analysis twice on main subprog bpf: Free reuseport cBPF prog after RCU grace period. bpf: tcp: Fix type confusion in sol_tcp_sockopt(). bpf: tcp: Fix type confusion in bpf_skc_to_tcp6_sock(). bpf: tcp: Fix type confusion in bpf_skc_to_tcp_sock(). mptcp: bpf: Fix type confusion in bpf_mptcp_sock_from_subflow() selftest: bpf: Add test for bpf_tcp_sock() and RAW socket. bpf: tcp: Fix type confusion in bpf_tcp_sock(). tools/headers: Regenerate stddef.h to fix BPF selftests bpf: Fix sk_local_storage diag dumping uninitialized special fields bpf: Fix NULL pointer dereference in bpf_skb_fib_lookup() sockmap: Fix sk_psock_drop() race vs sock_map_{unhash,close,destroy}(). bpf: Fix NULL pointer dereference in bpf_sk_storage_clone and diag paths selftests/bpf: Verify bpf-tcp-cc rejects TCP_NODELAY selftests/bpf: Test TCP_NODELAY in TCP hdr opt callbacks bpf: Reject TCP_NODELAY in bpf-tcp-cc bpf: Reject TCP_NODELAY in TCP header option callbacks	2026-05-09 18:42:54 -07:00
Junyoung Jang	3ac1a467e3	bpf: Fix off-by-one boundary validation in arena direct-value access BPF_MAP_TYPE_ARENA accepts BPF_PSEUDO_MAP_VALUE offsets at exactly the end of the arena mapping (off == arena_size). The boundary check in arena_map_direct_value_addr() uses `>` instead of `>=`, which incorrectly allows a one-past-end pointer to be accepted. Change the condition to `>=` to correctly reject offsets that fall outside the valid arena user_vm range. Fixes: `317460317a` ("bpf: Introduce bpf_arena.") Signed-off-by: Junyoung Jang <graypanda.inzag@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260426172505.1947915-1-graypanda.inzag@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-05-09 16:18:39 -07:00
Linpu Yu	bf6d507f7e	xskmap: reject TX-only AF_XDP sockets XSKMAP entries are used as redirect targets for incoming XDP frames. A TX-only AF_XDP socket lacks an Rx ring and cannot handle redirected traffic, but xsk_map_update_elem() currently allows such sockets to be inserted into the map. Redirecting packets to such a socket on the veth generic-XDP path causes a kernel crash in xsk_generic_rcv(). This became possible after xsk_is_setup_for_bpf_map() was removed from the XSKMAP update path, which allowed bound TX-only sockets to be inserted into the map. Reject TX-only sockets during XSKMAP updates to avoid the crash. They remain fully operational for pure Tx purposes outside XSKMAP. Fixes: `968be23cea` ("xsk: Fix possible segfault at xskmap entry insertion") Reported-by: Juefei Pu <tomapufckgml@gmail.com> Reported-by: Yuan Tan <yuantan098@gmail.com> Reported-by: Xin Liu <bird@lzu.edu.cn> Signed-off-by: Yifan Wu <yifanwucs@gmail.com> Signed-off-by: Linpu Yu <linpu5433@gmail.com> Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Link: https://lore.kernel.org/r/20260508144344.694-1-linpu5433@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-05-09 16:17:01 -07:00
Paul Chaignon	512809bb8a	bpf: Don't run arg-tracking analysis twice on main subprog Because subprog 0, the main subprog, is considered a global function, we end up running the arg-tracking dataflow analysis twice on it. That results in slightly longer verification but mostly in more verbose verifier logs. This patch fixes it by keeping only the iteration over global subprogs. When running over all of Cilium's programs with BPF_LOG_LEVEL2, this reduces verbosity by ~20% on average. Fixes: `bf0c571f7f` ("bpf: introduce forward arg-tracking dataflow analysis") Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/e4d7b53d4963ef520541a782f5fc8108a168877c.1778176504.git.paul.chaignon@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>	2026-05-09 16:12:40 -07:00
Linus Torvalds	1bfaee9d33	Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux Pull fsverity fix from Eric Biggers: "Fix a regression in overlayfs caused by an fsverity API change" * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux: ovl: fix verity lazy-load guard broken by fsverity_active() semantic change	2026-05-09 11:47:39 -07:00
Linus Torvalds	e92b2872d0	Merge tag 'rust-fixes-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux Pull Rust fixes from Miguel Ojeda: "Toolchain and infrastructure: - Add 'bindgen' target to make UML 32-bit builds work with GCC - Disable two Clippy warnings ('collapsible_{if,match}') 'pin-init' crate: - Fix unsoundness issue that created &'static references" * tag 'rust-fixes-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux: rust: allow `clippy::collapsible_if` globally rust: allow `clippy::collapsible_match` globally rust: pin-init: fix incorrect accessor reference lifetime rust: pin-init: internal: move alignment check to `make_field_check` rust: arch: um: Fix building 32-bit UML with GCC	2026-05-09 11:24:02 -07:00
Linus Torvalds	ec89572766	Merge tag 'hwmon-for-v7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - ads7871: Fix endianness bug in 16-bit register reads - lm75: Fix configuration register writes and AS6200/TMP112 setup and alarm handling - lm63: Fix TOCTOU problems - corsair-psu: Close HID device on probe errors - ltc2992: Fix overflow and threshold range - Documentation: fix link to ideapad-laptop.c file - Remove stale CONFIG_SENSORS_SBRMI Makefile reference * tag 'hwmon-for-v7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (ads7871) Fix endianness bug in 16-bit register reads hwmon: (lm75) Fix configuration register writes. hwmon: (lm75) Fix AS6200 and TMP112 setup and alarm handling hwmon: (lm63) Add locking to avoid TOCTOU hwmon: (corsair-psu) Close HID device on probe errors hwmon: Remove stale CONFIG_SENSORS_SBRMI Makefile reference Documentation: hwmon: fix link to ideapad-laptop.c file hwmon: (ltc2992) Fix u32 overflow in power read path hwmon: (ltc2992) Clamp threshold writes to hardware range	2026-05-09 08:32:50 -07:00
Linus Torvalds	234d72ae02	Merge tag 'staging-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging driver fixes from Greg KH: "Here are two small staging driver fixes for 7.1-rc3. They are: - vme_user root device leak fix - NULL dereference bugfix in the rtl8723bs driver Both of these have been in linux-next all this week with no reported issues" * tag 'staging-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: staging: rtl8723bs: os_dep: avoid NULL pointer dereference in rtw_cbuf_alloc staging: vme_user: fix root device leak on init failure	2026-05-09 08:26:08 -07:00
Linus Torvalds	fe3e5bc9e3	Merge tag 'usb-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB driver fixes from Greg KH: "Here are some small USB driver fixes for 7.1-rc3 to resolve some reported issues, and a new device id. These are: - usblp driver heap leak fixes - ulpi driver memory leak fix - typec driver fixes - dwc3 driver fix - omap dma driver fix - new option driver device id addition All of these have been in linux-next for over a week with no reported issues" * tag 'usb-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: USB: serial: option: add Telit Cinterion LE910Cx compositions usb: usblp: fix uninitialized heap leak via LPGETSTATUS ioctl usb: usblp: fix heap leak in IEEE 1284 device ID via short response usb: dwc3: Move GUID programming after PHY initialization usb: typec: tcpm: fix debug accessory mode detection for sink ports usb: typec: tcpm: reset internal port states on soft reset AMS usb: ulpi: fix memory leak on ulpi_register() error paths USB: omap_udc: DMA: Don't enable burst 4 mode	2026-05-09 08:16:24 -07:00
Linus Torvalds	656a95c4a0	Merge tag 'i2c-for-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux Pull i2c fixes from Wolfram Sang: - sanitize more input parameters in the core (found by syzkaller) - usual set of driver fixes (proper completion handling, applying quirks, correct workqueue selection...) - ID additions to simplify dependency handling - new email address for Peter Rosin * tag 'i2c-for-7.1-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: i2c: smbus: reject oversized block transfers in the common path MAINTAINERS: Update mail for Peter Rosin i2c: stub: Reject I2C block transfers with invalid length i2c: Compare the return value of gpiod_get_direction against GPIO_LINE_DIRECTION_OUT i2c: dev: prevent integer overflow in I2C_TIMEOUT ioctl i2c: acpi: Add ELAN0678 to i2c_acpi_force_100khz_device_ids dt-bindings: i2c: apple,i2c: Add t8122 compatible i2c: stm32f7: reinit_completion() per transfer not per msg dt-bindings: i2c: amlogic: Add compatible for T7 SOC i2c: testunit: Replace system_long_wq with system_dfl_long_wq	2026-05-09 08:10:07 -07:00
Linus Torvalds	bf0e022821	Merge tag 'powerpc-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc fixes from Madhavan Srinivasan: - Fix KASAN sanitization flag for core_$(BITS).o - Fixes for handling offset values in pseries htmdump - Fix interrupt mask in cpm1_gpiochip_add16() - ps3/pasemi fixes to drop redundant result assignment - Fixes in papr-hvpipe code path - powerpc/perf: Update check for PERF_SAMPLE_DATA_SRC marked events Thanks to Aboorva Devarajan, Athira Rajeev, Christophe Leroy (CS GROUP), Geert Uytterhoeven, Haren Myneni, Krzysztof Kozlowski, Mukesh Kumar Chaurasiya (IBM), Nathan Chancellor, Ritesh Harjani (IBM), Shivani Nittor, Sourabh Jain, Thomas Zimmermann, and Venkat Rao Bagalkote. * tag 'powerpc-7.1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (21 commits) powerpc/pasemi: Drop redundant res assignment powerpc/ps3: Drop redundant result assignment powerpc/vdso: Drop -DCC_USING_PATCHABLE_FUNCTION_ENTRY from 32-bit flags with clang arch/powerpc: Drop CONFIG_FIRMWARE_EDID from defconfig files powerpc/perf: Update check for PERF_SAMPLE_DATA_SRC marked events powerpc/8xx: Fix interrupt mask in cpm1_gpiochip_add16() powerpc/vmx: avoid KASAN instrumentation in enter_vmx_ops() for kexec powerpc/kdump: fix KASAN sanitization flag for core_$(BITS).o pseries/papr-hvpipe: Fix style and checkpatch issues in enable_hvpipe_IRQ() pseries/papr-hvpipe: Refactor and simplify hvpipe_rtas_recv_msg() pseries/papr-hvpipe: Kill task_struct pointer from struct hvpipe_source_info pseries/papr-hvpipe: Simplify spin unlock usage in papr_hvpipe_handle_release() pseries/papr-hvpipe: Fix the usage of copy_to_user() pseries/papr-hvpipe: Fix & simplify error handling in papr_hvpipe_init() pseries/papr-hvpipe: Fix null ptr deref in papr_hvpipe_dev_create_handle() pseries/papr-hvpipe: Prevent kernel stack memory leak to userspace pseries/papr-hvpipe: Fix race with interrupt handler powerpc/pseries/htmdump: Add memory configuration dump support to htmdump module powerpc/pseries/htmdump: Fix the offset value used in htm status dump powerpc/pseries/htmdump: Fix the offset value used in processor configuration dump ...	2026-05-09 08:03:21 -07:00
Linus Torvalds	70390501d1	Merge tag 'x86-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: - Fix memory map enumeration bug in the Xen e820 parsing code (Juergen Gross) - Re-enable e820 BIOS fallback if e820 table is empty (David Gow) * tag 'x86-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot/e820: Re-enable BIOS fallback if e820 table is empty x86/xen: Fix a potential problem in xen_e820_resolve_conflicts()	2026-05-08 20:28:45 -07:00
Linus Torvalds	6e1e5a33e8	Merge tag 'timers-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer fix from Ingo Molnar: "Fix CPU hotplug activation race in the timer migration code, by Frederic Weisbecker" * tag 'timers-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timers/migration: Fix another hotplug activation race	2026-05-08 20:03:39 -07:00
Linus Torvalds	7f00232152	Merge tag 'sched-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: - Fix spurious failures in rseq self-tests (Mark Brown) - Fix rseq rseq::cpu_id_start ABI regression due to TCMalloc's creative use of the supposedly read-only field The fix is to introduce a new ABI variant based on a new (larger) rseq area registration size, to keep the TCMalloc use of rseq backwards compatible on new kernels (Thomas Gleixner) - Fix wakeup_preempt_fair() for not waking up task (Vincent Guittot) - Fix s64 mult overflow in vruntime_eligible() (Zhan Xusheng) * tag 'sched-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/fair: Fix wakeup_preempt_fair() for not waking up task sched/fair: Fix overflow in vruntime_eligible() selftests/rseq: Expand for optimized RSEQ ABI v2 rseq: Reenable performance optimizations conditionally rseq: Implement read only ABI enforcement for optimized RSEQ V2 mode selftests/rseq: Validate legacy behavior selftests/rseq: Make registration flexible for legacy and optimized mode selftests/rseq: Skip tests if time slice extensions are not available rseq: Revert to historical performance killing behaviour rseq: Don't advertise time slice extensions if disabled rseq: Protect rseq_reset() against interrupts rseq: Set rseq::cpu_id_start to 0 on unregistration selftests/rseq: Don't run tests with runner scripts outside of the scripts	2026-05-08 19:42:10 -07:00
Linus Torvalds	e5cf0260a7	Merge tag 'perf-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf events fixes from Ingo Molnar: - Fix deadlock in the perf_mmap() failure path (Peter Zijlstra) - Intel ACR (Auto Counter Reload) fixes (Dapeng Mi): - Fix validation and configuration of ACR masks - Fix ACR rescheduling bug causing stale masks - Disable the PMI on ACR-enabled hardware - Enable ACR on Panther Cover uarch too * tag 'perf-urgent-2026-05-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Enable auto counter reload for DMR perf/x86/intel: Disable PMI for self-reloaded ACR events perf/x86/intel: Always reprogram ACR events to prevent stale masks perf/x86/intel: Improve validation and configuration of ACR masks perf/core: Fix deadlock in perf_mmap() failure path	2026-05-08 19:39:18 -07:00
Linus Torvalds	27a26ccfd5	Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 fix from Catalin Marinas: - ptrace(PTRACE_SETREGSET) fix to zero the target's fpsimd_state rather than the tracer's * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: arm64/fpsimd: ptrace: zero target's fpsimd_state, not the tracer's	2026-05-08 16:18:35 -07:00
Linus Torvalds	678ede852f	Merge tag 'pci-v7.1-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI fixes from Bjorn Helgaas: - Don't fallback to bus reset after failed slot reset; a bus reset isn't safe if the .reset_slot() callback is implemented (Keith Busch) - Update saved_config_space upon resource assignment to fix passthrough regressions when x86 pcibios_assign_resources() updates BARs (Lukas Wunner) - Initialize a temporary pci_dev->dev in sysfs 'new_id' attribute to fix a lockdep regression after driver_override was moved from PCI to device core (Samiullah Khawaja) - Update MAINTAINERS email addresses (Marek Vasut, Hans Zhang) - Add MAINTAINERS reviewer for PCIe Cadence IP (Aksh Garg) * tag 'pci-v7.1-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: MAINTAINERS: Add Aksh Garg as PCIe CADENCE reviewer MAINTAINERS: Update Hans Zhang email for PCIe CIX Sky1 MAINTAINERS: Update Marek Vasut email for PCIe R-Car PCI: Initialize temporary device in new_id_store() PCI: Update saved_config_space upon resource assignment PCI: Don't fallback to bus reset after failed slot reset	2026-05-08 16:08:58 -07:00
Aksh Garg	9ef40a09c5	MAINTAINERS: Add Aksh Garg as PCIe CADENCE reviewer I wish to contribute to the review process for Cadence PCIe IP drivers, hence add myself as a reviewer. Signed-off-by: Aksh Garg <a-garg7@ti.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260508060951.840233-1-a-garg7@ti.com	2026-05-08 15:50:07 -05:00
Hans Zhang	78e115d806	MAINTAINERS: Update Hans Zhang email for PCIe CIX Sky1 Update my email address as my work email account is no longer in use. Signed-off-by: Hans Zhang <18255117159@163.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260508023006.1787674-1-18255117159@163.com	2026-05-08 15:50:06 -05:00
Marek Vasut	bf5421b3d8	MAINTAINERS: Update Marek Vasut email for PCIe R-Car Use up to date address. No functional change. Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20260428052030.51101-1-marek.vasut+renesas@mailbox.org	2026-05-08 15:50:06 -05:00
Samiullah Khawaja	f45a49a238	PCI: Initialize temporary device in new_id_store() When setting new_id of a PCI device driver using sysfs a lockdep splat occurs. This is because new_id_store() builds a temporary pci_dev for pci_match_device(), which calls device_match_driver_override(). That depends on the driver_override.lock added by `cb3d1049f4` ("driver core: generalize driver_override in struct device"). The new driver_override.lock was not initialized in the temporary pci_dev, resulting in this lockdep splat. Initialize the temporary pci_dev to fix this. Repro: Build with CONFIG_LOCKDEP=y, boot with QEMU, and add a new ID: # echo "8086 10f5" > /sys/bus/pci/drivers/e1000e/new_id INFO: trying to register non-static key. The code is fine but needs lockdep annotation, or maybe you didn't initialize this object before use? turning off the locking correctness validator. CPU: 2 UID: 0 PID: 177 Comm: liveupdate-iomm Not tainted 7.0.0+ #9 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x5d/0x80 register_lock_class+0x77e/0x790 lock_acquire+0xbf/0x2e0 pci_match_device+0x24/0x180 new_id_store+0x189/0x1d0 kernfs_fop_write_iter+0x14f/0x210 vfs_write+0x263/0x5e0 ksys_write+0x79/0xf0 do_syscall_64+0x117/0xf80 Fixes: `10a4206a24` ("PCI: use generic driver_override infrastructure") Fixes: `8895d3bcb8` ("PCI: Fail new_id for vendor/device values already built into driver") Signed-off-by: Samiullah Khawaja <skhawaja@google.com> [bhelgaas: add commit log details and repro, trim backtrace] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Danilo Krummrich <dakr@kernel.org> Link: https://patch.msgid.link/20260505234327.716630-1-skhawaja@google.com	2026-05-08 15:50:06 -05:00
Lukas Wunner	909f7bf9b0	PCI: Update saved_config_space upon resource assignment Bernd reports passthrough failure of a Digital Devices Cine S2 V6 DVB adapter plugged into an ASRock X570S PG Riptide board with BIOS version P5.41 (09/07/2023): ddbridge 0000:05:00.0: detected Digital Devices Cine S2 V6 DVB adapter ddbridge 0000:05:00.0: cannot read registers ddbridge 0000:05:00.0: fail BIOS assigns an incorrect BAR to the DVB adapter which doesn't fit into the upstream bridge window. The kernel corrects the BAR assignment: pci 0000:07:00.0: BAR 0 [mem 0xfffffffffc500000-0xfffffffffc50ffff 64bit]: can't claim; no compatible bridge window pci 0000:07:00.0: BAR 0 [mem 0xfc500000-0xfc50ffff 64bit]: assigned Correction of the BAR assignment happens in an x86-specific fs_initcall, pcibios_assign_resources(), after device enumeration in a subsys_initcall. This order was introduced at the behest of Linus in 2004: https://git.kernel.org/tglx/history/c/a06a30144bbc No other architecture performs such a late BAR correction. Bernd bisected the issue to commit `a2f1e22390` ("PCI/ERR: Ensure error recoverability at all times"), but it only occurs in the absence of commit `4d4c10f763` ("PCI: Explicitly put devices into D0 when initializing"). This combination exists in stable kernel v6.12.70, but not in mainline, hence Bernd cannot reproduce the issue with mainline. Since `a2f1e22390`, config space is saved on enumeration, prior to BAR correction. Upon passthrough, the corrected BAR is overwritten with the incorrect saved value by: vfio_pci_core_register_device() vfio_pci_set_power_state() pci_restore_state() But only if the device's current_state is PCI_UNKNOWN, as it was prior to commit `4d4c10f763`. Since the commit, it is PCI_D0, which changes the behavior of vfio_pci_set_power_state() to no longer restore the state without saving it first. Alexandre is reporting the same issue as Bernd, but in his case, mainline is affected as well. The difference is that on Alexandre's system, the host kernel binds a driver to the device which is unbound prior to passthrough, whereas on Bernd's system no driver gets bound by the host kernel. Unbinding sets current_state to PCI_UNKNOWN in pci_device_remove(), so when vfio-pci is subsequently bound to the device, pci_restore_state() is once again called without invoking pci_save_state() first. To robustly fix the issue, always update saved_config_space upon resource assignment. Reported-by: Bernd Schumacher <bernd@bschu.de> Closes: https://lore.kernel.org/r/acfZrlP0Ua_5D3U4@eldamar.lan/ Reported-by: Alexandre N. <an.tech@mailo.com> Closes: https://lore.kernel.org/r/dd3c3358-de0f-4a56-9c81-04aceaab4058@mailo.com/ Fixes: `a2f1e22390` ("PCI/ERR: Ensure error recoverability at all times") Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Bernd Schumacher <bernd@bschu.de> Tested-by: Alexandre N. <an.tech@mailo.com> Cc: stable@vger.kernel.org # v6.12+ Link: https://patch.msgid.link/febc3f354e0c1f5a9f5b3ee9ffddaa44caccf651.1776268054.git.lukas@wunner.de	2026-05-08 15:50:06 -05:00
Kuniyuki Iwashima	18fc650ccd	bpf: Free reuseport cBPF prog after RCU grace period. Eulgyu Kim reported the splat below with a repro. [0] The repro sets up a UDP reuseport group with a cBPF prog and replaces it with a new one while another thread is sending a UDP packet to the group. The reuseport prog is freed by sk_reuseport_prog_free(). bpf_prog_put() is called for "e"BPF prog to destruct through multiple stages while cBPF prog is freed immediately by bpf_release_orig_filter() and bpf_prog_free(). If a reuseport prog is detached from the setsockopt() path (reuseport_attach_prog() or reuseport_detach_prog()), sk_reuseport_prog_free() is called without waiting for RCU readers to complete, resulting in various bugs. Let's defer freeing the reuseport cBPF prog after one RCU grace period. Note "e"BPF prog is safe as is unless the fast path starts to touch fields destroyed in bpf_prog_put_deferred() and __bpf_prog_put_noref(). [0]: BUG: KASAN: vmalloc-out-of-bounds in reuseport_select_sock+0xedc/0x1220 net/core/sock_reuseport.c:596 Read of size 4 at addr ffffc9000051e004 by task slowme/10208 CPU: 6 UID: 1000 PID: 10208 Comm: slowme Not tainted 7.0.0-geb7ac95ff75e #32 PREEMPT(full) Hardware name: QEMU Ubuntu 24.04 PC v2 (i440FX + PIIX, arch_caps fix, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 Call Trace: <IRQ> dump_stack_lvl+0xe8/0x150 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:378 [inline] print_report+0xca/0x240 mm/kasan/report.c:482 kasan_report+0x118/0x150 mm/kasan/report.c:595 reuseport_select_sock+0xedc/0x1220 net/core/sock_reuseport.c:596 udp4_lib_lookup2+0x3bc/0x950 net/ipv4/udp.c:495 __udp4_lib_lookup+0x768/0xe20 net/ipv4/udp.c:723 __udp4_lib_lookup_skb+0x297/0x390 net/ipv4/udp.c:752 __udp4_lib_rcv+0x1312/0x2620 net/ipv4/udp.c:2752 ip_protocol_deliver_rcu+0x282/0x440 net/ipv4/ip_input.c:207 ip_local_deliver_finish+0x3bb/0x6f0 net/ipv4/ip_input.c:241 NF_HOOK+0x30c/0x3a0 include/linux/netfilter.h:318 NF_HOOK+0x30c/0x3a0 include/linux/netfilter.h:318 __netif_receive_skb_one_core net/core/dev.c:6181 [inline] __netif_receive_skb net/core/dev.c:6294 [inline] process_backlog+0xaa4/0x1960 net/core/dev.c:6645 __napi_poll+0xae/0x340 net/core/dev.c:7709 napi_poll net/core/dev.c:7772 [inline] net_rx_action+0x5d7/0xf50 net/core/dev.c:7929 handle_softirqs+0x22b/0x870 kernel/softirq.c:622 do_softirq+0x76/0xd0 kernel/softirq.c:523 </IRQ> <TASK> __local_bh_enable_ip+0xf8/0x130 kernel/softirq.c:450 local_bh_enable include/linux/bottom_half.h:33 [inline] rcu_read_unlock_bh include/linux/rcupdate.h:924 [inline] __dev_queue_xmit+0x1dd7/0x3710 net/core/dev.c:4890 neigh_output include/net/neighbour.h:556 [inline] ip_finish_output2+0xca9/0x1070 net/ipv4/ip_output.c:237 NF_HOOK_COND include/linux/netfilter.h:307 [inline] ip_output+0x29f/0x450 net/ipv4/ip_output.c:438 ip_send_skb+0x45/0xc0 net/ipv4/ip_output.c:1508 udp_send_skb+0xb04/0x1510 net/ipv4/udp.c:1195 udp_sendmsg+0x1a71/0x2350 net/ipv4/udp.c:1485 sock_sendmsg_nosec net/socket.c:727 [inline] __sock_sendmsg net/socket.c:742 [inline] __sys_sendto+0x554/0x680 net/socket.c:2206 __do_sys_sendto net/socket.c:2213 [inline] __se_sys_sendto net/socket.c:2209 [inline] __x64_sys_sendto+0xde/0x100 net/socket.c:2209 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0x160/0xf80 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x415a2d Code: b3 66 2e 0f 1f 84 00 00 00 00 00 66 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007f6bc31e41e8 EFLAGS: 00000212 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007f6bc31e4cdc RCX: 0000000000415a2d RDX: 0000000000000001 RSI: 00007f6bc31e421f RDI: 0000000000000003 RBP: 00007f6bc31e4240 R08: 00007f6bc31e4220 R09: 0000000000000010 R10: 0000000000000000 R11: 0000000000000212 R12: 00007f6bc31e46c0 R13: ffffffffffffffb8 R14: 0000000000000000 R15: 00007ffc9b0d70b0 </TASK> Fixes: `538950a1b7` ("soreuseport: setsockopt SO_ATTACH_REUSEPORT_[CE]BPF") Reported-by: Eulgyu Kim <eulgyukim@snu.ac.kr> Reported-by: Taeyang Lee <0wn@theori.io> Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://lore.kernel.org/bpf/20260426012647.3233119-1-kuniyu@google.com	2026-05-08 22:40:05 +02:00
Linus Torvalds	cbf457c584	Merge tag 'block-7.1-20260508' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block fixes from Jens Axboe: - Fix for ublk not doing an actual issue from the task_work fallback path. Any request hitting that should be canceled automatically - Fix for uring_cmd prep side handling, for the block side uring_cmd discard handling - Fix for missing validation of the io and physical block size shifts - Fix for a use-after-free in ublk's cancel command handling * tag 'block-7.1-20260508' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: ublk: fix use-after-free in ublk_cancel_cmd() ublk: validate physical_bs_shift, io_min_shift and io_opt_shift block: only read from sqe on initial invocation of blkdev_uring_cmd() ublk: don't issue uring_cmd from fallback task work	2026-05-08 13:18:13 -07:00
Linus Torvalds	8be01e1280	Merge tag 'io_uring-7.1-20260508' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull io_uring fixes from Jens Axboe: - Ensure that the absolute timeouts for both the command side and the waiting side honor the callers time namespace - Ensure tracked NAPI entries are cleared at unregistration time, as the NAPI polling loop checks the list state rather than the general NAPI state. This can lead to NAPI polling even after unregistration has been done. If unregistered, all NAPI polling should be disabled - Fix for eventfd recursive invocation handling * tag 'io_uring-7.1-20260508' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: io_uring/wait: honour caller's time namespace for IORING_ENTER_ABS_TIMER io_uring/timeout: honour caller's time namespace for IORING_TIMEOUT_ABS io_uring/eventfd: reset deferred signal state io_uring/napi: clear tracked NAPI entries on unregister	2026-05-08 13:12:48 -07:00
Martin KaFai Lau	f3b8c28135	Merge branch 'bpf-tcp-fix-type-confusion-in-bpf-helper-functions' Kuniyuki Iwashima says: ==================== bpf: tcp: Fix type confusion in bpf helper functions. bpf_tcp_sock() only check if sk->sk_protocol is IPPROTO_TCP, but RAW socket can bypass it: socket(AF_INET, SOCK_RAW, IPPROTO_TCP) The same issues exist in other bpf functions: * bpf_mptcp_sock_from_subflow() * bpf_skc_to_tcp_sock() * bpf_skc_to_tcp6_sock() * sol_tcp_sockopt() Patch 1 fixes bpf_tcp_sock() and Patch 2 adds a test for it. Patch 3 ~ 6 fix the rest of the functions above. Changes: v2: * Inverse if (err) to if (!err) in the selftest * Add patch 3 ~ 6 v1: https://lore.kernel.org/bpf/20260430184405.1227386-1-kuniyu@google.com/ https://lore.kernel.org/mptcp/20260430-mptcp-bpf-mptcp-sock-type-v1-1-d2ed5cda7da9@kernel.org/ ==================== Link: https://patch.msgid.link/20260504210610.180150-1-kuniyu@google.com Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>	2026-05-08 11:38:11 -07:00
Kuniyuki Iwashima	1c2958e4ab	bpf: tcp: Fix type confusion in sol_tcp_sockopt(). sol_tcp_sockopt() only checks if sk->sk_protocol is IPPROTO_TCP, but RAW socket can bypass it: socket(AF_INET, SOCK_RAW, IPPROTO_TCP) Let's use sk_is_tcp(). Note that initially sol_tcp_sockopt() checked sk->sk_prot->setsockopt. Fixes: `2ab42c7b87` ("bpf: Check the protocol of a sock to agree the calls to bpf_setsockopt().") Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260504210610.180150-7-kuniyu@google.com	2026-05-08 11:38:10 -07:00

1 2 3 4 5 ...

1445093 Commits