Now, as the page_ext holds count of IOMMU mappings, we can use it to
assert that any page allocated/freed is indeed not in the IOMMU.
The sanitizer doesn’t protect against mapping/unmapping during this
period. However, that’s less harmful as the page is not used by the
kernel.
Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Using the new calls, use an atomic refcount to track how many times
a page is mapped in any of the IOMMUs.
For unmap we need to use iova_to_phys() to get the physical address
of the pages.
We use the smallest supported page size as the granularity of tracking
per domain.
This is important as it is possible to map pages and unmap them with
larger sizes (as in map_sg()) cases.
Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Add calls for the new iommu debug config IOMMU_DEBUG_PAGEALLOC:
- iommu_debug_init: Enable the debug mode if configured by the user.
- iommu_debug_map: Track iommu pages mapped, using physical address.
- iommu_debug_unmap_begin: Track start of iommu unmap operation, with
IOVA and size.
- iommu_debug_unmap_end: Track the end of unmap operation, passing the
actual unmapped size versus the tracked one at unmap_begin.
We have to do the unmap_begin/end as once pages are unmapped we lose
the information of the physical address.
This is racy, but the API is racy by construction as it uses refcounts
and doesn't attempt to lock/synchronize with the IOMMU API as that will
be costly, meaning that possibility of false negative exists.
Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Add a new config IOMMU_DEBUG_PAGEALLOC, which registers new data to
page_ext.
This config will be used by the IOMMU API to track pages mapped in
the IOMMU to catch drivers trying to free kernel memory that they
still map in their domains, causing all types of memory corruption.
This behaviour is disabled by default and can be enabled using
kernel cmdline iommu.debug_pagealloc.
Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>
Reviewed-by: Pranjal Shrivastava <praan@google.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
PCIe permits a device to ignore ATS invalidation TLPs while processing a
reset. This creates a problem visible to the OS where an ATS invalidation
command will time out: e.g. an SVA domain will have no coordination with a
reset event and can racily issue ATS invalidations to a resetting device.
The PCIe r6.0, sec 10.3.1 IMPLEMENTATION NOTE recommends SW to disable and
block ATS before initiating a Function Level Reset. It also mentions that
other reset methods could have the same vulnerability as well.
The IOMMU subsystem provides pci_dev_reset_iommu_prepare/done() callback
helpers for this matter. Use them in all the existing reset functions.
This will attach the device to its iommu_group->blocking_domain during the
device reset, so as to allow IOMMU driver to:
- invoke pci_disable_ats() and pci_enable_ats(), if necessary
- wait for all ATS invalidations to complete
- stop issuing new ATS invalidations
- fence any incoming ATS queries
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Dheeraj Kumar Srivastava <dheerajkumar.srivastava@amd.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
PCIe permits a device to ignore ATS invalidation TLPs while processing a
reset. This creates a problem visible to the OS where an ATS invalidation
command will time out. E.g. an SVA domain will have no coordination with a
reset event and can racily issue ATS invalidations to a resetting device.
The OS should do something to mitigate this as we do not want production
systems to be reporting critical ATS failures, especially in a hypervisor
environment. Broadly, OS could arrange to ignore the timeouts, block page
table mutations to prevent invalidations, or disable and block ATS.
The PCIe r6.0, sec 10.3.1 IMPLEMENTATION NOTE recommends SW to disable and
block ATS before initiating a Function Level Reset. It also mentions that
other reset methods could have the same vulnerability as well.
Provide a callback from the PCI subsystem that will enclose the reset and
have the iommu core temporarily change all the attached RID/PASID domains
group->blocking_domain so that the IOMMU hardware would fence any incoming
ATS queries. And IOMMU drivers should also synchronously stop issuing new
ATS invalidations and wait for all ATS invalidations to complete. This can
avoid any ATS invaliation timeouts.
However, if there is a domain attachment/replacement happening during an
ongoing reset, ATS routines may be re-activated between the two function
calls. So, introduce a new resetting_domain in the iommu_group structure
to reject any concurrent attach_dev/set_dev_pasid call during a reset for
a concern of compatibility failure. Since this changes the behavior of an
attach operation, update the uAPI accordingly.
Note that there are two corner cases:
1. Devices in the same iommu_group
Since an attachment is always per iommu_group, this means that any
sibling devices in the iommu_group cannot change domain, to prevent
race conditions.
2. An SR-IOV PF that is being reset while its VF is not
In such case, the VF itself is already broken. So, there is no point
in preventing PF from going through the iommu reset.
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Dheeraj Kumar Srivastava <dheerajkumar.srivastava@amd.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
There is a need to stage a resetting PCI device to temporarily the blocked
domain and then attach back to its previously attached domain after reset.
This can be simply done by keeping the "previously attached domain" in the
iommu_group->domain pointer while adding an iommu_group->resetting_domain,
which gives troubles to IOMMU drivers using the iommu_get_domain_for_dev()
for a device's physical domain in order to program IOMMU hardware.
And in such for-driver use cases, the iommu_group->mutex must be held, so
it doesn't fit in external callers that don't hold the iommu_group->mutex.
Introduce a new iommu_driver_get_domain_for_dev() helper, exclusively for
driver use cases that hold the iommu_group->mutex, to separate from those
external use cases.
Add a lockdep_assert_not_held to the existing iommu_get_domain_for_dev()
and highlight that in a kdoc.
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Dheeraj Kumar Srivastava <dheerajkumar.srivastava@amd.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
This function can only be called on the default_domain. Trivally pass it
in. In all three existing cases, the default domain was just attached to
the device.
This avoids iommu_setup_dma_ops() calling iommu_get_domain_for_dev() that
will be used by external callers.
Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Tested-by: Dheeraj Kumar Srivastava <dheerajkumar.srivastava@amd.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
The iommu_deferred_attach() function invokes __iommu_attach_device(), but
doesn't hold the group->mutex like other __iommu_attach_device() callers.
Though there is no pratical bug being triggered so far, it would be better
to apply the same locking to this __iommu_attach_device(), since the IOMMU
drivers nowaday are more aware of the group->mutex -- some of them use the
iommu_group_mutex_assert() function that could be potentially in the path
of an attach_dev callback function invoked by the __iommu_attach_device().
Worth mentioning that the iommu_deferred_attach() will soon need to check
group->resetting_domain that must be locked also.
Thus, grab the mutex to guard __iommu_attach_device() like other callers.
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com>
Tested-by: Dheeraj Kumar Srivastava <dheerajkumar.srivastava@amd.com>
Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
Reviewed-by: Samiullah Khawaja <skhawaja@google.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
Pull perf tool fixes and from Namhyung Kim:
- skip building BPF skeletons if libopenssl is missing
- a couple of test updates
- handle error cases of filename__read_build_id()
- support NVIDIA Olympus for ARM SPE profiling
- update tool headers to sync with the kernel
* tag 'perf-tools-fixes-for-v6.19-2026-01-02' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
tools build: Fix the common set of features test wrt libopenssl
tools headers: Sync syscall table with kernel sources
tools headers: Sync linux/socket.h with kernel sources
tools headers: Sync linux/gfp_types.h with kernel sources
tools headers: Sync arm64 headers with kernel sources
tools headers: Sync x86 headers with kernel sources
tools headers: Sync UAPI sound/asound.h with kernel sources
tools headers: Sync UAPI linux/mount.h with kernel sources
tools headers: Sync UAPI linux/fs.h with kernel sources
tools headers: Sync UAPI linux/fcntl.h with kernel sources
tools headers: Sync UAPI KVM headers with kernel sources
tools headers: Sync UAPI drm/drm.h with kernel sources
perf arm-spe: Add NVIDIA Olympus to neoverse list
tools headers arm64: Add NVIDIA Olympus part
perf tests top: Make the test exclusive
perf tests kvm: Avoid leaving perf.data.guest file around
perf symbol: Fix ENOENT case for filename__read_build_id
perf tools: Disable BPF skeleton if no libopenssl found
tools/build: Add a feature test for libopenssl
Pull power management fix from Rafael Wysocki:
"Fix a recent regression that affects system suspend testing
at the 'core' level (Rafael Wysocki)"
* tag 'pm-6.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM: sleep: Fix suspend_test() at the TEST_CORE level
Pull crypto library fix from Eric Biggers:
"Fix the kunit_run_irq_test() function (which I recently added for the
CRC and crypto tests) to be less timing-dependent.
This fixes flakiness in the polyval kunit test suite"
* tag 'libcrypto-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
kunit: Enforce task execution in {soft,hard}irq contexts
Pull rdma fixes from Jason Gunthorpe:
- Fix several syzkaller found bugs:
- Poor parsing of the RDMA_NL_LS_OP_IP_RESOLVE netlink
- GID entry refcount leaking when CM destruction races with
multicast establishment
- Missing refcount put in ib_del_sub_device_and_put()
- Fixup recently introduced uABI padding for 32 bit consistency
- Avoid user triggered math overflow in MANA and AFA
- Reading invalid netdev data during an event
- kdoc fixes
- Fix never-working gid copying in ib_get_gids_from_rdma_hdr
- Typo in bnxt when validating the BAR
- bnxt mis-parsed IB_SEND_IP_CSUM so it didn't work always
- bnxt out of bounds access in bnxt related to the counters on new
devices
- Allocate the bnxt PDE table with the right sizing
- Use dma_free_coherent() correctly in bnxt
- Allow rxe to be unloadable when CONFIG_PROVE_LOCKING by adjusting the
tracking of the global sockets it uses
- Missing unlocking on error path in rxe
- Compute the right number of pages in a MR in rtrs
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
RDMA/bnxt_re: fix dma_free_coherent() pointer
RDMA/rtrs: Fix clt_path::max_pages_per_mr calculation
IB/rxe: Fix missing umem_odp->umem_mutex unlock on error path
RDMA/bnxt_re: Fix to use correct page size for PDE table
RDMA/bnxt_re: Fix OOB write in bnxt_re_copy_err_stats()
RDMA/bnxt_re: Fix IB_SEND_IP_CSUM handling in post_send
RDMA/core: always drop device refcount in ib_del_sub_device_and_put()
RDMA/rxe: let rxe_reclassify_recv_socket() call sk_owner_put()
RDMA/bnxt_re: Fix incorrect BAR check in bnxt_qplib_map_creq_db()
RDMA/core: Fix logic error in ib_get_gids_from_rdma_hdr()
RDMA/efa: Remove possible negative shift
RTRS/rtrs: clean up rtrs headers kernel-doc
RDMA/irdma: avoid invalid read in irdma_net_event
RDMA/mana_ib: check cqe length for kernel CQs
RDMA/irdma: Fix irdma_alloc_ucontext_resp padding
RDMA/ucma: Fix rdma_ucm_query_ib_service_resp struct padding
RDMA/cm: Fix leaking the multicast GID table reference
RDMA/core: Check for the presence of LS_NLA_TYPE_DGID correctly
Pull kselftest fixes from Shuah Khan:
- Fix for build failures in tests that use an empty FIXTURE() seen in
Android's build environment, which uses -D_FORTIFY_SOURCE=3, a build
failure occurs in tests that use an empty FIXTURE()
- Fix func_traceonoff_triggers.tc sometimes failures on Kunpeng-920
board resulting from including transient trace file name in checksum
compare
- Fix to remove available_events requirement from toplevel-enable for
instance as it isn't a valid requirement for this test
* tag 'linux_kselftest-fixes-6.19-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
kselftest/harness: Use helper to avoid zero-size memset warning
selftests/ftrace: Test toplevel-enable for instance
selftests/ftrace: traceonoff_triggers: strip off names
Pull block fixes from Jens Axboe:
- Scan partition tables asynchronously for ublk, similarly to how nvme
does it. This avoids potential deadlocks, which is why nvme does it
that way too. Includes a set of selftests as well.
- MD pull request via Yu:
- Fix null-pointer dereference in raid5 sysfs group_thread_cnt
store (Tuo Li)
- Fix possible mempool corruption during raid1 raid_disks update
via sysfs (FengWei Shih)
- Fix logical_block_size configuration being overwritten during
super_1_validate() (Li Nan)
- Fix forward incompatibility with configurable logical block size:
arrays assembled on new kernels could not be assembled on older
kernels (v6.18 and before) due to non-zero reserved pad rejection
(Li Nan)
- Fix static checker warning about iterator not incremented (Li Nan)
- Skip CPU offlining notifications on unmapped hardware queues
- bfq-iosched block stats fix
- Fix outdated comment in bfq-iosched
* tag 'block-6.19-20260102' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
block, bfq: update outdated comment
blk-mq: skip CPU offline notify on unmapped hctx
selftests/ublk: fix Makefile to rebuild on header changes
selftests/ublk: add test for async partition scan
ublk: scan partition in async way
block,bfq: fix aux stat accumulation destination
md: Fix forward incompatibility from configurable logical block size
md: Fix logical_block_size configuration being overwritten
md: suspend array while updating raid_disks via sysfs
md/raid5: fix possible null-pointer dereferences in raid5_store_group_thread_cnt()
md: Fix static checker warning in analyze_sbs
Pull io_uring fixes from Jens Axboe:
- Removed dead argument length for io_uring_validate_mmap_request()
- Use GFP_NOWAIT for overflow CQEs on legacy ring setups rather than
GFP_ATOMIC, which makes it play nicer with memcg limits
- Fix a potential circular locking issue with tctx node removal and
exec based cancelations
* tag 'io_uring-6.19-20260102' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux:
io_uring/memmap: drop unused sz param in io_uring_validate_mmap_request()
io_uring/tctx: add separate lock for list of tctx's in ctx
io_uring: use GFP_NOWAIT for overflow CQEs on legacy rings
Pull x86 fix from Ingo Molnar:
"Fix the AMD microcode Entrysign signature checking code to include
more models"
* tag 'x86-urgent-2026-01-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode/AMD: Fix Entrysign revision check for Zen5/Strix Halo
Pull LoongArch fixes from Huacai Chen:
"Complete CPUCFG registers definition, set correct protection_map[] for
VM_NONE/VM_SHARED, fix some bugs in the orc stack unwinder, ftrace and
BPF JIT"
* tag 'loongarch-fixes-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson:
samples/ftrace: Adjust LoongArch register restore order in direct calls
LoongArch: BPF: Enhance the bpf_arch_text_poke() function
LoongArch: BPF: Enable trampoline-based tracing for module functions
LoongArch: BPF: Adjust the jump offset of tail calls
LoongArch: BPF: Save return address register ra to t0 before trampoline
LoongArch: BPF: Zero-extend bpf_tail_call() index
LoongArch: BPF: Sign extend kfunc call arguments
LoongArch: Refactor register restoration in ftrace_common_return
LoongArch: Enable exception fixup for specific ADE subcode
LoongArch: Remove unnecessary checks for ORC unwinder
LoongArch: Remove is_entry_func() and kernel_entry_end
LoongArch: Use UNWIND_HINT_END_OF_STACK for entry points
LoongArch: Set correct protection_map[] for VM_NONE/VM_SHARED
LoongArch: Complete CPUCFG registers definition
Pull drm fixes from Dave Airlie:
"Happy New Year, jetlagged fixes from me, still pretty quiet, xe is
most of this, with i915/nouveau/imagination fixes and some shmem
cleanups.
shmem:
- docs and MODULE_LICENSE fix
xe:
- Ensure svm device memory is idle before migration completes
- Fix a SVM debug printout
- Use READ_ONCE() / WRITE_ONCE() for g2h_fence
i915:
- Fix eb_lookup_vmas() failure path
nouveau:
- fix prepare_fb warnings
imagination:
- prevent export of protected objects"
* tag 'drm-fixes-2026-01-02' of https://gitlab.freedesktop.org/drm/kernel:
drm/i915/gem: Zero-initialize the eb.vma array in i915_gem_do_execbuffer
drm/xe/guc: READ/WRITE_ONCE g2h_fence->done
drm/pagemap, drm/xe: Ensure that the devmem allocation is idle before use
drm/xe/svm: Fix a debug printout
drm/gem-shmem: Fix the MODULE_LICENSE() string
drm/gem-shmem: Fix typos in documentation
drm/nouveau/dispnv50: Don't call drm_atomic_get_crtc_state() in prepare_fb
drm/imagination: Disallow exporting of PM/FW protected objects
Pull smb server fixes from Steve French:
- Fix memory leak
- Fix two refcount leaks
- Fix error path in create_smb2_pipe
* tag 'v6.19-rc3-smb3-server-fixes' of git://git.samba.org/ksmbd:
smb/server: fix refcount leak in smb2_open()
smb/server: fix refcount leak in parse_durable_handle_context()
smb/server: call ksmbd_session_rpc_close() on error path in create_smb2_pipe()
ksmbd: Fix memory leak in get_file_all_info()
Pull smb client fixes from Steve French:
- Fix array out of bounds error in copy_file_range
- Add tracepoint to help debug ioctl failures
* tag 'v6.19-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: fix UBSAN array-index-out-of-bounds in smb2_copychunk_range
smb3 client: add missing tracepoint for unsupported ioctls
The function bfq_bfqq_may_idle() was renamed as bfq_better_to_idle()
in commit 277a4a9b56 ("block, bfq: give a better name to
bfq_bfqq_may_idle"). Update the comment accordingly.
Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
io_uring_validate_mmap_request() doesn't use its size_t sz argument, so
remove it.
Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
An extra blank line gets printed after printing firmware version
because the build date is null terminated. Remove the "\n" from
dev_info() calls to print firmware version and build date to fix
the problem.
Reported-by: Mario Limonciello <superm1@gmail.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When building kselftests with a toolchain that enables source
fortification (e.g., Android's build environment, which uses
-D_FORTIFY_SOURCE=3), a build failure occurs in tests that use an
empty FIXTURE().
The root cause is that an empty fixture struct results in
`sizeof(self_private)` evaluating to 0. The compiler's fortification
checks then detect the `memset()` call with a compile-time constant size
of 0, issuing a `-Wuser-defined-warnings` which is promoted to an error
by `-Werror`.
An initial attempt to guard the call with `if (sizeof(self_private) > 0)`
was insufficient. The compiler's static analysis is aggressive enough
to flag the `memset(..., 0)` pattern before evaluating the conditional,
thus still triggering the error.
To resolve this robustly, this change introduces a `static inline`
helper function, `__kselftest_memset_safe()`. This function wraps the
size check and the `memset()` call. By replacing the direct `memset()`
in the `__TEST_F_IMPL` macro with a call to this helper, we create an
abstraction boundary. This prevents the compiler's static analyzer from
"seeing" the problematic pattern at the macro expansion site, resolving
the build failure.
Build Context:
Compiler: Android (14488419, +pgo, +bolt, +lto, +mlgo, based on r584948) clang version 22.0.0 (https://android.googlesource.com/toolchain/llvm-project 2d65e4108033380e6fe8e08b1f1826cd2bfb0c99)
Relevant Options: -O2 -Wall -Werror -D_FORTIFY_SOURCE=3 -target i686-linux-android10000
Test: m kselftest_futex_futex_requeue_pi
Removed Gerrit Change-Id
Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20251224084120.249417-1-wakel@google.com
Signed-off-by: Wake Liu <wakel@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Pull x86 platform driver fixes from Ilpo Järvinen:
- alienware-wmi-wmax: Area-51, x16, and 16X Aurora laptops support
- asus-armoury:
- Fix FA507R PPT data
- Add TDP data for more laptop models
- asus-nb-wmi: Asus Zenbook 14 display toggle key support
- dell-lis3lv02d: Dell Latitude 5400 support
- hp-bioscfg: Fix out-of-bounds array access in ACPI package parsing
- ibm_rtl: Fix EBDA signature search pointer arithmetic
- ideapad-laptop: Reassign KEY_CUT to KEY_SELECTIVE_SCREENSHOT
- intel/pmt:
- Fix kobject memory leak on init failure
- Use valid pointers on error handling path
- intel/vsec: Correct kernel doc comments
- mellanox: mlxbf-pmc: Fix event names
- msi-laptop: Add sysfs_remove_group()
- samsumg-galaxybook: Do not cast pointer to a shorter type
- think-lmi: WMI certificate thumbprint support for ThinkCenter
- uniwill: Tuxedo Book BA15 Gen10 support
* tag 'platform-drivers-x86-v6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (22 commits)
platform/x86: asus-armoury: add support for G835LW
platform/x86: asus-armoury: fix ppt data for FA507R
platform/x86/intel/pmt/discovery: use valid device pointer in dev_err_probe
platform/x86: hp-bioscfg: Fix out-of-bounds array access in ACPI package parsing
platform/x86: asus-armoury: add support for G615LR
platform/x86: asus-armoury: add support for FA608UM
platform/x86: asus-armoury: add support for GA403WR
platform/x86: asus-armoury: add support for GU605CR
platform/x86: ideapad-laptop: Reassign KEY_CUT to KEY_SELECTIVE_SCREENSHOT
platform/x86: samsung-galaxybook: Fix problematic pointer cast
platform/x86/intel/pmt: Fix kobject memory leak on init failure
platform/x86/intel/vsec: correct kernel-doc comments
platform/x86: ibm_rtl: fix EBDA signature search pointer arithmetic
platform/x86: msi-laptop: add missing sysfs_remove_group()
platform/x86: think-lmi: Add WMI certificate thumbprint support for ThinkCenter
platform/x86: dell-lis3lv02d: Add Latitude 5400
platform/mellanox: mlxbf-pmc: Remove trailing whitespaces from event names
platform/x86: asus-nb-wmi: Add keymap for display toggle
platform/x86/uniwill: Add TUXEDO Book BA15 Gen10
platform/x86: alienware-wmi-wmax: Add support for Alienware 16X Aurora
...
'available_events' is actually not required by
'test.d/event/toplevel-enable.tc' and its Existence has been tested in
'test.d/00basic/basic4.tc'.
So the require of 'available_events' can be dropped and then we can add
'instance' flag to test 'test.d/event/toplevel-enable.tc' for instance.
Test result show as below:
# ./ftracetest test.d/event/toplevel-enable.tc
=== Ftrace unit tests ===
[1] event tracing - enable/disable with top level files [PASS]
[2] (instance) event tracing - enable/disable with top level files [PASS]
# of passed: 2
# of failed: 0
# of unresolved: 0
# of untested: 0
# of unsupported: 0
# of xfailed: 0
# of undefined(test bug): 0
Link: https://lore.kernel.org/r/20230509203659.1173917-1-zhengyejian1@huawei.com
Signed-off-by: Zheng Yejian <zhengyejian1@huawei.com>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Pull VFIO fixes from Alex Williamson:
- Restrict ROM access to dword to resolve a regression introduced with
qword access seen on some Intel NICs. Update VGA region access to the
same given lack of precedent for 64-bit users (Kevin Tian)
- Fix missing .get_region_info_caps callback in the xe-vfio-pci variant
driver due to integration through the DRM tree (Michal Wajdeczko)
- Add aligned 64-bit access macros to tools/include/linux/types.h,
allowing removal of uapi/linux/type.h includes from various vfio
selftest, resolving redefinition warnings for integration with KVM
selftests (David Matlack)
- Fix error path memory leak in pds-vfio-pci variant driver (Zilin Guan)
- Fix error path use-after-free in xe-vfio-pci variant driver (Alper Ak)
* tag 'vfio-v6.19-rc4' of https://github.com/awilliam/linux-vfio:
vfio/xe: Fix use-after-free in xe_vfio_pci_alloc_file()
vfio/pds: Fix memory leak in pds_vfio_dirty_enable()
vfio: selftests: Drop <uapi/linux/types.h> includes
tools include: Add definitions for __aligned_{l,b}e64
vfio/xe: Add default handler for .get_region_info_caps
vfio/pci: Disable qword access to the VGA region
vfio/pci: Disable qword access to the PCI ROM bar
Pull MD fixes from Yu Kuai:
"- Fix null-pointer dereference in raid5 sysfs group_thread_cnt store
(Tuo Li)
- Fix possible mempool corruption during raid1 raid_disks update via
sysfs (FengWei Shih)
- Fix logical_block_size configuration being overwritten during
super_1_validate() (Li Nan)
- Fix forward incompatibility with configurable logical block size:
arrays assembled on new kernels could not be assembled on kernels
<=6.18 due to non-zero reserved pad rejection (Li Nan)
- Fix static checker warning about iterator not incremented (Li Nan)"
* tag 'md-6.19-20251231' of gitolite.kernel.org:pub/scm/linux/kernel/git/mdraid/linux:
md: Fix forward incompatibility from configurable logical block size
md: Fix logical_block_size configuration being overwritten
md: suspend array while updating raid_disks via sysfs
md/raid5: fix possible null-pointer dereferences in raid5_store_group_thread_cnt()
md: Fix static checker warning in analyze_sbs
Initialize the eb.vma array with values of 0 when the eb structure is
first set up. In particular, this sets the eb->vma[i].vma pointers to
NULL, simplifying cleanup and getting rid of the bug described below.
During the execution of eb_lookup_vmas(), the eb->vma array is
successively filled up with struct eb_vma objects. This process includes
calling eb_add_vma(), which might fail; however, even in the event of
failure, eb->vma[i].vma is set for the currently processed buffer.
If eb_add_vma() fails, eb_lookup_vmas() returns with an error, which
prompts a call to eb_release_vmas() to clean up the mess. Since
eb_lookup_vmas() might fail during processing any (possibly not first)
buffer, eb_release_vmas() checks whether a buffer's vma is NULL to know
at what point did the lookup function fail.
In eb_lookup_vmas(), eb->vma[i].vma is set to NULL if either the helper
function eb_lookup_vma() or eb_validate_vma() fails. eb->vma[i+1].vma is
set to NULL in case i915_gem_object_userptr_submit_init() fails; the
current one needs to be cleaned up by eb_release_vmas() at this point,
so the next one is set. If eb_add_vma() fails, neither the current nor
the next vma is set to NULL, which is a source of a NULL deref bug
described in the issue linked in the Closes tag.
When entering eb_lookup_vmas(), the vma pointers are set to the slab
poison value, instead of NULL. This doesn't matter for the actual
lookup, since it gets overwritten anyway, however the eb_release_vmas()
function only recognizes NULL as the stopping value, hence the pointers
are being set to NULL as they go in case of intermediate failure. This
patch changes the approach to filling them all with NULL at the start
instead, rather than handling that manually during failure.
Reported-by: Gangmin Kim <km.kim1503@gmail.com>
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/15062
Fixes: 544460c338 ("drm/i915: Multi-BB execbuf")
Cc: stable@vger.kernel.org # 5.16.x
Signed-off-by: Krzysztof Niemiec <krzysztof.niemiec@intel.com>
Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
Link: https://lore.kernel.org/r/20251216180900.54294-2-krzysztof.niemiec@intel.com
(cherry picked from commit 08889b706d)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Ensure that in the ftrace direct call logic, the CPU register state
(with ra = parent return address) is restored to the correct state after
the execution of the custom trampoline function and before returning to
the traced function. Additionally, guarantee the correctness of the jump
logic for jr t0 (traced function address).
Cc: stable@vger.kernel.org
Fixes: 9cdc3b6a29 ("LoongArch: ftrace: Add direct call support")
Reported-by: Youling Tang <tangyouling@kylinos.cn>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Enhance the bpf_arch_text_poke() function to enable accurate location
of BPF program entry points.
When modifying the entry point of a BPF program, skip the "move t0, ra"
instruction to ensure the correct logic and copy of the jump address.
Cc: stable@vger.kernel.org
Fixes: 677e6123e3 ("LoongArch: BPF: Disable trampoline for kernel module function trace")
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Remove the previous restrictions that blocked the tracing of kernel
module functions. Fix the issue that previously caused kernel lockups
when attempting to trace module functions.
Before entering the trampoline code, the return address register ra
shall store the address of the next assembly instruction after the
'bl trampoline' instruction, which is the traced function address, and
the register t0 shall store the parent function return address. Refine
the trampoline return logic to ensure that register data remains correct
when returning to both the traced function and the parent function.
Before this patch was applied, the module_attach test in selftests/bpf
encountered a deadlock issue. This was caused by an incorrect jump
address after the trampoline execution, which resulted in an infinite
loop within the module function.
Cc: stable@vger.kernel.org
Fixes: 677e6123e3 ("LoongArch: BPF: Disable trampoline for kernel module function trace")
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Call the next bpf prog and skip the first instruction of TCC
initialization.
A total of 7 instructions are skipped:
'move t0, ra' 1 inst
'move_imm + jirl' 5 inst
'addid REG_TCC, zero, 0' 1 inst
Relevant test cases: the tailcalls test item in selftests/bpf.
Cc: stable@vger.kernel.org
Fixes: 677e6123e3 ("LoongArch: BPF: Disable trampoline for kernel module function trace")
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Modify the build_prologue() function to ensure the return address
register ra is saved to t0 before entering trampoline operations.
This change ensures the accurate return address handling when a BPF
program calls another BPF program, preventing errors in the BPF-to-BPF
call chain.
Cc: stable@vger.kernel.org
Fixes: 677e6123e3 ("LoongArch: BPF: Disable trampoline for kernel module function trace")
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The bpf_tail_call() index should be treated as a u32 value. Let's
zero-extend it to avoid calling wrong BPF progs. See similar fixes
for x86 [1]) and arm64 ([2]) for more details.
[1]: 90caccdd8c
[2]: 16338a9b3a
Cc: stable@vger.kernel.org
Fixes: 5dc615520c ("LoongArch: Add BPF JIT support")
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The kfunc calls are native calls so they should follow LoongArch calling
conventions. Sign extend its arguments properly to avoid kernel panic.
This is done by adding a new emit_abi_ext() helper. The emit_abi_ext()
helper performs extension in place meaning a value already store in the
target register (Note: this is different from the existing sign_extend()
helper and thus we can't reuse it).
Cc: stable@vger.kernel.org
Fixes: 5dc615520c ("LoongArch: Add BPF JIT support")
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Refactor the register restoration sequence in the ftrace_common_return
function to clearly distinguish between the logic of normal returns and
direct call returns in function tracing scenarios. The logic is as
follows:
1. In the case of a normal return, the execution flow returns to the
traced function, and ftrace must ensure that the register data is
consistent with the state when the function was entered.
ra = parent return address; t0 = traced function return address.
2. In the case of a direct call return, the execution flow jumps to the
custom trampoline function, and ftrace must ensure that the register
data is consistent with the state when ftrace was entered.
ra = traced function return address; t0 = parent return address.
Cc: stable@vger.kernel.org
Fixes: 9cdc3b6a29 ("LoongArch: ftrace: Add direct call support")
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
This patch allows the LoongArch BPF JIT to handle recoverable memory
access errors generated by BPF_PROBE_MEM* instructions.
When a BPF program performs memory access operations, the instructions
it executes may trigger ADEM exceptions. The kernel’s built-in BPF
exception table mechanism (EX_TYPE_BPF) will generate corresponding
exception fixup entries in the JIT compilation phase; however, the
architecture-specific trap handling function needs to proactively call
the common fixup routine to achieve exception recovery.
do_ade(): fix EX_TYPE_BPF memory access exceptions for BPF programs,
ensure safe execution.
Relevant test cases: illegal address access tests in module_attach and
subprogs_extable of selftests/bpf.
Signed-off-by: Chenghao Duan <duanchenghao@kylinos.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
According to the following function definitions, __kernel_text_address()
already checks __module_text_address(), so it should remove the check of
__module_text_address() in bt_address() at least.
int __kernel_text_address(unsigned long addr)
{
if (kernel_text_address(addr))
return 1;
...
return 0;
}
int kernel_text_address(unsigned long addr)
{
bool no_rcu;
int ret = 1;
...
if (is_module_text_address(addr))
goto out;
...
return ret;
}
bool is_module_text_address(unsigned long addr)
{
guard(rcu)();
return __module_text_address(addr) != NULL;
}
Furthermore, there are two checks of __kernel_text_address(), one is in
bt_address() and the other is after calling bt_address(), it looks like
redundant.
Handle the exception address first and then use __kernel_text_address()
to validate the calculated address for exception or the normal address
in bt_address(), then it can remove the check of __kernel_text_address()
after calling bt_address().
Just remove unnecessary checks, no functional changes intended.
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>