linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-06-02 00:32:31 -04:00

Author	SHA1	Message	Date
Wendy Liang	140cfee588	accel/amdxdna: Return ERR_PTR on dma_alloc_noncoherent failure dma_alloc_noncoherent() returns NULL on failure, but callers of aie2_alloc_msg_buffer() check for IS_ERR(). Return ERR_PTR(-ENOMEM) instead of NULL to match the amdxdna_iommu_alloc() path and the caller's error checking convention. Fixes: `ece3e89809` ("accel/amdxdna: Allow forcing IOVA-based DMA via module parameter") Signed-off-by: Wendy Liang <wendy.liang@amd.com> Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260323173719.2311474-1-lizhi.hou@amd.com	2026-03-23 13:29:22 -07:00
haoyu.lu	d1c7388485	accel/amdxdna: fix missing newline in pr_err message Add missing newline to pr_err message in amdxdna_mailbox.c. Fixes: `b87f920b93` ("accel/amdxdna: Support hardware mailbox") Signed-off-by: haoyu.lu <hechushiguitu666@gmail.com> Reviewed-by: Lizhi.hou <lizhi.hou@amd.com> Signed-off-by: Lizhi.hou <lizhi.hou@amd.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260323034933.216-1-hechushiguitu666@gmail.com	2026-03-23 09:36:51 -07:00
Max Zhen	d76856beb4	accel/amdxdna: Refactor GEM BO handling and add helper APIs for address retrieval Refactor amdxdna GEM buffer object (BO) handling to simplify address management and unify BO type semantics. Introduce helper APIs to retrieve commonly used BO addresses: - User virtual address (UVA) - Kernel virtual address (KVA) - Device address (IOVA/PA) These helpers centralize address lookup logic and avoid duplicating BO-specific handling across submission and execution paths. This also improves readability and reduces the risk of inconsistent address handling in future changes. As part of the refactor: - Rename SHMEM BO type to SHARE to better reflect its usage. - Merge CMD BO handling into SHARE, removing special-case logic for command buffers. - Consolidate BO type handling paths to reduce code duplication and simplify maintenance. No functional change is intended. The refactor prepares the driver for future enhancements by providing a cleaner abstraction for BO address management. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Max Zhen <max.zhen@amd.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260320210615.1973016-1-lizhi.hou@amd.com	2026-03-20 22:12:49 -07:00
Karol Wachowski	ade00a6c90	accel/ivpu: Perform engine reset instead of device recovery on TDR Replace full device recovery on TDR timeout with per-context abort, allowing individual context handling instead of resetting the entire device. Extend ivpu_jsm_reset_engine() to return the list of contexts impacted by the engine reset and use that information to abort only the affected contexts. Only check for potentially faulty contexts when the engine reset was not triggered by an MMU fault or a job completion error status. This prevents misidentifying non-guilty contexts that happened to be running at the time of the fault. Trigger full device recovery if no contexts were marked by engine reset if triggered by job completion timeout, as there is no way to identify guilty one. Add engine reset counter to debugfs for engine resets bookkeeping for debugging/testing purposes. Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Link: https://patch.msgid.link/20260318093927.4080303-1-karol.wachowski@linux.intel.com	2026-03-20 08:03:11 +01:00
Lizhi Hou	25854131c0	accel/amdxdna: Support retrieving hardware context debug information The firmware implements the GET_APP_HEALTH command to collect debug information for a specific hardware context. When a command times out, the driver issues this command to collect the relevant debug information. User space tools can also retrieve this information through the hardware context query IOCTL. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260317044906.1513133-1-lizhi.hou@amd.com	2026-03-17 14:08:57 -07:00
Lizhi Hou	8ed8b02396	accel/amdxdna: Add debug prints for command submission Add debug prints to help diagnose issues with incoming command submissions. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260316175642.1451749-1-lizhi.hou@amd.com	2026-03-16 12:53:22 -07:00
Lizhi Hou	ece3e89809	accel/amdxdna: Allow forcing IOVA-based DMA via module parameter The amdxdna driver normally performs DMA using userspace virtual address plus PASID. For debugging and validation purposes, add a module parameter, force_iova, to force DMA to go through IOMMU IOVA mapping. When force_iova=1 is set, the driver will allocate and map DMA buffers using IOVA. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260126193001.1400545-1-lizhi.hou@amd.com	2026-03-12 12:37:56 -07:00
Maxime Ripard	f08ceb71c5	Merge drm/drm-next into drm-misc-next Biju Das needs a patch for rz-du merged in 7.0-rc3 Signed-off-by: Maxime Ripard <mripard@kernel.org>	2026-03-12 08:25:41 +01:00
Mario Limonciello (AMD)	f66d6cc689	accel/amdxdna: Support sensors for column utilization The AMD PMF driver provides realtime column utilization (npu_busy) metrics for the NPU. Extend the DRM_IOCTL_AMDXDNA_GET_INFO sensor query to expose these metrics to userspace. Add AMDXDNA_SENSOR_TYPE_COLUMN_UTILIZATION to the sensor type enum and update aie2_get_sensors() to return both the total power and up to 8 column utilization sensors if the user buffer permits. Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> [lizhi: support legacy tool which uses small buffer. checkpatch cleanup] Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260311171842.473453-1-lizhi.hou@amd.com	2026-03-11 11:37:36 -07:00
Lizhi Hou	f3bffef037	accel/amdxdna: Add IOCTL to retrieve realtime NPU power estimate The AMD PMF driver provides an interface to obtain realtime power estimates for the NPU. Expose this information to userspace through a new DRM_IOCTL_AMDXDNA_GET_INFO parameter, allowing applications to query the current NPU power level. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com> (Update comment to indicate power and utilization) Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260228061109.361239-2-superm1@kernel.org	2026-03-11 11:36:51 -07:00
Mario Limonciello (AMD)	efa3382a5a	accel/amdxdna: Import AMD_PMF namespace The amdxdna driver uses amd_pmf_get_npu_data() which is exported in the AMD_PMF namespace. Import the AMD_PMF namespace. Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260301005028.367618-1-superm1@kernel.org	2026-03-11 11:36:41 -07:00
Karol Wachowski	ed1021fbf9	accel/ivpu: Apply minor code style cleanups to align with kernel style Replace direct import_attach test with drm_gem_is_imported() in ivpu_bo_bind(). Replace kzalloc(sizeof(*bo), GFP_KERNEL) with kzalloc_obj() in ivpu_gem_create_object(). Remove unnecessary cast to bool in ivpu_dbg_bo(). No functional changes. Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Link: https://patch.msgid.link/20260310120736.3341679-1-karol.wachowski@linux.intel.com	2026-03-11 13:31:08 +01:00
Simona Vetter	58351f46de	Merge v7.0-rc3 into drm-next Requested by Maxime Ripard for drm-misc-next because renesas people need `fb797a7010` ("drm: renesas: rz-du: mipi_dsi: Set DSI divider"). Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>	2026-03-11 11:18:31 +01:00
Karol Wachowski	153308a2f9	accel/ivpu: Test for imported buffers with drm_gem_is_imported() Instead of testing import_attach for imported GEM buffers, invoke drm_gem_is_imported() to do the test. The test itself does not change. Suggested-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Link: https://patch.msgid.link/20260309092755.3165130-1-karol.wachowski@linux.intel.com	2026-03-09 10:47:26 +01:00
Andrzej Kacprowski	81e62e7bf8	accel/ivpu: Remove boot params address setting via MMIO register The NPU 60XX uses the default boot params location specified in the firmware image header, consistent with earlier generations. Remove the unnecessary MMIO register write, freeing the AON register for future use. Fixes: `44e4c88951` ("accel/ivpu: Implement warm boot flow for NPU6 and unify boot handling") Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@linux.intel.com> Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com> Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Link: https://patch.msgid.link/20260305142226.194995-1-andrzej.kacprowski@linux.intel.com	2026-03-09 10:05:10 +01:00
Rob Herring (Arm)	021f1b77f7	accel: ethosu: Handle possible underflow in IFM size calculations If the command stream has larger padding sizes than the IFM and OFM diminsions, then the calculations will underflow to a negative value. The result is a very large region bounds which is caught on submit, but it's better to catch it earlier. Current mesa ethosu driver has a signedness bug which resulted in padding of 127 (the max) and triggers this issue. Reviewed-and-Tested-by: Anders Roxell <anders.roxell@linaro.org> Link: https://patch.msgid.link/20260218-ethos-fixes-v1-3-be3fa3ea9a30@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>	2026-03-05 15:21:17 -06:00
Rob Herring (Arm)	838ae99f9a	accel: ethosu: Fix NPU_OP_ELEMENTWISE validation with scalar The NPU_OP_ELEMENTWISE instruction uses a scalar value for IFM2 if the IFM2_BROADCAST "scalar" mode is set. It is a bit (7) on the u65 and part of a field (bits 3:0) on the u85. The driver was hardcoded to the u85. Fixes: `5a5e9c0228` ("accel: Add Arm Ethos-U NPU driver") Reviewed-and-Tested-by: Anders Roxell <anders.roxell@linaro.org> Link: https://patch.msgid.link/20260218-ethos-fixes-v1-2-be3fa3ea9a30@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>	2026-03-05 15:21:17 -06:00
Rob Herring (Arm)	150bceb3e0	accel: ethosu: Fix job submit error clean-up refcount underflows If the job submit fails before adding the job to the scheduler queue such as when the GEM buffer bounds checks fail, then doing a ethosu_job_put() results in a pm_runtime_put_autosuspend() without the corresponding pm_runtime_resume_and_get(). The dma_fence_put()'s are also unnecessary, but seem to be harmless. Split the ethosu_job_cleanup() function into 2 parts for the before and after the job is queued. Fixes: `5a5e9c0228` ("accel: Add Arm Ethos-U NPU driver") Reviewed-and-Tested-by: Anders Roxell <anders.roxell@linaro.org> Link: https://patch.msgid.link/20260218-ethos-fixes-v1-1-be3fa3ea9a30@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>	2026-03-05 15:21:17 -06:00
Lizhi Hou	d5b8b0347f	accel/amdxdna: Split mailbox channel create function The management channel used for firmware control command submission is currently created after the firmware is started. If channel creation fails (for example, due to memory allocation failure or workqueue creation interruption), the firmware remains in a pending state and is unable to receive any control commands. To avoid leaving the firmware in this inconsistent state, split xdna_mailbox_create_channel() into two separate functions so that resource allocation can be completed before interacting with the hardware. xdna_mailbox_alloc_channel() Allocates memory and initializes the workqueue. This can be called earlier, before interacting with the hardware. xdna_mailbox_start_channel() Performs the hardware interaction required to start the channel. Rename xdna_mailbox_destroy_channel() to xdna_mailbox_free_channel(). Ensure that xdna_mailbox_stop_channel() and xdna_mailbox_free_channel() properly unwind the corresponding start and allocation steps, respectively. Fixes: `b87f920b93` ("accel/amdxdna: Support hardware mailbox") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260305062041.3954024-1-lizhi.hou@amd.com	2026-03-05 09:24:33 -08:00
Lizhi Hou	f82859c84a	accel/amdxdna: Fix major version check on NPU1 platform Add the missing major number in npu1_fw_feature_table. Without the major version specified, the firmware feature check fails, preventing new firmware commands from being enabled on the NPU1 platform. With the correct major version populated, the driver properly detects firmware support and enables the new command. Fixes: `f1eac46fe5` ("accel/amdxdna: Update firmware version check for latest firmware") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260304195012.3616908-1-lizhi.hou@amd.com	2026-03-04 12:05:02 -08:00
Karol Wachowski	86a14330bf	accel/ivpu: Limit number of maximum contexts and doorbells per user Implement per-user resource limits to prevent resource exhaustion. Root users can allocate all available contexts (128) and doorbells (255), while non-root users are limited to half of the available resources (64 contexts and 127 doorbells respectively). This prevents scenarios where a single user could monopolize NPU resources and starve other users on multi-user systems. Change doorbell ID and command queue ID allocation errors to debug messages as those are user triggered. Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Link: https://patch.msgid.link/20260302202207.469442-1-maciej.falkowski@linux.intel.com	2026-03-03 12:54:03 +01:00
Thomas Zimmermann	6716101ae4	Merge drm/drm-next into drm-misc-next Backmerge fixes from v7.0-rc2 into drm-misc-next. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2026-03-03 10:40:37 +01:00
Lizhi Hou	6270ee26e1	accel/amdxdna: Fix NULL pointer dereference of mgmt_chann mgmt_chann may be set to NULL if the firmware returns an unexpected error in aie2_send_mgmt_msg_wait(). This can later lead to a NULL pointer dereference in aie2_hw_stop(). Fix this by introducing a dedicated helper to destroy mgmt_chann and by adding proper NULL checks before accessing it. Fixes: `b87f920b93` ("accel/amdxdna: Support hardware mailbox") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260226213857.3068474-1-lizhi.hou@amd.com	2026-03-02 09:43:22 -08:00
Lizhi Hou	89ff45359a	accel/amdxdna: Fill invalid payload for failed command Newer userspace applications may read the payload of a failed command to obtain detailed error information. However, the driver and old firmware versions may not support returning advanced error information. In this case, initialize the command payload with an invalid value so userspace can detect that no detailed error information is available. Fixes: `aac243092b` ("accel/amdxdna: Add command execution") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260227004841.3080241-1-lizhi.hou@amd.com	2026-02-27 23:01:36 -08:00
Maciej Falkowski	5b50b7c2ce	accel/ivpu: Update FW Boot API to version 3.29.4 Update firmware boot API to the version 3.29.4. Remove unused boot parameters from the vpu_firmware_header structure. Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Link: https://patch.msgid.link/20260220160116.220367-1-maciej.falkowski@linux.intel.com	2026-02-26 17:01:39 +01:00
Lizhi Hou	75c151ceaa	accel/amdxdna: Use a different name for latest firmware Using legacy driver with latest firmware causes a power off issue. Fix this by assigning a different filename (npu_7.sbin) to the latest firmware. The driver attempts to load the latest firmware first and falls back to the previous firmware version if loading fails. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/5009 Fixes: `f1eac46fe5` ("accel/amdxdna: Update firmware version check for latest firmware") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260225204752.2711734-1-lizhi.hou@amd.com	2026-02-25 13:51:31 -08:00
Lizhi Hou	901ec34709	accel/amdxdna: Validate command buffer payload count The count field in the command header is used to determine the valid payload size. Verify that the valid payload does not exceed the remaining buffer space. Fixes: `aac243092b` ("accel/amdxdna: Add command execution") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260219211946.1920485-1-lizhi.hou@amd.com	2026-02-23 09:24:21 -08:00
Lizhi Hou	03808abb1d	accel/amdxdna: Prevent ubuf size overflow The ubuf size calculation may overflow, resulting in an undersized allocation and possible memory corruption. Use check_add_overflow() helpers to validate the size calculation before allocation. Fixes: `bd72d4acda` ("accel/amdxdna: Support user space allocated buffer") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260217192815.1784689-1-lizhi.hou@amd.com	2026-02-23 09:24:21 -08:00
Lizhi Hou	1110a94967	accel/amdxdna: Fix out-of-bounds memset in command slot handling The remaining space in a command slot may be smaller than the size of the command header. Clearing the command header with memset() before verifying the available slot space can result in an out-of-bounds write and memory corruption. Fix this by moving the memset() call after the size validation. Fixes: `3d32eb7a5e` ("accel/amdxdna: Fix cu_idx being cleared by memset() during command setup") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260217185415.1781908-1-lizhi.hou@amd.com	2026-02-23 09:24:20 -08:00
Lizhi Hou	07efce5a66	accel/amdxdna: Fix command hang on suspended hardware context When a hardware context is suspended, the job scheduler is stopped. If a command is submitted while the context is suspended, the job is queued in the scheduler but aie2_sched_job_run() is never invoked to restart the hardware context. As a result, the command hangs. Fix this by modifying the hardware context suspend routine to keep the job scheduler running so that queued jobs can trigger context restart properly. Fixes: `aac243092b` ("accel/amdxdna: Add command execution") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260211205341.722982-1-lizhi.hou@amd.com	2026-02-23 09:24:19 -08:00
Lizhi Hou	fdb65acfe6	accel/amdxdna: Fix suspend failure after enabling turbo mode Enabling turbo mode disables hardware clock gating. Suspend requires hardware clock gating to be re-enabled, otherwise suspend will fail. Fix this by calling aie2_runtime_cfg() from aie2_hw_stop() to re-enable clock gating during suspend. Also ensure that firmware is initialized in aie2_hw_start() before modifying clock-gating settings during resume. Fixes: `f4d7b8a6bc` ("accel/amdxdna: Enhance power management settings") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260211204716.722788-1-lizhi.hou@amd.com	2026-02-23 09:24:19 -08:00
Lizhi Hou	1aa82181a3	accel/amdxdna: Fix dead lock for suspend and resume When an application issues a query IOCTL while auto suspend is running, a deadlock can occur. The query path holds dev_lock and then calls pm_runtime_resume_and_get(), which waits for the ongoing suspend to complete. Meanwhile, the suspend callback attempts to acquire dev_lock and blocks, resulting in a deadlock. Fix this by releasing dev_lock before calling pm_runtime_resume_and_get() and reacquiring it after the call completes. Also acquire dev_lock in the resume callback to keep the locking consistent. Fixes: `063db45183` ("accel/amdxdna: Enhance runtime power management") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260211204644.722758-1-lizhi.hou@amd.com	2026-02-23 09:24:17 -08:00
Mario Limonciello	57aa3917a3	accel/amdxdna: Reduce log noise during process termination During process termination, several error messages are logged that are not actual errors but expected conditions when a process is killed or interrupted. This creates unnecessary noise in the kernel log. The specific scenarios are: 1. HMM invalidation returns -ERESTARTSYS when the wait is interrupted by a signal during process cleanup. This is expected when a process is being terminated and should not be logged as an error. 2. Context destruction returns -ENODEV when the firmware or device has already stopped, which commonly occurs during cleanup if the device was already torn down. This is also an expected condition during orderly shutdown. Downgrade these expected error conditions from error level to debug level to reduce log noise while still keeping genuine errors visible. Fixes: `97f2757383` ("accel/amdxdna: Fix potential NULL pointer dereference in context cleanup") Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260210164521.1094274-3-mario.limonciello@amd.com	2026-02-23 09:24:16 -08:00
Lizhi Hou	8363c02863	accel/amdxdna: Fix crash when destroying a suspended hardware context If userspace issues an ioctl to destroy a hardware context that has already been automatically suspended, the driver may crash because the mailbox channel pointer is NULL for the suspended context. Fix this by checking the mailbox channel pointer in aie2_destroy_context() before accessing it. Fixes: `97f2757383` ("accel/amdxdna: Fix potential NULL pointer dereference in context cleanup") Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260206060306.4050531-1-lizhi.hou@amd.com	2026-02-23 09:24:15 -08:00
Lizhi Hou	c68a6af400	accel/amdxdna: Switch to always use chained command Preempt commands are only supported when submitted as chained commands. To ensure preempt support works consistently, always submit commands in chained command format. Set force_cmdlist to true so that single commands are filled using the chained command layout, enabling correct handling of preempt commands. Fixes: `3a0ff7b98a` ("accel/amdxdna: Support preemption requests") Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260206060251.4050512-1-lizhi.hou@amd.com	2026-02-23 09:24:15 -08:00
Lizhi Hou	08fe1b5166	accel/amdxdna: Remove buffer size check when creating command BO Large command buffers may be used, and they do not always need to be mapped or accessed by the driver. Performing a size check at command BO creation time unnecessarily rejects valid use cases. Remove the buffer size check from command BO creation, and defer vmap and size validation to the paths where the driver actually needs to map and access the command buffer. Fixes: `ac49797c18` ("accel/amdxdna: Add GEM buffer object management") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260206060237.4050492-1-lizhi.hou@amd.com	2026-02-23 09:23:41 -08:00
Maxime Ripard	c17ee635fd	Merge drm/drm-fixes into drm-misc-fixes 7.0-rc1 was just released, let's merge it to kick the new release cycle. Signed-off-by: Maxime Ripard <mripard@kernel.org>	2026-02-23 10:09:45 +01:00
Kees Cook	189f164e57	Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL uses Conversion performed via this Coccinelle script: // SPDX-License-Identifier: GPL-2.0-only // Options: --include-headers-for-types --all-includes --include-headers --keep-comments virtual patch @gfp depends on patch && !(file in "tools") && !(file in "samples")@ identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex, kzalloc_obj,kzalloc_objs,kzalloc_flex, kvmalloc_obj,kvmalloc_objs,kvmalloc_flex, kvzalloc_obj,kvzalloc_objs,kvzalloc_flex}; @@ ALLOC(... - , GFP_KERNEL ) $ make coccicheck MODE=patch COCCI=gfp.cocci Build and boot tested x86_64 with Fedora 42's GCC and Clang: Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-22 08:26:33 -08:00
Linus Torvalds	32a92f8c89	Convert more 'alloc_obj' cases to default GFP_KERNEL arguments This converts some of the visually simpler cases that have been split over multiple lines. I only did the ones that are easy to verify the resulting diff by having just that final GFP_KERNEL argument on the next line. Somebody should probably do a proper coccinelle script for this, but for me the trivial script actually resulted in an assertion failure in the middle of the script. I probably had made it a bit _too_ trivial. So after fighting that far a while I decided to just do some of the syntactically simpler cases with variations of the previous 'sed' scripts. The more syntactically complex multi-line cases would mostly really want whitespace cleanup anyway. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 20:03:00 -08:00
Linus Torvalds	323bbfcf1e	Convert 'alloc_flex' family to use the new default GFP_KERNEL argument This is the exact same thing as the 'alloc_obj()' version, only much smaller because there are a lot fewer users of the alloc_flex() interface. As with alloc_obj() version, this was done entirely with mindless brute force, using the same script, except using 'flex' in the pattern rather than 'objs'. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 17:09:51 -08:00
Linus Torvalds	bf4afc53b7	Convert 'alloc_obj' family to use the new default GFP_KERNEL argument This was done entirely with mindless brute force, using git grep -l '\<k[vmz]alloc_objs(., GFP_KERNEL)' \| xargs sed -i 's/$alloc_objs(.*$, GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2026-02-21 17:09:51 -08:00
Kees Cook	69050f8d6d	treewide: Replace kmalloc with kmalloc_obj for non-scalar types This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(PTR, FAM, COUNT, ...) (where TYPE may also be VAR) The resulting allocations no longer return "void ", instead returning "TYPE ". Signed-off-by: Kees Cook <kees@kernel.org>	2026-02-21 01:02:28 -08:00
Dan Carpenter	7be41fb00e	accel: ethosu: Fix shift overflow in cmd_to_addr() The "((cmd[0] & 0xff0000) << 16)" shift is zero. This was intended to be (((u64)cmd[0] & 0xff0000) << 16). Move the cast to the correct location. Fixes: `5a5e9c0228` ("accel: Add Arm Ethos-U NPU driver") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://patch.msgid.link/aQGmY64tWcwOGFP4@stanley.mountain Signed-off-by: Rob Herring (Arm) <robh@kernel.org>	2026-02-18 16:28:08 -06:00
Linus Torvalds	505d195b0f	Merge tag 'char-misc-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc/IIO driver updates from Greg KH: "Here is the big set of char/misc/iio and other smaller driver subsystem changes for 7.0-rc1. Lots of little things in here, including: - Loads of iio driver changes and updates and additions - gpib driver updates - interconnect driver updates - i3c driver updates - hwtracing (coresight and intel) driver updates - deletion of the obsolete mwave driver - binder driver updates (rust and c versions) - mhi driver updates (causing a merge conflict, see below) - mei driver updates - fsi driver updates - eeprom driver updates - lots of other small char and misc driver updates and cleanups All of these have been in linux-next for a while, with no reported issues" * tag 'char-misc-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (297 commits) mux: mmio: fix regmap leak on probe failure rust_binder: return p from rust_binder_transaction_target_node() drivers: android: binder: Update ARef imports from sync::aref rust_binder: fix needless borrow in context.rs iio: magn: mmc5633: Fix Kconfig for combination of I3C as module and driver builtin iio: sca3000: Fix a resource leak in sca3000_probe() iio: proximity: rfd77402: Add interrupt handling support iio: proximity: rfd77402: Document device private data structure iio: proximity: rfd77402: Use devm-managed mutex initialization iio: proximity: rfd77402: Use kernel helper for result polling iio: proximity: rfd77402: Align polling timeout with datasheet iio: cros_ec: Allow enabling/disabling calibration mode iio: frequency: ad9523: correct kernel-doc bad line warning iio: buffer: buffer_impl.h: fix kernel-doc warnings iio: gyro: itg3200: Fix unchecked return value in read_raw MAINTAINERS: add entry for ADE9000 driver iio: accel: sca3000: remove unused last_timestamp field iio: accel: adxl372: remove unused int2_bitmask field iio: adc: ad7766: Use iio_trigger_generic_data_rdy_poll() iio: magnetometer: Remove IRQF_ONESHOT ...	2026-02-17 09:11:04 -08:00
Lizhi Hou	69674c1c70	accel/amdxdna: Move RPM resume into job run function Currently, amdxdna_pm_resume_get() is called during job creation, and amdxdna_pm_suspend_put() is called when the hardware notifies job completion. If a job is canceled before it is run, no hardware completion notification is generated, resulting in an unbalanced runtime PM resume/suspend pair. Fix this by moving amdxdna_pm_resume_get() to the job run path, ensuring runtime PM is only resumed for jobs that are actually executed. Fixes: `063db45183` ("accel/amdxdna: Enhance runtime power management") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260204171118.3165607-1-lizhi.hou@amd.com	2026-02-04 13:08:48 -08:00
Lizhi Hou	d19d963d2a	accel/amdxdna: Fix incorrect DPM level after suspend/resume The suspend routine sets the DPM level to 0, which unintentionally overwrites the previously saved DPM level. As a result, the device always resumes with DPM level 0 instead of restoring the original value. Fix this by ensuring the suspend path does not overwrite the saved DPM level, allowing the correct DPM level to be restored during resume. Fixes: `f4d7b8a6bc` ("accel/amdxdna: Enhance power management settings") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260204171048.3165580-1-lizhi.hou@amd.com	2026-02-04 13:08:35 -08:00
Lizhi Hou	750817a7c4	accel/amdxdna: Fix incorrect error code returned for failed chain command The driver currently returns an incorrect error code when a chain command fails. In this case, ERT_CMD_STATE_ERROR is expected to be reported for failed chain commands. Fixes: `aac243092b` ("accel/amdxdna: Add command execution") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260203184037.2751889-1-lizhi.hou@amd.com	2026-02-03 11:35:53 -08:00
Lizhi Hou	b853007fdc	accel/amdxdna: Remove hardware context status One newly supported command does not require hardware context configuration to be performed upfront. As a result, checking hardware context status causes this command to fail incorrectly. Remove hardware context status handling entirely. For other commands, if userspace submits a request without configuring the hardware context first, the firmware will report an error or time out as appropriate. Fixes: `aac243092b` ("accel/amdxdna: Add command execution") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260202212450.2681273-1-lizhi.hou@amd.com	2026-02-03 09:20:24 -08:00
Zishun Yi	84dd57fb03	accel/amdxdna: Fix memory leak in amdxdna_ubuf_map The amdxdna_ubuf_map() function allocates memory for sg and internal sg table structures, but it fails to free them if subsequent operations (sg_alloc_table_from_pages or dma_map_sgtable) fail. Fixes: `bd72d4acda` ("accel/amdxdna: Support user space allocated buffer") Signed-off-by: Zishun Yi <zishun.yi.dev@gmail.com> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Reviewed-by: Min Ma <mamin506@gmail.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260129171022.68578-1-zishun.yi.dev@gmail.com	2026-01-30 11:52:59 -08:00
Lizhi Hou	f1370241fe	accel/amdxdna: Stop job scheduling across aie2_release_resource() Running jobs on a hardware context while it is in the process of releasing resources can lead to use-after-free and crashes. Fix this by stopping job scheduling before calling aie2_release_resource() and restarting it after the release completes. Additionally, aie2_sched_job_run() now checks whether the hardware context is still active. Fixes: `4fd6ca90fc` ("accel/amdxdna: Refactor hardware context destroy routine") Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patch.msgid.link/20260130003255.2083255-1-lizhi.hou@amd.com	2026-01-30 11:52:53 -08:00

1 2 3 4 5 ...

1049 Commits