Currently, amdxdna_pm_resume_get() is called during job creation, and
amdxdna_pm_suspend_put() is called when the hardware notifies job
completion. If a job is canceled before it is run, no hardware
completion notification is generated, resulting in an unbalanced
runtime PM resume/suspend pair.
Fix this by moving amdxdna_pm_resume_get() to the job run path, ensuring
runtime PM is only resumed for jobs that are actually executed.
Fixes: 063db45183 ("accel/amdxdna: Enhance runtime power management")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20260204171118.3165607-1-lizhi.hou@amd.com
The suspend routine sets the DPM level to 0, which unintentionally
overwrites the previously saved DPM level. As a result, the device always
resumes with DPM level 0 instead of restoring the original value.
Fix this by ensuring the suspend path does not overwrite the saved DPM
level, allowing the correct DPM level to be restored during resume.
Fixes: f4d7b8a6bc ("accel/amdxdna: Enhance power management settings")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20260204171048.3165580-1-lizhi.hou@amd.com
One newly supported command does not require hardware context configuration
to be performed upfront. As a result, checking hardware context status
causes this command to fail incorrectly.
Remove hardware context status handling entirely. For other commands,
if userspace submits a request without configuring the hardware context
first, the firmware will report an error or time out as appropriate.
Fixes: aac243092b ("accel/amdxdna: Add command execution")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20260202212450.2681273-1-lizhi.hou@amd.com
Running jobs on a hardware context while it is in the process of
releasing resources can lead to use-after-free and crashes.
Fix this by stopping job scheduling before calling
aie2_release_resource() and restarting it after the release completes.
Additionally, aie2_sched_job_run() now checks whether the hardware
context is still active.
Fixes: 4fd6ca90fc ("accel/amdxdna: Refactor hardware context destroy routine")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20260130003255.2083255-1-lizhi.hou@amd.com
Some tests trigger a crash in iommu_sva_unbind_device() due to
accessing iommu_mm after the associated mm structure has been
freed.
Fix this by taking an explicit reference to the mm structure
after successfully binding the device, and releasing it only
after the device is unbound. This ensures the mm remains valid
for the entire SVA bind/unbind lifetime.
Fixes: be462c97b7 ("accel/amdxdna: Add hardware context")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20260128002356.1858122-1-lizhi.hou@amd.com
There already is a function to return the offset of the core for a given
struct device, so let's reuse that function instead of reimplementing
the same logic.
There's one change in behavior when a struct device is passed which
doesn't match any core's. Before, we would continue through
rocket_remove() but now we exit early, to match what other callers of
find_core_for_dev() (rocket_device_runtime_resume/suspend()) are doing.
This however should never happen. Aside from that, no intended change in
behavior.
Signed-off-by: Quentin Schulz <quentin.schulz@cherry.de>
Reviewed-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Link: https://patch.msgid.link/20251215-rocket-reuse-find-core-v1-1-be86a1d2734c@cherry.de
When rocket_job_init() is called, iommu_group_get() has already been
called, therefore we should call iommu_group_put() and make the
iommu_group pointer NULL. This aligns with what's done in
rocket_core_fini().
If pm_runtime_resume_and_get() somehow fails, not only should
rocket_job_fini() be called but we should also unwind everything done
before that, that is, disable PM, put the iommu_group, NULLify it and
then call rocket_job_fini(). This is exactly what's done in
rocket_core_fini() so let's call that function instead of duplicating
the code.
Fixes: 0810d5ad88 ("accel/rocket: Add job submission IOCTL")
Cc: stable@vger.kernel.org
Signed-off-by: Quentin Schulz <quentin.schulz@cherry.de>
Reviewed-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Link: https://patch.msgid.link/20251215-rocket-error-path-v1-1-eec3bf29dc3b@cherry.de
The latest firmware requires the message DMA buffer to
- have a minimum size of 8K
- use a power-of-two size
- be aligned to the buffer size
- not cross 64M boundary
Update the buffer allocation logic to meet these requirements and support
the latest firmware.
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20251219014356.2234241-1-lizhi.hou@amd.com
Starting from NPU6, the driver can pass boot parameters address through
the AON retention register and toggle between cold/warm boot types using
the boot_type parameter, while setting the cold boot entry point in both
cases.
Refactor the existing cold/warm boot handling to be consistent with the
new NPU6 boot flow requirements and still maintain compatibility with
older boot flows.
This will allow firmware to remove support for legacy warm boot starting
from NPU6.
Fixes: 550f4dd2ce ("accel/ivpu: Add support for Nova Lake's NPU")
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Andrzej Kacprowski <andrzej.kacprowski@linux.intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Link: https://patch.msgid.link/20251230142116.540026-1-maciej.falkowski@linux.intel.com
Newer firmware versions prefer temporal sharing only mode. In this mode,
the driver no longer needs to manage AIE array column allocation. Instead,
a new field, num_unused_col, is added to the hardware context creation
request to specify how many columns will not be used by this hardware
context.
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20251217191150.2145937-1-lizhi.hou@amd.com
amdxdna_flush() was introduced to ensure that the device does not access
a process address space after it has been freed. However, this is no
longer necessary because the driver now increments the mm reference count
when a command is submitted and decrements it only after the command has
completed. This guarantees that the process address space remains valid
for the entire duration of command execution. Remove amdxdna_flush to
simplify the teardown path.
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20251216031311.2033399-1-lizhi.hou@amd.com
Validate scatter-gather table size matches buffer object size before
mapping. Break mapping early if the table exceeds buffer size to
prevent overwriting existing mappings. Also validate the table is
not smaller than buffer size to avoid unmapped regions that trigger
MMU translation faults.
Log error and fail mapping operation on size mismatch to prevent
data corruption from mismatched host memory locations and NPU
addresses. Unmap any partially mapped buffer on failure.
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://patch.msgid.link/20251215070933.520377-1-karol.wachowski@linux.intel.com
aie_destroy_context() is invoked during error handling in
aie2_create_context(). However, aie_destroy_context() assumes that the
context's mailbox channel pointer is non-NULL. If mailbox channel
creation fails, the pointer remains NULL and calling aie_destroy_context()
can lead to a NULL pointer dereference.
In aie2_create_context(), replace aie_destroy_context() with a function
which request firmware to remove the context created previously.
Fixes: be462c97b7 ("accel/amdxdna: Add hardware context")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20251212183244.1826318-1-lizhi.hou@amd.com
The firmware sends a response and interrupts the driver before advancing
the mailbox send ring head pointer. As a result, the driver may observe
the response and attempt to send a new request before the firmware has
updated the head pointer. In this window, the send ring still appears
full, causing the driver to incorrectly fail the send operation.
This race can be triggered more easily in a multithreaded environment,
leading to unexpected and spurious "send ring full" failures.
To address this, poll the send ring head pointer for up to 100us before
returning a full-ring condition. This allows the firmware time to update
the head pointer.
Fixes: b87f920b93 ("accel/amdxdna: Support hardware mailbox")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20251211045125.1724604-1-lizhi.hou@amd.com
For one command type, cu_idx is assigned before calling memset() on the
command structure. This results in cu_idx being overwritten, causing the
firmware to receive an incomplete or invalid command and leading to
unexpected command failures.
Fix this by moving the memset() call before initializing cu_idx so that
all fields are populated in the correct order.
Fixes: 71829d7f2f ("accel/amdxdna: Use MSG_OP_CHAIN_EXEC_NPU when supported")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20251209211639.1636888-1-lizhi.hou@amd.com
When autosuspend is triggered, driver rpm_on flag is set to indicate that
a suspend/resume is already in progress. However, when a userspace
application submits a command during this narrow window,
amdxdna_pm_resume_get() may incorrectly skip the resume operation because
the rpm_on flag is still set. This results in commands being submitted
while the device has not actually resumed, causing unexpected behavior.
The set_dpm() is called by suspend/resume, it relied on rpm_on flag to
avoid calling into rpm suspend/resume recursivly. So to fix this, remove
the use of the rpm_on flag entirely. Instead, introduce aie2_pm_set_dpm()
which explicitly resumes the device before invoking set_dpm(). With this
change, set_dpm() is called directly inside the suspend or resume execution
path. Otherwise, aie2_pm_set_dpm() is called.
Fixes: 063db45183 ("accel/amdxdna: Enhance runtime power management")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20251208165356.1549237-1-lizhi.hou@amd.com
In mailbox_get_msg(), mailbox_reg_read_non_zero() is called to poll for a
non-zero tail pointer. This assumed that a zero value indicates an error.
However, certain corner cases legitimately produce a zero tail pointer.
To handle these cases, remove mailbox_reg_read_non_zero(). The zero tail
pointer will be treated as a valid rewind event.
Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20251204181603.793824-1-lizhi.hou@amd.com
After issuing a firmware suspend request, the driver must ensure that the
suspend operation has completed before proceeding. Add polling of the
MPNPU_PWAITMODE register to confirm that the firmware has fully entered
the suspended state. This prevents race conditions where subsequent
operations assume the firmware is idle before it has actually completed
its suspend sequence.
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20251202165427.507414-1-lizhi.hou@amd.com
Hardware context destroy function holds dev_lock while waiting for all jobs
to complete. The timeout job also needs to acquire dev_lock, this leads to
a deadlock.
Fix the issue by temporarily releasing dev_lock before waiting for all
jobs to finish, and reacquiring it afterward.
Fixes: 4fd6ca90fc ("accel/amdxdna: Refactor hardware context destroy routine")
Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20251107181050.1293125-1-lizhi.hou@amd.com
Do not count buffer objects that have no backing pages, including imported
buffers where pages are set by VM faults triggered by userspace or pinned
by other drivers. Instead, return information about actual memory used by
the NPU.
Counting imported buffers results in incorrect calculations when
the same pages are counted multiple times, giving overly high
results.
Fixes: 7bfc9fa995 ("accel/ivpu: Expose NPU memory utilization info in sysfs")
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://patch.msgid.link/20251106101052.1050348-3-karol.wachowski@linux.intel.com
After subsystem of the device has crashed it sends a message with
command DEBUG_TRANSFER_INFO to kernel(host). Send ACK for that message
and then prepare to collect the ramdump of the subsystem
Steps of crashdump collection is as follows,
1) Device sends DEBUG_TRANSFER_INFO message indicating that device wants
to send crashdump.
2) Send an acknowledgment to that message either ACK or NACK.
a) NACK will inform the device that host will not download the
crashdump
b) ACK will inform the device that host will download the crashdump
3) Along with the DEBUG_TRANSFER_INFO we receive a table base address and
its length, use that to download that table from device.
a) This table is meta data of the crashdump and not the actual
crashdump.
4) After we respond as ACK for message received on step 1) we start
downloading the table. Use series of MEMORY_READ/MEMORY_READ_RSP SSR
commands to download the entire table.
5) Each entry in the table represents a segment of crashdump. Once the
table downloading is complete, iterate through each entry of table
and download each crashdump segment(same as table itself). Table entry
contains the memory base address and length along with other info.
6) After the entire crashdump is downloaded send DEBUG_TRANSFER_DONE
which marks that host is terminating the crashdump transfer. This
message can be send in both success or error case.
7) After receiving DEBUG_TRANSFER_DONE_RSP hand over the crashdump to
dev_coredumpv() and free all the necessary memory.
Co-developed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Co-developed-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Signed-off-by: Pranjal Ramajor Asha Kanojiya <pkanojiy@codeaurora.org>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Signed-off-by: Zack McKevitt <zachary.mckevitt@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://patch.msgid.link/20251031174059.2814445-4-zachary.mckevitt@oss.qualcomm.com
Failing to set power off indicates an unrecoverable hardware or firmware
error. Update the driver to treat such a failure as a fatal condition
and stop further operations that depend on successful power state
transition.
This prevents undefined behavior when the hardware remains in an
unexpected state after a failed power-off attempt.
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20251106180521.1095218-1-lizhi.hou@amd.com
Currently, dma_fence_put(job->fence) is called in job notification
callback. However, if a job is canceled, the notification callback is never
invoked, leading to a memory leak. Move dma_fence_put(job->fence)
to the job cleanup function to ensure the fence is always released.
Fixes: aac243092b ("accel/amdxdna: Add command execution")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20251105194140.1004314-1-lizhi.hou@amd.com
The driver checks the firmware version during initialization.If preemption
is supported, the driver configures preemption accordingly and handles
userspace preemption requests. Otherwise, the driver returns an error for
userspace preemption requests.
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20251104185340.897560-1-lizhi.hou@amd.com
Add IOCTL debug bit for logging user provided parameter validation
errors.
Refactor several warning and error messages to better reflect fault
reason. User generated faults should not flood kernel messages with
warnings or errors, so change those to ivpu_dbg(). Add additional debug
logs for parameter validation in IOCTLs.
Check size provided by in metric streamer start and return -EINVAL
together with a debug message print.
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://patch.msgid.link/20251104132418.970784-1-karol.wachowski@linux.intel.com
Add three hardware specific attributes to describe device capabilities:
hwctx_limit: The maximum number of hardware context supported.
max_tops: The maximum TOPS supported.
curr_tops: The TOPS achievable with the current power and frequency
configuration.
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20251104062546.833771-1-lizhi.hou@amd.com
MSG_OP_CHAIN_EXEC_NPU is a unified mailbox message that replaces
MSG_OP_CHAIN_EXEC_BUFFER_CF and MSG_OP_CHAIN_EXEC_DPU.
Add driver logic to check firmware version, and if MSG_OP_CHAIN_EXEC_NPU
is supported, uses it to submit firmware commands.
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20251031014700.2919349-1-lizhi.hou@amd.com
During power down, pending DVFS operations may still be in progress
when the NPU reset is asserted after CDYN=0 is set. Since the READY
bit may already be deasserted at this point, checking only the READY
bit is insufficient to ensure all transactions have completed.
Add an explicit check for CDYN de-assertion after the READY bit check
to guarantee no outstanding transactions remain before proceeding.
Fixes: 550f4dd2ce ("accel/ivpu: Add support for Nova Lake's NPU")
Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://patch.msgid.link/20251030091700.293341-1-karol.wachowski@linux.intel.com