Commit Graph

932 Commits

Author SHA1 Message Date
Zack McKevitt
ddf70cb6f8 accel/qaic: Fix mismatched types in min()
Use min_t() instead of min() to resolve compiler warnings for mismatched
types.

Signed-off-by: Zack McKevitt <zmckevit@qti.qualcomm.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251015153715.184143-1-youssef.abdulrahman@oss.qualcomm.com
2025-10-20 08:38:42 -06:00
Zack McKevitt
9cfe6abee1 accel/qaic: Use check_add_overflow in sahara for 64b types
Use check_add_overflow instead of size_add in sahara when
64b types are being added to ensure compatibility with 32b
systems. The size_add function parameters are of size_t, so
64b data types may be truncated when cast to size_t on 32b
systems. When using check_add_overflow, no type casts are made,
making it a more portable option.

Signed-off-by: Zack McKevitt <zmckevit@qti.qualcomm.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251015165408.213645-1-youssef.abdulrahman@oss.qualcomm.com
2025-10-20 08:30:37 -06:00
Lizhi Hou
b291e4f1a4 accel/amdxdna: Support getting last hardware error
Add new parameter DRM_AMDXDNA_HW_LAST_ASYNC_ERR to get array IOCTL. When
hardware reports an error, the driver save the error information and
timestamp. This new get array parameter retrieves the last error.

Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://lore.kernel.org/r/20251014234119.628453-1-lizhi.hou@amd.com
2025-10-16 09:32:48 -07:00
Wludzik, Jozef
63c7870fab accel/ivpu: Fix race condition when mapping dmabuf
Fix a race that can occur when multiple jobs submit the same dmabuf.
This could cause the sg_table to be mapped twice, leading to undefined
behavior.

Fixes: e0c0891cd6 ("accel/ivpu: Rework bind/unbind of imported buffers")
Signed-off-by: Wludzik, Jozef <jozef.wludzik@intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20251014071725.3047287-1-karol.wachowski@linux.intel.com
2025-10-15 08:10:22 +02:00
Jeff Hugo
7fb19ea1ec accel/qaic: Support the new READ_DATA implementation
AIC200 uses the newer "XBL" firmware implementation which changes the
expectations of how READ_DATA is performed. Larger data requests are
supported via streaming the data over the transport instead of requiring
a single transport transfer for everything.

Co-developed-by: Carl Vanderlip <quic_carlv@quicinc.com>
Signed-off-by: Carl Vanderlip <quic_carlv@quicinc.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251007224045.605374-1-youssef.abdulrahman@oss.qualcomm.com
2025-10-14 10:38:39 -06:00
Youssef Samir
4ddf4ddfce accel/qaic: Ensure entry belongs to DBC in qaic_perf_stats_bo_ioctl()
struct qaic_perf_stats is defined to have a DBC specified in the header,
followed by struct qaic_perf_stats_entry instances, each pointing to a BO
that is associated with the DBC. Currently, qaic_perf_stats_bo_ioctl() does
not check if the entries belong to the DBC specified in the header.
Therefore, add checks to ensure that each entry in the request is sliced
and belongs to hdr.dbc_id.

Co-developed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Youssef Samir <quic_yabdulra@quicinc.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251007221212.559474-1-youssef.abdulrahman@oss.qualcomm.com
2025-10-14 10:30:07 -06:00
Carl Vanderlip
6bee90901f accel/qaic: Use overflow check function instead of division
Division is an expensive operation. Overflow check functions exist
already. Use existing overflow check functions rather than dividing to
check for overflow.

Signed-off-by: Carl Vanderlip <quic_carlv@quicinc.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251007174218.469867-1-youssef.abdulrahman@oss.qualcomm.com
2025-10-14 09:52:24 -06:00
Aswin Venkatesan
8134da2c9f accel/qaic: Fix incorrect error return path
Found via code inspection that when encode_message() fails in the middle
of processing, instead of returning the actual error code, it always
returns -EINVAL. This is because the entire message length has not been
processed, and the error code is set to -EINVAL.
Instead, take the 'out' path on failure to return the actual error code.

Signed-off-by: Aswin Venkatesan <aswivenk@qti.qualcomm.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251007170130.445878-1-youssef.abdulrahman@oss.qualcomm.com
2025-10-14 09:17:25 -06:00
Youssef Samir
754fcd22d1 accel/qaic: Remove redundant retry_count = 0 statement
If msg_xfer() is called and the channel ring does not have enough room
to accommodate the whole message, the function sleeps and tries again.
It uses retry_count to keep track of the number of retrials done. This
variable is not used after the space check succeeds. So, remove the
retry_count = 0 statement used later in the function.

Signed-off-by: Youssef Samir <quic_yabdulra@quicinc.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251007161148.422744-1-youssef.abdulrahman@oss.qualcomm.com
2025-10-14 09:10:10 -06:00
Zack McKevitt
5f63e03387 accel/qaic: Include signal.h in qaic_control.c
Include linux/sched/signal.h in qaic_control.c
to avoid implicit inclusion of signal_pending().

Signed-off-by: Zack McKevitt <zmckevit@qti.qualcomm.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251007154525.415039-1-youssef.abdulrahman@oss.qualcomm.com
2025-10-14 08:48:30 -06:00
Youssef Samir
f6d9329aef accel/qaic: Use kvcalloc() for slice requests allocation
When a BO is created, qaic will use the page allocator to request the
memory chunks that the BO will be composed of in-memory. The number of
chunks increases when memory is segmented. For example, a 16MB BO can
be composed of four 4MB chunks or 4096 4KB chunks.

A BO is then sliced into a single or multiple slices to be transferred
to the device on the DBC's xfer queue. For that to happen, the slice
needs to encode its memory chunks into DBC requests and keep track of
them in an array, which is allocated using kcalloc(). Knowing that the
BO might be very fragmented, this array can grow so large that the
allocation may fail to find contiguous memory for it.

Replace kcalloc() with kvcalloc() to allocate the DBC requests array
for a slice.

Signed-off-by: Youssef Samir <quic_yabdulra@quicinc.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Carl Vanderlip <carl.vanderlip@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251007121845.337382-1-youssef.abdulrahman@oss.qualcomm.com
2025-10-14 08:39:10 -06:00
Pranjal Ramajor Asha Kanojiya
dba5f91829 accel/qaic: Add support to export dmabuf fd
Add support to export BO as DMABUF to enable userspace to reuse buffers
and reduce number of copy.

Signed-off-by: Pranjal Ramajor Asha Kanojiya <quic_pkanojiy@quicinc.com>
Signed-off-by: Youssef Samir <youssef.abdulrahman@oss.qualcomm.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20251007053853.193608-1-youssef.abdulrahman@oss.qualcomm.com
2025-10-13 11:26:31 -06:00
Thomas Zimmermann
9b966ae422 Merge drm/drm-next into drm-misc-next
Updating drm-misc-next to the state of v6.18-rc1.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2025-10-13 09:19:19 +02:00
Andrzej Kacprowski
40527034d1 accel/ivpu: Return correct job error status
Currently the driver returns ABORTED for all errors that trigger engine
reset. It is better to distinguish between different error types by
returning the actual error code reported by firmware.
This allows userspace to take different actions based on the error type
and improves debuggability.

Refactor ivpu_job_signal_and_destroy() by extracting engine error
handling logic into a new function ivpu_job_handle_engine_error().
This simplifies engine error handling logic by removing necessity of
calling ivpu_job_singal_and_destroy() multiple times by a single job
changing it's behavior based on job status.

Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@linux.intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20251008061255.2909794-1-karol.wachowski@linux.intel.com
2025-10-08 18:32:39 +02:00
Andrzej Kacprowski
4139eb2490 accel/ivpu: Trigger engine reset for additional job status codes
Trigger engine reset for any status code in the range.
This allows to add additional status codes in the future without
breaking compatibility between the firmware and the driver.

Signed-off-by: Andrzej Kacprowski <Andrzej.Kacprowski@intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20251007083511.2817021-1-karol.wachowski@linux.intel.com
2025-10-08 18:32:39 +02:00
Andrzej Kacprowski
39a0283dba accel/ivpu: Update JSM API header to 3.33.0
New API header includes additional status codes and range definitions
for error handling and improved API documentation.

Signed-off-by: Andrzej Kacprowski <Andrzej.Kacprowski@intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20251007083451.2816990-1-karol.wachowski@linux.intel.com
2025-10-08 18:32:39 +02:00
Lizhi Hou
e00f78679f accel/amdxdna: Resume power for creating and destroying hardware context
When the hardware is powered down by auto-suspend, creating or destroying
a hardware context without resuming power will fail.
Call amdxdna_pm_resume_get() before requesting the hardware to create or
destroy a hardware context.

Fixes: 063db45183 ("accel/amdxdna: Enhance runtime power management")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://lore.kernel.org/r/20251008045324.4171807-1-lizhi.hou@amd.com
2025-10-08 08:31:39 -07:00
Chelsy Ratnawat
e68c994445 accel/qaic: Replace snprintf() with sysfs_emit() in sysfs show functions
Documentation/filesystems/sysfs.rst mentions that show() should only
use sysfs_emit() or sysfs_emit_at() when formatting the value to be
returned to user space. So replace snprintf() with sysfs_emit().

Signed-off-by: Chelsy Ratnawat <chelsyratnawat2001@gmail.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
[jhugo: Fix commit text typos]
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250822112804.1726592-1-chelsyratnawat2001@gmail.com
2025-10-06 14:38:20 -06:00
Thorsten Blum
bed1291240 accel/qaic: Replace kcalloc + copy_from_user with memdup_array_user
Replace kcalloc() followed by copy_from_user() with memdup_array_user()
to improve and simplify both __qaic_execute_bo_ioctl() and
qaic_perf_stats_bo_ioctl().

In __qaic_execute_bo_ioctl(), return early if an error occurs and remove
the obsolete 'free_exec' label.

Since memdup_array_user() already checks for multiplication overflow,
remove the manual check in __qaic_execute_bo_ioctl(). Remove any unused
local variables accordingly.

Since 'ret = copy_from_user()' has been removed, initialize 'ret = 0' to
preserve the same return value on success.

No functional changes intended.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250917124805.90395-4-thorsten.blum@linux.dev
2025-10-06 14:13:55 -06:00
Thorsten Blum
9c81523063 accel/qaic: Replace kzalloc + copy_from_user with memdup_user
Replace kzalloc() followed by copy_from_user() with memdup_user() to
improve and simplify qaic_attach_slice_bo_ioctl().

No functional changes intended.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250917124805.90395-2-thorsten.blum@linux.dev
2025-10-06 14:12:45 -06:00
Karol Wachowski
aa1c2b073a accel/ivpu: Fix DCT active percent format
The pcode MAILBOX STATUS register PARAM2 field expects DCT active
percent in U1.7 value format. Convert percentage value to this
format before writing to the register.

Fixes: a19bffb10c ("accel/ivpu: Implement DCT handling")
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20251001104322.1249896-1-karol.wachowski@linux.intel.com
2025-10-02 07:44:53 +02:00
Jacek Lawrynowicz
30531e9ca7 accel/ivpu: Improve BO alloc/free warnings
Add additional warnings related to allocation and
deallocation of buffer objects to better track possible
memory leaks and generally the BO's lifecycle.

Introduce checks for handle_count to ensure it is zero
before creating a new handle, and exactly one
after successfully creating a handle.

Introduce also a check to warn if the VMA node is not
empty when freeing the buffer object.

Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20250925145154.1446427-1-maciej.falkowski@linux.intel.com
2025-10-01 10:00:36 +02:00
Andrzej Kacprowski
9f5627578e accel/ivpu: Fix doc description of job structure
Fix doc description of job structure as it is
improperly formatted. Align order of job structure's
fields according to the documentation.

Fixes: 0bf37f45d5 ("accel/ivpu: Add support for user-managed preemption buffer")
Signed-off-by: Andrzej Kacprowski <andrzej.kacprowski@linux.intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20250925145131.1446323-1-maciej.falkowski@linux.intel.com
2025-10-01 10:00:10 +02:00
Jacek Lawrynowicz
8b694b405a accel/ivpu: Fix page fault in ivpu_bo_unbind_all_bos_from_context()
Don't add BO to the vdev->bo_list in ivpu_gem_create_object().
When failure happens inside drm_gem_shmem_create(), the BO is not
fully created and ivpu_gem_bo_free() callback will not be called
causing a deleted BO to be left on the list.

Fixes: 8d88e4cdce ("accel/ivpu: Use GEM shmem helper for all buffers")
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20250925145114.1446283-1-maciej.falkowski@linux.intel.com
2025-10-01 09:59:45 +02:00
Jacek Lawrynowicz
e0c0891cd6 accel/ivpu: Rework bind/unbind of imported buffers
Ensure that imported buffers are properly mapped and unmapped in
the same way as regular buffers to properly handle buffers during
device's bind and unbind operations to prevent resource leaks and
inconsistent buffer states.

Imported buffers are now dma_mapped before submission and
dma_unmapped in ivpu_bo_unbind(), guaranteeing they are unmapped
when the device is unbound.

Add also imported buffers to vdev->bo_list for consistent unmapping
on device unbind. The bo->ctx_id is set in open() so imported
buffers have a valid context ID.

Debug logs have been updated to match the new code structure.
The function ivpu_bo_pin() has been renamed to ivpu_bo_bind()
to better reflect its purpose, and unbind tests have been refactored
for improved coverage and clarity.

Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20250925145059.1446243-1-maciej.falkowski@linux.intel.com
2025-10-01 09:58:50 +02:00
Tomasz Rusinowicz
87a23fe296 accel/ivpu: Enable MCA ECC signalling based on MSR
Add new boot parameter for NPU5+ that enables
ECC signalling for on-chip memory based on the value
of MSR_INTEGRITY_CAPS register.

Signed-off-by: Tomasz Rusinowicz <tomasz.rusinowicz@intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20250925145020.1446208-1-maciej.falkowski@linux.intel.com
2025-10-01 09:58:14 +02:00
Karol Wachowski
2007e210b6 accel/ivpu: Split FW runtime and global memory buffers
Split firmware boot parameters (4KB) and FW version (4KB) into dedicated
buffer objects, separating them from the FW runtime memory buffer. This
creates three distinct buffers with independent allocation control.

This enables future modifications, particularly allowing the FW
image memory to be moved into a read-only buffer.

Fix user range starting address from incorrect 0x88000000 (2GB + 128MB)
overlapping global aperture on 37XX to intended 0xa0000000 (2GB + 512MB).
This caused no issues as FW aligned this range to 512MB anyway, but
corrected for consistency.

Convert ivpu_hw_range_init() from inline helper to function with overflow
validation to prevent potential security issues from address range
arithmetic overflows.

Improve readability of ivpu_fw_parse() function, remove unrelated constant
defines and validate firmware header values based on vdev->hw->ranges.

Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20250925074209.1148924-1-karol.wachowski@linux.intel.com
2025-09-25 21:03:57 +02:00
Pavan S
6ca282c3e6 accel/habanalabs: add Infineon version check
On HL338 ASICs, the Infineon first‑stage firmware is not present and
the reported version is 0. In this case printing a version number is
misleading, as it suggests valid firmware when it does not exist.

Fix this by printing the first‑stage Infineon firmware version only
if the reported value is non‑zero. This avoids confusing or incorrect
log messages on devices where the first stage is not applicable.

Signed-off-by: Pavan S <pavan.sreenivas@intel.com>
Reviewed-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25 09:14:45 +03:00
Konstantin Sinyuk
a0d866bab1 accel/habanalabs/gaudi2: read preboot status after recovering from dirty state
Dirty state can occur when the host VM undergoes a reset while the
device does not. In such a case, the driver must reset the device before
it can be used again. As part of this reset, the device capabilities
are zeroed. Therefore, the driver must read the Preboot status again to
learn the Preboot state, capabilities, and security configuration.

Signed-off-by: Konstantin Sinyuk <konstantin.sinyuk@intel.com>
Reviewed-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25 09:09:32 +03:00
Ariel Aviad
65a3f5bc33 accel/habanalabs: add HL_GET_P_STATE passthrough type
Add a new passthrough type HL_GET_P_STATE to the cpucp generic ioctl
to allow userspace to read the device performance state via firmware.

Signed-off-by: Ariel Aviad <ariel.aviad@intel.com>
Reviewed-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25 09:09:31 +03:00
Konstantin Sinyuk
eeb38d0e91 accel/habanalabs: add debugfs interface for HLDIO testing
Add debugfs files for NVMe Direct I/O (HLDIO) functionality.
This interface allows userspace access to direct SSD ↔ device transfers
through debugfs nodes.

Four debugfs files are created under /sys/kernel/debug/habanalabs/hlN/:

  - dio_ssd2hl : trigger SSD-to-device transfers
  - dio_hl2ssd : trigger device-to-SSD transfers
    (placeholder, not yet implemented)
  - dio_stats  : show transfer statistics
  - dio_reset  : reset statistics counters

Usage examples:

  # Perform SSD → device transfer
  echo "fd=3 va=0x10000 off=0 len=4096" > \
    /sys/kernel/debug/habanalabs/hl0/dio_ssd2hl

  # View statistics
  cat /sys/kernel/debug/habanalabs/hl0/dio_stats

  # Reset counters
  echo 1 > /sys/kernel/debug/habanalabs/hl0/dio_reset

This interface provides access to HLDIO functionality for validation
and diagnostics.

Signed-off-by: Konstantin Sinyuk <konstantin.sinyuk@intel.com>
Reviewed-by: Farah Kassabri <farah.kassabri@intel.com>
Reviewed-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25 09:09:31 +03:00
Konstantin Sinyuk
8cbacc9a27 accel/habanalabs: add NVMe Direct I/O (HLDIO) infrastructure
Introduce NVMe Direct I/O (HLDIO) infrastructure to support
peer‑to‑peer DMA in the habanalabs driver. This adds internal helpers
and data structures to enable direct transfers between NVMe storage
and device memory.

The feature is built only when CONFIG_HL_HLDIO is enabled. A debugfs
interface is also provided for functional validation.

Signed-off-by: Konstantin Sinyuk <konstantin.sinyuk@intel.com>
Reviewed-by: Farah Kassabri <farah.kassabri@intel.com>
Reviewed-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25 09:09:30 +03:00
Moti Haimovski
513024d5a0 accel/habanalabs: support mapping cb with vmalloc-backed coherent memory
When IOMMU is enabled, dma_alloc_coherent() with GFP_USER may return
addresses from the vmalloc range. If such an address is mapped without
VM_MIXEDMAP, vm_insert_page() will trigger a BUG_ON due to the
VM_PFNMAP restriction.

Fix this by checking for vmalloc addresses and setting VM_MIXEDMAP
in the VMA before mapping. This ensures safe mapping and avoids kernel
crashes. The memory is still driver-allocated and cannot be accessed
directly by userspace.

Signed-off-by: Moti Haimovski  <moti.haimovski@intel.com>
Reviewed-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25 09:09:30 +03:00
Ilia Levi
0668db41b5 accel/habanalabs: remove old interface variation of 'access_ok()'
The access_ok() API no longer requires the VERIFY_WRITE argument,
and the use of the old interface with VERIFY_WRITE is deprecated.

Clean up the habanalabs memory manager to use the modern access_ok()
interface consistently. This removes old #ifdef guards and aligns the
driver with current upstream kernel APIs.

Signed-off-by: Ilia Levi <ilia.levi@intel.com>
Reviewed-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25 09:09:29 +03:00
Konstantin Sinyuk
0529b191ac accel/habanalabs/gaudi2: use the CPLD_SHUTDOWN event handler
After CPLD shutdown event the device is not usable anymore. The common
CPLD_SHUTDOWN event handler disables any subsequent device access.

Signed-off-by: Konstantin Sinyuk <konstantin.sinyuk@intel.com>
Reviewed-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25 09:09:29 +03:00
Konstantin Sinyuk
083c53a854 accel/habanalabs: disable device access after CPLD_SHUTDOWN
After a CPLD shutdown event the device becomes unusable. Prevent further
device access once this event is received.

Signed-off-by: Konstantin Sinyuk <konstantin.sinyuk@intel.com>
Reviewed-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25 09:09:28 +03:00
Tomer Tayar
d0dd796bec accel/habanalabs: clarify ctx use after hl_ctx_put() in dmabuf release
In hl_release_dmabuf(), ctx is dereferenced after calling hl_ctx_put()
to obtain the compute device file.

This is safe because the dma-buf object holds a file reference taken in
export_dmabuf(), and the file release (which drops another ctx reference)
can only happen after we drop that file reference via fput(). Thus, this
hl_ctx_put() call cannot be the last one at this point.

Add a comment explaining this to avoid confusion.

Signed-off-by: Tomer Tayar <tomer.tayar@intel.com>
Reviewed-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25 09:09:27 +03:00
Sharley Calzolari
b5cddeb0dc accel/habanalabs/gaudi2: add support for logging register accesses from debugfs
Add infrastructure for logging the last configuration register accesses
that occur via debugfs read/write operations. At interrupt time, these
log entries can be dumped to dmesg, which helps in diagnosing the cause
of RAZWI and ADDR_DEC interrupts.

The logging is implemented as a ring buffer of access entries, with each
entry recording timestamp and access details. To ensure correctness
under concurrent access, operations are now protected using spinlocks.
Entries are copied under lock and then printed after releasing it, which
minimizes time spent in the critical section.

Signed-off-by: Sharley Calzolari <sharley.calzolari@intel.com>
Reviewed-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25 09:09:26 +03:00
Ariel Suller
214e26a43f accel/habanalabs/gaudi2: stringify engine/queue ids
Print engine/queue names instead of numerical engine/queue IDs to make
logs and debug output more readable.

Signed-off-by: Ariel Suller <ariel.suller@intel.com>
Reviewed-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25 09:09:25 +03:00
Vitaly Margolin
5295be6c4e accel/habanalabs: add generic message type to get error counters
Add a new CPUCP generic message type to retrieve HBM, SRAM and critical
error counters from the device.

Signed-off-by: Vitaly Margolin <vitaly.margolin@intel.com>
Reviewed-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25 09:09:25 +03:00
Vered Yavniely
b4fd8e56c9 accel/habanalabs/gaudi2: fix BMON disable configuration
Change the BMON_CR register value back to its original state before
enabling, so that BMON does not continue to collect information
after being disabled.

Signed-off-by: Vered Yavniely <vered.yavniely@intel.com>
Reviewed-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25 09:09:24 +03:00
Tomer Tayar
9f5067531c accel/habanalabs: return ENOMEM if less than requested pages were pinned
EFAULT is currently returned if less than requested user pages are
pinned. This value means a "bad address" which might be confusing to
the user, as the address of the given user memory is not necessarily
"bad".

Modify the return value to ENOMEM, as "out of memory" is more suitable
in this case.

Signed-off-by: Tomer Tayar <tomer.tayar@intel.com>
Reviewed-by: Koby Elbaz <koby.elbaz@intel.com>
Signed-off-by: Koby Elbaz <koby.elbaz@intel.com>
2025-09-25 09:09:22 +03:00
Lizhi Hou
063db45183 accel/amdxdna: Enhance runtime power management
Currently, pm_runtime_resume_and_get() is invoked in the driver's open
callback, and pm_runtime_put_autosuspend() is called in the close
callback. As a result, the device remains active whenever an application
opens it, even if no I/O is performed, leading to unnecessary power
consumption.

Move the runtime PM calls to the AIE2 callbacks that actually interact
with the hardware. The device will automatically suspend after 5 seconds
of inactivity (no hardware accesses and no pending commands), and it will
be resumed on the next hardware access.

Reviewed-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://lore.kernel.org/r/20250923152229.1303625-1-lizhi.hou@amd.com
2025-09-24 13:47:59 -07:00
Andrzej Kacprowski
0bf37f45d5 accel/ivpu: Add support for user-managed preemption buffer
Allow user mode drivers to manage preemption buffers, enabling
memory savings by sharing a single buffer across multiple
command queues within the same memory context.

Introduce DRM_IVPU_PARAM_PREEMPT_BUFFER_SIZE to report the required
preemption buffer size as specified by the firmware.

The preemption buffer is now passed from user space as an entry
in the BO list of DRM_IVPU_CMDQ_SUBMIT. The buffer must be
non-mappable and large enough to hold preemption data.

For backward compatibility, the kernel will allocate an internal
preemption buffer if user space does not provide one.

User space can only provide a single preemption buffer,
simplifying the ioctl interface and parameter validation.
A separate secondary preemption buffer is only needed
to save below 4GB address space on 37xx and only if preemption
buffers are not shared.

Signed-off-by: Andrzej Kacprowski <Andrzej.Kacprowski@intel.com>
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20250915103437.830086-1-karol.wachowski@linux.intel.com
2025-09-18 08:42:38 +02:00
Karol Wachowski
58b8b085b9 accel/ivpu: Update JSM firmware API to latest 3.32.5 version
Synchronize the JSM API header file with the latest 3.32.5 version
to reflect all changes introduced in the new firmware API

Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20250916084131.848988-1-karol.wachowski@linux.intel.com
2025-09-18 08:36:28 +02:00
Karol Wachowski
9f6c632857 accel/ivpu: Ensure rpm_runtime_put in case of engine reset/resume fail
Previously, aborting work could return early after engine reset or resume
failure, skipping the necessary runtime_put cleanup leaving the device
with incorrect reference count breaking runtime power management state.

Replace early returns with goto statements to ensure runtime_put is always
executed.

Fixes: a47e36dc5d ("accel/ivpu: Trigger device recovery on engine reset/resume failure")
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20250916084809.850073-1-karol.wachowski@linux.intel.com
2025-09-18 08:30:18 +02:00
Andrzej Kacprowski
7c7a395a00 accel/ivpu: Remove unused firmware boot parameters
Remove references to firmware boot parameters that were never used
by any production version of device firmware.

Signed-off-by: Andrzej Kacprowski <Andrzej.Kacprowski@intel.com>
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20250915103553.830151-1-karol.wachowski@linux.intel.com
2025-09-18 08:23:10 +02:00
Jacek Lawrynowicz
f7e79530ef accel/ivpu: Refactor priority_bands_show for readability
Reduce code duplication and improve the overall readability of the debugfs
output for job scheduling priority bands.

Additionally fix clang-tidy warning about missing default case in the
switch statement.

Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20250915103401.830045-1-karol.wachowski@linux.intel.com
2025-09-18 08:13:19 +02:00
Karol Wachowski
2274402ac1 accel/ivpu: Reset cmdq->db_id on register failure
Ensure that cmdq->db_id is reset to 0 if ivpu_jsm_register_db fails,
preventing potential reuse of invalid command queue with
unregistered doorbell.

Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Karol Wachowski <karol.wachowski@linux.intel.com>
Link: https://lore.kernel.org/r/20250915103421.830065-1-karol.wachowski@linux.intel.com
2025-09-18 07:42:11 +02:00
Lizhi Hou
457f4393d0 accel/amdxdna: Call dma_buf_vmap_unlocked() for imported object
In amdxdna_gem_obj_vmap(), calling dma_buf_vmap() triggers a kernel
warning if LOCKDEP is enabled. So for imported object, use
dma_buf_vmap_unlocked(). Then, use drm_gem_vmap() for other objects.
The similar change applies to vunmap code.

Fixes: bd72d4acda ("accel/amdxdna: Support user space allocated buffer")
Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://lore.kernel.org/r/20250916174842.234709-1-lizhi.hou@amd.com
2025-09-17 11:02:31 -07:00