Commit Graph

867 Commits

Author SHA1 Message Date
Lizhi Hou
6380b1ceba accel/amdxdna: Fix incorrect type used for a local variable
drivers/accel/amdxdna/aie2_pci.c:794:13: sparse: sparse: incorrect type in assignment (different address spaces)

Fixes: c8cea4371e ("accel/amdxdna: Add a function to walk hardware contexts")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202508230855.0b9efFl6-lkp@intel.com/
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://lore.kernel.org/r/20250826171951.801585-1-lizhi.hou@amd.com
2025-08-26 11:18:22 -07:00
Maxime Ripard
1e17ed8326 Merge drm/drm-next into drm-misc-next
Bring v6.17-rc2 in to unstuck for-linux-next.

Signed-off-by: Maxime Ripard <mripard@kernel.org>
2025-08-20 09:33:06 +02:00
Dave Airlie
dd489c01c3 Merge tag 'drm-misc-next-2025-08-14' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
drm-misc-next for v6.18:

UAPI Changes:

- Add DRM_IOCTL_GEM_CHANGE_HANDLE for reassigning GEM handles
- Document DRM_MODE_PAGE_FLIP_EVENT

Cross-subsystem Changes:

fbcon:
- Add missing declarations in fbcon.h

Core Changes:

bridge:
- Fix ref counting

panel:
- Replace and remove mipi_dsi_generic_write_{seq/_chatty}()

sched:
- Fixes

Rust:
- Drop Opaque<> from ioctl arguments

Driver Changes:

amdxdma:
- Support buffers allocated by user space
- Streamline PM interfaces
- Fixes

bridge:
- cdns-dsi: Various improvements to mode setting
- Support Solomon SSD2825 plus DT bindings
- Support Waveshare DSI2DPI plus DT bindings

gud:
- Fixes

ivpu:
- Fixes

nouveau:
- Use GSP firmware by default
- Fixes

panel:
- panel-edp: Support mt8189 Chromebooks; Support BOE NV140WUM-N64;
  Support SHP LQ134Z1; Fixes
- panel-simple: Support Olimex LCD-OLinuXino-5CTS plus DT bindings
- Support Samsung AMS561RA01
- Support Hydis HV101HD1 plus DT bindings

panthor:
- Print task/pid on errors
- Fixes

renesas:
- convert to RUNTIME_PM_OPS

repaper:
- Use shadow-plane helpers

rocket:
- Add driver for Rockchip NPU plus DT bindings

sharp-memory:
- Use shadow-plane helpers

simpledrm:
- Use of_reserved_mem_region_to_resource() helper

tidss:
- Use crtc_ fields for programming display mode
- Remove other drivers from aperture

v3d:
- Support querying nubmer of GPU resets for KHR_robustness

vmwgfx:
- Fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20250814072454.GA18104@linux.fritz.box
2025-08-19 07:02:30 +10:00
Lizhi Hou
c8cea4371e accel/amdxdna: Add a function to walk hardware contexts
Walking hardware contexts created by a process is duplicated in multiple
spots. Add a function, amdxdna_hwctx_walk(), and replace all spots.

hwctx_srcu and dev_lock are good enough to protect hardware context list.
Remove hwctx_lock.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://lore.kernel.org/r/20250815171634.3417487-1-lizhi.hou@amd.com
2025-08-18 08:35:57 -07:00
Linus Torvalds
8742b2d893 Merge tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull habanalabs fix from Al Viro:
 "Yet another use-after-free fix due to dma_buf_fd() misuse"

* tag 'pull-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  habanalabs: fix UAF in export_dmabuf()
2025-08-12 12:10:33 -07:00
Thomas Zimmermann
08c51f5bdd Merge drm/drm-next into drm-misc-n
Updating drm-misc-next to the state of v6.17-rc1. Begins a new release
cycle.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2025-08-11 14:37:45 +02:00
Lizhi Hou
d2b48f2b30 accel/amdxdna: Unify pm and rpm suspend and resume callbacks
The suspend and resume callbacks for pm and runtime pm should be same.
During suspending, it needs to stop all hardware contexts first. And
the hardware contexts will be restarted after the device is resumed.

Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://lore.kernel.org/r/20250803191450.1568851-1-lizhi.hou@amd.com
2025-08-06 10:31:55 -07:00
Brigham Campbell
d9c1b06f35 accel/rocket: Fix undeclared const rocket_pm_ops
Fix sparse warning regarding an undeclared const rocket_pm_ops, which is
used in rocket_drv.c.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202508030021.uwdr4P08-lkp@intel.com/
Signed-off-by: Brigham Campbell <me@brighamcampbell.com>
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Link: https://lore.kernel.org/r/20250802-fix-rockchip-npu-build-v1-2-fb0f0dacb3fe@brighamcampbell.com
2025-08-03 17:40:23 +02:00
Brigham Campbell
218b15a3e9 accel/rocket: Fix Rockchip NPU compilation
Replace DRM_GPU_SCHED_STAT_NOMINAL with GPU_DRM_SCHED_STAT_RESET, in
accordance with commit 0a5dc1b67e ("drm/sched: Rename
DRM_GPU_SCHED_STAT_NOMINAL to DRM_GPU_SCHED_STAT_RESET")

Pass extra parameter to drm_sched_job_init, as required by commit
2956554823 ("drm/sched: Store the drm client_id in drm_sched_fence")

Signed-off-by: Brigham Campbell <me@brighamcampbell.com>
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Link: https://lore.kernel.org/r/20250802-fix-rockchip-npu-build-v1-1-fb0f0dacb3fe@brighamcampbell.com
2025-08-03 17:40:23 +02:00
Linus Torvalds
260f6f4fda Merge tag 'drm-next-2025-07-30' of https://gitlab.freedesktop.org/drm/kernel
Pull drm updates from Dave Airlie:
 "Highlights:

   - Intel xe enable Panthor Lake, started adding WildCat Lake

   - amdgpu has a bunch of reset improvments along with the usual IP
     updates

   - msm got VM_BIND support which is important for vulkan sparse memory

   - more drm_panic users

   - gpusvm common code to handle a bunch of core SVM work outside
     drivers.

  Detail summary:

  Changes outside drm subdirectory:
   - 'shrink_shmem_memory()' for better shmem/hibernate interaction
   - Rust support infrastructure:
      - make ETIMEDOUT available
      - add size constants up to SZ_2G
      - add DMA coherent allocation bindings
   - mtd driver for Intel GPU non-volatile storage
   - i2c designware quirk for Intel xe

  core:
   - atomic helpers: tune enable/disable sequences
   - add task info to wedge API
   - refactor EDID quirks
   - connector: move HDR sink to drm_display_info
   - fourcc: half-float and 32-bit float formats
   - mode_config: pass format info to simplify

  dma-buf:
   - heaps: Give CMA heap a stable name

  ci:
   - add device tree validation and kunit

  displayport:
   - change AUX DPCD access probe address
   - add quirk for DPCD probe
   - add panel replay definitions
   - backlight control helpers

  fbdev:
   - make CONFIG_FIRMWARE_EDID available on all arches

  fence:
   - fix UAF issues

  format-helper:
   - improve tests

  gpusvm:
   - introduce devmem only flag for allocation
   - add timeslicing support to GPU SVM

  ttm:
   - improve eviction

  sched:
   - tracing improvements
   - kunit improvements
   - memory leak fixes
   - reset handling improvements

  color mgmt:
   - add hardware gamma LUT handling helpers

  bridge:
   - add destroy hook
   - switch to reference counted drm_bridge allocations
   - tc358767: convert to devm_drm_bridge_alloc
   - improve CEC handling

  panel:
   - switch to reference counter drm_panel allocations
   - fwnode panel lookup
   - Huiling hl055fhv028c support
   - Raspberry Pi 7" 720x1280 support
   - edp: KDC KD116N3730A05, N160JCE-ELL CMN, N116BCJ-EAK
   - simple: AUO P238HAN01
   - st7701: Winstar wf40eswaa6mnn0
   - visionox: rm69299-shift
   - Renesas R61307, Renesas R69328 support
   - DJN HX83112B

  hdmi:
   - add CEC handling
   - YUV420 output support

  xe:
   - WildCat Lake support
   - Enable PanthorLake by default
   - mark BMG as SRIOV capable
   - update firmware recommendations
   - Expose media OA units
   - aux-bux support for non-volatile memory
   - MTD intel-dg driver for non-volatile memory
   - Expose fan control and voltage regulator in sysfs
   - restructure migration for multi-device
   - Restore GuC submit UAF fix
   - make GEM shrinker drm managed
   - SRIOV VF Post-migration recovery of GGTT nodes
   - W/A additions/reworks
   - Prefetch support for svm ranges
   - Don't allocate managed BO for each policy change
   - HWMON fixes for BMG
   - Create LRC BO without VM
   - PCI ID updates
   - make SLPC debugfs files optional
   - rework eviction rejection of bound external BOs
   - consolidate PAT programming logic for pre/post Xe2
   - init changes for flicker-free boot
   - Enable GuC Dynamic Inhibit Context switch

  i915:
   - drm_panic support for i915/xe
   - initial flip queue off by default for LNL/PNL
   - Wildcat Lake Display support
   - Support for DSC fractional link bpp
   - Support for simultaneous Panel Replay and Adaptive sync
   - Support for PTL+ double buffer LUT
   - initial PIPEDMC event handling
   - drm_panel_follower support
   - DPLL interface renames
   - allocate struct intel_display dynamically
   - flip queue preperation
   - abstract DRAM detection better
   - avoid GuC scheduling stalls
   - remove DG1 force probe requirement
   - fix MEI interrupt handler on RT kernels
   - use backlight control helpers for eDP
   - more shared display code refactoring

  amdgpu:
   - add userq slot to INFO ioctl
   - SR-IOV hibernation support
   - Suspend improvements
   - Backlight improvements
   - Use scaling for non-native eDP modes
   - cleaner shader updates for GC 9.x
   - Remove fence slab
   - SDMA fw checks for userq support
   - RAS updates
   - DMCUB updates
   - DP tunneling fixes
   - Display idle D3 support
   - Per queue reset improvements
   - initial smartmux support

  amdkfd:
   - enable KFD on loongarch
   - mtype fix for ext coherent system memory

  radeon:
   - CS validation additional GL extensions
   - drop console lock during suspend/resume
   - bump driver version

  msm:
   - VM BIND support
   - CI: infrastructure updates
   - UBWC single source of truth
   - decouple GPU and KMS support
   - DP: rework I/O accessors
   - DPU: SM8750 support
   - DSI: SM8750 support
   - GPU: X1-45 support and speedbin support for X1-85
   - MDSS: SM8750 support

  nova:
   - register! macro improvements
   - DMA object abstraction
   - VBIOS parser + fwsec lookup
   - sysmem flush page support
   - falcon: generic falcon boot code and HAL
   - FWSEC-FRTS: fb setup and load/execute

  ivpu:
   - Add Wildcat Lake support
   - Add turbo flag

  ast:
   - improve hardware generations implementation

  imx:
   - IMX8qxq Display Controller support

  lima:
   - Rockchip RK3528 GPU support

  nouveau:
   - fence handling cleanup

  panfrost:
   - MT8370 support
   - bo labeling
   - 64-bit register access

  qaic:
   - add RAS support

  rockchip:
   - convert inno_hdmi to a bridge

  rz-du:
   - add RZ/V2H(P) support
   - MIPI-DSI DCS support

  sitronix:
   - ST7567 support

  sun4i:
   - add H616 support

  tidss:
   - add TI AM62L support
   - AM65x OLDI bridge support

  bochs:
   - drm panic support

  vkms:
   - YUV and R* format support
   - use faux device

  vmwgfx:
   - fence improvements

  hyperv:
   - move out of simple
   - add drm_panic support"

* tag 'drm-next-2025-07-30' of https://gitlab.freedesktop.org/drm/kernel: (1479 commits)
  drm/tidss: oldi: convert to devm_drm_bridge_alloc() API
  drm/tidss: encoder: convert to devm_drm_bridge_alloc()
  drm/amdgpu: move reset support type checks into the caller
  drm/amdgpu/sdma7: re-emit unprocessed state on ring reset
  drm/amdgpu/sdma6: re-emit unprocessed state on ring reset
  drm/amdgpu/sdma5.2: re-emit unprocessed state on ring reset
  drm/amdgpu/sdma5: re-emit unprocessed state on ring reset
  drm/amdgpu/gfx12: re-emit unprocessed state on ring reset
  drm/amdgpu/gfx11: re-emit unprocessed state on ring reset
  drm/amdgpu/gfx10: re-emit unprocessed state on ring reset
  drm/amdgpu/gfx9.4.3: re-emit unprocessed state on kcq reset
  drm/amdgpu/gfx9: re-emit unprocessed state on kcq reset
  drm/amdgpu: Add WARN_ON to the resource clear function
  drm/amd/pm: Use cached metrics data on SMUv13.0.6
  drm/amd/pm: Use cached data for min/max clocks
  gpu: nova-core: fix bounds check in PmuLookupTableEntry::new
  drm/amdgpu: Replace HQD terminology with slots naming
  drm/amdgpu: Add user queue instance count in HW IP info
  drm/amd/amdgpu: Add helper functions for isp buffers
  drm/amd/amdgpu: Initialize swnode for ISP MFD device
  ...
2025-07-30 19:26:49 -07:00
Tomeu Vizoso
525ad89dd9 accel/rocket: Add IOCTLs for synchronizing memory accesses
The NPU cores have their own access to the memory bus, and this isn't
cache coherent with the CPUs.

Add IOCTLs so userspace can mark when the caches need to be flushed, and
also when a writer job needs to be waited for before the buffer can be
accessed from the CPU.

Initially based on the same IOCTLs from the Etnaviv driver.

v2:
- Don't break UABI by reordering the IOCTL IDs (Jeff Hugo)

v3:
- Check that padding fields in IOCTLs are zero (Jeff Hugo)

v6:
- Fix conversion logic to make sure we use DMA_BIDIRECTIONAL when needed
  (Lucas Stach)

v8:
- Always sync BOs in both directions (Robin Murphy)

Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250721-6-10-rocket-v9-5-77ebd484941e@tomeuvizoso.net
2025-07-25 10:04:46 -06:00
Tomeu Vizoso
0810d5ad88 accel/rocket: Add job submission IOCTL
Using the DRM GPU scheduler infrastructure, with a scheduler for each
core.

Userspace can decide for a series of tasks to be executed sequentially
in the same core, so SRAM locality can be taken advantage of.

The job submission code was initially based on Panfrost.

v2:
- Remove hardcoded number of cores
- Misc. style fixes (Jeffrey Hugo)
- Repack IOCTL struct (Jeffrey Hugo)

v3:
- Adapt to a split of the register block in the DT bindings (Nicolas
  Frattaroli)
- Make use of GPL-2.0-only for the copyright notice (Jeff Hugo)
- Use drm_* logging functions (Thomas Zimmermann)
- Rename reg i/o macros (Thomas Zimmermann)
- Add padding to ioctls and check for zero (Jeff Hugo)
- Improve error handling (Nicolas Frattaroli)

v6:
- Use mutexes guard (Markus Elfring)
- Use u64_to_user_ptr (Jeff Hugo)
- Drop rocket_fence (Rob Herring)

v7:
- Assign its own IOMMU domain to each client, for isolation (Daniel
  Stone and Robin Murphy)

v8:
- Use reset lines to reset the cores (Robin Murphy)
- Use the macros to compute the values for the bitfields (Robin Murphy)
- More descriptive name for the IRQ (Robin Murphy)
- Simplify job interrupt handing (Robin Murphy)
- Correctly acquire a reference to the IOMMU (Robin Murphy)
- Specify the size of the embedded structs in the IOCTLs for future
  extensibility (Rob Herring)
- Expose only 32 bits for the address of the regcmd BO (Robin Murphy)

Tested-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250721-6-10-rocket-v9-4-77ebd484941e@tomeuvizoso.net
2025-07-25 10:02:27 -06:00
Tomeu Vizoso
658ebeac33 accel/rocket: Add IOCTL for BO creation
This uses the SHMEM DRM helpers and we map right away to the CPU and NPU
sides, as all buffers are expected to be accessed from both.

v2:
- Sync the IOMMUs for the other cores when mapping and unmapping.

v3:
- Make use of GPL-2.0-only for the copyright notice (Jeff Hugo)

v6:
- Use mutexes guard (Markus Elfring)

v7:
- Assign its own IOMMU domain to each client, for isolation (Daniel
  Stone and Robin Murphy)

v8:
- Correctly acquire a reference to the IOMMU (Robin Murphy)
- Allocate DMA address ourselves with drm_mm (Robin Murphy)
- Use refcount_read (Heiko Stuebner)
- Remove superfluous dma_sync_sgtable_for_device (Robin Murphy)

Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250721-6-10-rocket-v9-3-77ebd484941e@tomeuvizoso.net
2025-07-25 10:00:36 -06:00
Tomeu Vizoso
ed98261b41 accel/rocket: Add a new driver for Rockchip's NPU
This initial version supports the NPU as shipped in the RK3588 SoC and
described in the first part of its TRM, in Chapter 36.

This NPU contains 3 independent cores that the driver can submit jobs
to.

This commit adds just hardware initialization and power management.

v2:
- Split cores and IOMMUs as independent devices (Sebastian Reichel)
- Add some documentation (Jeffrey Hugo)
- Be more explicit in the Kconfig documentation (Jeffrey Hugo)
- Remove resets, as these haven't been found useful so far (Zenghui Yu)
- Repack structs (Jeffrey Hugo)
- Use DEFINE_DRM_ACCEL_FOPS (Jeffrey Hugo)
- Use devm_drm_dev_alloc (Jeffrey Hugo)
- Use probe log helper (Jeffrey Hugo)
- Introduce UABI header in a later patch (Jeffrey Hugo)

v3:
- Adapt to a split of the register block in the DT bindings (Nicolas
  Frattaroli)
- Move registers header to its own commit (Thomas Zimmermann)
- Misc. cleanups (Thomas Zimmermann and Jeff Hugo)
- Make use of GPL-2.0-only for the copyright notice (Jeff Hugo)
- PM improvements (Nicolas Frattaroli)

v4:
- Use bulk clk API (Krzysztof Kozlowski)

v6:
- Remove mention to NVDLA, as the hardware is only incidentally related
  (Kever Yang)
- Use calloc instead of GFP_ZERO (Jeff Hugo)
- Explicitly include linux/container_of.h (Jeff Hugo)
- pclk and npu clocks are now needed by all cores (Rob Herring)

v7:
- Assign its own IOMMU domain to each client, for isolation (Daniel
  Stone and Robin Murphy)

v8:
- Kconfig: fix depends to be more explicit about Rockchip, and remove
  superfluous selects (Robin Murphy)
- Use reset lines to reset the cores (Robin Murphy)
- Reference count the module
- Set dma_set_max_seg_size
- Correctly acquire a reference to the IOMMU (Robin Murphy)
- Remove notion of top core (Robin Murphy)

Reviewed-by: Robert Foss <rfoss@kernel.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250721-6-10-rocket-v9-2-77ebd484941e@tomeuvizoso.net
2025-07-25 09:56:23 -06:00
Tomeu Vizoso
5fc2bfddb0 accel/rocket: Add registers header
A XML file was generated with the data from the TRM, and then this
header was generated from it.

The canonical location for the XML file is the Mesa3D repository.

v3:
- Make use of GPL-2.0-only for the copyright notice (Jeff Hugo)

v8:
- Remove full MIT license blob, to match other files with the same
  licensing arrangement in the kernel

Reviewed-by: Robert Foss <rfoss@kernel.org>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250721-6-10-rocket-v9-1-77ebd484941e@tomeuvizoso.net
2025-07-25 09:48:41 -06:00
Salah Triki
5982a539cd accel/amdxdna: Delete pci_free_irq_vectors()
The device is managed so pci_free_irq_vectors() is called automatically
no need to do it manually.

Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Salah Triki <salah.triki@gmail.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://lore.kernel.org/r/aHs8QAfUlFeNp7qL@pc
2025-07-22 09:19:04 -07:00
Lizhi Hou
bd72d4acda accel/amdxdna: Support user space allocated buffer
Enhance DRM_IOCTL_AMDXDNA_CREATE_BO to accept user space allocated
buffer pointer. The buffer pages will be pinned in memory. Unless
the CAP_IPC_LOCK is enabled for the application process, the total
pinned memory can not beyond rlimit_memlock.

Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://lore.kernel.org/r/20250716164414.112091-1-lizhi.hou@amd.com
2025-07-22 08:34:29 -07:00
Maarten Lankhorst
ac6e050e69 accel/ivpu: Remove lockdep_assert_irqs_disabled()
This breaks on PREEMPT_RT, and should be unneeded since
lockdep can track irq disabled status itself.

Link: https://lore.kernel.org/r/20250715165919.33754-1-dev@lankhorst.se
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-07-22 10:26:37 +02:00
Maíra Canal
0a5dc1b67e drm/sched: Rename DRM_GPU_SCHED_STAT_NOMINAL to DRM_GPU_SCHED_STAT_RESET
Among the scheduler's statuses, the only one that indicates an error is
DRM_GPU_SCHED_STAT_ENODEV. Any status other than DRM_GPU_SCHED_STAT_ENODEV
signifies that the operation succeeded and the GPU is in a nominal state.

However, to provide more information about the GPU's status, it is needed
to convey more information than just "OK".

Therefore, rename DRM_GPU_SCHED_STAT_NOMINAL to
DRM_GPU_SCHED_STAT_RESET, which better communicates the meaning of this
status. The status DRM_GPU_SCHED_STAT_RESET indicates that the GPU has
hung, but it has been successfully reset and is now in a nominal state
again.

Reviewed-by: Philipp Stanner <phasta@kernel.org>
Link: https://lore.kernel.org/r/20250714-sched-skip-reset-v6-1-5c5ba4f55039@igalia.com
Signed-off-by: Maíra Canal <mcanal@igalia.com>
2025-07-15 08:27:00 -03:00
Al Viro
33927f3d0e habanalabs: fix UAF in export_dmabuf()
As soon as we'd inserted a file reference into descriptor table, another
thread could close it.  That's fine for the case when all we are doing is
returning that descriptor to userland (it's a race, but it's a userland
race and there's nothing the kernel can do about it).  However, if we
follow fd_install() with any kind of access to objects that would be
destroyed on close (be it the struct file itself or anything destroyed
by its ->release()), we have a UAF.

dma_buf_fd() is a combination of reserving a descriptor and fd_install().
habanalabs export_dmabuf() calls it and then proceeds to access the
objects destroyed on close.  In particular, it grabs an extra reference to
another struct file that will be dropped as part of ->release() for ours;
that "will be" is actually "might have already been".

Fix that by reserving descriptor before anything else and do fd_install()
only when everything had been set up.  As a side benefit, we no longer
have the failure exit with file already created, but reference to
underlying file (as well as ->dmabuf_export_cnt, etc.) not grabbed yet;
unlike dma_buf_fd(), fd_install() can't fail.

Fixes: db1a8dd916 ("habanalabs: add support for dma-buf exporter")
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2025-07-12 01:11:14 -04:00
Maarten Lankhorst
e21354aea4 Merge remote-tracking branch 'drm/drm-next' into drm-misc-next
Pull in drm-intel-next for the updates to drm panic handling.

Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>
2025-07-08 16:49:07 +02:00
Sunil Khatri
348fe34a61 drm: move drm based debugfs funcs to drm_debugfs.c
Requirement is to create per client-id based directories to
hold key debugging information and for that access to
root debugfs dentry is need which is not in one place
and that information cannot be stored in drm_device.

Move the debugfs functionality from drm_drv.c and drm_accel.c
to drm_debugfs.c This enables debugfs root node reference
directly drm_debugfs.c and hence enable to create per client-id
directory.

v8: Create drm_accel dentry only if it's config is enabled (Jeff, Hugo)
v8: Merge drm_drv and drm_accel debugfs patches (Koenig, Christian)

v10: Since we moved drm_debugfs_root, hence to handle drm bridge
debugfs add a new function which call drm_bridge_debugfs_params where
drm_debugfs_root is accessible.

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Link: https://lore.kernel.org/r/20250704075548.1549849-2-sunil.khatri@amd.com
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
2025-07-04 15:58:22 +02:00
Greg Kroah-Hartman
63dafeb392 Merge 6.16-rc3 into driver-core-next
We need the driver-core fixes that are in 6.16-rc3 into here as well
to build on top of.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-23 07:53:36 +02:00
Dave Airlie
9356b50af5 Merge tag 'drm-misc-next-2025-06-19' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
drm-misc-next for 6.17:

UAPI Changes:
- Add Task Information for the wedge API

Cross-subsystem Changes:

Core Changes:
- Fix warnings related to export.h
- fbdev: Make CONFIG_FIRMWARE_EDID available on all architectures
- fence: Fix UAF issues
- format-helper: Improve tests

Driver Changes:
- ivpu: Add turbo flag, Add Wildcat Lake Support
- rz-du: Improve MIPI-DSI Support
- vmwgfx: fence improvement

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://lore.kernel.org/r/20250619-perfect-industrious-whippet-8ed3db@houat
2025-06-20 11:34:09 +10:00
Dave Airlie
45215c589e Merge tag 'drm-misc-next-2025-06-12' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
drm-misc-next for 6.17:

UAPI Changes:

Cross-subsystem Changes:

Core Changes:
 - atomic-helpers: Tune the enable / disable sequence
 - bridge: Add destroy hook
 - color management: Add helpers for hardware gamma LUT handling
 - HDMI: Add CEC handling, YUV420 output support
 - sched: tracing improvements

Driver Changes:
 - hyperv: Move out of simple-kms, drm_panic support
 - i915: drm_panel_follower support
 - imx: Add IMX8qxq Display Controller Support
 - lima: Add Rockchip RK3528 GPU Support
 - nouveau: fence handling cleanup
 - panfrost: Add BO labeling, 64-bit registers access
 - qaic: Add RAS Support
 - rz-du: Add RZ/V2H(P) Support, MIPI-DSI DCS Support
 - sun4i: Add H616 Support
 - tidss: Add TI AM62L Support
 - vkms: YUV and R* formats support

 - bridges:
   - Switched to reference counted drm_bridge allocations

 - panels:
   - Switched to reference counted drm_panel allocations
   - Add support for fwnode-based panel lookup
   - himax-hx8394: Support for Huiling hl055fhv028c
   - ilitek-ili9881c: Support for 7" Raspberry Pi 720x1280
   - panel-edp: Support for KDC KD116N3730A05, N160JCE-ELL CMN,
   - panel-simple: Support for AUO P238HAN01
   - st7701: Support for Winstar wf40eswaa6mnn0
   - visionox-rm69299: Support for rm69299-shift
   - New panels: Renesas R61307, Renesas R69328

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://lore.kernel.org/r/20250612-coucal-of-impossible-cleaning-a5eecf@houat
2025-06-18 08:09:35 +10:00
Thomas Weißschuh
fb506e31b3 sysfs: treewide: switch back to attribute_group::bin_attrs
The normal bin_attrs field can now handle const pointers.
This makes the _new variant unnecessary.
Switch all users back.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20250530-sysfs-const-bin_attr-final-v3-4-724bfcf05b99@weissschuh.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-17 10:44:15 +02:00
Thomas Weißschuh
2fbe82037a sysfs: treewide: switch back to bin_attribute::read()/write()
The bin_attribute argument of bin_attribute::read() is now const.
This makes the _new() callbacks unnecessary. Switch all users back.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Link: https://lore.kernel.org/r/20250530-sysfs-const-bin_attr-final-v3-3-724bfcf05b99@weissschuh.net
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-17 10:44:13 +02:00
Lizhi Hou
e252e3f348 accel/amdxdna: Revise device bo creation and free
The device bo is allocated from the device heap memory. (a trunk of
memory dedicated to device)

Rename amdxdna_gem_insert_node_locked to amdxdna_gem_heap_alloc
and move related sanity checks into it.

Add amdxdna_gem_dev_obj_free and move device bo free code into it.

Calculate the kernel virtual address of device bo by the device
heap memory address and offset.

Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://lore.kernel.org/r/20250616091418.2605476-1-lizhi.hou@amd.com
2025-06-16 22:29:09 -07:00
Andrzej Kacprowski
5e861a695a accel/ivpu: Add turbo flag to the DRM_IVPU_CMDQ_CREATE ioctl
Introduce a new parameter to the DRM_IVPU_CMDQ_CREATE ioctl,
enabling turbo mode for jobs submitted via the command queue.
Turbo mode allows jobs to run at higher frequencies,
potentially improving performance for demanding workloads.

Also adds the IVPU_TEST_MODE_TURBO_DISABLE flag to allow test
mode to explicitly disable turbo mode requested by the application.
The IVPU_TEST_MODE_TURBO mode has been renamed to
IVPU_TEST_MODE_TURBO_ENABLE for clarity and consistency.

Signed-off-by: Andrzej Kacprowski <Andrzej.Kacprowski@intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://lore.kernel.org/r/20250605162001.1237789-1-maciej.falkowski@linux.intel.com
2025-06-16 13:35:42 +02:00
Maciej Falkowski
8346440506 accel/ivpu: Add initial Wildcat Lake support
Add support for Wildcat Lake (WCL) CPUs.
Wildcat Lake contains NPU5 just like Panther Lake
hence the initial support is very simple and adds
only PCI IDs.

Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://lore.kernel.org/r/20250605161947.1237727-1-maciej.falkowski@linux.intel.com
2025-06-16 13:35:27 +02:00
Jeff Hugo
c5ae936db7 accel/qaic: Use dev_printk() in RAS
pci_printk() was removed with commit 1c8a0ed204 ("PCI: Remove unused pci_printk()")
so change to using dev_printk().

Fixes: c11a50b170 ("accel/qaic: Add Reliability, Accessibility, Serviceability (RAS)")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/all/20250610124809.1e1ff0cd@canb.auug.org.au/
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Troy Hanson <quic_thanson@quicinc.com>
Link: https://lore.kernel.org/r/20250610175912.2086773-1-jeff.hugo@oss.qualcomm.com
2025-06-11 09:34:05 -06:00
Thomas Zimmermann
c598d5eb9f Merge drm/drm-next into drm-misc-next
Backmerging to forward to v6.16-rc1

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2025-06-11 09:01:34 +02:00
Lizhi Hou
779a0c9e06 accel/amdxdna: Fix incorrect PSP firmware size
The incorrect PSP firmware size is used for initializing. It may
cause error for newer version firmware.

Fixes: 8c9ff1b181 ("accel/amdxdna: Add a new driver for AMD AI Engine")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://lore.kernel.org/r/20250604143217.1386272-1-lizhi.hou@amd.com
2025-06-09 07:16:32 -07:00
Ingo Molnar
41cb08555c treewide, timers: Rename from_timer() to timer_container_of()
Move this API to the canonical timer_*() namespace.

[ tglx: Redone against pre rc1 ]

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/aB2X0jCKQO56WdMt@gmail.com
2025-06-08 09:07:37 +02:00
Linus Torvalds
5fc6c6f258 Merge tag 'drm-fixes-2025-06-06' of https://gitlab.freedesktop.org/drm/kernel
Pull more drm fixes from Simona Vetter:
 "Another small batch of drm fixes, this time with a different baseline
  and hence separate.

  Drivers:
   - ivpu:
       - dma_resv locking
       - warning fixes
       - reset failure handling
       - improve logging
       - update fw file names
       - fix cmdqueue unregister
   - panel-simple: add Evervision VGG644804

  Core Changes:
   - sysfb: screen_info type check
   - video: screen_info for relocated pci fb
   - drm/sched: signal fence of killed job
   - dummycon: deferred takeover fix"

* tag 'drm-fixes-2025-06-06' of https://gitlab.freedesktop.org/drm/kernel:
  sysfb: Fix screen_info type check for VGA
  video: screen_info: Relocate framebuffers behind PCI bridges
  accel/ivpu: Fix warning in ivpu_gem_bo_free()
  accel/ivpu: Trigger device recovery on engine reset/resume failure
  accel/ivpu: Use dma_resv_lock() instead of a custom mutex
  drm/panel-simple: fix the warnings for the Evervision VGG644804
  accel/ivpu: Reorder Doorbell Unregister and Command Queue Destruction
  accel/ivpu: Use firmware names from upstream repo
  accel/ivpu: Improve buffer object logging
  dummycon: Trigger redraw when switching consoles with deferred takeover
  drm/scheduler: signal scheduled fence when kill job
2025-06-06 09:26:47 -07:00
Simona Vetter
93614c2ab3 accel/qaic: delete qaic_bo.handle
Handles are per-file, not global, so this makes no sense. Plus it's
set only after calling drm_gem_handle_create(), and drivers are not
allowed to further initialize a bo after that function has published it
already.

It is also entirely unused, which helps enormously with removing it
:-)

Since we're still holding a reference to the bo nothing bad can
happen, hence not cc: stable material.

Cc: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Cc: Carl Vanderlip <quic_carlv@quicinc.com>
Cc: linux-arm-msm@vger.kernel.org
Signed-off-by: Simona Vetter <simona.vetter@ffwll.ch>
Signed-off-by: Simona Vetter <simona.vetter@intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Link: https://lore.kernel.org/r/20250528091307.1894940-5-simona.vetter@ffwll.ch
2025-06-06 10:24:11 -06:00
Jacek Lawrynowicz
91274fd4ed accel/ivpu: Fix warning in ivpu_gem_bo_free()
Don't WARN if imported buffers are in use in ivpu_gem_bo_free() as they
can be indeed used in the original context/driver.

Fixes: 647371a660 ("accel/ivpu: Add GEM buffer object management")
Cc: stable@vger.kernel.org # v6.3
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://lore.kernel.org/r/20250528171220.513225-1-jacek.lawrynowicz@linux.intel.com
2025-06-05 14:49:04 +02:00
Karol Wachowski
a47e36dc5d accel/ivpu: Trigger device recovery on engine reset/resume failure
Trigger full device recovery when the driver fails to restore device state
via engine reset and resume operations. This is necessary because, even if
submissions from a faulty context are blocked, the NPU may still process
previously submitted faulty jobs if the engine reset fails to abort them.
Such jobs can continue to generate faults and occupy device resources.
When engine reset is ineffective, the only way to recover is to perform
a full device recovery.

Fixes: dad945c27a ("accel/ivpu: Add handling of VPU_JSM_STATUS_MVNCI_CONTEXT_VIOLATION_HW")
Cc: stable@vger.kernel.org # v6.15+
Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://lore.kernel.org/r/20250528154253.500556-1-jacek.lawrynowicz@linux.intel.com
2025-06-05 14:36:56 +02:00
Jacek Lawrynowicz
98d3f772ca accel/ivpu: Use dma_resv_lock() instead of a custom mutex
This fixes a potential race conditions in:
 - ivpu_bo_unbind_locked() where we modified the shmem->sgt without
   holding the dma_resv_lock().
 - ivpu_bo_print_info() where we read the shmem->pages without
   holding the dma_resv_lock().

Using dma_resv_lock() also protects against future syncronisation
issues that may arise when accessing drm_gem_shmem_object or
drm_gem_object members.

Fixes: 42328003ec ("accel/ivpu: Refactor BO creation functions")
Cc: stable@vger.kernel.org # v6.9+
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://lore.kernel.org/r/20250528154325.500684-1-jacek.lawrynowicz@linux.intel.com
2025-06-05 14:36:37 +02:00
Linus Torvalds
3719a04a80 Merge tag 'pci-v6.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci
Pull pci updates from Bjorn Helgaas:
 "Enumeration:

   - Print the actual delay time in pci_bridge_wait_for_secondary_bus()
     instead of assuming it was 1000ms (Wilfred Mallawa)

   - Revert 'iommu/amd: Prevent binding other PCI drivers to IOMMU PCI
     devices', which broke resume from system sleep on AMD platforms and
     has been fixed by other commits (Lukas Wunner)

  Resource management:

   - Remove mtip32xx use of pcim_iounmap_regions(), which is deprecated
     and unnecessary (Philipp Stanner)

   - Remove pcim_iounmap_regions() and pcim_request_region_exclusive()
     and related flags since all uses have been removed (Philipp
     Stanner)

   - Rework devres 'request' functions so they are no longer 'hybrid',
     i.e., their behavior no longer depends on whether
     pcim_enable_device or pci_enable_device() was used, and remove
     related code (Philipp Stanner)

   - Warn (not BUG()) about failure to assign optional resources (Ilpo
     Järvinen)

  Error handling:

   - Log the DPC Error Source ID only when it's actually valid (when
     ERR_FATAL or ERR_NONFATAL was received from a downstream device)
     and decode into bus/device/function (Bjorn Helgaas)

   - Determine AER log level once and save it so all related messages
     use the same level (Karolina Stolarek)

   - Use KERN_WARNING, not KERN_ERR, when logging PCIe Correctable
     Errors (Karolina Stolarek)

   - Ratelimit PCIe Correctable and Non-Fatal error logging, with sysfs
     controls on interval and burst count, to avoid flooding logs and
     RCU stall warnings (Jon Pan-Doh)

  Power management:

   - Increment PM usage counter when probing reset methods so we don't
     try to read config space of a powered-off device (Alex Williamson)

   - Set all devices to D0 during enumeration to ensure ACPI opregion is
     connected via _REG (Mario Limonciello)

  Power control:

   - Rename pwrctrl Kconfig symbols from 'PWRCTL' to 'PWRCTRL' to match
     the filename paths. Retain old deprecated symbols for
     compatibility, except for the pwrctrl slot driver
     (PCI_PWRCTRL_SLOT) (Johan Hovold)

   - When unregistering pwrctrl, cancel outstanding rescan work before
     cleaning up data structures to avoid use-after-free issues (Brian
     Norris)

  Bandwidth control:

   - Simplify link bandwidth controller by replacing the count of Link
     Bandwidth Management Status (LBMS) events with a PCI_LINK_LBMS_SEEN
     flag (Ilpo Järvinen)

   - Update the Link Speed after retraining, since the Link Speed may
     have changed (Ilpo Järvinen)

  PCIe native device hotplug:

   - Ignore Presence Detect Changed caused by DPC.

     pciehp already ignores Link Down/Up events caused by DPC, but on
     slots using in-band presence detect, DPC causes a spurious Presence
     Detect Changed event (Lukas Wunner)

   - Ignore Link Down/Up caused by Secondary Bus Reset.

     On hotplug ports using in-band presence detect, the reset causes a
     Presence Detect Changed event, which mistakenly caused teardown and
     re-enumeration of the device. Drivers may need to annotate code
     that resets their device (Lukas Wunner)

  Virtualization:

   - Add an ACS quirk for Loongson Root Ports that don't advertise ACS
     but don't allow peer-to-peer transactions between Root Ports; the
     quirk allows each Root Port to be in a separate IOMMU group (Huacai
     Chen)

  Endpoint framework:

   - For fixed-size BARs, retain both the actual size and the possibly
     larger size allocated to accommodate iATU alignment requirements
     (Jerome Brunet)

   - Simplify ctrl/SPAD space allocation and avoid allocating more space
     than needed (Jerome Brunet)

   - Correct MSI-X PBA offset calculations for DesignWare and Cadence
     endpoint controllers (Niklas Cassel)

   - Align the return value (number of interrupts) encoding for
     pci_epc_get_msi()/pci_epc_ops::get_msi() and
     pci_epc_get_msix()/pci_epc_ops::get_msix() (Niklas Cassel)

   - Align the nr_irqs parameter encoding for
     pci_epc_set_msi()/pci_epc_ops::set_msi() and
     pci_epc_set_msix()/pci_epc_ops::set_msix() (Niklas Cassel)

  Common host controller library:

   - Convert pci-host-common to a library so platforms that don't need
     native host controller drivers don't need to include these helper
     functions (Manivannan Sadhasivam)

  Apple PCIe controller driver:

   - Extract ECAM bridge creation helper from pci_host_common_probe() to
     separate driver-specific things like MSI from PCI things (Marc
     Zyngier)

   - Dynamically allocate RID-to_SID bitmap to prepare for SoCs with
     varying capabilities (Marc Zyngier)

   - Skip ports disabled in DT when setting up ports (Janne Grunau)

   - Add t6020 compatible string (Alyssa Rosenzweig)

   - Add T602x PCIe support (Hector Martin)

   - Directly set/clear INTx mask bits because T602x dropped the
     accessors that could do this without locking (Marc Zyngier)

   - Move port PHY registers to their own reg items to accommodate
     T602x, which moves them around; retain default offsets for existing
     DTs that lack phy%d entries with the reg offsets (Hector Martin)

   - Stop polling for core refclk, which doesn't work on T602x and the
     bootloader has already done anyway (Hector Martin)

   - Use gpiod_set_value_cansleep() when asserting PERST# in probe
     because we're allowed to sleep there (Hector Martin)

  Cadence PCIe controller driver:

   - Drop a runtime PM 'put' to resolve a runtime atomic count underflow
     (Hans Zhang)

   - Make the cadence core buildable as a module (Kishon Vijay Abraham I)

   - Add cdns_pcie_host_disable() and cdns_pcie_ep_disable() for use by
     loadable drivers when they are removed (Siddharth Vadapalli)

  Freescale i.MX6 PCIe controller driver:

   - Apply link training workaround only on IMX6Q, IMX6SX, IMX6SP
     (Richard Zhu)

   - Remove redundant dw_pcie_wait_for_link() from
     imx_pcie_start_link(); since the DWC core does this, imx6 only
     needs it when retraining for a faster link speed (Richard Zhu)

   - Toggle i.MX95 core reset to align with PHY powerup (Richard Zhu)

   - Set SYS_AUX_PWR_DET to work around i.MX95 ERR051624 erratum: in
     some cases, the controller can't exit 'L23 Ready' through Beacon or
     PERST# deassertion (Richard Zhu)

   - Clear GEN3_ZRXDC_NONCOMPL to work around i.MX95 ERR051586 erratum:
     controller can't meet 2.5 GT/s ZRX-DC timing when operating at 8
     GT/s, causing timeouts in L1 (Richard Zhu)

   - Wait for i.MX95 PLL lock before enabling controller (Richard Zhu)

   - Save/restore i.MX95 LUT for suspend/resume (Richard Zhu)

  Mobiveil PCIe controller driver:

   - Return bool (not int) for link-up check in
     mobiveil_pab_ops.link_up() and layerscape-gen4, mobiveil (Hans
     Zhang)

  NVIDIA Tegra194 PCIe controller driver:

   - Create debugfs directory for 'aspm_state_cnt' only when
     CONFIG_PCIEASPM is enabled, since there are no other entries (Hans
     Zhang)

  Qualcomm PCIe controller driver:

   - Add OF support for parsing DT 'eq-presets-<N>gts' property for lane
     equalization presets (Krishna Chaitanya Chundru)

   - Read Maximum Link Width from the Link Capabilities register if DT
     lacks 'num-lanes' property (Krishna Chaitanya Chundru)

   - Add Physical Layer 64 GT/s Capability ID and register offsets for
     8, 32, and 64 GT/s lane equalization registers (Krishna Chaitanya
     Chundru)

   - Add generic dwc support for configuring lane equalization presets
     (Krishna Chaitanya Chundru)

   - Add DT and driver support for PCIe on IPQ5018 SoC (Nitheesh Sekar)

  Renesas R-Car PCIe controller driver:

   - Describe endpoint BAR 4 as being fixed size (Jerome Brunet)

   - Document how to obtain R-Car V4H (r8a779g0) controller firmware
     (Yoshihiro Shimoda)

  Rockchip PCIe controller driver:

   - Reorder rockchip_pci_core_rsts because
     reset_control_bulk_deassert() deasserts in reverse order, to fix a
     link training regression (Jensen Huang)

   - Mark RK3399 as being capable of raising INTx interrupts (Niklas
     Cassel)

  Rockchip DesignWare PCIe controller driver:

   - Check only PCIE_LINKUP, not LTSSM status, to determine whether the
     link is up (Shawn Lin)

   - Increase N_FTS (used in L0s->L0 transitions) and enable ASPM L0s
     for Root Complex and Endpoint modes (Shawn Lin)

   - Hide the broken ATS Capability in rockchip_pcie_ep_init() instead
     of rockchip_pcie_ep_pre_init() so it stays hidden after PERST#
     resets non-sticky registers (Shawn Lin)

   - Call phy_power_off() before phy_exit() in rockchip_pcie_phy_deinit()
     (Diederik de Haas)

  Synopsys DesignWare PCIe controller driver:

   - Set PORT_LOGIC_LINK_WIDTH to one lane to make initial link training
     more robust; this will not affect the intended link width if all
     lanes are functional (Wenbin Yao)

   - Return bool (not int) for link-up check in dw_pcie_ops.link_up()
     and armada8k, dra7xx, dw-rockchip, exynos, histb, keembay,
     keystone, kirin, meson, qcom, qcom-ep, rcar_gen4, spear13xx,
     tegra194, uniphier, visconti (Hans Zhang)

   - Add debugfs support for exposing DWC device-specific PTM context
     (Manivannan Sadhasivam)

  TI J721E PCIe driver:

   - Make j721e buildable as a loadable and removable module (Siddharth
     Vadapalli)

   - Fix j721e host/endpoint dependencies that result in link failures
     in some configs (Arnd Bergmann)

  Device tree bindings:

   - Add qcom DT binding for 'global' interrupt (PCIe controller and
     link-specific events) for ipq8074, ipq8074-gen3, ipq6018, sa8775p,
     sc7280, sc8180x sdm845, sm8150, sm8250, sm8350 (Manivannan
     Sadhasivam)

   - Add qcom DT binding for 8 MSI SPI interrupts for msm8998, ipq8074,
     ipq8074-gen3, ipq6018 (Manivannan Sadhasivam)

   - Add dw rockchip DT binding for rk3576 and rk3562 (Kever Yang)

   - Correct indentation and style of examples in brcm,stb-pcie,
     cdns,cdns-pcie-ep, intel,keembay-pcie-ep, intel,keembay-pcie,
     microchip,pcie-host, rcar-pci-ep, rcar-pci-host, xilinx-versal-cpm
     (Krzysztof Kozlowski)

   - Convert Marvell EBU (dove, kirkwood, armada-370, armada-xp) and
     armada8k from text to schema DT bindings (Rob Herring)

   - Remove obsolete .txt DT bindings for content that has been moved to
     schemas (Rob Herring)

   - Add qcom DT binding for MHI registers in IPQ5332, IPQ6018, IPQ8074
     and IPQ9574 (Varadarajan Narayanan)

   - Convert v3,v360epc-pci from text to DT schema binding (Rob Herring)

   - Change microchip,pcie-host DT binding to be 'dma-noncoherent' since
     PolarFire may be configured that way (Conor Dooley)

  Miscellaneous:

   - Drop 'pci' suffix from intel_mid_pci.c filename to match similar
     files (Andy Shevchenko)

   - All platforms with PCI have an MMU, so add PCI Kconfig dependency
     on MMU to simplify build testing and avoid inadvertent build
     regressions (Arnd Bergmann)

   - Update Krzysztof Wilczyński's email address in MAINTAINERS
     (Krzysztof Wilczyński)

   - Update Manivannan Sadhasivam's email address in MAINTAINERS
     (Manivannan Sadhasivam)"

* tag 'pci-v6.16-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (147 commits)
  MAINTAINERS: Update Manivannan Sadhasivam email address
  PCI: j721e: Fix host/endpoint dependencies
  PCI: j721e: Add support to build as a loadable module
  PCI: cadence-ep: Introduce cdns_pcie_ep_disable() helper for cleanup
  PCI: cadence-host: Introduce cdns_pcie_host_disable() helper for cleanup
  PCI: cadence: Add support to build pcie-cadence library as a kernel module
  MAINTAINERS: Update Krzysztof Wilczyński email address
  PCI: Remove unnecessary linesplit in __pci_setup_bridge()
  PCI: WARN (not BUG()) when we fail to assign optional resources
  PCI: Remove unused pci_printk()
  PCI: qcom: Replace PERST# sleep time with proper macro
  PCI: dw-rockchip: Replace PERST# sleep time with proper macro
  PCI: host-common: Convert to library for host controller drivers
  PCI/ERR: Remove misleading TODO regarding kernel panic
  PCI: cadence: Remove duplicate message code definitions
  PCI: endpoint: Align pci_epc_set_msix(), pci_epc_ops::set_msix() nr_irqs encoding
  PCI: endpoint: Align pci_epc_set_msi(), pci_epc_ops::set_msi() nr_irqs encoding
  PCI: endpoint: Align pci_epc_get_msix(), pci_epc_ops::get_msix() return value encoding
  PCI: endpoint: Align pci_epc_get_msi(), pci_epc_ops::get_msi() return value encoding
  PCI: cadence-ep: Correct PBA offset in .set_msix() callback
  ...
2025-06-04 11:26:17 -07:00
Karol Wachowski
8395204aeb accel/ivpu: Add inference_timeout_ms module parameter
Add new inference_timeout_ms parameter that allows specifying
maximum allowed duration in milliseconds that inference can take before
triggering a recovery.

Calculate maximum number of heartbeat retries based on ratio between
inference timeout and tdr timeout.

Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://lore.kernel.org/r/20250515093128.252041-1-jacek.lawrynowicz@linux.intel.com
2025-06-02 14:39:50 +02:00
Linus Torvalds
8477ab1430 Merge tag 'iommu-updates-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux
Pull iommu updates from Joerg Roedel:
 "Core:
   - Introduction of iommu-pages infrastructure to consolitate
     page-table allocation code among hardware drivers. This is
     ground-work for more generalization in the future
   - Remove IOMMU_DEV_FEAT_SVA and IOMMU_DEV_FEAT_IOPF feature flags
   - Convert virtio-iommu to domain_alloc_paging()
   - KConfig cleanups
   - Some small fixes for possible overflows and race conditions

  Intel VT-d driver:
   - Restore WO permissions on second-level paging entries
   - Use ida to manage domain id
   - Miscellaneous cleanups

  AMD-Vi:
   - Make sure notifiers finish running before module unload
   - Add support for HTRangeIgnore feature
   - Allow matching ACPI HID devices without matching UIDs

  ARM-SMMU:
   - SMMUv2:
      - Recognise the compatible string for SAR2130P MDSS in the
        Qualcomm driver, as this device requires an identity domain
      - Fix Adreno stall handling so that GPU debugging is more robust
        and doesn't e.g. result in deadlock
   - SMMUv3:
      - Fix ->attach_dev() error reporting for unrecognised domains
   - IO-pgtable:
      - Allow clients (notably, drivers that process requests from
        userspace) to silence warnings when mapping an already-mapped
        IOVA

  S390:
   - Add support for additional table regions

  Mediatek:
   - Add support for MT6893 MM IOMMU

  And some smaller fixes and improvements in various other drivers"

* tag 'iommu-updates-v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (75 commits)
  iommu/vt-d: Restore context entry setup order for aliased devices
  iommu/mediatek: Fix compatible typo for mediatek,mt6893-iommu-mm
  iommu/arm-smmu-qcom: Make set_stall work when the device is on
  iommu/arm-smmu: Move handing of RESUME to the context fault handler
  iommu/arm-smmu-qcom: Enable threaded IRQ for Adreno SMMUv2/MMU500
  iommu/io-pgtable-arm: Add quirk to quiet WARN_ON()
  iommu: Clear the freelist after iommu_put_pages_list()
  iommu/vt-d: Change dmar_ats_supported() to return boolean
  iommu/vt-d: Eliminate pci_physfn() in dmar_find_matched_satc_unit()
  iommu/vt-d: Replace spin_lock with mutex to protect domain ida
  iommu/vt-d: Use ida to manage domain id
  iommu/vt-d: Restore WO permissions on second-level paging entries
  iommu/amd: Allow matching ACPI HID devices without matching UIDs
  iommu: make inclusion of arm/arm-smmu-v3 directory conditional
  iommu: make inclusion of riscv directory conditional
  iommu: make inclusion of amd directory conditional
  iommu: make inclusion of intel directory conditional
  iommu: remove duplicate selection of DMAR_TABLE
  iommu/fsl_pamu: remove trailing space after \n
  iommu/arm-smmu-qcom: Add SAR2130P MDSS compatible
  ...
2025-05-30 10:44:20 -07:00
Jeff Hugo
c11a50b170 accel/qaic: Add Reliability, Accessibility, Serviceability (RAS)
AIC100 devices generates Reliability, Availability, Serviceability events
via MHI QAIC_STATUS channel. Support such events and print a structured
log with details of the events, and if the event describes an uncorrected
error, reset the device to put it back into service. As these events may
not all be reported via other mechanisms like AER, maintain counts of
the number of errors observed for each type.

Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Reviewed-by: Troy Hanson <quic_thanson@quicinc.com>
Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Link: https://lore.kernel.org/r/20250516160634.1408309-1-jeff.hugo@oss.qualcomm.com
2025-05-30 09:14:35 -06:00
Linus Torvalds
b08494a8f7 Merge tag 'drm-next-2025-05-28' of https://gitlab.freedesktop.org/drm/kernel
Pull drm updates from Dave Airlie:
 "As part of building up nova-core/nova-drm pieces we've brought in some
  rust abstractions through this tree, aux bus being the main one, with
  devres changes also in the driver-core tree. Along with the drm core
  abstractions and enough nova-core/nova-drm to use them. This is still
  all stub work under construction, to build the nova driver upstream.

  The other big NVIDIA related one is nouveau adds support for
  Hopper/Blackwell GPUs, this required a new GSP firmware update to
  570.144, and a bunch of rework in order to support multiple fw
  interfaces.

  There is also the introduction of an asahi uapi header file as a
  precursor to getting the real driver in later, but to unblock
  userspace mesa packages while the driver is trapped behind rust
  enablement.

  Otherwise it's the usual mixture of stuff all over, amdgpu, i915/xe,
  and msm being the main ones, and some changes to vsprintf.

  new drivers:
   - bring in the asahi uapi header standalone
   - nova-drm: stub driver

  rust dependencies (for nova-core):
   - auxiliary
       - bus abstractions
       - driver registration
       - sample driver
   - devres changes from driver-core
   - revocable changes

  core:
   - add Apple fourcc modifiers
   - add virtio capset definitions
   - extend EXPORT_SYNC_FILE for timeline syncobjs
   - convert to devm_platform_ioremap_resource
   - refactor shmem helper page pinning
   - DP powerup/down link helpers
   - extended %p4cc in vsprintf.c to support fourcc prints
   - change vsprintf %p4cn to %p4chR, remove %p4cn
   - Add drm_file_err function
   - IN_FORMATS_ASYNC property
   - move sitronix from tiny to their own subdir

  rust:
   - add drm core infrastructure rust abstractions
     (device/driver, ioctl, file, gem)

  dma-buf:
   - adjust sg handling to not cache map on attach
   - allow setting dma-device for import
   - Add a helper to sort and deduplicate dma_fence arrays

  docs:
   - updated drm scheduler docs
   - fbdev todo update
   - fb rendering
   - actual brightness

  ttm:
   - fix delayed destroy resv object

  bridge:
   - add kunit tests
   - convert tc358775 to atomic
   - convert drivers to devm_drm_bridge_alloc
   - convert rk3066_hdmi to bridge driver

  scheduler:
   - add kunit tests

  panel:
   - refcount panels to improve lifetime handling
   - Powertip PH128800T004-ZZA01
   - NLT NL13676BC25-03F, Tianma TM070JDHG34-00
   - Himax HX8279/HX8279-D DDIC
   - Visionox G2647FB105
   - Sitronix ST7571
   - ZOTAC rotation quirk

  vkms:
   - allow attaching more displays

  i915:
   - xe3lpd display updates
   - vrr refactor
   - intel_display struct conversions
   - xe2hpd memory type identification
   - add link rate/count to i915_display_info
   - cleanup VGA plane handling
   - refactor HDCP GSC
   - fix SLPC wait boosting reference counting
   - add 20ms delay to engine reset
   - fix fence release on early probe errors

  xe:
   - SRIOV updates
   - BMG PCI ID update
   - support separate firmware for each GT
   - SVM fix, prelim SVM multi-device work
   - export fan speed
   - temp disable d3cold on BMG
   - backup VRAM in PM notifier instead of suspend/freeze
   - update xe_ttm_access_memory to use GPU for non-visible access
   - fix guc_info debugfs for VFs
   - use copy_from_user instead of __copy_from_user
   - append PCIe gen5 limitations to xe_firmware document

  amdgpu:
   - DSC cleanup
   - DC Scaling updates
   - Fused I2C-over-AUX updates
   - DMUB updates
   - Use drm_file_err in amdgpu
   - Enforce isolation updates
   - Use new dma_fence helpers
   - USERQ fixes
   - Documentation updates
   - SR-IOV updates
   - RAS updates
   - PSP 12 cleanups
   - GC 9.5 updates
   - SMU 13.x updates
   - VCN / JPEG SR-IOV updates

  amdkfd:
   - Update error messages for SDMA
   - Userptr updates
   - XNACK fixes

  radeon:
   - CIK doorbell cleanup

  nouveau:
   - add support for NVIDIA r570 GSP firmware
   - enable Hopper/Blackwell support

  nova-core:
   - fix task list
   - register definition infrastructure
   - move firmware into own rust module
   - register auxiliary device for nova-drm

  nova-drm:
   - initial driver skeleton

  msm:
   - GPU:
       - ACD (adaptive clock distribution) for X1-85
       - drop fictional address_space_size
       - improve GMU HFI response time out robustness
       - fix crash when throttling during boot
   - DPU:
       - use single CTL path for flushing on DPU 5.x+
       - improve SSPP allocation code for better sharing
       - Enabled SmartDMA on SM8150, SC8180X, SC8280XP, SM8550
       - Added SAR2130P support
       - Disabled DSC support on MSM8937, MSM8917, MSM8953, SDM660
   - DP:
       - switch to new audio helpers
       - better LTTPR handling
   - DSI:
       - Added support for SA8775P
       - Added SAR2130P support
   - HDMI:
       - Switched to use new helpers for ACR data
       - Fixed old standing issue of HPD not working in some cases

  amdxdna:
   - add dma-buf support
   - allow empty command submits

  renesas:
   - add dma-buf support
   - add zpos, alpha, blend support

  panthor:
   - fail properly for NO_MMAP bos
   - add SET_LABEL ioctl
   - debugfs BO dumping support

  imagination:
   - update DT bindings
   - support TI AM68 GPU

  hibmc:
   - improve interrupt handling and HPD support

  virtio:
   - add panic handler support

  rockchip:
   - add RK3588 support
   - add DP AUX bus panel support

  ivpu:
   - add heartbeat based hangcheck

  mediatek:
   - prepares support for MT8195/99 HDMIv2/DDCv2

  anx7625:
   - improve HPD

  tegra:
   - speed up firmware loading

* tag 'drm-next-2025-05-28' of https://gitlab.freedesktop.org/drm/kernel: (1627 commits)
  drm/nouveau/tegra: Fix error pointer vs NULL return in nvkm_device_tegra_resource_addr()
  drm/xe: Default auto_link_downgrade status to false
  drm/xe/guc: Make creation of SLPC debugfs files conditional
  drm/i915/display: Add check for alloc_ordered_workqueue() and alloc_workqueue()
  drm/i915/dp_mst: Work around Thunderbolt sink disconnect after SINK_COUNT_ESI read
  drm/i915/ptl: Use everywhere the correct DDI port clock select mask
  drm/nouveau/kms: add support for GB20x
  drm/dp: add option to disable zero sized address only transactions.
  drm/nouveau: add support for GB20x
  drm/nouveau/gsp: add hal for fifo.chan.doorbell_handle
  drm/nouveau: add support for GB10x
  drm/nouveau/gf100-: track chan progress with non-WFI semaphore release
  drm/nouveau/nv50-: separate CHANNEL_GPFIFO handling out from CHANNEL_DMA
  drm/nouveau: add helper functions for allocating pinned/cpu-mapped bos
  drm/nouveau: add support for GH100
  drm/nouveau: improve handling of 64-bit BARs
  drm/nouveau/gv100-: switch to volta semaphore methods
  drm/nouveau/gsp: support deeper page tables in COPY_SERVER_RESERVED_PDES
  drm/nouveau/gsp: init client VMMs with NV0080_CTRL_DMA_SET_PAGE_DIRECTORY
  drm/nouveau/gsp: fetch level shift and PDE from BAR2 VMM
  ...
2025-05-28 09:46:39 -07:00
Pierre-Eric Pelloux-Prayer
2956554823 drm/sched: Store the drm client_id in drm_sched_fence
This will be used in a later commit to trace the drm client_id in
some of the gpu_scheduler trace events.

This requires changing all the users of drm_sched_job_init to
add an extra parameter.

The newly added drm_client_id field in the drm_sched_fence is a bit
of a duplicate of the owner one. One suggestion I received was to
merge those 2 fields - this can't be done right now as amdgpu uses
some special values (AMDGPU_FENCE_OWNER_*) that can't really be
translated into a client id. Christian is working on getting rid of
those; when it's done we should be able to squash owner/drm_client_id
together.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Philipp Stanner <phasta@kernel.org>
Link: https://lore.kernel.org/r/20250526125505.2360-3-pierre-eric.pelloux-prayer@amd.com
2025-05-28 16:15:58 +02:00
Karol Wachowski
4557cc8347 accel/ivpu: Reorder Doorbell Unregister and Command Queue Destruction
Refactor ivpu_cmdq_unregister() to ensure the doorbell is unregistered
before destroying the command queue. The NPU firmware requires doorbells
to be unregistered prior to command queue destruction.

If doorbell remains registered when command queue destroy command is sent
firmware will automatically unregister the doorbell, making subsequent
unregister attempts no-operations (NOPs).

Ensure compliance with firmware expectations by moving the doorbell
unregister call ahead of the command queue destruction logic,
thus preventing unnecessary NOP operation.

Fixes: 465a3914b2 ("accel/ivpu: Add API for command queue create/destroy/submit")
Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://lore.kernel.org/r/20250515094124.255141-1-jacek.lawrynowicz@linux.intel.com
2025-05-28 11:49:29 +02:00
Jacek Lawrynowicz
1c2c0e29f2 accel/ivpu: Use firmware names from upstream repo
Use FW names from linux-firmware repo instead of deprecated ones.
The vpu_37xx.bin style names were never released and were only used for
internal testing, so it is safe to remove them.

Fixes: c140244f0c ("accel/ivpu: Add initial Panther Lake support")
Cc: stable@vger.kernel.org # v6.13+
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://lore.kernel.org/r/20250506092030.280276-1-jacek.lawrynowicz@linux.intel.com
2025-05-28 11:44:03 +02:00
Jacek Lawrynowicz
a01e93ee44 accel/ivpu: Improve buffer object logging
- Fix missing alloc log when drm_gem_handle_create() fails in
  drm_vma_node_allow() and open callback is not called
- Add ivpu_bo->ctx_id that enables to log the actual context
  id instead of using 0 as default
- Add couple WARNs and errors so we can catch more memory
  corruption issues

Fixes: 37dee2a2f4 ("accel/ivpu: Improve buffer object debug logs")
Cc: stable@vger.kernel.org # v6.8+
Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com>
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://lore.kernel.org/r/20250506091303.262034-1-jacek.lawrynowicz@linux.intel.com
2025-05-28 11:41:49 +02:00
Joerg Roedel
879b141b7c Merge branches 'fixes', 'apple/dart', 'arm/smmu/updates', 'arm/smmu/bindings', 'fsl/pamu', 'mediatek', 'renesas/ipmmu', 's390', 'intel/vt-d', 'amd/amd-vi' and 'core' into next 2025-05-23 17:14:32 +02:00
Ingo Molnar
412751aa69 Merge tag 'v6.15-rc7' into x86/core, to pick up fixes
Pick up build fixes from upstream to make this tree more testable.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-21 08:45:03 +02:00