Commit Graph

1415627 Commits

Author SHA1 Message Date
Jia Yao
fbbe32618e drm/xe: Add bounds check on pat_index to prevent OOB kernel read in madvise
When user provides a bogus pat_index value through the madvise IOCTL, the
xe_pat_index_get_coh_mode() function performs an array access without
validating bounds. This allows a malicious user to trigger an out-of-bounds
kernel read from the xe->pat.table array.

The vulnerability exists because the validation in madvise_args_are_sane()
directly calls xe_pat_index_get_coh_mode(xe, args->pat_index.val) without
first checking if pat_index is within [0, xe->pat.n_entries).

Although xe_pat_index_get_coh_mode() has a WARN_ON to catch this in debug
builds, it still performs the unsafe array access in production kernels.

v2(Matthew Auld)
- Using array_index_nospec() to mitigate spectre attacks when the value
is used

v3(Matthew Auld)
- Put the declarations at the start of the block

Fixes: ada7486c56 ("drm/xe: Implement madvise ioctl for xe")
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Cc: <stable@vger.kernel.org> # v6.18+
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Jia Yao <jia.yao@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patch.msgid.link/20260205161529.1819276-1-jia.yao@intel.com
(cherry picked from commit 944a3329b0)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-17 19:39:04 -05:00
Michal Wajdeczko
2a673fb4d7 drm/xe/configfs: Fix 'parameter name omitted' errors
On some configs and old compilers we can get following build errors:

  ../drivers/gpu/drm/xe/xe_configfs.h: In function 'xe_configfs_get_ctx_restore_mid_bb':
  ../drivers/gpu/drm/xe/xe_configfs.h:40:76: error: parameter name omitted
   static inline u32 xe_configfs_get_ctx_restore_mid_bb(struct pci_dev *pdev, enum xe_engine_class,
                                                                            ^~~~~~~~~~~~~~~~~~~~
  ../drivers/gpu/drm/xe/xe_configfs.h: In function 'xe_configfs_get_ctx_restore_post_bb':
  ../drivers/gpu/drm/xe/xe_configfs.h:42:77: error: parameter name omitted
   static inline u32 xe_configfs_get_ctx_restore_post_bb(struct pci_dev *pdev, enum xe_engine_class,
                                                                             ^~~~~~~~~~~~~~~~~~~~
when trying to define our configfs stub functions. Fix that.

Fixes: 7a4756b2fd ("drm/xe/lrc: Allow to add user commands mid context switch")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://patch.msgid.link/20260203193745.576-1-michal.wajdeczko@intel.com
(cherry picked from commit f59cde8a24)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-17 19:38:57 -05:00
Michal Wajdeczko
bf7172cd25 drm/xe/pf: Fix sysfs initialization
In case of devm_add_action_or_reset() failure the provided cleanup
action will be run immediately on the not yet initialized kobject.
This may lead to errors like:

 [ ] kobject: '(null)' (ff110001393608e0): is not initialized, yet kobject_put() is being called.
 [ ] WARNING: lib/kobject.c:734 at kobject_put+0xd9/0x250, CPU#0: kworker/0:0/9
 [ ] RIP: 0010:kobject_put+0xdf/0x250
 [ ] Call Trace:
 [ ]  xe_sriov_pf_sysfs_init+0x21/0x100 [xe]
 [ ]  xe_sriov_pf_init_late+0x87/0x2b0 [xe]
 [ ]  xe_sriov_init_late+0x5f/0x2c0 [xe]
 [ ]  xe_device_probe+0x5f2/0xc20 [xe]
 [ ]  xe_pci_probe+0x396/0x610 [xe]
 [ ]  local_pci_probe+0x47/0xb0

 [ ] refcount_t: underflow; use-after-free.
 [ ] WARNING: lib/refcount.c:28 at refcount_warn_saturate+0x68/0xb0, CPU#0: kworker/0:0/9
 [ ] RIP: 0010:refcount_warn_saturate+0x68/0xb0
 [ ] Call Trace:
 [ ]  kobject_put+0x174/0x250
 [ ]  xe_sriov_pf_sysfs_init+0x21/0x100 [xe]
 [ ]  xe_sriov_pf_init_late+0x87/0x2b0 [xe]
 [ ]  xe_sriov_init_late+0x5f/0x2c0 [xe]
 [ ]  xe_device_probe+0x5f2/0xc20 [xe]
 [ ]  xe_pci_probe+0x396/0x610 [xe]
 [ ]  local_pci_probe+0x47/0xb0

Fix that by calling kobject_init() and kobject_add() separately
and register cleanup action after the kobject is initialized.

Also make this cleanup registration a part of the create helper to
fix another mistake, as in the loop we were wrongly passing parent
kobject while registering cleanup action, and this resulted in some
undetected leaks.

Fixes: 5c170a4d9c ("drm/xe/pf: Prepare sysfs for SR-IOV admin attributes")
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Link: https://patch.msgid.link/20260203235332.1350-1-michal.wajdeczko@intel.com
(cherry picked from commit 98b16727f0)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-17 19:38:50 -05:00
Dave Airlie
2f5db9b400 Merge tag 'drm-xe-next-fixes-2026-02-05' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next
- Fix CFI violation in debugfs access (Daniele)
- Kernel-doc fixes (Chaitanya, Shuicheng)
- Disable D3Cold for BMG only on specific platforms (Karthik)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/aYStaLZVJWwKCDZt@intel.com
2026-02-06 13:02:44 +10:00
Dave Airlie
3c5ab2407a Merge tag 'drm-misc-next-fixes-2026-02-05' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
Several fixes for amdxdna around PM handling, error reporting and
memory safety, a compilation fix for ilitek-ili9882t, a NULL pointer
dereference fix for imx8qxp-pixel-combiner and several PTE fixes for
nouveau

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patch.msgid.link/20260205-refreshing-natural-vole-4c73af@houat
2026-02-06 12:52:15 +10:00
Dave Airlie
1099b651ae Merge tag 'drm-intel-next-fixes-2026-02-05' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
- Fix the pixel normalization handling for xe3p_lpd display

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patch.msgid.link/aYROngKfyUIyoQW0@jlahtine-mobl
2026-02-06 10:59:01 +10:00
Karthik Poosa
666c654a5a drm/xe/pm: Disable D3Cold for BMG only on specific platforms
Restrict D3Cold disablement for BMG to unsupported NUC platforms,
instead of disabling it on all platforms.

Signed-off-by: Karthik Poosa <karthik.poosa@intel.com>
Fixes: 3e331a6715 ("drm/xe/pm: Temporarily disable D3Cold on BMG")
Link: https://patch.msgid.link/20260123173238.1642383-1-karthik.poosa@intel.com
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 39125eaf88)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-05 08:03:58 -05:00
Shuicheng Lin
51cedb93da drm/xe: Fix kerneldoc for xe_tlb_inval_job_alloc_dep
Correct the function name in the kerneldoc.
It is for below warning:
"Warning: drivers/gpu/drm/xe/xe_tlb_inval_job.c:210 expecting prototype for
xe_tlb_inval_alloc_dep(). Prototype was for xe_tlb_inval_job_alloc_dep()
instead"

Fixes: 15366239e2 ("drm/xe: Decouple TLB invalidations from GT")
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260129233834.419977-8-shuicheng.lin@intel.com
(cherry picked from commit 9f9c117ac5)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-05 08:03:52 -05:00
Shuicheng Lin
904b2e5063 drm/xe: Fix kerneldoc for xe_gt_tlb_inval_init_early
Correct the function name in the kerneldoc.
It is for below warning:
"Warning: drivers/gpu/drm/xe/xe_tlb_inval.c:136 expecting prototype for
xe_gt_tlb_inval_init(). Prototype was for xe_gt_tlb_inval_init_early()
instead"

v2: add () for the function. (Michal)

Fixes: db16f9d90c ("drm/xe: Split TLB invalidation code in frontend and backend")
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260129233834.419977-7-shuicheng.lin@intel.com
(cherry picked from commit 0651dbb9d6)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-05 08:03:46 -05:00
Shuicheng Lin
5d5ef69549 drm/xe: Fix kerneldoc for xe_migrate_exec_queue
Correct the function name in the kerneldoc.
It is for below warning:
"Warning: drivers/gpu/drm/xe/xe_migrate.c:1262 expecting prototype for
xe_get_migrate_exec_queue(). Prototype was for xe_migrate_exec_queue()
instead"

Fixes: 916ee4704a ("drm/xe/vf: Register CCS read/write contexts with Guc")
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260129233834.419977-6-shuicheng.lin@intel.com
(cherry picked from commit 9fd8da7179)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-05 08:03:41 -05:00
Shuicheng Lin
8b52d9ba08 drm/xe/query: Fix topology query pointer advance
The topology query helper advanced the user pointer by the size
of the pointer, not the size of the structure. This can misalign
the output blob and corrupt the following mask. Fix the increment
to use sizeof(*topo).
There is no issue currently, as sizeof(*topo) happens to be equal
to sizeof(topo) on 64-bit systems (both evaluate to 8 bytes).

Fixes: dd08ebf6c3 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patch.msgid.link/20260130043907.465128-2-shuicheng.lin@intel.com
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
(cherry picked from commit c2a6859138)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-05 08:03:35 -05:00
Chaitanya Kumar Borah
6282995188 drm/xe/guc: Fix kernel-doc warning in GuC scheduler ABI header
The GuC scheduler ABI header contains a file-level comment that is not
intended to document a kernel-doc symbol. Using kernel-doc comment
syntax (/** */) triggers kernel-doc warnings.

With "-Werror", this causes the build to fail. Convert the comment to a
regular block comment.

HDRTEST drivers/gpu/drm/xe/abi/guc_scheduler_abi.h
Warning: drivers/gpu/drm/xe/abi/guc_scheduler_abi.h:11 This comment starts with '/**', but isn't a kernel-doc comment. Refer to Documentation/doc-guide/kernel-doc.rst
 * Generic defines required for registration with and submissions to the GuC
1 warnings as errors
make[6]: *** [drivers/gpu/drm/xe/Makefile:377: drivers/gpu/drm/xe/abi/guc_scheduler_abi.hdrtest] Error 3
make[5]: *** [scripts/Makefile.build:544: drivers/gpu/drm/xe] Error 2
make[4]: *** [scripts/Makefile.build:544: drivers/gpu/drm] Error 2
make[3]: *** [scripts/Makefile.build:544: drivers/gpu] Error 2
make[2]: *** [scripts/Makefile.build:544: drivers] Error 2
make[1]: *** [/home/kbuild2/kernel/Makefile:2088: .] Error 2
make: *** [Makefile:248: __sub-make] Error 2

v2:
 - Add Fixes tag (Daniele)

Fixes: b0c5cf4f59 ("drm/gt/guc: extract scheduler-related defines from guc_fwif.h")
Signed-off-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patch.msgid.link/20260130135210.2659200-1-chaitanya.kumar.borah@intel.com
(cherry picked from commit f89dbe14a0)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-05 08:03:30 -05:00
Daniele Ceraolo Spurio
6e035abf98 drm/xe/guc: Fix CFI violation in debugfs access.
xe_guc_print_info is void-returning, but the function pointer it is
assigned to expects an int-returning function, leading to the following
CFI error:

[  206.873690] CFI failure at guc_debugfs_show+0xa1/0xf0 [xe]
(target: xe_guc_print_info+0x0/0x370 [xe]; expected type: 0xbe3bc66a)

Fix this by updating xe_guc_print_info to return an integer.

Fixes: e15826bb3c ("drm/xe/guc: Refactor GuC debugfs initialization")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: George D Sworo <george.d.sworo@intel.com>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patch.msgid.link/20260129182547.32899-2-daniele.ceraolospurio@intel.com
(cherry picked from commit dd8ea2f2ab)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2026-02-05 08:03:25 -05:00
Lizhi Hou
69674c1c70 accel/amdxdna: Move RPM resume into job run function
Currently, amdxdna_pm_resume_get() is called during job creation, and
amdxdna_pm_suspend_put() is called when the hardware notifies job
completion. If a job is canceled before it is run, no hardware
completion notification is generated, resulting in an unbalanced
runtime PM resume/suspend pair.

Fix this by moving amdxdna_pm_resume_get() to the job run path, ensuring
runtime PM is only resumed for jobs that are actually executed.

Fixes: 063db45183 ("accel/amdxdna: Enhance runtime power management")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20260204171118.3165607-1-lizhi.hou@amd.com
2026-02-04 13:08:48 -08:00
Lizhi Hou
d19d963d2a accel/amdxdna: Fix incorrect DPM level after suspend/resume
The suspend routine sets the DPM level to 0, which unintentionally
overwrites the previously saved DPM level. As a result, the device always
resumes with DPM level 0 instead of restoring the original value.

Fix this by ensuring the suspend path does not overwrite the saved DPM
level, allowing the correct DPM level to be restored during resume.

Fixes: f4d7b8a6bc ("accel/amdxdna: Enhance power management settings")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20260204171048.3165580-1-lizhi.hou@amd.com
2026-02-04 13:08:35 -08:00
Dave Airlie
d19512f5ab nouveau/vmm: start tracking if the LPT PTE is valid. (v6)
When NVK enabled large pages userspace tests were seeing fault
reports at a valid address.

There was a case where an address moving from 64k page to 4k pages
could expose a race between unmapping the 4k page, mapping the 64k
page and unref the 4k pages.

Unref 4k pages would cause the dual-page table handling to always
set the LPTE entry to SPARSE or INVALID, but if we'd mapped a valid
LPTE in the meantime, it would get trashed. Keep track of when
a valid LPTE has been referenced, and don't reset in that case.

This adds an lpte valid tracker and lpte reference count.

Whenever an lpte is referenced, it gets made valid and the ref count
increases, whenever it gets unreference the refcount is tracked.

Link: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14610
Reviewed-by: Mary Guillemard <mary@mary.zone>
Tested-by: Mary Guillemard <mary@mary.zone>
Tested-by: Mel Henning <mhenning@darkrefraction.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patch.msgid.link/20260204030208.2313241-4-airlied@gmail.com
2026-02-05 06:05:09 +10:00
Dave Airlie
9dc983a85e nouveau/vmm: increase size of vmm pte tracker struct to u32 (v2)
We need to tracker large counts of spte than previously due to unref
getting delayed sometimes.

This doesn't fix LPT tracking yet, it just creates space for it.

Reviewed-by: Mary Guillemard <mary@mary.zone>
Tested-by: Mary Guillemard <mary@mary.zone>
Tested-by: Mel Henning <mhenning@darkrefraction.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patch.msgid.link/20260204030208.2313241-3-airlied@gmail.com
2026-02-05 06:04:01 +10:00
Dave Airlie
c4d53e567d nouveau/vmm: rewrite pte tracker using a struct and bitfields.
I want to increase the counters here and start tracking LPTs as well
as there are certain situations where userspace with mixed page sizes
can cause ref/unrefs to live longer so need better reference counting.

This should be entirely non-functional.

Reviewed-by: Mary Guillemard <mary@mary.zone>
Tested-by: Mary Guillemard <mary@mary.zone>
Tested-by: Mel Henning <mhenning@darkrefraction.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patch.msgid.link/20260204030208.2313241-2-airlied@gmail.com
2026-02-05 06:03:26 +10:00
Lizhi Hou
750817a7c4 accel/amdxdna: Fix incorrect error code returned for failed chain command
The driver currently returns an incorrect error code when a chain command
fails. In this case, ERT_CMD_STATE_ERROR is expected to be reported for
failed chain commands.

Fixes: aac243092b ("accel/amdxdna: Add command execution")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20260203184037.2751889-1-lizhi.hou@amd.com
2026-02-03 11:35:53 -08:00
Lizhi Hou
b853007fdc accel/amdxdna: Remove hardware context status
One newly supported command does not require hardware context configuration
to be performed upfront. As a result, checking hardware context status
causes this command to fail incorrectly.

Remove hardware context status handling entirely. For other commands,
if userspace submits a request without configuring the hardware context
first, the firmware will report an error or time out as appropriate.

Fixes: aac243092b ("accel/amdxdna: Add command execution")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20260202212450.2681273-1-lizhi.hou@amd.com
2026-02-03 09:20:24 -08:00
Liu Ying
fe6d29b082 drm/bridge: imx8qxp-pixel-combiner: Fix bailout for imx8qxp_pc_bridge_probe()
In case the channel0 is unavailable and bailing out from free_child is
needed when we fail to add a DRM bridge for the available channel1,
pointer pc->ch[0] in the bailout path would be NULL and it would be
dereferenced as pc->ch[0]->bridge.next_bridge.  Fix this by checking
pc->ch[0] before dereferencing it.

Fixes: ae754f049c ("drm/bridge: imx8qxp-pixel-combiner: get/put the next bridge")
Fixes: 9976459352 ("drm/bridge: imx8qxp-pixel-combiner: convert to devm_drm_bridge_alloc() API")
Signed-off-by: Liu Ying <victor.liu@nxp.com>
Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Link: https://patch.msgid.link/20260123-imx8qxp-drm-bridge-fixes-v1-3-8bb85ada5866@nxp.com
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
2026-02-03 16:54:28 +01:00
Nathan Chancellor
45c0a8a702 drm/panel: ilitek-ili9882t: Remove duplicate initializers in tianma_il79900a_dsc
Clang warns (or errors with CONFIG_WERROR=y / W=e):

  drivers/gpu/drm/panel/panel-ilitek-ili9882t.c:95:16: error: initializer overrides prior initialization of this subobject [-Werror,-Winitializer-overrides]
     95 |         .vbr_enable = 0,
        |                       ^
  drivers/gpu/drm/panel/panel-ilitek-ili9882t.c:90:16: note: previous initialization is here
     90 |         .vbr_enable = false,
        |                       ^~~~~
  drivers/gpu/drm/panel/panel-ilitek-ili9882t.c:97:19: error: initializer overrides prior initialization of this subobject [-Werror,-Winitializer-overrides]
     97 |         .rc_model_size = DSC_RC_MODEL_SIZE_CONST,
        |                          ^~~~~~~~~~~~~~~~~~~~~~~
  include/drm/display/drm_dsc.h:22:38: note: expanded from macro 'DSC_RC_MODEL_SIZE_CONST'
     22 | #define DSC_RC_MODEL_SIZE_CONST             8192
        |                                             ^~~~
  drivers/gpu/drm/panel/panel-ilitek-ili9882t.c:91:19: note: previous initialization is here
     91 |         .rc_model_size = DSC_RC_MODEL_SIZE_CONST,
        |                          ^~~~~~~~~~~~~~~~~~~~~~~
  include/drm/display/drm_dsc.h:22:38: note: expanded from macro 'DSC_RC_MODEL_SIZE_CONST'
     22 | #define DSC_RC_MODEL_SIZE_CONST             8192
        |                                             ^~~~
  drivers/gpu/drm/panel/panel-ilitek-ili9882t.c:132:25: error: initializer overrides prior initialization of this subobject [-Werror,-Winitializer-overrides]
    132 |         .initial_scale_value = 32,
        |                                ^~
  drivers/gpu/drm/panel/panel-ilitek-ili9882t.c:126:25: note: previous initialization is here
    126 |         .initial_scale_value = 32,
        |                                ^~
  drivers/gpu/drm/panel/panel-ilitek-ili9882t.c:133:20: error: initializer overrides prior initialization of this subobject [-Werror,-Winitializer-overrides]
    133 |         .nfl_bpg_offset = 3511,
        |                           ^~~~
  drivers/gpu/drm/panel/panel-ilitek-ili9882t.c:108:20: note: previous initialization is here
    108 |         .nfl_bpg_offset = 1402,
        |                           ^~~~

GCC would warn about this in the same manner but its version,
-Woverride-init, is disabled for a normal kernel build in
scripts/Makefile.warn. For clang, -Wextra in drivers/gpu/drm/Makefile
turns it back but GCC respects turning it off earlier in the command
line.

Of all the duplicate fields in the initializer, only nfl_bpg_offset is a
different value. Clear up the duplicate initializers, keeping the
'false' value for .vbr_enable, as it is bool, and the second value for
.nfl_bpg_offset, assuming it is the correct one since it was the one
tested in the original change.

Fixes: 65ce1f5834 ("drm/panel: ilitek-ili9882t: Switch Tianma TL121BVMS07 to DSC 120Hz mode")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Link: https://patch.msgid.link/20260114-panel-ilitek-ili9882t-fix-override-init-v1-1-1d69a2b096df@kernel.org
Signed-off-by: Maxime Ripard <mripard@kernel.org>
2026-02-03 10:03:22 +01:00
Vinod Govindapillai
3e28a67a85 drm/i915/display: fix the pixel normalization handling for xe3p_lpd
Pixel normalizer is enabled with normalization factor as 1.0 for
FP16 formats in order to support FBC for those formats in xe3p_lpd.
Previously pixel normalizer gets disabled during the plane disable
routine. But there could be plane format settings without explicitly
calling the plane disable in-between and we could endup keeping the
pixel normalizer enabled for formats which we don't require that.
This is causing crc mismatches in yuv formats and FIFO underruns in
planar formats like NV12. Fix this by updating the pixel normalizer
configuration based on the pixel formats explicitly during the plane
settings arm calls itself - enable it for FP16 and disable it for
other formats in HDR capable planes.

v2: avoid redundant pixel normalization setting updates

v3: moved the normalization factor definition to intel_fbc.c and some
    updates to comments

v4: simplified the pixel normalizer setting handling

Fixes: 5298eea7ed ("drm/i915/xe3p_lpd: use pixel normalizer for fp16 formats for FBC")
Signed-off-by: Vinod Govindapillai <vinod.govindapillai@intel.com>
Reviewed-by: Uma Shankar <uma.shankar@intel.com>
Link: https://patch.msgid.link/20260130095919.107805-1-vinod.govindapillai@intel.com
(cherry picked from commit c0dc68f4e2)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2026-02-02 13:41:03 +02:00
Dave Airlie
3cc9398a9e Merge tag 'exynos-drm-next-for-v6.20' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos into drm-next
Fix three regressions
. Fix a regression where vidi_connection_ioctl() used the wrong device
  to look up the vidi context. It stores the vidi device in exynos_drm_private
  and uses it in ioctl(), preventing invalid pointer access and related bugs.
. Fix a security regression where vidi_connection_ioctl() directly dereferenced
  a user pointer for EDID data. It copies EDID from user space
  with copy_from_user() into kernel memory before use, preventing arbitrary
  kernel memory access.
. Fix a concurrency regression where vidi_context members related
  to EDID memory were accessed without locking. It protects alloc/free and
  state updates with ctx->lock, preventing race conditions and use-after-free bugs.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Inki Dae <inki.dae@samsung.com>
Link: https://patch.msgid.link/20260201143939.27074-1-inki.dae@samsung.com
2026-02-02 11:17:12 +10:00
Dave Airlie
a60f627cf4 Merge tag 'amd-drm-next-6.20-2026-01-30' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.20-2026-01-30:

amdgpu:
- Misc cleanups
- SMU 13 fixes
- SMU 14 fixes
- GPUVM fault filter fix
- USB4 fixes
- DC FP guard fixes
- Powergating fix
- JPEG ring reset fix
- RAS fixes
- Xclk fix for soc21 APUs
- Fix COND_EXEC handling for GC 11
- UserQ fixes
- MQD size alignment fixes
- SMU feature interface cleanup
- GC 10-12 KGQ init fixes
- GC 11-12 KGQ reset fixes

amdkfd:
- Fix device snapshot reporting
- GC 12.1 trap handler fixes
- MQD size alignment fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20260130183257.28879-1-alexander.deucher@amd.com
2026-02-02 05:51:54 +10:00
Jeongjun Park
52b330799e drm/exynos: vidi: use ctx->lock to protect struct vidi_context member variables related to memory alloc/free
Exynos Virtual Display driver performs memory alloc/free operations
without lock protection, which easily causes concurrency problem.

For example, use-after-free can occur in race scenario like this:
```
	CPU0				CPU1				CPU2
	----				----				----
  vidi_connection_ioctl()
    if (vidi->connection) // true
      drm_edid = drm_edid_alloc(); // alloc drm_edid
      ...
      ctx->raw_edid = drm_edid;
      ...
								drm_mode_getconnector()
								  drm_helper_probe_single_connector_modes()
								    vidi_get_modes()
								      if (ctx->raw_edid) // true
								        drm_edid_dup(ctx->raw_edid);
								          if (!drm_edid) // false
								          ...
				vidi_connection_ioctl()
				  if (vidi->connection) // false
				    drm_edid_free(ctx->raw_edid); // free drm_edid
				    ...
								          drm_edid_alloc(drm_edid->edid)
								            kmemdup(edid); // UAF!!
								            ...
```

To prevent these vulns, at least in vidi_context, member variables related
to memory alloc/free should be protected with ctx->lock.

Cc: <stable@vger.kernel.org>
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2026-02-01 23:28:01 +09:00
Jeongjun Park
d4c98c077c drm/exynos: vidi: fix to avoid directly dereferencing user pointer
In vidi_connection_ioctl(), vidi->edid(user pointer) is directly
dereferenced in the kernel.

This allows arbitrary kernel memory access from the user space, so instead
of directly accessing the user pointer in the kernel, we should modify it
to copy edid to kernel memory using copy_from_user() and use it.

Cc: <stable@vger.kernel.org>
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2026-02-01 23:28:01 +09:00
Jeongjun Park
d3968a0d85 drm/exynos: vidi: use priv->vidi_dev for ctx lookup in vidi_connection_ioctl()
vidi_connection_ioctl() retrieves the driver_data from drm_dev->dev to
obtain a struct vidi_context pointer. However, drm_dev->dev is the
exynos-drm master device, and the driver_data contained therein is not
the vidi component device, but a completely different device.

This can lead to various bugs, ranging from null pointer dereferences and
garbage value accesses to, in unlucky cases, out-of-bounds errors,
use-after-free errors, and more.

To resolve this issue, we need to store/delete the vidi device pointer in
exynos_drm_private->vidi_dev during bind/unbind, and then read this
exynos_drm_private->vidi_dev within ioctl() to obtain the correct
struct vidi_context pointer.

Cc: <stable@vger.kernel.org>
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
2026-02-01 23:27:56 +09:00
Zishun Yi
84dd57fb03 accel/amdxdna: Fix memory leak in amdxdna_ubuf_map
The amdxdna_ubuf_map() function allocates memory for sg and
internal sg table structures, but it fails to free them if subsequent
operations (sg_alloc_table_from_pages or dma_map_sgtable) fail.

Fixes: bd72d4acda ("accel/amdxdna: Support user space allocated buffer")
Signed-off-by: Zishun Yi <zishun.yi.dev@gmail.com>
Reviewed-by: Lizhi Hou <lizhi.hou@amd.com>
Reviewed-by: Min Ma <mamin506@gmail.com>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20260129171022.68578-1-zishun.yi.dev@gmail.com
2026-01-30 11:52:59 -08:00
Lizhi Hou
f1370241fe accel/amdxdna: Stop job scheduling across aie2_release_resource()
Running jobs on a hardware context while it is in the process of
releasing resources can lead to use-after-free and crashes.

Fix this by stopping job scheduling before calling
aie2_release_resource() and restarting it after the release completes.
Additionally, aie2_sched_job_run() now checks whether the hardware
context is still active.

Fixes: 4fd6ca90fc ("accel/amdxdna: Refactor hardware context destroy routine")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20260130003255.2083255-1-lizhi.hou@amd.com
2026-01-30 11:52:53 -08:00
Lizhi Hou
a9162439ad accel/amdxdna: Hold mm structure across iommu_sva_unbind_device()
Some tests trigger a crash in iommu_sva_unbind_device() due to
accessing iommu_mm after the associated mm structure has been
freed.

Fix this by taking an explicit reference to the mm structure
after successfully binding the device, and releasing it only
after the device is unbound. This ensures the mm remains valid
for the entire SVA bind/unbind lifetime.

Fixes: be462c97b7 ("accel/amdxdna: Add hardware context")
Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Link: https://patch.msgid.link/20260128002356.1858122-1-lizhi.hou@amd.com
2026-01-30 11:52:45 -08:00
Dave Airlie
502d2d8e01 Merge tag 'drm-xe-next-fixes-2026-01-29' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next
- Reduce LRC timestamp stuck message on VFs to notice (Brost)
- Disable GuC Power DCC strategy on PTL (Vinay)
- Unregister drm device on probe error (Lin)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/aXuyrtsnlAOmj_OB@intel.com
2026-01-30 13:02:41 +10:00
Dave Airlie
8fbe215d37 Merge tag 'drm-misc-next-fixes-2026-01-29' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
Two fixes for NULL pointer dereference in imx8 following the bridge
refcounting conversions, and one for the bridge connector following the
HDMI audio reworks.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patch.msgid.link/20260129-efficient-jerboa-of-ecstasy-822832@houat
2026-01-30 12:54:09 +10:00
Dave Airlie
608fb0a78c Merge tag 'drm-intel-next-fixes-2026-01-29' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
- Prevent u64 underflow in intel_fbc_stolen_end

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patch.msgid.link/aXsWGWjacEJ03rTs@jlahtine-mobl
2026-01-30 12:03:26 +10:00
Alex Deucher
0a6d6ed694 drm/amdgpu/gfx12: adjust KGQ reset sequence
Kernel gfx queues do not need to be reinitialized or
remapped after a reset.  Align with gfx11.

v2: preserve init and remap for MMIO case.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29 12:27:37 -05:00
Alex Deucher
b340ff216f drm/amdgpu/gfx11: adjust KGQ reset sequence
Kernel gfx queues do not need to be reinitialized or
remapped after a reset.  This fixes queue reset failures
on APUs.

v2: preserve init and remap for MMIO case.

Fixes: b3e9bfd866 ("drm/amdgpu/gfx11: add ring reset callbacks")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4789
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29 12:27:29 -05:00
Alex Deucher
a2918f958d drm/amdgpu/gfx12: fix wptr reset in KGQ init
wptr is a 64 bit value and we need to update the
full value, not just 32 bits. Align with what we
already do for KCQs.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29 12:27:27 -05:00
Alex Deucher
1f16866bdb drm/amdgpu/gfx11: fix wptr reset in KGQ init
wptr is a 64 bit value and we need to update the
full value, not just 32 bits. Align with what we
already do for KCQs.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29 12:27:18 -05:00
Alex Deucher
e80b1d1aa1 drm/amdgpu/gfx10: fix wptr reset in KGQ init
wptr is a 64 bit value and we need to update the
full value, not just 32 bits. Align with what we
already do for KCQs.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29 12:27:10 -05:00
Lang Yu
2bddc36c12 drm/amdkfd: Use AMDGPU_MQD_SIZE_ALIGN in gfx11+ kfd mqd manager
MES is enabled by default from gfx11+, use AMDGPU_MQD_SIZE_ALIGN
unconditionally for gfx11+.

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: David Belanger <david.belanger@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29 12:27:02 -05:00
Lang Yu
3aca6f835b drm/amdkfd: Adjust parameter of allocate_mqd
Make allocate_mqd consistent with other callbacks.
Prepare for next patch to use mqd_manager->mqd_size.

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: David Belanger <david.belanger@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29 12:26:58 -05:00
Lang Yu
a6a4dd519c drm/amdgpu: Use AMDGPU_MQD_SIZE_ALIGN in KGD
Use AMDGPU_MQD_SIZE_ALIGN for both kernel and user queue.

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: David Belanger <david.belanger@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29 12:26:55 -05:00
Lijo Lazar
0d9a49a2ce drm/amd/pm: Initialize allowed feature list
Instead of returning feature bit mask of allowed features, initialize
the allowed features in the callback implementation itself.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29 12:26:48 -05:00
Lijo Lazar
156c0ab1de drm/amd/pm: Remove unused logic in SMUv14.0.2
Remove commented and redundant logic in get_allowed_feature_mask
implementation.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29 12:26:44 -05:00
Lijo Lazar
c99d381d2d drm/amd/pm: Add smu feature interface functions
Instead of using bitmap operations, add wrapper interface functions to
operate on smu features.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29 12:26:41 -05:00
Lijo Lazar
f28b0a1386 drm/amd/pm: Add smu feature bits data struct
Add a bitmap struct to represent smu feature bits and functions to set/clear features.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29 12:26:36 -05:00
Lang Yu
82a9ab369a drm/amdgpu: Add a helper macro to align mqd size
MES FW uses address(mqd_addr + sizeof(struct mqd) + 3*sizeof(uint32_t))
as fence address and writes a 32 bit fence value to this address. Driver
needs to allocate some extra memory(at least 4 DWs) in addition to
sizeof(struct mqd) as mqd memory(limited to gfx/compute/sdma queue).

For gfx11/12, sizeof(struct mqd) < PAGE_SIZE, KGD allocates mqd memory with
PAGE_SIZE aligned works. For gfx12.1, sizeof(struct mqd) == PAGE_SIZE,
it doesn't work.

KFD mqd manager hardcodes mqd size to PAGE_SIZE/MQD_SIZE across different
IP versions to solve this issue.

To avoid hardcoding in differnet places and across different IP versions.
Let's use AMDGPU_MQD_SIZE_ALIGN instead. It is used in two places.

1. mqd memory alloction
2. mqd stride handling for multi xcc config

v2: Use AMDGPU_GPU_PAGE_ALIGN. (Mukul)

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: David Belanger <david.belanger@amd.com> (v1)
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29 12:26:26 -05:00
Jesse.Zhang
8079b87c02 drm/amdgpu: validate user queue size constraints
Add validation to ensure user queue sizes meet hardware requirements:
- Size must be a power of two for efficient ring buffer wrapping
- Size must be at least AMDGPU_GPU_PAGE_SIZE to prevent undersized allocations

This prevents invalid configurations that could lead to GPU faults or
unexpected behavior.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-29 12:26:15 -05:00
Alex Deucher
ba205ac3d6 drm/amdgpu: Fix cond_exec handling in amdgpu_ib_schedule()
The EXEC_COUNT field must be > 0.  In the gfx shadow
handling we always emit a cond_exec packet after the gfx_shadow
packet, but the EXEC_COUNT never gets patched.  This leads
to a hang when we try and reset queues on gfx11 APUs.

Fixes: c68cbbfd54 ("drm/amdgpu: cleanup conditional execution")
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4789
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-28 16:21:45 -05:00
Alex Deucher
637fee3954 drm/amdgpu/soc21: fix xclk for APUs
The reference clock is supposed to be 100Mhz, but it
appears to actually be slightly lower (99.81Mhz).

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14451
Reviewed-by: Jesse Zhang <Jesse.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-28 16:21:31 -05:00