linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-21 17:15:24 -04:00

Author	SHA1	Message	Date
Dave Airlie	2c38074c36	Merge tag 'drm-rust-fixes-2025-09-05' of https://gitlab.freedesktop.org/drm/rust/kernel into drm-fixes - Add drm-rust tree to MAINTAINERS - Require CONFIG_64BIT for Nova Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alice Ryhl <aliceryhl@google.com> Link: https://lore.kernel.org/r/aLquN1YvdyI_6PJS@google.com	2025-09-12 09:05:42 +10:00
Daniele Ceraolo Spurio	ba391a102e	drm/i915/guc: Include the GuC registers in the error state If GuC hangs, the GuC logs might not contain enough information to understand exactly why the hang occurred. In this case, we need to look at the GuC HW state to try to understand where the GuC is stuck. It is therefore useful to include the GuC HW state in the error capture. The list of registers that are part of the GuC HW state can change based on platform, but it is the same for all platforms from TGL to MTL so we only need to support one version for i915. v2: revised list v3: remove confusing comment, use sizeof(u32) instead of 4 (John) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250909223621.3782625-2-daniele.ceraolospurio@intel.com	2025-09-11 11:28:02 -07:00
Daniele Ceraolo Spurio	8843444843	drm/xe/guc: Set RCS/CCS yield policy All recent platforms (including all the ones officially supported by the Xe driver) do not allow concurrent execution of RCS and CCS workloads from different address spaces, with the HW blocking the context switch when it detects such a scenario. The DUAL_QUEUE flag helps with this, by causing the GuC to not submit a context it knows will not be able to execute. This, however, causes a new problem: if RCS and CCS queues have pending workloads from different address spaces, the GuC needs to choose from which of the 2 queues to pick the next workload to execute. By default, the GuC prioritizes RCS submissions over CCS ones, which can lead to CCS workloads being significantly (or completely) starved of execution time. The driver can tune this by setting a dedicated scheduling policy KLV; this KLV allows the driver to specify a quantum (in ms) and a ratio (percentage value between 0 and 100), and the GuC will prioritize the CCS for that percentage of each quantum. Given that we want to guarantee enough RCS throughput to avoid missing frames, we set the yield policy to 20% of each 80ms interval. v2: updated quantum and ratio, improved comment, use xe_guc_submit_disable in gt_sanitize Fixes: `d9a1ae0d17` ("drm/xe/guc: Enable WA_DUAL_QUEUE for newer platforms") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Tested-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://lore.kernel.org/r/20250905235632.3333247-2-daniele.ceraolospurio@intel.com	2025-09-11 09:45:35 -07:00
Michal Wajdeczko	95c1cfa306	drm/xe/pf: Drop rounddown_pow_of_two fair LMEM limitation This effectively reverts commit `4c3fe5eae4` ("drm/xe/pf: Limit fair VF LMEM provisioning") since we don't need it any more after non-contig VRAM allocations were fixed. This allows larger LMEM auto-provisioning for VFs, so instead: [ ] GT0: PF: LMEM available(14096M) fair(1 x 8192M) [ ] GT0: PF: VF1 provisioned with 8589934592 (8.00 GiB) LMEM or [ ] GT0: PF: LMEM available(14096M) fair(2 x 4096M) [ ] GT0: PF: VF1..VF2 provisioned with 4294967296 (4.00 GiB) LMEM we may get: [ ] GT0: PF: LMEM available(14096M) fair(1 x 14096M) [ ] GT0: PF: VF1 provisioned with 14780727296 (13.8 GiB) LMEM and [ ] GT0: PF: LMEM available(14096M) fair(2 x 7048M) [ ] GT0: PF: VF1..VF2 provisioned with 7390363648 (6.88 GiB) LMEM Fixes: `1e32ffbc9d` ("drm/xe/sriov: support non-contig VRAM provisioning") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Link: https://lore.kernel.org/r/20250910222439.32869-1-michal.wajdeczko@intel.com	2025-09-11 18:20:57 +02:00
Varun Gupta	010629e00d	drm/xe: Fix driver reference in FLR comment Rectify the reference of i915 to Xe in a comment. v2: Cosmetic changes. (Karthik) v3: Rephrased the commit message. (Karthik) Signed-off-by: Varun Gupta <varun.gupta@intel.com> Reviewed-by: Karthik Poosa <karthik.poosa@intel.com> Link: https://lore.kernel.org/r/20250911111712.811524-1-varun.gupta@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-09-11 08:00:33 -07:00
Vinay Belgaumkar	60d2b78991	drm/xe/guc: Add SLPC power profile interface GuC has an interface to set a power profile for the SLPC algorithm. Base mode is default and ensures a balanced performance, power_saving mode has conservative up/down thresholds and is suitable for use with apps that typically need to be power efficient. This will result in lower GT frequencies, thus consuming lower power. Selected power profile will be displayed in this format: $ cat power_profile [base] power_saving $ echo power_saving > power_profile $ cat power_profile base [power_saving] v2: Address review comments (Rodrigo) Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://lore.kernel.org/r/20250903232120.390190-1-vinay.belgaumkar@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-11 08:45:05 -04:00
Daniel Almeida	cf4fd52e32	rust: drm: Introduce the Tyr driver for Arm Mali GPUs Add a Rust driver for ARM Mali CSF-based GPUs. It is a port of Panthor and therefore exposes Panthor's uAPI and name to userspace, and the product of a joint effort between Collabora, Arm and Google engineers. The aim is to incrementally develop Tyr with the abstractions that are currently available until it is consider to be in parity with Panthor feature-wise. The development of Tyr itself started in January, after a few failed attempts of converting Panthor piecewise through a mix of Rust and C code. There is a downstream branch that's much further ahead in terms of capabilities than this initial patch. The downstream code is capable of booting the MCU, doing sync VM_BINDS through the work-in-progress GPUVM abstraction and also doing (trivial) submits through Asahi's drm_scheduler and dma_fence abstractions. So basically, most of what one would expect a modern GPU driver to do, except for power management and some other very important adjacent pieces. It is not at the point where submits can correctly deal with dependencies, or at the point where it can rotate access to the GPU hardware fairly through a software scheduler, but that is simply a matter of writing more code. This first patch, however, only implements a subset of the current features available downstream, as the rest is not implementable without pulling in even more abstractions. In particular, a lot of things depend on properly mapping memory on a given VA range, which itself depends on the GPUVM abstraction that is currently work-in-progress. For this reason, we still cannot boot the MCU and thus, cannot do much for the moment. This constitutes a change in the overall strategy that we have been using to develop Tyr so far. By submitting small parts of the driver upstream iteratively, we aim to: a) evolve together with Nova and rvkms, hopefully reducing regressions due to upstream changes (that may break us because we were not there, in the first place) b) prove any work-in-progress abstractions by having them run on a real driver and hardware and, c) provide a reason to work on and review said abstractions by providing a user, which would be tyr itself. Despite its limited feature-set, we offer IGT tests. It is only tested on the rk3588, so any other SoC is probably not going to work at all for now. The skeleton is basically taken from Nova and also rust_platform_driver.rs. Lastly, the name "Tyr" is inspired by Norse mythology, reflecting ARM's tradition of naming their GPUs after Nordic mythological figures and places. Co-developed-by: Beata Michalska <beata.michalska@arm.com> Signed-off-by: Beata Michalska <beata.michalska@arm.com> Co-developed-by: Carsten Haitzler <carsten.haitzler@foss.arm.com> Signed-off-by: Carsten Haitzler <carsten.haitzler@foss.arm.com> Co-developed-by: Rob Herring <robh@kernel.org> Signed-off-by: Rob Herring <robh@kernel.org> Link: https://www.collabora.com/news-and-blog/news-and-events/introducing-tyr-a-new-rust-drm-driver.html Signed-off-by: Daniel Almeida <daniel.almeida@collabora.com> Acked-by: Boris Brezillon <boris.brezillon@collabora.com> [aliceryhl: minor Kconfig update on apply] [aliceryhl: s/drm::device::/drm::/] Link: https://lore.kernel.org/r/20250910-tyr-v3-1-dba3bc2ae623@collabora.com Co-developed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Alice Ryhl <aliceryhl@google.com>	2025-09-11 12:20:03 +00:00
Aaron Ma	72136efb87	drm/i915/backlight: Honor VESA eDP backlight luminance control capability The VESA AUX backlight fails to be enable luminance based backlight mainpulation becaused luminance_set is false by default. Fix it by using luminance support control capabitliy. Fixes: `e13af5166a` ("drm/i915/backlight: Use drm helper to initialize edp backlight") Signed-off-by: Aaron Ma <aaron.ma@canonical.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/20250823121647.275834-1-aaron.ma@canonical.com	2025-09-11 14:43:54 +05:30
Thomas Hellström	692a480243	drm/xe: Fix uninitialized return values clang warned about two uninitialized variables used as return values in the exhaustive eviction series. Fix those. Fixes: `1f1541720f` ("drm/xe: Rework instances of variants of xe_bo_create_locked()") Fixes: `7bcb6e38c1` ("drm/xe/display: Convert __xe_pin_fb_vma()") Cc: Matthew Brost <matthew.brost@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Link: https://lore.kernel.org/r/20250910151128.49693-1-thomas.hellstrom@linux.intel.com	2025-09-11 08:45:12 +02:00
Shuicheng Lin	b98775bca9	drm/xe/tile: Release kobject for the failure path Call kobject_put() for the failure path to release the kobject v2: remove extra newline. (Matt) Fixes: `e3d0839aa5` ("drm/xe/tile: Abort driver load for sysfs creation failure") Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://lore.kernel.org/r/20250819153950.2973344-2-shuicheng.lin@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-09-10 21:00:10 -07:00
Dave Airlie	91494dee10	xe: populate buffers before exporting them. Before exporting a buffer, make sure it has been populated with pages at least once. While discussing cgroups we noticed a problem where you could export a BO to a dma-buf without having it ever being backed or accounted for. This meant in low memory situations or eventually with cgroups, a lower privledged process might cause the compositor to try and allocate a lot of memory on it's behalf and this could fail. At least make sure the exporter has managed to allocate the RAM at least once before exporting the object. This only applies currently to TTM_PL_SYSTEM objects, because GTT objects get populated on first validate, and VRAM doesn't use TT. Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://lore.kernel.org/r/20250904021643.2050497-4-airlied@gmail.com	2025-09-11 10:04:58 +10:00
Dave Airlie	3629e1b22e	nouveau: populate buffers before exporting them. Before exporting a buffer, make sure it has been populated with pages at least once. While discussing cgroups we noticed a problem where you could export a BO to a dma-buf without having it ever being backed or accounted for. This meant in low memory situations or eventually with cgroups, a lower privledged process might cause the compositor to try and allocate a lot of memory on it's behalf and this could fail. At least make sure the exporter has managed to allocate the RAM at least once before exporting the object. This only applies currently to TTM_PL_SYSTEM objects, because GTT objects get populated on first validate, and VRAM doesn't use TT. Acked-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://lore.kernel.org/r/20250904021643.2050497-3-airlied@gmail.com	2025-09-11 10:04:55 +10:00
Dave Airlie	619ddf57cf	amdgpu: populate buffers before exporting them. Before exporting a buffer, make sure it has been populated with pages at least once. While discussing cgroups we noticed a problem where you could export a BO to a dma-buf without having it ever being backed or accounted for. This meant in low memory situations or eventually with cgroups, a lower privledged process might cause the compositor to try and allocate a lot of memory on it's behalf and this could fail. At least make sure the exporter has managed to allocate the RAM at least once before exporting the object. This only applies currently to TTM_PL_SYSTEM objects, because GTT objects get populated on first validate, and VRAM doesn't use TT. Reviewed-by: Christian Koenig <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://lore.kernel.org/r/20250904021643.2050497-2-airlied@gmail.com	2025-09-11 10:04:31 +10:00
Dave Airlie	5024307986	ttm/bo: add an API to populate a bo before exporting. While discussing cgroups we noticed a problem where you could export a BO to a dma-buf without having it ever being backed or accounted for. This meant in low memory situations or eventually with cgroups, a lower privledged process might cause the compositor to try and allocate a lot of memory on it's behalf and this could fail. At least make sure the exporter has managed to allocate the RAM at least once before exporting the object. This only applies currently to TTM_PL_SYSTEM objects, because GTT objects get populated on first validate, and VRAM doesn't use TT. Reviewed-by: Christian Koenig <christian.koenig@amd.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Simona Vetter <simona.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://lore.kernel.org/r/20250904021643.2050497-1-airlied@gmail.com	2025-09-11 10:01:38 +10:00
Rob Clark	b5bad77e1e	drm/msm/registers: Sync GPU registers from mesa In particular, to pull in a SP_READ_SEL_LOCATION bitfield size fix to fix a7xx GPU snapshot. Sync from mesa commit 15ee3873aa4d ("freedreno/registers: Update GMU register xml"). Cc: Karmjit Mahil <karmjit.mahil@igalia.com> Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/673558/	2025-09-10 14:48:12 -07:00
Rob Clark	60e9f776b7	drm/msm/registers: Generate _HI/LO builders for reg64 The upstream mesa copy of the GPU regs has shifted more things to reg64 instead of seperate 32b HI/LO reg32's. This works better with the "new- style" c++ builders that mesa has been migrating to for a6xx+ (to better handle register shuffling between gens), but it leaves the C builders with missing _HI/LO builders. So handle the special case of reg64, automatically generating the missing _HI/LO builders. Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/673559/	2025-09-10 14:48:12 -07:00
Rob Clark	29e087f31b	drm/msm/registers: Make TPL1_BICUBIC_WEIGHTS_TABLE an array Synced from mesa commit 77c42c1a5752 ("freedreno/registers: Make TPL1_BICUBIC_WEIGHTS_TABLE an array"). Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/673552/	2025-09-10 14:48:12 -07:00
Rob Clark	9052817620	drm/msm/registers: Sync gen_header.py from mesa Sync from mesa commit 04e2140d8be7 ("freedreno/registers: remove python 3.9 dependency for compiling msm"). Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/673556/	2025-09-10 14:48:12 -07:00
Rob Clark	f03464c638	drm/msm/registers: Remove license/etc from generated headers Since these generated files are no longer checked in, either in mesa or in the linux kernel, simplify things by dropping the verbose generated comment. These were semi-nerf'd on the kernel side, in the name of build reproducibility, by commit `ba64c6737f` ("drivers: gpu: drm: msm: registers: improve reproducibility"), but in a way that was semi- kernel specific. We can just reduce the divergence between kernel and mesa by just dropping all of this. Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com> Patchwork: https://patchwork.freedesktop.org/patch/673551/	2025-09-10 14:48:11 -07:00
Mario Limonciello (AMD)	7df7b728c3	DRM: Add a new 'boot_display' attribute On systems with multiple GPUs there can be uncertainty which GPU is the primary one used to drive the display at bootup. In some desktop environments this can lead to increased power consumption because secondary GPUs may be used for rendering and never go to a low power state. In order to disambiguate this add a new sysfs attribute 'boot_display' that uses the output of video_is_primary_device() to populate whether the PCI device was used for driving the display. Suggested-by: Manivannan Sadhasivam <mani@kernel.org> Acked-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://gitlab.freedesktop.org/xorg/lib/libpciaccess/-/issues/23 Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://lore.kernel.org/r/20250811162606.587759-5-superm1@kernel.org Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>	2025-09-10 09:35:33 -05:00
Johan Hovold	9ba2556cef	drm/mediatek: clean up driver data initialisation The platform and drm devices are only used to look up the drm device and its driver data respectively when initialising the driver data during bind(). Drop the reference counts as soon as they have been used to make the code more readable. Note that the crtc count is never incremented on lookup failures. Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: CK Hu <ck.hu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20250829090345.21075-3-johan@kernel.org/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2025-09-10 12:52:59 +00:00
Johan Hovold	4de37a48b6	drm/mediatek: fix potential OF node use-after-free The for_each_child_of_node() helper drops the reference it takes to each node as it iterates over children and an explicit of_node_put() is only needed when exiting the loop early. Drop the recently introduced bogus additional reference count decrement at each iteration that could potentially lead to a use-after-free. Fixes: `1f403699c4` ("drm/mediatek: Fix device/node reference count leaks in mtk_drm_get_all_drm_priv") Cc: Ma Ke <make24@iscas.ac.cn> Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: CK Hu <ck.hu@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20250829090345.21075-2-johan@kernel.org/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>	2025-09-10 12:49:37 +00:00
Rodrigo Vivi	702fdf3513	Merge drm/drm-next into drm-intel-next Catching up with some display dependencies. Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-10 08:01:42 -04:00
Danilo Krummrich	d4dc08c530	Merge drm-misc-next-2025-08-21 into drm-rust-next We need the DRM Rust changes that went into drm-misc before the existence of the drm-rust tree in here as well. Signed-off-by: Danilo Krummrich <dakr@kernel.org>	2025-09-10 11:07:05 +02:00
Thomas Hellström	844150c255	drm/xe: Convert pinned suspend eviction for exhaustive eviction Pinned suspend eviction and preparation for eviction validates system memory for eviction buffers. Do that under a validation exclusive lock to avoid interfering with other processes validating system graphics memory. v2: - Avoid gotos from within xe_validation_guard(). - Adapt to signature change of xe_validation_guard(). Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250908101246.65025-14-thomas.hellstrom@linux.intel.com	2025-09-10 09:16:10 +02:00
Thomas Hellström	1f1541720f	drm/xe: Rework instances of variants of xe_bo_create_locked() A common pattern is to create a locked bo, pin it without mapping and then unlock it. Add a function to do that, which internally uses xe_validation_guard(). With that we can remove xe_bo_create_locked_range() and add exhaustive eviction to stolen, pf_provision_vf_lmem and psmi_alloc_object. v4: - New patch after reorganization. v5: - Replace DRM_XE_GEM_CPU_CACHING_WB with 0. (CI) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250908101246.65025-13-thomas.hellstrom@linux.intel.com	2025-09-10 09:16:09 +02:00
Thomas Hellström	59eabff2a3	drm/xe: Convert xe_bo_create_pin_map() for exhaustive eviction Introduce an xe_bo_create_pin_map_novm() function that does not take the drm_exec paramenter to simplify the conversion of many callsites. For the rest, ensure that the same drm_exec context that was used for locking the vm is passed down to validation. Use xe_validation_guard() where appropriate. v2: - Avoid gotos from within xe_validation_guard(). (Matt Brost) - Break out the change to pf_provision_vf_lmem8 to a separate patch. - Adapt to signature change of xe_validation_guard(). Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250908101246.65025-12-thomas.hellstrom@linux.intel.com	2025-09-10 09:16:06 +02:00
Thomas Hellström	e6108eade1	drm/xe: Convert xe_bo_create_pin_map_at() for exhaustive eviction Most users of xe_bo_create_pin_map_at() and xe_bo_create_pin_map_at_aligned() are not using the vm parameter, and that simplifies conversion. Introduce an xe_bo_create_pin_map_at_novm() function and make the _aligned() version static. Use xe_validation_guard() for conversion. v2: - Adapt to signature change of xe_validation_guard(). (Matt Brost) - Fix up documentation. v4: - Postpone the change to i915_gem_stolen_insert_node_in_range() to a later patch. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250908101246.65025-11-thomas.hellstrom@linux.intel.com	2025-09-10 09:16:05 +02:00
Thomas Hellström	550a42a8da	drm/xe: Rename ___xe_bo_create_locked() Don't start external function names with underscores. Rename to xe_bo_init_locked(). Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250908101246.65025-10-thomas.hellstrom@linux.intel.com	2025-09-10 09:16:04 +02:00
Thomas Hellström	eb289a5f6c	drm/xe: Convert xe_dma_buf.c for exhaustive eviction Convert dma-buf migration to XE_PL_TT and dma-buf import to support exhaustive eviction, using xe_validation_guard(). It seems unlikely that the import would result in an -ENOMEM, but convert import anyway for completeness. The dma-buf map_attachment() functionality unfortunately doesn't support passing a drm_exec, which means that foreign devices validating a dma-buf that we exported will not, unless they are xeKMD devices, participate in the exhaustive eviction scheme. v2: - Avoid gotos from within xe_validation_guard(). (Matt Brost) - Adapt to signature change of xe_validation_guard(). (Matt Brost) - Remove an unneeded (void)ret. (Matt Brost) - Fix up an error path. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250908101246.65025-9-thomas.hellstrom@linux.intel.com	2025-09-10 09:16:03 +02:00
Thomas Hellström	7bcb6e38c1	drm/xe/display: Convert __xe_pin_fb_vma() Convert __xe_pin_fb_vma() for exhaustive eviction using xe_validation_guard(). v2: - Avoid gotos from within xe_validation_guard(). (Matt Brost) - Adapt to signature change of xe_validation_guard(). (Matt Brost) - Use interruptible waiting, since xe_bo_migrate() already does that. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250908101246.65025-8-thomas.hellstrom@linux.intel.com	2025-09-10 09:16:02 +02:00
Thomas Hellström	c2ae94cf8c	drm/xe: Convert the CPU fault handler for exhaustive eviction The CPU fault handler may populate bos and migrate, and in doing so might interfere with other tasks validating. Rework the CPU fault handler completely into a fastpath and a slowpath. The fastpath trylocks only the validation lock in read-mode. If that fails, there's a fallback to the slowpath, where we do a full validation transaction. This mandates open-coding of bo locking, bo idling and bo populating, but we still call into TTM for fault finalizing. v2: - Rework the CPU fault handler to actually take part in the exhaustive eviction scheme (Matthew Brost). v3: - Don't return anything but VM_FAULT_RETRY if we've dropped the mmap_lock. Not even if a signal is pending. - Rebase on gpu_madvise() and split out fault migration. - Wait for idle after migration. - Check whether the resource manager uses tts to determine whether to map the tt or iomem. - Add a number of asserts. - Allow passing a ttm_operation_ctx to xe_bo_migrate() so that it's possible to try non-blocking migration. - Don't fall through to TTM on migration / population error Instead remove the gfp_retry_mayfail in mode 2 where we must succeed. (Matthew Brost) v5: - Don't allow faulting in the imported bo case (Matthew Brost) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthews Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250908101246.65025-7-thomas.hellstrom@linux.intel.com	2025-09-10 09:16:01 +02:00
Thomas Hellström	8f25e5abcb	drm/xe: Convert existing drm_exec transactions for exhaustive eviction Convert existing drm_exec transactions, like GT pagefault validation, non-LR exec() IOCTL and the rebind worker to support exhaustive eviction using the xe_validation_guard(). v2: - Adapt to signature change in xe_validation_guard() (Matt Brost) - Avoid gotos from within xe_validation_guard() (Matt Brost) - Check error return from xe_validation_guard() v3: - Rebase on gpu_madvise() Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> #v1 Link: https://lore.kernel.org/r/20250908101246.65025-6-thomas.hellstrom@linux.intel.com	2025-09-10 09:16:00 +02:00
Thomas Hellström	1710cd5c8c	drm/xe: Convert SVM validation for exhaustive eviction Convert SVM validation to support exhaustive eviction, using xe_validation_guard(). v2: - Wrap also xe_vm_range_rebind (Matt Brost) - Adapt to argument changes of xe_validation_guard(). v5: - Rebase on SVM stats. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250908101246.65025-5-thomas.hellstrom@linux.intel.com	2025-09-10 09:15:59 +02:00
Thomas Hellström	a2f2453c2c	drm/xe: Convert xe_bo_create_user() for exhaustive eviction Use the xe_validation_guard() to convert xe_bo_create_user() for exhaustive eviction. v2: - Adapt to argument changes of xe_validation_guard() Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> #v1 Link: https://lore.kernel.org/r/20250908101246.65025-4-thomas.hellstrom@linux.intel.com	2025-09-10 09:15:57 +02:00
Thomas Hellström	c460bc2311	drm/xe: Introduce an xe_validation wrapper around drm_exec Introduce a validation wrapper xe_validation_guard() as a helper intended to be used around drm_exec transactions what perform validations. Once TTM can handle exhaustive eviction we could remove this wrapper or make it mostly a NO-OP unless other functionality is added to it. Currently the wrapper takes a read lock upon entry and if the transaction hits an OOM, all locks are released and the transaction is retried with a write-lock. If all other validations participate in this scheme, the transaction with the write lock will be the only transaction validating and should have access to all available non-pinned memory. There is currently a problem in that TTM converts -EDEADLOCKS to -ENOMEM, and with ww_mutex slowpath error injections, we can hit -ENOMEMs without having actually ran out of memory. We abuse ww_mutex internals to detect such situations until TTM is fixes to not convert the error code. In the meantime, injecting ww_mutex slowpath -EDEADLOCKs is a good way to test the implementation in the absence of real OOMs. Just introduce the wrapper in this commit. It will be hooked up to the driver in following commits. v2: - Mark class_xe_validation conditional so that the loop is skipped on initialization error. - Argument sanitation (Matt Brost) - Fix conditional execution of xe_validation_ctx_fini() (Matt Brost) - Add a no_block mode for upcoming use in the CPU fault handler. v4: - Update kerneldoc. (Xe CI). Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20250908101246.65025-3-thomas.hellstrom@linux.intel.com	2025-09-10 09:15:56 +02:00
Thomas Hellström	0131514f97	drm/xe: Pass down drm_exec context to validation We want all validation (potential backing store allocation) to be part of a drm_exec transaction. Therefore add a drm_exec pointer argument to xe_bo_validate() and ___xe_bo_create_locked(). Upcoming patches will deal with making all (or nearly all) calls to these functions part of a drm_exec transaction. In the meantime, define special values of the drm_exec pointer: XE_VALIDATION_UNIMPLEMENTED: Implementation of the drm_exec transaction has not been done yet. XE_VALIDATION_UNSUPPORTED: Some Middle-layers (dma-buf) doesn't allow the drm_exec context to be passed down to map_attachment where validation takes place. XE_VALIDATION_OPT_OUT: May be used only for kunit tests where exhaustive eviction isn't crucial and the ROI of converting those is very small. For XE_VALIDATION_UNIMPLEMENTED and XE_VALIDATION_OPT_OUT there is also a lockdep check that a drm_exec transaction can indeed start at the location where the macro is expanded. This is to encourage developers to take this into consideration early in the code development process. v2: - Fix xe_vm_set_validation_exec() imbalance. Add an assert that hopefully catches future instances of this (Matt Brost) v3: - Extend to psmi_alloc_object Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> #v3 Link: https://lore.kernel.org/r/20250908101246.65025-2-thomas.hellstrom@linux.intel.com	2025-09-10 09:15:52 +02:00
David Rosca	3318f2d20c	drm/amdgpu/vcn: Allow limiting ctx to instance 0 for AV1 at any time There is no reason to require this to happen on first submitted IB only. We need to wait for the queue to be idle, but it can be done at any time (including when there are multiple video sessions active). Signed-off-by: David Rosca <david.rosca@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `8908fdce06`) Cc: stable@vger.kernel.org	2025-09-09 16:42:26 -04:00
David Rosca	2b10cb58d7	drm/amdgpu/vcn4: Fix IB parsing with multiple engine info packages There can be multiple engine info packages in one IB and the first one may be common engine, not decode/encode. We need to parse the entire IB instead of stopping after finding first engine info. Signed-off-by: David Rosca <david.rosca@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `dc8f9f0f45`) Cc: stable@vger.kernel.org	2025-09-09 16:41:49 -04:00
Pratap Nirujogi	857ccfc19f	drm/amd/amdgpu: Declare isp firmware binary file Declare isp firmware file isp_4_1_1.bin required by isp4.1.1 device. Suggested-by: Alexey Zagorodnikov <xglooom@gmail.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Pratap Nirujogi <pratap.nirujogi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `d97b74a833`) Cc: stable@vger.kernel.org	2025-09-09 16:41:15 -04:00
Alex Deucher	1d66c3f2b8	drm/amd/display: use udelay rather than fsleep This function can be called from an atomic context so we can't use fsleep(). Fixes: `01f60348d8` ("drm/amd/display: Fix 'failed to blank crtc!'") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4549 Cc: Wen Chen <Wen.Chen3@amd.com> Cc: Fangzhi Zuo <jerry.zuo@amd.com> Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `27e4dc2c05`)	2025-09-09 16:39:16 -04:00
Alex Deucher	7838fb5f11	drm/amdgpu: fix a memory leak in fence cleanup when unloading Commit `b61badd20b` ("drm/amdgpu: fix usage slab after free") reordered when amdgpu_fence_driver_sw_fini() was called after that patch, amdgpu_fence_driver_sw_fini() effectively became a no-op as the sched entities we never freed because the ring pointers were already set to NULL. Remove the NULL setting. Reported-by: Lin.Cao <lincao12@amd.com> Cc: Vitaly Prosyak <vitaly.prosyak@amd.com> Cc: Christian König <christian.koenig@amd.com> Fixes: `b61badd20b` ("drm/amdgpu: fix usage slab after free") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `a525fa37aa`) Cc: stable@vger.kernel.org	2025-09-09 16:38:26 -04:00
Rodrigo Siqueira	0855c764f7	drm/amdgpu/vcn: Change amdgpu_vcn_sw_fini return to void The function amdgpu_vcn_sw_fini() returns an integer, but this number is always 0. This commit changes the amdgpu_vcn_sw_fini() return to void, and eliminates all checks to this return across different VCNs. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-09 16:18:46 -04:00
Rodrigo Siqueira	3d9752f4f9	drm/amdgpu/vcn: Document IRQ per-instance irq behavior for VCN 4.0.3 When examining the VCN function init, it is common to find a loop that initializes VCN rings, which uses one IRQ per instance. However, VCN 4.0.3 deviates from this pattern, as it includes a distinct field to differentiate instances, which results in a slightly different ring init. This commit makes this difference explicit by using a fixed index when initializing the ring buffer and also adds a comment. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-09 16:18:41 -04:00
Prike Liang	8b38bf3883	drm/amdgpu: validate userq hw unmap status for destroying userq Before destroying the userq buffer object, it requires validating the userq HW unmap status and ensuring the userq is unmapped from hardware. If the user HW unmap failed, then it needs to reset the queue for reusing. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-09 16:18:37 -04:00
Srinivasan Shanmugam	11aaec3566	drm/amdgpu: Wire up MMIO_REMAP placement and User-visible strings Wire up the conversions and strings for the new MMIO_REMAP placement: * amdgpu_mem_type_to_domain() maps AMDGPU_PL_MMIO_REMAP -> domain * amdgpu_bo_placement_from_domain() accepts the new domain * amdgpu_bo_mem_stats_placement() and amdgpu_bo_print_info() report it * res cursor supports the new placement * fdinfo prints "mmioremap" for the new placement Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-09 16:18:34 -04:00
Srinivasan Shanmugam	357fe94b66	drm/amdgpu/ttm: Add New AMDGPU_PL_MMIO_REMAP Placement Introduce a kernel-internal TTM placement type AMDGPU_PL_MMIO_REMAP for the HDP flush MMIO remap page Plumbing added: - amdgpu_res_cursor.{first,next}: treat MMIO_REMAP like DOORBELL - amdgpu_ttm_io_mem_reserve(): return BAR bus address + offset for MMIO_REMAP, mark as uncached I/O - amdgpu_ttm_io_mem_pfn(): PFN from register BAR - amdgpu_res_cpu_visible(): visible to CPU - amdgpu_evict_flags()/amdgpu_bo_move(): non-migratable - amdgpu_ttm_tt_pde_flags(): map as SYSTEM - amdgpu_bo_mem_stats_placement(): report AMDGPU_PL_MMIO_REMAP - amdgpu_fdinfo: print “mmioremap” bucket label Cc: Alex Deucher <alexander.deucher@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-09 16:18:30 -04:00
David Rosca	8908fdce06	drm/amdgpu/vcn: Allow limiting ctx to instance 0 for AV1 at any time There is no reason to require this to happen on first submitted IB only. We need to wait for the queue to be idle, but it can be done at any time (including when there are multiple video sessions active). Signed-off-by: David Rosca <david.rosca@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-09 16:18:28 -04:00
David Rosca	dc8f9f0f45	drm/amdgpu/vcn4: Fix IB parsing with multiple engine info packages There can be multiple engine info packages in one IB and the first one may be common engine, not decode/encode. We need to parse the entire IB instead of stopping after finding first engine info. Signed-off-by: David Rosca <david.rosca@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-09 16:18:22 -04:00
Prike Liang	d426a5b6da	drm/amdgpu: clean up the amdgpu_userq_active() This is no invocation for amdgpu_userq_active(). Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-09 16:18:18 -04:00

... 9 10 11 12 13 ...

118570 Commits