linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-02-17 13:40:32 -05:00

Author	SHA1	Message	Date
Alex Deucher	f8b367e6fa	drm/amdgpu: suspend KFD and KGD user queues for S0ix We need to make sure the user queues are preempted so GFX can enter gfxoff. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Tested-by: David Perry <david.perry@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-18 09:43:30 -04:00
Alex Deucher	846de1384a	drm/amdgpu/userq: Optimize S0ix handling In S0i3, GFX state is retained, so it's preferrable to preempt queues rather than unmapping them as the overhead is lower. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Tested-by: David Perry <david.perry@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-18 09:43:23 -04:00
Joe.Wang	f05c03ffc7	drm/amdgpu: Fix PRT flag for gfx12 AMDGPU_PTE_PRT_GFX12 flag is missed during pageTable rework, add it back. Fixes: `6716a823d1` ("drm/amdgpu: rework how PTE flags are generated v3") Signed-off-by: Joe Wang <joe.wang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-18 09:43:10 -04:00
Xiang Liu	1ed511fb76	drm/amdgpu: Check VF critical region before RAS poison injection Check VF critical region before RAS poison injection to ensure that the poison injection will not hit the VF critical region. Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-18 09:43:10 -04:00
Alex Deucher	4bfa860993	drm/amdkfd: add proper handling for S0ix When in S0i3, the GFX state is retained, so all we need to do is stop the runlist so GFX can enter gfxoff. Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Tested-by: David Perry <david.perry@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-18 09:43:02 -04:00
Xiang Liu	f1fdeb3d07	drm/amdgpu: Introduce VF critical region check for RAS poison injection The SRIOV guest send requet to host to check whether the poison injection address is in VF critical region or not via mabox. Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Shravan Kumar Gande <Shravankumar.Gande@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-18 09:43:02 -04:00
Alex Deucher	18f769ff36	drm/amdgpu: remove non-DC DCE 11 code DC has been the default for ~8 years now and supports many things that the non-DC code does not (audio, DP MST, etc.). No DCE 11.x IPs ever supported analog encoders so that is not an issue. Finally drop this code. Acked-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-18 09:43:02 -04:00
Asad Kamal	34f10da667	drm/amd/pm: Enable npm metrics data Enable npm metrics data for smu_v13_0_12 v3: Add node id check for setting NPM_CAPS (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-18 09:43:02 -04:00
Asad Kamal	acae8ad69b	drm/amd/pm: Fetch npm data from system metrics table Fetch npm data from system metrics table for smu_v13_0_12 v3: Remove intermittent type for npm data, remove node id check, move npm caps check to npm_get_data function (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-18 09:43:02 -04:00
Asad Kamal	ef612f58d9	drm/amd/pm: Add sysfs node for node power Add sysfs node to expose node power limit for smu_v13_0_12 v2: Remove support check from visible function (Kevin) v3: Update comments (Kevin) Remove sysfs remove file, change format specifier for sysfs_emit, use attribute_group.name (Lijo) Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-18 09:43:02 -04:00
Asad Kamal	4072b16dd8	drm/amd/pm: Allow system metrics table in 1vf mode Allow fetching system metrics table in 1VF mode Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-18 09:43:02 -04:00
Thomas Hellström	187e16f69d	drm/xe: Work around clang multiple goto-label error When using drm_exec_retry_on_contention(), clang may consider all labels for which we take addresses in a function as potential retry goto targets, although strictly only one is possible. It will then in some situations generate false positive errors. In this case, the compiler, for some architectures, consider the might_lock(&m->job_mutex); as a potential goto target from drm_exec_retry_on_contention(), and errors. Work around that by moving the xe_validate / drm_exec transaction to a separate function. v2: - New commit message based on analysis of Nathan Chancellor Fixes: `59eabff2a3` ("drm/xe: Convert xe_bo_create_pin_map() for exhaustive eviction") Cc: Matthew Brost <matthew.brost@intel.com> Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202509101853.nDmyxTEM-lkp@intel.com/ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> # build Link: https://lore.kernel.org/r/20250911080324.180307-1-thomas.hellstrom@linux.intel.com	2025-09-18 08:27:00 +02:00
Daniele Ceraolo Spurio	26caeae9fb	drm/xe/guc: Set RCS/CCS yield policy All recent platforms (including all the ones officially supported by the Xe driver) do not allow concurrent execution of RCS and CCS workloads from different address spaces, with the HW blocking the context switch when it detects such a scenario. The DUAL_QUEUE flag helps with this, by causing the GuC to not submit a context it knows will not be able to execute. This, however, causes a new problem: if RCS and CCS queues have pending workloads from different address spaces, the GuC needs to choose from which of the 2 queues to pick the next workload to execute. By default, the GuC prioritizes RCS submissions over CCS ones, which can lead to CCS workloads being significantly (or completely) starved of execution time. The driver can tune this by setting a dedicated scheduling policy KLV; this KLV allows the driver to specify a quantum (in ms) and a ratio (percentage value between 0 and 100), and the GuC will prioritize the CCS for that percentage of each quantum. Given that we want to guarantee enough RCS throughput to avoid missing frames, we set the yield policy to 20% of each 80ms interval. v2: updated quantum and ratio, improved comment, use xe_guc_submit_disable in gt_sanitize Fixes: `d9a1ae0d17` ("drm/xe/guc: Enable WA_DUAL_QUEUE for newer platforms") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Tested-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Link: https://lore.kernel.org/r/20250905235632.3333247-2-daniele.ceraolospurio@intel.com (cherry picked from commit `8843444843`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> [Rodrigo added #include "xe_guc_submit.h" while backporting]	2025-09-17 20:23:47 -04:00
Michal Wajdeczko	fb3c27a69c	drm/xe/sysfs: Simplify sysfs registration Instead of manually maintaining each sysfs file define and use attribute groups and register them using device managed function. Then use is_visible() to filter-out unsupported attributes. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250916170029.3313-3-michal.wajdeczko@intel.com	2025-09-17 21:59:31 +02:00
Michal Wajdeczko	a2d6223d22	drm/xe/vf: Don't expose sysfs attributes not applicable for VFs VFs can't read BMG_PCIE_CAP(0x138340) register nor access PCODE (already guarded by the info.skip_pcode flag) so we shouldn't expose attributes that require any of them to avoid errors like: [] xe 0000:03:00.1: [drm] Tile0: GT0: VF is trying to read an \ inaccessible register 0x138340+0x0 [] RIP: 0010:xe_gt_sriov_vf_read32+0x6c2/0x9a0 [xe] [] Call Trace: [] xe_mmio_read32+0x110/0x280 [xe] [] auto_link_downgrade_capable_show+0x2e/0x70 [xe] [] dev_attr_show+0x1a/0x70 [] sysfs_kf_seq_show+0xaa/0x120 [] kernfs_seq_show+0x41/0x60 Fixes: `0e414bf7ad` ("drm/xe: Expose PCIe link downgrade attributes") Fixes: `cdc36b66cd` ("drm/xe: Expose fan control and voltage regulator version") Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Lukasz Laguna <lukasz.laguna@intel.com> Reviewed-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250916170029.3313-2-michal.wajdeczko@intel.com	2025-09-17 21:59:29 +02:00
Shuicheng Lin	33fe111a35	drm/xe/madvise: Fix ioctl argument check It is "preferred_mem_loc" instead of "atomic" for the ATTR_PREFERRED_LOC path. Also include 2 minor changes with no functional impact. 1. Remove the redundant "attr.atomic_access" assignment. 2. Replace down_read_interruptible() with xe_svm_notifier_lock_interruptible() to pair with xe_svm_notifier_unlock(). Fixes: `ada7486c56` ("drm/xe: Implement madvise ioctl for xe") Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Link: https://lore.kernel.org/r/20250911173139.1405878-2-shuicheng.lin@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2025-09-17 10:09:34 -07:00
Shuicheng Lin	5959c4da17	drm/xe: Misc refine for svm These changes should have no functional impact. 1. Correct typo of "operation"in macro range_debug(). 2. Combine 2 spin_lock() call in xe_svm_garbage_collector() into 1. 3. Drop redundant preferred_region_is_vram check in xe_svm_range_needs_migrate_to_vram(). 4. Combine the devmem_possible check in xe_svm_handle_pagefault(). need_vram includes the IS_DGFX() check, so there is no change for .devmem_only. v2: revert !ctx.devmem_only change (Matt) v3: rebase code and refine commit message. v4: rebase code and refine commit message. Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://lore.kernel.org/r/20250911031405.1371812-2-shuicheng.lin@intel.com	2025-09-17 09:50:44 -07:00
Daniele Ceraolo Spurio	ae5fbbda34	drm/xe: Fix error handling if PXP fails to start Since the PXP start comes after __xe_exec_queue_init() has completed, we need to cleanup what was done in that function in case of a PXP start error. __xe_exec_queue_init calls the submission backend init() function, so we need to introduce an opposite for that. Unfortunately, while we already have a fini() function pointer, it performs other operations in addition to cleaning up what was done by the init(). Therefore, for clarity, the existing fini() has been renamed to destroy(), while a new fini() has been added to only clean up what was done by the init(), with the latter being called by the former (via xe_exec_queue_fini). Fixes: `72d479601d` ("drm/xe/pxp/uapi: Add userspace and LRC support for PXP-using queues") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250909221240.3711023-3-daniele.ceraolospurio@intel.com (cherry picked from commit `626667321d`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-17 12:28:55 -04:00
Zongyao Bai	ff89a4d285	drm/xe/sysfs: Add cleanup action in xe_device_sysfs_init On partial failure, some sysfs files created before the failure might not be removed. Add common cleanup step to remove them all immediately, as is should be harmless to attempt to remove non-existing files. Fixes: `0e414bf7ad` ("drm/xe: Expose PCIe link downgrade attributes") Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Zongyao Bai <zongyao.bai@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250915214716.1327379-2-zongyao.bai@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> (cherry picked from commit `1a869168d9`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-17 12:28:55 -04:00
Michal Wajdeczko	5bb5258e35	drm/xe/tests: Add pre-GMDID IP descriptors to param generators Recently introduced kunit parameter generators were based on the existing arrays which have only GDMID-based IPs and didn't take into account IP definitions from pre-GMDID era. Add test only arrays with pre-GMDID IPs (as those will not change) and extend param generators to start iterating over them. [ ] =================== xe_pci (2 subtests) ==================== [ ] ==================== check_graphics_ip ==================== [ ] [PASSED] 12.00 Xe_LP [ ] [PASSED] 12.10 Xe_LP+ [ ] [PASSED] 12.55 Xe_HPG [ ] [PASSED] 12.60 Xe_HPC [ ] [PASSED] 12.70 Xe_LPG [ ] [PASSED] 12.71 Xe_LPG [ ] [PASSED] 12.74 Xe_LPG+ [ ] [PASSED] 20.01 Xe2_HPG [ ] [PASSED] 20.02 Xe2_HPG [ ] [PASSED] 20.04 Xe2_LPG [ ] [PASSED] 30.00 Xe3_LPG [ ] [PASSED] 30.01 Xe3_LPG [ ] [PASSED] 30.03 Xe3_LPG [ ] ================ [PASSED] check_graphics_ip ================ [ ] ===================== check_media_ip ====================== [ ] [PASSED] 12.00 Xe_M [ ] [PASSED] 12.55 Xe_HPM [ ] [PASSED] 13.00 Xe_LPM+ [ ] [PASSED] 13.01 Xe2_HPM [ ] [PASSED] 20.00 Xe2_LPM [ ] [PASSED] 30.00 Xe3_LPM [ ] [PASSED] 30.02 Xe3_LPM [ ] ================= [PASSED] check_media_ip ================== [ ] ===================== [PASSED] xe_pci ====================== Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250916171645.3335-1-michal.wajdeczko@intel.com	2025-09-17 15:29:13 +02:00
Dave Airlie	6f17ab9a63	Merge tag 'drm-rust-next-2025-09-16' of https://gitlab.freedesktop.org/drm/rust/kernel into drm-next DRM Rust changes for v6.18 Alloc - Add BorrowedPage type and AsPageIter trait - Implement Vmalloc::to_page() and VmallocPageIter - Implement AsPageIter for VBox and VVec DMA & Scatterlist - Add dma::DataDirection and type alias for dma_addr_t - Abstraction for struct scatterlist and struct sg_table DRM - In the DRM GEM module, simplify overall use of generics, add DriverFile type alias and drop Object::SIZE. Nova (Core) - Various register!() macro improvements (paving the way for lifting it to common driver infrastructure) - Minor VBios fixes and refactoring - Minor firmware request refactoring - Advance firmware boot stages; process Booter and patch its signature, process GSP and GSP bootloader - Switch development fimrware version to r570.144 - Add basic firmware bindings for r570.144 - Move GSP boot code to its own module - Clean up and take advantage of pin-init features to store most of the driver's private data within a single allocation - Update ARef import from sync::aref - Add website to MAINTAINERS entry Nova (DRM) - Update ARef import from sync::aref - Add website to MAINTAINERS entry Pin-Init - Merge pin-init PR from Benno - `#[pin_data]` now generates a `Projection` struct similar to the `pin-project` crate. - Add initializer code blocks to `[try_][pin_]init!` macros: make initializer macros accept any number of `_: {/ arbitrary code */},` & make them run the code at that point. - Make the `[try_][pin_]init!` macros expose initialized fields via a `let` binding as `&mut T` or `Pin<&mut T>` for later fields. Rust - Various methods for AsBytes and FromBytes traits Tyr - Initial Rust driver skeleton for ARM Mali GPUs. - It can power up the GPU, query for GPU metatdata through MMIO and provide the metadata to userspace via DRM device IOCTL (struct drm_panthor_dev_query). Signed-off-by: Dave Airlie <airlied@redhat.com> From: "Danilo Krummrich" <dakr@kernel.org> Link: https://lore.kernel.org/r/DCUC4SY6SRBD.1ZLHAIQZOC6KG@kernel.org	2025-09-17 16:13:49 +10:00
Daniele Ceraolo Spurio	aaae483657	drm/xe: Allow error injection for xe_pxp_exec_queue_add This will allow us to simulate this function returning an error like we do for other functions called in the exec_queue_create path. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250909221240.3711023-4-daniele.ceraolospurio@intel.com	2025-09-16 15:54:31 -07:00
Daniele Ceraolo Spurio	626667321d	drm/xe: Fix error handling if PXP fails to start Since the PXP start comes after __xe_exec_queue_init() has completed, we need to cleanup what was done in that function in case of a PXP start error. __xe_exec_queue_init calls the submission backend init() function, so we need to introduce an opposite for that. Unfortunately, while we already have a fini() function pointer, it performs other operations in addition to cleaning up what was done by the init(). Therefore, for clarity, the existing fini() has been renamed to destroy(), while a new fini() has been added to only clean up what was done by the init(), with the latter being called by the former (via xe_exec_queue_fini). Fixes: `72d479601d` ("drm/xe/pxp/uapi: Add userspace and LRC support for PXP-using queues") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://lore.kernel.org/r/20250909221240.3711023-3-daniele.ceraolospurio@intel.com	2025-09-16 15:54:28 -07:00
Mario Limonciello	f9b80514a7	drm/amd: Only restore cached manual clock settings in restore if OD enabled If OD is not enabled then restoring cached clock settings doesn't make sense and actually leads to errors in resume. Check if enabled before restoring settings. Fixes: `4e9526924d` ("drm/amd: Restore cached manual clock settings during resume") Reported-by: Jérôme Lécuyer <jerome.4a4c@gmail.com> Closes: https://lore.kernel.org/amd-gfx/0ffe2692-7bfa-4821-856e-dd0f18e2c32b@amd.com/T/#me6db8ddb192626360c462b7570ed7eba0c6c9733 Suggested-by: Jérôme Lécuyer <jerome.4a4c@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit `1a4dd33cc6`) Cc: stable@vger.kernel.org	2025-09-16 17:58:34 -04:00
Christian König	df99f6d112	drm/amdgpu: re-order and document VM code Re-order fields in the VM structure and try to improve the documentation a bit. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-16 17:51:48 -04:00
Christian König	930595df25	drm/amdgpu: remove check for BO reservation add assert instead We should leave such checks to lockdep and not implement something manually. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-16 17:51:35 -04:00
Asad Kamal	beae798b68	drm/amd/pm: Update pmfw headers for smu_v13_0_12 Update pmfw headers for smu_v13_0_12 to include node power limit Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-16 17:51:30 -04:00
Asad Kamal	e09a4fdfbb	drm/amd/pm: Rename amdgpu_hwmon_get_sensor_generic Rename amdgpu_hwmon_get_sensor_generic to use for generic pm interfaces Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-16 17:51:24 -04:00
Mario Limonciello	1a4dd33cc6	drm/amd: Only restore cached manual clock settings in restore if OD enabled If OD is not enabled then restoring cached clock settings doesn't make sense and actually leads to errors in resume. Check if enabled before restoring settings. Fixes: `4e9526924d` ("drm/amd: Restore cached manual clock settings during resume") Reported-by: Jérôme Lécuyer <jerome.4a4c@gmail.com> Closes: https://lore.kernel.org/amd-gfx/0ffe2692-7bfa-4821-856e-dd0f18e2c32b@amd.com/T/#me6db8ddb192626360c462b7570ed7eba0c6c9733 Suggested-by: Jérôme Lécuyer <jerome.4a4c@gmail.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-16 17:47:46 -04:00
Rodrigo Siqueira	49e957b289	drm/amd/pm: Use devm_i2c_add_adapter() in the V14_0_2 smu The I2C init for V14_0_2 uses i2c_add_adapter() and i2c_del_adapter(), this commit replaces the use of these two functions with devm_i2c_add_adapter(). Notice that V14_0_2 init initializes multiple I2C buses in a loop; if something goes wrong, the previous adapters are removed, and the amdgpu load is interrupted. Since I2C init is required for the correct load of amdgpu, it is safe to rely on devm_i2c_add_adapter() to handle any previously initialized I2C adapter. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-16 17:47:43 -04:00
Rodrigo Siqueira	c32da00612	drm/amd/pm: Use devm_i2c_add_adapter() in the V13_0_6 smu The I2C init for V13_0_6 uses i2c_add_adapter() and i2c_del_adapter(), this commit replaces the use of these two functions with devm_i2c_add_adapter(). Notice that V13_0_6 init initializes multiple I2C buses in a loop; if something goes wrong, the previous adapters are removed, and the amdgpu load is interrupted. Since I2C init is required for the correct load of amdgpu, it is safe to rely on devm_i2c_add_adapter() to handle any previously initialized I2C adapter. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-16 17:47:40 -04:00
Rodrigo Siqueira	4970883abd	drm/amd/pm: Use devm_i2c_add_adapter() in the V13 smu The I2C init for SMU_V13 uses i2c_add_adapter() and i2c_del_adapter(), this commit replaces the use of these two functions with devm_i2c_add_adapter(). Notice that SMU_V13 init initializes multiple I2C buses in a loop; if something goes wrong, the previous adapters are removed, and the amdgpu load is interrupted. Since I2C init is required for the correct load of amdgpu, it is safe to rely on devm_i2c_add_adapter() to handle any previously initialized I2C adapter. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-16 17:47:38 -04:00
Rodrigo Siqueira	13f785d37a	drm/amd/pm: Use devm_i2c_add_adapter() in the Sienna smu The I2C init for Sienna Cichlid uses i2c_add_adapter() and i2c_del_adapter(), this commit replaces the use of these two functions with devm_i2c_add_adapter(). Notice that Sienna Cichlid init initializes multiple I2C buses in a loop; if something goes wrong, the previous adapters are removed, and the amdgpu load is interrupted. Since I2C init is required for the correct load of amdgpu, it is safe to rely on devm_i2c_add_adapter() to handle any previously initialized I2C adapter. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-16 17:47:36 -04:00
Rodrigo Siqueira	9058cb7775	drm/amd/pm: Use devm_i2c_add_adapter() in the Navi10 smu The I2C init for Navi10 uses i2c_add_adapter() and i2c_del_adapter(), this commit replaces the use of these two functions with devm_i2c_add_adapter(). Notice that Navi10 init initializes multiple I2C buses in a loop; if something goes wrong, the previous adapters are removed, and the amdgpu load is interrupted. Since I2C init is required for the correct load of amdgpu, it is safe to rely on devm_i2c_add_adapter() to handle any previously initialized I2C adapter. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-16 17:47:32 -04:00
Rodrigo Siqueira	439158c475	drm/amd/pm: Use devm_i2c_add_adapter() in the Arcturus smu The I2C init for Arcturus uses i2c_add_adapter() and i2c_del_adapter(), this commit replaces the use of these two functions with devm_i2c_add_adapter(). Notice that Arcturus init initializes multiple I2C buses in a loop; if something goes wrong, the previous adapters are removed, and the amdgpu load is interrupted. Since I2C init is required for the correct load of amdgpu, it is safe to rely on devm_i2c_add_adapter() to handle any previously initialized I2C adapter. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-16 17:47:28 -04:00
Rodrigo Siqueira	f4dfc4447d	drm/amd/pm: Use devm_i2c_add_adapter() in the i2c init Instead of using i2c_add_adapter() and i2c_del_adapter(), replace them with devm_i2c_add_adapter() to simplify the i2c logic. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-16 17:47:26 -04:00
Rodrigo Siqueira	63137c7c8c	drm/amdgpu: Use devm_i2c_add_adapter() in SMU V11 Instead of using i2c_add_adapter() and i2c_del_adapter() in the SMU V11, use devm_i2c_add_adapter() to simplify the code path. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-16 17:47:24 -04:00
Rodrigo Siqueira	0f36a3c6af	drm/amdgpu/amdgpu_i2c: Use devm_i2c_add_adapter instead of i2c_add_adapter This commit replaces i2c_add_adapter() with devm_i2c_add_adapter() and removes part of the cleanup logic since the new function handles the i2c removal. Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-16 17:47:22 -04:00
Rodrigo Siqueira	5b3eca05cf	drm/amd/display: Use devm_i2c_add_adapter to simplify i2c cleanup logic This commit replaces the utilization of i2c_add/del_adapter() with devm_i2c_add_adapter() to reduce the amount of boilerplate. Using devm_i2c_add_adapter() has the advantage of removing the manual manipulation of the I2C adapter. Suggested-by: Robert Beckett <bob.beckett@collabora.com> Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-16 17:47:16 -04:00
James Flowers	70db83e2b9	drm/amd/display: Use kmalloc_array() instead of kmalloc() Documentation/process/deprecated.rst recommends against the use of kmalloc with dynamic size calculations due to the risk of overflow and smaller allocation being made than the caller was expecting. This could lead to buffer overflow in code similar to the memcpy in amdgpu_dm_plane_add_modifier(). Signed-off-by: James Flowers <bold.zone2373@fastmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-16 17:47:11 -04:00
Christian König	39203f5e6d	drm/amdgpu: fix userq VM validation v4 That was actually complete nonsense and not validating the BOs at all. The code just cleared all VM areas were it couldn't grab the lock for a BO. Try to fix this. Only compile tested at the moment. v2: fix fence slot reservation as well as pointed out by Sunil. also validate PDs, PTs, per VM BOs and update PDEs v3: grab the status_lock while working with the done list. v4: rename functions, add some comments, fix waiting for updates to complete. v4: rename amdgpu_vm_lock_done_list(), add some more comments Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Sunil Khatri <sunil.khatri@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-16 17:47:06 -04:00
Christian König	d7ddcf921e	drm/amdgpu: reject gang submissions under SRIOV Gang submission means that the kernel driver guarantees that multiple submissions are executed on the HW at the same time on different engines. Background is that those submissions then depend on each other and each can't finish stand alone. SRIOV now uses world switch to preempt submissions on the engines to allow sharing the HW resources between multiple VFs. The problem is now that the SRIOV world switch can't know about such inter dependencies and will cause a timeout if it waits for a partially running gang submission. To conclude SRIOV and gang submissions are fundamentally incompatible at the moment. For now just disable them. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2025-09-16 17:47:00 -04:00
Yang Li	9e6eb49ec1	drm/xe: Remove duplicate header files Fix some duplicate includes in xe: ./drivers/gpu/drm/xe/xe_tlb_inval.c: xe_tlb_inval.h is included more than once. ./drivers/gpu/drm/xe/xe_pt.c: xe_tlb_inval_job.h is included more than once. While at it, also sort the include lines alphabetically. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=24705 Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=24706 Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> [Reword commit message] Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250916021039.1632766-1-yang.lee@linux.alibaba.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-09-16 12:47:40 -07:00
John Harrison	3b09b11805	drm/xe/guc: Return an error code if the GuC load fails Due to multiple explosion issues in the early days of the Xe driver, the GuC load was hacked to never return a failure. That prevented kernel panics and such initially, but now all it achieves is creating more confusing errors when the driver tries to submit commands to a GuC it already knows is not there. So fix that up. As a stop-gap and to help with debug of load failures due to invalid GuC init params, a wedge call had been added to the inner GuC load function. The reason being that it leaves the GuC log accessible via debugfs. However, for an end user, simply aborting the module load is much cleaner than wedging and trying to continue. The wedge blocks user submissions but it seems that various bits of the driver itself still try to submit to a dead GuC and lots of subsequent errors occur. And with regards to developers debugging why their particular code change is being rejected by the GuC, it is trivial to either add the wedge back in and hack the return code to zero again or to just do a GuC log dump to dmesg. v2: Add support for error injection testing and drop the now redundant wedge call. CC: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> Link: https://lore.kernel.org/r/20250909224132.536320-1-John.C.Harrison@Intel.com	2025-09-16 12:11:08 -07:00
Dan Carpenter	cbc7f3b4f6	drm/xe: Fix a NULL vs IS_ERR() in xe_vm_add_compute_exec_queue() The xe_preempt_fence_create() function returns error pointers. It never returns NULL. Update the error checking to match. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/aJTMBdX97cof_009@stanley.mountain Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit `75cc23ffe5`) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-09-16 11:49:16 -04:00
Zongyao Bai	1a869168d9	drm/xe/sysfs: Add cleanup action in xe_device_sysfs_init On partial failure, some sysfs files created before the failure might not be removed. Add common cleanup step to remove them all immediately, as is should be harmless to attempt to remove non-existing files. Fixes: `0e414bf7ad` ("drm/xe: Expose PCIe link downgrade attributes") Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Zongyao Bai <zongyao.bai@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20250915214716.1327379-2-zongyao.bai@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-09-16 07:59:36 -07:00
Qi Xi	288dac9fb6	drm: bridge: cdns-mhdp8546: Fix missing mutex unlock on error path Add missing mutex unlock before returning from the error path in cdns_mhdp_atomic_enable(). Fixes: `935a92a1c4` ("drm: bridge: cdns-mhdp8546: Fix possible null pointer dereference") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Qi Xi <xiqi2@huawei.com> Reviewed-by: Luca Ceresoli <luca.ceresoli@bootlin.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://lore.kernel.org/r/20250904034447.665427-1-xiqi2@huawei.com Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>	2025-09-16 15:42:35 +02:00
Aaron Ma	35e526398b	drm/i915/backlight: Honor VESA eDP backlight luminance control capability The VESA AUX backlight fails to be enable luminance based backlight mainpulation becaused luminance_set is false by default. Fix it by using luminance support control capabitliy. Fixes: `e13af5166a` ("drm/i915/backlight: Use drm helper to initialize edp backlight") Signed-off-by: Aaron Ma <aaron.ma@canonical.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Signed-off-by: Suraj Kandpal <suraj.kandpal@intel.com> Link: https://lore.kernel.org/r/20250823121647.275834-1-aaron.ma@canonical.com (cherry picked from commit `72136efb87`) Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>	2025-09-16 09:12:24 +01:00
Tamir Duberstein	7dfabaa0c4	drm/panic: use `core::ffi::CStr` method names Prepare for `core::ffi::CStr` taking the place of `kernel::str::CStr` by avoid methods that only exist on the latter. Link: https://github.com/Rust-for-Linux/linux/issues/1075 Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Benno Lossin <lossin@kernel.org> Acked-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: Tamir Duberstein <tamird@gmail.com> Acked-by: Jocelyn Falempe <jfalempe@redhat.com> Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2025-09-16 09:26:59 +02:00
Tamir Duberstein	4b4d06a763	gpu: nova-core: use `kernel::{fmt,prelude::fmt!}` Reduce coupling to implementation details of the formatting machinery by avoiding direct use for `core`'s formatting traits and macros. Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Benno Lossin <lossin@kernel.org> Acked-by: Danilo Krummrich <dakr@kernel.org> Signed-off-by: Tamir Duberstein <tamird@gmail.com> Signed-off-by: Miguel Ojeda <ojeda@kernel.org>	2025-09-16 09:26:58 +02:00

... 5 6 7 8 9 ...

118549 Commits