linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-14 05:22:19 -04:00

Author	SHA1	Message	Date
Sk Anirban	d7364b86e4	drm/i915/selftests: Correct frequency handling in RPS power measurement Fix the frequency calculation by ensuring it uses the raw frequency only. Update live_rps_power test to use the correct frequency values for logging and comparison. Signed-off-by: Sk Anirban <sk.anirban@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250113095912.356147-2-sk.anirban@intel.com	2025-01-28 21:11:15 +01:00
Ranu Maurya	1aeb1c0eda	drm/i915: Add Wa_22010465259 in its respective WA list Add Wa_22010465259 which points to an existing WA, but was missing from the comment list. While at it, update the other WAs and their applicable platforms as well. v1: Initial commit. v2: Add DG2 platform to Wa_22010465259. v3: Removed DG2 platform to Wa_22010465259 since it was for preproduction. Signed-off-by: Ranu Maurya <ranu.maurya@intel.com> Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com> Reviewed-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250116093115.2437154-1-ranu.maurya@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com>	2025-01-28 08:50:24 -08:00
Brian Geffon	9e304a1863	drm/i915: Fix page cleanup on DMA remap failure When converting to folios the cleanup path of shmem_get_pages() was missed. When a DMA remap fails and the max segment size is greater than PAGE_SIZE it will attempt to retry the remap with a PAGE_SIZEd segment size. The cleanup code isn't properly using the folio apis and as a result isn't handling compound pages correctly. v2 -> v3: (Ville) Just use shmem_sg_free_table() as-is in the failure path of shmem_get_pages(). shmem_sg_free_table() will clear mapping unevictable but it will be reset when it retries in shmem_sg_alloc_table(). v1 -> v2: (Ville) Fixed locations where we were not clearing mapping unevictable. Cc: stable@vger.kernel.org Cc: Ville Syrjala <ville.syrjala@linux.intel.com> Cc: Vidya Srinivas <vidya.srinivas@intel.com> Link: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13487 Link: https://lore.kernel.org/lkml/20250116135636.410164-1-bgeffon@google.com/ Fixes: `0b62af28f2` ("i915: convert shmem_sg_free_table() to use a folio_batch") Signed-off-by: Brian Geffon <bgeffon@google.com> Suggested-by: Tomasz Figa <tfiga@google.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250127204332.336665-1-bgeffon@google.com Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Tested-by: Vidya Srinivas <vidya.srinivas@intel.com> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>	2025-01-28 18:01:16 +02:00
Umesh Nerlige Ramappa	431b742e2b	drm/i915/pmu: Fix zero delta busyness issue When running igt@gem_exec_balancer@individual for multiple iterations, it is seen that the delta busyness returned by PMU is 0. The issue stems from a combination of 2 implementation specific details: 1) gt_park is throttling __update_guc_busyness_stats() so that it does not hog PCI bandwidth for some use cases. (Ref: `59bcdb564b`) 2) busyness implementation always returns monotonically increasing counters. (Ref: `cf907f6d29`) If an application queried an engine while it was active, engine->stats.guc.running is set to true. Following that, if all PM wakeref's are released, then gt is parked. At this time the throttling of __update_guc_busyness_stats() may result in a missed update to the running state of the engine (due to (1) above). This means subsequent calls to guc_engine_busyness() will think that the engine is still running and they will keep updating the cached counter (stats->total). This results in an inflated cached counter. Later when the application runs a workload and queries for busyness, we return the cached value since it is larger than the actual value (due to (2) above) All subsequent queries will return the same large (inflated) value, so the application sees a delta busyness of zero. Fix the issue by resetting the running state of engines each time intel_guc_busyness_park() is called. v2: (Rodrigo) - Use the correct tag in commit message - Drop the redundant wakeref check in guc_engine_busyness() and update commit message Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13366 Fixes: `cf907f6d29` ("i915/guc: Ensure busyness counter increases motonically") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250123193839.2394694-1-umesh.nerlige.ramappa@intel.com	2025-01-27 13:06:17 -08:00
Dr. David Alan Gilbert	93b69c0482	drm/i915: Remove unused live_context_for_engine The last use of live_context_for_engine() was removed in 2021 by commit `99919be74a` ("drm/i915/gem: Zap the i915_gem_object_blt code") Remove it. Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250125003846.228514-1-linux@treblig.org Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-01-27 14:10:18 -05:00
Lucas De Marchi	367d7bc6d5	drm/i915/pmu: Remove i915_pmu_event_event_idx() perf event already has a default function that returns 0, no need to override with the same thing. Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250113183329.3138138-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2025-01-21 10:15:58 -08:00
John Harrison	709631924e	drm/i915/uc: Include requested frequency in slow firmware load messages To aid debug of sporadic issues, include the requested frequency in the debug message as well as the actual frequency. That way we know for certain that the clamping is not because the driver forgot to ask. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241221014329.4048408-1-John.C.Harrison@Intel.com	2025-01-17 11:29:08 -08:00
John Harrison	1113fc0e82	drm/i915: Add debug print about hw config table size Add debug info to help investigate a very rare bug: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/13385 Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241221011925.3944625-1-John.C.Harrison@Intel.com	2025-01-15 10:56:33 -08:00
Raag Jadav	9ef80eec5f	drm/i915/selftest: Change throttle criteria for rps Current live_rps_control() implementation errors out on throttling. This was done with the assumption that throttling to minimum frequency is a catastrophic failure, which is incorrect. Throttling can happen due to variety of reasons and often times out of our control. Also, the resulting frequency can be at any given point below the maximum allowed. Change throttle criteria to reflect this logic and drop the error, as it doesn't necessarily mean selftest failure. Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250102110618.174415-1-raag.jadav@intel.com	2025-01-11 13:44:40 +01:00
Nitin Gote	6f0572fa8f	drm/i915/gt: Prefer IS_ENABLED() instead of defined() on config option Use IS_ENABLED() instead of defined() for checking whether a kconfig option is enabled. Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250103062651.798249-2-nitin.r.gote@intel.com	2025-01-09 13:43:16 +01:00
Apoorva Singh	d6b24cc3e2	drm/i915/gt: Prevent uninitialized pointer reads Initialize rq to NULL to prevent uninitialized pointer reads. Signed-off-by: Apoorva Singh <apoorva.singh@intel.com> Reviewed-by: Krzysztof Karas <krzysztof.karas@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241227112920.1547592-1-apoorva.singh@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2025-01-08 08:54:57 -05:00
Dr. David Alan Gilbert	0a1584ec3d	drm/i915: Remove unused intel_ring_cacheline_align The last use of intel_ring_cacheline_align() was removed in 2017 by commit `afa8ce5b30` ("drm/i915: Nuke legacy flip queueing code") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Tvrtko Ursulin <tursulin@igalia.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241227113754.25871-3-tursulin@igalia.com	2024-12-30 01:31:56 +01:00
Dr. David Alan Gilbert	64420d2f3e	drm/i915: Remove unused intel_huc_suspend intel_huc_suspend() was added in 2022 by commit `27536e0327` ("drm/i915/huc: track delayed HuC load with a fence") but hasn't been used. Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Tvrtko Ursulin <tursulin@igalia.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241227113754.25871-2-tursulin@igalia.com	2024-12-30 01:31:45 +01:00
Dr. David Alan Gilbert	bc6b027e6d	drm/i915: Remove deadcode i915_active_acquire_for_context() was added in 2020 by commit `5d9341370f` ("drm/i915: Export a preallocate variant of i915_active_acquire()") but has never been used. The last use of __i915_gem_object_is_lmem() was removed in 2021 by commit `ff20afc4ce` ("drm/i915: Update error capture code to avoid using the current vma state") Remove them. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Tvrtko Ursulin <tursulin@igalia.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241227113754.25871-1-tursulin@igalia.com	2024-12-30 01:31:31 +01:00
Sebastian Brzezinka	835443da6f	drm/i915/gt: Log reason for setting TAINT_WARN at reset TAINT_WARN is used to notify CI about non-recoverable failures, which require device to be restarted. In some cases, there is no sufficient information about the reason for the restart. The test runner is just killed, and DUT is rebooted, logging only 'probe with driver i915 failed with error -4' to dmesg. Printing error to dmesg before TAINT_WARN, would explain why the device has been restarted, and what caused the malfunction in the first place. Signed-off-by: Sebastian Brzezinka <sebastian.brzezinka@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Andi Shyti <andi.shyti@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20241220131714.1309483-1-andi.shyti@linux.intel.com	2024-12-23 20:12:20 +01:00
Nitin Gote	5ed539e327	drm/i915/gt: Use ENGINE_TRACE for tracing. Instead of drm_err(), prefer gt_err() and ENGINE_TRACE() for GEM tracing in i915. So, it will be good to use ENGINE_TRACE() over drm_err() drm_device based logging for engine debug log. v2: Bit more specific in commit description (Andi) v3: Use gt_err() along with ENGINE_TRACE() in place of drm_err() (Andi) Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241217100058.2819053-1-nitin.r.gote@intel.com	2024-12-20 11:42:26 +01:00
Sk Anirban	63b81a3a77	drm/i915/selftests: Implement frequency logging for energy reading validation Add RC6 & RC0 frequency printing to ensure accurate energy readings aimed at addressing GPU energy leaks and power measurement failures. Also update sleep time for RC6 mode to match RC0. v2: - Improved commit message. v3: - Used pr_err log to display frequency (Anshuman) - Sorted headers alphabetically (Sai Teja) v4: - Improved commit message. - Fix pr_err log (Sai Teja) v5: - Add error & debug logging for RC0 power and frequency checks (Anshuman) v6: - Modify debug logging for RC0 power and frequency checks (Sai Teja) v7: - Use pr_debug if RC0 power isn't measured but frequency is (Anshuman) - Improved commit message (Badal) - Change API to read actual frequency without applying forcewake (Badal) - Update sleep time for RC6 mode (Anshuman) Signed-off-by: Sk Anirban <sk.anirban@intel.com> Reviewed-by: Sai Teja Pottumuttu <sai.teja.pottumuttu@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241129154716.2764974-1-sk.anirban@intel.com	2024-12-19 11:06:03 +01:00
Nitin Gote	512eadb334	drm/i915/gt: Increase a time to retry RING_HEAD reset Issue seen again where engine resets fails because the engine resumes from an incorrect RING_HEAD. HEAD is still not 0 even after writing into it. This seems to be timing issue and we experimented different values from 5ms to 50ms and found out that 50ms works best based on testing. So, if write doesn't succeed at first then retry again. v2: add a comment (Andi Shyti) Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12806 Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241217063532.2729031-1-nitin.r.gote@intel.com	2024-12-18 15:26:09 +01:00
Jesus Narvaez	f373ebec18	drm/i915/guc: Update guc_err message to show outstanding g2h responses Updating the guc_error message to show how many g2h responses are still outstanding, in order to help with future debugging. Signed-off-by: Jesus Narvaez <jesus.narvaez@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241213204720.3918056-1-jesus.narvaez@intel.com	2024-12-17 11:38:50 -08:00
Umesh Nerlige Ramappa	7ed047da59	i915/guc: Accumulate active runtime on gt reset On gt reset, if a context is running, then accumulate it's active time into the busyness counter since there will be no chance for the context to switch out and update it's run time. v2: Move comment right above the if (John) Fixes: `77cdd054dd` ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241127174006.190128-4-umesh.nerlige.ramappa@intel.com	2024-12-13 15:13:51 -08:00
Umesh Nerlige Ramappa	cf907f6d29	i915/guc: Ensure busyness counter increases motonically Active busyness of an engine is calculated using gt timestamp and the context switch in time. While capturing the gt timestamp, it's possible that the context switches out. This race could result in an active busyness value that is greater than the actual context runtime value by a small amount. This leads to a negative delta and throws off busyness calculations for the user. If a subsequent count is smaller than the previous one, just return the previous one, since we expect the busyness to catch up. Fixes: `77cdd054dd` ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241127174006.190128-3-umesh.nerlige.ramappa@intel.com	2024-12-13 15:13:50 -08:00
Umesh Nerlige Ramappa	abd318237f	i915/guc: Reset engine utilization buffer before registration On GT reset, we store total busyness counts for all engines and re-register the utilization buffer with GuC. At that time we should reset the buffer, so that we don't get spurious busyness counts on subsequent queries. To repro this issue, run igt@perf_pmu@busy-hang followed by igt@perf_pmu@most-busy-idle-check-all for a couple iterations. Fixes: `77cdd054dd` ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241127174006.190128-2-umesh.nerlige.ramappa@intel.com	2024-12-13 15:13:49 -08:00
Sk Anirban	630e03808a	drm/i915/selftests: Add delay to stabilize frequency in live_rps_power Add delays to allow frequency stabilization before power measurement to fix sporadic power conservation issues in live_rps_power test. v2: - Move delay to respective function (Badal) Signed-off-by: Sk Anirban <sk.anirban@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241203061114.2790448-1-sk.anirban@intel.com	2024-12-06 20:15:07 +05:30
Krzysztof Karas	2e0438f9c3	drm/i915: ensure segment offset never exceeds allowed max Commit `255fc1703e` ("drm/i915/gem: Calculate object page offset for partial memory mapping") introduced a new offset, which accounts for userspace mapping not starting from the beginning of object's scatterlist. This works fine for cases where first object pte is larger than the new offset - "r->sgt.curr" counter is set to the offset to match the difference in the number of total pages. However, if object's first pte's size is equal to or smaller than the offset, then information about the offset in userspace is covered up by moving "r->sgt" pointer in remap_sg(): r->sgt.curr += PAGE_SIZE; if (r->sgt.curr >= r->sgt.max) r->sgt = __sgt_iter(__sg_next(r->sgt.sgp), use_dma(r->iobase)); This means that two or more pages from virtual memory are counted for only one page in object's memory, because after moving "r->sgt" pointer "r->sgt.curr" will be 0. We should account for this mismatch by moving "r->sgt" pointer to the next pte. For that we may use "r.sgt.max", which already holds the max allowed size. This change also eliminates possible confusion, when looking at i915_scatterlist.h and remap_io_sg() code: former has scatterlist pointer definition, which differentiates "s.max" value based on "dma" flag (sg_dma_len() is used only when the flag is enabled), while latter uses sg_dma_len() indiscriminately. This patch aims to resolve issue: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12031 v3: - instead of checking if r.sgt.curr would exceed allowed max, changed the value in the while loop to be aligned with `dma` value v4: - remove unnecessary parent relation v5: - update commit message with explanation about page counting mismatch and link to the issue Signed-off-by: Krzysztof Karas <krzysztof.karas@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/upbjdavlbcxku63ns4vstp5kgbn2anxwewpmnppszgb67fn66t@tfclfgkqijue	2024-12-05 14:29:48 +01:00
Zhanjun Dong	b939a08bc3	drm/i915/guc: Flush ct receive tasklet during reset preparation GuC to host communication is interrupt driven, the handling has 3 parts: interrupt context, tasklet and request queue worker. During GuC reset prepare, interrupt is disabled before destroy contexts steps start. The IRQ and worker are flushed to finish any outstanding in-progress message handling. But, the tasklet flush is missing, it might causes 2 race conditions: 1. Tasklet runs after IRQ flushed, add request to queue after worker flush started, causes unexpected G2H message request processing, meanwhile, reset prepare code already get the context destroyed. This will causes error reported about bad context state. (https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/11349 and https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12303) 2. Tasklet runs after intel_guc_submission_reset_prepare, ct_try_receive_message start to run, while intel_uc_reset_prepare already finished guc sanitize and set ct->enable to false. This will causes warning on incorrect ct->enable state. (https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12439) Add the missing tasklet flush to flush all 3 parts. Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241104214103.214702-1-zhanjun.dong@intel.com	2024-11-06 12:29:30 -08:00
Lucas De Marchi	79367b7a58	drm/i915/pmu: Remove pointless synchronize_rcu() call This is already done inside perf_pmu_unregister() - no need to do it before. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241104213512.2314930-5-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-11-05 11:24:33 -08:00
Lucas De Marchi	6ba29f1352	drm/i915/pmu: Replace closed with registered Since i915 calls perf_pmu_register/perf_pmu_unregister, let's call the variable "registered" so we can flip the logic and rely on it being false by default. Looking at other drivers, it's also more common. Examples: arch/x86/events/intel/uncore.c and drivers/powercap/intel_rapl_common.c. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241104213512.2314930-4-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-11-05 11:24:33 -08:00
Lucas De Marchi	9116b5760e	drm/i915/pmu: Stop setting event_init to NULL Setting event_init to NULL is mostly done to detect when the driver is partially working: i915 probed, but pmu is not registered. However, checking for event_init is odd as it was supposed to always be set and kernel/events/ would just crash if it found it set to NULL. Since there's already a "closed" boolean, use that instead and extend it's meaning to unregistered/unregistering. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241104213512.2314930-3-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-11-05 11:24:33 -08:00
Lucas De Marchi	c62018a002	drm/i915/pmu: Rename cpuhp_slot to cpuhp_state Both the documentation and most of other users call the return of cpuhp_setup_state_multi() as "state". Follow that. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241104213512.2314930-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-11-05 11:24:33 -08:00
Dr. David Alan Gilbert	b7cfe79f06	drm/i915/gt: Remove unused execlists_unwind_incomplete_requests execlists_unwind_incomplete_requests() is unused since 2021's commit `eb5e7da736` ("drm/i915/guc: Reset implementation for new GuC interface") Remove it. Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Link: https://patchwork.freedesktop.org/patch/msgid/20241103144936.238116-1-linux@treblig.org Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-11-04 16:31:44 -05:00
Nitin Gote	6ef0e3ef26	drm/i915/gt: Retry RING_HEAD reset until it get sticks we see an issue where resets fails because the engine resumes from an incorrect RING_HEAD. Since the RING_HEAD doesn't point to the remaining requests to re-run, but may instead point into the uninitialised portion of the ring, the GPU may be then fed invalid instructions from a privileged context, oft pushing the GPU into an unrecoverable hang. If at first the write doesn't succeed, try, try again. v2: Avoid unnecessary timeout macro (Andi) v3: Correct comment format (Andi) v4: Make it generic for all platform as it won't impact (Chris) Link: https://gitlab.freedesktop.org/drm/intel/-/issues/5432 Testcase: igt/i915_selftest/hangcheck Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com> Signed-off-by: Nitin Gote <nitin.r.gote@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241015145710.2478599-1-nitin.r.gote@intel.com	2024-10-22 11:35:07 +02:00
Ville Syrjälä	e217f22041	drm/i915/pmu: Add support for gen2 Implement pmu support for gen2 so that one can use intel_gpu_top on it once again. Gen2 lacks MI_MODE/MODE_IDLE so we'll have to do a bit more work to determine the state of the engine: - to determine if the ring contains unconsumed data we can simply compare RING_TAIL vs. RING_HEAD - also check RING_HEAD vs. ACTHD to catch cases where the hardware is still executing a batch buffer but the ring head has already caught up with the tail. Not entirely sure if that's actually possible or not, but maybe it can happen if the batch buffer is initiated from the very end of the ring? But even if not strictly necessary there's no real harm in checking anyway. - MI_WAIT_FOR_EVENT can be detected via a dedicated bit in RING_HEAD v2: Use genX_ prefix rarther than suffix (Jani) Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241008214349.23331-5-ville.syrjala@linux.intel.com Acked-by: Jani Nikula <jani.nikula@intel.com>	2024-10-15 17:51:00 +03:00
Ville Syrjälä	bdc2917fbd	drm/i915/gt: s/gen3/gen2/ Now that we use the gen3 codepaths also for gen2 rename everything to gen2_ to match. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241008214349.23331-3-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com>	2024-10-15 17:49:24 +03:00
Ville Syrjälä	259f5a9d1c	drm/i915/gt: Nuke gen2_irq_{enable,disable}() We've determined that accessing the (supposedly) 16bit interrupt registers on gen2 as 32bit works just fine. We already dropped the special case from the main interrupt code, do so also for the gt interrupt stuff. Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241008214349.23331-2-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com>	2024-10-15 17:33:15 +03:00
Juston Li	b05f9847ff	drm/i915/guc: Enable PXP GuC autoteardown flow This feature flag enables GuC autoteardown which allows for a grace period before session teardown. Also add a HAS_PXP() helper to share with the other place that wants to check. Signed-off-by: Juston Li <juston.li@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240906174038.1468026-1-John.C.Harrison@Intel.com	2024-10-14 13:06:45 -07:00
Zhang He	aa944281bd	drm/i915/gt: Fixed "CPU" -> "GPU" typo Column header should be GPU, not CPU Signed-off-by: Zhang He <zhanghe9702@163.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240913140721.31165-1-zhanghe9702@163.com	2024-09-16 10:27:19 +02:00
Lucas De Marchi	2c3631fbd8	drm/i915/pmu: Use event_to_pmu() i915 pointer is not needed in this function and all the others simply calculate the i915_pmu container based on the event->pmu. Follow the same logic as in other functions. Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240909204340.3646458-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-09-10 17:45:59 -07:00
Lucas De Marchi	10a7210d59	drm/i915/pmu: Drop is_igp() There's no reason to hardcode checking for integrated graphics on a specific pci slot. That information is already available per platform an can be checked with IS_DGFX(). Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240909204340.3646458-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-09-10 17:45:53 -07:00
Nikita Zhandarovich	1f1c1bd566	drm/i915/guc: prevent a possible int overflow in wq offsets It may be possible for the sum of the values derived from i915_ggtt_offset() and __get_parent_scratch_offset()/ i915_ggtt_offset() to go over the u32 limit before being assigned to wq offsets of u64 type. Mitigate these issues by expanding one of the right operands to u64 to avoid any overflow issues just in case. Found by Linux Verification Center (linuxtesting.org) with static analysis tool SVACE. Fixes: `c2aa552ff0` ("drm/i915/guc: Add multi-lrc context registration") Cc: Matthew Brost <matthew.brost@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Nikita Zhandarovich <n.zhandarovich@fintech.ru> Link: https://patchwork.freedesktop.org/patch/msgid/20240725155925.14707-1-n.zhandarovich@fintech.ru Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-09-06 15:00:32 -04:00
Hongbo Li	596a7f1084	drm/i915: Remove extra unlikely helper In IS_ERR, the unlikely is used for the input parameter, so these is no need to use it again outside. Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240831094655.4153520-1-lihongbo22@huawei.com Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-09-05 15:44:37 -04:00
Raag Jadav	727eb1e3f0	drm/i915/hwmon: expose fan speed Add hwmon support for fan1_input attribute, which will expose fan speed in RPM. With this in place we can monitor fan speed using lm-sensors tool. $ sensors i915-pci-0300 Adapter: PCI adapter in0: 653.00 mV fan1: 3833 RPM power1: N/A (max = 43.00 W) energy1: 32.02 kJ v2: Handle overflow, add mutex protection and ABI documentation Aesthetic adjustments (Riana) v3: Change rotations data type, ABI date and version v4: Fix wakeref leak Drop switch case and simplify hwm_fan_xx() (Andi) v5: Rework time calculation, aesthetic adjustments (Andy) v6: Drop redundant overflow logic (Andy) Split fan_input_read() into dedicated helper (Badal) v7: Fix undefined reference to __udivdi3 for i386 (Andy) Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Riana Tauro <riana.tauro@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Signed-off-by: Anshuman Gupta <anshuman.gupta@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240823034548.2670032-1-raag.jadav@intel.com	2024-08-28 12:06:07 +05:30
Daniele Ceraolo Spurio	03ded4d432	drm/i915: Do not attempt to load the GSC multiple times If the GSC FW fails to load the GSC HW hangs permanently; the only ways to recover it are FLR or D3cold entry, with the former only being supported on driver unload and the latter only on DGFX, for which we don't need to load the GSC. Therefore, if GSC fails to load there is no need to try again because the HW is stuck in the error state and the submission to load the FW would just hang the GSCCS. Note that, due to wa_14015076503, on MTL the GuC escalates all GSCCS hangs to full GT resets, which would trigger a new attempt to load the GSC FW in the post-reset HW re-init; this issue is also fixed by not attempting to load the GSC FW after an error. Fixes: `15bd4a67e9` ("drm/i915/gsc: GSC firmware loading") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: <stable@vger.kernel.org> # v6.3+ Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240820215952.2290807-1-daniele.ceraolospurio@intel.com	2024-08-27 08:08:09 -07:00
John Harrison	54f90b0333	drm/i915/guc: Fix missing enable of Wa_14019159160 on ARL The previous update to enable the workaround on ARL only changed two out of three places where the w/a needs to be enabled. That meant the GuC side was operational but not the KMD side. And as the KMD side is the trigger, it meant the w/a was not actually active. So fix that. Fixes: `104bcfae57` ("drm/i915/arl: Enable Wa_14019159160 for ARL") Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Jonathan Cavitt <jonathan.cavitt@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240809000646.1747507-1-John.C.Harrison@Intel.com	2024-08-26 15:24:49 -07:00
Dnyaneshwar Bhadane	037f93434c	drm/i915/gt: Whitelist COMMON_SLICE_CHICKEN1 for UMD access. As part of the recommended tuning setting, whitelist COMMON_SLICE_CHICKEN1 for MTL/ARL and DG2. The UMD will selectively enable or disable specific bits of the register based on the type of workload and its requirements. v2: Remove the KMD par of enabling specific bits(Matt R) Bspec: 68331 Signed-off-by: Dnyaneshwar Bhadane <dnyaneshwar.bhadane@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240825121156.2498810-1-dnyaneshwar.bhadane@intel.com	2024-08-26 09:57:20 -07:00
Yu Jiaoliang	3126d5fff5	drm/i915/gt: Use kmemdup_array instead of kmemdup for multiple allocation Let the kememdup_array() take care about multiplication and possible overflows. v2: - Change subject - Leave one blank line between the commit log and the tag section - Fix code alignment issue v3: - Fix code alignment - Apply the patch on a clean drm-tip Signed-off-by: Yu Jiaoliang <yujiaoliang@vivo.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240821144036.343556-1-andi.shyti@linux.intel.com	2024-08-24 01:13:08 +02:00
Andi Shyti	6628851159	drm/i915/gt: Continue creating engine sysfs files even after a failure The i915 driver generates sysfs entries for each engine of the GPU in /sys/class/drm/cardX/engines/. The process is straightforward: we loop over the UABI engines and for each one, we: - Create the object. - Create basic files. - If the engine supports timeslicing, create timeslice duration files. - If the engine supports preemption, create preemption-related files. - Create default value files. Currently, if any of these steps fail, the process stops, and no further sysfs files are created. However, it's not necessary to stop the process on failure. Instead, we can continue creating the remaining sysfs files for the other engines. Even if some files fail to be created, the list of engines can still be retrieved by querying i915. Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240819113140.325235-1-andi.shyti@linux.intel.com	2024-08-24 00:28:55 +02:00
Andi Shyti	255fc1703e	drm/i915/gem: Calculate object page offset for partial memory mapping To enable partial memory mapping of GPU virtual memory, it's necessary to introduce an offset to the object's memory (obj->mm.pages) scatterlist. This adjustment compensates for instances when userspace mappings do not start from the beginning of the object. Based on a patch by Chris Wilson. Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Cc: Chris Wilson <chris.p.wilson@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240807100521.478266-3-andi.shyti@linux.intel.com	2024-08-21 15:28:33 +02:00
Andi Shyti	609d8b1c42	drm/i915/gem: Do not look for the exact address in node In preparation for the upcoming partial memory mapping feature, we want to make sure that when looking for a node we consider also the offset and not just the starting address of the virtual memory node. Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240807100521.478266-2-andi.shyti@linux.intel.com	2024-08-21 15:28:33 +02:00
Luca Coelho	0523374e30	drm/i915/gt: remove stray declaration of intel_gt_release_all() When intel_gt_release_all() was removed from the code in commit `e899505533` ("drm/i915: do not clean GT table on error path"), its declaration in the header file remained. Remove it. Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240813140618.387553-1-luciano.coelho@intel.com	2024-08-17 11:24:46 +02:00
Jesus Narvaez	437ad4534a	drm/i915/guc: Change GEM_WARN_ON to guc_err to prevent taints in CI This warning was supposed to catch a harmless issue, but changing to guc_error should prevent kernel taints in CI runs. Signed-off-by: Jesus Narvaez <jesus.narvaez@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240808204943.911727-1-jesus.narvaez@intel.com	2024-08-16 11:04:53 -07:00

1 2 3 4 5 ...

1268053 Commits