linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-02 14:34:13 -04:00

Author	SHA1	Message	Date
Himal Prasad Ghimiray	a4c48a3fa3	drm/xe/mocs: Update handling of xe_force_wake_get return xe_force_wake_get() now returns the reference count-incremented domain mask. If it fails for individual domains, the return value will always be 0. However, for XE_FORCEWAKE_ALL, it may return a non-zero value even in the event of failure. Update the return handling of xe_force_wake_get() to reflect this behavior, and ensure that the return value is passed as input to xe_force_wake_put(). v3 - return xe_wakeref_t instead of int in xe_force_wake_get() - don't use xe_assert() to report HW errors (Michal) v5 - return unsigned int from xe_force_wake_get() - Remove redundant warn v7 - Fix commit message Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Nirmoy Das <Nirmoy.das@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241014075601.2324382-14-himal.prasad.ghimiray@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-10-17 10:17:08 -04:00
Himal Prasad Ghimiray	6a966d677d	drm/xe/tests/mocs: Update xe_force_wake_get() return handling With xe_force_wake_get() now returning the refcount-incremented domain mask, a return value of 0 indicates failure for single domains. Change assert condition to incorporate this change in return and pass the return value to xe_force_wake_put() v3 - return xe_wakeref_t instead of int in xe_force_wake_get() v5 - return unsigned int for xe_force_wake_get() Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241014075601.2324382-13-himal.prasad.ghimiray@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-10-17 10:17:08 -04:00
Himal Prasad Ghimiray	9ffd6ec2de	drm/xe/devcoredump: Update handling of xe_force_wake_get return xe_force_wake_get() now returns the reference count-incremented domain mask. If it fails for individual domains, the return value will always be 0. However, for XE_FORCEWAKE_ALL, it may return a non-zero value even in the event of failure. Use helper xe_force_wake_ref_has_domain to verify all domains are initialized or not. Update the return handling of xe_force_wake_get() to reflect this behavior, and ensure that the return value is passed as input to xe_force_wake_put(). v3 - return xe_wakeref_t instead of int in xe_force_wake_get() v5 - return unsigned int for xe_force_wake_get() v6 - use helper xe_force_wake_ref_has_domain() v7 - Fix commit message Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241014075601.2324382-12-himal.prasad.ghimiray@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-10-17 10:17:08 -04:00
Himal Prasad Ghimiray	a66c198953	drm/xe/xe_gt_idle: Update handling of xe_force_wake_get return xe_force_wake_get() now returns the reference count-incremented domain mask. If it fails for individual domains, the return value will always be 0. However, for XE_FORCEWAKE_ALL, it may return a non-zero value even in the event of failure. Update the return handling of xe_force_wake_get() to reflect this behavior, and ensure that the return value is passed as input to xe_force_wake_put(). v3 - return xe_wakeref_t instead of int in xe_force_wake_get() - xe_force_wake_put() error doesn't need to be checked. It internally WARNS on domain ack failure. v4 - Rebase fix v5 - return unsigned int for xe_force_wake_get() - Remove reudandant WARN calls. v7 - Fix commit message Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241014075601.2324382-11-himal.prasad.ghimiray@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-10-17 10:17:08 -04:00
Himal Prasad Ghimiray	30d105577a	drm/xe/gt: Update handling of xe_force_wake_get return xe_force_wake_get() now returns the reference count-incremented domain mask. If it fails for individual domains, the return value will always be 0. However, for XE_FORCEWAKE_ALL, it may return a non-zero value even in the event of failure. Use helper xe_force_wake_ref_has_domain to verify all domains are initialized or not. Update the return handling of xe_force_wake_get() to reflect this behavior, and ensure that the return value is passed as input to xe_force_wake_put(). v3 - return xe_wakeref_t instead of int in xe_force_wake_get() - xe_force_wake_put() error doesn't need to be checked. It internally WARNS on domain ack failure. v4 - Rebase fix v5 - return unsigned int for xe_force_wake_get() - remove redundant XE_WARN_ON() v6 - use helper for checking all initialized domains are awake or not. v7 - Fix commit message v9 - Remove redundant WARN_ON (Badal) Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241014075601.2324382-10-himal.prasad.ghimiray@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-10-17 10:17:08 -04:00
Himal Prasad Ghimiray	21eb4f178d	drm/xe/gsc: Update handling of xe_force_wake_get return xe_force_wake_get() now returns the reference count-incremented domain mask. If it fails for individual domains, the return value will always be 0. However, for XE_FORCEWAKE_ALL, it may return a non-zero value even in the event of failure. Update the return handling of xe_force_wake_get() to reflect this behavior, and ensure that the return value is passed as input to xe_force_wake_put(). v3 - return xe_wakeref_t instead of int in xe_force_wake_get() v5 - return unsigned int for xe_force_wake_get() - No need to WARN from caller in case of forcewake get failure. v7 - Fix commit message Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241014075601.2324382-9-himal.prasad.ghimiray@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-10-17 10:17:08 -04:00
Himal Prasad Ghimiray	82d9de63ca	drm/xe/hdcp: Update handling of xe_force_wake_get return xe_force_wake_get() now returns the reference count-incremented domain mask. If it fails for individual domains, the return value will always be 0. However, for XE_FORCEWAKE_ALL, it may return a non-zero value even in the event of failure. Update the return handling of xe_force_wake_get() to reflect this behavior, and ensure that the return value is passed as input to xe_force_wake_put(). v3 - return xe_wakeref_t instead of int in xe_force_wake_get() v5 - return unsigned int for xe_force_wake_get() v7 - Fix commit message Cc: Suraj Kandpal <suraj.kandpal@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Suraj Kandpal <suraj.kandpal@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241014075601.2324382-8-himal.prasad.ghimiray@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-10-17 10:17:07 -04:00
Himal Prasad Ghimiray	3b41f8882e	drm/xe/device: Update handling of xe_force_wake_get return xe_force_wake_get() now returns the reference count-incremented domain mask. If it fails for individual domains, the return value will always be 0. However, for XE_FORCEWAKE_ALL, it may return a non-zero value even in the event of failure. Update the return handling of xe_force_wake_get() to reflect this behavior, and ensure that the return value is passed as input to xe_force_wake_put(). v3 - return xe_wakeref_t instead of int in xe_force_wake_get() - xe_force_wake_put() error doesn't need to be escalated/considered as probing error. It internally WARNS on domain ack failure. v5 - return unsigned int xe_force_wake_get() v7 - Fix commit message(Badal) v9 - s/uint/unsigned int (Nikula) Cc: Jani Nikula <jani.nikula@intel.com> Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241014075601.2324382-7-himal.prasad.ghimiray@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-10-17 10:17:07 -04:00
Himal Prasad Ghimiray	79f716bbfa	drm/xe: Modify xe_force_wake_put to handle _get returned mask Instead of calling xe_force_wake_put on all domains that were input to xe_force_wake_get, call _put only on the domains whose reference counts were successfully incremented by the _get call. Since the return value of _get can be a mask that does not match any specific value in the enum xe_force_wake_domains, change the input parameter of _put to unsigned int. v3 - Move WARN to this patch (Badal) - use xe_gt_WARN instead of XE_WARN (Michal) - Stop using xe_force_wake_domains for non enum values. - Remove kernel-doc from this patch (Badal) -v5 - Fix global awake_domain -v6 - put all initialized domains in case of FORCEWAKE_ALL. - Modify ret variable name (Michal) - Modify input var name (Michal) - Modify commit message and warn (Badal) -v9 - Add assert condition. Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241014075601.2324382-6-himal.prasad.ghimiray@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-10-17 10:17:07 -04:00
Himal Prasad Ghimiray	a7ddcea1f5	drm/xe: Error handling in xe_force_wake_get() If an acknowledgment timeout occurs for a forcewake domain awake request, do not increment the reference count for the domain. This ensures that subsequent _get calls do not incorrectly assume the domain is awake. The return value is a mask of domains that got refcounted, and these domains need to be provided for subsequent xe_force_wake_put call. While at it, add simple kernel-doc for xe_force_wake_get() v3 - Use explicit type for mask (Michal/Badal) - Improve kernel-doc (Michal) - Use unsigned int instead of abusing enum (Michal) v5 - Use unsigned int for return (MattB/Badal/Rodrigo) - use xe_gt_WARN for domain awake ack failure (Badal/Rodrigo) v6 - Change XE_FORCEWAKE_ALL to single bit, this helps accommodate actually refcounted domains in return. (Michal) - Modify commit message and warn message (Badal) - Remove unnecessary information in kernel-doc (Michal) v7 - Add assert condition for valid input domains (Badal) v9 - Update kernel-doc and simplify conditions (Michal) Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241014075601.2324382-5-himal.prasad.ghimiray@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-10-17 10:17:07 -04:00
Himal Prasad Ghimiray	9d62b07027	drm/xe/forcewake: Add a helper xe_force_wake_ref_has_domain() The helper xe_force_wake_ref_has_domain() checks if the input domain has been successfully reference-counted and awakened in the reference. v2 - Fix commit message and kernel-doc (Michal) - Remove unnecessary paranthesis (Michal) Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241014075601.2324382-4-himal.prasad.ghimiray@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-10-17 10:17:07 -04:00
Himal Prasad Ghimiray	38820e63a3	drm/xe/forcewake: Change awake_domain datatype Change the datatype of awake_domains to unsigned int to accommodate values that differ from the enum xe_force_wake_domains. Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241014075601.2324382-3-himal.prasad.ghimiray@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-10-17 10:17:07 -04:00
Himal Prasad Ghimiray	f5fc004b33	drm/xe: Add member initialized_domains to xe_force_wake() This field serves as a bitmask representing all initialized forcewake domains on the GT. v2 - Move awake_domains datatype change out of this patch (Michal) - Rename domain_init to init_domain (Michal) - optimize alignment (Michal) Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Badal Nilawar <badal.nilawar@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Reviewed-by: Badal Nilawar <badal.nilawar@intel.com> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241014075601.2324382-2-himal.prasad.ghimiray@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>	2024-10-17 10:17:07 -04:00
Nirmoy Das	649f533b7a	drm/xe: Add caller info to xe_gt_reset_async Add caller info to the xe_gt_reset_async() to help debug issues. v2: s/%pS/%ps(Matt) Cc: Matthew Auld <matthew.auld@intel.com> Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2874 Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241016141717.881143-1-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>	2024-10-17 14:56:34 +02:00
Shuicheng Lin	2eb460ab9f	drm/xe: Enlarge the invalidation timeout from 150 to 500 There are error messages like below that are occurring during stress testing: "[ 31.004009] xe 0000:03:00.0: [drm] ERROR GT0: Global invalidation timeout". Previously it was hitting this 3 out of 1000 executions of warm reboot. After raising it to 500, 1000 warm reboot executions passed and it didn't fail. Due to the way xe_mmio_wait32() is implemented, the timeout is able to expire early when the register matches the expected value due to the wait increments starting small. So, the larger timeout value should have no effect during normal use cases. v2 (Jonathan): - rework the commit message v3 (Lucas): - add conclusive message for the fail rate and test case v4: - add suggested-by Suggested-by: Jia Yao <jia.yao@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Tested-by: Zongyao Bai <zongyao.bai@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241015161207.1373401-1-shuicheng.lin@intel.com	2024-10-16 16:11:10 +01:00
Shekhar Chauhan	d2822832d7	drm/xe/xe3lpg: Extend Wa_18034896535 to Xe3_LPG. Extend Wa_18034896535 to Xe3_LPG (IP 30.00), steppings A0 to B0. BSpec: 56852 Signed-off-by: Shekhar Chauhan <shekhar.chauhan@intel.com> Reviewed-by: Sai Teja Pottumuttu <sai.teja.pottumuttu@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241015161938.845996-1-shekhar.chauhan@intel.com	2024-10-16 08:00:38 -07:00
Juha-Pekka Heikkila	73e8e2f9a3	drm/i915/display: Don't allow tile4 framebuffer to do hflip on display20 or greater On display ver 20 onwards tile4 is not supported with horizontal flip Bspec: 69853 Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Sai Teja Pottumuttu <sai.teja.pottumuttu@intel.com> Signed-off-by: Mika Kahola <mika.kahola@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241007182841.2104740-1-juhapekka.heikkila@gmail.com	2024-10-15 11:31:21 +03:00
Juha-Pekka Heikkila	b0228a337d	drm/xe/display: align framebuffers according to hw requirements Align framebuffers in memory according to hw requirements instead of default page size alignment. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by: Mika Kahola <mika.kahola@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009151947.2240099-3-juhapekka.heikkila@gmail.com	2024-10-14 17:33:40 +03:00
Juha-Pekka Heikkila	3ad86ae1da	drm/xe: add interface to request physical alignment for buffer objects Add xe_bo_create_pin_map_at_aligned() which augment xe_bo_create_pin_map_at() with alignment parameter allowing to pass required alignemnt if it differ from default. Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by: Mika Kahola <mika.kahola@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009151947.2240099-2-juhapekka.heikkila@gmail.com	2024-10-14 17:33:39 +03:00
Matthew Auld	26f69e88dc	drm/xe/xe_sync: initialise ufence.signalled We can incorrectly think that the fence has signalled, if we get a non-zero value here from the kmalloc, which is quite plausible. Just use kzalloc to prevent stuff like this. Fixes: `977e5b82e0` ("drm/xe: Expose user fence from xe_sync_entry") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Cc: <stable@vger.kernel.org> # v6.10+ Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241011133633.388008-2-matthew.auld@intel.com	2024-10-14 09:14:31 +01:00
Nirmoy Das	ec7e6a1d52	drm/xe/ufence: ufence can be signaled right after wait_woken do_comapre() can return success after a timedout wait_woken() which was treated as -ETIME. The loop calling wait_woken() sets correct err so there is no need to re-evaluate err. v2: Remove entire check that reevaluate err at the end(Matt) Fixes: `e670f0b4ef` ("drm/xe/uapi: Return correct error code for xe_wait_user_fence_ioctl") Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/1630 Cc: stable@vger.kernel.org # v6.8+ Cc: Bommu Krishnaiah <krishnaiah.bommu@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241011151029.4160630-1-nirmoy.das@intel.com Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>	2024-10-14 10:03:54 +02:00
Lucas De Marchi	735be7acc5	drm/xe/query: Tidy up error EFAULT returns Move the error handling together in a single branch since all of them are doing similar thing and return the same error. Reviewed-by: Sai Teja Pottumuttu <sai.teja.pottumuttu@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241011035618.1057602-3-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-11 19:06:32 -07:00
Lucas De Marchi	5c84985b07	drm/xe/query: Move timestamp reg to hwe_read_timestamp() __read_timestamps() is actually reading the timestamp from a certain hwe. Use it as parameter, move register declarations to be inside that function and rename it. Reviewed-by: Sai Teja Pottumuttu <sai.teja.pottumuttu@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241011035618.1057602-2-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-11 19:06:32 -07:00
Lucas De Marchi	9d559cdcb2	drm/xe/query: Increase timestamp width Starting with Xe2 the timestamp is a full 64 bit counter, contrary to the 36 bit that was available before. Although 36 should be sufficient for any reasonable delta calculation (for Xe2, of about 30min), it's surprising to userspace to get something truncated. Also if the timestamp being compared to is coming from the GPU and the application is not careful enough to apply the width there, a delta calculation would be wrong. Extend it to full 64-bits starting with Xe2. v2: Expand width=64 to media gt, as it's just a wrong tagging in the spec - empirical tests show it goes beyond 36 bits and match the engines for the main gt Bspec: 60411 Cc: Szymon Morek <szymon.morek@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241011035618.1057602-1-lucas.demarchi@intel.com Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-10-11 19:06:32 -07:00
Matthew Brost	b8b1163248	drm/xe: Use bookkeep slots for external BO's in exec IOCTL Fix external BO's dma-resv usage in exec IOCTL using bookkeep slots rather than write slots. This leaves syncing to user space rather than the KMD blindly enforcing write semantics on every external BO. Fixes: `dd08ebf6c3` ("drm/xe: Introduce a new DRM driver for Intel GPUs") Cc: José Roberto de Souza <jose.souza@intel.com> Cc: Kenneth Graunke <kenneth.w.graunke@intel.com> Cc: Paulo Zanoni <paulo.r.zanoni@intel.com> Reported-by: Simona Vetter <simona.vetter@ffwll.ch> Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2673 Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240911152622.903058-1-matthew.brost@intel.com	2024-10-11 15:59:11 -07:00
Matthew Brost	ea2f6a77d0	drm/xe: Don't free job in TDR Freeing job in TDR is not safe as TDR can pass the run_job thread resulting in UAF. It is only safe for free job to naturally be called by the scheduler. Rather free job in TDR, add to pending list. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/2811 Cc: Matthew Auld <matthew.auld@intel.com> Fixes: `e275d61c5f` ("drm/xe/guc: Handle timing out of signaled jobs gracefully") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241003001657.3517883-3-matthew.brost@intel.com	2024-10-11 15:48:36 -07:00
Matthew Brost	90521df5fc	drm/xe: Take job list lock in xe_sched_add_pending_job A fragile micro optimization in xe_sched_add_pending_job relied on both the GPU scheduler being stopped and fence signaling stopped to safely add a job to the pending list without the job list lock in xe_sched_add_pending_job. Remove this optimization and just take the job list lock. Fixes: `7ddb9403dd` ("drm/xe: Sample ctx timestamp to determine if jobs have timed out") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241003001657.3517883-2-matthew.brost@intel.com	2024-10-11 15:48:29 -07:00
Imre Deak	bbc4a30de0	drm/xe/display: Add missing HPD interrupt enabling during non-d3cold RPM resume Atm the display HPD interrupts that got disabled during runtime suspend, are re-enabled only if d3cold is enabled. Fix things by also re-enabling the interrupts if d3cold is disabled. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009194358.1321200-5-imre.deak@intel.com	2024-10-11 17:01:19 +03:00
Imre Deak	a4de6beb83	drm/xe/display: Separate the d3cold and non-d3cold runtime PM handling For clarity separate the d3cold and non-d3cold runtime PM handling. The only change in behavior is disabling polling later during runtime resume. This shouldn't make a difference, since the poll disabling is handled from a work, which could run at any point wrt. the runtime resume handler. The work will also require a runtime PM reference, syncing it with the resume handler. Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009194358.1321200-4-imre.deak@intel.com	2024-10-11 17:00:57 +03:00
Colin Ian King	46bcb0a121	drm/xe/guc: Fix inverted logic on snapshot->copy check Currently the check to see if snapshot->copy has been allocated is inverted and ends up dereferencing snapshot->copy when free'ing objects in the array when it is null or not free'ing the objects when snapshot->copy is allocated. Fix this by using the correct non-null pointer check logic. Fixes: `d8ce1a9772` ("drm/xe/guc: Use a two stage dump for GuC logs and add more info") Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009160510.372195-1-colin.i.king@gmail.com	2024-10-10 14:53:28 +02:00
Matthew Auld	a187c1b0a8	drm/xe: fix unbalanced rpm put() with declare_wedged() Technically the or_reset() means we call the action on failure, however that would lead to unbalanced rpm put(). Move the get() earlier to fix this. It should be extremely unlikely to ever trigger this in practice. Fixes: `452bca0edb` ("drm/xe: Don't suspend device upon wedge") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009084808.204432-4-matthew.auld@intel.com	2024-10-10 09:15:59 +01:00
Matthew Auld	cfcbc0520d	drm/xe: fix unbalanced rpm put() with fence_fini() Currently we can call fence_fini() twice if something goes wrong when sending the GuC CT for the tlb request, since we signal the fence and return an error, leading to the caller also calling fini() on the error path in the case of stack version of the flow, which leads to an extra rpm put() which might later cause device to enter suspend when it shouldn't. It looks like we can just drop the fini() call since the fence signaller side will already call this for us. There are known mysterious splats with device going to sleep even with an rpm ref, and this could be one candidate. v2 (Matt B): - Prefer warning if we detect double fini() Fixes: `0a382f9bc5` ("drm/xe: Hold a PM ref when GT TLB invalidations are inflight") Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009084808.204432-3-matthew.auld@intel.com	2024-10-10 09:15:58 +01:00
Aradhya Bhatia	8fb1da9f9b	drm/xe/xe2lpg: Extend Wa_15016589081 for xe2lpg Add workaround (wa) 15016589081 which applies to Xe2_v3_LPG_MD. Xe2_v3_LPG_MD is a Lunar Lake platform with GFX version: 20.04. This wa is type: permanent, and hence is applicable on all steppings. Signed-off-by: Aradhya Bhatia <aradhya.bhatia@intel.com> Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241009065542.283151-1-aradhya.bhatia@intel.com	2024-10-09 12:16:02 -07:00
Gustavo Sousa	081cb8948c	drm/xe/xe3: Add initial set of workarounds Implement the initial set of workarounds for Xe3 IPs. Signed-off-by: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241008204626.55802-2-matthew.s.atwood@intel.com	2024-10-09 06:41:46 -07:00
Thomas Hellström	7a7593e588	drm/xe/tests: Fix the shrinker test compiler warnings. The xe_bo_shrink_kunit test has an uninitialized value and illegal integer size conversions on 32-bit. Fix. v2: - Use div64_u64 to ensure the u64 division compiles everywhere. (Matt Auld) Reported-by: Nathan Chancellor <nathan@kernel.org> Closes: https://lore.kernel.org/20240913195649.GA61514@thelio-3990X/ Fixes: `5a90b60db5` ("drm/xe: Add a xe_bo subtest for shrinking / swapping") Cc: dri-devel@lists.freedesktop.org Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> #v1 Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241004141121.186177-1-thomas.hellstrom@linux.intel.com	2024-10-09 12:18:22 +02:00
Matthew Auld	67ec9f87bd	drm/xe/bmg: improve cache flushing behaviour The BSpec says that EN_L3_RW_CCS_CACHE_FLUSH must be toggled on for manual global invalidation to take effect and actually flush device cache, however this also turns on flushing for things like pipecontrol, which occurs between submissions for compute/render. This sounds like massive overkill for our needs, where we already have the manual flushing on the display side with the global invalidation. Some observations on BMG: 1. Disabling l2 caching for host writes and stubbing out the driver global invalidation but keeping EN_L3_RW_CCS_CACHE_FLUSH enabled, has no impact on wb-transient-vs-display IGT, which makes sense since the pipecontrol is now flushing the device cache after the render copy. Without EN_L3_RW_CCS_CACHE_FLUSH the test then fails, which is also expected since device cache is now dirty and display engine can't see the writes. 2. Disabling EN_L3_RW_CCS_CACHE_FLUSH, but keeping the driver global invalidation also has no impact on wb-transient-vs-display. This suggests that the global invalidation still works as expected and is flushing the device cache without EN_L3_RW_CCS_CACHE_FLUSH turned on. With that drop EN_L3_RW_CCS_CACHE_FLUSH. This helps some workloads since we no longer flush the device cache between submissions as part of pipecontrol. Edit: We now also have clarification from HW side that BSpec was indeed wrong here. v2: - Rebase and update commit message. BSpec: 71718 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Vitasta Wattal <vitasta.wattal@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241007074541.33937-2-matthew.auld@intel.com	2024-10-09 09:01:42 +01:00
Zhanjun Dong	0f1fdf5592	drm/xe/guc: Save manual engine capture into capture list Save manual engine capture into capture list. This removes duplicate register definitions across manual-capture vs guc-err-capture. Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241004193428.3311145-7-zhanjun.dong@intel.com	2024-10-08 09:47:02 -07:00
Zhanjun Dong	ecb6336463	drm/xe/guc: Plumb GuC-capture into dev coredump When we decide to kill a job, (from guc_exec_queue_timedout_job), we could end up with 4 possible scenarios at this starting point of this decision: 1. the guc-captured register-dump is already there. 2. the driver is wedged.mode > 1, so GuC-engine-reset / GuC-err-capture will not happen. 3. the user has started the driver in execlist-submission mode. 4. the guc-captured register-dump is not ready yet so we force GuC to kill that context now, but: A. we don't know yet if GuC will be successful on the engine-reset and get the guc-err-capture, else kmd will do a manual reset later OR B. guc will be successful and we will get a guc-err-capture shortly. So to accomdate the scenarios of 2 and 4A, we will need to do a manual KMD capture first(which is not be reliable in guc-submission mode) and decide later if we need to use that for the cases of 2 or 4A. So this flow is part of the implementation for this patch. Provide xe_guc_capture_get_reg_desc_list to get the register dscriptor list. Add manual capture by read from hw engine if GuC capture is not ready. If it becomes ready at later time, GuC sourced data will be used. Although there may only be a small delay between (1) the check for whether guc-err-capture is available at the start of guc_exec_queue_timedout_job and (2) the decision on using a valid guc-err-capture or manual-capture, lets not take any chances and lock the matching node down so it doesn't get re-claimed if GuC-Err-Capture subsystem is running out of pre-cached nodes. Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241004193428.3311145-6-zhanjun.dong@intel.com	2024-10-08 09:39:58 -07:00
Zhanjun Dong	8bfc496327	drm/xe/guc: Extract GuC error capture lists Upon the G2H Notify-Err-Capture event, parse through the GuC Log Buffer (error-capture-subregion) and generate one or more capture-nodes. A single node represents a single "engine- instance-capture-dump" and contains at least 3 register lists: global, engine-class and engine-instance. An internal link list is maintained to store one or more nodes. Because the link-list node generation happen before the call to devcoredump, duplicate global and engine-class register lists for each engine-instance register dump if we find dependent-engine resets in a engine-capture-group. To avoid dynamically allocate the output nodes during gt reset, pre-allocate a fixed number of empty nodes up front (at the time of ADS registration) that we can consume from or return to an internal cached list of nodes. Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241004193428.3311145-5-zhanjun.dong@intel.com	2024-10-08 09:34:45 -07:00
Zhanjun Dong	84d15f4261	drm/xe/guc: Add capture size check in GuC log buffer Capture-nodes generated by GuC are placed in the GuC capture ring buffer which is a sub-region of the larger Guc-Log-buffer. Add capture output size check before allocating the shared buffer. Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241004193428.3311145-4-zhanjun.dong@intel.com	2024-10-08 09:34:28 -07:00
Zhanjun Dong	b170d696c1	drm/xe/guc: Add XE_LP steered register lists Add the ability for runtime allocation and freeing of steered register list extentions that depend on the detected HW config fuses. Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241004193428.3311145-3-zhanjun.dong@intel.com	2024-10-08 09:34:16 -07:00
Zhanjun Dong	9c8c7a7e6f	drm/xe/guc: Prepare GuC register list and update ADS size for error capture Add referenced registers defines and list of registers. Update GuC ADS size allocation to include space for the lists of error state capture register descriptors. Then, populate GuC ADS with the lists of registers we want GuC to report back to host on engine reset events. This list should include global, engine-class and engine-instance registers for every engine-class type on the current hardware. Ensure we allocate a persistent storage for the register lists that are populated into ADS so that we don't need to allocate memory during GT resets when GuC is reloaded and ADS population happens again. Signed-off-by: Zhanjun Dong <zhanjun.dong@intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241004193428.3311145-2-zhanjun.dong@intel.com	2024-10-08 09:34:04 -07:00
Matt Roper	d6d87a10d9	drm/xe/xe3lpm: Add new "instance0" steering table MCR steering on Xe3 media IP is almost the same as it was on Xe2, except for one new range (0x38D0D0 - 0x38D0FF) which has changed to an MCR "MEDIAINF" range on Xe3. Since we can always steer to grpid / instanceid 0 for MEDIAINF ranges, define a new "INSTANCE0" steering table for Xe3 media. Xe3 can continue to use the same OADDRM/GPMXMT table as Xe2. v2: Merge continuous entries 38D0D0 - 38F0FF Bspec: 74298 Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241008013509.61233-7-matthew.s.atwood@intel.com	2024-10-08 09:20:53 -07:00
Haridhar Kalvala	2298d8a81f	drm/xe/ptl: Add PTL platform definition PTL is an integrated GPU based on the Xe3 architecture. v2: explicitly turn off display until display patches land. Bspec: 72574 Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Haridhar Kalvala <haridhar.kalvala@intel.com> Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com> Reviewed-by: Shekhar Chauhan <shekhar.chauhan@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241008013509.61233-6-matthew.s.atwood@intel.com	2024-10-08 09:19:50 -07:00
Haridhar Kalvala	37466119ff	drm/xe/ptl: PTL re-uses Xe2 MOCS table PTL is Xe3 architecture but there is no difference between LNL and PTL in MOCS table. So, PTL uses the same MOCS table as LNL. Bspec: 71582 Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Shekhar Chauhan <shekhar.chauhan@intel.com> Signed-off-by: Haridhar Kalvala <haridhar.kalvala@intel.com> Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com> Reviewed-by: Shekhar Chauhan <shekhar.chauhan@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241008013509.61233-5-matthew.s.atwood@intel.com	2024-10-08 09:18:35 -07:00
Haridhar Kalvala	800d75bf20	drm/xe/xe3: Define Xe3 feature flags Define a common set of Xe3 feature flags and definitions that will be used for all platforms in this family. The feature flags are inherited unchanged from the Xe2 (XE2_FEATURES) platform. Following B-spec details inherited from Xe2 feature flag definition commit. v2: reuse graphics_xe2 definition Bspec: 58695 - dma_mask_size remains 46 (not documented in bspec) - supports_usm=1 (Bspec 59651) - has_flatccs=1 (Bspec 58797) - has_4tile=1 (Bspec 58788) - has_asid=1 (Bspec 59654, 59265, 60288) - has_range_tlb_invalidate=1 (Bspec 71126) - five-level page table (Bspec 59505) - 1 VD + 1 VE + 1 SFC (Bspec 67103, 70819) - platform engine mask (Bspec 60149) Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Haridhar Kalvala <haridhar.kalvala@intel.com> Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com> Reviewed-by: Shekhar Chauhan <shekhar.chauhan@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241008013509.61233-3-matthew.s.atwood@intel.com	2024-10-08 09:10:59 -07:00
Matt Roper	317d81085c	drm/xe/xe3: Xe3 uses the same PAT settings as Xe2 Xe3 platforms use the same PAT tables as Xe2. Bspec: 71582 Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241008013509.61233-2-matthew.s.atwood@intel.com	2024-10-08 09:10:10 -07:00
Shekhar Chauhan	9ab440a9d0	drm/xe/ptl: L3bank mask is not available on the media GT On PTL platforms with media version 30.00, the fuse registers for reporting L3 bank availability to the GT just read out as ~0 and do not provide proper values. Xe does not use the L3 bank mask for anything internally; it only passes the mask through to userspace via the GT topology query. Since we don't have any way to get the real L3 bank mask, we don't want to pass garbage to userspace. Passing a zeroed mask or a copy of the primary GT's L3 bank mask would also be inaccurate and likely to cause confusion for userspace. The best approach is to simply not include L3 in the list of masks returned by the topology query in cases where we aren't able to provide a meaningful value. This won't change the behavior for any existing platforms (where we can always obtain L3 masks successfully for all GTs), it will only prevent us from mis-reporting bad information on upcoming platform(s). There's a good chance this will become a formal workaround in the future, but for now we don't have a lineage number so "no_media_l3" is used in place of a lineage as the OOB workaround descriptor. v2: - Re-calculate query size to properly match data returned. (Gustavo) - Update kerneldoc to clarify that the L3bank mask may not be included in the query results if the hardware doesn't make it available. (Gustavo) Cc: Matt Atwood <matthew.s.atwood@intel.com> Cc: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Shekhar Chauhan <shekhar.chauhan@intel.com> Co-developed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Jonathan Cavitt <jonathan.cavitt@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Acked-by: Francois Dugast <francois.dugast@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241007154143.2021124-2-matthew.d.roper@intel.com	2024-10-08 06:56:51 -07:00
John Harrison	691b5a6af3	drm/xe/guc: Add a helper function for dumping GuC log to dmesg Create a helper function that can be used to dump the GuC log to dmesg in a manner that is reliable for extraction and decode. The intention is that calls to this can be added by developers when debugging specific issues that require a GuC log but do not allow easy capture of the log - e.g. failures in selftests and failues that lead to kernel hangs. Also note that this is really a temporary stop-gap. The aim is to allow on demand creation and dumping of devcoredump captures (which includes the GuC log and much more). Currently this is not possible as much of the devcoredump code requires a 'struct xe_sched_job' and those are not available at many places that might want to do the dump. v2: Add kerneldoc - review feedback from Michal W. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241003004611.2323493-12-John.C.Harrison@Intel.com	2024-10-07 18:35:00 -07:00
John Harrison	8b7dfb9855	drm/xe/guc: Add GuC log to devcoredump captures Include the GuC log in devcoredump captures because they can be useful with debugging certain types of bug. v2: Fix kerneldoc v3: Drop module parameter as now using more compact ascii85 encoding rather than hexdump (although still not compressed) (review feedback from Matthew B). Rebase onto recent refactoring of devcoredump code. v4: Don't move the submission snapshot inside the GuC internals structure 'cos it really doesn't belong there. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241003004611.2323493-11-John.C.Harrison@Intel.com	2024-10-07 18:35:00 -07:00

1 2 3 4 5 ...

1308951 Commits