linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2026-05-10 09:09:55 -04:00

Author	SHA1	Message	Date
Jacek Lawrynowicz	d893da85e0	accel/ivpu: Fix PM related deadlocks in MS IOCTLs Prevent runtime resume/suspend while MS IOCTLs are in progress. Failed suspend will call ivpu_ms_cleanup() that would try to acquire file_priv->ms_lock, which is already held by the IOCTLs. Fixes: `cdfad4db77` ("accel/ivpu: Add NPU profiling support") Cc: stable@vger.kernel.org # v6.11+ Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://lore.kernel.org/r/20250325114306.3740022-3-maciej.falkowski@linux.intel.com	2025-03-31 14:19:05 +02:00
Jacek Lawrynowicz	9a6f56762d	accel/ivpu: Fix deadlock in ivpu_ms_cleanup() Fix deadlock in ivpu_ms_cleanup() by preventing runtime resume after file_priv->ms_lock is acquired. During a failure in runtime resume, a cold boot is executed, which calls ivpu_ms_cleanup_all(). This function calls ivpu_ms_cleanup() that acquires file_priv->ms_lock and causes the deadlock. Fixes: `cdfad4db77` ("accel/ivpu: Add NPU profiling support") Cc: stable@vger.kernel.org # v6.11+ Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://lore.kernel.org/r/20250325114306.3740022-2-maciej.falkowski@linux.intel.com	2025-03-31 14:19:05 +02:00
Jacek Lawrynowicz	6b4568b675	accel/ivpu: Fix warning in ivpu_ipc_send_receive_internal() Warn if device is suspended only when runtime PM is enabled. Runtime PM is disabled during reset/recovery and it is not an error to use ivpu_ipc_send_receive_internal() in such cases. Fixes: `5eaa497411` ("accel/ivpu: Prevent recovery invocation during probe and resume") Cc: stable@vger.kernel.org # v6.13+ Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://lore.kernel.org/r/20250325114219.3739951-1-maciej.falkowski@linux.intel.com	2025-03-31 14:18:10 +02:00
Chris Bainbridge	8ec0fbb28d	drm/nouveau: prime: fix ttm_bo_delayed_delete oops Fix an oops in ttm_bo_delayed_delete which results from dererencing a dangling pointer: Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6b7b: 0000 [#1] PREEMPT SMP CPU: 4 UID: 0 PID: 1082 Comm: kworker/u65:2 Not tainted 6.14.0-rc4-00267-g505460b44513-dirty #216 Hardware name: LENOVO 82N6/LNVNB161216, BIOS GKCN65WW 01/16/2024 Workqueue: ttm ttm_bo_delayed_delete [ttm] RIP: 0010:dma_resv_iter_first_unlocked+0x55/0x290 Code: 31 f6 48 c7 c7 00 2b fa aa e8 97 bd 52 ff e8 a2 c1 53 00 5a 85 c0 74 48 e9 88 01 00 00 4c 89 63 20 4d 85 e4 0f 84 30 01 00 00 <41> 8b 44 24 10 c6 43 2c 01 48 89 df 89 43 28 e8 97 fd ff ff 4c 8b RSP: 0018:ffffbf9383473d60 EFLAGS: 00010202 RAX: 0000000000000001 RBX: ffffbf9383473d88 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffffbf9383473d78 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 6b6b6b6b6b6b6b6b R13: ffffa003bbf78580 R14: ffffa003a6728040 R15: 00000000000383cc FS: 0000000000000000(0000) GS:ffffa00991c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000758348024dd0 CR3: 000000012c259000 CR4: 0000000000f50ef0 PKRU: 55555554 Call Trace: <TASK> ? __die_body.cold+0x19/0x26 ? die_addr+0x3d/0x70 ? exc_general_protection+0x159/0x460 ? asm_exc_general_protection+0x27/0x30 ? dma_resv_iter_first_unlocked+0x55/0x290 dma_resv_wait_timeout+0x56/0x100 ttm_bo_delayed_delete+0x69/0xb0 [ttm] process_one_work+0x217/0x5c0 worker_thread+0x1c8/0x3d0 ? apply_wqattrs_cleanup.part.0+0xc0/0xc0 kthread+0x10b/0x240 ? kthreads_online_cpu+0x140/0x140 ret_from_fork+0x40/0x70 ? kthreads_online_cpu+0x140/0x140 ret_from_fork_asm+0x11/0x20 </TASK> The cause of this is: - drm_prime_gem_destroy calls dma_buf_put(dma_buf) which releases the reference to the shared dma_buf. The reference count is 0, so the dma_buf is destroyed, which in turn decrements the corresponding amdgpu_bo reference count to 0, and the amdgpu_bo is destroyed - calling drm_gem_object_release then dma_resv_fini (which destroys the reservation object), then finally freeing the amdgpu_bo. - nouveau_bo obj->bo.base.resv is now a dangling pointer to the memory formerly allocated to the amdgpu_bo. - nouveau_gem_object_del calls ttm_bo_put(&nvbo->bo) which calls ttm_bo_release, which schedules ttm_bo_delayed_delete. - ttm_bo_delayed_delete runs and dereferences the dangling resv pointer, resulting in a general protection fault. Fix this by moving the drm_prime_gem_destroy call from nouveau_gem_object_del to nouveau_bo_del_ttm. This ensures that it will be run after ttm_bo_delayed_delete. Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com> Suggested-by: Christian König <christian.koenig@amd.com> Fixes: `22b33e8ed0` ("nouveau: add PRIME support") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3937 Cc: Stable@vger.kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/Z-P4epVK8k7tFZ7C@debian.local	2025-03-27 13:29:05 +01:00
Vivek Kasireddy	3d50e61a17	drm/virtio: Fix flickering issue seen with imported dmabufs We need to save the reservation object pointer associated with the imported dmabuf in the newly created GEM object to allow drm_gem_plane_helper_prepare_fb() to extract the exclusive fence from it and attach it to the plane state during prepare phase. This is needed to ensure that drm_atomic_helper_wait_for_fences() correctly waits for the relevant fences (move, etc) associated with the reservation object, thereby implementing proper synchronization. Otherwise, artifacts or slight flickering can be seen when apps are dragged across the screen when running Gnome (Wayland). This problem is mostly seen with dGPUs in the case where the FBs are allocated in VRAM but need to be migrated to System RAM as they are shared with virtio-gpu. Fixes: `ca77f27a26` ("drm/virtio: Import prime buffers from other devices as guest blobs") Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Dmitry Osipenko <dmitry.osipenko@collabora.com> Cc: Gurchetan Singh <gurchetansingh@chromium.org> Cc: Chia-I Wu <olvaffe@gmail.com> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> Reviewed-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com> [dmitry.osipenko@collabora.com: Moved assignment before object_init()] Link: https://patchwork.freedesktop.org/patch/msgid/20250325201021.1315080-1-vivek.kasireddy@intel.com	2025-03-26 23:59:41 +03:00
Brendan King	a5b230e7f3	drm/imagination: fix firmware memory leaks Free the memory used to hold the results of firmware image processing when the module is unloaded. Fix the related issue of the same memory being leaked if processing of the firmware image fails during module load. Ensure all firmware GEM objects are destroyed if firmware image processing fails. Fixes memory leaks on powervr module unload detected by Kmemleak: unreferenced object 0xffff000042e20000 (size 94208): comm "modprobe", pid 470, jiffies 4295277154 hex dump (first 32 bytes): 02 ae 7f ed bf 45 84 00 3c 5b 1f ed 9f 45 45 05 .....E..<[...EE. d5 4f 5d 14 6c 00 3d 23 30 d0 3a 4a 66 0e 48 c8 .O].l.=#0.:Jf.H. backtrace (crc dd329dec): kmemleak_alloc+0x30/0x40 ___kmalloc_large_node+0x140/0x188 __kmalloc_large_node_noprof+0x2c/0x13c __kmalloc_noprof+0x48/0x4c0 pvr_fw_init+0xaa4/0x1f50 [powervr] unreferenced object 0xffff000042d20000 (size 20480): comm "modprobe", pid 470, jiffies 4295277154 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 09 00 00 00 0b 00 00 00 ................ 00 00 00 00 00 00 00 00 07 00 00 00 08 00 00 00 ................ backtrace (crc 395b02e3): kmemleak_alloc+0x30/0x40 ___kmalloc_large_node+0x140/0x188 __kmalloc_large_node_noprof+0x2c/0x13c __kmalloc_noprof+0x48/0x4c0 pvr_fw_init+0xb0c/0x1f50 [powervr] Cc: stable@vger.kernel.org Fixes: `cc1aeedb98` ("drm/imagination: Implement firmware infrastructure and META FW support") Signed-off-by: Brendan King <brendan.king@imgtec.com> Reviewed-by: Matt Coster <matt.coster@imgtec.com> Link: https://lore.kernel.org/r/20250318-ddkopsrc-1339-firmware-related-memory-leak-on-module-unload-v1-1-155337c57bb4@imgtec.com Signed-off-by: Matt Coster <matt.coster@imgtec.com>	2025-03-25 17:18:15 +00:00
Brendan King	4ba2abe154	drm/imagination: take paired job reference For paired jobs, have the fragment job take a reference on the geometry job, so that the geometry job cannot be freed until the fragment job has finished with it. The geometry job structure is accessed when the fragment job is being prepared by the GPU scheduler. Taking the reference prevents the geometry job being freed until the fragment job no longer requires it. Fixes a use after free bug detected by KASAN: [ 124.256386] BUG: KASAN: slab-use-after-free in pvr_queue_prepare_job+0x108/0x868 [powervr] [ 124.264893] Read of size 1 at addr ffff0000084cb960 by task kworker/u16:4/63 Cc: stable@vger.kernel.org Fixes: `eaf01ee5ba` ("drm/imagination: Implement job submission and scheduling") Signed-off-by: Brendan King <brendan.king@imgtec.com> Reviewed-by: Matt Coster <matt.coster@imgtec.com> Link: https://lore.kernel.org/r/20250318-ddkopsrc-1337-use-after-free-in-pvr_queue_prepare_job-v1-1-80fb30d044a6@imgtec.com Signed-off-by: Matt Coster <matt.coster@imgtec.com>	2025-03-25 17:17:37 +00:00
Xiaogang Chen	021ba7f1ba	udmabuf: fix a buf size overflow issue during udmabuf creation by casting size_limit_mb to u64 when calculate pglimit. Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250321164126.329638-1-xiaogang.chen@amd.com Signed-off-by: Christian König <christian.koenig@amd.com>	2025-03-24 14:19:40 +01:00
Jason Gunthorpe	cb83f4b965	gpu: host1x: Do not assume that a NULL domain means no DMA IOMMU Previously with tegra-smmu, even with CONFIG_IOMMU_DMA, the default domain could have been left as NULL. The NULL domain is specially recognized by host1x_iommu_attach() as meaning it is not the DMA domain and should be replaced with the special shared domain. This happened prior to the below commit because tegra-smmu was using the NULL domain to mean IDENTITY. Now that the domain is properly labled the test in DRM doesn't see NULL. Check for IDENTITY as well to enable the special domains. This is the same issue and basic fix as seen in commit `fae6e669cd` ("drm/tegra: Do not assume that a NULL domain means no DMA IOMMU"). Fixes: `c8cc2655cc` ("iommu/tegra-smmu: Implement an IDENTITY domain") Reported-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Closes: https://lore.kernel.org/all/c6a6f114-3acd-4d56-a13b-b88978e927dc@tecnico.ulisboa.pt/ Tested-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patchwork.freedesktop.org/patch/msgid/0-v1-10dcc8ce3869+3a7-host1x_identity_jgg@nvidia.com	2025-03-19 19:05:40 +01:00
Dan Carpenter	67d15c7aa0	accel/qaic: Fix integer overflow in qaic_validate_req() These are u64 variables that come from the user via qaic_attach_slice_bo_ioctl(). Use check_add_overflow() to ensure that the math doesn't have an integer wrapping bug. Cc: stable@vger.kernel.org Fixes: `ff13be8303` ("accel/qaic: Add datapath") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> Link: https://patchwork.freedesktop.org/patch/msgid/176388fa-40fe-4cb4-9aeb-2c91c22130bd@stanley.mountain	2025-03-14 10:47:15 -06:00
Jeffrey Hugo	84a833d906	accel/qaic: Fix possible data corruption in BOs > 2G When slicing a BO, we need to iterate through the BO's sgt to find the right pieces to construct the slice. Some of the data types chosen for this process are incorrectly too small, and can overflow. This can result in the incorrect slice construction, which can lead to data corruption in workload execution. The device can only handle 32-bit sized transfers, and the scatterlist struct only supports 32-bit buffer sizes, so our upper limit for an individual transfer is an unsigned int. Using an int is incorrect due to the reservation of the sign bit. Upgrade the length of a scatterlist entry and the offsets into a scatterlist entry to unsigned int for a correct representation. While each transfer may be limited to 32-bits, the overall BO may exceed that size. For counting the total length of the BO, we need a type that can represent the largest allocation possible on the system. That is the definition of size_t, so use it. Fixes: `ff13be8303` ("accel/qaic: Add datapath") Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jeff Hugo <jeff.hugo@oss.qualcomm.com> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Reviewed-by: Troy Hanson <quic_thanson@quicinc.com> Reviewed-by: Youssef Samir <quic_yabdulra@quicinc.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250306171959.853466-1-jeff.hugo@oss.qualcomm.com	2025-03-14 10:28:45 -06:00
Maíra Canal	c3e4a25602	drm/v3d: Set job pointer to NULL when the job's fence has an error Similar to commit `e4b5ccd392` ("drm/v3d: Ensure job pointer is set to NULL after job completion"), ensure the job pointer is set to `NULL` when a job's fence has an error. Failing to do so can trigger kernel warnings in specific scenarios, such as: 1. v3d_csd_job_run() assigns `v3d->csd_job = job` 2. CSD job exceeds hang limit, causing a timeout → v3d_gpu_reset_for_timeout() 3. GPU reset 4. drm_sched_resubmit_jobs() sets the job's fence to `-ECANCELED`. 5. v3d_csd_job_run() detects the fence error and returns NULL, not submitting the job to the GPU 6. User-space runs `modprobe -r v3d` 7. v3d_gem_destroy() v3d_gem_destroy() triggers a warning indicating that the CSD job never ended, as we didn't set `v3d->csd_job` to NULL after the timeout. The same can also happen to BIN, RENDER, and TFU jobs. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Maíra Canal <mcanal@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250313-v3d-gpu-reset-fixes-v4-2-c1e780d8e096@igalia.com	2025-03-13 16:50:43 -03:00
Maíra Canal	80cbee810e	drm/v3d: Don't run jobs that have errors flagged in its fence The V3D driver still relies on `drm_sched_increase_karma()` and `drm_sched_resubmit_jobs()` for resubmissions when a timeout occurs. The function `drm_sched_increase_karma()` marks the job as guilty, while `drm_sched_resubmit_jobs()` sets an error (-ECANCELED) in the DMA fence of that guilty job. Because of this, we must check whether the job’s DMA fence has been flagged with an error before executing the job. Otherwise, the same guilty job may be resubmitted indefinitely, causing repeated GPU resets. This patch adds a check for an error on the job's fence to prevent running a guilty job that was previously flagged when the GPU timed out. Note that the CPU and CACHE_CLEAN queues do not require this check, as their jobs are executed synchronously once the DRM scheduler starts them. Cc: stable@vger.kernel.org Fixes: `d223f98f02` ("drm/v3d: Add support for compute shader dispatch.") Fixes: `1584f16ca9` ("drm/v3d: Add support for submitting jobs to the TFU.") Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Maíra Canal <mcanal@igalia.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250313-v3d-gpu-reset-fixes-v4-1-c1e780d8e096@igalia.com	2025-03-13 16:50:16 -03:00
qianyi liu	a952f1ab69	drm/sched: Fix fence reference count leak The last_scheduled fence leaks when an entity is being killed and adding the cleanup callback fails. Decrement the reference count of prev when dma_fence_add_callback() fails, ensuring proper balance. Cc: stable@vger.kernel.org # v6.2+ [phasta: add git tag info for stable kernel] Fixes: `2fdb8a8f07` ("drm/scheduler: rework entity flush, kill and fini") Signed-off-by: qianyi liu <liuqianyi125@gmail.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250311060251.4041101-1-liuqianyi125@gmail.com	2025-03-13 09:39:06 +01:00
Imre Deak	12d8f31834	drm/dp_mst: Fix locking when skipping CSN before topology probing The handling of the MST Connection Status Notify message is skipped if the probing of the topology is still pending. Acquiring the drm_dp_mst_topology_mgr::probe_lock for this in drm_dp_mst_handle_up_req() is problematic: the task/work this function is called from is also responsible for handling MST down-request replies (in drm_dp_mst_handle_down_rep()). Thus drm_dp_mst_link_probe_work() - holding already probe_lock - could be blocked waiting for an MST down-request reply while drm_dp_mst_handle_up_req() is waiting for probe_lock while processing a CSN message. This leads to the probe work's down-request message timing out. A scenario similar to the above leading to a down-request timeout is handling a CSN message in drm_dp_mst_handle_conn_stat(), holding the probe_lock and sending down-request messages while a second CSN message sent by the sink subsequently is handled by drm_dp_mst_handle_up_req(). Fix the above by moving the logic to skip the CSN handling to drm_dp_mst_process_up_req(). This function is called from a work (separate from the task/work handling new up/down messages), already holding probe_lock. This solves the above timeout issue, since handling of down-request replies won't be blocked by probe_lock. Fixes: `ddf983488c` ("drm/dp_mst: Skip CSN if topology probing is not done yet") Cc: Wayne Lin <Wayne.Lin@amd.com> Cc: Lyude Paul <lyude@redhat.com> Cc: stable@vger.kernel.org # v6.6+ Reviewed-by: Wayne Lin <Wayne.Lin@amd.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250307183152.3822170-1-imre.deak@intel.com	2025-03-11 11:29:18 +02:00
Ville Syrjälä	de93ddf880	drm/atomic: Filter out redundant DPMS calls Video players (eg. mpv) do periodic XResetScreenSaver() calls to keep the screen on while the video playing. The modesetting ddx plumbs these straight through into the kernel as DPMS setproperty ioctls, without any filtering whatsoever. When implemented via atomic these end up as empty commits on the crtc (which will nonetheless take one full frame), which leads to a dropped frame every time XResetScreenSaver() is called. Let's just filter out redundant DPMS property changes in the kernel to avoid this issue. v2: Explain the resulting commits a bit better (Sima) Document the behaviour in uapi docs (Sima) Cc: stable@vger.kernel.org Testcase: igt/kms_flip/flip-vs-dpms-on-nop Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250219160239.17502-1-ville.syrjala@linux.intel.com	2025-03-10 18:18:37 +02:00
Miguel Ojeda	cba3b86974	drm/panic: fix overindented list items in documentation Starting with the upcoming Rust 1.86.0 (to be released 2025-04-03), Clippy warns: error: doc list item overindented --> drivers/gpu/drm/drm_panic_qr.rs:914:5 \| 914 \| /// will be encoded as binary segment, otherwise it will be encoded \| ^^^ help: try using ` ` (2 spaces) \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#doc_overindented_list_items The overindentation is slightly hard to notice, since all the items start with a backquote that makes it look OK, but it is there. Thus fix it. Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Fixes: `cb5164ac43` ("drm/panic: Add a QR code panic screen") Cc: stable@vger.kernel.org # Needed in 6.12.y and 6.13.y only (Rust is pinned in older LTSs). Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250301231602.917580-2-ojeda@kernel.org	2025-03-07 17:23:23 +01:00
Miguel Ojeda	986c2e9ca8	drm/panic: use `div_ceil` to clean Clippy warning Starting with the upcoming Rust 1.86.0 (to be released 2025-04-03), Clippy warns: error: manually reimplementing `div_ceil` --> drivers/gpu/drm/drm_panic_qr.rs:548:26 \| 548 \| let pad_offset = (offset + 7) / 8; \| ^^^^^^^^^^^^^^^^ help: consider using `.div_ceil()`: `offset.div_ceil(8)` \| = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#manual_div_ceil And similarly for `stride`. Thus apply the suggestion to both. The behavior (and thus codegen) is not exactly equivalent [1][2], since `div_ceil()` returns the right value for the values that currently would overflow. Link: https://github.com/rust-lang/rust-clippy/issues/14333 [1] Link: https://godbolt.org/z/dPq6nGnv3 [2] Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Fixes: `cb5164ac43` ("drm/panic: Add a QR code panic screen") Cc: stable@vger.kernel.org # Needed in 6.12.y and 6.13.y only (Rust is pinned in older LTSs). Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com> Signed-off-by: Jocelyn Falempe <jfalempe@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250301231602.917580-1-ojeda@kernel.org	2025-03-07 17:22:23 +01:00
Ivan Abramov	9af152dcf1	drm/gma500: Add NULL check for pci_gfx_root in mid_get_vbt_data() Since pci_get_domain_bus_and_slot() can return NULL, add NULL check for pci_gfx_root in the mid_get_vbt_data(). This change is similar to the checks implemented in mid_get_fuse_settings() and mid_get_pci_revID(), which were introduced by commit `0cecdd818c` ("gma500: Final enables for Oaktrail") as "additional minor bulletproofing". Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `f910b41105` ("gma500: Add the glue to the various BIOS and firmware interfaces") Signed-off-by: Ivan Abramov <i.abramov@mt-integration.ru> Signed-off-by: Patrik Jakobsson <patrik.r.jakobsson@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250306112046.17144-1-i.abramov@mt-integration.ru	2025-03-06 12:30:39 +01:00
Takashi Iwai	80da96d735	drm/bochs: Fix DPMS regression The recent rewrite with the use of regular atomic helpers broke the DPMS unblanking on X11. Fix it by moving the call of bochs_hw_blank(false) from CRTC mode_set_nofb() to atomic_enable(). Fixes: `2037174993` ("drm/bochs: Use regular atomic helpers") Link: https://bugzilla.suse.com/show_bug.cgi?id=1238209 Signed-off-by: Takashi Iwai <tiwai@suse.de> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20250304134203.20534-1-tiwai@suse.de	2025-03-06 08:54:42 +01:00
Philipp Stanner	23e0832d6d	drm/sched: Fix preprocessor guard When writing the header guard for gpu_scheduler_trace.h, a typo, apparently, occurred. Fix the typo and document the scope of the guard. Fixes: `353da3c520` ("drm/amdgpu: add tracepoint for scheduler (v2)") Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com> Signed-off-by: Philipp Stanner <phasta@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250218124149.118002-2-phasta@kernel.org	2025-03-03 16:03:48 +01:00
Alessio Belle	1d2eabb661	drm/imagination: Fix timestamps in firmware traces When firmware traces are enabled, the firmware dumps 48-bit timestamps for each trace as two 32-bit values, highest 32 bits (of which only 16 useful) first. The driver was reassembling them the other way round i.e. interpreting the first value in memory as the lowest 32 bits, and the second value as the highest 32 bits (then truncated to 16 bits). Due to this, firmware trace dumps showed very large timestamps even for traces recorded shortly after GPU boot. The timestamps in these dumps would also sometimes jump backwards because of the truncation. Example trace dumped after loading the powervr module and enabling firmware traces, where each line is commented with the timestamp value in hexadecimal to better show both issues: [93540092739584] : Host Sync Partition marker: 1 // 0x551300000000 [28419798597632] : GPU units deinit // 0x19d900000000 [28548647616512] : GPU deinit // 0x19f700000000 Update logic to reassemble the timestamps halves in the correct order. Fixes: `cb56cd6108` ("drm/imagination: Add firmware trace to debugfs") Signed-off-by: Alessio Belle <alessio.belle@imgtec.com> Reviewed-by: Matt Coster <matt.coster@imgtec.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250221-fix-fw-trace-timestamps-v1-1-dba4aeb030ca@imgtec.com Signed-off-by: Matt Coster <matt.coster@imgtec.com>	2025-03-03 12:15:55 +00:00
Brendan King	68c3de7f70	drm/imagination: only init job done fences once Ensure job done fences are only initialised once. This fixes a memory manager not clean warning from drm_mm_takedown on module unload. Cc: stable@vger.kernel.org Fixes: `eaf01ee5ba` ("drm/imagination: Implement job submission and scheduling") Signed-off-by: Brendan King <brendan.king@imgtec.com> Reviewed-by: Matt Coster <matt.coster@imgtec.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250226-init-done-fences-once-v2-1-c1b2f556b329@imgtec.com Signed-off-by: Matt Coster <matt.coster@imgtec.com>	2025-03-03 12:10:13 +00:00
Brendan King	a5c4c3ba95	drm/imagination: Hold drm_gem_gpuva lock for unmap Avoid a warning from drm_gem_gpuva_assert_lock_held in drm_gpuva_unlink. The Imagination driver uses the GEM object reservation lock to protect the gpuva list, but the GEM object was not always known in the code paths that ended up calling drm_gpuva_unlink. When the GEM object isn't known, it is found by calling drm_gpuva_find to lookup the object associated with a given virtual address range, or by calling drm_gpuva_find_first when removing all mappings. Cc: stable@vger.kernel.org Fixes: `4bc736f890` ("drm/imagination: vm: make use of GPUVM's drm_exec helper") Signed-off-by: Brendan King <brendan.king@imgtec.com> Reviewed-by: Matt Coster <matt.coster@imgtec.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250226-hold-drm_gem_gpuva-lock-for-unmap-v2-1-3fdacded227f@imgtec.com Signed-off-by: Matt Coster <matt.coster@imgtec.com>	2025-03-03 12:09:00 +00:00
Brendan King	df1a1ed5e1	drm/imagination: avoid deadlock on fence release Do scheduler queue fence release processing on a workqueue, rather than in the release function itself. Fixes deadlock issues such as the following: [ 607.400437] ============================================ [ 607.405755] WARNING: possible recursive locking detected [ 607.415500] -------------------------------------------- [ 607.420817] weston:zfq0/24149 is trying to acquire lock: [ 607.426131] ffff000017d041a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: pvr_gem_object_vunmap+0x40/0xc0 [powervr] [ 607.436728] but task is already holding lock: [ 607.442554] ffff000017d105a0 (reservation_ww_class_mutex){+.+.}-{3:3}, at: dma_buf_ioctl+0x250/0x554 [ 607.451727] other info that might help us debug this: [ 607.458245] Possible unsafe locking scenario: [ 607.464155] CPU0 [ 607.466601] ---- [ 607.469044] lock(reservation_ww_class_mutex); [ 607.473584] lock(reservation_ww_class_mutex); [ 607.478114] * DEADLOCK * Cc: stable@vger.kernel.org Fixes: `eaf01ee5ba` ("drm/imagination: Implement job submission and scheduling") Signed-off-by: Brendan King <brendan.king@imgtec.com> Reviewed-by: Matt Coster <matt.coster@imgtec.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250226-fence-release-deadlock-v2-1-6fed2fc1fe88@imgtec.com Signed-off-by: Matt Coster <matt.coster@imgtec.com>	2025-03-03 12:07:56 +00:00
Dave Airlie	6b481ab0e6	drm/nouveau: select FW caching nouveau tries to load some firmware during suspend that it loaded earlier, but with fw caching disabled it hangs suspend, so just rely on FW cache enabling instead of working around it in the driver. Fixes: `176fdcbddf` ("drm/nouveau/gsp/r535: add support for booting GSP-RM") Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250207012531.621369-1-airlied@gmail.com	2025-02-27 20:27:58 +01:00
Thomas Zimmermann	3603996432	drm/fbdev-dma: Add shadow buffering for deferred I/O DMA areas are not necessarily backed by struct page, so we cannot rely on it for deferred I/O. Allocate a shadow buffer for drivers that require deferred I/O and use it as framebuffer memory. Fixes driver errors about being "Unable to handle kernel NULL pointer dereference at virtual address" or "Unable to handle kernel paging request at virtual address". The patch splits drm_fbdev_dma_driver_fbdev_probe() in an initial allocation, which creates the DMA-backed buffer object, and a tail that sets up the fbdev data structures. There is a tail function for direct memory mappings and a tail function for deferred I/O with the shadow buffer. It is no longer possible to use deferred I/O without shadow buffer. It can be re-added if there exists a reliably test for usable struct page in the allocated DMA-backed buffer object. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reported-by: Nuno Gonçalves <nunojpg@gmail.com> CLoses: https://lore.kernel.org/dri-devel/CAEXMXLR55DziAMbv_+2hmLeH-jP96pmit6nhs6siB22cpQFr9w@mail.gmail.com/ Tested-by: Nuno Gonçalves <nunojpg@gmail.com> Fixes: `5ab91447aa` ("drm/tiny/ili9225: Use fbdev-dma") Cc: Thomas Zimmermann <tzimmermann@suse.de> Cc: <stable@vger.kernel.org> # v6.11+ Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch> Reviewed-by: Javier Martinez Canillas <javierm@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241211090643.74250-1-tzimmermann@suse.de	2025-02-27 09:37:55 +01:00
Thomas Zimmermann	01f1d77a26	drm/nouveau: Do not override forced connector status Keep user-forced connector status even if it cannot be programmed. Same behavior as for the rest of the drivers. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250114100214.195386-1-tzimmermann@suse.de	2025-02-26 14:18:15 -05:00
Masahiro Yamada	2e064e3f32	drm/imagination: remove unnecessary header include path drivers/gpu/drm/imagination/ includes local headers with the double-quote form (#include "..."). Hence, the header search path addition is unneeded. Signed-off-by: Masahiro Yamada <masahiroy@kernel.org> Reviewed-by: Matt Coster <matt.coster@imgtec.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250210102352.1517115-1-masahiroy@kernel.org Signed-off-by: Matt Coster <matt.coster@imgtec.com>	2025-02-26 09:54:57 +00:00
Harry Wentland	8ec43c58d3	drm/vkms: Round fixp2int conversion in lerp_u16 fixp2int always rounds down, fixp2int_ceil rounds up. We need the new fixp2int_round. Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Harry Wentland <harry.wentland@amd.com> Reviewed-by: Louis Chauvet <louis.chauvet@bootlin.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241220043410.416867-3-alex.hung@amd.com Signed-off-by: Louis Chauvet <louis.chauvet@bootlin.com>	2025-02-25 16:15:55 +01:00
Maarten Lankhorst	511a3444f7	MAINTAINERS: Add entry for DMEM cgroup controller The cgroups controller is currently maintained through the drm-misc tree, so lets add Maxime Ripard, Natalie Vock and me as specific maintainers for dmem. We keep the cgroup mailing list CC'd on all cgroup specific patches. Acked-by: Maxime Ripard <mripard@kernel.org> Acked-by: Natalie Vock <natalie.vock@gmx.de> Acked-by: Tejun Heo <tj@kernel.org> Acked-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Koutný <mkoutny@suse.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250220140757.16823-1-dev@lankhorst.se Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>	2025-02-20 19:58:24 +01:00
Su Hui	838c17fd07	accel/amdxdna: Add missing include linux/slab.h When compiling without CONFIG_IA32_EMULATION, there can be some errors: drivers/accel/amdxdna/amdxdna_mailbox.c: In function ‘mailbox_release_msg’: drivers/accel/amdxdna/amdxdna_mailbox.c:197:2: error: implicit declaration of function ‘kfree’. 197 \| kfree(mb_msg); \| ^~~~~ drivers/accel/amdxdna/amdxdna_mailbox.c: In function ‘xdna_mailbox_send_msg’: drivers/accel/amdxdna/amdxdna_mailbox.c:418:11: error:implicit declaration of function ‘kzalloc’. 418 \| mb_msg = kzalloc(sizeof(*mb_msg) + pkg_size, GFP_KERNEL); \| ^~~~~~~ Add the missing include. Fixes: `b87f920b93` ("accel/amdxdna: Support hardware mailbox") Signed-off-by: Su Hui <suhui@nfschina.com> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250211015354.3388171-1-suhui@nfschina.com	2025-02-19 09:42:17 -08:00
Karol Herbst	6586788f0a	MAINTAINERS: Remove myself I was pondering with myself for a while if I should just make it official that I'm not really involved in the kernel community anymore, neither as a reviewer, nor as a maintainer. Most of the time I simply excused myself with "if something urgent comes up, I can chime in and help out". Lyude and Danilo are doing a wonderful job and I've put all my trust into them. However, there is one thing I can't stand and it's hurting me the most. I'm convinced, no, my core believe is, that inclusivity and respect, working with others as equals, no power plays involved, is how we should work together within the Free and Open Source community. I can understand maintainers needing to learn, being concerned on technical points. Everybody deserves the time to understand and learn. It is my true belief that most people are capable of change eventually. I truly believe this community can change from within, however this doesn't mean it's going to be a smooth process. The moment I made up my mind about this was reading the following words written by a maintainer within the kernel community: "we are the thin blue line" This isn't okay. This isn't creating an inclusive environment. This isn't okay with the current political situation especially in the US. A maintainer speaking those words can't be kept. No matter how important or critical or relevant they are. They need to be removed until they learn. Learn what those words mean for a lot of marginalized people. Learn about what horrors it evokes in their minds. I can't in good faith remain to be part of a project and its community where those words are tolerated. Those words are not technical, they are a political statement. Even if unintentionally, such words carry power, they carry meanings one needs to be aware of. They do cause an immense amount of harm. I wish the best of luck for everybody to continue to try to work from within. You got my full support and I won't hold it against anybody trying to improve the community, it's a thankless job, it's a lot of work. People will continue to burn out. I got burned out enough by myself caring about the bits I maintained, but eventually I had to realize my limits. The obligation I felt was eating me from inside. It stopped being fun at some point and I reached a point where I simply couldn't continue the work I was so motivated doing as I've did in the early days. Please respect my wishes and put this statement as is into the tree. Leaving anything out destroys its entire meaning. Respectfully Karol Signed-off-by: Karol Herbst <kherbst@redhat.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250215073753.1217002-2-kherbst@redhat.com	2025-02-19 13:57:26 +01:00
Aaron Kling	3dbc0215e3	drm/nouveau/pmu: Fix gp10b firmware guard Most kernel configs enable multiple Tegra SoC generations, causing this typo to go unnoticed. But in the case where a kernel config is strictly for Tegra186, this is a problem. Fixes: `989863d7cb` ("drm/nouveau/pmu: select implementation based on available firmware") Signed-off-by: Aaron Kling <webgeek1234@gmail.com> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250218-nouveau-gm10b-guard-v2-1-a4de71500d48@gmail.com	2025-02-19 13:31:59 +01:00
Friedrich Vock	8821f36333	cgroup/dmem: Don't open-code css_for_each_descendant_pre The current implementation has a bug: If the current css doesn't contain any pool that is a descendant of the "pool" (i.e. when found_descendant == false), then "pool" will point to some unrelated pool. If the current css has a child, we'll overwrite parent_pool with this unrelated pool on the next iteration. Since we can just check whether a pool refers to the same region to determine whether or not it's related, all the additional pool tracking is unnecessary, so just switch to using css_for_each_descendant_pre for traversal. Fixes: `b168ed458d` ("kernel/cgroup: Add "dmem" memory accounting cgroup") Signed-off-by: Friedrich Vock <friedrich.vock@gmx.de> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250127152754.21325-1-friedrich.vock@gmx.de Signed-off-by: Maarten Lankhorst <dev@lankhorst.se>	2025-02-19 09:50:37 +01:00
David Hildenbrand	b3fefbb30a	nouveau/svm: fix missing folio unlock + put after make_device_exclusive_range() In case we have to retry the loop, we are missing to unlock+put the folio. In that case, we will keep failing make_device_exclusive_range() because we cannot grab the folio lock, and even return from the function with the folio locked and referenced, effectively never succeeding the make_device_exclusive_range(). While at it, convert the other unlock+put to use a folio as well. This was found by code inspection. Fixes: `8f187163eb` ("nouveau/svm: implement atomic SVM access") Signed-off-by: David Hildenbrand <david@redhat.com> Reviewed-by: Alistair Popple <apopple@nvidia.com> Tested-by: Alistair Popple <apopple@nvidia.com> Signed-off-by: Danilo Krummrich <dakr@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250124181524.3584236-2-david@redhat.com	2025-02-14 15:52:58 +01:00
Hugo Villeneuve	a8972d5a49	drm: panel: jd9365da-h3: fix reset signal polarity In jadard_prepare() a reset pulse is generated with the following statements (delays ommited for clarity): gpiod_set_value(jadard->reset, 1); --> Deassert reset gpiod_set_value(jadard->reset, 0); --> Assert reset for 10ms gpiod_set_value(jadard->reset, 1); --> Deassert reset However, specifying second argument of "0" to gpiod_set_value() means to deassert the GPIO, and "1" means to assert it. If the reset signal is defined as GPIO_ACTIVE_LOW in the DTS, the above statements will incorrectly generate the reset pulse (inverted) and leave it asserted (LOW) at the end of jadard_prepare(). Fix reset behavior by inverting gpiod_set_value() second argument in jadard_prepare(). Also modify second argument to devm_gpiod_get() in jadard_dsi_probe() to assert the reset when probing. Do not modify it in jadard_unprepare() as it is already properly asserted with "1", which seems to be the intended behavior. Fixes: `6b818c533d` ("drm: panel: Add Jadard JD9365DA-H3 DSI panel") Cc: stable@vger.kernel.org Signed-off-by: Hugo Villeneuve <hvilleneuve@dimonoff.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Link: https://lore.kernel.org/r/20240927135306.857617-1-hugo@hugovil.com Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240927135306.857617-1-hugo@hugovil.com	2025-02-13 17:46:51 +01:00
Imre Deak	e00a2e5d48	drm: Fix DSC BPP increment decoding Starting with DPCD version 2.0 bits 6:3 of the DP_DSC_BITS_PER_PIXEL_INC DPCD register contains the NativeYCbCr422_MAX_bpp_DELTA field, which can be non-zero as opposed to earlier DPCD versions, hence decoding the bit_per_pixel increment value at bits 2:0 in the same register requires applying a mask, do so. Cc: Ankit Nautiyal <ankit.k.nautiyal@intel.com> Fixes: `0c2287c965` ("drm/display/dp: Add helper function to get DSC bpp precision") Reviewed-by: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250212161851.4007005-1-imre.deak@intel.com	2025-02-13 10:20:30 +02:00
Arnd Bergmann	9ab127a180	drm/hisilicon/hibmc: select CONFIG_DRM_DISPLAY_DP_HELPER Without the DP helper code, the newly added displayport support causes a link failure: x86_64-linux-ld: drivers/gpu/drm/hisilicon/hibmc/dp/dp_aux.o: in function `hibmc_dp_aux_init': dp_aux.c:(.text+0x37e): undefined reference to `drm_dp_aux_init' x86_64-linux-ld: drivers/gpu/drm/hisilicon/hibmc/dp/dp_link.o: in function `hibmc_dp_link_set_pattern': dp_link.c:(.text+0xae): undefined reference to `drm_dp_dpcd_write' x86_64-linux-ld: drivers/gpu/drm/hisilicon/hibmc/dp/dp_link.o: in function `hibmc_dp_link_get_adjust_train': dp_link.c:(.text+0x121): undefined reference to `drm_dp_get_adjust_request_voltage' x86_64-linux-ld: dp_link.c:(.text+0x12e): undefined reference to `drm_dp_get_adjust_request_pre_emphasis' x86_64-linux-ld: drivers/gpu/drm/hisilicon/hibmc/dp/dp_link.o: in function `hibmc_dp_link_training': dp_link.c:(.text+0x2b0): undefined reference to `drm_dp_dpcd_write' x86_64-linux-ld: dp_link.c:(.text+0x2e3): undefined reference to `drm_dp_dpcd_write' Add both DRM_DISPLAY_DP_HELPER and DRM_DISPLAY_HELPER, which is in turn required by the former. Fixes: `0ab6ea261c` ("drm/hisilicon/hibmc: add dp module in hibmc") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20250127071059.617567-1-arnd@kernel.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>	2025-02-09 03:56:34 +02:00
Su Hui	3b32b7f638	drm/panthor: avoid garbage value in panthor_ioctl_dev_query() 'priorities_info' is uninitialized, and the uninitialized value is copied to user object when calling PANTHOR_UOBJ_SET(). Using memset to initialize 'priorities_info' to avoid this garbage value problem. Fixes: `f70000ef23` ("drm/panthor: Add DEV_QUERY_GROUP_PRIORITIES_INFO dev query") Signed-off-by: Su Hui <suhui@nfschina.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250119025828.1168419-1-suhui@nfschina.com	2025-02-07 17:31:59 +01:00
Rupinderjit Singh	02458fbfaa	gpu: host1x: Fix a use of uninitialized mutex commit `c8347f915e` ("gpu: host1x: Fix boot regression for Tegra") caused a use of uninitialized mutex leading to below warning when CONFIG_DEBUG_MUTEXES and CONFIG_DEBUG_LOCK_ALLOC are enabled. [ 41.662843] ------------[ cut here ]------------ [ 41.663012] DEBUG_LOCKS_WARN_ON(lock->magic != lock) [ 41.663035] WARNING: CPU: 4 PID: 794 at kernel/locking/mutex.c:587 __mutex_lock+0x670/0x878 [ 41.663458] Modules linked in: rtw88_8822c(+) bluetooth(+) rtw88_pci rtw88_core mac80211 aquantia libarc4 crc_itu_t cfg80211 tegra194_cpufreq dwmac_tegra(+) arm_dsu_pmu stmmac_platform stmmac pcs_xpcs rfkill at24 host1x(+) tegra_bpmp_thermal ramoops reed_solomon fuse loop nfnetlink xfs mmc_block rpmb_core ucsi_ccg ina3221 crct10dif_ce xhci_tegra ghash_ce lm90 sha2_ce sha256_arm64 sha1_ce sdhci_tegra pwm_fan sdhci_pltfm sdhci gpio_keys rtc_tegra cqhci mmc_core phy_tegra_xusb i2c_tegra tegra186_gpc_dma i2c_tegra_bpmp spi_tegra114 dm_mirror dm_region_hash dm_log dm_mod [ 41.665078] CPU: 4 UID: 0 PID: 794 Comm: (udev-worker) Not tainted 6.11.0-29.31_1538613708.el10.aarch64+debug #1 [ 41.665838] Hardware name: NVIDIA NVIDIA Jetson AGX Orin Developer Kit/Jetson, BIOS 36.3.0-gcid-35594366 02/26/2024 [ 41.672555] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 41.679636] pc : __mutex_lock+0x670/0x878 [ 41.683834] lr : __mutex_lock+0x670/0x878 [ 41.688035] sp : ffff800084b77090 [ 41.691446] x29: ffff800084b77160 x28: ffffdd4bebf7b000 x27: ffffdd4be96b1000 [ 41.698799] x26: 1fffe0002308361c x25: 1ffff0001096ee18 x24: 0000000000000000 [ 41.706149] x23: 0000000000000000 x22: 0000000000000002 x21: ffffdd4be6e3c7a0 [ 41.713500] x20: ffff800084b770f0 x19: ffff00011841b1e8 x18: 0000000000000000 [ 41.720675] x17: 0000000000000000 x16: 0000000000000000 x15: 0720072007200720 [ 41.728023] x14: 0000000000000000 x13: 0000000000000001 x12: ffff6001a96eaab3 [ 41.735375] x11: 1fffe001a96eaab2 x10: ffff6001a96eaab2 x9 : ffffdd4be4838bbc [ 41.742723] x8 : 00009ffe5691554e x7 : ffff000d4b755593 x6 : 0000000000000001 [ 41.749985] x5 : ffff000d4b755590 x4 : 1fffe0001d88f001 x3 : dfff800000000000 [ 41.756988] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff0000ec478000 [ 41.764251] Call trace: [ 41.766695] __mutex_lock+0x670/0x878 [ 41.770373] mutex_lock_nested+0x2c/0x40 [ 41.774134] host1x_intr_start+0x54/0xf8 [host1x] [ 41.778863] host1x_runtime_resume+0x150/0x228 [host1x] [ 41.783935] pm_generic_runtime_resume+0x84/0xc8 [ 41.788485] __rpm_callback+0xa0/0x478 [ 41.792422] rpm_callback+0x15c/0x1a8 [ 41.795922] rpm_resume+0x698/0xc08 [ 41.799597] __pm_runtime_resume+0xa8/0x140 [ 41.803621] host1x_probe+0x810/0xbc0 [host1x] [ 41.807909] platform_probe+0xcc/0x1a8 [ 41.811845] really_probe+0x188/0x800 [ 41.815347] __driver_probe_device+0x164/0x360 [ 41.819810] driver_probe_device+0x64/0x1a8 [ 41.823834] __driver_attach+0x180/0x490 [ 41.827773] bus_for_each_dev+0x104/0x1a0 [ 41.831797] driver_attach+0x44/0x68 [ 41.835296] bus_add_driver+0x23c/0x4e8 [ 41.839235] driver_register+0x15c/0x3a8 [ 41.843170] __platform_register_drivers+0xa4/0x208 [ 41.848159] tegra_host1x_init+0x4c/0xff8 [host1x] [ 41.853147] do_one_initcall+0xd4/0x380 [ 41.856997] do_init_module+0x1dc/0x698 [ 41.860758] load_module+0xc70/0x1300 [ 41.864435] __do_sys_init_module+0x1a8/0x1d0 [ 41.868721] __arm64_sys_init_module+0x74/0xb0 [ 41.873183] invoke_syscall.constprop.0+0xdc/0x1e8 [ 41.877997] do_el0_svc+0x154/0x1d0 [ 41.881671] el0_svc+0x54/0x140 [ 41.884820] el0t_64_sync_handler+0x120/0x130 [ 41.889285] el0t_64_sync+0x1a4/0x1a8 [ 41.892960] irq event stamp: 69737 [ 41.896370] hardirqs last enabled at (69737): [<ffffdd4be6d7768c>] _raw_spin_unlock_irqrestore+0x44/0xe8 [ 41.905739] hardirqs last disabled at (69736): [<ffffdd4be59dcd40>] clk_enable_lock+0x98/0x198 [ 41.914314] softirqs last enabled at (68082): [<ffffdd4be466b1d0>] handle_softirqs+0x4c8/0x890 [ 41.922977] softirqs last disabled at (67945): [<ffffdd4be44f02a4>] __do_softirq+0x1c/0x28 [ 41.931289] ---[ end trace 0000000000000000 ]--- Inside the probe function when pm_runtime_enable() is called, the PM core invokes a resume callback if the device Host1x is in a suspended state. As it can be seen in the logs above, this leads to host1x_intr_start() function call which is trying to acquire a mutex lock. But, the function host_intr_init() only gets called after the pm_runtime_enable() where mutex is initialised leading to the use of mutex prior to its initialisation. Fix this by moving the mutex initialisation prior to the runtime PM enablement function pm_runtime_enable() in probe. Fixes: `c8347f915e` ("gpu: host1x: Fix boot regression for Tegra") Signed-off-by: Rupinderjit Singh <rusingh@redhat.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: https://patchwork.ozlabs.org/project/linux-tegra/patch/20250206155803.201942-1-rusingh@redhat.com/	2025-02-07 15:57:50 +01:00
Maxime Ripard	5d14c08a47	drm/tests: hdmi: Fix recursive locking The find_preferred_mode() functions takes the mode_config mutex, but due to the order most tests have, is called with the crtc_ww_class_mutex taken. This raises a warning for a circular dependency when running the tests with lockdep. Reorder the tests to call find_preferred_mode before the acquire context has been created to avoid the issue. Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20250129-test-kunit-v2-4-fe59c43805d5@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>	2025-02-06 14:07:56 +01:00
Maxime Ripard	6b6bfd63e1	drm/tests: hdmi: Reorder DRM entities variables assignment The tests all deviate slightly in how they assign their local pointers to DRM entities. This makes refactoring pretty difficult, so let's just move the assignment as soon as the entities are allocated. Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20250129-test-kunit-v2-3-fe59c43805d5@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>	2025-02-06 14:07:47 +01:00
Maxime Ripard	bb4f929a88	drm/tests: hdmi: Remove redundant assignments Some tests have the drm pointer assigned multiple times to the same value. Drop the redundant assignments. Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20250129-test-kunit-v2-2-fe59c43805d5@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>	2025-02-06 14:07:09 +01:00
Maxime Ripard	fb97bc2e47	drm/tests: hdmi: Fix WW_MUTEX_SLOWPATH failures The light_up_connector helper function in the HDMI infrastructure unit tests uses drm_atomic_set_crtc_for_connector(), but fails when it returns an error. This function can return EDEADLK though if the sequence needs to be restarted, and WW_MUTEX_SLOWPATH is meant to test that we handle it properly. Let's handle EDEADLK and restart the sequence in our tests as well. Fixes: `eb66d34d79` ("drm/tests: Add output bpc tests") Reported-by: Dave Airlie <airlied@gmail.com> Closes: https://lore.kernel.org/r/CAPM=9tzJ4-ERDxvuwrCyUPY0=+P44orhp1kLWVGL7MCfpQjMEQ@mail.gmail.com/ Link: https://lore.kernel.org/r/20241031091558.2435850-1-mripard@kernel.org Reviewed-by: Simona Vetter <simona.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20250129-test-kunit-v2-1-fe59c43805d5@kernel.org Signed-off-by: Maxime Ripard <mripard@kernel.org>	2025-02-06 14:07:09 +01:00
Maxime Ripard	2c1ed90752	Merge remote-tracking branch 'drm-misc/drm-misc-next-fixes' into drm-misc-fixes Merge the few remaining patches stuck into drm-misc-next-fixes. Signed-off-by: Maxime Ripard <mripard@kernel.org>	2025-02-06 09:59:35 +01:00
Mario Limonciello	ecee4d0695	accel/amdxdna: Add MODULE_FIRMWARE() declarations Initramfs building tools such as dracut will look for a MODULE_FIRMWARE() declaration to determine which firmware to include in the initramfs when a driver is included in the initramfs. As amdxdna doesn't declare any firmware this causes the driver to fail to load with -ENOENT when in the initramfs. Add the missing declaration for possible firmware. Reported-by: Renjith Pananchikkal <Renjith.Pananchikkal@amd.com> Suggested-by: Alexander Deucher <Alexander.Deucher@amd.com> Fixes: `8c9ff1b181` ("accel/amdxdna: Add a new driver for AMD AI Engine") Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Link: https://lore.kernel.org/r/20250204174031.3425762-1-superm1@kernel.org Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250204174031.3425762-1-superm1@kernel.org	2025-02-04 13:05:48 -06:00
Jacek Lawrynowicz	41a2d8286c	accel/ivpu: Fix error handling in recovery/reset Disable runtime PM for the duration of reset/recovery so it is possible to set the correct runtime PM state depending on the outcome of the `ivpu_resume()`. Don’t suspend or reset the HW if the NPU is suspended when the reset/recovery is requested. Also, move common reset/recovery code to separate functions for better code readability. Fixes: `27d19268cf` ("accel/ivpu: Improve recovery and reset support") Cc: stable@vger.kernel.org # v6.8+ Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250129124009.1039982-4-jacek.lawrynowicz@linux.intel.com	2025-02-03 09:59:18 +01:00
Jacek Lawrynowicz	f2bc2afe34	accel/ivpu: Clear runtime_error after pm_runtime_resume_and_get() fails pm_runtime_resume_and_get() sets dev->power.runtime_error that causes all subsequent pm_runtime_get_sync() calls to fail. Clear the runtime_error using pm_runtime_set_suspended(), so the driver doesn't have to be reloaded to recover when the NPU fails to boot during runtime resume. Fixes: `7d4b4c7443` ("accel/ivpu: Remove suspend_reschedule_counter") Cc: stable@vger.kernel.org # v6.11+ Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250129124009.1039982-3-jacek.lawrynowicz@linux.intel.com	2025-02-03 09:59:18 +01:00
Jacek Lawrynowicz	f3be8a9b1a	accel/ivpu: Fix error handling in ivpu_boot() Ensure IRQs and IPC are properly disabled if HW sched or DCT initialization fails. Fixes: `cc3c72c7e6` ("accel/ivpu: Refactor failure diagnostics during boot") Cc: stable@vger.kernel.org # v6.13+ Reviewed-by: Karol Wachowski <karol.wachowski@intel.com> Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com> Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20250129124009.1039982-2-jacek.lawrynowicz@linux.intel.com	2025-02-03 09:59:17 +01:00

1 2 3 4 5 ...

1326587 Commits