Currently drm_bridge_connector_init() reuses the 'bridge' variable, first
as a loop variable in the long drm_for_each_bridge_in_chain() and then in
the HDMI-specific code as a shortcut to bridge_connector->bridge_cec.
We are about to remove the 'bridge' loop variable for
drm_for_each_bridge_in_chain() by moving to a scoped version of the same
macro, which implies removing the 'bridge' variable from the main function
scope. Additionally reusing the variable can make such long function less
readable.
Similarly to what commit 6f727c838ea8 ("drm/bridge-connector: Fix bridge in
drm_connector_hdmi_audio_init()") already did for the audio HDMI bridge,
use n local variable inside the scopes where it is needed as a
bridge_connector->bridge_hdmi_cec shortcut to make its scope clearer as
well as to allow removing the 'bridge' variable in an upcoming commit.
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Link: https://lore.kernel.org/r/20250808-drm-bridge-alloc-getput-for_each_bridge-v2-1-edb6ee81edf1@bootlin.com
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Before updating the display from the console's shadow buffer, the dirty
worker now waits for a vblank. This allows several screen updates to pile
up and acts as a rate limiter. If a DRM master is present, it could
interfere with the vblank. Don't wait in this case.
v4:
* share code with WAITFORVSYNC ioctl (Emil)
* use lock guard
v3:
* add back helper->lock
* acquire DRM master status while waiting for vblank
v2:
* don't hold helper->lock while waiting for vblank
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Link: https://lore.kernel.org/r/20250829091447.46719-1-tzimmermann@suse.de
Using pmu counters for usage stats. This enables dynamic frequency
scaling on all of the currently supported Tegra gpus.
The register offsets are valid for gk20a, gm20b, gp10b, and gv11b. If
support is added for ga10b, this will need rearchitected.
Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
[fixed tab alignment in gk20a_devfreq_target()]
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://lore.kernel.org/r/20250906-gk20a-devfreq-v2-1-0217f53ee355@gmail.com
Starting with Tegra186, gpu clock handling is done by the bpmp and there
is little to be done by the kernel. The only thing necessary for
reclocking is to set the gpcclk to the desired rate and the bpmp handles
the rest. The pstate list is based on the downstream driver generates.
Signed-off-by: Aaron Kling <webgeek1234@gmail.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
[added newline before gp10b_clk macro declaration for checkpatch error]
Signed-off-by: Lyude Paul <lyude@redhat.com>
Link: https://lore.kernel.org/r/20250823-gp10b-reclock-v2-1-90a1974a54e3@gmail.com
At the moment, the way that we currently free gem shmem objects is not
ideal for rust bindings. drm_gem_shmem_free() releases all of the
associated memory with a gem shmem object with kfree(), which means that
for us to correctly release a gem shmem object in rust we have to manually
drop all of the contents of our gem object structure in-place by hand
before finally calling drm_gem_shmem_free() to release the shmem resources
and the allocation for the gem object.
Since the only reason this is an issue is because of drm_gem_shmem_free()
calling kfree(), we can fix this by splitting drm_gem_shmem_free() out into
itself and drm_gem_shmem_release(), where drm_gem_shmem_release() releases
the various gem shmem resources without freeing the structure itself. With
this, we can safely re-acquire the KBox for the gem object's memory
allocation and let rust handle cleaning up all of the other struct members
automatically.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Link: https://lore.kernel.org/r/20250911230147.650077-3-lyude@redhat.com
With gem objects in rust, the most ideal way for us to be able to handle
gem shmem object creation is to be able to handle the memory allocation of
a gem object ourselves - and then have the DRM gem shmem helpers initialize
the object we've allocated afterwards. So, let's split out
drm_gem_shmem_init() from drm_gem_shmem_create() to allow for doing this.
Signed-off-by: Lyude Paul <lyude@redhat.com>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Link: https://lore.kernel.org/r/20250911230147.650077-2-lyude@redhat.com
If GuC hangs, the GuC logs might not contain enough information to
understand exactly why the hang occurred. In this case, we need to
look at the GuC HW state to try to understand where the GuC is stuck. It
is therefore useful to include the GuC HW state in the error capture.
The list of registers that are part of the GuC HW state can change based
on platform, but it is the same for all platforms from TGL to MTL so we
only need to support one version for i915.
v2: revised list
v3: remove confusing comment, use sizeof(u32) instead of 4 (John)
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://lore.kernel.org/r/20250909223621.3782625-2-daniele.ceraolospurio@intel.com
Add support to register host1x devices without requiring subdevices.
This ensures syncpoint functionality remains available even when engine
subdevices are not present.
Add softdep for tegra-drm to make userspace interface available
without module autoloading through device tree entries.
Signed-off-by: Vamsee Vardhan Thummala <vthummala@nvidia.com>
[mperttunen@nvidia.com: some rewording]
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20250708-host1x-allow-no-subdevs-v1-1-93c66c251f03@nvidia.com
The current submission opcode sequence first takes the engine MLOCK,
and then switches to HOST1X class to wait prefences. This is fine
while we only use a single channel per engine and there is no
virtualization, since jobs are serialized on that one channel anyway.
However, when that assumption doesn't hold, we are keeping the
engine locked while not running anything on it while waiting for
prefences to complete.
To resolve this, execute wait commands in the beginning of the job
outside the engine MLOCK. We still take the HOST1X MLOCK because
recent hardware requires register opcodes to be executed within some
MLOCK, but the hardware also allows unlimited channels to take the
HOST1X MLOCK at the same time.
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20250708-host1x-wait-prefences-outside-mlock-v1-1-13e98044e35a@nvidia.com
Fix race condition between host1x_syncpt_alloc()
and host1x_syncpt_put() by using kref_put_mutex()
instead of kref_put() + manual mutex locking.
This ensures no thread can acquire the
syncpt_mutex after the refcount drops to zero
but before syncpt_release acquires it.
This prevents races where syncpoints could
be allocated while still being cleaned up
from a previous release.
Remove explicit mutex locking in syncpt_release
as kref_put_mutex() handles this atomically.
Signed-off-by: Mainak Sen <msen@nvidia.com>
Fixes: f5ba33fb96 ("gpu: host1x: Reserve VBLANK syncpoints at initialization")
Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20250707-host1x-syncpt-race-fix-v1-1-28b0776e70bc@nvidia.com
The variable j is set, however never used in or outside the loop, thus
resulting in dead code.
Building with GCC 16 results in a build error due to
-Werror=unused-but-set-variable= enabled by default.
This patch clean up the dead code and fixes the build error.
Example build log:
drivers/gpu/drm/tegra/sor.c:1867:19: error: variable ‘j’ set but not used [-Werror=unused-but-set-variable=]
1867 | size_t i, j;
| ^
Signed-off-by: Brahmajit Das <listout@listout.xyz>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Link: https://lore.kernel.org/r/20250901212020.3757519-1-listout@listout.xyz
Before exporting a buffer, make sure it has been populated with
pages at least once.
While discussing cgroups we noticed a problem where you could export
a BO to a dma-buf without having it ever being backed or accounted for.
This meant in low memory situations or eventually with cgroups, a
lower privledged process might cause the compositor to try and allocate
a lot of memory on it's behalf and this could fail. At least make
sure the exporter has managed to allocate the RAM at least once
before exporting the object.
This only applies currently to TTM_PL_SYSTEM objects, because
GTT objects get populated on first validate, and VRAM doesn't
use TT.
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://lore.kernel.org/r/20250904021643.2050497-4-airlied@gmail.com
Before exporting a buffer, make sure it has been populated with
pages at least once.
While discussing cgroups we noticed a problem where you could export
a BO to a dma-buf without having it ever being backed or accounted for.
This meant in low memory situations or eventually with cgroups, a
lower privledged process might cause the compositor to try and allocate
a lot of memory on it's behalf and this could fail. At least make
sure the exporter has managed to allocate the RAM at least once
before exporting the object.
This only applies currently to TTM_PL_SYSTEM objects, because
GTT objects get populated on first validate, and VRAM doesn't
use TT.
Acked-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://lore.kernel.org/r/20250904021643.2050497-3-airlied@gmail.com
Before exporting a buffer, make sure it has been populated with
pages at least once.
While discussing cgroups we noticed a problem where you could export
a BO to a dma-buf without having it ever being backed or accounted for.
This meant in low memory situations or eventually with cgroups, a
lower privledged process might cause the compositor to try and allocate
a lot of memory on it's behalf and this could fail. At least make
sure the exporter has managed to allocate the RAM at least once
before exporting the object.
This only applies currently to TTM_PL_SYSTEM objects, because
GTT objects get populated on first validate, and VRAM doesn't
use TT.
Reviewed-by: Christian Koenig <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://lore.kernel.org/r/20250904021643.2050497-2-airlied@gmail.com
While discussing cgroups we noticed a problem where you could export
a BO to a dma-buf without having it ever being backed or accounted for.
This meant in low memory situations or eventually with cgroups, a
lower privledged process might cause the compositor to try and allocate
a lot of memory on it's behalf and this could fail. At least make
sure the exporter has managed to allocate the RAM at least once
before exporting the object.
This only applies currently to TTM_PL_SYSTEM objects, because
GTT objects get populated on first validate, and VRAM doesn't
use TT.
Reviewed-by: Christian Koenig <christian.koenig@amd.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Simona Vetter <simona.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://lore.kernel.org/r/20250904021643.2050497-1-airlied@gmail.com
On systems with multiple GPUs there can be uncertainty which GPU is the
primary one used to drive the display at bootup. In some desktop
environments this can lead to increased power consumption because
secondary GPUs may be used for rendering and never go to a low power
state. In order to disambiguate this add a new sysfs attribute
'boot_display' that uses the output of video_is_primary_device() to
populate whether the PCI device was used for driving the display.
Suggested-by: Manivannan Sadhasivam <mani@kernel.org>
Acked-by: Manivannan Sadhasivam <mani@kernel.org>
Link: https://gitlab.freedesktop.org/xorg/lib/libpciaccess/-/issues/23
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://lore.kernel.org/r/20250811162606.587759-5-superm1@kernel.org
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
There is no reason to require this to happen on first submitted IB only.
We need to wait for the queue to be idle, but it can be done at any
time (including when there are multiple video sessions active).
Signed-off-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 8908fdce06)
Cc: stable@vger.kernel.org
There can be multiple engine info packages in one IB and the first one
may be common engine, not decode/encode.
We need to parse the entire IB instead of stopping after finding first
engine info.
Signed-off-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit dc8f9f0f45)
Cc: stable@vger.kernel.org