If the clock mhdp->clk was not enabled in cdns_mhdp_probe(), it should not
be disabled in any path.
The return value of clk_prepare_enable() is not checked. If mhdp->clk was
not enabled, it may be disabled in the error path of cdns_mhdp_probe()
(e.g., if cdns_mhdp_load_firmware() fails) or in cdns_mhdp_remove() after
a successful cdns_mhdp_probe() call.
Use the devm_clk_get_enabled() helper function to ensure proper call
balance for mhdp->clk.
Found by Linux Verification Center (linuxtesting.org) with Klever.
Fixes: fb43aa0acd ("drm: bridge: Add support for Cadence MHDP8546 DPI/DP bridge")
Signed-off-by: Vitalii Mordan <mordan@ispras.ru>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Signed-off-by: Robert Foss <rfoss@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20250214154632.1907425-1-mordan@ispras.ru
It is required by firmware to wait up to 2 seconds for pending commands
before sending the destroy hardware context command. After 2 seconds
wait, if there are still pending commands, driver needs to cancel them.
So the context destroy steps need to be:
1. Stop drm scheduler. (drm_sched_entity_destroy)
2. Wait up to 2 seconds for pending commands.
3. Destroy hardware context and cancel the rest pending requests.
4. Wait all jobs associated with the hwctx are freed.
5. Free job resources.
Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
Reviewed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250124173536.148676-1-lizhi.hou@amd.com
Currently, DRM atomic uAPI allows only primary planes to be flipped
asynchronously. However, each driver might be able to perform async
flips in other different plane types. To enable drivers to set their own
restrictions on which type of plane they can or cannot flip, use the
existing atomic_async_check() from struct drm_plane_helper_funcs to
enhance this flexibility, thus allowing different plane types to be able
to do async flips as well.
Create a new parameter for the atomic_async_check(), `bool flip`. This
parameter is used to distinguish when this function is being called from
a plane update from a full page flip.
In order to prevent regressions and such, we keep the current policy: we
skip the driver check for the primary plane, because it is always
allowed to do async flips on it.
Signed-off-by: André Almeida <andrealmeid@igalia.com>
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Reviewed-by: Christopher Snowhill <chris@kode54.net>
Tested-by: Christopher Snowhill <chris@kode54.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20250127-tonyk-async_flip-v12-1-0f7f8a8610d3@igalia.com
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
This was previously attempted as xe specific reset uevent but dropped
in commit 77a0d4d1ce ("drm/xe/uapi: Remove reset uevent for now")
as part of refactoring.
Now that we have device wedged event provided by DRM core, make use
of it and support both driver rebind and bus-reset based recovery.
With this in place userspace will be notified of wedged device, on
the basis of which, userspace may take respective action to recover
the device.
$ udevadm monitor --property --kernel
monitor will print the received events for:
KERNEL - the kernel uevent
KERNEL[265.802982] change /devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:01.0/0000:03:00.0/drm/card0 (drm)
ACTION=change
DEVPATH=/devices/pci0000:00/0000:00:01.0/0000:01:00.0/0000:02:01.0/0000:03:00.0/drm/card0
SUBSYSTEM=drm
WEDGED=rebind,bus-reset
DEVNAME=/dev/dri/card0
DEVTYPE=drm_minor
SEQNUM=5208
MAJOR=226
MINOR=0
v2: Change authorship to Himal (Aravind)
Add uevent for all device wedged cases (Aravind)
v3: Generic implementation in DRM subsystem (Lucas)
v4: Change authorship to Raag (Aravind)
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250204070528.1919158-4-raag.jadav@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Introduce device wedged event, which notifies userspace of 'wedged'
(hanged/unusable) state of the DRM device through a uevent. This is
useful especially in cases where the device is no longer operating as
expected and has become unrecoverable from driver context. Purpose of
this implementation is to provide drivers a generic way to recover the
device with the help of userspace intervention without taking any drastic
measures (like resetting or re-enumerating the full bus, on which the
underlying physical device is sitting) in the driver.
A 'wedged' device is basically a device that is declared dead by the
driver after exhausting all possible attempts to recover it from driver
context. The uevent is the notification that is sent to userspace along
with a hint about what could possibly be attempted to recover the device
from userspace and bring it back to usable state. Different drivers may
have different ideas of a 'wedged' device depending on hardware
implementation of the underlying physical device, and hence the vendor
agnostic nature of the event. It is up to the drivers to decide when they
see the need for device recovery and how they want to recover from the
available methods.
Driver prerequisites
--------------------
The driver, before opting for recovery, needs to make sure that the
'wedged' device doesn't harm the system as a whole by taking care of the
prerequisites. Necessary actions must include disabling DMA to system
memory as well as any communication channels with other devices. Further,
the driver must ensure that all dma_fences are signalled and any device
state that the core kernel might depend on is cleaned up. All existing
mmaps should be invalidated and page faults should be redirected to a
dummy page. Once the event is sent, the device must be kept in 'wedged'
state until the recovery is performed. New accesses to the device
(IOCTLs) should be rejected, preferably with an error code that resembles
the type of failure the device has encountered. This will signify the
reason for wedging, which can be reported to the application if needed.
Recovery
--------
Current implementation defines three recovery methods, out of which,
drivers can use any one, multiple or none. Method(s) of choice will be
sent in the uevent environment as ``WEDGED=<method1>[,..,<methodN>]`` in
order of less to more side-effects. If driver is unsure about recovery
or method is unknown (like soft/hard system reboot, firmware flashing,
physical device replacement or any other procedure which can't be
attempted on the fly), ``WEDGED=unknown`` will be sent instead.
Userspace consumers can parse this event and attempt recovery as per the
following expectations.
=============== ========================================
Recovery method Consumer expectations
=============== ========================================
none optional telemetry collection
rebind unbind + bind driver
bus-reset unbind + bus reset/re-enumeration + bind
unknown consumer policy
=============== ========================================
The only exception to this is ``WEDGED=none``, which signifies that the
device was temporarily 'wedged' at some point but was recovered from driver
context using device specific methods like reset. No explicit recovery is
expected from the consumer in this case, but it can still take additional
steps like gathering telemetry information (devcoredump, syslog). This is
useful because the first hang is usually the most critical one which can
result in consequential hangs or complete wedging.
Consumer prerequisites
----------------------
It is the responsibility of the consumer to make sure that the device or
its resources are not in use by any process before attempting recovery.
With IOCTLs erroring out, all device memory should be unmapped and file
descriptors should be closed to prevent leaks or undefined behaviour. The
idea here is to clear the device of all user context beforehand and set
the stage for a clean recovery.
Example
-------
Udev rule::
SUBSYSTEM=="drm", ENV{WEDGED}=="rebind", DEVPATH=="*/drm/card[0-9]",
RUN+="/path/to/rebind.sh $env{DEVPATH}"
Recovery script::
#!/bin/sh
DEVPATH=$(readlink -f /sys/$1/device)
DEVICE=$(basename $DEVPATH)
DRIVER=$(readlink -f $DEVPATH/driver)
echo -n $DEVICE > $DRIVER/unbind
echo -n $DEVICE > $DRIVER/bind
Customization
-------------
Although basic recovery is possible with a simple script, consumers can
define custom policies around recovery. For example, if the driver supports
multiple recovery methods, consumers can opt for the suitable one depending
on scenarios like repeat offences or vendor specific failures. Consumers
can also choose to have the device available for debugging or telemetry
collection and base their recovery decision on the findings. This is useful
especially when the driver is unsure about recovery or method is unknown.
v4: s/drm_dev_wedged/drm_dev_wedged_event
Use drm_info() (Jani)
Kernel doc adjustment (Aravind)
v5: Send recovery method with uevent (Lina)
v6: Access wedge_recovery_opts[] using helper function (Jani)
Use snprintf() (Jani)
v7: Convert recovery helpers into regular functions (Andy, Jani)
Aesthetic adjustments (Andy)
Handle invalid recovery method
v8: Allow sending multiple methods with uevent (Lucas, Michal)
static_assert() globally (Andy)
v9: Provide 'none' method for device reset (Christian)
Provide recovery opts using switch cases
v11: Log device reset (André)
Signed-off-by: Raag Jadav <raag.jadav@intel.com>
Reviewed-by: André Almeida <andrealmeid@igalia.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250204070528.1919158-2-raag.jadav@intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
In some cases observed during ESD tests, the TI SN65DSI83 cannot recover
from errors by itself. A full restart of the bridge is needed in those
cases to have the bridge output LVDS signals again.
Also, during tests, cases were observed where reading the status of the
bridge was not even possible. Indeed, in those cases, the bridge stops
to acknowledge I2C transactions. Only a full reset of the bridge (power
off/on) brings back the bridge to a functional state.
The TI SN65DSI83 has some error detection capabilities. Introduce an
error recovery mechanism based on this detection.
The errors detected are signaled through an interrupt. On system where
this interrupt is not available, the driver uses a polling monitoring
fallback to check for errors. When an error is present or when reading
the bridge status leads to an I2C failure, the recovery process is
launched.
Restarting the bridge needs to redo the initialization sequence. This
initialization sequence has to be done with the DSI data lanes driven in
LP11 state. In order to do that, the recovery process resets the whole
output path (i.e the path from the encoder to the connector) where the
bridge is located.
Signed-off-by: Herve Codina <herve.codina@bootlin.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20250210132620.42263-5-herve.codina@bootlin.com
Signed-off-by: Maxime Ripard <mripard@kernel.org>
Convert most mutex_(un)lock calls to use (scoped_)guard instead. This
generally reduces line count and prevents bugs like forgetting to unlock
the mutex. I've left traditional calls in a few places where scoped
helpers would be more verbose. This mostly happens where
debugfs_file_put needs to be called regardless. I looked into defining a
CLASS for debugfs_file, but it seems like more effort than it's worth
since debugfs_file_get can fail.
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250207162528.1651426-3-sean.anderson@linux.dev
The Lenovo Thinkpad T14s Gen6 Snapdragon is currently sold with three
different panel versions: OLED, Low Power IPS or IPS with Touchscreen.
My Low Power IPS version had this panel and the kernel complained
about not knowing any details. I don't have any panel documentation,
but as far as I can see the same timings for the already supported
CSO panel also work for this one.
The raw EDID is:
00 ff ff ff ff ff ff 00 0e 6f 13 14 00 00 00 00
00 1e 01 04 a5 1e 13 78 03 82 53 a4 55 4d 9b 24
0d 51 55 00 00 00 01 01 01 01 01 01 01 01 01 01
01 01 01 01 01 01 35 3c 80 a0 70 b0 23 40 30 20
36 00 2e bd 10 00 00 18 00 00 00 fd 00 30 3c 4a
4a 0f 01 0a 20 20 20 20 20 20 00 00 00 fe 00 43
53 4f 54 20 54 33 0a 20 20 20 20 20 00 00 00 fe
00 4d 4e 45 30 30 37 4a 41 31 2d 32 0a 20 00 8b
Signed-off-by: Sebastian Reichel <sre@kernel.org>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20250211014314.94429-1-sre@ring0.de
Look up the mode index for the astdp transmitter ship in the encoder's
atomic check and report an error if the display mode is not supported.
The lookup uses the DRM display mode instead of the driver's internal
VBIOS mode. Both are equivalent. The modesetting code later reads
the calculated index from the connector state to avoid recalculating it.
v2:
- fix typo in commit message (Jocelyn)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250204133209.403327-4-tzimmermann@suse.de