If another driver for a VGA compatible GPU (that is passthrough'd)
locks the VGA resources (by calling vga_get()), then virtio_gpu
driver would encounter the following errors and fail to load during
probe and initialization:
Invalid read at addr 0x7200005014, size 1, region '(null)', reason: rejected
Invalid write at addr 0x7200005014, size 1, region '(null)', reason: rejected
virtio_gpu virtio0: virtio: device uses modern interface but does not have VIRTIO_F_VERSION_1
virtio_gpu virtio0: probe with driver virtio_gpu failed with error -22
This issue is only seen if virtio-gpu and the other GPU are on
different PCI buses, which can happen if the user includes an
additional PCIe port and associates the VGA compatible GPU with
it while launching Qemu:
qemu-system-x86_64...
-device virtio-vga,max_outputs=1,xres=1920,yres=1080,blob=true
-device pcie-root-port,id=pcie.1,bus=pcie.0,addr=1c.0,slot=1,chassis=1,multifunction=on
-device vfio-pci,host=03:00.0,bus=pcie.1,addr=00.0 ...
In the above example, the device 03:00.0 is an Intel DG2 card and
this issue is seen when both i915 driver and virtio_gpu driver are
loading (or initializing) concurrently or when i915 is loaded first.
Note that during initalization, i915 driver does the following in
intel_vga_reset_io_mem():
vga_get_uninterruptible(pdev, VGA_RSRC_LEGACY_IO);
outb(inb(VGA_MIS_R), VGA_MIS_W);
vga_put(pdev, VGA_RSRC_LEGACY_IO);
Although, virtio-gpu might own the VGA resources initially, the
above call (in i915) to vga_get_uninterruptible() would result in
these resources being taken away, which means that virtio-gpu would
not be able to decode VGA anymore. This happens in __vga_tryget()
when it calls
pci_set_vga_state(conflict->pdev, false, pci_bits, flags);
where
pci_bits = PCI_COMMAND_MEMORY | PCI_COMMAND_IO
flags = PCI_VGA_STATE_CHANGE_DECODES | PCI_VGA_STATE_CHANGE_BRIDGE
Therefore, to solve this issue, virtio-gpu driver needs to call
vga_get() whenever it needs to reclaim and access VGA resources,
which is during initial probe and setup. After that, a call to
vga_put() would release the lock to allow other VGA compatible
devices to access these shared VGA resources.
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Gurchetan Singh <gurchetansingh@chromium.org>
Cc: Chia-I Wu <olvaffe@gmail.com>
Reported-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20241211064343.550153-1-vivek.kasireddy@intel.com
Tested-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Fix deadlock in job submission and abort handling.
When a thread aborts currently executing jobs due to a fault,
it first locks the global lock protecting submitted_jobs (#1).
After the last job is destroyed, it proceeds to release the related context
and locks file_priv (#2). Meanwhile, in the job submission thread,
the file_priv lock (#2) is taken first, and then the submitted_jobs
lock (#1) is obtained when a job is added to the submitted jobs list.
CPU0 CPU1
---- ----
(for example due to a fault) (jobs submissions keep coming)
lock(&vdev->submitted_jobs_lock) #1
ivpu_jobs_abort_all()
job_destroy()
lock(&file_priv->lock) #2
lock(&vdev->submitted_jobs_lock) #1
file_priv_release()
lock(&vdev->context_list_lock)
lock(&file_priv->lock) #2
This order of locking causes a deadlock. To resolve this issue,
change the order of locking in ivpu_job_submit().
Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250107173238.381120-12-maciej.falkowski@linux.intel.com
To prevent looping infinitely in MMU event handler we stop
generating new events by removing 'R' (record) bit from context
descriptor, but to ensure this change has effect KMD has to perform
configuration invalidation followed by sync command.
Because of that move parts of the interrupt handler that can take longer
to a thread not to block in interrupt handler for too long.
This includes:
* disabling event queue for the time KMD updates MMU event queue consumer
to ensure proper synchronization between MMU and KMD
* removal of 'R' (record) bit from context descriptor to ensure no more
faults are recorded until that context is destroyed
Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250107173238.381120-8-maciej.falkowski@linux.intel.com
Stop dumping consecutive faults from an already faulty context immediately,
instead of waiting for the context abort thread handler (IRQ handler bottom
half) to abort currently executing jobs.
Remove 'R' (record events) bit from context descriptor of a faulty
context to prevent future faults generation.
This change speeds up the IRQ handler by eliminating the need to print the
fault content repeatedly. Additionally, it prevents flooding dmesg with
errors, which was occurring due to the delay in the bottom half of the
handler stopping fault-generating jobs.
Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250107173238.381120-7-maciej.falkowski@linux.intel.com
With hardware scheduler it is not expected to receive JOB_DONE
notifications from NPU FW for the jobs aborted due to command queue destroy
JSM command.
Remove jobs submitted to unregistered command queue from submitted_jobs_xa
to avoid triggering a TDR in such case.
Add explicit submitted_jobs_lock that protects access to list of submitted
jobs which is now used to find jobs to abort.
Move context abort procedure to separate work queue not to slow down
handling of IPCs or DCT requests in case where job abort takes longer,
especially when destruction of the last job of a specific context results
in context release.
Signed-off-by: Karol Wachowski <karol.wachowski@intel.com>
Signed-off-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>
Reviewed-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Signed-off-by: Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250107173238.381120-4-maciej.falkowski@linux.intel.com
The only reason for the ssd130x-spi driver to have an spi_device_id table
is that the SPI core always reports an "spi:" MODALIAS, even when the SPI
device has been registered via a Device Tree Blob.
Without spi_device_id table information in the module's metadata, module
autoloading would not work because there won't be an alias that matches
the MODALIAS reported by the SPI core.
This spi_device_id table is not needed for device matching though, since
the of_device_id table is always used in this case. For this reason, the
struct spi_driver .id_table member is currently not set in the SPI driver.
Because the spi_device_id table is always required for module autoloading,
the SPI core checks during driver registration that both an of_device_id
table and a spi_device_id table are present and that they contain the same
entries for all the SPI devices.
Not setting the .id_table member in the driver then confuses the core and
leads to the following warning when the ssd130x-spi driver is registered:
[ 41.091198] SPI driver ssd130x-spi has no spi_device_id for sinowealth,sh1106
[ 41.098614] SPI driver ssd130x-spi has no spi_device_id for solomon,ssd1305
[ 41.105862] SPI driver ssd130x-spi has no spi_device_id for solomon,ssd1306
[ 41.113062] SPI driver ssd130x-spi has no spi_device_id for solomon,ssd1307
[ 41.120247] SPI driver ssd130x-spi has no spi_device_id for solomon,ssd1309
[ 41.127449] SPI driver ssd130x-spi has no spi_device_id for solomon,ssd1322
[ 41.134627] SPI driver ssd130x-spi has no spi_device_id for solomon,ssd1325
[ 41.141784] SPI driver ssd130x-spi has no spi_device_id for solomon,ssd1327
[ 41.149021] SPI driver ssd130x-spi has no spi_device_id for solomon,ssd1331
To prevent the warning, set the .id_table even though it's not necessary.
Since the check is done even for built-in drivers, drop the condition to
only define the ID table when the driver is built as a module. Finally,
rename the variable to use the "_spi_id" convention used for ID tables.
Fixes: 74373977d2 ("drm/solomon: Add SSD130x OLED displays SPI support")
Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20241231114516.2063201-1-javierm@redhat.com
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
In certain use-cases, a CRTC could switch between two encoders
and because the mode being programmed on the CRTC remains
the same during this switch, the CRTC's mode_changed remains false.
In such cases, the encoder's mode_set also gets skipped.
Skipping mode_set on the encoder for such cases could cause an issue
because even though the same CRTC mode was being used, the encoder
type could have changed like the CRTC could have switched from a
real time encoder to a writeback encoder OR vice-versa.
Allow encoder's mode_set to happen even when connectors changed on a
CRTC and not just when the mode changed.
Signed-off-by: Abhinav Kumar <quic_abhinavk@quicinc.com>
Signed-off-by: Jessica Zhang <quic_jesszhan@quicinc.com>
Reviewed-by: Maxime Ripard <mripard@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20241211-abhinavk-modeset-fix-v3-1-0de4bf3e7c32@quicinc.com
Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Each bridge instance creates up to four auxiliary devices with different
names. However, their IDs are always zero, causing duplicate filename
errors when a system has multiple bridges:
sysfs: cannot create duplicate filename '/bus/auxiliary/devices/ti_sn65dsi86.gpio.0'
Fix this by using a unique instance ID per bridge instance. The
instance ID is derived from the I2C adapter number and the bridge's I2C
address, to support multiple instances on the same bus.
Fixes: bf73537f41 ("drm/bridge: ti-sn65dsi86: Break GPIO and MIPI-to-eDP bridge into sub-drivers")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Douglas Anderson <dianders@chromium.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Link: https://patchwork.freedesktop.org/patch/msgid/7a68a0e3f927e26edca6040067fb653eb06efb79.1733840089.git.geert+renesas@glider.be