On some systems with Navi3x dGPU will attempt to use BACO for runtime
PM but fails to resume properly. This is because on these systems
the root port goes into D3cold which is incompatible with BACO.
This happens because in this case dGPU is connected to a bridge between
root port which causes BOCO detection logic to fail. Fix the intent of
the logic by looking at root port, not the immediate upstream bridge for
_PR3.
Cc: stable@vger.kernel.org
Suggested-by: Jun Ma <Jun.Ma2@amd.com>
Tested-by: David Perry <David.Perry@amd.com>
Fixes: b10c1c5b3a ("drm/amdgpu: add check for ACPI power resources")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
While aligning SMU11 with SMU13 implementation an assumption was made that
`dpm_context->dpm_tables.pcie_table` was populated in dpm table initialization
like in SMU13 but it isn't.
So restore some of the original logic and instead just check for
amdgpu_device_pcie_dynamic_switching_supported() to decide whether to hardcode
values; erring on the side of performance.
Cc: stable@vger.kernel.org # 6.1+
Reported-and-tested-by: Umio Yasuno <coelacanth_dream@protonmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1447#note_2101382
Fixes: e701156ccc ("drm/amd: Align SMU11 SMU_MSG_OverridePcieParameters implementation with SMU13")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Commit 1ec23ed712 ("drm/i915: Use uabi engines for the default engine
map") switched from using for_each_engine() to for_each_uabi_engine() to
iterate over the user engines. While this seems to be a sensible change,
it's only safe to do when the engines are actually chained using the
rb-tree structure which is not the case during early driver
initialization where it can be either a lock-less list or regular
double-linked list.
In fact, the modesetting initialization code may end up calling
default_engines() through the fb helper code while the engines list
is still llist_node-based:
i915_driver_probe() ->
intel_display_driver_probe() ->
intel_fbdev_init() ->
drm_fb_helper_init() ->
drm_client_init() ->
drm_client_open() ->
drm_file_alloc() ->
i915_driver_open() ->
i915_gem_open() ->
i915_gem_context_open() ->
i915_gem_create_context() ->
default_engines()
Using for_each_uabi_engine() in default_engines() is therefore wrong, as
it would try to interpret the llist as rb-tree, making it find no engine
at all, as the rb_left and rb_right members will still be NULL, as they
haven't been initialized yet.
To fix this type confusion register the engines earlier and at the same
time reduce the amount of code that has to deal with the intermediate
llist state.
Reported-by: sanity checks in grsecurity
Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 1ec23ed712 ("drm/i915: Use uabi engines for the default engine map")
Signed-off-by: Mathias Krause <minipli@grsecurity.net>
Cc: Jonathan Cavitt <jonathan.cavitt@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230928182019.10256-2-minipli@grsecurity.net
[tursulin: fixed commit tag typo]
(cherry picked from commit 2b562f032f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Report the maximum number of IBs that can be pushed with a single
DRM_IOCTL_NOUVEAU_EXEC through DRM_IOCTL_NOUVEAU_GETPARAM.
While the maximum number of IBs per ring might vary between chipsets,
the kernel will make sure that userspace can only push a fraction of the
maximum number of IBs per ring per job, such that we avoid a situation
where there's only a single job occupying the ring, which could
potentially lead to the ring run dry.
Using DRM_IOCTL_NOUVEAU_GETPARAM to report the maximum number of IBs
that can be pushed with a single DRM_IOCTL_NOUVEAU_EXEC implies that
all channels of a given device have the same ring size.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Lyude Paul <lyude@redhat.com>
Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231002135008.10651-3-dakr@redhat.com
Ideally the busyness worker should take a gt pm wakeref because the
worker only needs to be active while gt is awake. However, the gt_park
path cancels the worker synchronously and this complicates the flow if
the worker is also running at the same time. The cancel waits for the
worker and when the worker releases the wakeref, that would call gt_park
and would lead to a deadlock.
The resolution is to take the global pm wakeref if runtime pm is already
active. If not, we don't need to update the busyness stats as the stats
would already be updated when the gt was parked.
Note:
- We do not requeue the worker if we cannot take a reference to runtime
pm since intel_guc_busyness_unpark would requeue the worker in the
resume path.
- If the gt was parked longer than time taken for GT timestamp to roll
over, we ignore those rollovers since we don't care about tracking the
exact GT time. We only care about roll overs when the gt is active and
running workloads.
- There is a window of time between gt_park and runtime suspend, where
the worker may run. This is acceptable since the worker will not find
any new data to update busyness.
v2: (Daniele)
- Edit commit message and code comment
- Use runtime pm in the worker
- Put runtime pm after enabling the worker
- Use Link tag and add Fixes tag
v3: (Daniele)
- Reword commit and comments and add details
Link: https://gitlab.freedesktop.org/drm/intel/-/issues/7077
Fixes: 77cdd054dd ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu")
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230925192117.2497058-1-umesh.nerlige.ramappa@intel.com
(cherry picked from commit e2f99b79d4)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
On eDP we can receive invalid modes from dm_update_crtc_state() for
entirely new streams for which drm_mode_set_crtcinfo() shouldn't be
called on. So, instead of calling drm_mode_set_crtcinfo() from within
create_stream_for_sink() we can instead call it from
amdgpu_dm_connector_mode_valid(). Since, we are guaranteed to only call
drm_mode_set_crtcinfo() for valid modes from that function (invalid
modes are rejected by that callback) and that is the only user
of create_validate_stream_for_sink() that we need to call
drm_mode_set_crtcinfo() for (as before commit cb841d27b8
("drm/amd/display: Always pass connector_state to stream validation"),
that is the only place where create_validate_stream_for_sink()'s
dm_state was NULL).
Cc: stable@vger.kernel.org
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2693
Fixes: cb841d27b8 ("drm/amd/display: Always pass connector_state to stream validation")
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch fixes a memory leak in the amdgpu_ras_feature_enable() function.
The leak occurs when the function sends a command to the firmware to enable
or disable a RAS feature for a GFX block. If the command fails, the kfree()
function is not called to free the info memory.
Fixes: 9f051d6ff1 ("drm/amdgpu: Free ras cmd input buffer properly")
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Cong Liu <liucong2@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 7748ce5b69.
vbios_version sysfs node is used to identify Part Number also. Revert to
the same so that it doesn't break scripts/software which parse this.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The removed line prevents the following cleanup function
to execute a dma_fence_put on the out_fence to free its
memory, producing the following output in kmemleak:
unreferenced object 0xffff888126d8ee00 (size 128):
comm "kwin_wayland", pid 981, jiffies 4295380296 (age 390.060s)
hex dump (first 32 bytes):
c8 a1 c2 27 81 88 ff ff e0 14 a9 c0 ff ff ff ff ...'............
30 1a e1 2e a6 00 00 00 28 fc 5b 17 81 88 ff ff 0.......(.[.....
backtrace:
[<0000000011655661>] kmalloc_trace+0x26/0xa0
[<0000000055f15b82>] virtio_gpu_fence_alloc+0x47/0xc0 [virtio_gpu]
[<00000000fa6d96f9>] virtio_gpu_execbuffer_ioctl+0x1a8/0x800 [virtio_gpu]
[<00000000e6cb5105>] drm_ioctl_kernel+0x169/0x240 [drm]
[<000000005ad33e27>] drm_ioctl+0x399/0x6b0 [drm]
[<00000000a19dbf65>] __x64_sys_ioctl+0xc5/0x100
[<0000000011fa801e>] do_syscall_64+0x5b/0xc0
[<0000000065c76d8a>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
unreferenced object 0xffff888121930500 (size 128):
comm "kwin_wayland", pid 981, jiffies 4295380313 (age 390.096s)
hex dump (first 32 bytes):
c8 a1 c2 27 81 88 ff ff e0 14 a9 c0 ff ff ff ff ...'............
f9 ec d7 2f a6 00 00 00 28 fc 5b 17 81 88 ff ff .../....(.[.....
backtrace:
[<0000000011655661>] kmalloc_trace+0x26/0xa0
[<0000000055f15b82>] virtio_gpu_fence_alloc+0x47/0xc0 [virtio_gpu]
[<00000000fa6d96f9>] virtio_gpu_execbuffer_ioctl+0x1a8/0x800 [virtio_gpu]
[<00000000e6cb5105>] drm_ioctl_kernel+0x169/0x240 [drm]
[<000000005ad33e27>] drm_ioctl+0x399/0x6b0 [drm]
[<00000000a19dbf65>] __x64_sys_ioctl+0xc5/0x100
[<0000000011fa801e>] do_syscall_64+0x5b/0xc0
[<0000000065c76d8a>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[...]
This memleak will grow quickly, being possible to see the
following line in dmesg after few minutes of life in the
virtual machine:
[ 706.217388] kmemleak: 10731 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
The patch will remove the line to allow the cleanup
function do its job.
Signed-off-by: José Pekkarinen <jose.pekkarinen@foxhound.fi>
Fixes: e4812ab8e6 ("drm/virtio: Refactor and optimize job submission code path")
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230912060824.5210-1-jose.pekkarinen@foxhound.fi
As a result of the recent Kconfig reworks, the default settings for the
framebuffer interfaces changed in unexpected ways:
Configurations that leave CONFIG_FB disabled but use DRM now get
DRM_FBDEV_EMULATION by default. This also turns on the deprecated /dev/fb
device nodes for machines that don't actually want it.
In turn, configurations that previously had DRM_FBDEV_EMULATION enabled
now only get the /dev/fb front-end but not the more useful framebuffer
console, which is not selected any more.
We had previously decided that any combination of the three frontends
(FB_DEVICE, FRAMEBUFFER_CONSOLE and LOGO) should be selectable, but the
new default settings mean that a lot of defconfig files would have to
get adapted.
Change the defaults back to what they were in Linux 6.5:
- Leave DRM_FBDEV_EMULATION turned off unless CONFIG_FB
is enabled. Previously this was a hard dependency but now the two are
independent. However, configurations that enable CONFIG_FB probably
also want to keep the emulation for DRM, while those without FB
presumably did that intentionally in the past.
- Leave FB_DEVICE turned off for FB=n. Following the same
logic, the deprecated option should not automatically get enabled
here, most users that had FB turned off in the past do not want it,
even if they want the console
- Turn the FRAMEBUFFER_CONSOLE option on if
DRM_FBDEV_EMULATION is set to avoid having to change defconfig
files that relied on it being selected unconditionally in the past.
This also makes sense since both LOGO and FB_DEVICE are now disabled
by default for builds without CONFIG_FB, but DRM_FBDEV_EMULATION
would make no sense if all three are disabled.
Fixes: a5ae331edb ("drm: Drop select FRAMEBUFFER_CONSOLE for DRM_FBDEV_EMULATION")
Fixes: 701d2054fa ("fbdev: Make support for userspace interfaces configurable")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230911205338.2385278-1-arnd@kernel.org
Apparently Acer Chromebook C740 (BDW-ULT) doesn't have the
eDP HPD line properly connected, and thus fails the new
HPD check during eDP probe. The result is that we lose the
eDP output.
I suspect all such machines would be Chromebooks or other
Linux exclusive systems as the Windows driver likely wouldn't
work either. I did check a few other BDW machines here and
those do have eDP HPD connected, one of them even is a
different Chromebook (Samus).
To account for these funky machines let's skip the HPD check when
it looks like the eDP port is the only one using that specific AUX
channel. In case of multiple ports sharing the same AUX CH (eg. on
Asrock B250M-HDV) we still do the check and thus should correctly
ignore the eDP port in favor of the other DP port (usually a DP->VGA
converter).
v2: Don't oops during list iteration
Cc: stable@vger.kernel.org
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9264
Fixes: cfe5bdfb27 ("drm/i915: Check HPD live state during eDP probe")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230908052527.685-1-ville.syrjala@linux.intel.com
Reviewed-by: Luca Coelho <luciano.coelho@intel.com>
(cherry picked from commit 70052100fa)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
On some APU systems, there is no atom context and so the
atom_context struct is null.
Add a check to the VBIOS_INFO branch of amdgpu_info_ioctl
to handle this case, returning all zeroes.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: David Francis <David.Francis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
In drm_dp_mst_topology_mgr_resume() today, it will resume the
mst branch to be ready handling mst mode and also consecutively do
the mst topology probing. Which will cause the dirver have chance
to fire hotplug event before restoring the old state. Then Userspace
will react to the hotplug event based on a wrong state.
[How]
Adjust the mst resume flow as:
1. set dpcd to resume mst branch status
2. restore source old state
3. Do mst resume topology probing
For drm_dp_mst_topology_mgr_resume(), it's better to adjust it to
pull out topology probing work into a 2nd part procedure of the mst
resume. Will have a follow up patch in drm.
Reviewed-by: Chao-kai Wang <stylon.wang@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Acked-by: Stylon Wang <stylon.wang@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>