Ideally the busyness worker should take a gt pm wakeref because the
worker only needs to be active while gt is awake. However, the gt_park
path cancels the worker synchronously and this complicates the flow if
the worker is also running at the same time. The cancel waits for the
worker and when the worker releases the wakeref, that would call gt_park
and would lead to a deadlock.
The resolution is to take the global pm wakeref if runtime pm is already
active. If not, we don't need to update the busyness stats as the stats
would already be updated when the gt was parked.
Note:
- We do not requeue the worker if we cannot take a reference to runtime
pm since intel_guc_busyness_unpark would requeue the worker in the
resume path.
- If the gt was parked longer than time taken for GT timestamp to roll
over, we ignore those rollovers since we don't care about tracking the
exact GT time. We only care about roll overs when the gt is active and
running workloads.
- There is a window of time between gt_park and runtime suspend, where
the worker may run. This is acceptable since the worker will not find
any new data to update busyness.
v2: (Daniele)
- Edit commit message and code comment
- Use runtime pm in the worker
- Put runtime pm after enabling the worker
- Use Link tag and add Fixes tag
v3: (Daniele)
- Reword commit and comments and add details
Link: https://gitlab.freedesktop.org/drm/intel/-/issues/7077
Fixes: 77cdd054dd ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu")
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230925192117.2497058-1-umesh.nerlige.ramappa@intel.com
(cherry picked from commit e2f99b79d4)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
On eDP we can receive invalid modes from dm_update_crtc_state() for
entirely new streams for which drm_mode_set_crtcinfo() shouldn't be
called on. So, instead of calling drm_mode_set_crtcinfo() from within
create_stream_for_sink() we can instead call it from
amdgpu_dm_connector_mode_valid(). Since, we are guaranteed to only call
drm_mode_set_crtcinfo() for valid modes from that function (invalid
modes are rejected by that callback) and that is the only user
of create_validate_stream_for_sink() that we need to call
drm_mode_set_crtcinfo() for (as before commit cb841d27b8
("drm/amd/display: Always pass connector_state to stream validation"),
that is the only place where create_validate_stream_for_sink()'s
dm_state was NULL).
Cc: stable@vger.kernel.org
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2693
Fixes: cb841d27b8 ("drm/amd/display: Always pass connector_state to stream validation")
Tested-by: Mark Broadworth <mark.broadworth@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch fixes a memory leak in the amdgpu_ras_feature_enable() function.
The leak occurs when the function sends a command to the firmware to enable
or disable a RAS feature for a GFX block. If the command fails, the kfree()
function is not called to free the info memory.
Fixes: 9f051d6ff1 ("drm/amdgpu: Free ras cmd input buffer properly")
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Cong Liu <liucong2@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 7748ce5b69.
vbios_version sysfs node is used to identify Part Number also. Revert to
the same so that it doesn't break scripts/software which parse this.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The removed line prevents the following cleanup function
to execute a dma_fence_put on the out_fence to free its
memory, producing the following output in kmemleak:
unreferenced object 0xffff888126d8ee00 (size 128):
comm "kwin_wayland", pid 981, jiffies 4295380296 (age 390.060s)
hex dump (first 32 bytes):
c8 a1 c2 27 81 88 ff ff e0 14 a9 c0 ff ff ff ff ...'............
30 1a e1 2e a6 00 00 00 28 fc 5b 17 81 88 ff ff 0.......(.[.....
backtrace:
[<0000000011655661>] kmalloc_trace+0x26/0xa0
[<0000000055f15b82>] virtio_gpu_fence_alloc+0x47/0xc0 [virtio_gpu]
[<00000000fa6d96f9>] virtio_gpu_execbuffer_ioctl+0x1a8/0x800 [virtio_gpu]
[<00000000e6cb5105>] drm_ioctl_kernel+0x169/0x240 [drm]
[<000000005ad33e27>] drm_ioctl+0x399/0x6b0 [drm]
[<00000000a19dbf65>] __x64_sys_ioctl+0xc5/0x100
[<0000000011fa801e>] do_syscall_64+0x5b/0xc0
[<0000000065c76d8a>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
unreferenced object 0xffff888121930500 (size 128):
comm "kwin_wayland", pid 981, jiffies 4295380313 (age 390.096s)
hex dump (first 32 bytes):
c8 a1 c2 27 81 88 ff ff e0 14 a9 c0 ff ff ff ff ...'............
f9 ec d7 2f a6 00 00 00 28 fc 5b 17 81 88 ff ff .../....(.[.....
backtrace:
[<0000000011655661>] kmalloc_trace+0x26/0xa0
[<0000000055f15b82>] virtio_gpu_fence_alloc+0x47/0xc0 [virtio_gpu]
[<00000000fa6d96f9>] virtio_gpu_execbuffer_ioctl+0x1a8/0x800 [virtio_gpu]
[<00000000e6cb5105>] drm_ioctl_kernel+0x169/0x240 [drm]
[<000000005ad33e27>] drm_ioctl+0x399/0x6b0 [drm]
[<00000000a19dbf65>] __x64_sys_ioctl+0xc5/0x100
[<0000000011fa801e>] do_syscall_64+0x5b/0xc0
[<0000000065c76d8a>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[...]
This memleak will grow quickly, being possible to see the
following line in dmesg after few minutes of life in the
virtual machine:
[ 706.217388] kmemleak: 10731 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
The patch will remove the line to allow the cleanup
function do its job.
Signed-off-by: José Pekkarinen <jose.pekkarinen@foxhound.fi>
Fixes: e4812ab8e6 ("drm/virtio: Refactor and optimize job submission code path")
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230912060824.5210-1-jose.pekkarinen@foxhound.fi
As a result of the recent Kconfig reworks, the default settings for the
framebuffer interfaces changed in unexpected ways:
Configurations that leave CONFIG_FB disabled but use DRM now get
DRM_FBDEV_EMULATION by default. This also turns on the deprecated /dev/fb
device nodes for machines that don't actually want it.
In turn, configurations that previously had DRM_FBDEV_EMULATION enabled
now only get the /dev/fb front-end but not the more useful framebuffer
console, which is not selected any more.
We had previously decided that any combination of the three frontends
(FB_DEVICE, FRAMEBUFFER_CONSOLE and LOGO) should be selectable, but the
new default settings mean that a lot of defconfig files would have to
get adapted.
Change the defaults back to what they were in Linux 6.5:
- Leave DRM_FBDEV_EMULATION turned off unless CONFIG_FB
is enabled. Previously this was a hard dependency but now the two are
independent. However, configurations that enable CONFIG_FB probably
also want to keep the emulation for DRM, while those without FB
presumably did that intentionally in the past.
- Leave FB_DEVICE turned off for FB=n. Following the same
logic, the deprecated option should not automatically get enabled
here, most users that had FB turned off in the past do not want it,
even if they want the console
- Turn the FRAMEBUFFER_CONSOLE option on if
DRM_FBDEV_EMULATION is set to avoid having to change defconfig
files that relied on it being selected unconditionally in the past.
This also makes sense since both LOGO and FB_DEVICE are now disabled
by default for builds without CONFIG_FB, but DRM_FBDEV_EMULATION
would make no sense if all three are disabled.
Fixes: a5ae331edb ("drm: Drop select FRAMEBUFFER_CONSOLE for DRM_FBDEV_EMULATION")
Fixes: 701d2054fa ("fbdev: Make support for userspace interfaces configurable")
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230911205338.2385278-1-arnd@kernel.org
Apparently Acer Chromebook C740 (BDW-ULT) doesn't have the
eDP HPD line properly connected, and thus fails the new
HPD check during eDP probe. The result is that we lose the
eDP output.
I suspect all such machines would be Chromebooks or other
Linux exclusive systems as the Windows driver likely wouldn't
work either. I did check a few other BDW machines here and
those do have eDP HPD connected, one of them even is a
different Chromebook (Samus).
To account for these funky machines let's skip the HPD check when
it looks like the eDP port is the only one using that specific AUX
channel. In case of multiple ports sharing the same AUX CH (eg. on
Asrock B250M-HDV) we still do the check and thus should correctly
ignore the eDP port in favor of the other DP port (usually a DP->VGA
converter).
v2: Don't oops during list iteration
Cc: stable@vger.kernel.org
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/9264
Fixes: cfe5bdfb27 ("drm/i915: Check HPD live state during eDP probe")
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230908052527.685-1-ville.syrjala@linux.intel.com
Reviewed-by: Luca Coelho <luciano.coelho@intel.com>
(cherry picked from commit 70052100fa)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
On some APU systems, there is no atom context and so the
atom_context struct is null.
Add a check to the VBIOS_INFO branch of amdgpu_info_ioctl
to handle this case, returning all zeroes.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: David Francis <David.Francis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
In drm_dp_mst_topology_mgr_resume() today, it will resume the
mst branch to be ready handling mst mode and also consecutively do
the mst topology probing. Which will cause the dirver have chance
to fire hotplug event before restoring the old state. Then Userspace
will react to the hotplug event based on a wrong state.
[How]
Adjust the mst resume flow as:
1. set dpcd to resume mst branch status
2. restore source old state
3. Do mst resume topology probing
For drm_dp_mst_topology_mgr_resume(), it's better to adjust it to
pull out topology probing work into a 2nd part procedure of the mst
resume. Will have a follow up patch in drm.
Reviewed-by: Chao-kai Wang <stylon.wang@amd.com>
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Acked-by: Stylon Wang <stylon.wang@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Currently the driver looks DCN registers to access if BL is on or not.
This check is not valid if we are using AUX based brightness control.
This causes driver to not send out "backlight off" command during power off
sequence as it already thinks it is off.
[How]
Only check DCN registers if we aren't using AUX based brightness control.
Reviewed-by: Wenjing Liu <wenjing.liu@amd.com>
Acked-by: Stylon Wang <stylon.wang@amd.com>
Signed-off-by: Swapnil Patel <swapnil.patel@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This loop will exit with "retry" set to -1 if it fails but the code
checks for if "retry" is zero. Fix this by changing post-op to a
pre-op. --retry vs retry--.
Fixes: e01eeffc3f ("drm/amd/pm: avoid driver getting empty metrics table for the first time")
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The CU mask passed from user-space will change based on
different spatial partitioning mode. As a result, update
CU masking code for GFX9.4.3 to work for all partitioning
modes.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update cache info reporting in sysfs to report the correct
number of CUs and associated cache information based on
different spatial partitioning modes.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Currently, we store CU info only for a single XCC assuming
that it is the same for all XCCs. However, that may not be
true. As a result, store CU info for all XCCs. This info is
later used for CU masking.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This patch fixes the case where the code currently passes
absolute register address and not the reg offset, which HWS
expects, when sending the PM4 packet to set/update CWSR grace
period. Additionally, cleanup the signature of
build_grace_period_packet_info function as it no longer needs
the inst parameter.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pull drm ci scripts from Dave Airlie:
"This is a bunch of ci integration for the freedesktop gitlab instance
where we currently do upstream userspace testing on diverse sets of
GPU hardware. From my perspective I think it's an experiment worth
going with and seeing how the benefits/noise playout keeping these
files useful.
Ideally I'd like to get this so we can do pre-merge testing on PRs
eventually.
Below is some info from danvet on why we've ended up making the
decision and how we can roll it back if we decide it was a bad plan.
Why in upstream?
- like documentation, testcases, tools CI integration is one of these
things where you can waste endless amounts of time if you
accidentally have a version that doesn't match your source code
- but also like the above, there's a balance, this is the initial cut
of what we think makes sense to keep in sync vs out-of-tree,
probably needs adjustment
- gitlab supports out-of-repo gitlab integration and that's what's
been used for the kernel in drm, but it results in per-driver
fragmentation and lots of duplicated effort. the simple act of
smashing an arbitrary winner into a topic branch already started
surfacing patches on dri-devel and sparking good cross driver team
discussions
Why gitlab?
- it's not any more shit than any of the other CI
- drm userspace uses it extensively for everything in userspace, we
have a lot of people and experience with this, including
integration of hw testing labs
- media userspace like gstreamer is also on gitlab.fd.o, and there's
discussion to extend this to the media subsystem in some fashion
Can this be shared?
- there's definitely a pile of code that could move to scripts/ if
other subsystem adopt ci integration in upstream kernel git. other
bits are more drm/gpu specific like the igt-gpu-tests/tools
integration
- docker images can be run locally or in other CI runners
Will we regret this?
- it's all in one directory, intentionally, for easy deletion
- probably 1-2 years in upstream to see whether this is worth it or a
Big Mistake. that's roughly what it took to _really_ roll out solid
CI in the bigger userspace projects we have on gitlab.fd.o like
mesa3d"
* tag 'topic/drm-ci-2023-08-31-1' of git://anongit.freedesktop.org/drm/drm:
drm: ci: docs: fix build warning - add missing escape
drm: Add initial ci/ subdirectory