Commit Graph

106821 Commits

Author SHA1 Message Date
Dave Airlie
478a52707b Merge tag 'amd-drm-next-6.11-2024-07-12' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.11-2024-07-12:

amdgpu:
- RAS fixes
- SMU fixes
- GC 12 updates
- SR-IOV fixes
- IH 7 updates
- DCC fixes
- GC 11.5 fixes
- DP MST fixes
- GFX 9.4.4 fixes
- SMU 14 updates
- Documentation updates
- MAINTAINERS updates
- PSR SU fix
- Misc small fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240712171637.2581787-1-alexander.deucher@amd.com
2024-07-18 09:20:00 +10:00
Alex Deucher
e3615bd198 drm/amd/display: fix corruption with high refresh rates on DCN 3.0
This reverts commit bc87d666c0 and the
register changes from commit 6d4279cb99.

Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3412
Cc: mikhail.v.gavrilov@gmail.com
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.10.x
2024-07-17 17:41:28 -04:00
Jiri Pirko
6c85d6b653 virtio: rename virtio_find_vqs_info() to virtio_find_vqs()
Since the original virtio_find_vqs() is no longer present, rename
virtio_find_vqs_info() back to virtio_find_vqs().

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Message-Id: <20240708074814.1739223-20-jiri@resnulli.us>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-17 05:20:58 -04:00
Jiri Pirko
c95e67bac4 virtio: convert the rest virtio_find_vqs() users to virtio_find_vqs_info()
Instead of passing separate names and callbacks arrays
to virtio_find_vqs(), have one of virtual_queue_info structs and
pass it to virtio_find_vqs_info().

Suggested-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Message-Id: <20240708074814.1739223-18-jiri@resnulli.us>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2024-07-17 05:20:58 -04:00
Rodrigo Siqueira
d938ec1a12 drm/amd/display: Add simple struct doc to remove doc build warning
This commit is a part of a series that addresses the following build
warning for opp:

./drivers/gpu/drm/amd/display/dc/inc/hw/opp.h:1: warning: no structured
comments found
./drivers/gpu/drm/amd/display/dc/inc/hw/dpp.h:1: warning: no structured
comments found

This commit fixes this issue by adding a simple kernel-doc to a struct
in the opp.h and the dpp.h files.

Cc: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-16 11:45:09 -04:00
Rodrigo Siqueira
4cf300f604 drm/amd/display: Move DIO documentation to the right place
When building the kernel-doc, it complains with the below warning:

./drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dio.h:1: warning: no structured comments found
./drivers/gpu/drm/amd/display/dc/link/hwss/link_hwss_dio.h:1: warning: no structured comments found

This warning was caused by the wrong use of the ':export:' and the lack
of function documentation in the file pointed under the ':internal:'.
This commit addresses those issues by relocating the overview
documentation to the correct C file, removing the ':export:' options,
and adding two simple kernel-doc to ensure that ':internal:' does not
have any warning.

Cc: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lore.kernel.org/dri-devel/20240715085918.68f5ecc9@canb.auug.org.au/
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-16 11:44:49 -04:00
Li Ma
5ae8fb9712 drm/amd/swsmu: enable Pstates profile levels for SMU v14.0.4
Enables following UMD stable Pstates profile levels
of power_dpm_force_performance_level for SMU v14.0.4.

    - profile_peak
    - profile_min_mclk
    - profile_min_sclk
    - profile_standard

Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Tim Huang <tim.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-16 11:44:43 -04:00
Tim Huang
2fdc99b96e drm/amd/pm: early return if disabling DPMS for GFX IP v11.5.2
This was intended to add support for GFX IP v11.5.2, but it needs
to be applied to all GFX11 and subsequent APUs. Therefore the code
should be revised to accommodate this.

Signed-off-by: Tim Huang <tim.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-16 11:44:36 -04:00
YiPeng Chai
b3fb79cda5 drm/amdgpu: add mutex to protect ras shared memory
Add mutex to protect ras shared memory.

v2:
  Add TA_RAS_COMMAND__TRIGGER_ERROR command call
  status check.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-16 11:44:30 -04:00
Roman Li
0e2c796b49 drm/amd/display: Add function banner for idle_workqueue
[Why]
htmldocs warning:
drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.h: warning:
Function parameter or struct member 'idle_workqueue' not described in
'amdgpu_display_manager'.

[How]
Add comment section for idle_workqueue with param description.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lore.kernel.org/dri-devel/20240715090211.736a9b4d@canb.auug.org.au/
Signed-off-by: Roman Li <Roman.Li@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-16 11:44:20 -04:00
Alex Hung
6e169c7e0f drm/amd/display: Add doc entry for program_3dlut_size
Fixes the warning:

Function parameter or struct member 'program_3dlut_size' not described in
'mpc_funcs'

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/dri-devel/20240715090445.7e9387ec@canb.auug.org.au/
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-16 11:44:10 -04:00
Boyuan Zhang
7d75ef3736 drm/amdgpu/vcn: not pause dpg for unified queue
For unified queue, DPG pause for encoding is done inside VCN firmware,
so there is no need to pause dpg based on ring type in kernel.

For VCN3 and below, pausing DPG for encoding in kernel is still needed.

v2: add more comments
v3: update commit message

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-16 11:44:04 -04:00
Boyuan Zhang
ecfa23c8df drm/amdgpu/vcn: identify unified queue in sw init
Determine whether VCN using unified queue in sw_init, instead of calling
functions later on.

v2: fix coding style

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ruijing Dong <ruijing.dong@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-16 11:43:48 -04:00
Danilo Krummrich
eeb1f825b5 drm/gpuvm: fix missing dependency to DRM_EXEC
In commit 50c1a36f59 ("drm/gpuvm: track/lock/validate external/evicted
objects") we started using drm_exec, but did not select DRM_EXEC in the
Kconfig for DRM_GPUVM, fix this.

Cc: Christian König <christian.koenig@amd.com>
Cc: Boris Brezillon <boris.brezillon@collabora.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Fixes: 50c1a36f59 ("drm/gpuvm: track/lock/validate external/evicted objects")
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240715135158.133287-1-dakr@redhat.com
2024-07-16 16:17:01 +02:00
Imre Deak
509580fad7 drm/i915/dp: Don't switch the LTTPR mode on an active link
Switching to transparent mode leads to a loss of link synchronization,
so prevent doing this on an active link. This happened at least on an
Intel N100 system / DELL UD22 dock, the LTTPR residing either on the
host or the dock. To fix the issue, keep the current mode on an active
link, adjusting the LTTPR count accordingly (resetting it to 0 in
transparent mode).

v2: Adjust code comment during link training about reiniting the LTTPRs.
   (Ville)

Fixes: 7b2a4ab8b0 ("drm/i915: Switch to LTTPR transparent mode link training")
Reported-and-tested-by: Gareth Yu <gareth.yu@intel.com>
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/10902
Cc: <stable@vger.kernel.org> # v5.15+
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708190029.271247-3-imre.deak@intel.com
(cherry picked from commit 211ad49cf8)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
2024-07-16 08:14:29 +00:00
Imre Deak
d13e2a6e95 drm/i915/dp: Reset intel_dp->link_trained before retraining the link
Regularly retraining a link during an atomic commit happens with the
given pipe/link already disabled and hence intel_dp->link_trained being
false. Ensure this also for retraining a DP SST link via direct calls to
the link training functions (vs. an actual commit as for DP MST). So far
nothing depended on this, however the next patch will depend on
link_trained==false for changing the LTTPR mode to non-transparent.

Cc: <stable@vger.kernel.org> # v5.15+
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Ankit Nautiyal <ankit.k.nautiyal@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708190029.271247-2-imre.deak@intel.com
(cherry picked from commit a4d5ce6176)
Signed-off-by: Tvrtko Ursulin <tursulin@ursulin.net>
2024-07-16 08:14:26 +00:00
Linus Torvalds
f998678baf Merge tag 'x86_vmware_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 vmware updates from Borislav Petkov:

 - Add a unified VMware hypercall API layer which should be used by all
   callers instead of them doing homegrown solutions. This will provide
   for adding API support for confidential computing solutions like TDX

* tag 'x86_vmware_for_v6.11_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/vmware: Add TDX hypercall support
  x86/vmware: Remove legacy VMWARE_HYPERCALL* macros
  x86/vmware: Correct macro names
  x86/vmware: Use VMware hypercall API
  drm/vmwgfx: Use VMware hypercall API
  input/vmmouse: Use VMware hypercall API
  ptp/vmware: Use VMware hypercall API
  x86/vmware: Introduce VMware hypercall API
2024-07-15 20:05:40 -07:00
Aurabindo Pillai
7bbae44cf1 drm/amd/display: fix doc entry for bb_from_dmub
Fixes the warning:

Function parameter or struct member 'bb_from_dmub' not described in 'amdgpu_display_manager'

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-15 17:41:23 -04:00
Aurabindo Pillai
28814be882 drm/amd: Bump KMS_DRIVER_MINOR version
Increase the KMS minor version to indicate GFX12 DCC support since this
contains a major change in how DCC is managed across IPs like GFX, DCN
etc. This will be used mainly by userspace like Mesa to figure out
DCC support on GFX12 hardware.

v2: fix version number (Alex)

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-15 17:41:23 -04:00
Maíra Canal
1fe1c66274 drm/v3d: Fix Indirect Dispatch configuration for V3D 7.1.6 and later
`args->cfg[4]` is configured in Indirect Dispatch using the number of
batches. Currently, for all V3D tech versions, `args->cfg[4]` equals the
number of batches subtracted by 1. But, for V3D 7.1.6 and later, we must not
subtract 1 from the number of batches.

Implement the fix by checking the V3D tech version and revision.

Fixes several `dEQP-VK.synchronization*` CTS tests related to Indirect Dispatch.

Fixes: 18b8413b25 ("drm/v3d: Create a CPU job extension for a indirect CSD job")
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240714145243.1223131-2-mcanal@igalia.com
2024-07-15 12:49:52 -03:00
Maíra Canal
7c78fdbace drm/v3d: Add V3D tech revision to the device information
The V3D tech revision can be a useful information when configuring
jobs. Therefore, expose it in the `struct v3d_dev` with the V3D tech
version.

Signed-off-by: Maíra Canal <mcanal@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240714145243.1223131-1-mcanal@igalia.com
2024-07-15 12:45:59 -03:00
Philip Mueller
d60c429610 drm: panel-orientation-quirks: Add quirk for OrangePi Neo
This adds a DMI orientation quirk for the OrangePi Neo Linux Gaming
Handheld.

Signed-off-by: Philip Mueller <philm@manjaro.org>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240715045818.1019979-1-philm@manjaro.org
2024-07-15 15:29:29 +02:00
Alex Deucher
1cff1010be drm/amdgpu/mes12: add missing opcode string
Fixes the indexing of the string array.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-12 11:46:46 -04:00
Alex Deucher
478cb8badf drm/amdgpu/mes11: update opcode strings
Add new packet.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-12 11:46:46 -04:00
Leo Li
7ed58b68ac Revert "drm/amd/display: Reset freesync config before update new state"
This change caused PSR SU panels to not read from their remote fb,
preventing us from entering self-refresh. It is a regression.

This reverts commit eb6dfbb7a9.

Signed-off-by: Leo Li <sunpeng.li@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit dc1000bf46)
2024-07-12 11:46:46 -04:00
Dave Airlie
8b68788beb Merge tag 'amd-drm-fixes-6.10-2024-07-11' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.10-2024-07-11:

amdgpu:
- PSR-SU fix
- Reseved VMID fix

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240712005534.803064-1-alexander.deucher@amd.com
2024-07-12 13:32:36 +10:00
Dave Airlie
85e23c6620 Merge tag 'drm-xe-fixes-2024-07-11' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
UAPI Changes:
- Use write-back caching mode for system memory on DGFX (Thomas)

Driver Changes:
- Do not leak object when finalizing hdcp gsc (Nirmoy)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/vgqz35btnxdddko3byrgww5ii36wig2tvondg2p3j3b3ourj4i@rqgolll3wwkh
2024-07-12 13:24:28 +10:00
Nathan Chancellor
c58c39163a drm/omap: Restrict compile testing to PAGE_SIZE less than 64KB
Prior to commit dc6fcaaba5 ("drm/omap: Allow build with
COMPILE_TEST=y"), it was only possible to build the omapdrm driver with
a 4KB page size. After that change, when the PAGE_SIZE is 64KB or
larger, clang points out that the driver has some assumptions around the
page size implicitly by passing PAGE_SIZE to a parameter with a type of
u16:

  drivers/gpu/drm/omapdrm/omap_gem.c:758:7: error: implicit conversion from 'unsigned long' to 'u16' (aka 'unsigned short') changes value from 65536 to 0 [-Werror,-Wconstant-conversion]
    757 |                 block = tiler_reserve_2d(fmt, omap_obj->width, omap_obj->height,
        |                         ~~~~~~~~~~~~~~~~
    758 |                                          PAGE_SIZE);
        |                                          ^~~~~~~~~
  arch/powerpc/include/asm/page.h:25:34: note: expanded from macro 'PAGE_SIZE'
     25 | #define PAGE_SIZE               (ASM_CONST(1) << PAGE_SHIFT)
        |                                  ~~~~~~~~~~~~~^~~~~~~~~~~~~
  drivers/gpu/drm/omapdrm/omap_gem.c:1504:44: error: implicit conversion from 'unsigned long' to 'u16' (aka 'unsigned short') changes value from 65536 to 0 [-Werror,-Wconstant-conversion]
   1504 |                         block = tiler_reserve_2d(fmts[i], w, h, PAGE_SIZE);
        |                                 ~~~~~~~~~~~~~~~~                ^~~~~~~~~
  arch/powerpc/include/asm/page.h:25:34: note: expanded from macro 'PAGE_SIZE'
     25 | #define PAGE_SIZE               (ASM_CONST(1) << PAGE_SHIFT)
        |                                  ~~~~~~~~~~~~~^~~~~~~~~~~~~
  2 errors generated.

As there is a lot of use of a u16 type throughout this driver and it
will only ever be run on hardware that has a 4KB page size, just
restrict compile testing to when the page size is less than 64KB (as no
other issues have been discussed and it keeps compile testing relatively
more available).

Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240620-omapdrm-restrict-compile-test-to-sub-64kb-page-size-v1-1-5e56de71ffca@kernel.org
2024-07-12 13:13:15 +10:00
Dave Airlie
864204e467 Merge tag 'drm-xe-next-fixes-2024-07-11' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-next
UAPI Changes:
- Rename xe perf layer as xe observation layer (Ashutosh)

Driver Changes:
- Drop trace_xe_hw_fence_free (Brost)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Zo_3ustogPDVKZwu@intel.com
2024-07-12 12:52:28 +10:00
Dave Airlie
38e73004c2 Merge tag 'drm-misc-next-fixes-2024-07-11' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-next
A fix for fbdev on big endian systems, a condition fix for a sharp panel
at removal, and a fix for qxl to prevent unpinned buffer access under
certain conditions.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240711-benign-rich-mouflon-2eeafe@houat
2024-07-12 12:50:30 +10:00
Leo Li
dc1000bf46 Revert "drm/amd/display: Reset freesync config before update new state"
This change caused PSR SU panels to not read from their remote fb,
preventing us from entering self-refresh. It is a regression.

This reverts commit 6b8487cdf9.

Signed-off-by: Leo Li <sunpeng.li@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-11 20:48:42 -04:00
Philipp Stanner
f00059b4c1 drm/vboxvideo: fix mapping leaks
When the PCI devres API was introduced to this driver, it was wrongly
assumed that initializing the device with pcim_enable_device() instead of
pci_enable_device() will make all PCI functions managed.

This is wrong and was caused by the quite confusing PCI devres API in which
some, but not all, functions become managed that way.

The function pci_iomap_range() is never managed.

Replace pci_iomap_range() with the managed function pcim_iomap_range().

Fixes: 8558de401b ("drm/vboxvideo: use managed pci functions")
Link: https://lore.kernel.org/r/20240613115032.29098-14-pstanner@redhat.com
Signed-off-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
2024-07-11 16:20:15 -05:00
Nirmoy Das
609458abd5 drm/xe/display/xe_hdcp_gsc: Free arbiter on driver removal
Free arbiter allocated in intel_hdcp_gsc_init().

Fixes: 152f2df954 ("drm/xe/hdcp: Enable HDCP for XE")
Cc: Suraj Kandpal <suraj.kandpal@intel.com>
Cc: Arun R Murthy <arun.r.murthy@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708125918.23573-1-nirmoy.das@intel.com
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
(cherry picked from commit 33891539f9)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-11 08:25:32 -07:00
Thomas Hellström
5207c393d3 drm/xe: Use write-back caching mode for system memory on DGFX
The caching mode for buffer objects with VRAM as a possible
placement was forced to write-combined, regardless of placement.

However, write-combined system memory is expensive to allocate and
even though it is pooled, the pool is expensive to shrink, since
it involves global CPU TLB flushes.

Moreover write-combined system memory from TTM is only reliably
available on x86 and DGFX doesn't have an x86 restriction.

So regardless of the cpu caching mode selected for a bo,
internally use write-back caching mode for system memory on DGFX.

Coherency is maintained, but user-space clients may perceive a
difference in cpu access speeds.

v2:
- Update RB- and Ack tags.
- Rephrase wording in xe_drm.h (Matt Roper)
v3:
- Really rephrase wording.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Fixes: 622f709ca6 ("drm/xe/uapi: Add support for CPU caching mode")
Cc: Pallavi Mishra <pallavi.mishra@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: dri-devel@lists.freedesktop.org
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Effie Yu <effie.yu@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Jose Souza <jose.souza@intel.com>
Cc: Michal Mrozek <michal.mrozek@intel.com>
Cc: <stable@vger.kernel.org> # v6.8+
Acked-by: Matthew Auld <matthew.auld@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Fixes: 622f709ca6 ("drm/xe/uapi: Add support for CPU caching mode")
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Acked-by: Effie Yu <effie.yu@intel.com> #On chat
Link: https://patchwork.freedesktop.org/patch/msgid/20240705132828.27714-1-thomas.hellstrom@linux.intel.com
(cherry picked from commit 01e0cfc994)
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
2024-07-11 08:25:26 -07:00
Matthew Brost
26d289158e drm/xe: Drop trace_xe_hw_fence_free
fence->ctx may be stale memory when trace_xe_hw_fence_free is called
resuling UAF bug when deriving the device name. This tracepoint is not
all that useful, so just drop it.

Fixes: 501c4255c4 ("drm/xe/trace: Print device_id in xe_trace events")
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Gustavo Sousa <gustavo.sousa@intel.com>
Cc: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240708211008.956384-1-matthew.brost@intel.com
(cherry picked from commit caaf1f44a6)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-07-11 09:54:24 -04:00
Ashutosh Dixit
63347fe031 drm/xe/uapi: Rename xe perf layer as xe observation layer
In Xe, the perf layer allows capture of HW counter streams. These HW
counters are generally performance related but don't have to be necessarily
so. Also, the name "perf" is a carryover from i915 and is not preferred.

Here we propose the name "observation" for this common layer which allows
capture of different types of these counter streams.

v2: Rename observability layer to observation layer (Lucas/Rodrigo)
v3: Rename sysctl file to "observation_paranoid" (Jose)

Fixes: 52c2e956dc ("drm/xe/perf/uapi: "Perf" layer to support multiple perf counter stream types")
Fixes: fe8929bdf8 ("drm/xe/perf/uapi: Add perf_stream_paranoid sysctl")
Acked-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Acked-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240703164801.2561423-1-ashutosh.dixit@intel.com
(cherry picked from commit 8169b2097d)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2024-07-11 09:54:24 -04:00
Qiuxu Zhuo
833cd3e9ad drm/fb-helper: Don't schedule_work() to flush frame buffer during panic()
Sometimes the system [1] hangs on x86 I/O machine checks. However, the
expected behavior is to reboot the system, as the machine check handler
ultimately triggers a panic(), initiating a reboot in the last step.

The root cause is that sometimes the panic() is blocked when
drm_fb_helper_damage() invoking schedule_work() to flush the frame buffer.
This occurs during the process of flushing all messages to the frame
buffer driver as shown in the following call trace:

  Machine check occurs [2]:
    panic()
      console_flush_on_panic()
        console_flush_all()
          console_emit_next_record()
            con->write()
              vt_console_print()
                hide_cursor()
                  vc->vc_sw->con_cursor()
                    fbcon_cursor()
                      ops->cursor()
                        bit_cursor()
                          soft_cursor()
                            info->fbops->fb_imageblit()
                              drm_fbdev_generic_defio_imageblit()
                                drm_fb_helper_damage_area()
                                  drm_fb_helper_damage()
                                    schedule_work() // <--- blocked here
    ...
    emergency_restart()  // wasn't invoked, so no reboot.

During panic(), except the panic CPU, all the other CPUs are stopped.
In schedule_work(), the panic CPU requires the lock of worker_pool to
queue the work on that pool, while the lock may have been token by some
other stopped CPU. So schedule_work() is blocked.

Additionally, during a panic(), since there is no opportunity to execute
any scheduled work, it's safe to fix this issue by skipping schedule_work()
on 'oops_in_progress' in drm_fb_helper_damage().

[1] Enable the kernel option CONFIG_FRAMEBUFFER_CONSOLE,
    CONFIG_DRM_FBDEV_EMULATION, and boot with the 'console=tty0'
    kernel command line parameter.

[2] Set 'panic_timeout' to a non-zero value before calling panic().

Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Reported-by: Yudong Wang <yudong.wang@intel.com>
Tested-by: Yudong Wang <yudong.wang@intel.com>
Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240703141737.75378-1-qiuxu.zhuo@intel.com
Signed-off-by: Maarten Lankhorst,,, <maarten.lankhorst@linux.intel.com>
2024-07-11 13:38:33 +02:00
Christian König
99194e6db5 drm/amdgpu: reject gang submit on reserved VMIDs
A gang submit won't work if the VMID is reserved and we can't flush out
VM changes from multiple engines at the same time.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 320debca1b)
2024-07-10 16:19:19 -04:00
Alex Deucher
8030f6533e drm/amdgpu: remove exp hw support check for gfx12
Enable it by default.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-10 13:45:57 -04:00
YiPeng Chai
e23300dfff drm/amdgpu: timely save bad pages to eeprom after gpu ras reset is completed
The problem case is as follows:
1. GPU A triggers a gpu ras reset, and GPU A drives
   GPU B to also perform a gpu ras reset.
2. After gpu B ras reset started, gpu B queried a DE
   data. Since the DE data was queried in the ras reset
   thread instead of the page retirement thread, bad
   page retirement work would not be triggered. Then
   even if all gpu resets are completed, the bad pages
   will be cached in RAM until GPU B's bad page retirement
   work is triggered again and then saved to eeprom.

This patch can save the bad pages to eeprom in time after gpu
ras reset is completed.

v2:
  1. Add the above description to code comments.
  2. Reuse existing function.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-10 10:13:41 -04:00
YiPeng Chai
c04706914d drm/amdgpu: flush all cached ras bad pages to eeprom
Before uninstalling gpu driver, flush all cached ras
bad pages to eeprom.

v2:
  Put the same code into a function and reuse the function.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-10 10:13:35 -04:00
Sunil Khatri
c39385710c drm/amdgpu: select compute ME engines dynamically
GFX ME right now is one but this could change in
future SOC's. Use no of ME for GFX as start point
for ME for compute for GFX12.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-10 10:13:22 -04:00
Aurabindo Pillai
21e6f6085b drm/amd/display: Allow display DCC for DCN401
To enable mesa to use display dcc, DM should expose them in the
supported modifiers. Add the best (most efficient) modifiers first.

Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-10 10:13:10 -04:00
Sunil Khatri
a85cc86cce drm/amdgpu: select compute ME engines dynamically
GFX ME right now is one but this could change in
future SOC's. Use no of ME for GFX as start point
for ME for compute for GFX11.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-10 10:13:04 -04:00
Alex Deucher
7d570f56f1 drm/amdgpu/job: Replace DRM_INFO/ERROR logging
Use the dev_info/err variants so we get per device logging
in multi-GPU cases.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-10 10:12:55 -04:00
Sunil Khatri
948f2828a6 drm/amdgpu: select compute ME engines dynamically
GFX ME right now is one but this could change in
future SOC's. Use no of ME for GFX as start point
for ME for compute for GFX10.

Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-10 10:12:48 -04:00
Danijel Slivka
708f220567 drm/amd/pm: Ignore initial value in smu response register
Why:
If the reg mmMP1_SMN_C2PMSG_90 is being written to during amdgpu driver
load or driver unload, subsequent amdgpu driver load will fail at
smu_hw_init. The default of mmMP1_SMN_C2PMSG_90 register at a clean
environment is 0x1 and if value differs from expected, amdgpu driver
load will fail.

How to fix:
Ignore the initial value in smu response register before the first smu
message is sent,if smc in SMU_FW_INIT state, just proceed further to
send the message. If register holds an unexpected value after smu message
was sent set, smc_state to SMU_FW_HANG state and no further smu messages
will be sent.

v2:
Set SMU_FW_INIT state at the start of smu hw_init/resume.
Check smc_fw_state before sending smu message if in hang state skip
sending message.
Set SMU_FW_HANG only in case unexpected value is detected

Signed-off-by: Danijel Slivka <danijel.slivka@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-10 10:12:37 -04:00
Lijo Lazar
d02ddefc7e drm/amdgpu: Initialize VF partition mode
For SOCs with GFX v9.4.3, a VF may have multiple compute partitions.
Fetch the partition information during init and initialize partition
nodes. There is no support to switch partition mode in VF mode, hence
disable the same.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-10 10:12:28 -04:00
Gavin Wan
5d64af40e3 drm/amd/amdgpu: fix SDMA IRQ client ID <-> req mapping.
sdma has 2 instances in SRIOV cpx mode. Odd numbered VFs have
sdma0/sdma1 instances. Even numbered vfs have sdma2/sdma3. For
Even numbered vfs, the sdma2 & sdma3 (irq srouce id
CLIENTID_SDMA2 and CLIENTID_SDMA3) should map to irq seq 0 & 1.

Signed-off-by: Gavin Wan <Gavin.Wan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-07-10 10:12:19 -04:00
Daniel Vetter
dbf35b4dea Merge tag 'drm-intel-next-2024-06-28' of https://gitlab.freedesktop.org/drm/i915/kernel into drm-next
drm/i915 feature pull #2 for v6.11:

Features and functionality:
- More eDP Panel Replay enabling (Jouni)
- Add async flip and flip done tracepoints (Ville)

Refactoring and cleanups:
- Clean up BDW+ pipe interrupt register definitions (Ville)
- Prep work for DSB based plane programming (Ville)
- Relocate encoder suspend/shutdown helpers (Imre)
- Polish plane surface alignment handling (Ville)

Fixes:
- Enable more fault interrupts on TGL+/MTL+ (Ville)
- Fix CMRR 32-bit build (Mitul)
- Fix PSR Selective Update Region Scan Line Capture Indication (Jouni)
- Fix cursor fb unpinning (Maarten, Ville)
- Fix Cx0 PHY PLL state verification in TBT mode (Imre)
- Fix unnecessary MG DP programming on MTL+ Type-C (Imre)

DRM changes:
- Rename drm_plane_check_pixel_format() to drm_plane_has_format() and export
  (Ville)
- Add drm_vblank_work_flush_all() (Maarten)

Xe driver changes:
- Call encoder .suspend_complete() hook also on Xe (Imre)

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/875xttazx2.fsf@intel.com
2024-07-10 10:36:47 +02:00