Commit Graph

1156448 Commits

Author SHA1 Message Date
Ayush Gupta
9bb10b7aae drm/amd/display: populate subvp cmd info only for the top pipe
[Why]
System restart observed while changing the display resolution
to 8k with extended mode. Sytem restart was caused by a page fault.

[How]
When the driver populates subvp info it did it for both the pipes using
vblank which caused an outof bounds array access causing the page fault.
added checks to allow the top pipe only to fix this issue.

Co-authored-by: Ayush Gupta <ayush.gupta@amd.com>
Reviewed-by: Alvin Lee <Alvin.Lee2@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Ayush Gupta <ayush.gupta@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28 14:29:36 -05:00
Hersen Wu
504d3cae8b drm/amd/display: dcn32/321 dsc_pg_control not executed properly
[why]
during boot up or resume from s3, hw default value of
domain_power_forceon is 1. when program domain_power_gate
to 1 to power down hw block, hw will not change to power
off due to domain_power_forceon = 1.

[how]
enable_power_gating_plane(true) should be executed to set
domain_power_forceon to 0 before dsc_pg_control.
dsc_pg_control is already called by dcn3x_init_hw-->
init_pipes--> dsc_pg_control. no need be programmed with
dcn3x_init_hw one more time.
to trigger dchub, dsc block power state change, need
program dc_ip_request_cntl to notify hw block.

Reviewed-by: Nevenko Stupar <Nevenko.Stupar@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Hersen Wu <hersenxs.wu@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28 14:29:36 -05:00
Mustapha Ghaddar
a06d565b4a drm/amd/display: Allocation at stream Enable
[WHY & HOW]
After we allocate BW at plug, we will de-alloc
and allocate only what stream needs at
stream_enable()

[HOW]
Introduce bw allocation check at link_enable()
for DPIA links

Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Mustapha Ghaddar <mghaddar@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28 14:29:35 -05:00
Aric Cyr
36951fc946 Revert "drm/amd/display: Do not set DRR on pipe commit"
This reverts commit 4f1b5e739d.

[Why & How]
Original change causes a regression. Revert
until fix is available.

Reviewed-by: Aric Cyr <aric.cyr@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28 14:29:35 -05:00
Jasdeep Dhillon
c32699caec drm/amd/display: Updating Video Format Fall Back Policy.
[WHY]
Adding 1920x1080 as fail safe mode for
Video Format Fall Back Policy.

Reviewed-by: Jerry Zuo <Jerry.Zuo@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Jasdeep Dhillon <jdhillon@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28 14:29:35 -05:00
Aric Cyr
c69fc3d0de drm/amd/display: Reduce CPU busy-waiting for long delays
[WHY]
udelay should not be used for long waits since it keeps CPU active,
wasting power.

[HOW]
Use fsleep where acceptable to allow CPU cores to be parked by the scheduler.

Reviewed-by: Wenjing Liu <Wenjing.Liu@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Aric Cyr <aric.cyr@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28 14:29:35 -05:00
Alex Hung
b4ceeffd13 drm/amd/display: fix shift-out-of-bounds in CalculateVMAndRowBytes
[WHY]
When PTEBufferSizeInRequests is zero, UBSAN reports the following
warning because dml_log2 returns an unexpected negative value:

  shift exponent 4294966273 is too large for 32-bit type 'int'

[HOW]

In the case PTEBufferSizeInRequests is zero, skip the dml_log2() and
assign the result directly.

Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28 14:29:35 -05:00
Ryan Lin
1e5d4d8eb8 drm/amd/display: Ext displays with dock can't recognized after resume
[Why]
Needs to set the default value of the LTTPR timeout after resume.

[How]
Set the default (3.2ms) timeout at resuming if the sink supports
LTTPR

Reviewed-by: Jerry Zuo <Jerry.Zuo@amd.com>
Acked-by: Qingqing Zhuo <qingqing.zhuo@amd.com>
Signed-off-by: Ryan Lin <tsung-hua.lin@amd.com>
Tested-by: Daniel Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28 14:29:24 -05:00
Horatio Zhang
61d2a9bec4 drm/amdgpu: fix ttm_bo calltrace warning in psp_hw_fini
The call trace occurs when the amdgpu is removed after
the mode1 reset. During mode1 reset, from suspend to resume,
there is no need to reinitialize the ta firmware buffer
which caused the bo pin_count increase redundantly.

[  489.885525] Call Trace:
[  489.885525]  <TASK>
[  489.885526]  amdttm_bo_put+0x34/0x50 [amdttm]
[  489.885529]  amdgpu_bo_free_kernel+0xe8/0x130 [amdgpu]
[  489.885620]  psp_free_shared_bufs+0xb7/0x150 [amdgpu]
[  489.885720]  psp_hw_fini+0xce/0x170 [amdgpu]
[  489.885815]  amdgpu_device_fini_hw+0x2ff/0x413 [amdgpu]
[  489.885960]  ? blocking_notifier_chain_unregister+0x56/0xb0
[  489.885962]  amdgpu_driver_unload_kms+0x51/0x60 [amdgpu]
[  489.886049]  amdgpu_pci_remove+0x5a/0x140 [amdgpu]
[  489.886132]  ? __pm_runtime_resume+0x60/0x90
[  489.886134]  pci_device_remove+0x3e/0xb0
[  489.886135]  __device_release_driver+0x1ab/0x2a0
[  489.886137]  driver_detach+0xf3/0x140
[  489.886138]  bus_remove_driver+0x6c/0xf0
[  489.886140]  driver_unregister+0x31/0x60
[  489.886141]  pci_unregister_driver+0x40/0x90
[  489.886142]  amdgpu_exit+0x15/0x451 [amdgpu]

Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com>
Signed-off-by: longlyao <Longlong.Yao@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28 14:28:51 -05:00
Tom Rix
1a80993ae3 drm/amdgpu: remove unused variable ring
building with gcc and W=1 reports
drivers/gpu/drm/amd/amdgpu/vcn_v4_0.c:81:29: error: variable
  ‘ring’ set but not used [-Werror=unused-but-set-variable]
   81 |         struct amdgpu_ring *ring;
      |                             ^~~~

ring is not used so remove it.

Signed-off-by: Tom Rix <trix@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28 14:28:51 -05:00
tiancyin
60971b204c drm/amd/display: fix dm irq error message in gpu recover
[Why]
Variable adev->crtc_irq.num_types was initialized as the value of
adev->mode_info.num_crtc at early_init stage, later at hw_init stage,
the num_crtc changed due to the display pipe harvest on some SKUs,
but the num_types was not updated accordingly, that cause below error
in gpu recover.

  *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
  *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
  *ERROR* amdgpu_dm_set_crtc_irq_state: crtc is NULL at id :3
  *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
  *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
  *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
  *ERROR* amdgpu_dm_set_pflip_irq_state: crtc is NULL at id :3
  *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3
  *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3
  *ERROR* amdgpu_dm_set_vupdate_irq_state: crtc is NULL at id :3

[How]
Defer the initialization of num_types to eliminate the error logs.

Signed-off-by: tiancyin <tianci.yin@amd.com>
Reviewed-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28 14:28:51 -05:00
Mario Limonciello
33759ce0ce drm/amd: Fix initialization for nbio 7.5.1
A mistake has been made in the BIOS for some ASICs with NBIO 7.5.1
where some NBIO registers aren't properly setup.

Ensure that they're set during initialization.

Tested-by: Richard Gong <richard.gong@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28 14:28:51 -05:00
Harry Wentland
283947bbd5 drm/amd/display: Format input and output CSC matrix
Format the input and output CSC matrix so they
look like 3x4 matrixes. This will make parsing them
much easier and allows us to quickly spot potential
mistakes.

Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Cc: Pekka Paalanen <ppaalanen@gmail.com>
Cc: Sebastian Wick <sebastian.wick@redhat.com>
Cc: Vitaly.Prosyak@amd.com
Cc: Joshua Ashton <joshua@froggi.es>
Cc: dri-devel@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
Reviewed-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28 14:28:50 -05:00
Harry Wentland
e47f1691ad drm/amd/display: Don't restrict bpc to 8 bpc
This will let us pass the kms_hdr.bpc_switch IGT
test.

The reason the bpc restriction was required is
historical. At one point in time we were not falling
back to a lower bpc when we didn't have enough
bandwidth for the maximum bpc reported by a display.
This meant that we couldn't enable some high refresh
modes unless we limitted the bpc.

Starting with this patch the issue is fixed:
commit cbd14ae7ea ("drm/amd/display: Fix
incorrectly pruned modes with deep color")

This patch implemented a fallback mechanism if mode
validation failed at the max bpc. This means users
now automatically get all modes that can be supported
by at least 6 bpc. The driver will enable the mode
with the highest possible bpc that is supported by
the display.

v2:
 - explain why this is no longer needed (Michel)
 - refer to commit that fixed bpc fallback (Michel)

Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Cc: Pekka Paalanen <ppaalanen@gmail.com>
Cc: Sebastian Wick <sebastian.wick@redhat.com>
Cc: Vitaly.Prosyak@amd.com
Cc: Joshua Ashton <joshua@froggi.es>
Cc: dri-devel@lists.freedesktop.org
Cc: amd-gfx@lists.freedesktop.org
Cc: Michel Dänzer <michel.daenzer@mailbox.org>
Reviewed-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28 14:28:39 -05:00
Candice Li
567172bbb4 drm/amdgpu: Make umc_v8_10_convert_error_address static and remove unused variable
Fixes following warnings:
warning: no previous prototype for 'umc_v8_10_convert_error_address'
warning: variable 'channel_index' set but not used

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-28 14:16:44 -05:00
Kun Liu
0c3c993643 drm/amdgpu: added a sysfs interface for thermal throttling
implement apu_thermal_cap r/w callback for vangogh

Signed-off-by: Kun Liu <Kun.Liu2@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:36:00 -05:00
Kun Liu
c3ed0e72c8 drm/amdgpu: added a sysfs interface for thermal throttling
added a sysfs interface for thermal throttling, then userspace
can get/update thermal limit

Signed-off-by: Kun Liu <Kun.Liu2@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:36:00 -05:00
Arthur Grillo
4f101d5710 drm/amd/display: Remove unused local variables and function
Remove a couple of local variables that are only set but never used,
also remove an static utility function that is never used in consequence
of the variable removal.

This decrease the number of -Wunused-but-set-variable warnings.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:36:00 -05:00
Arthur Grillo
4d3ed63264 drm/amd/display: Remove unused local variables
Remove local variables that were just set but were never used. This
decrease the number of -Wunused-but-set-variable warnings.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Arthur Grillo <arthurgrillo@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:36:00 -05:00
Mark Hawrylak
cb1b05287f drm/radeon: Fix eDP for single-display iMac11,2
Apple iMac11,2 (mid 2010) also with Radeon HD-4670 that has the same
issue as iMac10,1 (late 2009) where the internal eDP panel stays dark on
driver load.  This patch treats iMac11,2 the same as iMac10,1,
so the eDP panel stays active.

Additional steps:
Kernel boot parameter radeon.nomodeset=0 required to keep the eDP
panel active.

This patch is an extension of
commit 564d8a2cf3 ("drm/radeon: Fix eDP for single-display iMac10,1 (v2)")
Link: https://lore.kernel.org/all/lsq.1507553064.833262317@decadent.org.uk/
Signed-off-by: Mark Hawrylak <mark.hawrylak@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:59 -05:00
Thomas Weißschuh
4fa01c6357 drm/amdkfd: Make kobj_type structures constant
Since commit ee6d3dd4ed ("driver core: make kobj_type constant.")
the driver core allows the usage of const struct kobj_type.

Take advantage of this to constify the structure definitions to prevent
modification at runtime.

Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:59 -05:00
Thomas Weißschuh
b2daaa9360 drm/amdgpu: make kobj_type structures constant
Since commit ee6d3dd4ed ("driver core: make kobj_type constant.")
the driver core allows the usage of const struct kobj_type.

Take advantage of this to constify the structure definitions to prevent
modification at runtime.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Weißschuh <linux@weissschuh.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:59 -05:00
Jiapeng Chong
28d58468ad drm/amd/display: Modify mismatched function name
No functional modification involved.

drivers/gpu/drm/amd/amdgpu/../display/dc/link/link_detection.c:1199: warning: expecting prototype for dc_link_detect_connection_type(). Prototype was for link_detect_connection_type() instead.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4103
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:59 -05:00
Hans de Goede
d24b77e444 drm/amd/display: Pass proper parent for DM backlight device registration
The parent for the backlight device should be the drm-connector object,
not the PCI device.

Userspace relies on this to be able to detect which backlight class device
to use on hybrid gfx devices where there may be multiple native (raw)
backlight devices registered.

Specifically gnome-settings-daemon expects the parent device to have
an "enabled" sysfs attribute (as drm_connector devices do) and tests
that this returns "enabled" when read.

This aligns the parent of the backlight device with i915, nouveau, radeon.
Note that drivers/gpu/drm/amd/amdgpu/atombios_encoders.c also already
uses the drm_connector as parent, only amdgpu_dm.c used the PCI device
as parent before this change.

Note this is marked as a RFC because I don't have hw to test, so this
has only been compile tested! If someone can test this on actual
hw which hits the changed code path that would be great.

Link: https://gitlab.gnome.org/GNOME/gnome-settings-daemon/-/issues/730
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:59 -05:00
Guchun Chen
424b3d7582 drm/amd/pm: downgrade log level upon SMU IF version mismatch
SMU IF version mismatch as a warning message exists widely
after asic production, however, due to this log level setting,
such mismatch warning will be caught by automation test like
IGT and reported as a fake error after checking. As such mismatch
does not break anything, to reduce confusion, downgrade it from
dev_warn to dev_info.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:59 -05:00
Candice Li
b1e9a718af drm/amdgpu: Add ecc info query interface for umc v8_10
Support ecc info query for umc v8_10.

v2: Simplied by convert_error_address.
v3: Remove unused variable and invalid checking.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:59 -05:00
Candice Li
2d53b579f3 drm/amdgpu: Add convert_error_address function for umc v8_10
Add convert_error_address for umc v8_10.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:59 -05:00
Tao Zhou
22106ed0be drm/amdgpu: add bad_page_threshold check in ras_eeprom_check_err
bad_page_threshold controls page retirement behavior and it should be
also checked.

v2: simplify the condition of bad page handling path.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:59 -05:00
Tao Zhou
f3cbe70e21 drm/amdgpu: change default behavior of bad_page_threshold parameter
Ignore ras umc bad page threshold by default, GPU initialization won't
be stopped in this mode.

v2: refine the description of bad_page_threshold.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:59 -05:00
Tao Zhou
4d33e0f134 drm/amdgpu: exclude duplicate pages from UMC RAS UE count
If a UMC bad page is reserved but not freed by an application, the
application may trigger uncorrectable error repeatly by accessing the page.

v2: add specific function to do the check.
v3: remove duplicate pages, calculate new added bad page number.
v4: reuse save_bad_pages to calculate new added bad page number.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:59 -05:00
Tao Zhou
e69c785723 drm/amdgpu: add umc retire unit element
It records how many bad pages are retired in one uncorrectable error.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:59 -05:00
Evan Quan
6761c4bfee drm/amd/pm: no pptable resetup on runpm exiting
It is assumed the pptable used before runpm is same as
the one used afterwards. Thus, we can reuse the stored
copy and do not need to resetup the pptable again.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Feifei Xu <feifei.xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:59 -05:00
Evan Quan
edddc6fd54 drm/amd/pm: correct the baco state setting for ArmD3 scenario
The check for baco support relies on the correct baco state.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Feifei Xu <feifei.xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:58 -05:00
Marek Olšák
b299221faf drm/amdgpu: add more fields into device info, caches sizes, etc.
AMDGPU_IDS_FLAGS_CONFORMANT_TRUNC_COORD: important for conformance on gfx11
Other fields are exposed from IP discovery.
enabled_rb_pipes_mask_hi is added for future chips, currently 0.

Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21403

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:58 -05:00
Qu Huang
4fc8fff378 drm/amdkfd: Fix an illegal memory access
In the kfd_wait_on_events() function, the kfd_event_waiter structure is
allocated by alloc_event_waiters(), but the event field of the waiter
structure is not initialized; When copy_from_user() fails in the
kfd_wait_on_events() function, it will enter exception handling to
release the previously allocated memory of the waiter structure;
Due to the event field of the waiters structure being accessed
in the free_waiters() function, this results in illegal memory access
and system crash, here is the crash log:

localhost kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x185/0x1e0
localhost kernel: RSP: 0018:ffffaa53c362bd60 EFLAGS: 00010082
localhost kernel: RAX: ff3d3d6bff4007cb RBX: 0000000000000282 RCX: 00000000002c0000
localhost kernel: RDX: ffff9e855eeacb80 RSI: 000000000000279c RDI: ffffe7088f6a21d0
localhost kernel: RBP: ffffe7088f6a21d0 R08: 00000000002c0000 R09: ffffaa53c362be64
localhost kernel: R10: ffffaa53c362bbd8 R11: 0000000000000001 R12: 0000000000000002
localhost kernel: R13: ffff9e7ead15d600 R14: 0000000000000000 R15: ffff9e7ead15d698
localhost kernel: FS:  0000152a3d111700(0000) GS:ffff9e855ee80000(0000) knlGS:0000000000000000
localhost kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
localhost kernel: CR2: 0000152938000010 CR3: 000000044d7a4000 CR4: 00000000003506e0
localhost kernel: Call Trace:
localhost kernel: _raw_spin_lock_irqsave+0x30/0x40
localhost kernel: remove_wait_queue+0x12/0x50
localhost kernel: kfd_wait_on_events+0x1b6/0x490 [hydcu]
localhost kernel: ? ftrace_graph_caller+0xa0/0xa0
localhost kernel: kfd_ioctl+0x38c/0x4a0 [hydcu]
localhost kernel: ? kfd_ioctl_set_trap_handler+0x70/0x70 [hydcu]
localhost kernel: ? kfd_ioctl_create_queue+0x5a0/0x5a0 [hydcu]
localhost kernel: ? ftrace_graph_caller+0xa0/0xa0
localhost kernel: __x64_sys_ioctl+0x8e/0xd0
localhost kernel: ? syscall_trace_enter.isra.18+0x143/0x1b0
localhost kernel: do_syscall_64+0x33/0x80
localhost kernel: entry_SYSCALL_64_after_hwframe+0x44/0xa9
localhost kernel: RIP: 0033:0x152a4dff68d7

Allocate the structure with kcalloc, and remove redundant 0-initialization
and a redundant loop condition check.

Signed-off-by: Qu Huang <qu.huang@linux.dev>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:58 -05:00
Jane Jian
6dcb38a19e drm/amdgpu/vcn: set and use harvest config
in early init to set harvest config if the vcn0/1 is disabled
rather than hard-code the ring attributes as before did

Signed-off-by: Jane Jian <Jane.Jian@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:58 -05:00
Mario Limonciello
ca47518663 drm/amd: Don't allow s0ix on APUs older than Raven
APUs before Raven didn't support s0ix.  As we just relieved some
of the safety checks for s0ix to improve power consumption on
APUs that support it but that are missing BIOS support a new
blind spot was introduced that a user could "try" to run s0ix.

Plug this hole so that if users try to run s0ix on anything older
than Raven it will just skip suspend of the GPU.

Fixes: cf488dcd0a ("drm/amd: Allow s0ix without BIOS support")
Suggested-by: Alexander Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:58 -05:00
Hawking Zhang
f9c35f4fff drm/amdgpu: fix incorrect active rb bitmap for gfx11
GFX v11 changes RB_BACKEND_DISABLE related registers
from per SA to global ones. The approach to query active
rb bitmap needs to be changed accordingly. Query per
SE setting returns wrong active RB bitmap especially
in the case when some of SA are disabled. With the new
approach, driver will generate the active rb bitmap
based on active SA bitmap and global active RB bitmap.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:58 -05:00
Shane Xiao
2866cc0961 drm/amdgpu: optimize VRAM allocation when using drm buddy
Since the VRAM manager changed from drm mm to drm buddy. It's
not necessary to allocate 2MB aligned VRAM for more than 2MB
unaligned size, and then do trim. This method improves the
allocation efficiency and reduces memory fragmentation.

v2: Correct the remainder operation

Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:58 -05:00
Shane Xiao
c105518679 drm/amdgpu: remove TOPDOWN flags when allocating VRAM in large bar system
Since VRAM manager is changed from drm mm to drm buddy, the
TOP_DOWN flag should not be set by default in the large bar system.
Removing this flag helps improve drm buddy allocator efficiency and
reduce the risk of splitting higher order block into lower order.

Signed-off-by: Shane Xiao <shane.xiao@amd.com>
Reviewed-by: Christian K�nig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:58 -05:00
Harry Wentland
455ad25997 drm/amdgpu: Select DRM_DISPLAY_HDCP_HELPER in amdgpu
Keeps this selection with the rest of the DRM HELPER
selection.

Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:58 -05:00
Hamza Mahfooz
6d9b6dceaa drm/amd/display: only warn once in dce110_edp_wait_for_hpd_ready()
Since, hot plugging eDP displays isn't supported, it is sufficient for
us to warn about the lack of a connected display once. So, use ASSERT()
in dce110_edp_wait_for_hpd_ready() instead of DC_LOG_WARNING().

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:58 -05:00
Kenneth Feng
d9e1e14f42 drm/amd/pm: re-enable ac/dc on smu_v13_0_0/10
re-enable ac/dc on smu_v13_0_0/10

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:58 -05:00
Jesse Zhang
08c6ab7fb4 drm/amdgpu: add tmz support for GC 10.3.6
this patch to add tmz support for GC 10.3.6

Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:58 -05:00
Ruili Ji
2e2b9baf00 drm/amdkfd: To fix sdma page fault issue for GC 11
For the MQD memory, KMD would always allocate 4K memory,
and mes scheduler would write to the end of MQD for unmap flag.

Signed-off-by: Ruili Ji <ruiliji2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-02-23 17:35:57 -05:00
Dave Airlie
a48bba9838 msm/fbdev: fix unused variable warning with clang.
clang builds showed this:
drivers/gpu/drm/msm/msm_fbdev.c:144:6: error: variable 'helper' is used uninitialized whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
        if (!fbdev)
            ^~~~~~

Fixes: 3fb1f62f80 ("drm/fb-helper: Remove drm_fb_helper_unprepare() from drm_fb_helper_fini()")
Signed-off-by: Dave Airlie <airlied@redhat.com>
2023-02-23 09:48:05 +10:00
Dave Airlie
13daf53619 Merge tag 'drm-misc-next-fixes-2023-02-21' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
Short summary of fixes pull:

Fixes GEM SHMEM locking and generic fbdev hotplugging. Constifies
dma_buf kobj type.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/Y/S6tu3gdQ0VizR+@linux-uq9g
2023-02-23 09:32:41 +10:00
Thomas Zimmermann
3fb1f62f80 drm/fb-helper: Remove drm_fb_helper_unprepare() from drm_fb_helper_fini()
Move drm_fb_helper_unprepare() from drm_fb_helper_fini() into the
calling fbdev implementation. Avoids a possible stale mutex with
generic fbdev code.

As indicated by its name, drm_fb_helper_prepare() prepares struct
drm_fb_helper before setting up the fbdev support with a call to
drm_fb_helper_init(). In legacy fbdev emulation, this happens next
to each other. If successful, drm_fb_helper_fini() later tear down
the fbdev device and also unprepare via drm_fb_helper_unprepare().

Generic fbdev emulation prepares struct drm_fb_helper immediately
after allocating the instance. It only calls drm_fb_helper_init()
as part of processing a hotplug event. If the hotplug-handling fails,
it runs drm_fb_helper_fini(). This unprepares the fb-helper instance
and the next hotplug event runs on stale data.

Solve this by moving drm_fb_helper_unprepare() from drm_fb_helper_fini()
into the fbdev implementations. Call it right before freeing the
fb-helper instance.

Fixes: 643231b283 ("drm/fbdev-generic: Minimize hotplug error handling")
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Javier Martinez Canillas <javierm@redhat.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Maxime Ripard <mripard@kernel.org>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230216140620.17699-1-tzimmermann@suse.de
2023-02-21 13:26:18 +01:00
Dave Airlie
5582f3c1b1 Merge tag 'drm-intel-next-fixes-2023-02-17' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
drm/i915 fixes for the v6.3 merge window:
- Fix eDP+DSI dual panel systems
- Fix system suspend when fbdev isn't initialized
- Fix memory leaks in scatterlist
- Fix some MCR register annotations
- Fix documentation build warnings

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/87v8k0xyx4.fsf@intel.com
2023-02-21 10:59:20 +10:00
Dave Airlie
fec67d1896 Merge tag 'amd-drm-next-6.3-2023-02-17' of https://gitlab.freedesktop.org/agd5f/linux into drm-next
amd-drm-next-6.3-2023-02-17:

amdgpu:
- GC 11 fixes
- Display fixes
- Backlight cleanup
- SMU13 fixes
- SMU7 regression fix
- GFX9 sw queues fix
- AGP fix for GMC 11
- W1 warning fixes
- S/G display fixes
- Misc spelling fixes
- Driver unload fix
- DCN 3.1.4 fixes
- Display code reorg fixes
- Rotation fixes

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230217230930.64821-1-alexander.deucher@amd.com
2023-02-21 10:14:52 +10:00