Define a utility macro for defining capabilities and their attributes.
Capability attributes are read-only, write-only, read-write.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Rename extra_dw to extra_bytes and document what it's for.
The value is already used as if it were bytes in vcn_v4_0.c
and in amdgpu_ring_init. Just adjust the dword count in
jpeg_v1_0.c so that it becomes a byte count.
v2:
Rename extra_dw to extra_bytes as discussed during review.
Fixes: c8c1a1d2ef ("drm/amdgpu: define and add extra dword for jpeg ring")
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This code isn't needed anymore as we collect the same information
into pm_display_cfg instead.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This commit is necessary for DC to function well with chips
that use the legacy power management code, ie. SI and KV.
Communicate display information from DC to the legacy PM code.
Currently DC uses pm_display_cfg to communicate power management
requirements from the display code to the DPM code.
However, the legacy (non-DC) code path used different fields
and therefore could not take into account anything from DC.
Change the legacy display code to fill the same pm_display_cfg
struct as DC and use the same in the legacy DPM code.
To ease review and reduce churn, this commit does not yet
delete the now unneeded code, that is done in the next commit.
v2:
Rebase.
Fix single_display in amdgpu_dpm_pick_power_state.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This commit adds the pixel_clock field to the display config
struct so that power management (DPM) can use it.
We currently don't have a proper bandwidth calculation on old
GPUs with DCE 6-10 because dce_calcs only supports DCE 11+.
So the power management (DPM) on these GPUs may need to make
ad-hoc decisions for display based on the pixel clock.
Also rename sym_clock to pixel_clock in dm_pp_single_disp_config
to avoid confusion with other code where the sym_clock refers to
the DisplayPort symbol clock.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
xgmi hive reference is taken on function entry, but not released
correctly for all paths. Use __free() to release reference properly.
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Ce Sun <cesun102@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add mmio_remap bookkeeping to amdgpu_device and introduce
amdgpu_ttm_mmio_remap_bo_init()/fini() to manage a kernel-owned,
one-page (4K) BO in AMDGPU_GEM_DOMAIN_MMIO_REMAP.
Bookkeeping:
- adev->rmmio_remap.bo : kernel-owned singleton BO
The BO is allocated during TTM init when a remap bus address is available
(adev->rmmio_remap.bus_addr) and PAGE_SIZE <= AMDGPU_GPU_PAGE_SIZE (4K),
and freed during TTM fini.
v2:
- Check mmio_remap bus address (adev->rmmio_remap.bus_addr) instead of
rmmio_base. (Alex)
- Skip quietly if PAGE_SIZE > AMDGPU_GPU_PAGE_SIZE or no bus address
(no warn). (Alex)
- Use `amdgpu_bo_create()` (not *_kernel) - Only with this The object
is stored in adev->mmio_remap.bo and will later be exposed to
userspace via a GEM handle. (Christian)
v3:
- Remove obvious comment before amdgpu_ttm_mmio_remap_bo_fini() call.
(Alex)
v4:
- Squash bookkeeping into this patch (Christian)
Suggested-by: Christian König <christian.koenig@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add a one-page TTM range manager for AMDGPU_PL_MMIO_REMAP via
amdgpu_ttm_init_on_chip(). This only registers the placement with TTM;
no BO is allocated in this patch.
The singleton 4K remap BO is created and freed in the following patch.
This split follows to separate heap bring-up from BO allocation.
Cc: Christian König <christian.koenig@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Implement TTM-level behavior for AMDGPU_PL_MMIO_REMAP so it behaves as a
CPU-visible IO page:
* amdgpu_evict_flags(): mark as unmovable
* amdgpu_res_cpu_visible(): consider CPU-visible
* amdgpu_bo_move(): use null move when src/dst is MMIO_REMAP
* amdgpu_ttm_io_mem_reserve(): program base/is_iomem/caching using
the device's mmio_remap_* metadata
* amdgpu_ttm_io_mem_pfn(): return PFN for the remapped HDP page
* amdgpu_ttm_tt_pde_flags(): set AMDGPU_PTE_SYSTEM for this mem type
v2:
- Drop HDP-specific comment; keep generic remap (Alex).
v3:
- Fix indentation in amdgpu_res_cpu_visible (Christian).
- Use adev->rmmio_remap.bus_addr for MMIO_REMAP bus/PFN calculations
(Alex).
v4:
- Drop unnecessary (resource_size_t) casts in MMIO_REMAP io-mem paths
(Alex)
Cc: Christian König <christian.koenig@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Replace kzalloc() followed by copy_from_user() with memdup_user() to
improve and simplify ta_if_load_debugfs_write() and
ta_if_invoke_debugfs_write().
No functional changes intended.
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Replace kzalloc() followed by copy_from_user() with memdup_user() to
improve and simplify kfd_ioctl_set_cu_mask().
Return early if an error occurs and remove the obsolete 'out' label.
No functional changes intended.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Replace k(v)malloc_array() + copy_from_user() with (v)memdup_array_user().
This shrinks the source code and improves separation between the kernel
and userspace slabs.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Replace kmalloc_array() + copy_from_user() with memdup_array_user().
This shrinks the source code and improves separation between the kernel
and userspace slabs.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Replace kvmalloc_array() + copy_from_user() with vmemdup_array_user() on
the fast path.
This shrinks the source code and improves separation between the kernel
and userspace slabs.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
None of the pointer operations handled by the ring file requires
volatile, for this reason, this commit removes all occurrences of
volatile associated with rings.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The CSB buffer manipulation occurs in memory where the BO is mapped
during initialization, and some references to this buffer are handled
with volatile, which is incorrect in this scenario. There are a few
cases where the use of volatile is accepted, but none of them align with
CSB operations. Therefore, this commit removes all the volatile
variables associated with the CSB code.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The function amdgpu_vcn_sw_fini() returns an integer, but this number is
always 0. This commit changes the amdgpu_vcn_sw_fini() return to void,
and eliminates all checks to this return across different VCNs.
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When examining the VCN function init, it is common to find a loop that
initializes VCN rings, which uses one IRQ per instance. However, VCN
4.0.3 deviates from this pattern, as it includes a distinct field to
differentiate instances, which results in a slightly different ring
init. This commit makes this difference explicit by using a fixed index
when initializing the ring buffer and also adds a comment.
Signed-off-by: Rodrigo Siqueira <siqueira@igalia.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Before destroying the userq buffer object, it requires validating
the userq HW unmap status and ensuring the userq is unmapped from
hardware. If the user HW unmap failed, then it needs to reset the
queue for reusing.
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Wire up the conversions and strings for the new MMIO_REMAP placement:
* amdgpu_mem_type_to_domain() maps AMDGPU_PL_MMIO_REMAP -> domain
* amdgpu_bo_placement_from_domain() accepts the new domain
* amdgpu_bo_mem_stats_placement() and amdgpu_bo_print_info() report it
* res cursor supports the new placement
* fdinfo prints "mmioremap" for the new placement
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Introduce a kernel-internal TTM placement type AMDGPU_PL_MMIO_REMAP
for the HDP flush MMIO remap page
Plumbing added:
- amdgpu_res_cursor.{first,next}: treat MMIO_REMAP like DOORBELL
- amdgpu_ttm_io_mem_reserve(): return BAR bus address + offset
for MMIO_REMAP, mark as uncached I/O
- amdgpu_ttm_io_mem_pfn(): PFN from register BAR
- amdgpu_res_cpu_visible(): visible to CPU
- amdgpu_evict_flags()/amdgpu_bo_move(): non-migratable
- amdgpu_ttm_tt_pde_flags(): map as SYSTEM
- amdgpu_bo_mem_stats_placement(): report AMDGPU_PL_MMIO_REMAP
- amdgpu_fdinfo: print “mmioremap” bucket label
Cc: Alex Deucher <alexander.deucher@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There is no reason to require this to happen on first submitted IB only.
We need to wait for the queue to be idle, but it can be done at any
time (including when there are multiple video sessions active).
Signed-off-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There can be multiple engine info packages in one IB and the first one
may be common engine, not decode/encode.
We need to parse the entire IB instead of stopping after finding first
engine info.
Signed-off-by: David Rosca <david.rosca@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This will help on validating the userq input args, and
rejecting for the invalid userq request at the IOCTLs
first place.
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add a new GEM domain bit AMDGPU_GEM_DOMAIN_MMIO_REMAP to allow
userspace to request the MMIO remap (HDP flush) page via GEM_CREATE.
- include/uapi/drm/amdgpu_drm.h:
* define AMDGPU_GEM_DOMAIN_MMIO_REMAP
* include the bit in AMDGPU_GEM_DOMAIN_MASK
v2: Add early reject in amdgpu_gem_create_ioctl() (Alex).
Cc: Christian König <christian.koenig@amd.com>
Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Commit b61badd20b ("drm/amdgpu: fix usage slab after free")
reordered when amdgpu_fence_driver_sw_fini() was called after
that patch, amdgpu_fence_driver_sw_fini() effectively became
a no-op as the sched entities we never freed because the
ring pointers were already set to NULL. Remove the NULL
setting.
Reported-by: Lin.Cao <lincao12@amd.com>
Cc: Vitaly Prosyak <vitaly.prosyak@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Fixes: b61badd20b ("drm/amdgpu: fix usage slab after free")
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The copy_to_user() function returns the number of bytes that it wasn't
able to copy, but we should return -EFAULT to the user.
Fixes: 4d82724f7f ("drm/amdgpu: Add mapping info option for GEM_OP ioctl")
Fixes: f9db1fc52c ("drm/amdgpu: Add ioctl to get all gem handles for a process")
Reviewed-By: David Francis <David.Francis@amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Older GPUs did not support memory protection, so the kernel
driver would validate the command submissions (CS) from userspace
to avoid the GPU accessing any memory it shouldn't.
Change any error messages in that validation to dev_warn_once() to
avoid spamming the kernel log in the event of a bad CS. If users
see any of these messages they should report them to the user space
component, which in most cases is mesa
(https://gitlab.freedesktop.org/mesa/mesa/-/issues).
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20250829171655.GBaLHgh3VOvuM1UfJg@fat_crate.local
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The header comment above amdgpu_gem_list_handles_ioctl referenced
drm_amdgpu_gem_list_handles_ioctl. Update the comment to reflect the
actual function identifier to avoid misleading prototype warnings.
Fixes the below:
drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c:1106: warning: expecting prototype for drm_amdgpu_gem_list_handles_ioctl(). Prototype was for amdgpu_gem_list_handles_ioctl() instead
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Print PD address of VM root instead of GPU address in the debugfs.
On modern GPU's this is what UMR tool expects in the registers
as well.
Fixes: 719b378d37 ("drm/amdgpu: add debugfs support for VM pagetable per client")
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This version brings along following updates:
- Disable stutter when programming watermarks on dcn32
- Fix pbn_div Calculation Error
- Correct sequences and delays for DCN35 PG & RCG
- Define interfaces for hubbub perfmance monitoring support
- Extend to read eDP general capability 2
- Indicate when custom brightness curves are in use
- Dont wait for pipe update during medupdate/highirq
- Add HDCP retry_limit control parameter
Acked-by: Tom Chung <chiahsuan.chung@amd.com>
Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
dm_mst_get_pbn_divider() returns value integer coming from
the cast from fixed point, but the casted integer will then be used
in dfixed_const to be multiplied by 4096. The cast from fixed point to integer
causes the calculation error becomes bigger when multiplied by 4096.
That makes the calculated pbn_div value becomes smaller than
it should be, which leads to the req_slot number becomes bigger.
Such error is getting reflected in 8k30 timing,
where the correct and incorrect calculated req_slot 62.9 Vs 63.1.
That makes the wrong calculation failed to light up 8k30
after a dock under HBR3 x 4.
[How]
Restore the accuracy by keeping the fraction part
calculated for the left shift operation.
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Fangzhi Zuo <Jerry.Zuo@amd.com>
Signed-off-by: Wayne Lin <wayne.lin@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>