On contiguous allocation, we round up the size
to the *next* power of 2, implement a function
to free the unused pages after the newly allocate block.
v2(Matthew Auld):
- replace function name 'drm_buddy_free_unused_pages' with
drm_buddy_block_trim
- replace input argument name 'actual_size' with 'new_size'
- add more validation checks for input arguments
- add overlaps check to avoid needless searching and splitting
- merged the below patch to see the feature in action
- add free unused pages support to i915 driver
- lock drm_buddy_block_trim() function as it calls mark_free/mark_split
are all globally visible
v3(Matthew Auld):
- remove trim method error handling as we address the failure case
at drm_buddy_block_trim() function
v4:
- in case of trim, at __alloc_range() split_block failure path
marks the block as free and removes it from the original list,
potentially also freeing it, to overcome this problem, we turn
the drm_buddy_block_trim() input node into a temporary node to
prevent recursively freeing itself, but still retain the
un-splitting/freeing of the other nodes(Matthew Auld)
- modify the drm_buddy_block_trim() function return type
v5(Matthew Auld):
- revert drm_buddy_block_trim() function return type changes in v4
- modify drm_buddy_block_trim() passing argument n_pages to original_size
as n_pages has already been rounded up to the next power-of-two and
passing n_pages results noop
v6:
- fix warnings reported by kernel test robot <lkp@intel.com>
v7:
- modify drm_buddy_block_trim() function doc description
- at drm_buddy_block_trim() handle non-allocated block as
a serious programmer error
- fix a typo
Signed-off-by: Arunpravin <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220221164552.2434-3-Arunpravin.PaneerSelvam@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
Implemented a function which walk through the order list,
compares the offset and returns the maximum offset block,
this method is unpredictable in obtaining the high range
address blocks which depends on allocation and deallocation.
for instance, if driver requests address at a low specific
range, allocator traverses from the root block and splits
the larger blocks until it reaches the specific block and
in the process of splitting, lower orders in the freelist
are occupied with low range address blocks and for the
subsequent TOPDOWN memory request we may return the low
range blocks.To overcome this issue, we may go with the
below approach.
The other approach, sorting each order list entries in
ascending order and compares the last entry of each
order list in the freelist and return the max block.
This creates sorting overhead on every drm_buddy_free()
request and split up of larger blocks for a single page
request.
v2:
- Fix alignment issues(Matthew Auld)
- Remove unnecessary list_empty check(Matthew Auld)
- merged the below patch to see the feature in action
- add top-down alloc support to i915 driver
Signed-off-by: Arunpravin <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220221164552.2434-2-Arunpravin.PaneerSelvam@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
- Make drm_buddy_alloc a single function to handle
range allocation and non-range allocation demands
- Implemented a new function alloc_range() which allocates
the requested power-of-two block comply with range limitations
- Moved order computation and memory alignment logic from
i915 driver to drm buddy
v2:
merged below changes to keep the build unbroken
- drm_buddy_alloc_range() becomes obsolete and may be removed
- enable ttm range allocation (fpfn / lpfn) support in i915 driver
- apply enhanced drm_buddy_alloc() function to i915 driver
v3(Matthew Auld):
- Fix alignment issues and remove unnecessary list_empty check
- add more validation checks for input arguments
- make alloc_range() block allocations as bottom-up
- optimize order computation logic
- replace uint64_t with u64, which is preferred in the kernel
v4(Matthew Auld):
- keep drm_buddy_alloc_range() function implementation for generic
actual range allocations
- keep alloc_range() implementation for end bias allocations
v5(Matthew Auld):
- modify drm_buddy_alloc() passing argument place->lpfn to lpfn
as place->lpfn will currently always be zero for i915
v6(Matthew Auld):
- fixup potential uaf - If we are unlucky and can't allocate
enough memory when splitting blocks, where we temporarily
end up with the given block and its buddy on the respective
free list, then we need to ensure we delete both blocks,
and no just the buddy, before potentially freeing them
- fix warnings reported by kernel test robot <lkp@intel.com>
v7(Matthew Auld):
- revert fixup potential uaf
- keep __alloc_range() add node to the list logic same as
drm_buddy_alloc_blocks() by having a temporary list variable
- at drm_buddy_alloc_blocks() keep i915 range_overflows macro
and add a new check for end variable
v8:
- fix warnings reported by kernel test robot <lkp@intel.com>
v9(Matthew Auld):
- remove DRM_BUDDY_RANGE_ALLOCATION flag
- remove unnecessary function description
v10:
- keep DRM_BUDDY_RANGE_ALLOCATION flag as removing the flag
and replacing with (end < size) logic fails amdgpu driver load
Signed-off-by: Arunpravin <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220221164552.2434-1-Arunpravin.PaneerSelvam@amd.com
struct drm_display_mode embeds a list head, so overwriting
the full struct with another one will corrupt the list
(if the destination mode is on a list). Use drm_mode_copy()
instead which explicitly preserves the list head of
the destination mode.
Even if we know the destination mode is not on any list
using drm_mode_copy() seems decent as it sets a good
example. Bad examples of not using it might eventually
get copied into code where preserving the list head
actually matters.
Obviously one case not covered here is when the mode
itself is embedded in a larger structure and the whole
structure is copied. But if we are careful when copying
into modes embedded in structures I think we can be a
little more reassured that bogus list heads haven't been
propagated in.
@is_mode_copy@
@@
drm_mode_copy(...)
{
...
}
@depends on !is_mode_copy@
struct drm_display_mode *mode;
expression E, S;
@@
(
- *mode = E
+ drm_mode_copy(mode, &E)
|
- memcpy(mode, E, S)
+ drm_mode_copy(mode, E)
)
@depends on !is_mode_copy@
struct drm_display_mode mode;
expression E;
@@
(
- mode = E
+ drm_mode_copy(&mode, &E)
|
- memcpy(&mode, E, S)
+ drm_mode_copy(&mode, E)
)
@@
struct drm_display_mode *mode;
@@
- &*mode
+ mode
Cc: Emma Anholt <emma@anholt.net>
Cc: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220218100403.7028-18-ville.syrjala@linux.intel.com
Fbdev's deferred I/O sorts all dirty pages by default, which incurs a
significant overhead. Make the sorting step optional and update the few
drivers that require it. Use a FIFO list by default.
Most fbdev drivers with deferred I/O build a bounding rectangle around
the dirty pages or simply flush the whole screen. The only two affected
DRM drivers, generic fbdev and vmwgfx, both use a bounding rectangle.
In those cases, the exact order of the pages doesn't matter. The other
drivers look at the page index or handle pages one-by-one. The patch
sets the sort_pagelist flag for those, even though some of them would
probably work correctly without sorting. Driver maintainers should update
their driver accordingly.
Sorting pages by memory offset for deferred I/O performs an implicit
bubble-sort step on the list of dirty pages. The algorithm goes through
the list of dirty pages and inserts each new page according to its
index field. Even worse, list traversal always starts at the first
entry. As video memory is most likely updated scanline by scanline, the
algorithm traverses through the complete list for each updated page.
For example, with 1024x768x32bpp each page covers exactly one scanline.
Writing a single screen update from top to bottom requires updating
768 pages. With an average list length of 384 entries, a screen update
creates (768 * 384 =) 294912 compare operation.
Fix this by making the sorting step opt-in and update the few drivers
that require it. All other drivers work with unsorted page lists. Pages
are appended to the list. Therefore, in the common case of writing the
framebuffer top to bottom, pages are still sorted by offset, which may
have a positive effect on performance.
Playing a video [1] in mplayer's benchmark mode shows the difference
(i7-4790, FullHD, simpledrm, kernel with debugging).
mplayer -benchmark -nosound -vo fbdev ./big_buck_bunny_720p_stereo.ogg
With sorted page lists:
BENCHMARKs: VC: 32.960s VO: 73.068s A: 0.000s Sys: 2.413s = 108.441s
BENCHMARK%: VC: 30.3947% VO: 67.3802% A: 0.0000% Sys: 2.2251% = 100.0000%
With unsorted page lists:
BENCHMARKs: VC: 31.005s VO: 42.889s A: 0.000s Sys: 2.256s = 76.150s
BENCHMARK%: VC: 40.7156% VO: 56.3219% A: 0.0000% Sys: 2.9625% = 100.0000%
VC shows the overhead of video decoding, VO shows the overhead of the
video output. Using unsorted page lists reduces the benchmark's run time
by ~32s/~25%.
v2:
* Make sorted pagelists the special case (Sam)
* Comment on drivers' use of pagelist (Sam)
* Warn about the overhead in comment
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://download.blender.org/peach/bigbuckbunny_movies/big_buck_bunny_720p_stereo.ogg # [1]
Link: https://patchwork.freedesktop.org/patch/msgid/20220211094640.21632-3-tzimmermann@suse.de
Add support to convert from XR24 to reversed monochrome for drivers that
control monochromatic display panels, that only have 1 bit per pixel.
The function does a line-by-line conversion doing an intermediate step
first from XR24 to 8-bit grayscale and then to reversed monochrome.
The drm_fb_gray8_to_mono_reversed_line() helper was based on code from
drivers/gpu/drm/tiny/repaper.c driver.
Signed-off-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Maxime Ripard <maxime@cerno.tech>
Link: https://patchwork.freedesktop.org/patch/msgid/20220214133710.3278506-3-javierm@redhat.com
Recently we added generic "edp-panel"s probed by EDID. To support
panels in this way we look at the panel ID in the EDID and look up the
panel in a table that has power sequence timings. If we find a panel
that's not in the table we will still attempt to use it but we'll use
conservative timings. While it's likely that these conservative
timings will work for most nearly all panels, the performance of
turning the panel off and on suffers.
We'd like to be able to reliably detect the case that we're using the
hardcoded timings without relying on parsing dmesg. This allows us to
implement tests that ensure that no devices get shipped that are
relying on the conservative timings.
Let's add a new debugfs entry to panel devices. It will have one of:
* UNKNOWN - We tried to detect a panel but it wasn't in our table.
* HARDCODED - We're not using generic "edp-panel" probed by EDID.
* A panel name - This is the name of the panel from our table.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220204161245.v2.3.I209d72bcc571e1d7d6b793db71bf15c9c0fc9292@changeid
We'd like panels to be able to add things to debugfs underneath the
connector's directory. Let's plumb it through. A panel will be able to
put things in a "panel" directory under the connector's
directory. Note that debugfs is not ABI and so it's always possible
that the location that the panel gets for its debugfs could change in
the future.
NOTE: this currently only works if you're using a modern
architecture. Specifically the plumbing relies on _both_
drm_bridge_connector and drm_panel_bridge. If you're not using one or
both of these things then things won't be plumbed through.
As a side effect of this change, drm_bridges can also get callbacks to
put stuff underneath the connector's debugfs directory. At the moment
all bridges in the chain have their debugfs_init() called with the
connector's root directory.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220204161245.v2.2.Ib0bd5346135cbb0b63006b69b61d4c8af6484740@changeid
The ti-sn65dsi86 driver shouldn't hand-roll its own bridge
connector. It should use the normal drm_bridge_connector. Let's switch
to do that, removing all of the custom code.
NOTE: this still _doesn't_ implement DRM_BRIDGE_ATTACH_NO_CONNECTOR
support for ti-sn65dsi86 and that would still be a useful thing to do
in the future. It was attempted in the past [1] but put on the back
burner. However, unless we instantly change ti-sn65dsi86 fully from
not supporting DRM_BRIDGE_ATTACH_NO_CONNECTOR at all to _only_
supporting DRM_BRIDGE_ATTACH_NO_CONNECTOR then we'll still need a bit
of time when we support both. This is a better way to support the old
way where the driver hand rolls things itself.
A new notes about the implementation here:
* When using the drm_bridge_connector the connector should be created
after all the bridges, so we change the ordering a bit.
* I'm reasonably certain that we don't need to do anything to "free"
the new drm_bridge_connector. If drm_bridge_connector_init() returns
success then we know drm_connector_init() was called with the
`drm_bridge_connector_funcs`. The `drm_bridge_connector_funcs` has a
.destroy() that does all the cleanup. drm_connector_init() calls
__drm_mode_object_add() with a drm_connector_free() that will call
the .destroy().
* I'm also reasonably certain that I don't need to "undo" the
drm_bridge_attach() if drm_bridge_connector_init() fails. The
"detach" function is private and other similar code doesn't try to
undo the drm_bridge_attach() in error cases. There's also a comment
indicating the lack of balance at the top of drm_bridge_attach().
[1] https://lore.kernel.org/r/20210920225801.227211-4-robdclark@gmail.com
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220204161245.v2.1.I3ab26b7f197cc56c874246a43e57913e9c2c1028@changeid
Support reading edid through aux channel if panel is connected to aux
bus. Extend anx7625_aux_dpcd_trans() to implement aux transfer function:
1. panel is populated in devm_of_dp_aux_populate_ep_devices(), so move
anx7625_parse_dt() after.
2. Use pm runtime autosuspend since aux transfer function is called
multiple times when reading edid.
3. No-op if aux transfer length is 0.
Signed-off-by: Hsin-Yi Wang <hsinyi@chromium.org>
Reviewed-by: Xin Ji <xji@analogixsemi.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Robert Foss <robert.foss@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220213103437.3363848-3-hsinyi@chromium.org
Currently when users try to run an application with lima and that hits
an issue such as a timeout, a message saying "fail to save task state"
and "error task list is full" is shown in dmesg.
The error task dump is a debug feature disabled by default, so the
error task list is usually not going to be available at all.
The message can be misleading and creates confusion in bug reports.
We can avoid that code path and that particular message when the user
has not explicitly set the max_error_tasks parameter to enable the
feature.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Qiang Yu <yuq825@gmail.com>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220209093700.30901-1-nunes.erico@gmail.com
The IDVS group size feature was missing. It is used on some Bifrost and
Valhall GPUs, and is the last kernel-relevant Bifrost feature we're
missing.
This feature adds an extra IDVS group size field to the JM_CONFIG
register. In kbase, the value is configurable via the device tree; kbase
uses 0xF as a default if no value is specified. Until we find a device
demanding otherwise, let's always set the 0xF default on devices which
support this feature mimicking kbase's behaviour.
Tuning this register slightly improves performance of index-driven
vertex shading. On Mali-G52 (with Mesa), overall glmark2 score is
improved from 1026 to 1037. Geometry-heavy scenes like -bshading are
improved from 1068 to 1098.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Steven Price <steven.price@arm.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220211145849.3148-1-alyssa.rosenzweig@collabora.com
When the dw-hdmi bridge is in first place of the bridge chain, this
means there is no way to select an input format of the dw-hdmi HW
component.
Since introduction of display-connector, negotiation was broken since
the dw-hdmi negotiation code only worked when the dw-hdmi bridge was
in last position of the bridge chain or behind another bridge also
supporting input & output format negotiation.
Commit 7cd70656d1 ("drm/bridge: display-connector: implement bus fmts callbacks")
was introduced to make negotiation work again by making display-connector
act as a pass-through concerning input & output format negotiation.
But in the case where the dw-hdmi is single in the bridge chain, for
example on Renesas SoCs, with the display-connector bridge the dw-hdmi
is no more single, breaking output format.
Reported-by: Biju Das <biju.das.jz@bp.renesas.com>
Bisected-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Tested-by: Kieran Bingham <kieran.bingham+renesas@ideasonboard.com>
Fixes: 6c3c719936 ("drm/bridge: synopsys: dw-hdmi: add bus format negociation")
Signed-off-by: Neil Armstrong <narmstrong@baylibre.com>
[narmstrong: add proper fixes commit]
Reviewed-by: Robert Foss <robert.foss@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20220204143337.89221-1-narmstrong@baylibre.com
Add drm_fb_helper_memory_range_to_clip(), a helper function that
accepts an linear range of video memory and converts it into a
rectangle. The computed rectangle describes the damaged area in
terms of scanlines and pixels per scanline.
While at it, make the code more readable by using struct drm_rect
and related helpers.
The code was previously part of the deferred I/O helpers, but is
also useful for damage handling of regular write operations. Update
the deferred I/O code to use the new function.
v2:
* rename helper (Javier)
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220209161617.3553-4-tzimmermann@suse.de
Don't clip the damage rectangle against the viewport. This only
works if the viewport is located at the beginning of the video
memory and the video memory doesn't extend the screen (i.e., if
there's no overallocation).
Fbdev emulation transfers data from write operations into a
possible shadow buffer, then into a GEM buffer object, and finally
via graphics driver onto the screen.
If callers write outside the currently visible area, clipping the
damage rectangle against the viewport will loose these updates in
the shadow buffer and the fbdev's buffer object will contain stale
data. Panning the viewport to the stale area of the buffer will
display obsolete data.
Instead, mark all written areas as damaged, so that the damage
handler updates the buffer object from the shadow buffer for all
such areas. The graphics driver's later has the option of clipping
the damaged area against the viewport when updating the screen
from the buffer object.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220209161617.3553-3-tzimmermann@suse.de