Files
linux/include/linux
Lucas De Marchi 5f278dbd54 iosys-map: Add per-word read
Instead of always falling back to memcpy_fromio() for any size, prefer
using read{b,w,l}(). When reading struct members it's common to read
individual integer variables individually. Going through memcpy_fromio()
for each of them poses a high penalty.

Employ a similar trick as __seqprop() by using _Generic() to generate
only the specific call based on a type-compatible variable.

For a pariticular i915 workload producing GPU context switches,
__get_engine_usage_record() is particularly hot since the engine usage
is read from device local memory with dgfx, possibly multiple times
since it's racy. Test execution time for this test shows a ~12.5%
improvement with DG2:

Before:
	nrepeats = 1000; min = 7.63243e+06; max = 1.01817e+07;
	median = 9.52548e+06; var = 526149;
After:
	nrepeats = 1000; min = 7.03402e+06; max = 8.8832e+06;
	median = 8.33955e+06; var = 333113;

Other things attempted that didn't prove very useful:
1) Change the _Generic() on x86 to just dereference the memory address
2) Change __get_engine_usage_record() to do just 1 read per loop,
   comparing with the previous value read
3) Change __get_engine_usage_record() to access the fields directly as it
   was before the conversion to iosys-map

(3) did gave a small improvement (~3%), but doesn't seem to scale well
to other similar cases in the driver.

Additional test by Chris Wilson using gem_create from igt with some
changes to track object creation time. This happens to accidentally
stress this code path:

	Pre iosys_map conversion of engine busyness:
	lmem0: Creating    262144 4KiB objects took 59274.2ms

	Unpatched:
	lmem0: Creating    262144 4KiB objects took 108830.2ms

	With readl (this patch):
	lmem0: Creating    262144 4KiB objects took 61348.6ms

	s/readl/READ_ONCE/
	lmem0: Creating    262144 4KiB objects took 61333.2ms

So we do take a little bit more time than before the conversion, but
that is due to other factors: bringing the READ_ONCE back would be as
good as just doing this conversion.

v2:
  - Remove default from _Generic() - callers wanting to read more
    than u64 should use iosys_map_memcpy_from()
  - Add READ_ONCE() cases dereferencing the pointer when using system
    memory
v3:
  - Fix precedence issue when casting inside READ_ONCE(). By not using ()
    around vaddr__ the offset was not part of the cast, but rather added
    to it, producing a wrong address
  - Remove compiletime_assert() as READ_ONCE() already contains it

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com> # v1
Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20220628191016.3899428-1-lucas.demarchi@intel.com
2022-06-29 17:41:32 -07:00
..
2022-05-10 16:03:52 +08:00
2022-04-20 12:59:50 +05:30
2022-05-22 20:44:29 +01:00
2022-03-23 19:58:38 +01:00
2022-02-01 14:25:50 +02:00
2022-01-22 08:33:34 +02:00
2022-06-08 14:04:14 -04:00
2022-05-20 15:29:00 -07:00
2022-03-11 19:15:03 -08:00
2022-04-22 12:32:03 +02:00
2022-03-15 10:32:44 +01:00
2022-05-16 13:37:59 -07:00
2022-03-16 15:13:36 -07:00
2022-06-03 06:52:57 -07:00
2022-05-13 07:20:18 -07:00
2022-01-20 08:52:54 +02:00
2022-03-02 22:44:49 -08:00
2022-02-28 23:26:27 -08:00
2022-06-29 17:41:32 -07:00
2022-02-14 15:43:15 +01:00
2022-03-18 09:47:04 +01:00
2022-05-02 14:06:20 -06:00
2022-06-06 09:52:17 +09:00
2022-01-27 13:53:26 +00:00
2022-05-03 16:09:03 -04:00
2022-04-21 07:36:56 -04:00
2022-05-19 14:08:53 -07:00
2022-03-21 12:57:38 -04:00
2022-05-13 07:20:17 -07:00
2022-04-28 23:16:14 -07:00
2022-05-12 10:29:41 -07:00
2022-04-28 16:31:10 +02:00
2022-04-01 14:40:44 -04:00
2022-05-17 13:32:46 -04:00
2022-02-09 09:24:40 -05:00
2022-01-12 10:14:09 -06:00
2022-02-02 07:49:59 -07:00
2022-02-09 08:04:44 +01:00
2022-02-09 08:04:44 +01:00
2022-04-05 10:24:38 +02:00
2022-04-19 10:19:02 -07:00
2022-01-22 08:33:37 +02:00
2022-01-08 12:43:57 -06:00
2022-01-24 14:45:02 +01:00
2022-06-10 11:29:48 +02:00
2022-03-08 14:33:36 -06:00
2022-03-17 20:16:29 -07:00
2022-03-23 19:58:41 +01:00
2022-05-22 21:03:01 +01:00
2022-04-07 12:53:54 +02:00
2022-02-24 15:04:51 +00:00
2022-06-02 10:15:05 -07:00
2022-05-08 01:33:08 -07:00
2022-02-25 09:36:06 +01:00
2022-04-11 19:18:27 -06:00
2022-03-22 15:57:11 -07:00
2022-01-26 14:54:48 +01:00
2022-01-27 13:53:27 +00:00
2022-05-24 08:41:18 -06:00
2022-05-31 12:45:10 -04:00