Several factors conspire against us when trying to execute
the tiled small-bo tests:
- pre-gen4 require power of two fences, with natural alignment
- the entire gtt may be mappable
- we put a guard page at the end of gtt
What all that means is that when we try to use a tiled object half
the size of the mappable area, we can only fit it in the first half
of the gtt. That leads to a SIGBUS when we try to fault in the
object when there's already something (eg. fbdev) occupying the
first half of gtt.
So in order to make the tests run on old machines, let's further
halve the object size when things look too tight.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Some of the copy tests take a while, so let the user know how
far along we are via a progress indicator.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Gen2/3 platforms have some unusual tile dimensions. Account
for them to make the test work correctly.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
igt_kms.c: In function ‘igt_crtc_set_background’:
igt_kms.c:1940:2: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘uint64_t’ [-Wformat=]
LOG(display, "%s.%d: crtc_set_background(%lu)\n",
^
intel_firmware_decode.c: In function ‘csr_open’:
intel_firmware_decode.c:169:2: warning: format ‘%zd’ expects argument of type ‘signed size_t’, but argument 3 has type ‘__off_t’ [-Wformat=]
printf("Firmware: %s (%zd bytes)\n", filename, st.st_size);
^
intel_gpu_top.c: In function ‘main’:
intel_gpu_top.c:683:10: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t’ [-Wformat=]
stats[i] - last_stats[i]);
^
hsw_compute_wrpll.c: In function ‘main’:
hsw_compute_wrpll.c:644:3: warning: format ‘%li’ expects argument of type ‘long int’, but argument 7 has type ‘long long int’ [-Wformat=]
igt_fail_on_f(ref->r2 != r2 || ref->n2 != n2 || ref->p != p,
^
gem_gtt_hog.c: In function ‘__real_main155’:
gem_gtt_hog.c:177:2: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘unsigned int’ [-Wformat=]
igt_info("Time to execute %lu children: %7.3fms\n",
^
kms_flip.c: In function ‘run_test_step’:
kms_flip.c:985:3: warning: format ‘%u’ expects argument of type ‘unsigned int’, but argument 10 has type ‘__time_t’ [-Wformat=]
igt_assert_f(end - start > 0.9 * frame_time(o) &&
^
kms_flip.c:985:3: warning: format ‘%u’ expects argument of type ‘unsigned int’, but argument 11 has type ‘__suseconds_t’ [-Wformat=]
kms_frontbuffer_tracking.c: In function ‘setup_sink_crc’:
kms_frontbuffer_tracking.c:1364:3: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 4 has type ‘ssize_t’ [-Wformat=]
igt_info("Unexpected sink CRC error, rc=:%ld errno:%d %s\n",
^
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
The test tries to anger CHV pipe C cursor by walking the edges of the
screen while moving the cursor across the screen edge.
The actual hw issue only occurs on pipe C, and only on the left screen
edge. The testcase can walk all the edges though, and on all pipes, just
so I could make sure the failure doesn't occur there.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Add support for reading the CRC in non-blocking mode. Useful for tests
that want to start the CRC capture, then do a bunch of operations, then
collect however many CRCs that got generated. The current
igt_pipe_crc_new() + igt_pipe_crc_get_crcs() method would block until
it gets the requested number of CRCs, whreas in non-blocking mode we
can just read as many as got generated thus far.
v2: __attribute__((warn_unused_result)), document the
new igt_pipe_crc_get_crcs() return value (Daniel)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Several tests do one or more of the following:
* igt_create_fb() + igt_paint_test_pattern()
* igt_create_color_fb() + igt_paint_test_pattern()
* igt_create_fb() + igt_paint_image()
Extract them into new helpers: igt_create_pattern_fb(),
igt_create_color_pattern_fb(), igt_create_image_fb().
v2: Fix typos, and improve API docs (Thomas)
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
i915 validates that requested offset is in canonical form, so tests need
to convert the offsets as required.
Also add test to verify non-canonical 48-bit address will be rejected.
v2: Use sign_extend64 for converting to canonical form (Tvrtko)
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Reviewed-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
line[strlen(line)] will always evaluate to NULL so line_continuation
was always true. That prevented the program name, pid and log level
ever being printed.
Changed to [strlen(line) - 1] so the last character before the null
terminator is compared with '\n' to determine line_continuation.
Signed-off-by: Derek Morton <derek.j.morton@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The default is too low for panels that are 30 fps or lower.
Bump the timeout to 50 ms to prevent spurious errors on those
displays.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
In order to do concurrency checks using different allocation functions,
we need to hook those functions up to gem_concurrent_all. So let's add
another layer of combinations! The actual enabling for create2-ioctl
will come in the future.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Like the previous patch to gem_exec_ctx, retrict gem_exec_nop to running
for a fixed length of time, rather than over a range of different
execution counts. In order to retain some measurement of that range,
allow measuring individual execution versus continuous dispatch.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Rather than investigate the curve for dispatch latency, just run for a
fixed time and report an average latency. Instead offer two modes,
average single dispatch latency, average continuous dispatch latency.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
As we didn't recognise the different buffer type, we confused it with
whatever we last decoded (i.e. the render ring buffer).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Recent kernels compress the active objects using zlib + ascii85
encoding. This adapts the tool to decompress those inplace.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
If we autotune the workload to only take 0.1s and then repeat the
measurements over 2s, we can bound the benchmark runtime. (Roughly of
course! Sometimes the dispartity between main memory CPU bandwidth, and
GPU execution bandwidth throws off the runtime, but that's the purpose
of the benchmark!)
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Due to the clever way the whole sequence block is specified without
forward compatibility, it's not possible to dump most blocks without
this.
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
It's nice to see just how many components the crc claims to have
when the count don't match what we expect.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The extra_long_opts passed to igt_*_parse_opts() isn't modified,
so let's make it const.
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The joy of our hardware; don't let two threads attempt to read the same
register at the same time.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Allow the producers to be set with maximum RT priority to verify that
the waiters are not exhibiting priorty-inversion.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Try a different pattern to cascade the cancellation from producers to
their consumers in order to avoid one potential deadlock.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Do the workload before the nop, so that if combining both, there is a
better chance for the spurious interrupts. Emit just one workload batch
(use the nops to generate spurious interrupts) and apply the factor to
the number of copies to make inside the workload - the intention is that
this gives sufficient time for all producers to run concurrently.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Split the distinct phases (generate interrupts, busywork, measure
latency) into separate batches for finer control.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Knowing how long it takes to execute the workload (and how that scales)
is interesting to put the latency figures into perspective.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Late last night I forgot I had only added the llc CPU mmaping and not
the !llc GTT mapping for byt/bsw.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The goal is measure how long it takes for clients waiting on results to
wakeup after a buffer completes, and in doing so ensure scalibilty of
the kernel to large number of clients.
We spawn a number of producers. Each producer submits a busyload to the
system and records in the GPU the BCS timestamp of when the batch
completes. Then each producer spawns a number of waiters, who wait upon
the batch completion and measure the current BCS timestamp register and
compare against the recorded value.
By varying the number of producers and consumers, we can study different
aspects of the design, in particular how many wakeups the kernel does
for each interrupt (end of batch). The more wakeups on each batch, the
longer it takes for any one client to finish.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The upper bound for SLOW_QUICK was added for the benefit of the slow
simulator, not because, as I wrongly thought, of the latency
measurements.
SLOW_QUICK was added in
commit d1e862324b747a0ab5d985eaa6830076817231c5
Author: Damien Lespiau <damien.lespiau@intel.com>
Date: Mon Mar 25 20:06:20 2013 +0000
tests: Instrument tests run in simulation to run quickly
and dropped in
commit 89bcdb9022fb7a1f66635b9f2546356ad0c0761a
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 8 13:42:50 2015 +0000
igt/gem_exec_nop: Remove nop latency measurements
Reported-by: Ben Widawsky <benjamin.widawsky@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>