After loading the module with load failure injection enabled don't try
check the alive state. Also limit the number of failure points to
existing ones, to reduce the run time of the test.
v2:
- make VT binding/snd module loading part of reload and VT bind fail
silently (Chris)
Signed-off-by: Imre Deak <imre.deak@intel.com>
For basic, since CI doesn't hit the same hard lockup on Braswell that is
possible without hpet, stop running for so long!
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Not as stressful as testing inter-ring synchronisation, but it does
allow inspecting the simpler testcases if need be.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Fix the engine selection to exercise all possible rings and in doing so
completely obsoletes gem_multi_bsd_sync_loop.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The goal is to test interengine synchronisation so remove any likelihood
that we introduce synchronisation for performing relocations.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
An idea for testing failure paths along module load is to use a parameter
to perform fault injection. This rudimentary framework should get us
started.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
We don't preserve the execobj.offset nor set the right values into the
batches between iterations.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
A more complicated store variant to stress inter-engine dependencies
(i.e. semaphores and sync). We write a control value from one batch into
the next and then execute it. This is repeated a few times with each
execution happening on a different engine (so the kernel has to
serialise operations between engines) until we finally write the value
out into our scratch buffer where we can check the result, just like a
Chinese whisper.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
gem_reloc_overflow.c: In function ‘__real_main365’:
gem_reloc_overflow.c:384:3: warning: format ‘%lu’ expects argument of type ‘long unsigned int’, but argument 6 has type ‘size_t’ [-Wformat=]
igt_require_f(mlock(reloc, reloc_size) == 0,
^
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
gem_concurrent_all.c: In function ‘__real_main1556’:
gem_concurrent_all.c:1642:4: warning: format ‘%ld’ expects argument of type ‘long int’, but argument 4 has type ‘uint64_t’ [-Wformat=]
igt_debug("Pinning %ld MiB\n", pin_sz);
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
drv_hangman.c: In function ‘hangcheck_unterminated’:
drv_hangman.c:290:27: warning: integer overflow in expression [-Woverflow]
int64_t timeout_ns = 100 * NSEC_PER_SEC; /* 100 seconds */
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Ignore the default ring as that is purely symbolic. On BSD2 systems it
is similarly useful to ignore the symbolic BSD ring.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
A very simple, the simplest!, batch that can execution on any known
engine that just writes a value into memory.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
For a background task the fork helpers are more appropriate, since we
can explicitly cancel children. Also, anything that does real work is
supposed to be in fixtures.
Cc: Tiago Vignatti <tiago.vignatti@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
--
Tiago, can you pls check that I haven't broken anythig?
Thanks, Daniel
Demonstrate how trivial it is to lockup Braswell, at least my N3050 nuc,
by saturating the interrupt handler with a few requests.
References: https://bugs.freedesktop.org/show_bug.cgi?id=93467
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Check that the system operates normally before and after the suspend (as
well as across the suspend). The goal is to isolate the breakage to the
subtest.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The execbuffer2 ABI is not strictly limited to a total of UINT32_MAX
entries, rather each object can have a maximum of UINT32_MAX relocation
entries and the current implementation imposes that the total must be
allocable in a contiguous buffer when necessary (i.e as large as the
kernel can conceivably allocate). This is not an ABI constraint per-se,
just an implementation issue.
Whilst updating the limits for 64bit kernels, review usable of
ioctl-wrappers (i.e. use __gem_execbuf now available) and include a
batch of more tests to explore the boundary conditions of the maximum
relocation size. Note that rather than guess the reloc-max, it would be
better if we queried it. Also it is of vital importance that when
constructing a test to fail in a particular fashion, it must not include
any other error (e.g. we were passing in relocation arrays with invalid
target handle and domains when looking for a potential overflow across
multiple objects).
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The transition to PCI device state D3 is not instantaneous and only
started when runtime suspended. Allow the driver/hardware a little bit
of time to complete the transition before declaring a test failure.
References: https://bugs.freedesktop.org/show_bug.cgi?id=93123
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Now that engines keep references on the last executed contexts,
to fix this test we need to execute an unrelated context last to
ensure the one we are interested in is free to be cleaned up when
we expect it to be.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
kms_setmode.c:384:30: warning: declaration of ‘drm_fd’ shadows a global
declaration [-Wshadow]
kms_setmode.c:45:12: note: shadowed declaration is here static int drm_fd;
kms_setmode.c:391:38: warning: passing argument 8 of ‘drmModeSetCrtc’
discards ‘const’ qualifier from pointer target type
[-Wdiscarded-qualifiers] ids, crtc->connector_count, &crtc->mode);
Signed-off-by: Marius Vlad <marius.c.vlad@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Excercise connector stealing harder. There is a border case in atomic currently where
encoder stealing is not prevented on the same crtc when the encoder is not reassigned.
The following testcase excercises that path and causes a OOPS on my system with nightly.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The test used to assume pipe A was being used for everything, and we
tried to fix this in commit "tests: fix CRTC assignment for a few
tests", but the pipe CRC code was forgotten.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Use the pipe which is given from the possible_crcs for that connected port
instead.
On BSW there are constrains for the crtc<-->connector, this fix make this test
passing on BSW.
v2 (from Paulo): bikeshed the blank lines.
Signed-off-by: Gabriel Feceoru <gabriel.feceoru@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
All the tests I wrote always assumed that every connector supported
CRTC 0. This is not the case for BSW and possibly others, so fix the
tests before the CI reports more failures.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Instead of just giving preference to an eDP primary connector, give
preference to one that's eDP and supports pipe A, then try lesser
optimal combinations later.
We could try to make our test suite use different sets of connectors
when testing FBC and PSR, but that would require some rework, and we
would still be helpless when testing the combination of FBC+PSR.
Also notice that we still hardcode pipe A for the primary connector,
regardless of whether it supports it. This will be solved in the next
commits.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
We're going to make our search for connnectors a little more
complicated, so extract the function since we're going to call it a
few more times.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
We already pass the crtc id, so use the id to retrieve the index.
We'll change the way we pass the crtc id in the next commits, so we'll
have to call a function to calculate the index based on the id at that
point. Do the change now in order to avoid big commits later.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Move it from pm_rpm.c to lib/igt_kms and remove the hardcoded version
from kms_frontbuffer_tracking. I'm also planning to add other callers.
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
On machines that lack an LLC the pm-caching subtest will
terminate with sigbus and thus CRASH during the
I915_CACHING_CACHED iteration. To work around this we reset
the caching to I915_CACHING_NONE before doing memory access.
v2: Various improvements based on feedback from Chris Wilson
v3: Fix incorrect Signed-off-by: line
v4: Further improvements based on feedback from Chris Wilson
Signed-off-by: David Weinehall <david.weinehall@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Marius Vlad <marius.c.vlad@intel.com>
Improve the busy-load for triggering a wait-on-interrupt and check for
extraneous missed-interrupts before and after our tests.
References: https://bugs.freedesktop.org/show_bug.cgi?id=88437
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Trying to allocate and use lots of contexts with execlists and !llc end
ups in faliure very quickly.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
The hangcheck logic will not flag an hang if acthd keeps increasing.
However, if a malformed batch jumps to an invalid offset in the ppgtt it
can potentially continue executing through the whole address space
without triggering the hangcheck mechanism.
This patch adds a test to simulate the issue. I've kept the test running
for more than 10 minutes before killing it on a BDW and no hang occurred.
I've sampled i915_hangcheck_info a few times during the run and got the
following:
Hangcheck active, fires in 468ms
render ring:
seqno = fffff55e [current fffff55e]
ACTHD = 0x47df685ecc [current 0x4926b81d90]
max ACTHD = 0x47df685ecc
score = 0
action = 2
instdone read = 0xffd7ffff 0xffffffff 0xffffffff 0xffffffff
instdone accu = 0x00000000 0x00000000 0x00000000 0x00000000
Hangcheck active, fires in 424ms
render ring:
seqno = fffff55e [current fffff55e]
ACTHD = 0x6c953d3a34 [current 0x6de5e76fa4]
max ACTHD = 0x6c953d3a34
score = 0
action = 2
instdone read = 0xffd7ffff 0xffffffff 0xffffffff 0xffffffff
instdone accu = 0x00000000 0x00000000 0x00000000 0x00000000
Hangcheck active, fires in 1692ms
render ring:
seqno = fffff55e [current fffff55e]
ACTHD = 0x1f49b0366dc [current 0x1f4dcbd88ec]
max ACTHD = 0x1f49b0366dc
score = 0
action = 2
instdone read = 0xffd7ffff 0xffffffff 0xffffffff 0xffffffff
instdone accu = 0x00000000 0x00000000 0x00000000 0x00000000
v2: use the new gem_wait() function (Chris)
v3: switch to unterminated batch and rename test, remove redundant
check, update test requirements (Chris), update top comment
v4: force gpu reset if the hang detection fails (Mika)
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
[Mika: removed batch_len=8]
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
Upcoming tests will call it to recover from bad states caused by
hangcheck bugs.the function was renamed to igt_force_gpu_reset to have a
naming closer to other hang-related functions in the same file.
The value written to the debugfs has also been changed to -1; this makes
no differences with the current implementation but copes with upcoming
TDR changes (still under discussion) that should allow the resetting of
a mask of rings.
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
More num_buffers onto the local struct passed down into the tests to
avoid the issue with having to modify the global value inside the tests
leading to hilarity if the test asserts.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Allow read-only synchronisation on dmabuf mmaps, useful to allow
concurrent read-read testing between the CPU and GPU.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Load detection requires a inactive crtc to run. The CI igt tests are
failing, so ensure there is at least 1 inactive crtc.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The sync test is supposed to complete in 10s. But some bugs cause it to
run very, very slowly. As a defence against those, terminate the test if
we wait for more than 20s.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>