22 Commits

Author SHA1 Message Date
Chris Wilson
eb59497a83 igt/drv_hangman: 32bit compilation warning
drv_hangman.c: In function ‘hangcheck_unterminated’:
drv_hangman.c:290:27: warning: integer overflow in expression [-Woverflow]
  int64_t timeout_ns = 100 * NSEC_PER_SEC; /* 100 seconds */

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-03-10 10:58:11 +00:00
Daniele Ceraolo Spurio
95ca7644db tests/drv_hangman: test for acthd increasing through invalid VM space
The hangcheck logic will not flag an hang if acthd keeps increasing.
However, if a malformed batch jumps to an invalid offset in the ppgtt it
can potentially continue executing through the whole address space
without triggering the hangcheck mechanism.

This patch adds a test to simulate the issue. I've kept the test running
for more than 10 minutes before killing it on a BDW and no hang occurred.
I've sampled i915_hangcheck_info a few times during the run and got the
following:

Hangcheck active, fires in 468ms
render ring:
	seqno = fffff55e [current fffff55e]
	ACTHD = 0x47df685ecc [current 0x4926b81d90]
	max ACTHD = 0x47df685ecc
	score = 0
	action = 2
	instdone read = 0xffd7ffff 0xffffffff 0xffffffff 0xffffffff
	instdone accu = 0x00000000 0x00000000 0x00000000 0x00000000

Hangcheck active, fires in 424ms
render ring:
	seqno = fffff55e [current fffff55e]
	ACTHD = 0x6c953d3a34 [current 0x6de5e76fa4]
	max ACTHD = 0x6c953d3a34
	score = 0
	action = 2
	instdone read = 0xffd7ffff 0xffffffff 0xffffffff 0xffffffff
	instdone accu = 0x00000000 0x00000000 0x00000000 0x00000000

Hangcheck active, fires in 1692ms
render ring:
	seqno = fffff55e [current fffff55e]
	ACTHD = 0x1f49b0366dc [current 0x1f4dcbd88ec]
	max ACTHD = 0x1f49b0366dc
	score = 0
	action = 2
	instdone read = 0xffd7ffff 0xffffffff 0xffffffff 0xffffffff
	instdone accu = 0x00000000 0x00000000 0x00000000 0x00000000

v2: use the new gem_wait() function (Chris)

v3: switch to unterminated batch and rename test, remove redundant
    check, update test requirements (Chris), update top comment

v4: force gpu reset if the hang detection fails (Mika)

Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Arun Siluvery <arun.siluvery@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
[Mika: removed batch_len=8]
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
2016-03-01 16:57:11 +02:00
Chris Wilson
405b3478d1 igt/drv_hangman: Tidy up assertion failure message
Because

(drv_hangman:6035) CRITICAL: Failed assertion: !((__extension__
(__builtin_constant_p (l) && ((__builtin_constant_p (tmp) && strlen
(tmp) < ((size_t) (l))) || (__builtin_constant_p (s) && strlen (s) <
((size_t) (l)))) ? __extension__ ({ size_t __s1_len, __s2_len;
(__builtin_constant_p (tmp) && __builtin_constant_p (s) && (__s1_len =
strlen (tmp), __s2_len = strlen (s), (!((size_t)(const void *)((tmp) +
1) - (size_t)(const void *)(tmp) == 1) || __s1_len >= 4) &&
(!((size_t)(const void *)((s) + 1) - (size_t)(const void *)(s) == 1) ||
__s2_len >= 4)) ? __builtin_strcmp (tmp, s) : (__builtin_constant_p
(tmp) && ((size_t)(const void *)((tmp) + 1) - (size_t)(const void
*)(tmp) == 1) && (__s1_len = strlen (tmp), __s1_len < 4) ?
(__builtin_constant_p (s) && ((size_t)(const void *)((s) + 1) -
(size_t)(const void *)(s) == 1) ? __builtin_strcmp (tmp, s) :
(__extension__ ({ const unsigned char *__s2 = (const unsigned char *)
(const char *) (s); int __result = (((const unsigned char *) (const char
*) (tmp))[0] - __s2[0]); if (__s1_len > 0 && __result == 0) { __result =
(((const unsigned char *) (const char *) (tmp))[1] - __s2[1]); if
(__s1_len > 1 && __result == 0) { __result = (((const unsigned char *)
(const char *) (tmp))[2] - __s2[2]); if (__s1_len > 2 && __result == 0)
__result = (((const unsigned char *) (const char *) (tmp))[3] -
__s2[3]); } } __result; }))) : (__builtin_constant_p (s) &&
((size_t)(const void *)((s) + 1) - (size_t)(const void *)(s) == 1) &&
(__s2_len = strlen (s), __s2_len < 4) ? (__builtin_constant_p (tmp) &&
((size_t)(const void *)((tmp) + 1) - (size_t)(const void *)(tmp) == 1) ?
__builtin_strcmp (tmp, s) : (- (__extension__ ({ const unsigned char
*__s2 = (const unsigned char *) (const char *) (tmp); int __result =
(((const unsigned char *) (const char *) (s))[0] - __s2[0]); if
(__s2_len > 0 && __result == 0) { __result = (((const unsigned char *)
(const char *) (s))[1] - __s2[1]); if (__s2_len > 1 && __result == 0) {
__result = (((const unsigned char *) (const char *) (s))[2] - __s2[2]);
if (__s2_len > 2 && __result == 0) __result = (((const unsigned char *)
(const char *) (s))[3] - __s2[3]); } } __result; })))) :
__builtin_strcmp (tmp, s)))); }) : strncmp (tmp, s, l))) == 0)

is a little hard to understand at a glance.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-02-26 12:59:37 +00:00
Chris Wilson
38fe49d9a8 tests/drv_hangman: Convert to using central list of engines
Rather than encoding our own list of engines, use the common one for
greater coverage.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-02-04 11:22:00 +00:00
Chris Wilson
01e467a631 igt/drv_hangman: Make the batchbuffer check more robust
All the external viewer expects of the GPU error capture is to extract
the exact batch that triggered the hang. Everything else is internal
detail to aide in post-mortem debugging of the kernel driver (i.e.
subject to change) and not of the userspace portion (under control of
the test).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-02-04 11:08:07 +00:00
Chris Wilson
0753057446 igt/drv_hangman: Inject a true hang
Wean drv_hangman off the atrocious stop_rings and use a real GPU hang
instead.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
2016-02-04 09:49:36 +00:00
Michał Winiarski
52b5d5016e lib/ioctl_wrappers: Add gem_gtt_type exposing raw HAS_ALIASING_PPGTT param
No functional changes.
While I'm here, let's also rename gem_uses_aliasing_ppgtt (since it's
being used to indicate if we are using ANY kind of ppgtt) and introduce
gem_uses_full_ppgtt to drop some unnecessary code from tests that were
previously calling getparam directly instead of using ioctl wrapper.

v2: drop gem_uses_full_48b_ppgtt since it's no longer used anywhere,
    s/48b/64b (Chris)
v3: rebase

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2016-01-25 19:44:31 +01:00
Daniel Vetter
6cf72724e2 tests/drv_hangman: Open drm fd before doing anything
This way we correctly auto-skip instead of falling over the
lack of i915 debugfs files first and fail the testcase due to
that.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2015-12-04 16:35:31 +01:00
Micah Fedke
c81d293aed convert drm_open_any*() calls to drm_open_driver*(DRIVER_INTEL) calls with cocci
Apply the new API to all call sites within the test suite using the following
semantic patch:

// Semantic patch for replacing drm_open_any* with arch-specific drm_open_driver* calls
@@
identifier i =~ "\bdrm_open_any\b";
@@
- i()
+ drm_open_driver(DRIVER_INTEL)

@@
identifier i =~ "\bdrm_open_any_master\b";
@@
- i()
+ drm_open_driver_master(DRIVER_INTEL)

@@
identifier i =~ "\bdrm_open_any_render\b";
@@
- i()
+ drm_open_driver_render(DRIVER_INTEL)

@@
identifier i =~ "\b__drm_open_any\b";
@@
- i()
+ __drm_open_driver(DRIVER_INTEL)

Signed-off-by: Micah Fedke <micah.fedke@collabora.co.uk>
Signed-off-by: Thomas Wood <thomas.wood@intel.com>
2015-09-11 14:39:43 +01:00
Thomas Wood
804e11f40d lib: add a single include header
Add a header that includes all the headers for the library. This allows
reorganisation of the library without affecting programs using it and
also simplifies the headers that need to be included to use the library.

Signed-off-by: Thomas Wood <thomas.wood@intel.com>
2015-08-21 09:37:10 +01:00
Mika Kuoppala
c37b235202 tests/drv_hangman: Adjust to 64bit bb offsets
commit e1f123257a1f7d3af36a31a0fb2d4c6f40039fed
Author: Michel Thierry <michel.thierry@intel.com>
Date:   Wed Jul 29 17:23:56 2015 +0100

    drm/i915: Expand error state's address width to 64b

changed the batch buffer address to be 64b. Fix the parsing
of gtt offset accordingly.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91638
Cc: Akash Goel <akash.goel@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Reviewed-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
2015-08-19 12:12:28 +03:00
Matt Roper
07be8fec15 igt.cocci: Replace igt_assert() with igt_assert_CMP() where possible
The integer comparison macros give us better error output by including
the actual values that failed the comparison.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2015-03-06 18:06:10 +01:00
Daniel Vetter
3cd45dec2e lib/igt_gt: Document and consolidate
Also move forcewake and stop_rings code from igt_debugfs to igt_gt
since it fits better. And move the hang injection fork helpers from
igt_aux to igt_gt, too.

Also push the intel_gen call into igt_hang_ring while at it.

Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
2015-02-13 09:35:36 +01:00
Damien Lespiau
01153e7d5f drv_hangman: Remove unused function
Signed-off-by: Damien Lespiau <damien.lespiau@intel.com>
2014-12-09 17:10:42 +00:00
Ville Syrjälä
8032f526ef tests: Run lib/igt.cocci
Found some open coded min()/max()/swap() macros.

Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
2014-12-08 19:26:39 +02:00
Tim Gore
4e5c16c17e tests/drv_hangman: remove check for other drm clients
This test will not run on Android as the coreu service
remains running even after the android system is stopped.
Coreu is a client of drm and when the test finds this it
fails an assert.
Coreu is started by the init process and there is no
tidy, non invasive way to stop it (init just restarts it).
Coreu isn't doing anything and would not be expected to
interfere with this test. In addition, all the other
igt tests just rely on the user/test script to ensure
that there are no other drm clients, so this test can
do the same. On Android we must rely on coreu being
dormant when this test runs.

Signed-off-by: Tim Gore <tim.gore@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-12-01 18:09:35 +01:00
Brad Volkin
e34240d4c1 tests/drv_hangman: skip a few asserts when using the cmd parser
This test has a few checks that batch buffer addresses in the error
state match the expected  address for the userspace supplied batch.
But the batch buffer copy piece of the command  parser means that
the logged addresses are actually _supposed_ to be different. So
skip just those checks.

Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-11-05 13:35:29 +01:00
Daniel Vetter
c9c554594e tests: run igt.cocci
Re-run with correct igt_fail rules. Again manually fixup missing
includes for igt_core.h.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-06-13 18:27:59 +02:00
Daniel Vetter
ac0e606677 Revert "tests: Run igt.cocci over tests"
This reverts commit 6903ab04e5f9048e3932eb3225e94b6a228681ba.

The igt_assert conversion rule is broken and doesn't invert the check
as it should.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-06-13 18:09:05 +02:00
Daniel Vetter
6903ab04e5 tests: Run igt.cocci over tests
Cocci is awesome

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-06-13 11:17:23 +02:00
Mika Kuoppala
6fa1934a19 tests/drv_hangman: Add subtest for error state capture/dump
Guarantees that error capture works at a very basic level.

v2: Also check that the ring object contains a reloc with MI_BB_START
for the presumed batch object's address.

v3: Chris review comments:
 - Move variables to local scope.
 - Do not assume there is only one request.
 - Some gen encode flags into the BB start address.
 Also, use igt_set/get_stop_rings as suggested by Mika Kuoppala.

v4: Make as a subtest of drv_hangman.
v5: Rebase

Signed-off-by: Oscar Mateo <oscar.mateo@intel.com> <v4>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
2014-05-22 14:29:49 +03:00
Mika Kuoppala
9b0d3481e8 tests/drv_hangman: Convert test from shell script to c
Mixing script and standlone tests didn't mix well with the
strict i915_ring_stop flags handling. Also squash drv_missed_irq_hang
to the new test.

v2: - Remove missed irq test (Daniel Vetter)
    - gitignore fixed (Oscar Mateo)
    - fix check_other_clients to handle dangling fd's

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78322
Suggested-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Oscar Mateo <oscar.mateo@intel.com> <v1>
Signed-off-by: Mika Kuoppala <mika.kuoppala@intel.com>
2014-05-22 14:18:45 +03:00