ntel-gpu-tools

mirror of https://github.com/tiagovignatti/intel-gpu-tools.git synced 2025-11-07 05:27:12 +00:00

Author	SHA1	Message	Date
Chris Wilson	9024a72d29	benchmark/gem_wait: poc for benchmarking i915_wait_request overhead One scenario under recent discussion is that of having a thundering herd in i915_wait_request - where the overhead of waking up every waiter for every batchbuffer was significantly impacting customer throughput. This benchmark tries to replicate something to that effect by having a large number of consumers generating a busy load (a large copy followed by lots of small copies to generate lots of interrupts) and tries to wait upon all the consumers concurrenctly (to reproduce the thundering herd effect). To measure the overhead, we have a bunch of cpu hogs - less kernel overhead in waiting should allow more CPU throughput. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-10-30 15:04:55 +00:00
Chris Wilson	8253e7dc84	benchmarks: Measure BLT performance Execute N blits and time how long they complete to measure both GPU limited bandwidth and submission overhead. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-10-06 10:24:07 +01:00
Chris Wilson	38b3bd6b7c	benchmarks: Add a microbenchmark for relocation overhead Allow specification of the many different busyness modes and relocation interfaces, along with the number of buffers to use and relocations. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-08-11 15:31:02 +01:00
Chris Wilson	4c74a683c1	benchmarks: Do not install to system-wide bin/ These benchmarks are first-and-foremost development tools, not aimed at general users. As such they should not be installed into the system-wide bin/ directory, but installed into libexec/. v2: Now actually install beneath ${libexec} Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-08-10 15:53:08 +01:00
Chris Wilson	0393e7288b	benchmarks: Record and replay calls to EXECBUFFER2 This slightly idealises the behaviour of clients with the aim of measuring the kernel overhead of different workloads. This test focuses on the cost of relocating batchbuffers. A trace file is generated with an LD_PRELOAD intercept around execbuffer, which we can then replay at our leisure. The replay replaces the real buffers with a set of empty ones so the only thing that the kernel has to do is parse the relocations. but without a real workload we lose the impact of having to rewrite active buffers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-08-09 19:20:46 +01:00
Chris Wilson	cd306d4e65	benchmark: Measure allocation time for objects A basic measurement, how fast can we create and populate an object with backing storage? Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-07-24 18:56:00 +01:00
Chris Wilson	e984d4965f	benchmarks: Benchmarkify gem_exec_ctx Measure the overhead of execution when doing nothing, switching between a pair of contexts, or creating a new context every time. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-07-24 18:55:49 +01:00
Chris Wilson	d88981f62b	benchmarks: Measure round-trip time for an immediate vblanks By measuring both the query and the event round trip time, we can make a reasonable estimate of how long it takes for the query to send the vblank following an interrupt. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-07-23 15:52:53 +01:00
Chris Wilson	f8628a2c98	benchmarks: Add simple mmap benchmarks Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-07-23 12:20:43 +01:00
Chris Wilson	f689e2aa81	benchmarks: Add simple pread/pwrite benchmarks Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-07-23 12:20:05 +01:00
Chris Wilson	b7c33e0939	benchmarks: Benchmarkify gem_exec_nop Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-07-22 15:14:05 +01:00
Tvrtko Ursulin	d3057d7a1e	tests/gem_userptr_benchmark: Benchmarking userptr surfaces and impact This adds a small benchmark for the new userptr functionality. Apart from basic surface creation and destruction, also tested is the impact of having userptr surfaces in the process address space. Reason for that is the impact of MMU notifiers on common address space operations like munmap() which is per process. v2: * Moved to benchmarks. * Added pointer read/write tests. * Changed output to say iterations per second instead of operations per second. * Multiply result by batch size for multi-create* tests for a more comparable number with create-destroy test. v3: * Use ALIGN macro. * Catchup with big lib/ reorganization. * Removed unused code and one global variable. * Fixed up some warnings. v4: * Fixed feature test, does not matter here but makes it consistent with gem_userptr_blits and clearer. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Brad Volkin <bradley.d.volkin@intel.com> Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>	2014-04-25 17:48:49 +02:00
Tvrtko Ursulin	5d7649690c	benchmarks: Build them on Android. They build fine so give them some exposure. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Brad Volkin <bradley.d.volkin@intel.com> Signed-off-by: Thomas Wood <thomas.wood@intel.com>	2014-04-24 13:49:20 +01:00

13 Commits