ntel-gpu-tools

mirror of https://github.com/tiagovignatti/intel-gpu-tools.git synced 2025-06-08 16:36:14 +00:00

Author	SHA1	Message	Date
Chris Wilson	51bb53663e	benchmarks/gem_latency: Allow setting an infinite time Well, 24000 years. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2016-01-06 10:21:40 +00:00
Chris Wilson	1b9085b979	benchmarks/gem_latency: Hide spinlocks for android Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-12-21 16:32:08 +00:00
Chris Wilson	a1d465a3c5	benchmarks/gem_latency: Serialise mmio reads The joy of our hardware; don't let two threads attempt to read the same register at the same time. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-12-21 13:34:58 +00:00
Chris Wilson	3ebce37b65	benchmarks/gem_latency: Guard against inferior pthreads.h Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-12-21 10:00:21 +00:00
Chris Wilson	3cc8f957f1	benchmarks/gem_latency: Measure CPU usage Try and gauge the amount of CPU time used for each dispatch/wait cycle. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-12-20 21:22:35 +00:00
Chris Wilson	a91ee853b1	benchmarks/gem_latency: Measure effect of using RealTime priority Allow the producers to be set with maximum RT priority to verify that the waiters are not exhibiting priorty-inversion. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-12-20 21:22:35 +00:00
Chris Wilson	27e093dd1f	benchmarks/gem_latency: Use RCS on Sandybridge Reading BCS_TIMESTAMP just returns 0... Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-12-20 13:02:02 +00:00
Chris Wilson	c0942bf528	benchmarks/gem_latency: Rearrange thread cancellation Try a different pattern to cascade the cancellation from producers to their consumers in order to avoid one potential deadlock. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-12-20 13:02:02 +00:00
Chris Wilson	8ea61ec1ff	benchmarks/gem_latency: Tweak workload Do the workload before the nop, so that if combining both, there is a better chance for the spurious interrupts. Emit just one workload batch (use the nops to generate spurious interrupts) and apply the factor to the number of copies to make inside the workload - the intention is that this gives sufficient time for all producers to run concurrently. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-12-20 13:02:02 +00:00
Chris Wilson	db011021a1	benchmarks/gem_latency: Add output field specifier Just to make it easier to integrate into ezbench. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-12-19 15:07:56 +00:00
Chris Wilson	646cab4c0c	benchmarks/gem_latency: Split the nop/work/latency measurement Split the distinct phases (generate interrupts, busywork, measure latency) into separate batches for finer control. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-12-19 12:16:52 +00:00
Chris Wilson	e37a4c8092	benchmarks/gem_latency: Add time control Allow the user to choose a time to run for, default 10s Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-12-19 12:16:52 +00:00
Chris Wilson	2ef368acfa	benchmarks/gem_latency: Add nop dispatch latency measurement Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-12-19 12:16:52 +00:00
Chris Wilson	1db5b05243	benchmarks/gem_latency: Expose the workload factor Allow the user to select how many batches each producer submits before waiting. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-12-19 12:16:52 +00:00
Chris Wilson	6dbe0a3012	benchmarks/gem_latency: Measure whole execution throughput Knowing how long it takes to execute the workload (and how that scales) is interesting to put the latency figures into perspective. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-12-19 12:16:52 +00:00
Chris Wilson	2f74892ebd	benchmarks/gem_latency: Fix for !LLC Late last night I forgot I had only added the llc CPU mmaping and not the !llc GTT mapping for byt/bsw. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-12-19 10:32:38 +00:00
Chris Wilson	c9da0b5221	benchmark: Measure of latency of producers -> consumers, gem_latency The goal is measure how long it takes for clients waiting on results to wakeup after a buffer completes, and in doing so ensure scalibilty of the kernel to large number of clients. We spawn a number of producers. Each producer submits a busyload to the system and records in the GPU the BCS timestamp of when the batch completes. Then each producer spawns a number of waiters, who wait upon the batch completion and measure the current BCS timestamp register and compare against the recorded value. By varying the number of producers and consumers, we can study different aspects of the design, in particular how many wakeups the kernel does for each interrupt (end of batch). The more wakeups on each batch, the longer it takes for any one client to finish. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>	2015-12-19 01:30:57 +00:00

17 Commits