mirror of
https://github.com/ioacademy-jikim/debugging
synced 2025-06-08 00:16:11 +00:00
363 lines
19 KiB
HTML
363 lines
19 KiB
HTML
<html>
|
||
<head>
|
||
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
|
||
<title>10. DHAT: a dynamic heap analysis tool</title>
|
||
<link rel="stylesheet" type="text/css" href="vg_basic.css">
|
||
<meta name="generator" content="DocBook XSL Stylesheets V1.78.1">
|
||
<link rel="home" href="index.html" title="Valgrind Documentation">
|
||
<link rel="up" href="manual.html" title="Valgrind User Manual">
|
||
<link rel="prev" href="ms-manual.html" title="9. Massif: a heap profiler">
|
||
<link rel="next" href="sg-manual.html" title="11. SGCheck: an experimental stack and global array overrun detector">
|
||
</head>
|
||
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
|
||
<div><table class="nav" width="100%" cellspacing="3" cellpadding="3" border="0" summary="Navigation header"><tr>
|
||
<td width="22px" align="center" valign="middle"><a accesskey="p" href="ms-manual.html"><img src="images/prev.png" width="18" height="21" border="0" alt="Prev"></a></td>
|
||
<td width="25px" align="center" valign="middle"><a accesskey="u" href="manual.html"><img src="images/up.png" width="21" height="18" border="0" alt="Up"></a></td>
|
||
<td width="31px" align="center" valign="middle"><a accesskey="h" href="index.html"><img src="images/home.png" width="27" height="20" border="0" alt="Up"></a></td>
|
||
<th align="center" valign="middle">Valgrind User Manual</th>
|
||
<td width="22px" align="center" valign="middle"><a accesskey="n" href="sg-manual.html"><img src="images/next.png" width="18" height="21" border="0" alt="Next"></a></td>
|
||
</tr></table></div>
|
||
<div class="chapter">
|
||
<div class="titlepage"><div><div><h1 class="title">
|
||
<a name="dh-manual"></a>10. DHAT: a dynamic heap analysis tool</h1></div></div></div>
|
||
<div class="toc">
|
||
<p><b>Table of Contents</b></p>
|
||
<dl class="toc">
|
||
<dt><span class="sect1"><a href="dh-manual.html#dh-manual.overview">10.1. Overview</a></span></dt>
|
||
<dt><span class="sect1"><a href="dh-manual.html#dh-manual.understanding">10.2. Understanding DHAT's output</a></span></dt>
|
||
<dd><dl>
|
||
<dt><span class="sect2"><a href="dh-manual.html#idm257589292208">10.2.1. Interpreting the max-live, tot-alloc and deaths fields</a></span></dt>
|
||
<dt><span class="sect2"><a href="dh-manual.html#idm257590618576">10.2.2. Interpreting the acc-ratios fields</a></span></dt>
|
||
<dt><span class="sect2"><a href="dh-manual.html#idm257591842928">10.2.3. Interpreting "Aggregated access counts by offset" data</a></span></dt>
|
||
</dl></dd>
|
||
<dt><span class="sect1"><a href="dh-manual.html#dh-manual.options">10.3. DHAT Command-line Options</a></span></dt>
|
||
</dl>
|
||
</div>
|
||
<p>To use this tool, you must specify
|
||
<code class="option">--tool=exp-dhat</code> on the Valgrind
|
||
command line.</p>
|
||
<div class="sect1">
|
||
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
|
||
<a name="dh-manual.overview"></a>10.1. Overview</h2></div></div></div>
|
||
<p>DHAT is a tool for examining how programs use their heap
|
||
allocations.</p>
|
||
<p>It tracks the allocated blocks, and inspects every memory access
|
||
to find which block, if any, it is to. The following data is
|
||
collected and presented per allocation point (allocation
|
||
stack):</p>
|
||
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
|
||
<li class="listitem"><p>Total allocation (number of bytes and
|
||
blocks)</p></li>
|
||
<li class="listitem"><p>maximum live volume (number of bytes and
|
||
blocks)</p></li>
|
||
<li class="listitem"><p>average block lifetime (number of instructions
|
||
between allocation and freeing)</p></li>
|
||
<li class="listitem"><p>average number of reads and writes to each byte in
|
||
the block ("access ratios")</p></li>
|
||
<li class="listitem"><p>for allocation points which always allocate blocks
|
||
only of one size, and that size is 4096 bytes or less: counts
|
||
showing how often each byte offset inside the block is
|
||
accessed.</p></li>
|
||
</ul></div>
|
||
<p>Using these statistics it is possible to identify allocation
|
||
points with the following characteristics:</p>
|
||
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
|
||
<li class="listitem"><p>potential process-lifetime leaks: blocks allocated
|
||
by the point just accumulate, and are freed only at the end of the
|
||
run.</p></li>
|
||
<li class="listitem"><p>excessive turnover: points which chew through a lot
|
||
of heap, even if it is not held onto for very long</p></li>
|
||
<li class="listitem"><p>excessively transient: points which allocate very
|
||
short lived blocks</p></li>
|
||
<li class="listitem"><p>useless or underused allocations: blocks which are
|
||
allocated but not completely filled in, or are filled in but not
|
||
subsequently read.</p></li>
|
||
<li class="listitem"><p>blocks with inefficient layout -- areas never
|
||
accessed, or with hot fields scattered throughout the
|
||
block.</p></li>
|
||
</ul></div>
|
||
<p>As with the Massif heap profiler, DHAT measures program progress
|
||
by counting instructions, and so presents all age/time related figures
|
||
as instruction counts. This sounds a little odd at first, but it
|
||
makes runs repeatable in a way which is not possible if CPU time is
|
||
used.</p>
|
||
</div>
|
||
<div class="sect1">
|
||
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
|
||
<a name="dh-manual.understanding"></a>10.2. Understanding DHAT's output</h2></div></div></div>
|
||
<p>DHAT provides a lot of useful information on dynamic heap usage.
|
||
Most of the art of using it is in interpretation of the resulting
|
||
numbers. That is best illustrated via a set of examples.</p>
|
||
<div class="sect2">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="idm257589292208"></a>10.2.1. Interpreting the max-live, tot-alloc and deaths fields</h3></div></div></div>
|
||
<div class="sect3"><div class="titlepage"><div><div><h4 class="title">
|
||
<a name="idm257590510480"></a>10.2.1.1. A simple example</h4></div></div></div></div>
|
||
<pre class="screen">
|
||
======== SUMMARY STATISTICS ========
|
||
|
||
guest_insns: 1,045,339,534
|
||
[...]
|
||
max-live: 63,490 in 984 blocks
|
||
tot-alloc: 1,904,700 in 29,520 blocks (avg size 64.52)
|
||
deaths: 29,520, at avg age 22,227,424
|
||
acc-ratios: 6.37 rd, 1.14 wr (12,141,526 b-read, 2,174,460 b-written)
|
||
at 0x4C275B8: malloc (vg_replace_malloc.c:236)
|
||
by 0x40350E: tcc_malloc (tinycc.c:6712)
|
||
by 0x404580: tok_alloc_new (tinycc.c:7151)
|
||
by 0x40870A: next_nomacro1 (tinycc.c:9305)
|
||
</pre>
|
||
<p>Over the entire run of the program, this stack (allocation
|
||
point) allocated 29,520 blocks in total, containing 1,904,700 bytes in
|
||
total. By looking at the max-live data, we see that not many blocks
|
||
were simultaneously live, though: at the peak, there were 63,490
|
||
allocated bytes in 984 blocks. This tells us that the program is
|
||
steadily freeing such blocks as it runs, rather than hanging on to all
|
||
of them until the end and freeing them all.</p>
|
||
<p>The deaths entry tells us that 29,520 blocks allocated by this stack
|
||
died (were freed) during the run of the program. Since 29,520 is
|
||
also the number of blocks allocated in total, that tells us that
|
||
all allocated blocks were freed by the end of the program.</p>
|
||
<p>It also tells us that the average age at death was 22,227,424
|
||
instructions. From the summary statistics we see that the program ran
|
||
for 1,045,339,534 instructions, and so the average age at death is
|
||
about 2% of the program's total run time.</p>
|
||
<div class="sect3"><div class="titlepage"><div><div><h4 class="title">
|
||
<a name="idm257591621936"></a>10.2.1.2. Example of a potential process-lifetime leak</h4></div></div></div></div>
|
||
<p>This next example (from a different program than the above)
|
||
shows a potential process lifetime leak. A process lifetime leak
|
||
occurs when a program keeps allocating data, but only frees the
|
||
data just before it exits. Hence the program's heap grows constantly
|
||
in size, yet Memcheck reports no leak, because the program has
|
||
freed up everything at exit. This is particularly a hazard for
|
||
long running programs.</p>
|
||
<pre class="screen">
|
||
======== SUMMARY STATISTICS ========
|
||
|
||
guest_insns: 418,901,537
|
||
[...]
|
||
max-live: 32,512 in 254 blocks
|
||
tot-alloc: 32,512 in 254 blocks (avg size 128.00)
|
||
deaths: 254, at avg age 300,467,389
|
||
acc-ratios: 0.26 rd, 0.20 wr (8,756 b-read, 6,604 b-written)
|
||
at 0x4C275B8: malloc (vg_replace_malloc.c:236)
|
||
by 0x4C27632: realloc (vg_replace_malloc.c:525)
|
||
by 0x56FF41D: QtFontStyle::pixelSize(unsigned short, bool) (qfontdatabase.cpp:269)
|
||
by 0x5700D69: loadFontConfig() (qfontdatabase_x11.cpp:1146)
|
||
</pre>
|
||
<p>There are two tell-tale signs that this might be a
|
||
process-lifetime leak. Firstly, the max-live and tot-alloc numbers
|
||
are identical. The only way that can happen is if these blocks are
|
||
all allocated and then all deallocated.</p>
|
||
<p>Secondly, the average age at death (300 million insns) is 71% of
|
||
the total program lifetime (419 million insns), hence this is not a
|
||
transient allocation-free spike -- rather, it is spread out over a
|
||
large part of the entire run. One interpretation is, roughly, that
|
||
all 254 blocks were allocated in the first half of the run, held onto
|
||
for the second half, and then freed just before exit.</p>
|
||
</div>
|
||
<div class="sect2">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="idm257590618576"></a>10.2.2. Interpreting the acc-ratios fields</h3></div></div></div>
|
||
<div class="sect3"><div class="titlepage"><div><div><h4 class="title">
|
||
<a name="idm257590617824"></a>10.2.2.1. A fairly harmless allocation point record</h4></div></div></div></div>
|
||
<pre class="screen">
|
||
max-live: 49,398 in 808 blocks
|
||
tot-alloc: 1,481,940 in 24,240 blocks (avg size 61.13)
|
||
deaths: 24,240, at avg age 34,611,026
|
||
acc-ratios: 2.13 rd, 0.91 wr (3,166,650 b-read, 1,358,820 b-written)
|
||
at 0x4C275B8: malloc (vg_replace_malloc.c:236)
|
||
by 0x40350E: tcc_malloc (tinycc.c:6712)
|
||
by 0x404580: tok_alloc_new (tinycc.c:7151)
|
||
by 0x4046C4: tok_alloc (tinycc.c:7190)
|
||
</pre>
|
||
<p>The acc-ratios field tells us that each byte in the blocks
|
||
allocated here is read an average of 2.13 times before the block is
|
||
deallocated. Given that the blocks have an average age at death of
|
||
34,611,026, that's one read per block per approximately every 15
|
||
million instructions. So from that standpoint the blocks aren't
|
||
"working" very hard.</p>
|
||
<p>More interesting is the write ratio: each byte is written an
|
||
average of 0.91 times. This tells us that some parts of the allocated
|
||
blocks are never written, at least 9% on average. To completely
|
||
initialise the block would require writing each byte at least once,
|
||
and that would give a write ratio of 1.0. The fact that some block
|
||
areas are evidently unused might point to data alignment holes or
|
||
other layout inefficiencies.</p>
|
||
<p>Well, at least all the blocks are freed (24,240 allocations,
|
||
24,240 deaths).</p>
|
||
<p>If all the blocks had been the same size, DHAT would also show
|
||
the access counts by block offset, so we could see where exactly these
|
||
unused areas are. However, that isn't the case: the blocks have
|
||
varying sizes, so DHAT can't perform such an analysis. We can see
|
||
that they must have varying sizes since the average block size, 61.13,
|
||
isn't a whole number.</p>
|
||
<div class="sect3"><div class="titlepage"><div><div><h4 class="title">
|
||
<a name="idm257592529776"></a>10.2.2.2. A more suspicious looking example</h4></div></div></div></div>
|
||
<pre class="screen">
|
||
max-live: 180,224 in 22 blocks
|
||
tot-alloc: 180,224 in 22 blocks (avg size 8192.00)
|
||
deaths: none (none of these blocks were freed)
|
||
acc-ratios: 0.00 rd, 0.00 wr (0 b-read, 0 b-written)
|
||
at 0x4C275B8: malloc (vg_replace_malloc.c:236)
|
||
by 0x40350E: tcc_malloc (tinycc.c:6712)
|
||
by 0x40369C: __sym_malloc (tinycc.c:6787)
|
||
by 0x403711: sym_malloc (tinycc.c:6805)
|
||
</pre>
|
||
<p>Here, both the read and write access ratios are zero. Hence
|
||
this point is allocating blocks which are never used, neither read nor
|
||
written. Indeed, they are also not freed ("deaths: none") and are
|
||
simply leaked. So, here is 180k of completely useless allocation that
|
||
could be removed.</p>
|
||
<p>Re-running with Memcheck does indeed report the same leak. What
|
||
DHAT can tell us, that Memcheck can't, is that not only are the blocks
|
||
leaked, they are also never used.</p>
|
||
<div class="sect3"><div class="titlepage"><div><div><h4 class="title">
|
||
<a name="idm257587906976"></a>10.2.2.3. Another suspicious example</h4></div></div></div></div>
|
||
<p>Here's one where blocks are allocated, written to,
|
||
but never read from. We see this immediately from the zero read
|
||
access ratio. They do get freed, though:</p>
|
||
<pre class="screen">
|
||
max-live: 54 in 3 blocks
|
||
tot-alloc: 1,620 in 90 blocks (avg size 18.00)
|
||
deaths: 90, at avg age 34,558,236
|
||
acc-ratios: 0.00 rd, 1.11 wr (0 b-read, 1,800 b-written)
|
||
at 0x4C275B8: malloc (vg_replace_malloc.c:236)
|
||
by 0x40350E: tcc_malloc (tinycc.c:6712)
|
||
by 0x4035BD: tcc_strdup (tinycc.c:6750)
|
||
by 0x41FEBB: tcc_add_sysinclude_path (tinycc.c:20931)
|
||
</pre>
|
||
<p>In the previous two examples, it is easy to see blocks that are
|
||
never written to, or never read from, or some combination of both.
|
||
Unfortunately, in C++ code, the situation is less clear. That's
|
||
because an object's constructor will write to the underlying block,
|
||
and its destructor will read from it. So the block's read and write
|
||
ratios will be non-zero even if the object, once constructed, is never
|
||
used, but only eventually destructed.</p>
|
||
<p>Really, what we want is to measure only memory accesses in
|
||
between the end of an object's construction and the start of its
|
||
destruction. Unfortunately I do not know of a reliable way to
|
||
determine when those transitions are made.</p>
|
||
</div>
|
||
<div class="sect2">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="idm257591842928"></a>10.2.3. Interpreting "Aggregated access counts by offset" data</h3></div></div></div>
|
||
<p>For allocation points that always allocate blocks of the same
|
||
size, and which are 4096 bytes or smaller, DHAT counts accesses
|
||
per offset, for example:</p>
|
||
<pre class="screen">
|
||
max-live: 317,408 in 5,668 blocks
|
||
tot-alloc: 317,408 in 5,668 blocks (avg size 56.00)
|
||
deaths: 5,668, at avg age 622,890,597
|
||
acc-ratios: 1.03 rd, 1.28 wr (327,642 b-read, 408,172 b-written)
|
||
at 0x4C275B8: malloc (vg_replace_malloc.c:236)
|
||
by 0x5440C16: QDesignerPropertySheetPrivate::ensureInfo (qhash.h:515)
|
||
by 0x544350B: QDesignerPropertySheet::setVisible (qdesigner_propertysh...)
|
||
by 0x5446232: QDesignerPropertySheet::QDesignerPropertySheet (qdesigne...)
|
||
|
||
Aggregated access counts by offset:
|
||
|
||
[ 0] 28782 28782 28782 28782 28782 28782 28782 28782
|
||
[ 8] 20638 20638 20638 20638 0 0 0 0
|
||
[ 16] 22738 22738 22738 22738 22738 22738 22738 22738
|
||
[ 24] 6013 6013 6013 6013 6013 6013 6013 6013
|
||
[ 32] 18883 18883 18883 37422 0 0 0 0
|
||
[ 36] 5668 11915 5668 5668 11336 11336 11336 11336
|
||
[ 48] 6166 6166 6166 6166 0 0 0 0
|
||
</pre>
|
||
<p>This is fairly typical, for C++ code running on a 64-bit
|
||
platform. Here, we have aggregated access statistics for 5668 blocks,
|
||
all of size 56 bytes. Each byte has been accessed at least 5668
|
||
times, except for offsets 12--15, 36--39 and 52--55. These are likely
|
||
to be alignment holes.</p>
|
||
<p>Careful interpretation of the numbers reveals useful information.
|
||
Groups of N consecutive identical numbers that begin at an N-aligned
|
||
offset, for N being 2, 4 or 8, are likely to indicate an N-byte object
|
||
in the structure at that point. For example, the first 32 bytes of
|
||
this object are likely to have the layout</p>
|
||
<pre class="screen">
|
||
[0 ] 64-bit type
|
||
[8 ] 32-bit type
|
||
[12] 32-bit alignment hole
|
||
[16] 64-bit type
|
||
[24] 64-bit type
|
||
</pre>
|
||
<p>As a counterexample, it's also clear that, whatever is at offset 32,
|
||
it is not a 32-bit value. That's because the last number of the group
|
||
(37422) is not the same as the first three (18883 18883 18883).</p>
|
||
<p>This example leads one to enquire (by reading the source code)
|
||
whether the zeroes at 12--15 and 52--55 are alignment holes, and
|
||
whether 48--51 is indeed a 32-bit type. If so, it might be possible
|
||
to place what's at 48--51 at 12--15 instead, which would reduce
|
||
the object size from 56 to 48 bytes.</p>
|
||
<p>Bear in mind that the above inferences are all only "maybes". That's
|
||
because they are based on dynamic data, not static analysis of the
|
||
object layout. For example, the zeroes might not be alignment
|
||
holes, but rather just parts of the structure which were not used
|
||
at all for this particular run. Experience shows that's unlikely
|
||
to be the case, but it could happen.</p>
|
||
</div>
|
||
</div>
|
||
<div class="sect1">
|
||
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
|
||
<a name="dh-manual.options"></a>10.3. DHAT Command-line Options</h2></div></div></div>
|
||
<p>DHAT-specific command-line options are:</p>
|
||
<div class="variablelist">
|
||
<a name="dh.opts.list"></a><dl class="variablelist">
|
||
<dt>
|
||
<a name="opt.show-top-n"></a><span class="term">
|
||
<code class="option">--show-top-n=<number>
|
||
[default: 10] </code>
|
||
</span>
|
||
</dt>
|
||
<dd><p>At the end of the run, DHAT sorts the accumulated
|
||
allocation points according to some metric, and shows the
|
||
highest scoring entries. <code class="varname">--show-top-n</code>
|
||
controls how many entries are shown. The default of 10 is
|
||
quite small. For realistic applications you will probably need
|
||
to set it much higher, at least several hundred.</p></dd>
|
||
<dt>
|
||
<a name="opt.sort-by"></a><span class="term">
|
||
<code class="option">--sort-by=<string> [default: max-bytes-live] </code>
|
||
</span>
|
||
</dt>
|
||
<dd>
|
||
<p>At the end of the run, DHAT sorts the accumulated
|
||
allocation points according to some metric, and shows the
|
||
highest scoring entries. <code class="varname">--sort-by</code>
|
||
selects the metric used for sorting:</p>
|
||
<p><code class="varname">max-bytes-live </code> maximum live bytes [default]</p>
|
||
<p><code class="varname">tot-bytes-allocd </code> total allocation (turnover)</p>
|
||
<p><code class="varname">max-blocks-live </code> maximum live blocks</p>
|
||
<p>This controls the order in which allocation points are
|
||
displayed. You can choose to look at allocation points with
|
||
the highest maximum liveness, or the highest total turnover, or
|
||
by the highest number of live blocks. These give usefully
|
||
different pictures of program behaviour. For example, sorting
|
||
by maximum live blocks tends to show up allocation points
|
||
creating large numbers of small objects.</p>
|
||
</dd>
|
||
</dl>
|
||
</div>
|
||
<p>One important point to note is that each allocation stack counts
|
||
as a seperate allocation point. Because stacks by default have 12
|
||
frames, this tends to spread data out over multiple allocation points.
|
||
You may want to use the flag --num-callers=4 or some such small
|
||
number, to reduce the spreading.</p>
|
||
</div>
|
||
</div>
|
||
<div>
|
||
<br><table class="nav" width="100%" cellspacing="3" cellpadding="2" border="0" summary="Navigation footer">
|
||
<tr>
|
||
<td rowspan="2" width="40%" align="left">
|
||
<a accesskey="p" href="ms-manual.html"><< 9. Massif: a heap profiler</a> </td>
|
||
<td width="20%" align="center"><a accesskey="u" href="manual.html">Up</a></td>
|
||
<td rowspan="2" width="40%" align="right"> <a accesskey="n" href="sg-manual.html">11. SGCheck: an experimental stack and global array overrun detector >></a>
|
||
</td>
|
||
</tr>
|
||
<tr><td width="20%" align="center"><a accesskey="h" href="index.html">Home</a></td></tr>
|
||
</table>
|
||
</div>
|
||
</body>
|
||
</html>
|