In my last article I showed you how to profile your code using the Devel::SmallProf
module, which gave you times and counts per line of code. In this article we
will make code graphs with that information by using the
Devel::GraphVizProf
module.
A graph, in this sense, is a collection of connected nodes. The nodes in code
graphs are the executable statements of the program and are connected by "edges"
which show the flow of code from one statement to the next. This sort of graph
is also known as a "directed graph," since the edges show the direction of flow
from node to node.
GraphViz is an open-source
graphing program developed by AT&T that can help developers visualize structural
information, such as code flow, database table relationships,
or the links between web pages. GraphViz and many
other interesting tools are provided free of charge by AT&T. The ease of
installation of this package can depend on your operating system. On FreeBSD simply go to /usr/ports/graphics/graphviz
, run make install
, and then go off for a cup of coffee. Installing
GraphViz is a bit more
involved on Red Hat Linux due to some incompatibilities
mentioned on the Graph Visualization Project development site
. There appears to be an initial version for Windows, but I have not tried it-programmer
beware.
You can get Devel::GraphVizProf
from the Comprehensive Perl
Archive Network. It is in the GraphViz
module distribution by Leon Brocard, who also presented a talk
about Perl code graphs at YAPC::Europe 2000. It does not install
automatically as of version 0.12 but all that you
need to do is copy the Devel
directory to an
appropriate Perl library directory. If you have not done this before or
cannot install modules into the Perl library directories,
perlfaq8
can help you figure out what to do. Although you may suffer a
bit more while installing this module, the coolness factor is
worth the pain.
GraphViz
can do quite a bit and comes with more tools than I will show,
but you can see the documentation
for more details. To show a simple code graph, I wrote this sample program,
#!/usr/bin/perl
my $test =
0; while(
$test++ < 15 )
{
my_print("Hello $test\n");
}
sub my_print
{
print $_[0];
}
|
and then I wrote the graph description of it. Each
executable statement is defined as a node, and the edges are
defined as connections between them. In this case, I connect
statements that follow each other during program execution.
Rather than discuss the dot
syntax here I refer
you to the dot
documentation so I can get on with the cool stuff. Later
the Devel::GraphVizProf
module will do all of this for me.
digraph test {
bgcolor="white";
node2 [color="0,1,0", label="my $test =
0;"];
node4 [color="0,1,0", label= "my_print(\"Hello $test\n\");"];
node3 [color="0,1,0", label=
"print $_[0];"];node1 [color=
"0,1,0", label= "while($test++ < 15 )"];
node2 -> node1 [color="0,1,0", len="2", w="0"];
node4 -> node3 [color="0,1,0", len="2", w="0"];
node3 -> node1 [color="0,1,0", len="2", w="0"];
node1 -> node4 [color="0,1,0", len="2", w="0"];
}
|
Once I have created the nodes and connect them with edges, I
transform the graph description into an image with the
dot
utility that comes with the GraphViz
distribution. This program can produce output in several
formats including Adobe PostScript, FrameMaker MIF, PNG, and many
others. For this article I will use PNG so you can see
the images. To generate the image file, I tell
dot
which output format I want with the
-T
switch and what the output file name is with
the -o
switch along with the name of the file
that has the graph description. The -G
switch allows me to
specify options for the entire graph. In this case I want the
color of the background to be white. You might not need this, but
if you get an image full of black, that probably means GraphViz does
not know which color you want to use for the background and uses
black by default.
prompt$ dot -Gbgcolor="white" -Tpng -o example.png example.dot
|
The image shows the graph that I created.
I can also change the color of the edges so that I can encode
more information in the graph. The color of the edge can be used
to indicate how often the program goes from one statement to another.
I can then literally see the parts of the program that might deserve
more consideration for optimization or debugging. In this example,
I have colored the lines involved in the loop blue to indicate that
they execute more often than the other lines.
digraph test {
node2 [color="0,1,0", label="my $test =
0;"];
node4 [color="0,1,0", label= "my_print(\"Hello $test\n\");"];
node3 [color="0,1,0", label=
"print $_[0];"];node1 [color=
"0,1,0", label= "while($test++ < 15 )"];
node2 -> node1 [color="0,1,1", len="2", w="0"];
node4 -> node3 [color="0,1,1", len="2", w="0"];
node3 -> node1 [color="0,1,1", len="2", w="0"];
node1 -> node4 [color="0,1,1", len="2", w="0"];
}
|
I already know that the Devel::SmallProf
module can count the number of times the a line of code is
executed and how much time it takes to execute that line. The
Devel::GraphVizProf
module does the same thing. Rather than output a text report
like Devel::SmallProf
does, Devel::GraphVizProf
outputs a graph description. It uses the edge color to encode
the line counts. Statements that are connected infrequently
relative to other statements are colored darker and statements
that are connected more frequently are colored more brightly.
In this example, the edges that are black only happen a couple
of times while the ones colored blue happen very frequently. I
can easily identify where my program is spending time by
looking at the colored lines rather than going through lines of
test input. The power of pictures becomes apparent.
I modified the example script to add some lines of code that will be executed more
often than those in the while loop to show how Devel::GraphVizProf
displays relative frequencies of execution.
#!/usr/bin/perl
my $test =
0; while(
$test++ < 100 )
{
my_print("Hello $test\n");
}
my $sum = 0;
foreach( 0 .. 1000 )
{
$sum += $_;
}
sub my_print
{
print $_[0];
}
|
I run this script under the Devel::GraphVizProf
debugger by using the -d
switch.
prompt$ perl -d:GraphVizProf example.pl
|
At the end of the program the debugger prints to standard output the
information that I can pass to dot
to create the graph. I can
send the output to dot
directly, but often the program I graph
sends other information to standard output or I want to change the node information
a bit. I save the information in a file until I am ready to make the graph.
prompt$
perl -d:GraphVizProf example.pl > example.dot
|
I then edit out any extraneous output from the program and add
any extra features I might want in the graph (such as
background and foreground colors). Once I am satisfied I make
a PNG image of the graph as I did before.
prompt$ dot -Gbgcolor="white" -Tpng -o example.png example.dot
|
Look at how large that image is though (118k and 5052x2751). It is large not only
in file size, but in dimension. The interesting code only
takes a small portion of it, since a lot of the code that I see in
the image is from the parts of the debugger program that actually
creates the image.
I don't want to see all of that. I can limit the graph to particular namespaces.
If I want to limit my graph to the statements in particular namespaces, I can
create a .smallprof
file in the same directory from which I will
run the program. The .smallprof
file is included in the Devel::GraphVizProf
module at runtime with do {}
, so I can put valid Perl statements
in there. If I create a hash named %DB::packages
, Devel::GraphVizProf
only profiles packages that exist as keys in that hash and have a true value
(which is anything that is not 0, the empty string, or undef
),
and these are the only packages that will appear in the code graph.
By default, Perl programs are in the main
namespace (or package) which corresponds to the main()
loop in C. If I want
to profile and graph only statements in the main
namespace, I can use this
.smallprof
file.
$DB::packages{'main'} = 1;
|
I then rerun the debugger and redraw the graph, which turns out much smaller and easier
to read.
prompt$
perl -d:GraphVizProf example.pl > example.dot
prompt$ dot -Gbgcolor="white" -Tpng -o example.png example.dot
|
The new image is much smaller and only shows the code of interest. Notice
that the lines that execute more often are connected by lines that are brighter
colors. If I had a much longer program, and a much larger code graph, I could
easily scan the image looking for the brightest colored lines to see where the
program is spending its time. Although this is not going to unlock the secrets
of my program, I can use the graph along with other information to decide how
to optimize or debug it.
Just for kicks, I ran the test.pl
script
from the Business::ISBN
module under the Devel::GraphVizProf
debugger using a different .smallprof
file
so that I could also profile code in the Business::ISBN
namespace.
# the naked block defines the scope of the @modules array.
# i don't want to mess up the rest of the program ;)
{
my @modules = qw( main Business::ISBN );
@DB::packages{ @modules } = @modules;
}
|
The code graph generates a rather large image of the program: isbn.gif
(312 KB and 7866x6068).
There is a lot more that you can do with GraphViz to make these
graphs prettier, but that is up to you. You can install more fonts, use different
colors or outlines, and many other things to justify the use of a really expensive
printer. Just do not tell your friends and co-workers how easy it is to do.
:)
brian d foy has been a Perl
user since 1994. He is founder of the first Perl users group, NY.pm, and Perl Mongers, the Perl advocacy organization.
He has been teaching Perl through Stonehenge Consulting for the past three
years, and has been a featured speaker at The Perl Conference, Perl University,
YAPC, COMDEX, and Builder.com. Some of brian's other articles have appeared
in The Perl Journal.