Moshe is a systems administrator and operating-system researcher and has an M.S. and a Ph.D. in computer science. He can be contacted at [email protected].
It is one of the ironies of a programmer's life that when we've finished writing a program, the real work has just begun. Once the program compiles cleanly and seems to do what it was supposed to do, good programmers then need to find out how fast (or slow) the program performs its designed purpose. In other words, we need to stress-test the program.
There are commercial stress-test applications in the market, but they are expensive for individuals or small development firms, and often too generalizedcapable of most everything, but not really optimized for your particular situation. With its myriad modules (I call them "connectors"), Perl is a great tool to help you stress-test your particular program.
As a senior contributing editor at BYTE.com, I've been able to compare operating systems to one another (usually Linux against something else) over the last few years, and the way I do it is by running the exact same stress-test against standard applications like web servers and databases (MySQL). As it turns out, Perl is my tool of choice for this purpose, thanks in part to the CPAN repository of modules.
Parallel Fork Manager
One of the modules that best suits the stress-test purpose is the Parallel Fork Manager (http://aspn.activestate.com/ASPN/CodeDoc/ Parallel-ForkManager/ForkManager.html). Parallelizable parts of a Perl program can be forked into separate processes that continue to run concurrently with the parent process in a multiCPU environment. In a massively parallel environment, such as my own openMosix technology (http://www.openMosix.org/), every forked Perl program always finds a CPU to run on. The total run time is therefore drastically reduced. The Perl program in Listing 1 can run from many nodes in parallel in an openMosix cluster, and the server being tested is submitted to a veritable bombardment of client-side requests.
Using the Perl Fork Manager is easy. A simple construct like Example 1 parallelizes very nicelyand by using openMosix you don't even have to care about distribution in the cluster. Several callbacks (or event handlers) are possible, which are called on certain events. Among the callbacks, you typically use run_on_start $code, which defines a subroutine called when a freshly forked child begins execution.
Looking at Listing 1, all the script does is run $processes in parallel and create tables with $rows, then starts $transactions (SQL queries) against the database. In the beginning, you define the various fork and database connection managers. Then, you define an array of names that will be used later to populate the database with sample records. If it is important to increase the cardinality of the indices, then you can do so by building an array of n names by reading in n words from the UNIX dict file (which in Red Hat is in /usr/share/dict/linux.words).
Using the well-known DBI module, you then connect to the database (Oracle in this example) and first drop, then create the tables of this fictitious application. In Part Four of the code (see comments), you tell the openMosix clustering system (through the convenient /proc filesystem of Linux) to make sure all forked Perl programs are free to migrate to other nodes in the cluster to evenly distribute them and keep the cluster load-balanced. Part Seven is where you parallelize the SQL queries to the database using the Fork Manager module.
That's it. The entire program consists mostly of the SQL handling for connection to the database and housekeeping. The code related to the Fork Manager is just a few lines. The same program structure can be used for stress-testing any server application. Depending on what you need to do with the results, you can use various Perl modules to represent the results graphically or publish them onto a web page.
For my purposes, a simple run-time summary is enough and it keeps the program simple. Everybody is encouraged to use this Perl script. I'd be interested to know how you enhanced it for your particular needs.
TPJ
Listing 1
#! /usr/bin/perl -w # Part One Modules and parameters use Parallel::ForkManager; use DBI; srand; ############# configruation part ####################### my $processes=30; my $transactions=5000; my $rows=100000; # rows to be added to the database ######################################################## my $p; # Part Two - Seed for DB population@ NAMES = qw/Alf Ben Benny Daniel David Foo Bar Moshe Avivit Jon Linus Larry Safety First/; #Part Three - DB environment my $username; my $money; my $id; my $exists=0; my $commandlineargs=0; $SID = $ENV{ "ORACLE_SID" }; if (!$SID) { printf("No ORACLE_SID environment!\n"); printf("... do not know to which database you want to connect\n"); printf("You have to export the database-name in the environment\n"); printf("variable ORACLE_SID e.g.\n"); printf("export ORACLE_SID [mydb]\n"); exit -1 } # Part Four - openMosix related. unlocks this process and all its children sub unlock { #open (OUTFILE,">/proc/self/lock") || #ie "Could not unlock myself!\n"; #print OUTFILE "0"; } unlock; # Part Five - Here we connect to the DB and populate the tables with records sub filldb { my $dbh = DBI->connect( "dbi:Oracle:$SID", "scott", "tiger", { RaiseError => 1, AutoCommit => 0 } ) || die "Database connection not made: $DBI::errstr"; for ($loop=0; $loop<$rows; $loop++) { # prepare the random values srand; $p=rand(); $username = $NAMES[rand(@NAMES)]; $money=rand()*(rand()*100); $id=rand()*1000; $id=sprintf("%d", $id); # insert my $sql2 = qq{ insert into stress values ($id,'$username',$money) }; my $sth2 = $dbh->prepare( $sql2 ); $sth2->execute(); $sth2->finish(); } $dbh->disconnect(); } # delete old entries from db sub deldb { my $dbh = DBI->connect( "dbi:Oracle:$SID", "scott", "tiger", { RaiseError => 1, AutoCommit => 0 } ) || die "Database connection not made: $DBI::errstr"; # drop my $sql1 = qq{ select TABLE_NAME from user_tables where TABLE_NAME='STRESS' }; my $sth1 = $dbh->prepare( $sql1 ); $sth1->execute(); # if the table does not exists while( $sth1->fetch() ) { $exists=1; } if (!$exists) { # create my $sql2 = qq{ create table STRESS ( id number, name varchar2(255), money float) }; my $sth2 = $dbh->prepare( $sql2 ); $sth2->execute(); $sth2->finish(); } else { # cleanup my $sql3 = qq{ delete from STRESS }; my $sth3 = $dbh->prepare( $sql3 ); $sth3->execute(); $sth3->finish(); } $sth1->finish(); $dbh->disconnect(); } # Part Six - Stress-testing the DB with parallel instances sub stressdb { my $dbh = DBI->connect( "dbi:Oracle:$SID", "scott", "tiger", { RaiseError => 1, AutoCommit => 0 } ) || die "Database connection not made: $DBI::errstr"; for ($loop=0; $loop<$transactions; $loop++) { # prepare the random values srand; $p=rand(); $username = $NAMES[rand(@NAMES)]; $money=rand()*(rand()*100); $id=rand()*1000; $id=sprintf("%d", $id); # update transactioin if ($p<0.3) { printf("update\n"); my $sql = qq{ update stress set name='$username', money=$money where id=$id }; my $sth = $dbh->prepare( $sql ); $sth->execute(); $sth->finish(); } # select if (($p>0.3) and ($p<0.7)) { printf("select\n"); $sql = qq{ select id, name, money from stress }; my $sth = $dbh->prepare( $sql ); $sth->execute(); my( $sid, $sname, $smoney ); $sth->bind_columns( undef, \$sid, \$sname, \$smoney ); while( $sth->fetch() ) { # print "$sid $sname $smoney\n"; } } # delete + insert if ($p>0.7) { printf("delete/insert\n"); # delete first my $sql1 = qq{ delete from stress where id=$id }; my $sth1 = $dbh->prepare( $sql1 ); $sth1->execute(); $sth1->finish(); # insert my $sql2 = qq{ insert into stress values ($id,'$username',$money) }; my $sth2 = $dbh->prepare( $sql2 ); $sth2->execute(); $sth2->finish(); } } $dbh->disconnect(); } ############## main ###################### print "\nStarting the orastress test\n\n"; # check for commandline arguments foreach $parm (@ARGV) { $commandlineargs++; } $parm=0; if ($commandlineargs!=3) { # get values from the user print "How many rows the database-table should have ? : "; $rows=<STDIN>; chomp($rows); while ($rows !~ /\d{1,10}/){ print "Invalidate input! Please try again.\n"; print "How many rows the database-table should have ? : "; $rows = <STDIN>; chomp($rows); } print "How many clients do you want to simulate ? : "; $processes=<STDIN>; chomp($processes); while ($processes !~ /\d{1,10}/){ print "Invalidate input! Please try again.\n"; print "How many clients do you want to simulate ? : "; $processes = <STDIN>; chomp($processes); } print "How many transactions per client do you want to simulate ? : "; $transactions=<STDIN>; chomp($transactions); while ($transactions !~ /\d{1,10}/){ print "Invalidate input! Please try again.\n"; print "How many transactions per client do you want to simulate ? : "; $transactions = <STDIN>; chomp($transactions); } } else { # parse the values from the command line $rows = $ARGV[0]; $processes = $ARGV[1]; $transactions = $ARGV[2]; if (($rows !~ /\d{1,10}/) or ($processes !~ /\d{1,10}/) or ($transactions !~ /\d{1,10}/)) { print ("Invalidate input! Please try again.\n"); print ("e.g. stress-oracle.pl 10 10 10\n"); exit -1; } else { print("got the values for the stress test from the command line\n"); print("rows = $rows\n"); print("processes = $processes\n"); print("transactions = $transactions\n"); } } # cleaning up old db print("delete old entries from db\n"); deldb(); print("fill db with $rows rows\n"); filldb(); # Part Seven print("starting $processes processes\n"); my $pm = new Parallel::ForkManager($processes); $pm->run_on_start( sub { my ($pid,$ident)=@_; print "started, pid: $pid\n"; } ); for ($forks=0; $forks<$processes; $forks++){ $pm->start and next; stressdb; $pm->finish; } $pm->wait_all_children; $pm->finish; print "Simulation finished\n";