Test-Driven Development in Perl
The Perl Journal April 2003
By Piers Cawley
Piers is a freelance writer and programmer, and the writer of the Perl 6 Summary. He has been programming Perl since 1993, and is currently working on Pixie, an object persistence tool. He can be reached at [email protected].
Recently, I've been reading Kent Beck's inspirational book Test Driven Development (Addison-Wesley, 2002; ISBN 0321146530), in which he demonstrates with examples (in Java and Python) the process of driving development by writing a test for new functionality, coding until it passes, refactoring ruthlessly, and going back to the start of the loop with a new piece of functionality. I had also been looking for an opportunity to try out this method, and settled on the idea of creating a class "helper" module as an exercise in test-driven development. In this article, I'll build a basic helper module using this technique.
What Is a Class Helper Module?
If you write object-oriented Perl, you often find yourself writing many simple little pairs of methods with names like foo/set_foo, which simply provide you with uniform access to your object attributes. If you're like me, you get bored with this very quickly and start combing CPAN for a tool to do the grunt work for youa helper module. And you'll find plenty of them. I recommend that you take a look at Class::MethodMaker and Class::MakeMethods for starters.
However, none of the CPAN modules I've found do quite what I want. The main issue I have with almost all of them can almost be thought of as philosophical. In the tools I've tried, setting methods generally return either the new or the old value of the attribute being set. However, I don't like writing factory methods that look like:
sub Human::make_with_name { my $class = shift; my($name, $surname) = @_; my $self = $class->new; $self->set_name($name); $self->set_surname($name); }
I'd much rather have my setting methods return the object being altered because that lets me write a method that looks like:
sub Human::make_with_name { my $class = shift; my($name, $surname) = @_; return $class->new->set_name($name) ->set_surname($surname); }
This eliminates a useless variable and a lot of repetition. It's a matter of style, but I happen to think it's an important one.
Choice of Tools
I could have written my tests using Michael Schwern's excellent Test::More module, but I'm keen on xUnit-style testing, so I chose Adrian Howard's Test::Class, which neatly layers an xUnit-style interface on top of Perl's standard Test::Builder and Test::Harness testing framework.
Requirements
It's impossible to build anything unless you know what you want. Here's an initial list of my requirements with a few comments about their importance and whether they're "real" requirements or constraints. Note that we won't meet all of these requirements in this article. It's important, however, to start from as complete a set of requirements as possible, so I include them all here for the sake of completeness. These requirements will form the basis of our tests.
- A simple constructor method: The basic constructor should need no arguments and return a simple object.
- Constructors should call $self->init(@_).
- It should always be safe to call $self->SUPER::init. This just makes life easier.
- It should generate an accessor method that looks like "attrib_name."
- It should generate a setter method that looks like "set_attrib_name." The setter method should return the object being modified in preference to the new attribute value.
- Lazy initialization. It should be possible to use lazy initialization for objects, either by specifying a method, a default value, or a coderef. (Ideally, this would be broken up into smaller requirements.)
- Special attribute types should have useful helper methods; for instance, lists should allow $obj->attrib_append(@array) and other listlike ops. Again, this needs to be broken up into smaller requirementsat least one per attribute type.
Coding, Testing, and Refactoring
I'll be presenting this process in a series of very small steps. The idea is that at each step of the way, we should have a simple, clear goal that we can reach with simple, clear code.
First, we need to set up an empty package and test environment, and get the project safely into version control. The following commands did the job for me:
$ cd ~/build $ h2xs use-new-tests skip-exporter skip-autoloader \ > -AXPn Class::Builder $ cd Class/Builder $ cvs import Class-Builder vendor release $ cd ~/build $ cvs get Class-Builder $ rm -rf Class
Picking "simple constructor method" from the list of requirements means we can write a test. Our first test looks pretty simple:
package BuilderTest; use base 'Test::Class'; use Test::More; sub test_creation : Test(2) { eval q{ package ScratchClass; use Class::Builder new => 'new' }; die $@ if $@; my $scratch = ScratchClass->new; ok ref $scratch, 'Scratch object is a reference'; isa_ok $scratch => 'ScratchClass', 'Scratch object is an instance of ScratchClass'; } BuilderTest->runtests;
And of course, when we run the test suite, "the bar is red." (The original sUnit interface has a progress bar that updates as all the tests in the suite are executed. When any of the tests in a test suite fail, the bar goes red. When all the tests are passing, the bar is green. Even though Test::Class doesn't have a progress bar, the idiom is too useful not to use it here.) So, we write the simplest code we can, just to get a green bar:
package Class::Builder sub import { *ScratchClass::new = sub { bless {}, 'ScratchClass' }; } 1;
This code isn't great, but our only goal when we have a failing test is to write just enough code to get a green bar. So, we run the test suite, our failing test passes, and the bar is green. Now we can refactor.
Refactoring can be thought of as a process of removing redundancy in code. Looking at the code as it stands, there's some immediately obvious redundancy: the duplication of the string ScratchClass in both the test code and the implementation. So we fix that:
sub import { *ScratchClass::new = sub { bless {}, $_[0] }; }
And the bar stays green. Let's see if we can eliminate the other explicit use of ScratchClass in our import routine:
sub import { my $calling_class = caller(0); *{"${calling_class}::new"} = sub { bless {}, $_[0] }; }
Still green. Of course, there's still a bug there, so we'll expose it with another test:
sub test_custom_creation : Test { eval q{ package ScratchClass2; use Class::Builder new => 'custom_new' }; die $@ if $@; my $scratch = ScratchClass2->custom_new; ok isa_ok $scratch => 'ScratchClass2', 'Scratch object is an instance of ScratchClass2'; }
This fails because there's no custom_new method in ScratchClass2. Writing this test has made me think about constructors. In particular, I wonder if I ever call my simple constructors anything other than new? On reflection, I never do. So I'm better off not bothering to deal with custom constructor names. I can always come back and add them later if a need arises.
There's still something to think about, though. I could just have Class::Builder always generate a constructor and always call that constructor new, but that ignores inheritance. You only need to generate a new method for a parent class, or the package's user might need to write a more complex constructor themselves. So, we need some way of specifying whether or not to generate a constructor. Let's rejig the tests:
use Test::Exception; sub test_creation : Test(2) { eval q{ package ScratchClass; use Class::Builder has_constructor => 1 }; die $@ if $@; my $scratch = ScratchClass->new; ok ref $scratch, 'Scratch object is a reference'; isa_ok $scratch => 'ScratchClass', 'Scratch object is an instance of ScratchClass'; } sub test_custom_creation : Test { eval q{ package ScratchClass2; use Class::Builder }; die $@ if $@; dies_ok { ScratchClass2->new } "Constructor not generated by default."; }
Now, instead of passing a new => <method_name> pair of arguments, we pass has_constructor => 1 if we want Class::Builder to generate a constructor. Running the tests, we get a red bar. Rewriting Class::Builder, we now have:
sub import { my $class = shift; my %args = @_; my $calling_class = caller(0); if ($args{has_constructor}) { no strict 'refs'; *{"${calling_class}::new"} = sub { bless {}, $_[0] }; } }
And the bar is green.
Building Accessor and Modifier Methods
So, let's pull another feature off our desirable list. We're now at the point where we're able to create objects, but we don't have any accessor or modifier methods. Let's add modifier methods first. After all, if you can't set an attribute, you can't very well access it, can you? The first step is coming up with a syntax for specifying them. Poaching from Class::MethodMaker, I reckon that:
use Class::Builder get_set => 'attrib_name' ;
would be a good start. Writing the test, we have:
sub test_setter : Test { eval q{ package ScratchClass3; use Class::Builder has_constructor => 1, get_set => 'attr' }; die $@ if $@; my $scratch = ScratchClass3->new; $scratch->set_attr(10); is $scratch->{attr}, 10, "attr successfully set"; }
This fails. Note that we're treating the class as a glass box and using our knowledge of how it is implemented to write our test. This is fine for the time being because it's the only way we can get the test written. We'll refactor the test once we have a better way of accessing the attribute. First, we concentrate on getting the test to pass:
sub import { my $class = shift; my %args = @_; my $calling_class = caller(0); no strict 'refs'; if ($args{has_constructor}) { *{"${calling_class}::new"} = sub { bless {}, $_[0] }; } if ($args{get_set}) { my $method = $args{get_set}; *{"${calling_class}::set_${method}"} = sub { $_[0]->{$method} = $_[1] }; } }
The bar is green, but our implementation is as ugly as sin. It has some nasty repetition going on. I do not like *{"${calling_class}::new"} = sub {...} and *{"${calling_class}::set_${method}"} = sub {...}, for instance.
So, we put on our refactoring hat and slightly rewrite them:
if ($args{has_constructor}) { my $method = 'new'; *{"${calling_class}::${method}"} = sub { bless {}, $_[0] }; } if ($args{get_set}) { my $attr = $args{get_set}; my $method = "set_$attr"; *{"${calling_class}::${method}"} = sub { $_[0]->{$attr} = $_[1] }; }
The bar is still green, and now we have some absolutely clear duplication. So, lets pull the duplicated behavior out to a new function and rejig import:
sub add_class_method { my($target_class, $method_name, $methodref) = @_; no strict 'refs'; *{"${target_class}::${method_name}"} = $methodref; } sub import { my $class = shift; my %args = @_; my $calling_class = caller(0); if ($args{has_constructor}) { my $method = "new"; add_class_method($calling_class, $method, sub { bless {}, $_[0] }); } if ($args{get_set}) { my $attr = $args{get_set}; my $method = "set_$attr"; add_class_method($calling_class, $method, sub { $_[0]->{$attr} = $_[1] }); } }
The bar is now green. Now that we've added the function call, we can ditch the $method variables we added to make the duplication obvious. (I'm taking baby steps here to make the process explicit. Otherwise, I probably wouldn't have introduced them in the first place. However, in more complex situations, using variables as a way of explaining to yourself what the different bits of a method are doing can be a very handy precursor to extracting sections of code into new methods.) Even after we've removed the useless variable, we still have some duplication. Let's pull the consequent actions from each of those conditionals out into their own methods:
sub import { my $class = shift; my %args = @_; my $calling_class = caller(0); if ($args{has_constructor}) { $class->has_constructor($calling_class, $args{has_constructor}); } if ($args{get_set}) { $class->get_set($calling_class, $args{get_set}); } } sub has_constructor { my($class, $calling_class, $flag) = @_; if ($flag) { add_class_method($calling_class, 'new', sub { bless {}, $_[0] }); } } sub get_set { my($class, $calling_class, $attr) = @_; add_class_method($calling_class, "set_$attr", sub { $_[0]->{$attr} = $_[1] }); }
Doing this makes it clear that there's more duplication in the import method to deal with. Look at the way the key used in the conditional clause is repeated as the name of the called method. Let's refactor again:
sub import { my $class = shift; my %args = @_; my $calling_class = caller(0); foreach my $generation_method ( keys %args ) { $class->$generation_method( $calling_class,$args{$generation_method} ); } }
The only tricky part of that last refactoring was choosing the name for the loop variable. $generation_method is admittedly a little long winded, but it does express quite neatly what we're expecting to see.
The bar is still green, and there's no obvious duplication left in our code, so let's choose something else from our list of requirements.
One of the things we want to be able to do with our generated classes is to chain setting methods, so let's write a test for that:
sub test_chaining : Test { eval q{ package ScratchClass4; use Class::Builder has_constructor => 1, get_set => 'attr1', get_set => 'attr2' }; die $@ if $@; my $scratch = ScratchClass4->new; $scratch ->set_attr1(10) ->set_attr2(20); ok eq_hash $scratch, { attr1 => 10, attr2 => 20 }; }
As expected, we have a red bar, but it doesn't fail where we expected it to fail. It seems that we're not generating a set_attr1 method. A quick look at the import method shows the problem: We're using the wrong data structure. Hash keys are unique; therefore, instead of having two calls to get_set, we only have a single call. So, let's recode the import method, replacing the loop through a set of hash keys with a loop through @_:
sub import { my $class = shift; die "Arguments must be in pairs!" if @_ % 2; my $calling_class = caller(0); while (@_) { my($generation_method, $attrib) = splice @_, 0, 2; $class->$generation_method( $calling_class, $attrib ); } }
This still fails, but it fails as expected. Getting it to pass is simplejust alter the get_set generation method:
sub get_set { my($class, $calling_class, $attr) = @_; add_class_method($calling_class, "set_$attr", sub { $_[0]->{$attr} = $_[1]; $_[0] }); }
and the bar is green.
So, let's pick some more functionality and write another failing test.
I mentioned before that our tests are treating objects as glass boxes and using our knowledge about their internal representation. We can get rid of that assumption by rewriting our tests to depend on accessor methods (and, as a bonus, we get to test useful new functionality at the same time):
sub test_setter : Test { eval q{ package ScratchClass3; use Class::Builder has_constructor => 1, get_set => 'attr' }; die $@ if $@; my $scratch = ScratchClass3->new; $scratch->set_attr(10); is $scratch->attr, 10, "attr successfully set"; } sub test_chaining : Test(2) { eval q{ package ScratchClass4; use Class::Builder has_constructor => 1, get_set => 'attr1', get_set => 'attr2' }; die $@ if $@; my $scratch = ScratchClass4->new; $scratch->set_attr1(10) ->set_attr2(20); is $scratch->attr1, 10; is $scratch->attr2, 20; }
As expected, the bar goes red. So, let's have a look at the get_set generator method and change it to make an accessor method:
sub get_set { my($class, $calling_class, $attr) = @_; add_class_method( $calling_class, "set_$attr", sub { $_[0]->{$attr} = $_[1]; $_[0] }); add_class_method($calling_class, $attr, sub { $_[0]->{$attr} }); }
The bar is green, and this is as far as we'll take it in this article. We now have a helper module that will generate the kind of simple accessor methods that are the bread and butter of much OO development, and we have a simple set of tests that we can use to keep us sane as development proceeds. Note that the tests we've written aren't what could be described as comprehensive, but they're enough to have driven us to the point where we have a working module.
Class::Builder is not available on CPAN. At least, not yet. In this article, I've taken you to the point where it's beginning to be useful, but still a long way from release quality. I've also not provided the final source code, as this whole process was meant only as an exercise, and the module as it stands is just a starting point. I hope this article will help you get started using test-driven development to write your own modules.
TPJ