Ravaged by Robots!
By Randal L. Schwartz
In last month's column, I talked about implementing one type of survey form for customer feedback. Other types of forms often have ratings systems or multiple-choice values, which are then summarized into an average score to determine the most frequent responses.
Of course, such forms are meant to be used only once per person. But what if some of your responses are coming from Web robots? A clever Perl hacker could write a ballot stuffing program with just a few lines of code.
I was actually thinking about this problem the other day. As a human, it's trivial for me to see an image, extract the text content, and type it back into a form element. On the other hand, that has to be reasonably difficult for an automated form submission robot! That got me scurrying off to figure out how to validate a form using an image. After a couple of false starts, I came up with the program presented in
Listing 1, as a demonstration of this technique's basics.
Lines 1 through 3 are my standard Perl program header, enabling warnings, compiler restrictions, and disabling the buffering of
STDOUT
. Line 5 brings in all of the CGI shortcuts as functions rather than methods.
Lines 7 through 13 give our program a bit of memory using the Cache::Cache module subclass, Cache::FileCache. The Cache::Cache suite is found in the CPAN, and is being actively developed by DeWitt Clinton.
Here, we're setting up a cache that remembers things for ten minutes. Once an hour, the next lucky participant gets to perform the housekeeping by purging old entries. This way, if anyone leaves in the middle of trying to present a survey form, the resulting mess only stays around for up to an hour. The namespace is also defined. It's unique to this particular application, and I've arbitrarily called it
antirobot
. Beginning in line 15, we handle the image generation logic. Because that won't make much sense until we see how the inline image is used, I'll skip that for the moment, and jump down to line 44.
Lines 44 to 46 print the standard HTTP header, the top of the HTML head, and an in-page first level header to label the page. Lines 48 to 62 handle the response to the form. Again, as that won't make much sense until we've seen the form, so I'll set that aside as well.
Lines 64 to 74 set up a
$verify
string and a$session
value, and store them in the persistent cache. The verify string is eight random characters. To make sure it's fairly distinct even in courier font, I throw out the ten confusing characters in the character class on line 65 (two digits, and four letters in both lower and upper case). The session ID is designed to be unguessable, so I lifted the code from Apache::Session (as I've done in past columns), to generate a non-predictable 64-character hex string.
The strategy is simple. We provide a challenge (the
$verify
value) known only to the server, but keyed by the unique session ID (the$session
value). This challenge is presented only as an image link, and a hidden field communicates the session ID from the form to the form response action. If the response does not match the challenge, we have a mismatch and must start over.
The form must contain at least two things, an image link that contains the session ID, and a hidden field that contains that same session ID. The hidden field is set in line 73, and printed in line 85. The image link is generated in line 83. It refers back to this same script, but with trailing information that contains the session number followed by
.png
. Line 15 detects this on the subsequent invocation, but let's finish off the form first.
Lines 76 to 87 generate the form, including our one survey element: a request for the user's favorite ice cream flavor. We also have to include our hidden session field, the link to the image, and the text field for the user's response to reading the image, to determine the string in
$verify
.
Let's see how that image is generated, starting back in line 15. First, we notice that the script is invoked with some path info. For example, if the session ID were
a1b2c3
(and assuming the script is called antirobot), we'd get the URL:
http://www.stonehenge.com/cgi/antirobot/a1b2b3.png
This URL was constructed in line 83, and was automatically adjusted for the installed location of the script. Line 15 pulls out the
/a1b2c3.png
part into$info
. Lines 16 to 21 verify that this is a plausible URL for a session image. If not, a "404 not found" response is generated, which makes sense. You've asked for a file within a directory that doesn't exist.
Next, lines 23 to 28 extract the secret
$verify
string for this session, which was computed in the previous invocation and saved to the database in line 74. Again, if this doesn't exist, it's either a replay attack (a valid session key is being reused to submit another vote) or a forge attack (in which a session ID is being randomly generated to see if it might be a valid credential). Because of the huge number space of a 256-bit MD5 value, a brute force attack is unlikely to succeed, but in any case, we return the "404 not found" code here as well. (The warnings generated in line 25 would definitely be of some concern, however, and should be watched closely.)
If we have a valid session, and therefore, the verification string for that session, we must next make an image of the string. Three popular tools for doing this are GD, Imager, and the steroid-laden Image::Magick modules, all found in the CPAN. As this was a simple task, I chose GD, which I brought in at line 31. I'm using a fairly recent version of GD that writes PNG files. Older versions generate the controversial GIF format, which also works.
Line 33 selects the giant font built in to the GD package. Lines 34 and 35 create an image that's big enough to hold the string and a one-pixel border. Line 36 allocates the background color as black (red, green, and blue values all zero). Line 37, when uncommented, makes this background transparent. I realized that the output would then be sensitive to the background color of the HTML page, so I commented that out at the last minute. You might want to experiment with it.
Lines 38 and 39 write the string. They first create a white ink (red, green, and blue values all 255their maximum) and use that to place a string, which is offset by one character in each direction to maintain the border. Finally, line 40 pushes the image out with the right HTTP header for a PNG, and line 41 terminates this particular CGI invocation.
When I was discussing this program with my peers, a few suggested that using an automated tool to perform optical character recognition on the image would be enough to extract the verification string programmatically. If someone is going to that extreme, and if it were important enough to me, I'd start using low-contrast letters, gradients, or background grids. But we've raised the bar to a point at which most people won't bother trying to get around it (although a few might take it on for the challenge).
Once the form is filled out, we pass it to the standard, response-handling structure beginning in line 48. First, if
verify
is returned, then it was a form response. Ifsession
is also included, then we fetch that session from the cache. If it exists, we remove it from the cache, thus prohibiting the chance of a replay attack: only one form response can possibly use a given session/ validation pair. Note that there's a very small time window between checking for the session and removing that valid session, which can lead to the validation of multiple submits. Again, if we take a little more care in programming, we could eliminate that (using a read/modify/write-locked database, for example), but again, I think we've raised the bar enough to deter all but the most serious ballot-stuffers.
In line 54, if we have a match between the challenge (in
$
validate) and the response (in$
verify), then we have a real human who has correctly examined the image, figured out the original letters and digits, and typed those back in. In that case, we record the real human's vote in line 55 (code not shown. You could save it to XML as I showed in last month's column, for example). Otherwise, line 59 punts your user back to the form again (as described earlier). Note that the hidden fields for the form values do persist, so users won't need to reselect the ice cream flavor, but they will receive a new session ID and validation string.
And there you have it. A complete implementation of a robot-ballot-stuffing-proof survey form. This will work until someone else publishes instructions on how to programmatically extract a text string from an image, anyway. Until then, enjoy!
Randal ([email protected]) has coauthored the must-have standards Programming Perl, Learning Perl, and Effective Perl Programming.