Craig is a consultant at n-Link Corp. in Seattle, Washington. He can be contacted at [email protected].
Sometimes I am called paranoid, but I like to think of it as being careful. I have some web pages that I would like to be able to access from anywhere, but keep anyone else from seeing.
Having access to a file from anywhere I wish can be handled easily by hosting the file on my web site, but protecting the data is a little more difficult. Of course, the web page will be password protected, and will be on a secure (SSL) web server so that it isn't publicly accessible and no one can view the data as it traverses the Internet from the web site to the browser. But what about when the data is sitting on the server in a file? If I store it as an HTML file for the web server to access, then the permissions will be open enough for anyone on the server to view the file. I may be able to restrict the permissions so that only the administrator of the site can get to it, but I want to feel secure in the knowledge that only I can view the data, even if the server is compromised.
Just as the connection between the server and my browser is encrypted, then encrypting the file on the server will protect it from prying eyes. The file would be secure even if someone somehow obtained access to the file.
So the question, then, is how to encrypt and decrypt the file. This is where Perl and CPAN come in. I can write a script that will read the file in, encrypt the data, and save the file on the server. The script would also be able to decrypt the file, given the correct passphrase, and send the file to the browser.
Looking through CPAN led me to the Crypt modules, one of which was Crypt::Rijndael. I remembered reading that this algorithm was chosen by the U.S. Government as the new Advanced Encryption Standard (AES), so it should be good at protecting my data. After reading the details on usage, however, I noticed that the input data had to be exact multiples of the blocksize or the encrypt function would croak. I also had to supply a key, not just a password or passphrase. There was even an option for specifying the encryption mode: Electronic Code Book (ECB), Cipher Block Chaining (CBC), Cipher Feed Back (CFB), and several others (see http://www.rsasecurity.com/rsalabs/faq/2-1-4.html for a definition of the modes). While I did remember some of the differences between these modes, and that I should use CBC as was noted in the docs, I began to suspect that maybe there was a better module to use.
I then took a look at Crypt::CBC, which provides an interface to let me pick a cipher (encryption algorithm), uses an arbitrary length passphrase (key), and ensures that the data is of the correct block size (padding it if needed). This was certainly better than interfacing with Crypt::Rijndael directly, but there was one more option that I wasn't quite sure about.
The Intialization Vector (IV) could either be set or randomly generated. The IV is important with block ciphers because it is part of the input in encrypting the first block of data. I didn't realize the impact of this until I hard-coded the IV value and noticed that encrypting the same data with the same passphrase yielded the same encrypted data. Of course, this makes sense, given that the same three inputs (passphrase, data to encrypt, and IV) were used for the encryption. I did notice that the Crypt::CBC has an option to randomly generate the IV and then place this value at the start of the encrypted data. While this is an effective solution, I was a little concerned about putting this data into the encrypted file.
Some more searching on CPAN turned up the Crypt::OpenPGP module, which is a pure Perl implementation of the OpenPGP standard (RFC2440). I did not want to use the public-key aspects of the module, just the symmetric encryption, since I was both encrypting and decrypting the data file. Crypt::OpenPGP turned out to be the module that I needed: It was designed to encrypt individual files, and the Perl module has a very simple interface for encryption and decryption, requiring nothing more than a passphrase. It also handles the plaintext file in memory so that the data is not written to disk, thus reducing further exposure of the data.
Installing Crypt::OpenPGP
Crypt::OpenPGP is not the easiest module to install due to all of the dependencies. I had some trouble when I found out that one of the dependencies, Math::Pari, required an external library. I got around these problems by using the ports system on FreeBSD but you should be able to follow the instructions of the Math::Pari module to get everything configured properly.
Using the Script
When first accessing the script, you are presented with two forms: one to encrypt an HTML page and store it on the server and the second to decrypt a page stored on the server. The encrypt form has input fields for the filename, passphrase, and text data (see Figure 1). The filename is used as the name of the file when storing the encrypted data on disk. This name is also used on decryption to identify the file to decrypt.
The passphrase is used to encrypt the data. While there is no minimum length on the passphrase, it is best to use one greater than six characters using letters, numbers, and punctuation. The text data is the entire file that should be encrypted. The data needs to be a complete HTML file since only this data will be returned to the browser when it is decrypted.
The encrypted file can be decrypted either by using the form on the page as shown in Figure 1 or by accessing the script with the name of the file to decrypt as the extra path info. An example of this is https://server/cgi-bin/securepage.pl/filename/. When decrypting this way, a form is returned so that the passphrase can be entered.
Anatomy of the Script
There is a configuration variable for the directory that should be used when storing the encrypted files and another variable for the encryption algorithm (or cipher) to use when encrypting the file. Crypt::OpenPGP does not need the cipher specified when decrypting the file because the cipher used encrypting the file is stored in the file.
There is a check to ensure that the script is being accessed over a secure link. Because the data is being encrypted on the server, it makes sense that we ensure that the link is also secure. The HTTPS environment can be checked as shown in Listing 1. The $HOST and $SCRIPT variables are set so that a full URL can be constructed for redirecting to the secure URL. It is assumed that the secure URL is constructed by only changing the protocol from http to https.
The input parameters to the script need to be retrieved from the CGI.pm object (Listing 2). The $passphrase and $data variables can be taken as is, but the $filename variable needs to be checked to ensure that there are no slashes (/) in the value. We don't want the filename to be used to backtrack up the directory chain and potentially gain access to other files on the system.
Next, the script needs to determine what the state of the request is. Did the script get called for the first time, did the user click the Encrypt button or the Decrypt button? In Listing 3, you see the tests for state control of the script. It starts with getting the values of the two submit buttons, "decrypt" and "encrypt." Remember that submit buttons set a parameter value in the request.
The first if condition tests whether the decrypt button was clicked and also that the filename variable has been set. If both are true, then the subroutine PerformDecrypt_OpenPGP is called with the passphrase and filename. The second if tests the $pathinfo variable, which was set from the PATH_INFO environment variable. $pathinfo will not be empty if the script is called with extra path info (path in the URL after the script name). With $pathinfo having a value, the f (or filename) parameter of the CGI request is set to $pathinfo minus the first "/" character. ReturnDecryptForm is called to display a form to get the passphrase to decrypt the file. The single parameter to this subroutine is described later.
The third if condition tests the $encrypt variable to see whether the encrypt button was clicked. If true, the PerformEncrypt _OpenPGP subroutine is called with the passphrase, filename, and text data. The final else condition is executed when the script is first requested and calls the ReturnInputForms subroutine. Following the if...elsif construct, exit() is called because the script has completed its execution.
The ReturnInputForms subroutine displays the encryption form to encrypt new files and decryption form to display encrypted files. I used the CGI.pm object to write out all of the tags, but mostly because I can never remember the attributes for using a multipart form. I used the start_multipart_form() method so that I could easily add the function of uploading a file instead of typing the file in the textarea field. However, I decided against enabling this feature, since I didn't want the data I was trying to keep secure written to disk without encryption during the upload process. Also, the form method is POST and not GET to ensure that the passphrase and other data is not sent in the URL of the request. This subroutine finally calls ReturnDecryptForm to display the decryption form.
ReturnDecryptForm prints the html form. Since it can be called to print a standalone HTML page or as part of a larger page, it has a parameter to control whether the opening and closing <html> and other tags should be printed. A True (1) parameter value prints the <html> tags.
The PerformEncrypt_OpenPGP sub creates a new Crypt::OpenPGP object and encrypts the data with the passphrase. The encrypted data or ciphertext is then written to the filename. The parameters to the encrypt sub are the data to be encrypted, the passphrase and the encryption algorithm:
## encrypt the data my $ciphertext = $pgp->encrypt( Data => $data, Passphrase => $passphrase, Compress => 'Zlib', #Armour => 1, Cipher => $enc_algorithm );
I have also specified that the data be compressed before encryption. This is not required, but I wanted to save disk space on the server. The Armour parameter can be used to have the output "ASCII-armored," which creates a text file from the binary encrypted file. The default is to not create it ASCII-armored, and since armoring the file will only increase the file size just to make it viewable, I'll skip it to save on disk.
The resulting ciphertext is written to the file as specified by the filename and the directory set in the configuration. Finally, a short page is returned indicating that the file has been encrypted and a URL printed that can be used to access the file. This URL will display the decryption form, which prompts for the passphrase of the file.
The PerformDecrypt_OpenPGP subdirectory decrypts the file using the given passphrase and returns it to the browser. First, the file name is tested to see that it exists; if not, an error is returned stating that this file does not exist. Next, the Crypt::OpenPGP object is created and the decrypt subdirectory called. Since OpenPGP can read the ciphertext directly from the file, only the filename and passphrase are passed to the decrypt sub.
## decrypt the ciphertext my $plaintext = $pgp->decrypt( ## specify filename and have it read the file from disk Filename => $enc_dir.$filename , Passphrase => $passphrase, );
The plaintext is then printed to the browser because it is assumed that the decrypted file is an HTML web page.
Next Steps
Currently, this solution assumes that the encrypted page is a single, self-contained web page. It can't have any references to another encrypted page such as an image file, javascript file, or css file. It should be possible to have the first decrypted page decrypt the others, but I haven't thought this whole problem through. Another enhancement that would be nice is to encrypt a database file and possibly the script to execute on it. This should be an easy enough enhancement because instead of decrypting the page and sending that to the browser, it could decrypt a script and database file and then execute the script. But I'll leave both of these as an exercise to the reader or maybe another article.
Reference
CBC (http://www.bletchleypark.net/crypt/modes.html).
TPJ
## get the Name of the host for the server and the script name ## will be used when constructing the full url for this script my ($HOST) = $ENV{SERVER_NAME} =~ /(.*)/s; # untaint my ($SCRIPT) = $ENV{SCRIPT_NAME} =~ /(.*)/s; # untaint ## check to see that this script is on a secure link if ( $ENV{HTTPS} ne 'on' ) { print "<strong>ERROR: you should access this over a secure link.</strong><p>"; print " Maybe you want <a href=\"https://".$HOST.$SCRIPT."\">https://".$HOST.$SCRIPT."</a>"; exit; }Back to article
Listing 2
my ($passphrase) = $q->param('pwd') =~ /(.*)/s; ## make sure there are no extra slashes my ($filename) = $q->param('f') =~ /([^\/]*)/s; my ($data) = $q->param('data') =~ /(.*)/s;Back to article
Listing 3
## determine what button was clicked, and what action should be performed ## get submit button values for state control my ($decrypt) = $q->param('decrypt') =~ /(.*)/s; my ($encrypt) = $q->param('encrypt') =~ /(.*)/s; my ($pathinfo) = $ENV{'PATH_INFO'} =~ /(.*)/s; ## the decrypt button was clicked and a filename has been specified if ($decrypt ne '' && $filename ) { &PerformDecrypt_OpenPGP($passphrase, $filename ); } ## the PATH_INFO is set with the name of the file to decrypt elsif ($pathinfo ne '') { ## populate the file (f) field since that is what is expected ## CGI.pm will populate the field in the form $q->param('f',substr($pathinfo,1)) ; ## return form to request the password from the user ## param = 1, to have a full HTML page returned &ReturnDecryptForm( 1 ); } ## encrypt button was clicked elsif ( $encrypt ne '' ) { &PerformEncrypt_OpenPGP($passphrase, $filename, $data); } ## page requested for the first time else { &ReturnInputForms(); } exit;Back to article