Julius is a freelance network consultant in the Philippines. He can be contacted at [email protected].
There are two general methods of scrambling information: public-key encryption and symmetric encryption. Symmetric encryption is divided further into two major classes: stream cipher and block cipher. Of these two, the block cipher is the more popular, due mainly to the immense attention given to the Data Encryption Standard (DES), the most popular encryption algorithm in the world. Designed by IBM, with the aid of the NSA in the early 1970s, DES has been the focus of numerous studies by many mathematicians since its adoption as a Federal Standard in 1977. The lessons learned from these analyses spawned countless other block ciphers that bear many of DES's features.
Some Block Cipher Theory
Using a block cipher, data to be encrypted (also called plaintext) is first divided in chunks (or blocks) of equal sizes. The size corresponds to the block size of the cipher. For DES, data is encrypted in chunks of 64 bits, while the Advanced Encryption Standard (AES) candidates divide data in blocks of 128 bits. Next, the first plaintext block is encrypted, using a secret key, to produce the first encrypted block (also known as the "ciphertext block"). Next, the second plaintext block is processed to create the second ciphertext block, and so on. When all plaintext blocks have been encrypted, the ciphertext blocks are strung together in the order that they were created to form the final encrypted data, called "ciphertext." This mode of operation is known as Electronic Code Book (ECB). See Figures 1 and 2 for visual descriptions of ECB mode.
As you can see from the diagrams, encryption/decryption in ECB mode need not be done sequentially. In fact, since the blocks are processed independently of each other, they are best encrypted/decrypted in parallel.
There are problems in the block cipher, however. First, plaintexts don't always divide neatly in exact multiples of the block size. The last plaintext block may end up shorter than the block size. This is a problem of all block ciphers. The second problem concerns block ciphers in ECB mode: Identical plaintext blocks always produce identical ciphertext blocks. This is a problem because a cryptanalyst can create a code book of plaintexts and corresponding ciphertexts. If he learns, for example, that the plaintext block 0x4e6a89fb encrypts to ciphertext block 0x75f0c3d1, he can immediately decrypt that ciphertext block whenever it appears in another message.
But here's the clincher: If a certain block cipher has a block size of, say, 64 bits, the code book will have 264 entriesmuch too large to precompute and store, even with today's technology. And that's just for one key only. Now, if that block cipher uses a 128-bit key, a total of 264×2128 code book entries will have to be made. Given this immense storage requirement, it is practically certain that no storage device will ever be built to hold this huge code book. So, it turns out, this problem is purely theoretical. Yet, if you're the paranoid type, you'd be happy to know that there is a Perl module that avoids this ECB weakness.
For the first problem, the solution is to pad the last plaintext block with some regular patternzeros, ones, alternating ones, and zerosto make it a complete block. Figure 3 illustrates a typical padding scheme, the so-called PKCS #5. In the figure, assuming that the underlying block cipher has a block size of 64 bits (8 bytes), the partial block (consisting of five blue squares) is extended by padding it with three ASCII 0x03 characters.
For the second problem, the most common solution is to XOR (symbol: ...) the current plaintext block, bit by bit, with the previous ciphertext block before it is encrypted. This way, identical plaintext blocks will not encrypt to identical ciphertext blocks. Of course, plaintexts that start the same will still encrypt to the same ciphertexts up to the first difference. To prevent this, encrypt a random block of data (called an "initialization vector" or IV), and treat it as the first ciphertext block. An IV doesn't have to be meaningful; it's only used to make each message unique. This mode of operation is known as Cipher Block Chaining (CBC). Figures 4 and 5 are visual descriptions of CBC mode.
CBC mode is not without its own drawbacks, however. Some of these problems are:
- The IV must be unique for every message if you wish to encrypt with the same key, and the IV must not be reused. These factors increase the complexity of implementation.
- Encryption cannot be done in parallel because the previous ciphertext block must be computed first before the current plaintext block can be processed (although decryption is parallelizablesee Figure 5).
Perl Implementation of CBC
Lincoln Stein's Crypt::CBC module is a pure Perl implementation of CBC (http://www.cpan.org/authors/id/L/LD/LDS/Crypt-CBC-1.00.readme). Its only requirement is that the underlying block cipher must be able to report the block size and key size it's using when Crypt::CBC asks for them. Lincoln, however, made some exceptions. He accommodated Blowfish, a strong, time-tested block cipher that uses variable-length key sizes, even though it's unable to report its own key size. When Blowfish is used, Crypt::CBC defaults to Blowfish's maximum key size of 448 bits (56 bytes). On the other hand, if no block cipher is specified, Crypt::CBC defaults to DES.
I'll illustrate how Crypt::CBC works by using it in working Perl scripts. The scripts presented here use the block ciphers Khazad and Serpent, both available from CPAN as Crypt-Khazad and Crypt-Serpent, respectively. If you wish to run these scripts, please download and install these block ciphers first.
The first Perl script, khazad, shown in Listing 1, shows how to encrypt simple messages. Notice that our preferred block cipher module, Khazad, is not used anywhere in the script. That's because Crypt::CBC automatically loads it for us.
As with any object-oriented way of doing things, the first order of business is to instantiate an object. This is done by calling the new() function of Crypt::CBC (Line 10). Additionally, six hash keys need to be set before Crypt::CBC goes to work: key, cipher, iv, regenerate_key, padding, and prepend_iv. These are set in lines 10-15.
key. This is the password or passphrase used to encrypt or decrypt a message or file. If you're using a block cipher that takes, for example, a 128-bit (16-byte) key (or whatever length), you must ensure that you supply a key exactly that long. A key that's too short or too long would produce an error similar to this:
Uncaught exception from user code: Key setup error: Key must be 16 bytes long! at /usr/local/lib/perl5/site_perl/5.8.0/Crypt/CBC.pm line 95. Crypt::CBC::new('Crypt::CBC','HASH(0x8116654)') called at ./cbc-mode line 11
To take care of this problem, set the hash key regenerate_key to a nonzero value. See line 13.
regenerate_key. When regenerate_key is set to a nonzero value, Crypt::CBC hashes the supplied key using MD5, a strong one-way hash function that produces a fixed, 128-bit message digest. Crypt::CBC then uses this message digest (a 16-byte binary string) as the actual key for the encryption.
If the underlying block cipher requires a longer key, Crypt::CBC hashes the 16-byte key, thereby producing another 16-byte string, and appends it to the old 16-byte key. The resulting 32-byte key is again hashed, producing another 16-byte string, and appends it to the previous 32-byte key. This hashing process is repeated as often as needed until the final string is at least as long as the cipher's key size. The final string is then truncated to the desired length if it exceeds the required key size. For block ciphers with key sizes less than 128 bits (DES, for example, uses only 64 bits), the hashes of their keys are just truncated to the required lengths.
On Line 7, it is perfectly all right to use a key of arbitrary length because regenerate_key is set to 1.
On the other hand, if regenerate_key is set to 0, Crypt::CBC uses the user-supplied key as is. Just make sure that your key conforms to the key size of your block cipher.
cipher. This is the name of the block cipher to be used. There's no need to use it in your script because Crypt::CBC automatically loads it for you.
iv. The IV should be as long as the block size. Line 8 uses a static IV, which is really not a good idea.
padding. When padding an incomplete plaintext block, Crypt::CBC employs a variety of padding options. These are:
- Null padding
- Usage: 'padding' => 'null'
- Pad the last plaintext block with as many ASCII 0x00 characters necessary to fill the block. If the last block is a full block, an extra block of all ASCII 0x00 characters is still appended. See Figure 6 for a diagram of null padding. This padding option is recommended only for encrypting text data.
- Space padding
- Usage: 'padding' => 'space'
- Same as null padding, but uses ASCII 0x20 instead. An extra block containing all ASCII 0x20 characters is still appended if the last block happens to be complete already. See Figure 7 for a diagram of space padding. This padding option is recommended only for encrypting text data.
- One and zeroes padding
- Usage: 'padding' => 'oneandzeroes'
- Pad the last plaintext block with one ASCII 0x80 character followed by zero or more ASCII 0x00 characters necessary to fill the block. If the last block is already full, append an extra block. This extra block starts with ASCII 0x80 followed by as many ASCII 0x00 characters necessary to complete a block. See Figure 8 for a diagram of one and zeroes padding. The name oneandzeroes comes from the fact that 0x80 is 10000000 in binary representation. The last block is appended with a "1" bit followed by as many "0" bits as necessary to complete a block. This padding option is recommended for all types of data.
- Standard padding
- Usage: 'padding' => 'standard'
- This padding scheme is also known as PKCS #5. An example would clarify this padding option. Suppose that the last block is 5 bytes short of becoming a full block. With this padding scheme, append five ASCII 0x05 characters to that block. See Figure 3 for another visual description of this padding scheme. If, on the other hand, the block happens to be complete already, we still append an extra block. If the block size is, say, 8 bytes, this extra block contains eight ASCII 0x08 characters. See Figure 9 for a diagram of standard padding. This padding option is recommended for all types of data. If no padding method is specified, standard is assumed.
prepend_iv. When prepend_iv is set to a nonzero value, the resulting ciphertext will start with a 16-byte prefix. The first eight characters consist of the string "RandomIV" followed by another eight-character string. During decryption, if Crypt::CBC sees the regular expression /RandomIV.{8}/, the eight characters following "RandomIV" will be used as the IV. Obviously, this is unacceptable for block ciphers with block lengths longer than eight bytes. New block ciphers, like Rijndael, Serpent, RC6, Twofish, and Anubis, use block sizes of 16 bytes. To disable this feature, set prepend_iv to 0.
A Sample File Encryption Script
The next Perl script illustrates the use of Serpent, a 128-bit block cipher that uses a 128-bit key. The whole Serpent script, appropriately called serpent, is shown in Listing 2.
Suppose that you want to encrypt the file sample.txt. To encrypt, type
./serpent encrypt sample.txt > sample.enc
You will be asked to enter a password twice. When this is done, a new file, sample.enc, is created. To decrypt sample.enc, type
./serpent decrypt sample.enc > sample.dec
Just like in the preceding example, you will be asked to enter a password twice. After this, the decrypted file, sample.dec, is created. The original file, sample.txt, should be identical to sample.dec.
In this script, the IV is generated by reading /dev/urandom (line 8, and lines 47-49), and the IV is saved for future use in the file iv.rand (line 9, and lines 50-52). get_input() (lines 16-37) simply asks for a key twice while echo is switched off.
Taking a string argument, start() (lines 62 and 77) is a Crypt::CBC method that initializes the encryption or decryption process. For encryption, the argument must be "E" or any word that begins with "e." For decryption, the argument must be "D" or any word that begins with "d."
During decryption, the file iv.rand is read to retrieve the IV used in previous encryption (lines 66-68).
Lines 80-82 tell Crypt::CBC to call the method crypt() for as long as there is data to be processed. The crypt() method performs the encryption and should be called after start().
Finally, on line 85, the method finish() is called to process the last plaintext block.
Creating a Crypt::CBC-Compliant Module
Suppose you'd like to make your own block cipher module that is Crypt::CBC-compliant. Your block cipher must be able to return the block size it is using, as well as the length of its key when Crypt::CBC asks for them. Suppose that your block cipher is named MyBlockCipher, and uses a block size of 64 bits (8 bytes) and a key size of 128 bits (16 bytes), and further suppose that you'd like to use XS programming. Edit your .xs file and add the following lines:
int keysize(...) CODE: RETVAL = 16; OUTPUT: RETVAL int blocksize(...) CODE: RETVAL = 8; OUTPUT: RETVAL
Your .xs file should now contain the following:
1 #include "EXTERN.h" 2 #include "perl.h" 3 #include "XSUB.h" 4 #include "ppport.h" 5 6 /* some code here */ 7 8 MODULE = Crypt::MyBlockCipher PACKAGE = Crypt::MyBlockCipher 9 10 int 11 keysize(...) 12 CODE: 13 RETVAL = 16; 14 OUTPUT: 15 RETVAL 16 17 int 18 blocksize(...) 19 CODE: 20 RETVAL = 8; 21 OUTPUT: 22 RETVAL 23 24 /* some more code here */
The keyword int must be on a line by itself. RETVAL stands for "return value," and this is the data that Crypt::CBC needs. Read the perlxstut man pages for more details.
In the .xs file above, the keysize() function returns a value of 16 bytes (the 128-bit key), while blocksize() returns 8 bytes (the 64-bit block length).
References
Applied Cryptography: Protocols, Algorithms, and Source Code in C, Second Edition. Bruce Schneier, John Wiley & Sons, 1997, ISBN 0-471-12845-7
Crypt::CBC Module, Lincoln D. Stein, http://www.cpan.org/ authors/id/L/LD/LDS/Crypt-CBC-1.00.readme.
TPJ
1 #!/usr/local/bin/perl 2 use diagnostics; 3 use strict; 4 use warnings; 5 use Crypt::CBC; 6 7 my $key = "hello, there!"; 8 my $IV = pack "H16", "0102030405060708"; 9 10 my $cipher = Crypt::CBC->new({'key' => $key, 11 'cipher' => 'Khazad', 12 'iv' => $IV, 13 'regenerate_key' => 1, 14 'padding' => 'standard', 15 'prepend_iv' => 0 16 }); 17 18 my $plaintext1 = pack "H32", "0123456789abcdeffedcba9876543210"; 19 print "plaintext1 : ", unpack("H*", $plaintext1), "\n"; 20 21 my $ciphertext1 = $cipher->encrypt($plaintext1); 22 print "ciphertext1 : ", unpack("H*", $ciphertext1), "\n"; 23 24 my $plaintext2 = $cipher->decrypt($ciphertext1); 25 print "plaintext2 : ", unpack("H*", $plaintext2), "\n";Back to article
Listing 2
1 #!/usr/local/bin/perl 2 use diagnostics; 3 use strict; 4 use warnings; 5 use Getopt::Long; 6 use Crypt::CBC; 7 8 my $SRC_RAND = "/dev/urandom"; 9 my $IV_FILE = "iv.rand"; 10 my $BLOCKSIZE = 16; 11 my $BLOCK_CIPHER = "Serpent"; 12 my $IV; 13 my ($encrypt, $decrypt) = (); 14 GetOptions("encrypt" => \$encrypt, "decrypt" => \$decrypt); 15 16 sub get_input 17 { 18 my ($message) = @_; 19 local $| = 1; 20 local *TTY; 21 open TTY,"/dev/tty"; 22 my ($tkey1, $tkey2); 23 system "stty -echo </dev/tty"; 24 do { 25 print STDERR "Enter $message: "; 26 chomp($tkey1 = <TTY>); 27 print STDERR "\nRe-type $message: "; 28 chomp($tkey2 = <TTY>); 29 print STDERR "\n"; 30 print STDERR "\nThe two $message", "s don't match. ", 31 "Please try again.\n\n" unless $tkey1 eq $tkey2; 32 } until $tkey1 eq $tkey2; 33 34 system "stty echo </dev/tty"; 35 close TTY; 36 return $tkey1; 37 } 38 39 my $key = &get_input("password"); 40 41 chomp $ARGV[0]; 42 open INFILE, $ARGV[0]; 43 44 my $cipher; 45 46 if ($encrypt) { 47 open RANDSRC, $SRC_RAND; 48 read(RANDSRC, $IV, $BLOCKSIZE); 49 close RANDSRC; 50 open SRC_IV, "> $IV_FILE"; 51 print SRC_IV $IV; 52 close SRC_IV; 53 54 $cipher = Crypt::CBC->new({'key' => $key, 55 'cipher' => $BLOCK_CIPHER, 56 'iv' => $IV, 57 'regenerate_key' => 1, 58 'padding' => 'standard', 59 'prepend_iv' => 0 60 }); 61 62 $cipher->start('encrypt'); 63 } 64 65 if ($decrypt) { 66 open RANDSRC, $IV_FILE; 67 read(RANDSRC, $IV, $BLOCKSIZE); 68 close RANDSRC; 69 70 $cipher = Crypt::CBC->new({'key' => $key, 71 'cipher' => $BLOCK_CIPHER, 72 'iv' => $IV, 73 'regenerate_key' => 1, 74 'padding' => 'standard' 75 }); 76 77 $cipher->start('decrypt'); 78 } 79 80 while (read(INFILE, my $buffer, 1048576)) { 81 print $cipher->crypt($buffer); 82 } 83 84 close INFILE; 85 print $cipher->finish;Back to article