Sharing Cookies
The Perl Journal January 2003
By brian d foy
brian is the founder of the first Perl Users Group, NY.pm, and Perl Mongers, the Perl advocacy organization. He has been teaching Perl through Stonehenge Consulting for the past five years, and can be contacted at [email protected].
I use several different web browsers to do my work. Some were created by other people, like Mozilla, Internet Explorer, and OmniWeb; and some I wrote in Perl using the LWP modules. I want all of these browsers to use the same cookies. Perl gets me most of the way to that goal.
The LWP::Simple module, which Gisle Aas designed for simple web transactions, does not use cookies. I can fetch the web page for this magazine with a couple of lines of code, and even though it tries to set a session cookie, LWP::Simple will not read it. If I try to access the web site again in the same program, the web server will try to set another cookie because my program did not send the first one the server tried to set.
use LWP::Simple; my $data = get( 'http://www.tpj.com' );
I have to do more work to use cookies. Behind the scenes, LWP::Simple uses LWP::UserAgent to do its work, and LWP::UserAgent can automatically send and receive cookies if I tell it to use a cookie file. I set the autosave parameter to True when I create the HTTP::Cookies object, so it will save its persistent cookies to a file when it goes out of scope. I tell the user agent which cookie jar to use with the cookie_jar() method.
use HTTP::Cookies; use LWP::UserAgent; my $cookie_jar = HTTP::Cookies->new( qw( autosave 1 file cookies.txt ) ); my $ua = LWP::UserAgent->new; $ua->cookie_jar( $cookie_jar ); # a lot more programming goes here
If I use LWP::UserAgent directly, I have to program the mechanics of the web transaction myself, but often, I just want to add cookie support. The LWP::Simple module exposes its user agent object if I import its $ua variable. Once I have the $ua object, I can change the user agent to do what I need as I did in the previous example, and I still get the benefit of LWP::Simple.
use HTTP::Cookies; use LWP::Simple qw(get $ua); my $cookie_jar = HTTP::Cookies->new( qw( autosave 1 file cookies.txt ) ); $ua->cookie_jar( $cookie_jar ); my $data = get( 'http://www.example.com' );
After I run this program, I have a file named "cookies.txt" in the current working directory. The cookies are in the HTTP::Cookies format, as shown in Example 1.
The problem with HTTP::Cookies is that it uses its own format for cookies. This has advantages, as I show later, but at the moment, I do not want to have several cookie files. I want one cookie file that all my user agents, including my usual web browsers, can use. All of my Perl programs can use the same cookies file as long as they all use HTTP::Cookies, but what about Netscape's browser, Mozilla, Internet Explorer, and others?
Gisle has already thought about some of this, and provides an HTTP::Cookies::Netscape subclass with HTTP::Cookies. I use the same module, HTTP::Cookies, because HTTP::Cookies::Netscape is in the same file. The only change is my cookie jar constructor.
use HTTP::Cookies; use LWP::Simple qw(get $ua); my $cookie_jar = HTTP::Cookies::Netscape->new( qw( autosave 1 file cookies.txt ) ); $ua->cookie_jar( $cookie_jar ); my $data = get( 'http://www.example.com' );
After I run this program, my cookies file is in a cookie format that Netscape's browsers use, as shown in Example 2.
The HTTP::Cookies::Netscape module can also read files in the Netscape cookie format, so I can start with a cookie file that exists. If I am using Netscape Navigator from a Linux shell account, for instance, my cookie file is in the .netscape directory.
$cookie_jar = HTTP::Cookies::Netscape->new( file => "$ENV{HOME}/.netscape/cookies", autosave => 1, );
My Perl program can read any cookies in that file and store any cookies it gets. So now, my Perl programs can use the same cookies as my Netscape browser. If I visit a web site that sets a cookie in either program, the other will be able to send the same cookie back, assuming that I do not use them simultaneously (since they do not write to the cookie file until they stop).
What if I want to use another browser, though? I no longer use any of Netscape's browsers, instead favoring the open-source alternative, Mozilla. Since Mozilla is not a product of Netscape, its cookie file is slightly different. The developers removed the word "Netscape" and added a comment about the Mozilla Cookie Manager. The rest of the format, though, is the same. See Example 3.
The HTTP::Cookies::Netscape module cannot read the Mozilla cookies file because its load() method, which reads the cookies from the file and puts them into the internal data structure, looks for "Netscape" in the first line. I have submitted a patch to RT.cpan.org (ticket 1816http://rt.cpan.org/NoAuth/Bug.html ?id=1816) to make this more flexible, but I also slightly modified the HTTP::Cookies::Netscape class to create HTTP::Cookies::Mozilla, which you can download from CPAN. I simply overrode the load() and save() methods to read and write the Mozilla cookie file header instead of the Netscape browser header. The module inherits the rest of its functionality from HTTP::Cookies. Now my Perl programs and Mozilla can share cookie files.
Once I knew how to make another cookie class, I wanted to do it for all of the browsers that I have on my machine. Since I use Mac OS X, I also have OmniWeb, which stores its cookies in an XML format, as shown in Example 4.
To read and write OmniWeb cookies, I did the same thing I did with Mozillaoverrode the load() and save() methods to do the right thing. In this case, I had to completely reimplement these methods. The XML format is simple enough that I did not need any XML::Parser magic to create HTTP::Cookies::Omniweb, also available from CPAN. Now my Perl programs can use the same cookies as my OmniWeb browser.
After I wrote a couple of cookie modules, my Perl programs could share cookies with other applications. I would like to share cookies between applications, though. Can Mozilla and OmniWeb share cookies?
This problem entails a certain amount of concurrency. If I run both browsers at the same time, they potentially read more cookies from the web sites they visit and when I quit either browser, they will likely overwrite any changes to their cookie files. Instead of dealing with that problem, which is a simple matter of programming (even if the programming is not simple), I worry about the easy partconverting from one format to another.
I have the ideal setupan internal format and a way to convert other formats to and from it. If I can get the data into the internal data structure, I can output it in any way that it knows about. So far I have four output formats: HTTP::Cookies, Netscape, Mozilla, and OmniWeb. With a large enough set of formats, I have about N2 different possible conversions. Since I have the common internal format, however, I only need to program N of those conversions. Gisle started off with two and I added another two already.
To convert between OmniWeb and Mozilla cookies, I need to read the OmniWeb cookies file just like I did earlier. When I create the object, HTTP::Cookies stores the cookies in its internal data structure.
use HTTP::Cookies::Omniweb; my $omniweb = HTTP::Cookies::Omniweb->new( file => 'Cookies.xml' );
Once I have the cookies in the internal data structure (the common internal format), I only need to rewrite the file in the new format. Since I can muck with most of the object stuff in Perldo not try this in other languagesI can change an object's identity. In this case, I want to make the $omniweb object use the save() method in the HTTP::Cookies::Mozilla class, which will write the file in the Mozilla format.
In Perl, a method call is just a subroutine invocation where the first argument is the object. I can call the HTTP::Cookies::Mozilla::save() subroutine directly and pass it the $omniweb object. Internally, objects of either class are the same. The only difference is the filename I started with, but save() takes an optional second argument to select the output filename.
use HTTP::Cookies::Mozilla; HTTP::Cookies::Mozilla::save( $omniweb, 'cookies.txt' );
I can also rebless the $omniweb object into a new class. I still should call the save() method with the optional filename argument or it will try to use the filename with which I created it. I also leave off the autosave option so that HTTP::Cookies does not try to overwrite the original file when $omniweb goes out of scope.
use HTTP::Cookies::Omniweb; use HTTP::Cookies::Mozilla; my $omniweb = HTTP::Cookies::Omniweb->new( file => 'Cookies.xml' ); bless $omniweb, 'HTTP::Cookies::Mozilla'; $omniweb->save( 'cookies.txt' );
Now I have the basics for working with cookies between applications. I can let Perl programs share cookies amongst themselves and with other applications, and I can convert formats from one to the other. I am also thinking about a way to do the same tasks with less typing that involves automatic format detection, but that is a topic for another article.
TPJ