SysAdminMag.com
Using DNSBLs to Monitor Network Security
Luis E. Muñoz
Many email administrators are turning to DNSBLs -- DNS Block Lists -- as useful weapons in the arsenal against spam. There are DNSBLs covering many aspects of the security spectrum related to spam. A brief sample of the overall focus of the most common lists include:
- Open HTTP proxies
- Open SMTP proxies
- Zombies or trojaned machines
- Miscellaneous open proxies
- Hosts that send spam to spamtrap addresses
These lists continue to grow despite the efforts of the community to educate the general public and, more importantly, the administrators responsible for the operation or security of the network. No matter how many security measures we implement in our network, the reality is that a lot of computers in the public network and in our datacenters, are compromised each day.
This article will introduce another useful application for the DNSBLs. I'll show how to use this valuable information source to diagnose and monitor the overall security level of a given network. I'll do so by generating a sort of "reputation" or index, based in the information collected from the lists themselves.
The code I will use for this, although simply an example, is available from the Sys Admin Web site:
http://www.sysadminmag.comThe Lists One of the first things to do is research the existing DNSBLs. To save you some time, I will be using the following lists in this article. Be aware that you must thoroughly understand how each list work and what is really represented by a listing there:
- l2.spews.org (SPEWS level 2 list): IP
ranges listed here usually have long-standing problems that have not been
properly addressed by their operators. Alternatively, the IP ranges were
associated with spamming operations. Level 1 and Level 2 are
"severity levels" associated with a listing. Quoting from the
SPEWS FAQ, "... A common practice is to bounce based on the SPEWS
Level 1 list, and tag based on the SPEWS Level 2 list."
A listing in SPEWS probably is something you need to address immediately as those listings have been known to grow progressively to sizeable chunks of IP space. In this article I'll use the level 2, which tends to include more IP space. In practice, this can work as an early warning of an upcoming level 1 listing. - psbl.surriel.com (Passive Spam Block
List): This list includes hosts that have sent spam to certain spamtraps
with the rationale that anyone can quickly and easily delist them.
In my experience, many spammers tend to hit psbl's spamtraps, so this may be a good early warning of problems. However, your mileage will vary. - list.dsbl.org (Distributed Sender
Blackhole List): This list specializes in detecting SMTP relay hosts. An
open SMTP relay will likely be listed in dsbl.org pretty quickly when one
of the testers receives spam coming from it and subsequently tests the
relay.
This particular version of the list contains hosts that have been included by trusted users of dsbl.org. My experience shows that this is the most common variant of the list used to block email. For statistical purposes, you may want to include unconfirmed.dsbl.org and multihop.dsbl.org, which can also glean useful information about what is leaking out of your network.
Note that it is actually hard for an ISP not to have its customer-serving mail servers listed in multihop, so this list is less common in mail filtering. - spam.dnsbl.sorbs.net (SORBS' Spam Database): This is the "Database of hosts sending to SORBS' spamtraps". The SORBS' FAQs provide a thorough description of what is listable in this DNSBL. A quick summary is that any host sending spam, in the vicinity of a spammer or providing spam support can be listed.
- smtp.dnsbl.sorbs.net (Open SMTP Relay Servers): A list of SMTP servers that can be used to relay spam, commonly known as open relays.
- http.dnsbl.sorbs.net (Open HTTP Proxies): A list of HTTP proxy servers that can be used by anyone without (or with trivial) authentication.
- socks.dnsbl.sorbs.net (Open SOCKS Proxies): A list of SOCKS proxy servers that can be used by anyone without (or with trivial) authentication.
- web.dnsbl.sorbs.net (Web Servers with Spammer-Abusable Vulnerabilities): It is a well-known fact that notorious bugs in common Web applications can be exploited to send bulk email, making it seem to come from the Web server hosting those faulty applications. This list is designed to catch those hosts and prevent spam sent in this way from spreading.
- misc.dnsbl.sorbs.net (Non-HTTP and Non-SOCKS Open Proxies): Any other type of open proxy will be listed here. This deals mostly with worm infections that cause a host to become a "zombie". Zombies are commonly used to send huge amounts of spam for the benefit of those who control them. Those zombies are also used to orchestrate DDoS attacks.
- sbl.spamhaus.org (Spamhaus Block List):
Straight from the Spamhaus site, "The SBL is a realtime database of
IP addresses of verified spam sources ... maintained by the Spamhaus Project team and supplied as a free service to help
email administrators better manage incoming email streams."
A listing in the SBL, especially if it is associated with a spammer in the Register of Known Spamming Operations (ROKSO), can be a very serious matter warranting prompt attention. - xbl.spamhaus.org (Exploits Block List):
Again, straight from the Spamhaus site, this list is "a realtime
database of IP addresses of illegal third party exploits, including open
proxies..., worms/viruses with built-in spam engines, and other types of
trojan-horse exploits."
The data in the XBL is composed from two public lists: the CBL and the NJABL Open Proxy list. Listings in the XBL, just as listings in other similar DNSBLs, are reliable indicators of host security compromise.
16 9 * * * cd $DNSBL_WORK_DIR && \ rsync -q psbl.surriel.com::psbl/psbl.txt ./psbl.surriel.com && \ cp ./psbl.surriel.com ../lists 21 7,20 * * * cd $DNSBL_WORK_DIR && \ curl -s -o ./l2.spews.org \ http://www.spews.org/spews_list_level2.txt && \ cut -f1 '-d ' l2.spews.org | egrep '^[0-9]' | \ sort --temporary-directory=/var/tmp| uniq > ../lists/l2.spews.org 3 10 * * * cd $DNSBL_WORK_DIR && \ for i in http misc smtp socks spam web; do \ rsync -q rsync://rsync.us.sorbs.net/rbldnszones/ \ $i.dnsbl.sorbs.net .;\ cut -f1 '-d ' $i | egrep '^[0-9]' | \ sort --temporary-directory=/var/tmp | uniq > ../lists/$i; doneNotice the seemingly odd times I use for the entries. This is done to help randomize the hits on the sync servers operated by the lists. Also, please make sure you obtain permission from the list operators before mirroring their data. Each list usually has information about how to request and obtain copies of its data that you should review before trying to obtain their source. As you can see, these crontab entries leave the list data in a mnemonically named file in the $DNSBL_WORK_DIR/lists/ directory, which I will be using throughout this article as the repository for list information. One of the most important metrics that can be extracted from this list data is the proportion of our IP space that has been listed on each DNSBL. This is in fact what this article is about. Tracking this variable over time can give you useful insights about what is really going on in the network and what are its most visible symptoms. More importantly, it can show you whether what you're doing to secure your network is working and to what degree. Additionally, this kind of analysis can provide a robust metric for security incidents, which can help justify the business case for special tools or resources for your area. I've been able to use information like this a few times. We will calculate this variable once a day and store this sample in a database. Later, we will produce a simple graph that shows the progress of this variable over time. Let's begin with the tools we'll need:
- sqlite3 (Database)
- rrdtool (Graph)
- Perl
- NetAddr::IP (Perl module to handle IP addresses -- use the latest version)
- Class::DBI (an OO interface to databases supported by Perl's DBI)
3 use strict; 4 use warnings; 6 use IO::File; 7 use File::Find; 10 use NetAddr::IP 4.00; 13 use Getopt::Std; 14 use vars qw/$opt_i $opt_s $opt_h/; 15 my $opts = 'hi:s:'; 16 getopts($opts);Lines 3-16 load the modules we will be using and parse the command-line options:
18 die <<HELP 19 Usage: count-ips [-h] [-i isp-ip-space] [-s source] ... 34 HELP 35 if $opt_h;Lines 18-35 use the HEREDOC syntax for specifying multi-line strings in Perl in order to produce some help text when count-ips is invoked with the -h option:
41 my @input = (); # The IP space itself 43 my $fh = IO::File->new($opt_i, "r") 44 or die "Unable to open input file $opt_i: $!\n";Lines 41-44 prepare @input, the variable where the IP space we want to match against the DNSBLs will be stored. This IP space will be read from a file that will be available through the $fh filehandle:
46 while (my $l = $fh->getline) 47 { 48 $l =~ s!(?:\#.*$|\s+)!!g; 49 chomp $l; 50 next unless $l =~ /\S/; 51 my $ip = new NetAddr::IP $l; 52 unless ($ip) 53 { 54 warn "$opt_i:$.: Invalid IP address $l\n"; 55 next; 56 } 58 push @input, $ip; 59 }This loop at lines 46-59 reads one line at a time from the given file, stripping whitespace and Perl comments. This allows for some familiar formatting and documentation to be included in the file. Blank lines are skipped. The rest -- hopefully a CIDR block or lone IP address -- is passed to NetAddr::IP, which can understand most formats for IPv4 network specification (CIDR, range notation, etc.). If NetAddr::IP fails to recognize an IP address or subnet, its ->new() method returns undef, which then causes a warn() to output a message referencing the bogus entry in the input file. The $. variable holds the line number of the last filehandle read, which allows the script to generate a more useful message:
61 @input = NetAddr::IP::Compact @input; 62 my @ire = map { qr/$_/ } map { $_->re } @input;After this is done, NetAddr::IP::Compact() is used to merge contiguous ranges of IP space. That is, two contiguous /25s will become a single /24. IP blocks will also be nicely sorted as a side effect. This happens at line 61. Line 62 uses some map magic to convert each subnet to a Perl regular expression that will match any IP address within it. This is a handy feature for this application, and here is why. In many cases, the DNSBLs contain lots of /32s. The overhead of parsing the IP address out of its textual representation to use NetAddr::IP->contains() method to determine whether that IP address is listed is usually too high. Alternatively, the /32s could be simply matched against the regular expression, which is normally much faster. Note that two maps are used at line 62. This is so because in the future, ->re() might return an array of regular expressions. Whatever the case, this leaves a list of regular expressions that will match any IP address within our network in @ire. Next comes the slightly complex matching function, responsible for reading in each DNSBL source file and checking it against the @input and @ire representations of our IP space. Let's analyze it piece by piece:
66 sub slurp_n_check 67 { 68 my $file = $File::Find::name; 69 if ($file =~ m/[\[\]\(\)\{\}\$]/) 70 { 71 warn "Possibly dangerous filename: $file - Skipping\n"; 72 return; 73 }For paranoia's sake, lines 68-73 verify that the file name supplied by File::Find is not dangerous. (We'll use this module later to scan the directory where the DNSBLs sources are.) Basically, the code looks for potential variable interpolations or metacharacters in the file name and refuses to work with them:
75 my $fh = IO::File->new($file, "r"); 77 unless ($fh) 78 { 79 warn "Failed to open $file: $!\n"; 80 return; 81 }If the filename is considered safe enough, it is opened for reading with IO::File. If the open fails, a suitable warning is returned and this DNSBL is skipped. This happens at lines 75-81. The loop at lines 83-140 reads and processes each line on the DNSBL source file. This seemingly simple task is implemented with comparatively much code, as a few tradeoffs are done for speed:
87 $l =~ s!(?:\s+|/32$)!!g;Line 87 removes the (for us) useless /32 mask from some entries:
89 if ($l =~ m!/(\d{1,2})$!) 90 { 91 my $m_len = $1; # Cheap mask len 92 my $ip = new NetAddr::IP $l; 93 unless ($ip) 94 { 95 warn "$file:$.: Invalid IP spec <$l>\n"; 96 next IP; 97 } 98 99 for my $n (@input) 100 { 101 if ($n->masklen < $m_len) 102 { 103 if ($n->contains($ip)) 104 { 105 print "$file $ip ", 106 $ip->broadcast->numeric - \ $ip->network->numeric + 1, 107 "\n"; 108 next IP; 109 } 110 } 111 else 112 { 113 if ($ip->contains($n)) 114 { 115 print "$file $n", 116 $n->broadcast->numeric - \ $n->network->numeric + 1, 117 "\n"; 118 next IP; 119 } 120 } 121 } 122 123 }Lines 89-123 take care of the case where the address has a netmask. In this case, we parse it using NetAddr::IP and use its ->contains() method to find out how to report the listing. Lines 93-97 take care of the potentially corrupt entries that cannot be parsed by NetAddr::IP.
126 else 127 { 128 for my $re (@ire) 129 { 130 if ($l =~ m/$re/) 131 { 132 print "$file $l 1\n"; 133 next IP; 134 } 135 } 136 }Lines 126-136 take care of the simpler case of a lone /32, now without a mask. This can simply be matched against the regular expressions in @ire, reporting in the same way as for the previous case:
139 close $fh;Line 139 explicitly closes the $fh pointing to the DNSBL data file:
143 File::Find::find 144 ( 145 { 146 no_chdir => 'yes', 147 wanted => \&slurp_n_check, 148 }, 149 $opt_s 150 );Finally, lines 143-150 summon File::Find to scan the directory provided with the -s option to our script and process it with the function discussed before. Now I will create the file isp-networks, with content like this:
# This is a list of all my IP space 200.11.128.0/17 ...A simple run of count-ips would produce output as follows:
$ mkdir results $ count-ips -i isp-networks -s lists > results/hits-'date '+%Y%m%d'' $ head results/hits-20060910 lists/http.dnsbl.sorbs.net 200.11.177.138 1 lists/http.dnsbl.sorbs.net 200.11.178.204 1 lists/http.dnsbl.sorbs.net 200.11.181.28 1 lists/http.dnsbl.sorbs.net 200.11.182.113 1 lists/http.dnsbl.sorbs.net 200.11.182.114 1 lists/http.dnsbl.sorbs.net 200.11.182.115 1 lists/http.dnsbl.sorbs.net 200.11.182.187 1 lists/http.dnsbl.sorbs.net 200.11.182.58 1 lists/http.dnsbl.sorbs.net 200.11.182.59 1 lists/http.dnsbl.sorbs.net 200.11.182.62 1In fact, I could now put this line in my crontab, as follows:
7 1 * * * cd $DNSBL_WORK_DIR && \ ./count-ips -i ./isp-networks -s ./lists > \ ./results/hits-'date '+%Y%m%d''; \ find ./results/ -type f -mtime +10 | xargs rmThis would produce a hits-<date> file with each day's result. It probably isn't useful to generate more than one datapoint per day. Note that many DNSBLs place a limit on the frequency for downloading their sources, so this may also limit the number of samples per day you can generate. In practice, I've been working with a single sample per day. This provides good results as the trends are usually a few weeks long. Note the find | rm added at the end of the command. This should allow you to keep the last 10 days of results, in case you need them. You can adjust this value based on the amount of listings you have (and disk space). These hits files are very useful. As an example, here is a script -- warn-count -- that can alert you whenever a critical part of your network appears in one of your lists. The beginning of the script, up to line 44 is very similar to count-ips, so I won't discuss it. The first relevant code is found below:
46 my @C = (); # The critical IP space 48 my $fh = IO::File->new($opt_c, "r") 49 or die "Unable to open critical file $opt_c: $!\n"; 50 51 while (my $l = $fh->getline) 52 { 53 $l =~ s!\#.*$!!g; 54 chomp $l; 55 next unless $l =~ /\S/; 56 my ($n, $d) = split(m/\s+/, $l, 2); 57 my $ip = new NetAddr::IP $n; 58 unless ($ip) 59 { 60 warn "$opt_c:$.: Invalid IP address $n\n"; 61 next; 62 } 63 64 push @C, { ip => $ip, desc => $d }; 65 }Lines 46-65 are responsible for reading in the "critical" file. This is a file in a format similar to the input file of count-ips, which specifies network ranges that are critical. For instance, your mail servers should be there. After the IP subnet specification, that should be the first whitespace-separated column, a legend must be added. This legend should be mnemonic, so that the warnings make more sense. The critical space is stored in a list of hashrefs at @C:
70 @C = sort { $b->{ip}->masklen <=> $a->{ip}->masklen } @C;Line 70 sorts @C so that the network specifications are arranged from most specific to least specific. In some cases, you may want to have an entry for a whole datacenter and then more specific entries for groups of servers:
75 while (<>) 76 { 77 print $_ if $opt_f; 78 chomp; 79 my ($bl, $n, $num) = split(/\s+/, $_, 3); 80 my $ip = new NetAddr::IP $n; 81 unless ($ip) 82 { 83 warn "Unrecognized IP spec $n (line $.) - Ignoring\n"; 84 next; 85 } 86 87 # Iterate through the critical networks 88 for my $c (@C) 89 { 90 if ($c->{ip}->contains($ip)) 91 { 92 warn "Listing $bl ($ip) [$num hosts] matches $c->{desc}\n"; 93 last; 94 } 95 elsif ($ip->contains($c->{ip})) 96 { 97 warn "Listing $bl ($ip) [$num hosts] contains $c->{desc}\n"; 100 } 101 } 102 }Lines 75-102 iterate through the hits being read and match the critical networks. The sorting done at line 70 allows the last in line 93 to break the loop early, avoiding unnecessary work. Now, I'll create a "critical" file with the following contents:
# Our datacenter space 200.11.128.0/20 IP space for internal use 200.11.128.0/24 Firewalls 200.11.130.0/24 Misc public servers 200.11.132.0/24 Access servers 200.11.134.0/24 Misc private servers 200.11.130.0/25 Public mail servers ...Warn-count could be run as in this example:
$ cd $DNSBL_WORK_DIR $ ./warn-count -c critical ./results/hits-20060910 Listing lists/spam.dnsbl.sorbs.net (200.11.130.10/32) [1 hosts] matches Public mail servers Listing lists/spam.dnsbl.sorbs.net (200.11.130.10/32) [1 hosts] matches Misc public servers Listing lists/spam.dnsbl.sorbs.net (200.11.130.10/32) [1 hosts] matches IP space for internal useThe report contains the input file, the IP block that was found and the number of IP addresses it contains. If placed in the proper crontab file, it would cause this information to be sent via email to the person in charge of dealing with listings of critical infrastructure. The example above would tell you that one of your public mail servers was listed in spam.dnsbl.sorbs.net. At this point, we know how to generate daily samples of our IP space listings in the DNSBLs we're tracking. Now it's time to store that information in a database, so that we can easily build reports out of it. Let's start with defining a schema like the one shown in Figure 1 . The "dnsbl" table will hold the name of each of the DNSBLs we will consider. The "sample" table will hold each of the individual samples that will describe our variable over time. The file count.sql ( Listing 1 ) contains the SQL description of this schema in SQLite syntax. To create the database in the file $DNSBL_WORK_DIR/count.db, the following command can be used:
$ cd $DNSBL_WORK_DIR $ sqlite3 count.db < count.sqlThat's all there is to it, so now let's move on to filling the database with samples. To make that job even easier, let's use Class::DBI. This provides for a very nice OO wrap around the database, making the code considerably cleaner. Our Class::DBI-derived class will be in $DNSBL_WORK_DIR/lib/Net/Count.pm and the module will be called "Net::Count" for lack of a better name. Within this file, I will define three classes: Net::Count (the base class inheriting from Class::DBI), Net::Count::Dnsbl, and Net::Count::Sample. Let's see some code:
9 package Net::Count; 10 use base 'Class::DBI';Lines 9 and 10 are pretty much it for Net::Count. However, this allows for a handy place to put common code, if the need arises. Most of my Class::DBI hierarchies have a common base like this, just in case:
12 package Net::Count::Dnsbl; 13 use base 'Net::Count'; 15 __PACKAGE__->table('dnsbl'); 16 __PACKAGE__->columns(All => qw/id name/); 17 __PACKAGE__->has_many(samples => 'Net::Count::Sample');Lines 12-17 define Net::Count::Dnsbl base properties: The name of the table, the columns that will be managed by Class::DBI, and the relationship with the Net::Count::Sample class, expressed in this case with a call to ->has_many():
19 sub normalize_column_values 20 { 21 my $self = shift; # Object or string - Careful! 22 my $r = shift; 23 $r->{name} =~ s/\W/_/g; 24 }Since we want to be sure that the names of the lists do not contain dangerous characters, I provide a normalize_column_values() function. This is invoked by Class::DBI automatically every time a row is to be inserted or updated in the table, and provides for an opportunity to alter the data:
26 __PACKAGE__->set_sql(names => qq{ 27 SELECT DISTINCT name 28 FROM __TABLE__ 29 }); 30 31 sub unique_names 32 { 33 my $sth = Net::Count::Dnsbl->sql_names(); 34 $sth->execute; 35 map { $_->[0] } @{$sth->fetchall_arrayref(['name'])}; 36 }Finally, lines 26-36 use the extended ->set_sql() provided by Class::DBI to provide a custom ->unique_names() method returning a list of unique DNSBL names in the database. This will be useful later, when we try to generate the graphs from our data:
38 package Net::Count::Sample; 39 use base 'Net::Count'; 40 41 __PACKAGE__->table('sample'); 42 __PACKAGE__->columns(Primary => qw/dnsbl_id sample_utime source/); 43 __PACKAGE__->columns(Other => qw/count/); 44 __PACKAGE__->has_a(dnsbl_id => 'Net::Count::Dnsbl');Lines 38-44 define the basic details of the Net::Count::Sample. The relationship with the dnsbl table, through the Net::Count::Dnsbl class, is expressed by Class::DBI->has_a(). Note the calls to ->columns(), which allow the specification of multi-column primary keys as well as regular columns:
53 __PACKAGE__->set_sql(sources => q{ 54 SELECT DISTINCT source 55 FROM __TABLE__ 56 }); 58 __PACKAGE__->set_sql(historic_data => q{ 59 SELECT __ESSENTIAL__ 60 FROM dnsbl, __TABLE__ 61 WHERE 62 dnsbl.name = ? 63 AND sample.source = ? 64 AND sample.sample_utime >= ? 65 AND dnsbl.id = sample.dnsbl_id 66 ORDER BY sample.sample_utime 67 }); 69 sub unique_sources 70 { 71 my $sth = Net::Count::Sample->sql_sources(); 72 $sth->execute; 73 map { $_->[0] } @{$sth->fetchall_arrayref(['source'])}; 74 }Lines 52-74 use ->set_sql() again to produce the list of unique sources and also to query the database looking for the latest samples corresponding to a given DNSBL and source. These 73 lines of code, completely abstract the interaction with the database and allow our data loading script, dbi-count, to be much simpler. Let's see its relevant parts:
47 while (<>) 48 { 49 chomp; 50 my ($bl, $num) = (split(/\s+/, $_, 3))[0, 2]; 51 $T{$bl} += $num; 52 }The first step is to load and summarize all the hits in each DNSBL. This is done by a accumulating the total in a simple hash, at lines 47-52:
55 Net::Count->connection($opt_d, undef, undef);Line 55 initializes the Class::DBI machinery using either the argument to the -d command-line flag or a default DSN specified earlier in the code:
58 for my $bl (sort keys %T) 59 { 60 my $db_bl = Net::Count::Dnsbl->find_or_create(name => $bl); 61 my $db_sample = Net::Count::Sample->insert 62 ( 63 { 64 dnsbl_id => $db_bl, 65 sample_utime => $time, 66 source => $opt_s, 67 count => $T{$bl}, 68 } 69 ); 70 # Some fancy output 71 print "$opt_s\t$bl\t$T{$bl}\n"; 72 }Lines 58-72 trivially insert each sample in the database. Line 60 uses Class::DBI->find_or_create() to either search for or create a new DNSBL entry corresponding with the total we intend to store. This allows our database to automatically adapt as we incorporate new DNSBLs to our monitoring, sparing us from having to remember to add columns or modify scripts. Lines 61-69 insert the sample corresponding to this list, with this source, at the time the program is running. Thanks to having used Class::DBI, our script is extremely portable. It will support any database already supported by the Perl DBI. Running the script could not be easier:
$ cd $DNSBL_WORK_DIR $ ./dbi-count ./results/hits-20060910Alternatively, you could modify the line in your crontab file where you invoke count-ips like this:
7 1 * * * cd $DNSBL_WORK_DIR && \ ./count-ips -i ./isp-networks -s ./lists > \ ./results/hits-'date '+%Y%m%d''; \ ./dbi-count -d $DSN ./results/hits-'date '+%Y%m%d''; \ find ./results/ -type f -mtime +10 | xargs rmThis causes the results to be loaded to the database as soon as the scanning of the DNSBL data is finished. All that's left now is producing the eye candy by converting the data stored in the database to a nice graph we can show the boss. We will use rrdtool for this task. This process has two parts: populating the RRD data and generating the graph. To accomplish the first task, let's take a look at count-rrd, a simple script that takes the results stored in the database, creates the missing RRD files, and updates them. This makes the process of maintaining the RRDs automatic, thus reducing our workload:
48 my @dnsbls = map { s!\W+!_!g; $_ } \ Net::Count::Dnsbl->unique_names; 49 my @sources = map { s!\W+!_!g; $_ } \ Net::Count::Sample->unique_sources;Lines 48-49 use the ->unique_* methods defined in Net::Count::* to obtain the list of DNSBLs and sample sources in the database. Note the use of map { ... } as an additional measure against dangerous names coming from the database:
52 for my $d (@dnsbls) 53 { 54 for my $s (@sources) 55 { 56 my $name = "$path/$d-$s.rrd"; 57 if (-f $name) 58 { 59 update_rrd($name, $d, $s, 86400); 60 } 61 else 62 { 63 init_rrd($name, $d, $s); 64 } 65 } 66 }Lines 52-66 iterate through all possible combinations of DNSBL and sample sources. If the RRD corresponding to that combination is missing, the function init_rrd() is called. Otherwise, the update_rrd method is invoked:
72 sub init_rrd 73 { 74 my $file = shift; 75 my $dnsbl = shift; 76 my $source = shift; 77 78 # Which data sources and aggregator functions to 79 # include in each RRD 80 81 my @dss = (qw/listings/); 82 my @agg = (qw/AVERAGE MIN MAX/); 83 84 # Create the RRDs 85 RRDs::create 86 ( 87 $file, 88 '--start' => '-1 year', 89 '--step' => 86400, 90 (map { "DS:$_:GAUGE:172800:0:U" } @dss), 91 (map { "RRA:$_:0.5:1:365" } ('LAST', @agg)), 92 (map { "RRA:$_:0.5:5:365" } ('LAST', @agg)), 93 (map { "RRA:$_:0.5:30:365" } ('LAST', @agg)), 94 ); ... 98 # Verify creation errors 99 my $error = RRDs::error; 100 die "Failed to create RRD $file: $error\n" 101 if $error; 102 103 # Perform the update 104 update_rrd($file, $dnsbl, $source, 31536000); 105 }Lines 72-105 define the ->init_rrd() method. This is a rather thin wrapper around RRDs::create() that supplies the parameters to create the RRD as required. This is precisely what allows count-rrd to automatically adapt to new DNSBLs or sample sources. RRDs for new combinations can be created on the fly. In fact, you can erase the RRDs and rerun count-rrd. This will re-create the erased files:
107 sub update_rrd 108 { ... 115 my @samples = Net::Count::Sample->search_historic_data($dnsbl, 116 $source, 117 $time); 118 # Now update the RRD file 119 for my $sample (@samples) 120 { 121 # The actual update command 122 RRDs::update( 123 $file, 124 '-t' => 'listings', 125 $sample->sample_utime . ":" . \ $sample->count 126 ); 127 128 # Verify update errors 129 my $error = RRDs::error; 130 warn "Failed to update RRD $file: $error\n" 131 if $error; 132 } 133 }Lines 107-133 obtain data for a given period of time using the Net::Count::Sample->search_historic_data() method, the data is then passed to the RRDs::update() method for updating of the corresponding RRD file. I can now update all my RRD files by invoking the following command:
$ cd $DNSBL_WORK_DIR $ mkdir rrd $ ./count-rrd rrdThis will create all the missing RRDs and update their information. Of course, I could also add this step to the crontab file. The new line would look a lot like this:
7 1 * * * cd $DNSBL_WORK_DIR && \ ./count-ips -i ./isp-networks -s ./lists > \ ./results/hits-'date '+%Y%m%d''; \ ./dbi-count -d $DSN ./results/hits-'date '+%Y%m%d''; \ ./count-rrd -d $DSN ./rrd; \ find ./results/ -type f -mtime +10 | xargs rmAll that's left is converting the RRDs into graphs that can be put on Web pages -- probably with rrdcgi -- or added to reports. For demonstration purposes, I prepared graph.sh, which creates a very simple area graph with the lists I've discussed in this article:
#!/bin/bash rrdtool graph 'rrd/summary.svg' --imgformat SVG \ --start '-6 months' --end '-1 second' --step 86400 \ --width 600 --height 240 --lower-limit 0 \ --slope-mode \ --title 'Listing summary' --watermark 'SysAdmin Magazine' \ DEF:var0=rrd/lists_http_dnsbl_sorbs_net-default.rrd:listings:LAST \ DEF:var1=rrd/lists_smtp_dnsbl_sorbs_net-default.rrd:listings:LAST \ DEF:var2=rrd/lists_socks_dnsbl_sorbs_net-default.rrd:listings:LAST \ DEF:var3=rrd/lists_spam_dnsbl_sorbs_net-default.rrd:listings:LAST \ DEF:var4=rrd/lists_web_dnsbl_sorbs_net-default.rrd:listings:LAST \ DEF:var5=rrd/lists_misc_dnsbl_sorbs_net-default.rrd:listings:LAST \ DEF:var6=rrd/lists_list_dsbl_org-default.rrd:listings:LAST \ DEF:var7=rrd/lists_psbl_surriel_com-default.rrd:listings:LAST \ DEF:var8=rrd/lists_sbl_spamhaus_org-default.rrd:listings:LAST \ DEF:var9=rrd/lists_xbl_spamhaus_org-default.rrd:listings:LAST \ AREA:var0#FF0000:'rrd/lists_http_dnsbl_sorbs_net-default.rrd' \ AREA:var1#FFFF00:'rrd/lists_smtp_dnsbl_sorbs_net-default.rrd':STACK \ AREA:var2#FF00FF:'rrd/lists_socks_dnsbl_sorbs_net-default.rrd':STACK \ AREA:var3#CCDD22:'rrd/lists_spam_dnsbl_sorbs_net-default.rrd':STACK \ AREA:var4#0000FF:'rrd/lists_web_dnsbl_sorbs_net-default.rrd':STACK \ AREA:var5#0044EE:'rrd/lists_misc_dnsbl_sorbs_net-default.rrd':STACK \ AREA:var6#CCAAFF:'rrd/lists_list_dsbl_org-default.rrd':STACK \ AREA:var7#3322DD:'rrd/lists_psbl_surriel_com-default.rrd':STACK \ AREA:var8#888800:'rrd/lists_sbl_spamhaus_org-default.rrd':STACK \ AREA:var9#882288:'rrd/lists_xbl_spamhaus_org-default.rrd':STACKPlease see the documentation for rrdtool, as this excellent package includes lots of options for customizing your graphs. As shown, this script produced a graph similar to Figure 2 when fed with data for a few months.
Another example of the look you can achieve can be seen in Figure 3 . This graph, in Spanish, shows the results of applying this technique in a real network.
And this eye candy provides an excellent excuse to finish this article, so you can go play with the rrdtool options for changing how your graphics look. I hope this information proves as useful to you as it has been for me.
Luis has been working in various areas of computer science since the late 1980s. Some people blame him for conspiring to bring the Internet into his home country, where currently he spends most of his time teaching others about Perl and taking care of network security at the largest ISP there as its CISO. He also believes that being a sys admin is supposed to be fun.