This site is now 100% read-only, and retired.

Lightweight database work with sqlite

Posted by Steve on Mon 23 May 2005 at 00:35

There are many uses for databases, for example storing results and data processing. However there are cases where using a full database is unfeasible. If your job is small you might consider using the standalone package sqlite to create a simple database.

In this situation there are two useful solutions:

  • Connect to a database upon a different host.
  • Use a lightweight standalone database such as SQLite

Both solutions have pro's and cons. If you're familiar with a large database such as MySQL, Postgres, or Oracle you may well consider using that as the clients may easily run upon another host.

But there are situations where a small database is useful for minimal tasks - upon computers running as gateways, for example.

SQLite is a relatively recent project, but it has an almost complete implementation of SQL 92, and has bindings for several different languages.

One of the attractions of using SQLite is that it works on individual files, which can be copied to other hosts, and it doesn't require a server running at all times - only when you are actively importing data or querying it is anythign running.

As an example of a useful application which you might consider using a database for we'll look at using the perl bindings to first import some logfiles into a database, then deal with some statistics.

(If you're interested in generating real reports from Apache access logs you might be interested in one of the modules for logging accesses direct to a database from Apache - this cuts out the need to import the data).

First of all you'll need to install the database tools, and then the perl runtime:

 apt-get install libdbd-sqlite3-perl

This will give you the ability to "connect" to a database. As there is no database server installed you simply give the module the name of a file when you issue a connect statement - if it doesn't exist it will be created.

To connect to a database stored in the file "logs.db" you'd use the following code:

#!/usr/bin/perl -w
use strict;
use DBI;

my $dbh = DBI->connect( "dbi:SQLite:dbname=logs.db", "", "" );

(Here the last two parameters are "username" and "password" - meaningless in the server-less context of SQLite).

Once connected you could create a table. For this example we're going to work with a standard Apache logfile.

The table I wish to work with will have the following columns:

  • The IP address of the client which made a request.
  • The request made.
  • The status code the server returned.
  • The size of the request.

This information will allow us to generate some interesting statistics.

To create the table we run the following program:

#!/usr/bin/perl -w
use strict;
use DBI;

my $dbh = DBI->connect( "dbi:SQLite:dbname=logs.db", "", "" );

#
#  Delete the table if it already exists.
#
eval 
{
    local $dbh->{PrintError} = 0;
    $dbh->do("DROP TABLE accesses");
};

$dbh->do( "CREATE TABLE accesses (id INTEGER PRIMARY KEY, ip, request, code, size);" );

$dbh->disconnect();

Once running this you should find that you have a file called logs.db which has the table structure inside it. Notice that we didn't give types to each of the columns - this is something that the library tries to manage itself.

Now that we've got the database created we wish to actually import some data into it. The following is a sample excerpt from an Apache access.log file - in the common format:

192.168.1.1 - - [22/May/2005:22:45:09 +0100] "GET /images/pollbarright.gif HTTP/1.0" 304 - "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.7) Gecko/20050421 Firefox/1.0.3 (Debian package 1.0.3-2)"
192.168.1.1 - - [22/May/2005:22:45:09 +0100] "GET /images/pollbar.gif HTTP/1.0" 304 - "-" "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.7) Gecko/20050421 Firefox/1.0.3 (Debian package 1.0.3-2)"

Importing this detail can be simply achieved via a bit of Perl.

#!/usr/bin/perl -w
use strict;
use DBI;

#
#  Connect to the database.
#
my $dbh = DBI->connect( "dbi:SQLite:dbname=logs.db", "", "" );

#
#  Delete the table if it already exists.
#
eval 
{
    local $dbh->{PrintError} = 0;
    $dbh->do("DROP TABLE accesses");
};


#
#  Create it afresh.
#
$dbh->do( "CREATE TABLE accesses (id INTEGER PRIMARY KEY, ip, request, code, size);" );



#
#  Open the specified Apache access log.
#
my $filename = shift;

die "No filename...." if ( ! defined( $filename ) );


open( INPUT, "<$filename" ) or die "Cannot open $filename - $!";


#
#  Parse each line in a *very* simplistic and fragile way.
#
#  Insert items into database
#
foreach my $line ( <INPUT> )
{
    my ( $ip, $request, $code, $size )= "";

    chomp( $line );
    if ( $line =~ /^([^ ]+) / ) 
    { 
        $ip = $1; 
    }
    if ( $line =~ /GET ([^ ]+) HTTP/ ) 
    { 
        $request = $1; 
    }
    if ( $line =~ /HTTP\/1.[01]" ([0-9]+) / )
    {
        $code = $1; 
    }
    if ( $line =~ /HTTP\/1.[01]" ([0-9]+) ([0-9]+) / )
    {
        $size = $2; 
    } 
    else 
    { 
        $size = 0; 
    }

    $dbh->do( "INSERT INTO accesses( ip, request, code, size ) VALUES( '$ip', '$request', $code, $size);" );
}
close( INPUT );

#
#  Disconnect.
#
$dbh->disconnect();

With this saved to a file, import-sql you can run it as follows:

perl import-sql /var/log/apache/access.log

If it completes successfully you should see that your database file has grown:

skx@lappy:~$ ls -l logs.db 
-rw-r--r--  1 skx skx 50176 2005-05-23 01:13 logs.db

Now we can start working with the database contents, for example we might want to see the unique IP addresses which have visited our site - and how many times.

This can be achived with the following query:

SELECT ip,COUNT(*) AS Number FROM accesses GROUP BY ip ORDER BY Number DESC, ip

A simple perl script to show the top accesses by IP address would look like this:

#!/usr/bin/perl -w
use strict;
use DBI;

#
#  Connect to the database.
#
my $dbh = DBI->connect( "dbi:SQLite:dbname=logs.db", "", "" );


my $all = $dbh->selectall_arrayref("SELECT ip,COUNT(*) AS Number FROM accesses GROUP BY ip ORDER BY Number DESC, ip" );

foreach my $row (@$all) 
{
    my ($ip, $count) = @$row;
    print "$ip \t\t| $count\n";
}

#
#  Disconnect.
#
$dbh->disconnect();

Saving this to a file top-ip-sql then running it:

skx@lappy:~$ perl top-ip-sql
127.0.0.1               | 898
192.168.1.1             | 389

This shows us that the IP address 127.0.0.1 made 898 requests to the database - and that the IP address 192.168.1.1 made 389 requests.

We could also look at the most popular requests, by count, in a similar fashion - with the following query:

SELECT request,COUNT(*) AS Number FROM accesses GROUP BY request ORDER BY Number DESC, request LIMIT 10;

Rather than using the Perl interface to the library you can also access the database via the command line - as you might be used to if you're familiar with database client of MySQL, or Postgres.

To do this you will need to install the client program:

apt-get install sqlite3

Then you amy interact with the database directly, via the command:

sqlite3 filename

Here we can run the same queries as we did before - or new ones.

A simple example would be seeing the total amount of traffic which we served:

skx@lappy:~$ sqlite3 logs.db 
SQLite version 3.2.1
Enter ".help" for instructions
sqlite> 
sqlite> select sum( size ) FROM accesses;
7855952
sqlite> .exit

Here we see that 7,855,952 bytes have been served and logged in our access file.

This is just one simple example of the kind of programmatic or on-the-fly logging jobs you can simplify with the SQLite database.

 

 


Re: Lightweight database work with sqlite
Posted by Anonymous (210.18.xx.xx) on Tue 25 Sep 2007 at 03:39
excellent - simple and clean and yet achiveing so much. thanks

[ Parent ]

Re: Lightweight database work with sqlite
Posted by Anonymous (213.137.xx.xx) on Mon 17 Nov 2008 at 21:53
Spot on mate, very usefull information :)

- Peter.

[ Parent ]

Re: Lightweight database work with sqlite
Posted by Anonymous (68.189.xx.xx) on Mon 19 Jan 2009 at 19:37
Thank you for taking the time to make this excellent tutorial. It all worked for me.
Bill

[ Parent ]

Re: Lightweight database work with sqlite
Posted by Anonymous (41.141.xx.xx) on Sat 10 Dec 2011 at 17:33
what ist the right regex of url and agent?
thanks

[ Parent ]