11.22.06

HOWTO: Apache2 + Awstats setup on Debian/Ubuntu (Edgy Eft)

Posted in Unix / Linux, Tips and Documentation at 9:07 am by skoobi

Here is a simple HOWTO explaining how to configure AWstats to analyze Apache2 logs, and provide detailed statistics, under Ubuntu Edgy Eft. This should also work for other Ubuntu versions, as well as any Debian derivative.

Apache

The first step is to activate Logging in Apache, so that Awstats has something to analyze. For instance, you can add something similar in your VirtualHost configuration :

ErrorLog /var/log/apache2/sirika.com-error.log
CustomLog /var/log/apache2/sirika.com-access.log combined

Another important thing is to configure a few things for awstats in apache, like where the icons are, and more importantly, to activate CGI-scripts (since AWstats is written in perl…) . This can be done thanks to the following /etc/apache2/conf/awstats.conf :

# This provides worldwide access to everything below the directory
# Security concerns:
# * Raw log processing data is accessible too for everyone
# * The directory is by default writable by the httpd daemon, so if
# any PHP, CGI or other script can be tricked into copying or
# symlinking stuff here, you have a looking glass into your server,
# and if stuff can be uploaded to here, you have a public warez site!

Options None
AllowOverride None
Order allow,deny
Allow from all
# This provides worldwide access to everything below the directory
# Security concerns: none known

Options None
AllowOverride None
Order allow,deny
Allow from all

# This provides worldwide access to everything in the directory
# Security concerns: none known
Alias /awstats-icon/ /usr/share/awstats/icon/

# This (hopefully) enables _all_ CGI scripts in the default directory
# Security concerns: Are you sure _all_ CGI scripts are safe?
ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/

Awstats

The next step is to install awstats, with all the necessary perl modules.. Using several optional modules implies that you have installed them. liburi-perl is useful if you use the “decodeutfkeys” module, even though it is NOT listed as recommended or suggested in the awstats package.

sudo apt-get install awstats libnet-dns-perl libnet-ip-perl libgeo-ipfree-perl liburi-perl libnet-xwhois-perl

Once this is working, it is now necessary to configure awstats, to tell it which logs it should monitor, and where it should write its working files (stats). This is done by creating a file /etc/awstats/awstats.website.conf (replace website by your apache2 virtual host name, for instance, and do NOT forget the .conf !).

sudo cp /etc/awstats/awstats.conf /etc/awstats/awstats.website.conf

Editing this file should be pretty straightforward, since it is well commented. In particular, pay attention to the following entries

LogFile=”/var/log/apache2/sirika.com-access.log”

SiteDomain=”sirika.com”

HostAliases=”www.sirika.com”

DirData=”/srv/data/stats/sirika.com”

LogFile should point to the log file configured in Apache2.

SiteDomain is the main domain name, as configured in Apache2.

HostAliases should list ALL the aliases listed in Apache2’s VirtualHost configuration. Usually, you will want the same domain prefixed with www, (or without the www prefix if is it already specified in the main domain name). This is really annoying, error-prone and not having a global definition of a “virtual host” on the system is one of the issues I pointed in “10 things that still suck under linux“. Virtual hosts and aliases should be defined once, globally. Every single concept/thing should be repeated/configured once, and only once, on a perfect system. Anyways…

DirData should point to an empty directory, whose content will be managed by Awstats.
Cron

Everything is now configured. On a perfect system, the setup would stop now. Awstats could be automagically notified for every change in the logs (awstats could register to a so-called “http access event”, and its internal behaviour could define its policy for updating the stats (update synchronously, update asynchronously when the system is idle, etc..). This is one of the things I pointed out in “10 things that still suck under linux“. However, we’re not there yet, so, we need to run the script every day to update the stats. (yeah, that’s the reality today..)

So, this can be done thanks to the following /etc/cron.daily/awstats (do not forget to chmod +x /etc/cron.daily/awstats after creating it) :

#!/bin/sh

/usr/share/doc/awstats/examples/awstats_updateall.pl -awstatsprog=/usr/lib/cgi-bin/awstats.pl now > /dev/null

This will update all the statistics for all the hosts defined in /etc/awstats/*, on a daily basis. Yes, it’s not as beautiful as having a full-featured event-system for which every application could attach to events generated by others, but it has the merits of working…

Logrotate

What happens when your apache logs get rotated (and possibly gzipped, etc) by logrotate (apt-cache show logrotate for more information), and awstats still hasn’t analyzed the end of the logs that is about to be rotated ?

To avoid this situation, it is necessary to tell logrotate to launch awstats BEFORE rotating the logs. This can be done by adding the following lines to /etc/logrotate.d/apache2 :

prerotate
/etc/cron.daily/awstats
endscript

Permissions

And of course, permissions must be tweaked :

  • Since Awstats runs as the web users for viewing stats (CGI-script), the web user needs read access to /srv/data/stats/*
  • Additionnally, you may want to provide the “update now” button on your website stats. So, the web user also needs write access to /srv/data/stats/*
  • Finally, awstats needs access to the apache2 logs to create the stats. This is not a problem when it is run from a cron script, since it is run as root. But, in the case of “update now”, it runs as the web server, so the web server needs read access to its logs. (the default permissions are 660 with root:adm), so www-data doesn’t have access to its logs

The problem with traditional permissions is that there is no decent way of specifying default permissions. So, we are going to use ACLs for that. You can find more information about them here (Using POSIX ACLs to complement traditional Linux permissions). So, this gives, for instance :

# read write execute access for web user to the stats directories
find /srv/data/stats -type d -exec setfacl -m “g:www-data:rwx” {} \;

# read write execute access for FUTURE stats files for the web users

find /srv/data/stats -type d -exec setfacl -d -m “g:www-data:rwx” {} \;

# read write access to the stats files for the web user
find /srv/data/stats -type f -exec setfacl -d -m “g:www-data:rw-” {} \;

# read only access to the logs directory for the web user
find /var/log/apache2 -type d -exec setfacl -m “g:www-data:r-x” {} \;

# read only access to the logs for the web user, for future files

find /var/log/apache2 -type d -exec setfacl -d -m “g:www-data:r-x” {} \;

# read only access to the apache2 logs for the web user
find /var/log/apache2 -type f -exec setfacl -m “g:www-data:r–” {} \;

And it should work.. The last thing would be to protect access to your logs, if you don’t want your users to see them. This can be done using a .htaccess file, and there are plenty of tutorials on the web that explain how to achieve that.

1 Comment »

  1. Val said,

    July 9, 2008 at 7:35 am

    Thanks dude, this is great.

Leave a Comment