This site is now 100% read-only, and retired.

Some simple Apache optimisations

Posted by Steve on Mon 18 Jul 2005 at 10:02

Apache is the world's most popular webserver, powering over half the websites on the internet. It is a stable and reliable platform, but sometimes it struggles under a lot of load. Here we'll look at a couple of simple changes to increase performance when handling a lot of traffic.

None of these tips are revolutionary, but combined they have allowed this site to stay up under two slashdottings. If you've not heard the term before a Slashdotting is what happens when a popular website such as Slashdot links to a smaller site - suddenly there are thousands of visitors all coming to your site. The sudden and sustained increase in incoming requests can frequently overload many servers.

Frequently the Slashdot effect will knock a site over, either because there's insufficient bandwidth to handle the incoming connections, or because the webserver isn't setup to handle such a large load. This site has survived two such links, most recently a single article received 16,000 readers in the space of a couple of hours.

So, what can we do to tune Apache? Well there are several small and large changes that can be made - depending upon your server some or all of these may not be appropriate, but they've worked for me both here and on other sites I've setup. (At times like this I feel like pimping out my server handholding and remote maintainence services .. ;)

DNS Lookups

The single biggest source of slowdown in most webservers is the time required to perform DNS lookups.

Typically a webserver will record the full host name of each incoming client connection in it's access.log. This resolving can eat a significant chunk of time, even with a DNS cache.

Disabling DNS lookups by ensuring your Apache setup contains "HostnameLookups Off" inside either /etc/apache/httpd.conf, or /etc/apache2/apache2.conf can immediately make your server capable of handling more traffic.

You might be concerned that this will make your server log files less readable, and affect any log file analysis you might wish to perform. But thankfully the Debian Apache package ships with the logresolve tool - this will perform hostname lookups upon your log file, and output a new one as output.

If you use webalizer or Awstats you can use the logresolve tool to add in the host names before the stats are generated.

I use webalizer to produce my site's statistics and simply instruct it to read it's logfile from access.log.resolved instead of the more typical access.log. I produce this file once a day, just before producing the statistics with the following small script:

#!/bin/sh


cd /home/www/www.site1.com/logs
logresolve < access.log > access.log.resolved
/usr/bin/webalizer -q 

cd /home/www/www.site2.com/logs
logresolve < access.log > access.log.resolved
/usr/bin/webalizer -q 
MaxClients

When Apache starts up it will create a number of listening processes, each of which will handle a given number of clients then exit.

(This process is complicated somewhat by the different MPM models available in Apache2 - but in general it's a fair statement.)

If you have a lot of incoming clients you can immediately handle more just by increasing the relevant counts.

If your server has reached the limit of what it can handle you'll see something like this in your error.log file:

[error] server reached MaxClients setting, consider raising the MaxClients setting

The settings look like this, although if you're using Apache2 you'll discover that your apache2.conf file has multiple versions of these settings, one for each of the process models available:

StartServers         5
MinSpareServers      5
MaxSpareServers     10
MaxClients          35
MaxRequestsPerChild  0

The way to adjust these is to increase each number upwards by a small amount. This should allow you to handle more simultaneous clients, at the expense of running more processes. There's a fine balance to be maintained between running enough processes to handle the traffic, and running so many that your server slows down due to increased load.

Adjusting these settings appropriately will almost certainly be the single most useful change you can make to your server, but it's hard to give appropriate numbers. It really will depend upon your server, and what else you're running.

KeepAlive

Using KeepAlive is closely related to the MaxClients setting above.

Essentially KeepAlive keeps each listening connection alive for a short time to receive a potential followup request. Assuming that a client wishes to make several requests to your server it can do so en masse without having to make multiple distinct connections.

In this scenario KeepAlive is a useful optimisation, but it can mean that you have a lot of connections open uselessly waiting for followup requests which never occur.

A possible solution here is to allow KeepAlive, but only for a few seconds. This means that any client which requests another page quickly will receive it, but if it doesn't then the listening will stop - allowing your server to handle another connection instead.

To do this use:

#
#  Keep connections alive, but only for two seconds.
#
KeepAlive On
KeepAliveTimeout 2
Deny OverRides

Another common source of slowdown in Apache is the use of .htaccess files to change Apache's behaviour.

Many settings can be altered on a per-directory basis using these files, but looking for them and reading them will cause the server to slow down, and do more work than it really needs to.

For example the following URL:

This file should be something that Apache can serve quickly, there's nothing (obviously) dynamic about it. But if you allow the use of "Override files" then Apache must scan for and process:

  • prefix/.htaccess
  • prefix/some/.htaccess
  • prefix/some/long/.htaccess
  • prefix/some/long/path/.htaccess

Setting "AllowOverride None" inside any virtual hosts or directory directives you might have will disable this searching and reduce the amount of file testing and reading your server will need.

Of course many times you will discover that you need some directories to have specific processing - the solution here is to add such configuration settings inside your Apache setup directly.

Compress Content

Compress your content with mod_deflate, or mod_gzip, if you can.

Whilst there's some CPU overhead in performing this compression when serving a lot of mostly static content the network saturation is a bigger problem than CPU overload.

If you have CPU load issues you can easily disable this compression when you spot it.

Remove Debugging Logs

Many Apache modules such as mod_rewrite (used for making prettier URLs) or mod_security (a simple security module) allow you to setup logfiles useful for debugging problems.

If you're happy that your setup is working correctly then you no longer need any logfiles so the following entries, for example, should be removed:

RewriteLog        /tmp/rewrite.log
SecFilterDebugLog /var/log/apache2/modsec_debug_log

Hopefully those small tips will allow you to setup your server to handle more load, and perform more efficiently if you get slashdotted.

If you're routinely suffering from lots of load these tips might not be so useful, instead you might need to consider:

  • Having multiple webservers, each sharing the same common back end if you're using a database driven site.
  • Installing a web cache in front of your server to avoid the overhead of generating a lot of identical content to visitors.

Both of these solutions will ease the load on your servers, but they are overkill for smaller sites.

If you have any tips of your own to share feel free to leave them in the comments!

 

 


Re: Some simple Apache optimisations
Posted by Anonymous (81.158.xx.xx) on Mon 18 Jul 2005 at 13:15
There's a hack you can do with pre-compressing static data to save mod_gzip some work. You can set the apache config file to test if the client accepts gzip'ed data, and if it does, send it $request.gz instead of $request, if it exists. It's also possible to do various more advanced twiddles, like getting a 404 handler to do the compression for the first request so you don't need to remember.

[ Parent ]

Re: Some simple Apache optimisations
Posted by Steve (82.41.xx.xx) on Mon 18 Jul 2005 at 13:17
[ View Weblogs ]

Would that involve using "Content Negotiation"?

I'd guess that the overhead of searching for matching documents might be almost enough to outweigh the benefit .. although without full details it's hard to test/know.

Steve
-- Steve.org.uk

[ Parent ]

Re: Some simple Apache optimisations
Posted by Anonymous (82.119.xx.xx) on Mon 18 Jul 2005 at 14:30
Another simple optimisation is disabling unused modules. I can see many webservers (even those providing webhosting services) with all default modules loaded. So the first thing I do is disable any unused modules. I don't know if it has any significant speed effect, but certainly apache requires less memory - which means you can run more apache processes. Because the worst thing for a webserver is when it starts to swap.

[ Parent ]

Re: Some simple Apache optimisations
Posted by Anonymous (213.164.xx.xx) on Tue 19 Jul 2005 at 06:53
Ouch. You don't "privatise" the ip addresses on your public statistics at all!

[ Parent ]

Re: Some simple Apache optimisations
Posted by Steve (82.41.xx.xx) on Tue 19 Jul 2005 at 08:16
[ View Weblogs ]

Is there any point in doing so?

Steve
-- Steve.org.uk

[ Parent ]

Re: Some simple Apache optimisations
Posted by Anonymous (213.164.xx.xx) on Tue 19 Jul 2005 at 11:08
There must me - you do it here.

[ Parent ]

Re: Some simple Apache optimisations
Posted by Steve (82.41.xx.xx) on Tue 19 Jul 2005 at 12:28
[ View Weblogs ]

Ahhh but here it's different, here I'm preventing people from seeing exactly which IP address is linked to each named account.

(Although that's a bit misguided in the case of anonymous users anyway).

In the server statistics that information isn't present - so the inclusion of "real" IP addresses doesn't involve any information leakage.

(And of course site administrators here see the full addresses...)

Steve
-- Steve.org.uk

[ Parent ]

Re: Some simple Apache optimisations
Posted by cvweiss (68.61.xx.xx) on Tue 19 Jul 2005 at 11:58
[ View Weblogs ]
I'm surprised you didn't mention ramdrives. If the site's source isn't very large, you could copy it directly into the ramdrive or even just cramfs it.

Advantages: memory fast lookups for any file
Disadvantages: uses memory

[ Parent ]

Re: Some simple Apache optimisations
Posted by Anonymous (203.173.xx.xx) on Tue 19 Jul 2005 at 23:57
If you have enough RAM the kernel is going to have your docs cached anyway. Setting up a RAM disc is actually probably going to slow things down as you are essentially locking away memory that could be used to cache the documents that are *actually* being accessed.

The only time I have found RAM disks useful to boost performance -- and by RAM disc I do of course mean a tmpfs mount NOT old school RAM discs -- is when you need to write some temp files.

Read-only access is already fast.

[ Parent ]

Re: Some simple Apache optimisations
Posted by Steve (82.41.xx.xx) on Wed 20 Jul 2005 at 00:41
[ View Weblogs ]

Agreed.

Although there is the experimental module for Apache which does cache specific files mod_mmap_static which could be used if you were sure you wanted to cache a particular document - and avoid the overhead of a RAM disk.

Steve
-- Steve.org.uk

[ Parent ]

Re: Some simple Apache optimisations
Posted by Anonymous (194.149.xx.xx) on Wed 1 Mar 2006 at 14:56
Problem with this approach on 32 bit architecture is that even if you use 4GB:4GB split and switched page tables then anyway you can't mmap full dvd iso. As you would run out of logical address space and would not be able to fork/clone new request handler. With 64 bit architecture the limit is bit further. The idea to not add another layer when the system already does the caching quite efficiently is tempting.

But remember that when you mmap data too much you could run out of address space for your temporary variables. And also you need file handle to be able to mmap...

[ Parent ]

Re: Some simple Apache optimisations
Posted by mgobetti (217.199.xx.xx) on Wed 7 Sep 2005 at 11:12
Hi anyone knows where I can find somethings like php my web hosting? The project is now closed but I need somethings like that.
Thanks in advanced my e-mail is activty@sciarada.net

[ Parent ]

Re: Some simple Apache optimisations
Posted by Steve (82.41.xx.xx) on Wed 7 Sep 2005 at 11:14
[ View Weblogs ]

You'd be better off asking elsewhere, like maybe the debian-user mailing list, or one of the relevent newsgroups.

Steve
--

[ Parent ]