This site is now 100% read-only, and retired.

Make your own configuration deployment system, part 1

Posted by rossen on Mon 30 Jun 2008 at 20:24

In this series of articles, I describe the steps to making a flexible configuration deployment system tailored to your needs. It can be as simple or as complete as you care to make it. And since you made it, you can understand it intimately.

If you have two or more machines to manage, you have probably noticed that they have certain similarities of configuration.

These similarities may include

  • network configuration
  • basic package list
  • configurations of packages
  • aliases and shortcuts
  • internationalisation settings

You may have spent an enormous amount of time finding the ideal configuration for a piece of software and you would really regret losing your masterpiece in an unfortunate accident. Or you may need to rapidly deploy the same configuration change to a hundred machines. Or you may be simply tired of doing the same procedures every time you install a new machine.

A configuration deployment system can greatly reduce the amount of work necessary to manage 2 or more machines, but the amount of time necessary to learn the in-and-outs of currently existing systems may be daunting. ISconf, FAI, cfengine, debconf+LDAP, Subversion, etc. all have their strong points, but if you are just getting started, they are probably overkill. One solution is to build your own system from scratch.

Essential components

The essential components of the system are:

  • a configuration repository, i.e. what to deploy, including containing the database of configuration files, data, package lists, scripts, and jobs
  • a configuration transfer method, i.e. how to get the data to the clients
  • a collection of deployment scripts, i.e. how to apply the data to the clients

Depending on your needs, you can use solutions that overlap the boundaries of these functional divisions or you can keep them strictly separate which allows you to easily substitute methods or build on them if the need arises.

Configuration repository

You have a wide choice available for a configuration repository. Here is a non-exhaustive list of possibilities:

  • directories of files, one directory for each machine
  • directories of files, organised by classes
  • tarballs (or .deb or .rpm packages), one for each machine
  • versioning systems like CVS or Subversion
  • LDAP server
  • SQL database

There is a choice of media too. You can use a network-connected server or some removable media like a floppy, USB key, or a CDROM.

Note that one is not limited to one configuration repository - you can have multiple repositories, but you will have to make decisions about their priorities and what to do if a repository fails.

Configuration transfer method

You need a method to get your configuration from the repository to the client machines. This is somewhat determined by your choice of repository, but there is still some flexibility.

Here is yet another non-exhaustive list of common methods:

  • direct copy from removable media
  • direct copy from network-mounted share
  • rsync or scp or SSH
  • download from FTP or web server
  • versioning system check-out (CVS, Subversion, etc.)
  • transfer integrated into configuration-management software (cfengine)

And here are some more exotic methods for transferring configuration info:

  • POP or IMAP
  • LDAP query
  • SQL query
  • SNMP query
  • DHCP query (somewhat limited)
  • IRC download (think "botnet")
  • peer-to-peer (like Bittorrent)
  • DNS query (!)

Deployment scripts

After you get the configuration info to the client it must be used, but how? Again, you have a lot of flexibility.

  • Config files can be simply be copied into place automatically, or first manipulated in a local workspace to resolve configuration priorities coming from several repositories and then finally copied into place.
  • Scripts can be used to automatically edit configuration files and registries using the new values of various parameters if a change is necessary.
  • Little jobs to check/signal/reload/restart daemons can be triggered if configuration changes.
  • Old config files can be backed-up before being over-written.
  • A configuration roll-back mechanism can be implemented.

Research and define your needs

One of the most thoroughly thought-out configuration systems is ISconf, found at www.isconf.org. ISconf is probably too complicated for a beginner and over-kill for just a few systems, but the philosophy and history of the system is detailed at www.infrastructures.org and it is well worth the time to read over the paper "Bootstrapping an Infrastructure" at http://www.infrastructures.org/papers/bootstrap/bootstrap.html.

Since I usually use Debian or Ubuntu, my preferred installation/configuration system is FAI, "Fully Automatic Installation", http://www.informatik.uni-koeln.de/fai/.

One of the sub-systems used by FAI is cfengine, www.cfengine.org, a self-contained high-level scripting language and configuration deployment system itself.

Before you reach for your favorite scripting language, think about what you want your system to manage now and in the future. A few hours of reading reflection at this point could save a few false starts and re-inventions of the wheel.

  • contents of system config files only?
  • file permissions and ownerships?
  • user files too?
  • changes are fully automatic or just advisory?
  • push or pull?
  • polled or instantaneous changes?
  • logging?
  • backups?
  • roll-back capability?
  • multiple source?
  • package management?
  • multiple distribution?
  • multiple OS?
  • how many sites?
  • integration with present systems?
  • preserve local admin changes?
  • bullet-proof or hackware?
  • cryptographicly secured?
  • management interface other than the command-line+vi?
  • uploading of local changes?
  • confirmation of changes?

Hints and warnings

Organise your deployment by following the checklist at http://www.infrastructures.org/bootstrap/checklist.shtml. The principle is to always assemble the lowest-level infrastructure first in order to save time assembling the rest.

Make sure that everything in your DNS is complete and perfectly correct. A misspelling of a machine name or a false address will cause all sorts of time-wasting mysteries.

Use NTP to make sure every machine knows precisely what time it is or updates based on "make" or file time-stamps can fail in a bizarre manner.

Decide on a method for dealing with local changes (AKA cowboy admins). You might consider strictly forbidding local changes to configuration like Infrastructures.Org and FAI recommend.

Install integrit or some other file-system integrity checker and tune it so that configuration changes are obvious. That is, tune it to ignore files that are expected to change so that the reports are always tiny.

Simple examples

Here are some simple examples of configuration deployment systems. For small networks of composed of a small number of more-or-less identical machines all on one site, these examples may be all that you need. The examples also illustrate how the functions of configuration repository, transfer, and deployment scripts can overlap.

Simple recursive copy

Assume that you have a directly-accessible repository directory /srv/cfg/site/etc. It contains only /etc files that are valid for every machine at your site, eg. /etc/resolv.conf, /etc/hosts. To deploy these files, just copy them recursively into place using the GNU "cp" command and its "-a" or "--archive" option to preserve modification time, ownerships, and permissions:

        cp -a /srv/cfg/site/etc -T /etc

There are a few problems with the above example. Firstly, the files will be copied every time the command is run even if the source and target files are already identical. Apart from being inefficient, this might cause file integrity systems (like integrit) to trigger a useless warning. Secondly, if modifications were made to the files in /etc but the repository was not updated, the changes will be wiped out without a backup. Nevertheless, if your needs are simple and you intend to manually run the command only on the rare occasions that there is a change, this may be all that you need.

Congratulations - you are done.

Simple recursive update (based on file mod time) with backups

GNU cp has two options that are interesting: the "-u" or "--update" option that will copy a source file only if its modification time is newer than the target file and the "-b" or "--backup" option that makes a single or incrementally-numbered backup of the target file if a copy is done. Here is how they might be used:

        cp -u -a --backup=numbered /srv/cfg/site/etc -T /etc

This method has problems too. You end up with /etc directories cluttered with backup files with names like "hosts.~4~" that need to be dealt with. And if one of your target files is touched, which changes the modification timestamp, the cp will not copy the source to the target since the target is newer. This is a problem if all machines are supposed to be always using the canonical configuration file from the repository. Local administrators might consider this problem to be a feature and not a bug.

Simple recursive update (based on contents) with backups

Ideally, the updates should be based upon the files' contents, not their modification times. By default rsync will update only files with differing mod times or sizes, but it can be told to ignore these checks and look at file contents with the "-I" (or "--ignore-times") and "-c" (or "--checksum") options. In addition, one can specify a separate directory for keeping backed-up files:

        rsync -I -c -a --backup --backup-dir=/var/backup /srv/cfg/site/etc/ /etc

Simple recursive update from a remote repository with date-organised backups

Of course rsync has extra features that make it the ideal simple configuration deployment tool. It has remote file-transfer capabilities that can be used to solve the problem of access to the configuration repository if it is on another machine in your network instead of some locally-accessible media.

Assume that "cfg" is the name (or even better, a DNS alias) for the configuration repository machine and we want to save backups of local files that get replaced into directory hierarchies organised by date (and time, if you need). The configuration deployment commands could be:

        bd=/var/backup/cfg/$(date '+%Y/%m/%d'); mkdir -p $bd
        rsync -I -c -a --backup --backup-dir=$bd root@cfg:/srv/cfg/site/etc/ /etc

Simple recursive update from multiple remote repositories with date-organised backups

So far, we have only been recuperating site-wide /etc files. It is highly probable that we want to add useful files to /usr/local/{bin,sbin}, /root, and other directories. And we probably want to manage customisations that are valid only for a particular machine. The structure of our configuration repository on "cfg" might look like this:

/srv/cfg/site/
/srv/cfg/site/etc/
/srv/cfg/site/etc/hosts
/srv/cfg/site/etc/resolv.conf
...
/srv/cfg/host01/
/srv/cfg/host01/etc/
/srv/cfg/host01/etc/network/
/srv/cfg/host01/etc/network/interfaces
...
/srv/cfg/host02/
...

Here are the deployment commands to run on host01, host02, etc.:

        bd=/var/backup/cfg/$(date '+%Y/%m/%d'); mkdir -p $bd
        rsync -I -c -a --backup --backup-dir=$bd root@cfg:/srv/cfg/site/ /
        rsync -I -c -a --backup --backup-dir=$bd root@cfg:/srv/cfg/$(hostname)/ /

What next?

Part 2 of this series will probably deal with writing helper tools, for example a script to easily check files from the client into the configuration repository. If there is interest in this article, direction of the series will be in part determined by any questions that are posed.

About this document

URL: http://www.rtfm-sarl.ch/articles/configuration-deployment-p1.txt

HTML-conversion: txt2html --titlefirst --noanchors --preformat_trigger_lines 1 configuration-deployment-p1.txt > configuration-deployment-p1.html

Title: Make your own configuration deployment system, part 1

Version: 2008-06-27-001

Author: Erik Rossen <rossen@rossen.ch>

Licence: Creative Commons Attribution-Share Alike 2.5 Switzerland, http://creativecommons.org/licenses/by-sa/2.5/ch/

 

 


Re: Make your own configuration deployment system, part 1
Posted by Anonymous (190.137.xx.xx) on Tue 1 Jul 2008 at 23:50
Have you ever used Puppet? It's said to be the evolution of cfengine.

Best regards,
Lucas

[ Parent ]

Re: Make your own configuration deployment system, part 1
Posted by rossen (212.147.xx.xx) on Wed 2 Jul 2008 at 07:18
[ View Weblogs ]
I have heard of Puppet and read some introductory articles, but I have not felt any compelling reason to switch from cfengine and various home-made scripts.

Part of the reason for this is that I often find myself working for clients who are extremely conservative and it is difficult to convince them to start using an automatic configuration deployment system.

In almost all cases these people started building their infrastructures one PC at a time and that is how they are used to managing things. Often they have quite useful collections of scripts to do everything that they need, but they just lack the courage to make the leap to centralising their collection in order to rapidly and consistently deploy it.

Puppet looks fine and I might decide switch to it in a few years. The point is that when I make the decision, the role-out will be very rapid because the sites that I manage will already have a system (or many systems) for deployment in place.

[ Parent ]

Re: Make your own configuration deployment system, part 1
Posted by AJxn (90.227.xx.xx) on Mon 28 Jul 2008 at 13:34
[ View Weblogs ]
Or have a look at FAI. It let you run any script you write. It uses "classes" to select which scripts to run.

[ Parent ]

Re: Make your own configuration deployment system, part 1
Posted by mwr (97.81.xx.xx) on Wed 2 Jul 2008 at 04:15
[ View Weblogs ]

Commence the deluge of Puppet fanboys (myself included). Shameless plug for my infrastructure management pages, too.

Three main points to add here:

  1. I suspect by the time a configuration management toolset has been developed from scratch (including package management, users, cron jobs, config files, and which services should be running or disabled), it'll be fairly complex. Not quite as complex as a premade configuration management system, but close.
  2. If one has prebuilt packages for a particular management system, doing simple things shouldn't be much more complex than in the from-scratch version. Example: Steve's Puppet intro. You trade off some hassle in key-signing for other hassles in getting rsync secured against unauthorized access and running without passwords. It may approximately even out.
  3. Small homogeneous infrastructures rarely remain small and homogeneous. One big advantage to Puppet, at least, is the resource abstraction library that lets you define platform-neutral resources in a platform-neutral language. That is, ssh on Debian, Redhat, Solaris, and other systems all operate basically the same way: the same type of configuration files (in different locations), a need to run a service (with differing names, and run via sysvinit, svcadm, or whatever), and so on. Puppet lets you change the specific behavior depending on the OS or other characteristics of the client system (example here).

[ Parent ]

Re: Make your own configuration deployment system, part 1
Posted by Anonymous (69.66.xx.xx) on Tue 15 Jul 2008 at 21:01
Since we already have Svn setup for tracking in-house application development I'd like to extend this to storing configuration files.

The repository would look something like this.

-- inbound_mail
|-- trunk
|-- ldap
|-- slapd.conf
|-- scripts
|-- count_msgs_in_queue.sh
|-- sendmail
|-- sendmail.mc
|-- branches
|-- dev

Deployment would mean running a script via ssh on the remote server that would check out the configuration files from SVN and reload the associated server daemon.

Any suggestions or reasons not to go with this kind of setup?

[ Parent ]

Re: Make your own configuration deployment system, part 1
Posted by rossen (212.147.xx.xx) on Sat 19 Jul 2008 at 12:31
[ View Weblogs ]
You might consider structuring your repository like the standard Unix file hierarchy. That way you can easily deploy stuff outside of /etc (e.g. into /usr/local/bin) without a bunch of commands to copy everything into the correct places. Of course, if you are deploying to many different distributions, this might not be a great advantage.

[ Parent ]

etckeeper (Re: Make your own configuration deployment system, part 1)
Posted by PaulePanter (85.178.xx.xx) on Fri 1 Aug 2008 at 10:04
[ View Weblogs ]
Dear everybody,


have you heard of etckeeper [1][2]. It helps you to keep track of /etc with the help of an VCS.


Thanks,

Paul


[1] http://kitenet.net/~joey/code/etckeeper/
[2] aptitude show etckeeper

[ Parent ]

Re: Make your own configuration deployment system, part 1
Posted by Mrfai (87.79.xx.xx) on Tue 15 Mar 2011 at 18:52
The FAI project has a new home page:

http://fai-project.org

[ Parent ]