This site is now 100% read-only, and retired.

Cloning a Debian Etch system for redundancy

Posted by dldirector on Mon 21 Jan 2008 at 11:07

I am responsible for a production web server that is very critical to our clients and the bread and butter of our company. We have collocated the server, for reliability of power, A/C and Internet connectivity as well as cost effective high bandwidth. Here, we describe how to maintain a redundant server with the configuration of an identical standby machine.

For real peace of mind, we lease two identical server boxes from our collocation provider and with a "private rack" option, the two machines can be configured with Internet addresses from the same subnet, so that one can easily take-over for the other. In addition, the two machines are connected via a private local network, handy for mirroring.

For this take-over to be useful, the standby machine needs to be a relatively current copy of the production machine. It turns out that this is a fairly simple 3 step process, but step 2 is not obvious. Here is the process that we have recently developed and tested.

Our configuration is: two identical machines, A, the production server, and B the standby server. Each machine has 2 identical 160 GB disks, no RAID. Both machines are running Debian Etch. Machine A Disk 1 is the live production server, where content is constantly updated. Machine B Disk 2 is another copy of Debian Etch which is usually running and considered in a maintenance mode. The default boot configuration for Machine B is to boot to Disk 2.

1. A shell script was written to use rsync to copy the root partition (in our case the only partition) of the production Machine A Disk 1 to Machine B Disk 1. This a crontab entry on Machine B, I think of it as a content pull and it runs twice daily. The local network is used for this update.

It is important to use the --hard-links and --one-file-system switches to rsync so that hard links are maintained and there is no confusion caused by /proc, and /dev. With the transition to "udev" on Debian systems like Etch, the /dev directory is now virtual and dynamic. What we want is a copy of the "static" /dev directory as it exists on the disk, not as it is seen in the running system. This can be solved by step 2.

2. On a live running system, there is a directory called /dev/.static/dev which appears to be the static /dev directory as it appears on the disk image. So all we have to do for this step is rsync from Machine A Disk 1 /dev/.static/dev to Machine B Disk 1 /dev ( perhaps at /mnt/other/dev ).

3. Finally a little housekeeping. I changed the file /boot/grub/menu.lst on Machine B Disk 2 ( NOTE: that is the maintenance mode system, not the cloned system ) to have a new entry labeled "standby" or something similar with the appropriate information to boot Disk 1. Implied of course is that the default grub configuration on this machine is boot Disk 2 which has the maintenance version of the OS.

In addition, before booting the "standby" version, I like to change the hostname in Machine B Disk 1 to keep the name that I use for Machine B, so that I don't get confused when rebuilding or repairing the downed production server. The shell prompt shows the hostname. The names the server responds to in Apache are domain names not the localhost name, so as a web server, things look the same. I also edit Machine B Disk 1 /etc/network/interfaces so that Machine B keeps the same local network address which I want to follow the hostname. The outside Internet IP address will remain cloned from Machine A.

If there is a failure of Machine A, I reboot Machine B selecting the grub entry for Disk 1, and Machine B takes over with current or nearly current content.

This could also be done for "hot standby" using Heartbeat and a remote monitoring computer. But for now, I will go with reboot required by a human.

 

 


Re: Cloning a Debian Etch system for redundency
Posted by Anonymous (67.152.xx.xx) on Mon 21 Jan 2008 at 16:38
Your Step 2:
>So all we have to do for this step is rsync from Machine A Disk 1
>/dev/.static/dev to Machine B Disk 1 /dev ( perhaps at /mnt/other/dev ).

could be a lot more clear. Does the -onefilesystem flag do this? Or how would I go about making the /dev/ and /proc/ directories function properly with rsyncing? Currently we just exclude /proc/ and /dev/ and /tmp/ but if there is a better way I wish I knew it...

[ Parent ]

Re: Cloning a Debian Etch system for redundency
Posted by dldirector (69.17.xx.xx) on Mon 21 Jan 2008 at 17:01
Sorry for the lack of clarity. I have a clarification and a correction. In the article I said, that my system is on only one partition. In fact, there is a separate boot partition. Here is a sample shell script that can be used to copy the root partition, boot partition, and the /dev items.
SERVER=root@[production.server.com]

OPTS="  --recursive --times --perms --owner --group \
        --links --hard-links \
        --one-file-system --delete \
        --stats --rsh=/usr/bin/ssh"

# copy system image to partition mounted on /mnt/sysimage
FROMDIR=/
TODIR=/mnt/sysimage
rsync $OPTS $SERVER:$FROMDIR $TODIR

# copy boot image to partition mounted on /mnt/bootimage
FROMDIR=/boot/
TODIR=/mnt/bootimage
rsync $OPTS $SERVER:$FROMDIR $TODIR

# copy the disk image version of /dev
FROMDIR=/dev/.static/dev/
TODIR=/mnt/sysimage/dev
rsync $OPTS $SERVER:$FROMDIR $TODIR

Hope this helps. Thanks for the comment.

[ Parent ]

Re: Cloning a Debian Etch system for redundency
Posted by Anonymous (217.216.xx.xx) on Tue 22 Jan 2008 at 03:03
Hi, dldirector, I have a couple of questions:

- Why do you have just two partitions (/boot and /)? Is it for any particular reason?

- Why do you decide to share, for instance, /var/logs? or are they 'travelling' to another machine?

Thanks in advance.

[ Parent ]

Re: Cloning a Debian Etch system for redundency
Posted by dldirector (69.17.xx.xx) on Thu 24 Jan 2008 at 15:46
I find simpler is better and if I don't have a good reason for wanting more that one or two partitions, I just use one (or two). If I am lazy, during a Debian install, I sometimes let it partition the disk and end up with several.

I share or copy /var/log and everything else for that matter, because in this case I am trying to make a backup server, with everything that existed on the original server.

[ Parent ]

Re: Cloning a Debian Etch system for redundency
Posted by Anonymous (118.90.xx.xx) on Mon 18 Aug 2008 at 13:55
From http://packages.debian.org/changelogs/pool/main/u/udev/udev_0.125 -5/changelog#versionversion0.124-1 :

udev (0.124-1) unstable; urgency=low
...
* Removed the /dev/.static/dev/ hack. It was cool, but its complexity
is not justified anymore. (Closes: #444337, #481559)
...

Thus you can skip the advice about rsyncing /dev/.static/dev

[ Parent ]

Re: Cloning a Debian Etch system for redundancy
Posted by rak (164.73.xx.xx) on Mon 21 Jan 2008 at 21:23
[ View Weblogs ]
Didn't you evaluate the use of heartbeat or something similar?
I'm about to begin something similar for my employers and would like to have some intel in the topic, since the procedure you describe looks simple in some way though my previous idea was more by using this software.

[ Parent ]

Re: Cloning a Debian Etch system for redundancy
Posted by dldirector (69.17.xx.xx) on Mon 21 Jan 2008 at 22:07
I have installed the Debian package heartbeat in order to get the program send_arp. Send_arp solves a problem that I have experienced before when having one machine take over an IP address from another. If this portable IP address is not the primary IP, it is not used on outgoing packets, so ARP tables don't get updated right away and packets can be lost as they continue to be routed to the network card that previously "owned" the IP address. Send_arp can be used to force immediate ARP updates. As far as using heartbeat, I want to maintain manual control of the switchover for now. As the system matures, and I become more comfortable that the new setup works as desired, I may indeed consider using heartbeat to automate the failover.

[ Parent ]

Re: Cloning a Debian Etch system for redundancy
Posted by botox (91.17.xx.xx) on Tue 22 Jan 2008 at 07:25
If you just want to send the arp packets you should give fake a chance.

http://packages.debian.org/etch/fake

[ Parent ]

Re: Cloning a Debian Etch system for redundancy
Posted by rak (190.64.xx.xx) on Tue 22 Jan 2008 at 16:51
[ View Weblogs ]
Other option is to give the same MAC to the other interface, in the backup host. Using ifconfig or setting in the /etc/network/interfaces file.
For example

# ifconfig eth0 down
# ifconfig eth0 hw ether 00:80:48:BA:d1:20
# ifconfig eth0 up
# ifconfig eth0 |grep HWaddr


Taken from http://linuxhelp.blogspot.com/2005/09/how-to-change-mac-address-o f-your.html

Cheers

[ Parent ]

Re: Cloning a Debian Etch system for redundancy
Posted by botox (91.89.xx.xx) on Tue 22 Jan 2008 at 17:00
Could be a problem if the switch is remembering which MAC was on a specific port. As far as I know the most reliable procedure for sensitive environments is gratuitous arp.

[ Parent ]

Re: Cloning a Debian Etch system for redundancy
Posted by rak (200.40.xx.xx) on Tue 22 Jan 2008 at 18:05
[ View Weblogs ]
Yup, didn't consider the switch memory, that could make some trouble.

[ Parent ]

Re: Cloning a Debian Etch system for redundancy
Posted by stoffell (81.165.xx.xx) on Tue 22 Jan 2008 at 21:14
Great you shared how you solved your issue on this!

Just want to add some extra stuff to consider.

Using drbd is also an option, can be combined with heartbeat, etc..

I have also used OpenVZ for something like you posted. (taking a nightly dump from a virtual machine, and copy the dumped file to a different server where it can be restored or just archived)

---
stoffell

[ Parent ]

Re: Cloning a Debian Etch system for redundancy
Posted by Anonymous (86.7.xx.xx) on Sun 17 Feb 2008 at 00:42
This confuses me. I cannot understand why you include /proc and /dev. As these are maintained by the running kernel - how can rsyncing them from one machine to another be a good thing?
In my understanding of Linux, if the two machines use the same hardware, those directories will automomatically be very similar on both machines - in fact they would be identical except for things that should be different - eg files that contain packet counts, mac addresses etc.

[ Parent ]

Re: Cloning a Debian Etch system for redundancy
Posted by dldirector (64.81.xx.xx) on Sun 17 Feb 2008 at 13:50
I think you misunderstood. /proc and /dev look like mounted file systems. In rsync the --one-file-system switch means DON'T copy other file systems, i.e., don't copy /dev or /proc. In fact, figuring out the correct way to handle /dev was the biggest part of this puzzle, and finding the solution was what motivated me to document it in an article. Hope this helps.

[ Parent ]

Re: Cloning a Debian Etch system for redundancy
Posted by Anonymous (157.161.xx.xx) on Mon 22 Nov 2010 at 11:41
This will not work on lenny (debian 5). The directory "/dev/.static/dev" don't exists.

Appart from that, I not sure this is the good way to sync two systems as many users comment here...

[ Parent ]

Re: Cloning a Debian Etch system for redundancy
Posted by dldirector (69.17.xx.xx) on Mon 22 Nov 2010 at 15:04
When I started doing this in 2007, my crystal ball wasn't working, so I couldn't check compatibility with Lenny ;),

[ Parent ]