This site is now 100% read-only, and retired.

An introduction to the attic backup program

Posted by Steve on Tue 23 Dec 2014 at 11:09

Over the past year or two several new backup utilities have become popular. These new tools tend to avoid the use of tar.gz files, and just store backups as collections of files. Usually these storage areas are incrementally updated and avoid wasting space thanks to the detection of duplicate content. Here we're going to look at one of them in particular "attic".

Of the recent tools, obnam, bup, etc, attic is the one that I'm using myself. It supports backing up content both locally, or over an SSH connection to a remote system. There is also support for using encryption so that the remote backups don't leak your passwords, etc.

Installation of attic on Jessie

As attic is included in the Jessie/testing release of Debian GNU/Linux you can install it easily:

~ # aptitude install attic

Installation of attic on Wheezy

Unfortunately the Wheezy release of Debian GNU/Linux didn't contain a copy of the package, and there is no available backport yet.

However installation isn't impossible, and is infact pretty simple if you're willing to take the time. The biggest part of the installation will be enabling the use of the Debian backports system.

To get started install the following packages as you would expect:

~ # apt-get install python3 python3-dev python3-llfuse libacl1-dev 

Now enable the Debian backports:

~ # echo 'deb http://http.debian.net/debian wheezy-backports main' \
   > /etc/apt/sources.list.d/backports.list
~ # apt-get update

At this point you can install the backported msgpack library:

~ # apt-get install -t wheezy-backports python3-msgpack

Finally we can download the most recent release of attic and build it in a static location:

~ # wget https://pypi.python.org/packages/source/A/Attic/Attic-0.14.tar.gz
~ # tar zxf Attic-0.14.tar.gz
~ # cd Attic-0.14

~/Attic-0.14# python3 setup.py install --prefix=/opt/attic
...

If all goes well you'll find you have a populated /opt/attic directory which is ready and waiting for you to use. To ease working with this non-standard location I'd recommend the following wrapper:

~ # cat > /usr/local/bin/attic <<EOF
#!/bin/bash

PYTHONPATH=/opt/attic/lib/python3.2/site-packages/
export PYTHONPATH

python3 /opt/attic/bin/attic $*
EOF
~ # chmod 755 /usr/local/bin/attic

If this works you should find that you have a working attic command which you can test:

~ # /usr/local/bin/attic --help
usage: attic [-h]

             {serve,init,check,change-passphrase,create,extract,delete,list,mount,info,prune,help}
             ...

Attic 0.14 - Deduplicated Backups
..

Using attic to create a backup

Attic uses a notion of a "repository" which is used to host a collection of backups. While this repository could be local it might also be stored on a remote host (which is reachable via ssh, providing the remote host also has attic installed).

I tend to create the non-standard /backup directory to contain my backups locally, then arrange for them to be copied offsite using rsync. So we'll first create the directory, then initialize a repository, with the following two commands:

~# mkdir /backups
~# attic init /backups/backups.attic
Initializing repository at "/backups/backups.attic"
Encryption NOT enabled.
Use the "--encryption=passphrase|keyfile" to enable encryption.

Now we have a repository we can create our first backup. Ordinarily I'd backup the full system:

~# attic create /backups/backups.attic::$(date +%Y-%m-%d-%H:%M:%S) \
     --exclude=/proc \
     --exclude=/sys  \
     --exclude=/dev \
     --exclude=/backups \
     /
Initializing cache...
...

This backup will take a while to complete your system has a lot of space consumed. In my case I'm testing this on a brand new virtual machine so the full backup only involves archiving around 700Mb of data and it completes reasonably quickly.

Because attic will remove duplicate data, and only backup things that have changed since the previous run, additional backups will be quicker and not necessarily increase the size of the backup repository.

In our case we can see the size of the repository, first listing available backups, then looking at the statistics of the only item we've found:

~# attic list /backups/backups.attic
2014-12-20-19:54:48                  Sat Dec 20 19:56:22 2014

~# attic info  /backups/backups.attic::2014-12-20-19:54:48
Name: 2014-12-20-19:54:48
Fingerprint: 7f2d2014c9ba4cc7a0b8b536da0555719ac881662ae8e63c99feae98b8bb77cd
Hostname: attic.default.skx.uk0.bigv.io
Username: root
Time: Sat Dec 20 19:56:22 2014
Command line: /opt/attic/bin/attic create /backups/backups.attic::2014-12-20-19:54:48 --exclude=/proc --exclude=/sys --exclude=/dev --exclude=/backups /
Number of files: 21239

                       Original size      Compressed size    Deduplicated size
This archive:              710.55 MB            323.47 MB            311.26 MB
All archives:                    0 B                  0 B            311.26 MB

As you can see the first backup had to cope with 710Mb of data, and managed to compress that down to 311Mb. Not bad.

Rather than backing up the full filesystem you could prefer to just backup a smaller amount such as /etc/, /root, and /home. That is done as you would expect:

~# attic create /backups/backups.attic::$(date +%Y-%m-%d-%H:%M:%S) \
     /etc  \
     /home \
     /root

Restoring Files

We've seen how to backup a full system locally. Now let us try restoring a file.

The restoration involves the use of the "extract" command, and you need to give the backup name and the file to be restored. Let us delete a couple of files and then restore them (I suggest you don't play along at home here):

~# rm /etc/motd  /etc/shadow
~# cd /
/# grep root /etc/shadow
grep: /etc/shadow: No such file or directory

/ # attic extract /backups/backups.attic::2014-12-20-19:54:48 etc/shadow etc/motd
/ # grep root /etc/shadow
root:not.my.real.hash:16424:0:99999:7:::

That was pretty simple, the only caveat here is the lack of the leading "/" character.

Other Features

Listing backups

Because a repository might contain multiple backups you'll want to be able to list previous ones. Here's an example of a system which has several backups:

~ # attic list attic@rsync.io:/attic/docker.steve.org.uk.attic
2014-12-21-15:18:33                  Sun Dec 21 15:24:39 2014
2014-12-21-17:57:09                  Sun Dec 21 17:57:42 2014
2014-12-21-17:58:17                  Sun Dec 21 17:58:51 2014
2014-12-22-06:28:46                  Mon Dec 22 06:29:47 2014
2014-12-23-06:25:31                  Tue Dec 23 06:26:25 2014

NOTE that this backup is a remote one, accessed over SSH.

Listing files within backups

If you wish to see what files were included in an old backup, or track down when a file was removed, you can do that with the list command we've just seen:

~ # attic list  attic@rsync.io:/attic/docker.steve.org.uk.attic::2014-12-23-06:25:31
drwxr-xr-x root   root          0 Dec 21 18:55 bin
-rwxr-xr-x root   root    1029624 Nov 12 23:08 bin/bash
-rwxr-xr-x root   root      31152 Dec 09 17:46 bin/bunzip2
-rwxr-xr-x root   root     640344 Nov 09 20:57 bin/busybox
hrwxr-xr-x root   root          0 Dec 09 17:46 bin/bzcat link to bin/bunzip2
lrwxrwxrwx root   root          0 Dec 09 17:46 bin/bzcmp -> bzdiff
..
Removing old backups

If you take a single backup every day you'll find that they will gradually get larger and larger. To avoid running out of space you'll want to prune backups. There are several options here for choosing retention on a daily, weekly, or montly basis. For my personal systems I backup files for 50 days and then hope for the best.

Here's how to prune the older backups:

~ # attic prune /backups/backups.attic --keep-within=50d
Don't forget to check your backups!

If you don't check your backups you don't know that they're safe. The check command allows you to check that your backups are in a good state.

Usage is probably as simple as you would expect:

~ # attic check attic@rsync.io:/attic/www.steve.org.uk.attic
Starting repository check...
Repository check complete, no problems found.
Starting archive consistency check...
Analyzing archive 2014-... (1/60)
Analyzing archive 2014-... (2/60)
Analyzing archive 2014-... (3/60)
..
Archive consistency check complete, no problems found.

I hope this introduction was a good starting point. If you're configuring backups for a number of systems from scratch then attic is a good tool to use.

The attic documentation is simple to read and mentions some commands we've not covered here, along with documenting the encryption support.

For most users using encryption is just a simple matter of adding "--encryption=keyfile" when initializing the repository, but I'm not going to blindly suggest that because users need to remember to keep their keyfiles secure, and consider what kind of encryption they require.

 

 


Re: An introduction to the attic backup program
Posted by Anonymous (212.250.xx.xx) on Tue 23 Dec 2014 at 17:05
Hi Steve,

Thanks for the post. Currently i'm using RBME/rsync onto an encrypted external drive mounted in /backup/ . RBME works pretty well, no complaints but attic looks interesting.

RBME - https://github.com/schlomo/rbme

cheers

sno

[ Parent ]

Re: An introduction to the attic backup program
Posted by Steve (94.15.xx.xx) on Tue 23 Dec 2014 at 18:00
[ View Weblogs ]

Thanks for the link, that does look interesting. Similar to rsnapshot, which also uses rsync and hard links.

Steve

[ Parent ]

Backup time increases linearly with backups done
Posted by xrat (84.112.xx.xx) on Sat 27 Dec 2014 at 19:23
I have recently setup attic. To test it, it currently backs up 2 servers twice a day. After 2 weeks it was already apparent that the time needed to finish the backups increases linearly with the number of backups ("archives") already created (while number of files and backup volume are relatively constant). It seems that the update of the local cache ("Initializing cache...", "Analyzing archive") could be optimized.

Backup times started out with a pleasantly fast 200s for 45G spread over 120k files. 14 days later (2*28 archives) I am up to 330s. It takes 4 minutes just to analyze old archives, over and over again. That's a bummer because my plan was to use attic on many machines. But backup times would quickly increase to not-so-pleasant levels.

-- Andreas

[ Parent ]

Re: Backup time increases linearly with backups done
Posted by Steve (94.15.xx.xx) on Sat 27 Dec 2014 at 21:47
[ View Weblogs ]

I've noticed the same thing, but of the similar tools I tested the same thing occurred there too. It just seems to be the downside of running incrementals like this - you examine the existing backups to see what you have, and whether "stuff" is new.

obnam, for example, started out singificantly more slowly on the initial run, and then managed to get slower-still as additional backups were executed.

Steve

[ Parent ]

Re: Backup time increases linearly with backups done
Posted by mcortese (94.160.xx.xx) on Mon 12 Jan 2015 at 22:32
[ View Weblogs ]
While this behaviour is common to many incremental backup tools, it doesn't need to be like this. Instead of storing the first backup in "full" and then a series of incremental "diffs", you could arrange the data so that you always keep the last backup in full while the previous ones are "backward diffs". Then, at each new backup you don't need to go through all the archives you have but just examine the last one.

The time to create a new full backup and change the previous one to be a backward diff might be slightly more than to create a new forward diff, but at least it won't grow with the number of past backups. And the time to prune the oldest backups is surely shorter: just delete them.

Please note that I'm talking about linear backups, here: one after the other without jumps. I don't know how well it works with complex schemes like "7 daily, plus 4 weekly, plus 6 monthly backups".

[ Parent ]

Re: Backup time increases linearly with backups done
Posted by anarcat (72.0.xx.xx) on Tue 10 Feb 2015 at 02:27

while i've heard of similar reports, i have yet to reproduce exactly the same behavior here. i've been running attic since dec 19th, and here are my backup times and sizes:

anarcat@marcos:~$ notmuch show attic cron.daily | egrep '^Duration|^All'
Duration: 15 minutes 17.47 seconds
All archives:               20.11 TB             17.69 TB            333.64 GB
Duration: 14 minutes 58.77 seconds
All archives:               19.75 TB             17.38 TB            333.18 GB
Duration: 15 minutes 39.78 seconds
All archives:               19.40 TB             17.07 TB            332.81 GB
Duration: 15 minutes 29.09 seconds
All archives:               19.04 TB             16.75 TB            332.32 GB
Duration: 15 minutes 8.26 seconds
All archives:               18.69 TB             16.44 TB            331.95 GB
Duration: 16 minutes 26.27 seconds
All archives:               18.34 TB             16.13 TB            331.51 GB
Duration: 14 minutes 51.71 seconds
All archives:               17.98 TB             15.82 TB            330.90 GB
Duration: 15 minutes 0.33 seconds
All archives:               17.63 TB             15.51 TB            330.62 GB
Duration: 14 minutes 55.92 seconds
All archives:               17.28 TB             15.19 TB            330.14 GB
Duration: 14 minutes 25.09 seconds
All archives:               16.92 TB             14.88 TB            329.68 GB
Duration: 15 minutes 24.07 seconds
All archives:               16.57 TB             14.57 TB            329.32 GB
Duration: 14 minutes 51.44 seconds
All archives:               16.21 TB             14.26 TB            328.41 GB
Duration: 15 minutes 6.41 seconds
All archives:               15.86 TB             13.94 TB            328.13 GB
Duration: 15 minutes 18.51 seconds
All archives:               15.51 TB             13.63 TB            327.76 GB
Duration: 15 minutes 40.07 seconds
All archives:               15.15 TB             13.32 TB            327.26 GB
Duration: 14 minutes 54.53 seconds
All archives:               14.79 TB             13.00 TB            326.78 GB
Duration: 15 minutes 8.34 seconds
All archives:               14.43 TB             12.69 TB            326.54 GB
Duration: 14 minutes 51.70 seconds
All archives:               14.07 TB             12.37 TB            326.10 GB
Duration: 15 minutes 11.43 seconds
All archives:               13.72 TB             12.06 TB            325.81 GB
Duration: 21 minutes 59.75 seconds
All archives:               13.36 TB             11.74 TB            325.46 GB
Duration: 11 minutes 17.81 seconds
All archives:               13.00 TB             11.43 TB            324.30 GB
Duration: 9 minutes 32.30 seconds
All archives:               12.65 TB             11.12 TB            323.83 GB
Duration: 8 minutes 35.06 seconds
All archives:               12.29 TB             10.80 TB            323.67 GB
Duration: 10 minutes 44.21 seconds
All archives:               11.93 TB             10.49 TB            323.50 GB
Duration: 10 minutes 59.11 seconds
All archives:               11.57 TB             10.17 TB            322.97 GB
Duration: 10 minutes 19.47 seconds
All archives:               11.21 TB              9.86 TB            322.43 GB
Duration: 10 minutes 54.16 seconds
All archives:               10.86 TB              9.55 TB            322.19 GB
Duration: 10 minutes 45.26 seconds
All archives:               10.50 TB              9.23 TB            321.67 GB
Duration: 10 minutes 32.37 seconds
All archives:               10.14 TB              8.92 TB            321.27 GB
Duration: 10 minutes 17.87 seconds
All archives:                9.78 TB              8.60 TB            320.93 GB
Duration: 11 minutes 9.81 seconds
All archives:                9.43 TB              8.29 TB            320.69 GB
Duration: 10 minutes 34.86 seconds
All archives:                9.07 TB              7.98 TB            320.30 GB
Duration: 10 minutes 42.32 seconds
All archives:                8.71 TB              7.66 TB            319.93 GB
Duration: 10 minutes 33.75 seconds
All archives:                8.35 TB              7.35 TB            319.58 GB
Duration: 10 minutes 43.75 seconds
All archives:                8.00 TB              7.03 TB            319.32 GB
Duration: 10 minutes 41.76 seconds
All archives:                7.64 TB              6.72 TB            318.89 GB
Duration: 10 minutes 16.43 seconds
All archives:                7.28 TB              6.41 TB            318.43 GB
Duration: 10 minutes 39.29 seconds
All archives:                6.92 TB              6.09 TB            318.14 GB
Duration: 8 minutes 30.53 seconds
All archives:                6.57 TB              5.78 TB            317.84 GB
Duration: 12 minutes 20.88 seconds
All archives:                6.21 TB              5.46 TB            317.63 GB
Duration: 10 minutes 54.01 seconds
All archives:                5.85 TB              5.15 TB            316.88 GB
Duration: 10 minutes 37.47 seconds
All archives:                5.49 TB              4.84 TB            316.49 GB
Duration: 10 minutes 22.15 seconds
All archives:                5.14 TB              4.52 TB            316.12 GB
Duration: 10 minutes 38.00 seconds
All archives:                4.78 TB              4.21 TB            315.79 GB
Duration: 10 minutes 27.31 seconds
All archives:                4.43 TB              3.90 TB            315.45 GB
Duration: 10 minutes 50.58 seconds
All archives:                4.07 TB              3.58 TB            315.07 GB
Duration: 10 minutes 25.45 seconds
All archives:                3.71 TB              3.27 TB            314.49 GB
Duration: 10 minutes 52.40 seconds
All archives:                3.35 TB              2.96 TB            314.01 GB
Duration: 10 minutes 34.04 seconds
All archives:                3.00 TB              2.64 TB            313.63 GB
Duration: 10 minutes 50.46 seconds
All archives:                2.62 TB              2.31 TB            313.27 GB
Duration: 10 minutes 32.20 seconds
All archives:                2.25 TB              1.98 TB            312.93 GB
Duration: 13 minutes 16.40 seconds
All archives:                1.87 TB              1.65 TB            312.57 GB

so you see it bumped from 10 to 15 minutes to do its backup, but I'm not sure i would blame attic for that, and it certainly doesn't count as a "linear" increase as it's a one time increase that stays around after. which may be more worrisome. but anyways.

also, keep in mind that i'm not removing backups yet, maybe that could help with the performance.

finally, i think i should mention that attic doesn't do "incremental" backups the way some think it does here. it does deduplication amongst all files in all the backup archives. if there's a linear increase, i do not believe it would be because of the incremental nature - each backup is basically independant of each other. i've done quite a bit of work to document the internals, if you are at all curious:

https://github.com/anarcat/attic/blob/doc-encryption/docs/interna ls.rst

[ Parent ]

Re: Backup time increases linearly with backups done
Posted by xrat (84.112.xx.xx) on Tue 10 Feb 2015 at 19:36
Your data shows a spike of 22 min just before times went up to 15 min on average. Looks to me like something on that run could have caused the increase.
Also, you haven't described your setup.

The problems I were facing, and which - sadly - caused me to stay away from attic, appeared only in a setup of multiple machines using a central unencrypted attic archive due to the fact that every time attic ran it fully recreated the local cache. This is what takes time, causes a high CPU load, and on my machines transferred even more data over the network than the actual backup itself. Generating and writing the backup data was always fast.

It's fine for a single machine or 1 that is backuped often and another that is backuped rarely, but my conclusion is that for now attic is in general not suitable for backing up several machines to a central archive.

[ Parent ]

Re: Backup time increases linearly with backups done
Posted by anarcat (70.83.xx.xx) on Tue 10 Feb 2015 at 21:48

Yeah, I am not sure what happened there. My setup is that i run backups on an external WD My Drive 3TB HDD connected through USB2, not sure what else to add there.

I did some more research on your issue, and it seems that backing up multiple server to the same repository is not well supported:

https://github.com/jborg/attic/issues/186 http://librelist.com/browser//attic/2014/11/11/backing-up-multipl e-servers-into-a-single-repository/#e96345aa5a3469a87786675d65da4 92b

It could be interesting to see if it's possible to fix Attic to workaround that issue somehow.

There has also been a post about linear increases in backup times on the mailing list:

http://librelist.com/browser//attic/2014/12/27/times-increase-lin early-with-backups-done/

... with no response unfortunately.

[ Parent ]

Re: Backup time increases linearly with backups done
Posted by Anonymous (85.91.xx.xx) on Wed 1 Jul 2015 at 11:20

You could find interesting borg-backup (an attic fork, not yet stable).

From it's FAQ:

Can I backup from multiple servers into a single repository?

Yes, but in order for the deduplication used by Borg to work, it needs to keep a local cache containing checksums of all file chunks already stored in the repository. This cache is stored in ~/.cache/borg/. If Borg detects that a repository has been modified since the local cache was updated it will need to rebuild the cache. This rebuild can be quite time consuming.

So, yes it's possible. But it will be most efficient if a single repository is only modified from one place. Also keep in mind that Borg will keep an exclusive lock on the repository while creating or deleting archives, which may make simultaneous backups fail.

[ Parent ]

install process
Posted by anarcat (72.0.xx.xx) on Tue 10 Feb 2015 at 02:14

in debian wheezy, i use this much simpler install procedure:

apt-get install python3-msgpack python3-pip libacl1-dev
pip-3.2 install attic

that way you don't need the non-standard install and all.

[ Parent ]

Re: install process
Posted by villamarinella (88.78.xx.xx) on Mon 18 Apr 2016 at 15:02
no way.
1. update get problem with publickey.
2. needs to many CPU power, min 85 %. on RPI 2.
3. first try crashed with syslog message bug, OOPS.

I deleted it.

villamarinella

[ Parent ]

Re: install process
Posted by anarcat (206.248.xx.xx) on Mon 18 Apr 2016 at 16:00

That sounds like a bug. Maybe you'd like to try the fork of Attic, called Borg backup, which is better maintained than Attic at this point.

[ Parent ]

Re: install process
Posted by villamarinella (88.78.xx.xx) on Mon 18 Apr 2016 at 16:43
not really, or?
Your link bring me back to this page.
VM

[ Parent ]

Re: install process
Posted by anarcat (206.248.xx.xx) on Mon 18 Apr 2016 at 17:24

i don't understand your first question.

i did mess up the link, try: http://borgbackup.github.io/

[ Parent ]