I really hate losing things, and I am a little obsessive about my photos in particular. I’m a pack-rat – I keep a lot of photos because they increase in value over time. That random test shot with a new camera shows how our house used to look (remember how the plaster was coming off that wall?) or a throwaway shot that’s the only picture of one of the kids with that haircut.
I’m also still reeling from losing almost all of the photos of Jack from before he was two. There was a partition table incident on Helen’s computer. Those photos were digital and are now gone. Friends came to our aid and sent us back photos that we had emailed them, but it’s a tiny fraction of that two years.
So, that’s why I backup. This is how:
I run BitTorrent Sync on my iPhone and this syncs my camera roll with my desktop and HP Microserver at home and my Macbook at work – you need to remember to launch it in order to sync: it won’t do it in the background (iOS won’t let it). Assuming you can handle tapping on the icon every few days, you’ll be covered. It’s also the most convenient way to get your photos off your phone without a Lightning cable.
I import these into Lightroom, which I have decided to entrust with all of my photos. This means that there’s now two copies on my desktop, on separate drives, and that iPhone photos enter into the flow that my “proper” camera photos have.
I do the usual things with Lightroom: I run the scheduled backups of the meta-data in case of massive corruption. Meta-data is data too, after all. This isn’t that big a deal for me because I’m not actually very organised and I rely on the EXIF data in the original files for most of the meta-data – and that is easily reconstructed if something terrible happens.
git-annex gives you all the power of Haskell with the ease of use of git. (ha ha). What it actually does is really cleverly play to git’s strengths for managing meta-data, like what a file is called, where it’s stored and what its hash is, while filling in things it’s not good at, like supporting large files, encryption and supporting things that don’t have git installed.
Aside from photos, some other things which are large or that I want to keep go into git-annex – historical backups of my blog, email archives and my entire iTunes library are in there. I don’t necessarily want all of these things on every computer that I own, but git-annex lets me be selective about which files I have on which computer.
If I need an MP3 from my iTunes library on my Macbook, for example, I can ask git-annex where it’s stored and then ask it to connect to any of those places and get it for me. Once it’s done, it’ll update it’s records to say that my Macbook also has a copy.
It will let me remove my local copy of files if I’m low on space, but won’t let me accidentally delete the last copy of my file (more accurately, it will let me specify a minimum number of copies that are in circulation – one is rarely enough for the paranoid).
(It has an interface only a mother could love, but there’s a UI called git-annex Assistant. I prefer the control the CLI gives me).
My current git-annex remotes are:
beta: my Manchester virtual machine keeps most small data, but doesn’t have the iTunes library, family videos or my photo collection.
play: my Microserver has a copy of absolutely everything.
bob: Bob the Server at work has a GPG encrypted copy of everything. It doesn’t even run git-annex – it’s an rsync special remote.
macbook: my Macbook ebbs and flows, as it should. It’s often where large assets (like edited video) are created – then I add them to git-annex, make sure they are distributed around my remotes, and drop my local copy, giving me back SSD space. It has some, but not many, albums from iTunes.
iwebftp: I wrote an iWeb FTP special-remote – this was very, very easy and I happen to know that iWeb are keeping at least three copies of every file on iWeb FTP. This remote is encrypted, even though the transfer is encrypted and the data is stored encrypted at rest. Paranoia, right?
(If you’re not keeping count, that means that so far any photo I take is in seven places, eight if it’s an iPhone photo).
rsync from Manchester
beta is periodically backed up to my house. Nothing fancy.
rsync on the Microserver
Everything on the Microserver is periodically backed up to other disks in the same server. Rather than RAID (which I don’t need, as my SLA with my wife and kids is “no nines”), I use three disks in my Microserver as an original and two copies.
Everything good on the Microserver runs inside an LXC container, which is stored in an LVM logical volume, and backed up from outside of the container. The minimal Debian install for the host runs off a USB pen and changes infrequently. I have a second USB pen with this install on it, in case the flash dies.
beta is downloaded to the Microserver, it is now present on four disks total. Because
play is a container on Microserver, its git-annex install is also copied to another two disks. (10 copies of that iPhone photo, now).
For stuff that I don’t even think to put into git-annex, but that I’d kick myself for losing, I back my
Music folders up on my desktop.
They go to an external Western Digital MyBook via the truly excellent Bvckup 2. Bvckup handles things like differential updates and bypassing file locking (with Volume Shadow Copies) while still being a beautiful application and not just porting
rsync to Windows.
While I have the space, I also back this up to a Samba share on
play. You’re probably getting the picture by now, but because that Samba share is on the Microserver, it’s going to a further two disks, too.
Finally, there’s my Macbook. I really strive to keep everything important off of my Mac. All of my code lives in git, I push regularly, my documents are in BitTorrent Sync or Google Drive, my bookmarks are in the cloud and my email is in GMail. I’m reasonably confident that if my Macbook took an unexpected bath, I would not lose more than a half-a-day’s work.
But confidence isn’t control, so I use Bombich Software’s also-excellent Carbon Copy Cloner to create a bootable copy of my whole Macbook. I can plug this USB drive into a brand-new-in-box Mac and be up and running with my entire environment in minutes. It also covers me in case I suddenly realise I created an important file and just left it in my home folder.
Downsides and Weaknesses
Nothing is perfect. I’m happy with this backup situation, but it has some shortcomings and down sides.
- If I do want to purge a file, it’s remarkably difficult to make sure I have gotten all of the copies. This hasn’t really happened yet, but if you want a laugh at other people’s expense you should read about amateur pornographers not being able to delete shots from their Apple Photo Stream. (“I took a picture of my wife in lingerie and my in laws saw. Turning off the stream.”).
- There are several manual steps. I don’t leave the MyBook connected to my PC (it’s in a safe) – so I have to remember to do all of this. The Lightroom end-of-month copy to git-annex is the most annoying.
- Despite there being up to 20 copies of some photos, they’re actually only split among three real locations: home, work, Manchester. It’s easy to be overconfident. Lower priority data only exists at home, though on several machines. It would be lost to a fire, for example.
- Apart from the built-in delay of requiring manual steps, there’s no provision for keeping historical copies. If you find out that you corrupted a file six months ago, you will not be able to get it back. You’ll just have a dozen copies of the corrupted version.
- Obviously, you require a lot of storage to keep all of these copies of things.
- BitTorrent Sync and Bvckup 2 are proprietary software. This is an issue for some people, and to a limited degree, it is for me. That said, I prefer a proprietary piece of software to a cloud service (cough Dropbox), and Bvckup keeps my data in an open format (its original format, really), so I’m okay with it.