Backups – Part 1 (My Formative Years)

I seem to be slipping into a backup theme in pending posts, plus it’s time for me to adjust my backup strategy at home. So, I figured I’d write up my backup related history and biases. This is part 1, so that must mean there will at least be a part two. I’m hoping I don’t need a part 3, but we’ll see.

In the beginning…
Back in “the days of DOS” my backups consisted of floppy disks. (Remember those? And I don’t mean those smallish 3.5″ ones. I mean big floppies that actually flopped.) Usually a “backup” meant making a copy of the diskette, or copying the files from the hard drive to a diskette (and yea from one floppy to another because there was no hard drive). Diskettes failed so there were usually multiple copies and I was forever trying to remember which was the latest copy and what files were on each disk. (Labels? To much trouble. Eventually I figured out pencils would write on the diskette itself. Just never a pencil around when I needed it.)

Then technology moved forward and software was created to do backups, to floppies. This solved the problem of numerous, badly labeled diskettes. Now there could be a labeled box with labeled diskettes. The box being “Set 1”, “Set 2”, etc… and the diskettes numbered one through whatever. In other words “a system”. This was great for organization. But then there’d be a need to recover a file and the technology brought new fun to the restore process. Usually a bad diskette in the middle of the set. Oh yea, also be sure you had extra copies of the backup software as it was needed to read the disks. Technology crept forward and tried to address those problems. But it was never quit right.

Then technology moved forward again and floppies were replaced with backup tapes and even stuff like zip drives. Less swapping and now when a single piece of media went bad you lost lots of data. Maybe it was me, but I never had a backup system with media I could really trust. I’d have many copies and somehow managed to survive. But it was a painful existence, and those tapes were expensive.

Lessons learned, habits formed…
Some of the lessons I learned in these early days were:

  • Data on my computer is very organized. Data will tend to disorganization once it leaves the bounds of my computer unless I’m forced to fence it it. If there’s a flat surface I will put something on it. If that something is also flat I will put something else on it. And so on. (FYI – small flat things like diskettes or CDs tend to get larger flat things like paper and magazines put on top of them.)
  • I think I have a good memory, at the time a file is backed up. This means minimal or no labels. Write a date on a disk, I’ll know what’s there. I’m wrong about that, I don’t have a good memory. But I can never remember that fact and am forced to repeat my mistakes. On the plus side, swapping disks to find a file could be done while watching TV and is a mindless activity. It also provides an incentive to protect the original data source in order to avoid the whole exercise.
  • I don’t want to spend time addressing those first two bullet points if it takes any time at all. There were programs to catalog tapes. Labels do exist. They just didn’t exist in my house. It’s a personal failing. I accept that but never attempted to correct it.

Some of the early habits formed were:

  • I always had a copy of current and important work. I’d have a batch file that would copy a current days work or important directories to a floppy even in the early days. The more advanced versions would zip files first. Even after CDs were around floppies were easier and quicker. I tended to have many copies of these as a crude versioning system and also as a way to have a second (or third, or fourth) backup.
  • I tended to organize my computer hard drive in a way to make backups easier. Files were “archived” to a section of the disk when I didn’t need them or I knew they’d never change. Then they’d be backed up (or copied) to a couple floppies/tapes/CDs and filed away. Since this was an infrequent event I’d actually take time to label things.
  • Backups media will go bad. I always had extra backups, even if it meant rotating backup sets where that “second” set was older.
  • Restore some files every so often and make sure they work, especially for archived backups. If the restore failed I made a copy of the second set.
  • I backed up data, not programs or program configurations. Whenever I re-installed my PC it was usually for the purpose of cleaning things out so I didn’t want anything except data from my old PC. I kept copies of the program disks but if a program went bad it was a re-install, not a restore This habit solidified in my Windows days. It’s changed a bit for Linux and OS X. Not so much because there’s not a need to clean up, but just because all the configuration is file based and it’s easy to copy and restore (or delete if it’s suspect).
  • I tended to keep my backups small as it was just easier to deal with. I was never into using clones (like Ghost) as a way of doing backups.

I got my data files, now what…
Luckily I never got bit by this (just close calls) but just having the data wasn’t enough. This became a problem as I abandoned Windows and moved to Linux. Moving the data wasn’t a problem but I had archived older data. There came a time when I had to pull out some archived data that had been created with Windows software. I had rebuilt my Windows PC without most of the software. I had to hunt around to find copies of the original programs to read the files. Then I had to install the programs to read the data.

Some of the files were simply scans but they had been saved in a proprietary format. My move to Linux had actually solved this issue since all the scans were now standard graphic files or pdfs. There are numerous viewers for them, on every operating system I’m likely to use.

So a new lesson learned here was:

  • Data wants to be widely viewed. I now use a standard format whenever possible so it can be read in whatever is handy. If a standard format does go obsolete it’s time to save some viewers.
  • If the data is proprietary make sure the program that created it is with the backup files. There’s still a catch here in that you need to have an OS that will run the software. See the previous bullet and avoid the whole issue.

The Three Horseman…
Alright, it’s four horseman and they bring bad things. But I’ve found I group my data into three categories which prevents bad things.

  1. The really important stuff (in my life). This is mainly financial or “life” records. Stuff that will cause me financial loss or extreme hardship if they are lost. Generally, this is the stuff I also need to keep locked up so it doesn’t fall into the wrong hands. These are also the things I want in a standard data format, or lacking that I’ll include the software to read the file. I’ll also throw things in here that are easy to save (I have a lot of text files in this category) even if their loss is minimally annoying. An address may not be critical, but it’s easy enough to save. It’s also the things I’d need in the event of a major catastrophe that affects more than my hard drive (like a house fire).
  2. Files I want because losing it would entail some financial loss or the lose of time to recreate. Examples are some MP3 files, important pictures, videos and software installation files. Losing these would be a loss, but one I’d get over with minimal pain.
  3. Files I won’t miss or can easily replace if they’re lost. This can include old software, PC configuration files. I also include MP3 files that are both on my iPod and on physical CDs in this category. I things go terribly wrong and I lose both the PC and the iPod I can re-rip them. Misc pictures and videos also fall into this category. If I lose these I may never know it. Or the impact will be no more than a small speed bump.

While I’ve never formally though of it this way until now, I’ve been doing my backups in this way for a long time. In the old days the category three files may have been to much trouble to back up and they wouldn’t get done. Then CDs came along and I’ve burn them to CD every so often. I’d verify the backup when it was made but usually never again.

For the category two files I’d be a little more conscientious and make sure I got backups every week (or month) and test them every so often. These tended to be files that didn’t change that much so occasional backups were no big deal.

I’m paranoid about the category one files and always had some backup routine to make sure they got backed up. Historically, the size of these files were always relatively small (even today this backup is only 200MB).

The three categories are really just a way of putting a value on the data so I know how much I want to pay, in either time or money, to back them up.

Part 2 – Enter The Modern Age
Until I got my first Mac my strategy was basically “copies everywhere, whenever I got a chance.” Around the time I got my first Mac my backups changed from where I had to be there and do something to one that had automation and didn’t require media swaps. So in part two I discuss the specific backup techniques I’ve used recently.