Wednesday, 25 March 2015

Whole disk encryption cloning and fun with Windows MBR

or How I am Paranoid but Not Paranoid Enough.

Working for a large IT company makes me a paranoid person in a paranoid organization - but we are paranoid about different things. The company is crazy about managing the data security of thousands of employees travelling the world with access to confidential corporate information. I'm bothered about my part in that too but also about how easy it is for me to keep my laptop completely backed-up so I can quickly resurrect my working environment if I ever lost or damaged my laptop or had it stolen. Data backup is easy and secure in the corporate cloud but that doesn't get close to satisfying my need to be back up and running within hours of losing my laptop - I know from bitter experience that it can take days to reinstall and reconfigure all the software I use if all I have is the data backed up in cloud.
I have 3 basic requirements for backing up my laptop:
  1. I want to have everything (data, software, configuration) on my HDD (or SSD) in a backed up image.
  2. I want to be able to easily use that backed-up image to re-create a new, identical bootable drive.
  3. I want this to be fast and cheap and not send/receive hundreds of gigabytes over the network.

This used to be an easy problem to solve. There are lots of very good open source and commercial software that make it easy to clone a hard drive - all you need is a second hard drive of the same or larger capacity than the source. An hour a week to clone the HDD and you're done. I use Acronis True Image but there are plenty of good alternatives. If I lose the laptop or the primary drive is fried I have an immediately-available bootable replica that is as up-to-date as I have chosen to keep it.

Then along came PGP whole disk encryption (WDE) and a corporate policy to use it.

To me, WDE is something best left to hardware. Encryption in hardware is fast and reliable and doesn't complicate disk cloning at all because the encryption is wholly self-contained within the drive hardware and independent of the actual content stored on the drive. My company's policy for specifically requiring PGP rather than hardware encryption is nothing to do with security or reliability but is all about centralized management of the encryption keys. No issue with that BUT it makes it much harder to achieve all 3 of my backup criteria. And while I can sleep soundly knowing that if I was ever dozy enough to forget the password from which the key is derived my company could recover it, I'm on my own for the second (and hardest) of my three recovery criteria.

Why is backup/recovery harder with software FDE?

I learnt this the hard way. By far the simplest way to satisfy my backup objectives is a disk clone - something I've done regularly for years. This is not possible with robust software-enabled FDE because, unlike hardware FDE, something has to be stored on the disk to enable the encryption software to decrypt it. The way PGP and other FDE software works is to use a custom boot manager to present the user with a pre-OS-boot screen to enter the encryption key which is validated against files stored on the disk. (As a detail, at the time this happens there is no OS file system available so the location of the required PGPWDE* files that are actually in the c:\ root directory is known to the boot manager by disk sector location alone). During the initial disk encryption, PGP creates a master boot record (MBR) to load that boot manager. While it is generally a very bad thing to ever lose or corrupt your MBR, there aren't enough superlatives beyond very very very bad to describe how catastrophic it is to lose the MBR on a software-encrypted disk. But how stupid or unlucky would you have to be for this to happen? Unfortunately I can't use the unlucky defence.

Its perhaps ironic that it was the act of backing up my disk that utterly corrupted it. And I was backing it up "just to be on the safe side" a week before our biggest customer conference because I couldn't afford to lose any time during the busiest period of my year. Its hard to accurately describe the sick feeling I got after the disk clone activity failed very quickly and left my primary drive unable to boot with just a sickly "MBR Error 3 / MBR Error 1 / Press any key to boot from floppy..." message.  This turns up a lot of hits in any popular web search engine, and Acronis and PGP feature in many of them. You can bet that most people doing that search are pretty desperate.
Bad under any circumstances - disastrous with PGP : MBR Error 3 / MBR Error 1

There are three obvious questions this raises:
  1. Why did this happen?
  2. What is the recovery from it?
  3. What should I do differently in future to back up my software-encrypted disk?

Why Did It Happen?

Acronis and software like it provide multiple strategies for backing disks up. For the reasons I described above, I used the disk clone approach. I connect a 2nd HDD to my laptop and just zap it every time with a clone of my primary drive using Acronis. I can then (if I need to) simply boot from the backup HDD (or switch it with the primary drive) and I'm done. Nothing could be simpler to get to a bootable backup.
In order to be able to copy all the partitions including the MBR, Acronis reboots during the process using its own boot manager to boot into the clone-operation part of the backup. It does this by temporarily replacing the MBR with its own, restoring the original MBR as part of the clone. This is a mistake I only made once.
The problem, of course, is that after the Acronis MBR replaced the original PGP one and the machine rebooted, the content on the C drive of the disk is encrypted garbage - no operating system, data or anything. And no way to decrypt. At this point, I needed to get the PGP MBR back in place or my disk is dead. And that is horribly difficult.

What Is The Recovery?

There is an answer to #2 but its not pretty. You certainly can't use any off-the-shelf Windows tools for recovering the MBR in this scenario and you need to use PGP tools. But this is not straightforward - you need to take the messed-up drive, get it fully decrypted using PGP tools and then restore the MBR using a standard Windows recovery disk. Getting it decrypted is the hard part. For all practical purposes, you need to find another machine (or disk) with the same level of PGP installed, connect your messed-up disk as a slave and then use the other machine's PGP installation to decrypt the messed-up disk. The end result of that is a still non-bootable but unencrypted disk. Recreating a default MBR after you've done this is easy - for example on Win7 just boot from a Win7 recovery CD you created earlier, open the command window and run bootrec /fixmbr. At this point, you're ready to use PGP to re-encrypt again and implicitly generate you a new MBR for BootGuard.

For more details on the recovery side of things, my colleague Olly Brand described his journey to a successful conclusion using this technique./

But better never to have to get to this point in the first place.

So what's the right way to back up a software-encrypted disk?

I have tried 3 approaches to try to preserve all the requirements above - 2 approaches were successful and one I gave up on.
Successful Approach 1: Decrypt, clone, re-encrypt.

Easy to understand and always works but takes ages (6 hours to decrypt and 6 more to re-encrypt) so fails requirement #3 (fast).

Successful Approach 2: Use an archived backup/restore approach.
I resisted this approach for years because it slightly complicates the 'restore a bootable image' requirement but in the end this is the most practical approach with PGP encryption and its what I now do regularly. I'm still using Acronis TrueImage for backup/restore but through the more mainline incremental whole-back-up approach rather than the disk-clone utility. This creates an archive file that requires Acronis software to restore from, but it works without any PGP-related complications because there's no need to use any special Acronis boot manager or muck about with temporary MBRs.

In order to restore from such an archive you do need to have created bootable media (e.g. on a USB drive) that includes the recovery software to read and restore from the backup archive. Acronis (like other backup solutions) makes this easy so restoring from the archive is a simple 3-step process:
1) create the bootable media using whatever method the backup solution provides - in my case an image on a USB drive that delivers a bootable Acronis recovery manager. Also create a Windows recvoery CD for the last step below.
2) when a restore is required, boot from this media and restore from the backup. In my case I boot with the target to-be-primary disk, source backup archive disk and bootable USB available, booting from the USB. Then use Acronis to restore the disk (primary and boot partition) from the archive to the to-be-primary disk. There is a one further step required:
3) On reboot of the restored disk the text "bootguard..." is displayed and the boot will hang. This is because the restored MBR was created by PGP but the restored disk is not encrypted and the BootGuard boot manager cannot make sense of the files at the disk sectors it is pointing to (which it expects to be able to use to decrypt the contents of an encrypted disk). At this point the PGP MBR needs to be replaced by a default Windows MBR. This can easily be done by booting from the Windows recovery CD created in the first step, opening the command window and running bootrec /fixmbr. Finished.

All this is a lot quicker than Approach 1 but does require some preparation. It satisfies all the 3 requirements I listed above for 'simple' backup/restore although it's not as simple as the disk clone I could do before I had to use PGP.

Unsuccessful Approach 3: Create a sector-by-sector encrypted clone preserving the PGP MBR
I really wanted this approach to work but kept failing at the restore step. Software like Acronis can perform a sector-by-sector disk clone without messing up the MBR if the clone operation is initiated from outside Windows. Using the Acronis bootable media (and therefore bypassing PGP), I can initiate a disk clone at the raw sector level. As far as Acronis is concerned it's copying garbage but the end result is an encrypted clone. I can see that its a semi-successful clone by booting from the old primary source disk using the normal PGP boot manager, at which point the PGP software on the primary disk can see and interpret all the data on both the source disk and the new encrypted clone. However, when trying to actually boot from that clone I never got past:

Boot selection failed 0xc000000e

There comes a point when enough is enough. Maybe there's something hardware-drive-specific in the BootGuard sequence which means no amount of faithful cloning can be successful but I lost interest in making that work once I had the good-enough approach #2.

Its good to be a little paranoid.