An Excerpt from Don Jones' Definitive Guide to Backup 2.0 about Backup and Disaster Recovery
This story really illustrates why I dislike point-in-time exchange backups so intensely. Sure, you can achieve a lot of business goals using Backup 1.0 techniques and tools, but you have to be so very careful in order to get exactly what you want. Who needs that extra mental overhead?
I work in one of my company’s larger data centers, and support about 100 servers. Most of these are file servers, but there are a couple of domain controllers, a few SQL Server machines, and three Exchange Server boxes. We are very good about patching our computers. We typically will not apply a set of patches until we have a full backup of the computer’s OS and application files, and we usually make a backup right after applying the patch, too. We tend to apply patches during maintenance windows when the servers aren’t otherwise available. On some of our larger servers (the Exchange machines come to mind), it gets difficult to grab one full backup and apply patches during our 6-hour maintenance windows (you try taking email away from people for longer), so sometimes we take a backup one night and apply patches the next, then take a second backup the following night.
The system works—but not always well.
I can recall a couple instances where Exchange patches have caused problems with some of our third-party software, and we needed to roll back to the pre-patch backup. Unfortunately, a whole day of work had passed since that backup was made, so we lost all that work. People become incredibly unhappy when email goes missing.
In at least once instance, we didn’t realize a general Windows hotfix was causing problems for about a week. At that point, the pre-hotfix backup was pretty aged. This was on a domain controller, so we decided to apply the old backup anyway, knowing that the domain would bring itself up to date through replication. Unfortunately, the backup also—we found out—had some deleted objects in the domain, which were near the end of their tombstone life. The practical effect was that about a dozen formerly-deleted objects suddenly reappeared in the domain. Our security auditors freaked out, people were yelled out, and it actually took us a while to work out what had happened, since that’s not a scenario you see every day. We’ve since decided to rely less on backups for undoing patches.
We’ve started spending more time testing patches, which is of course a good idea but it’s very boring and it takes a lot of time we didn’t really have to spare. It also means our patches only get rolled out about every other month, rather than every other week, and I worry about what happens when one of those patches fixes some major security hole—and we have to leave the hole open for 2 months just because of our processes.
Again, the Backup 1.0 mentality has deeper-reaching effects than just disaster recovery problems. In this instance, the company has actually decided to run out-of-date software for longer simply because of the way their backup processes work. Unbelievable. If ever there was a case of the “technology driving the business” rather than the other way around like it should be—this must be that case.
There are easy-to-recognize problems here, which should be familiar to you at this point:
This story really illustrates why I dislike point-in-time exchange backups so intensely. Sure, you can achieve a lot of business goals using Backup 1.0 techniques and tools, but you have to be so very careful in order to get exactly what you want. Who needs that extra mental overhead?
I work in one of my company’s larger data centers, and support about 100 servers. Most of these are file servers, but there are a couple of domain controllers, a few SQL Server machines, and three Exchange Server boxes. We are very good about patching our computers. We typically will not apply a set of patches until we have a full backup of the computer’s OS and application files, and we usually make a backup right after applying the patch, too. We tend to apply patches during maintenance windows when the servers aren’t otherwise available. On some of our larger servers (the Exchange machines come to mind), it gets difficult to grab one full backup and apply patches during our 6-hour maintenance windows (you try taking email away from people for longer), so sometimes we take a backup one night and apply patches the next, then take a second backup the following night.
The system works—but not always well.
I can recall a couple instances where Exchange patches have caused problems with some of our third-party software, and we needed to roll back to the pre-patch backup. Unfortunately, a whole day of work had passed since that backup was made, so we lost all that work. People become incredibly unhappy when email goes missing.
In at least once instance, we didn’t realize a general Windows hotfix was causing problems for about a week. At that point, the pre-hotfix backup was pretty aged. This was on a domain controller, so we decided to apply the old backup anyway, knowing that the domain would bring itself up to date through replication. Unfortunately, the backup also—we found out—had some deleted objects in the domain, which were near the end of their tombstone life. The practical effect was that about a dozen formerly-deleted objects suddenly reappeared in the domain. Our security auditors freaked out, people were yelled out, and it actually took us a while to work out what had happened, since that’s not a scenario you see every day. We’ve since decided to rely less on backups for undoing patches.
We’ve started spending more time testing patches, which is of course a good idea but it’s very boring and it takes a lot of time we didn’t really have to spare. It also means our patches only get rolled out about every other month, rather than every other week, and I worry about what happens when one of those patches fixes some major security hole—and we have to leave the hole open for 2 months just because of our processes.
Again, the Backup 1.0 mentality has deeper-reaching effects than just disaster recovery problems. In this instance, the company has actually decided to run out-of-date software for longer simply because of the way their backup processes work. Unbelievable. If ever there was a case of the “technology driving the business” rather than the other way around like it should be—this must be that case.
There are easy-to-recognize problems here, which should be familiar to you at this point:
- Backup 1.0’s point-in-time snapshots don’t provide much granularity when it comes time to roll back something nor does it function as much needed continuous data protection software.
- Backup 1.0’s reliance on backup or maintenance windows took away some of the author’s flexibility with regard to his Exchange infrastructure.
- Except in a few special situations, Backup 1.0 tends to be an all-or-nothing proposition: either you roll back the entire server or you live with what you’ve got. There aren’t many good ways to restore a single application; Backup 2.0, by contrast, can more easily pull out just the bits related to a specific application and restore it—with a single click.




