SQL Server, PASS, and other data mishaps
How do you do Disaster Recovery
Going through the process of a large scale multi-location disaster recovery made me stop and think about all the different incarnations that can be used to recover database servers.
Living with a datacenter in Hurricane alley, We’ve been doing disaster preparedness(recovery) on a small scale for many years but this year we’ve been working towards recovering all of our assets to an offsite colocation. That part of the decision is easy, the actual method used to do these recoveries is definitely up in the air and I fully expect our processes to change for the better, every time we redo our disaster testing (many times a year going forward).
In exploring the recovery process we quickly realized that our “hardware failure” recovery documents weren’t going to work effectively in a datacenter failure situation. So, it was time to design a new set of criteria for success. I thought Id share our thought process and how we plan on tackling this always fun experience. Its worth mentioning in a side note that no SQL replication is wanted/allowed for in our case.
1st thought: Bring up blank OS builds for the database servers, load SQL Server, Patch it to the correct level while the tape restores of the database backups are happening, Recover the system databases then kick off the individual restores(that are scripted with the regular nightly backup jobs)
- Benefits to DBA: clean, repeatable, documentable process that we are mostly in control of.
- Drawbacks: Time consuming, potential version match issues, recovering system databases is always “fun”
2nd thought: Use a windows snapshot to restore the OS/Sql Binaries and Sql System databases then recover the user databases using the aforementioned scripts. This also buys us the nicety of having litespeed already installed
- Benefits to DBA: Faster, System level recovery done in a standard (for our system group) method
- Drawbacks: system/SQL recovery out of our (DBA) control
Since our Systems engineers are already asking to go the snap route (because thats common for other application servers), and we expect this method to take less overall time, we are planning on trying that first. Depending on how that test goes, we will likely have option 1 as a backup plan or potentially try that next time thats why we’re testing it, so that we can make sure we have it right.
As always, there’s more than 1 way to accomplish the same outcome so my question is how do you do off-site disaster recovery (testing)? Or maybe the better question is do you do disaster recovery testing? If not why?
| Print article | This entry was posted by Allen Kinsel on January 27, 2010 at 7:45 am, and is filed under Backup and Recovery, SQL Server. Follow any responses to this post through RSS 2.0. You can leave a response or trackback from your own site. |







about 3 years ago
What about mirroring? to another SQL Server somewhere else? Is that an option? Also what kind of time do you need to be back up in? That usually helps to determine some of your options as well.
about 3 years ago
One of the slickest ways I’ve seen recently is to have a completely virtual DR farm. Run just a few VMware (or Hyper-V) hosts, and let each team build up their own virtual servers in DR. You have limited hardware horsepower in DR, but you only do testing of apps there, not full blown production. You might not even have enough horsepower to turn on all of your DR VMs at the same time.
Then when disaster strikes, you can easily grab more VMware host hardware, spread the load across more servers, and add capacity easily. You don’t have to buy hardware before you need it, and you can usually beef up the hardware fairly quickly even in disaster situations. It’s much easier to do this than to build the servers from scratch, and users are tolerant of decreased performance rather than having no servers at all.
about 3 years ago
Funny you should mention a virtual DR Farm, I believe that we’re doing something similar, but the problem as I see it is how do you physically get the servers recovered? regardless of the hardware, there has to be a better method of getting the software up. Since most DR sites arent permanent having everything running fulltime offsite isnt an option
about 1 month ago
Thanks for finally talking about >SQL Server Database Backup and Recovery, Disaster Recovery | Allen Kinsel – SQL DBA <Liked it!