Need Redundancy, Disaster Recovery? Don’t buy that 2nd server!
For over ten years at XCentral, we have served, and continue to serve businesses of all sizes, whom all have differing IT requirements that are an outcome of their respective business goals & requirements. What is clear is that it’s not only Enterprise size business that need high quality and highly effective Backup and Disaster Recovery strategies, but also both Small, Medium & Mid-market sized businesses as well.
Most businesses will have a backup of some sort whether quite simple or advanced. However, the real challenge has traditionally been the prohibitive cost of Disaster Recovery to achieve High Availability. To that end, a lot of businesses have suffered in silence, needing High Availability, but have been let down by their semi-functional Backup and Disaster Recovery solutions.
The topic of High Availability, Backup and Disaster Recovery is a large one, so the goal of this Blog article is to describe the fundamentals. I will show you that by utilising the latest technology, you can enjoy the feeling of liberation, no longer being required to purchase multiple servers on-site. Or having to purchase expensive datacentre racks (Co-Lo), power, cooling and multitudes of additional equipment to have the secondary datacentre functional to achieve a higher level of availability.
RTO and RPO
Before diving in, I want to establish two important concepts that commonly get thrown around, yet are well worth understanding. They are RPO and RTO, let me explain the bigger picture through 2 questions:
Question 1: How much data can you afford to lose? 30 seconds? 5 minutes, 15 minutes, 30 minutes, 2 hours, 4 hours….1 day?
Question 2: In the event of a complete failure, how long can you be without your systems for? 30 seconds? 5 minutes, 15 minutes, 30 minutes, 2 hours, 4 hours….1 day?
Let’s presume your answer to Question 1 was 15 minutes. That would mean the following:
- Your RPO (Recovery Point Objective) is 15 minutes. Thus, you must always have a backup of your system every 15 minutes of the day.
- The RPO is the last time you have a successful backup. That is the Recovery Point.
Presuming your answer to Question 2 was 15 minutes. That would mean the following:
- Your RTO (Recovery Time Objective) is 15 minutes. Thus, you would need your system to be back up and running at 100% inside 15 minutes.
- The RTO is the time you have before your systems must be fully operational again. That is the Recovery Time.
That sounds pretty straightforward, but the complexity usually comes into play in that, in broad speak:
- The quality of your backup system typically governs your RPO. These days, there are many automated backup systems that can keep your systems backed up, right down to a 30-second window.
- So, for the most part, an RPO is typically straight-forward to meet and achieve.
- Your RTO is typically governed by good monitoring, good IT staff, and a really rock solid process ‘in case of a disaster.’
- So achieving an RTO typically is a challenge as it has traditionally been a lot harder to automate.
- In the example above, if we had an outage at 10 am, we would be required to be 100% operational by 10:15 am. If you think about a system crash, by the time you get people on the phone, work out what has gone wrong, and then enact a plan, you can see that 15 minutes would be nearly impossible to meet.
- So to that end, traditionally, this is why people buy multiple servers, run clusters, and run secondary data centres. That is so in the event of a disaster, automatic failover happens, and 15-minute windows can be achieved. Hopefully.
So what is the answer?
I promised liberation at the beginning of this Blog, so how can it be achieved? More than that, how can it be achieved without purchasing all the additional expensive ‘stuff’ I listed above and skilled staff to maintain it?
The answer is Azure Site Recovery or ASR. What on earth is that? ASR is a Microsoft product that sits in the Microsoft Azure cloud that has made Disaster Recovery and High Availability affordable for SMB’s. As mentioned previously, such solutions to date have required some or all of the following:
- A second Datacentre, or equipment in a second location
- Multiple equipment on-site
- Expensive network links
- A large capital outlay, thus a high barrier to entry
- Expensive monthly costs to maintain all the equipment, and to pay rent
- Complex management and multiple IT engineers to manage the various moving cogs
Whereas with ASR, Small Businesses have now been enabled to have Disaster Recovery and High Availability paying a small monthly fee for their servers. The minimal monthly fee keeps a copy of the servers in the Microsoft Azure data centres in a standby mode. The price only increases in the case of a DR simulation, or in the event of a Disaster itself. So the beauty of ASR for the Small Business is that they are:
- Resilient: Essentially, they have a second Data centre where their servers replicate to, for a small monthly fee
- Cost Effective: Not requiring any capital purchases
- Flexible: The solution is easy to work on, thus reducing labour
- Smart: It is like an off-site insurance policy without all the expensive costs that are typical of traditional solutions
- Secure: Virtual Hard Disks can be encrypted at rest using a secure, customer-managed encryption key that ensures best-in-class security and privacy for your application data when it is replicating to Azure.
- Self-service Disaster Recovery: ASR provides full support for DR drills via test failover, planned failover with a zero-data loss, unplanned failover, and failback.
- Audit/Compliance Reporting: DR testing and drills can be performed without any impact to production workloads which means you get risk-free, high-confidence testing that meets your audit and compliance objectives. You can also generate reports for every activity performed which means you can meet all your audit requirements.
Is this “Pie in the Sky?”
Let me give you a real-world example through a case study. The customer is Guardian Strata, which is owned by Ossie Pisanu. They are a 15-person business who have, let me repeat that, they have an RTO and RPO of 15 minutes. It has been tested, and it is achieved through a single server on-site, accompanied by Azure Site Recovery (ASR) providing the off-site component. You can read more about Guardian Strata through the #ModernBiz campaign here. In Episode 2 of the Tech behind the makeover video that is here, you can hear all about what we found.
Guardian Strata is the perfect example of the average Small Business which is sensitive to outages, and which would be crippled by a Disaster. In the past, they suffered in silence, needing an Enterprise grade solution but being left high and dry due to the debilitating costs usually associated with Enterprise grade solutions. I can confidently state that just 12 months ago, without Microsoft Azure, there is no way Ossie would have afforded a solution that could provide him a 15 minute RTO/RPO window. The cost would have sent his Small Business bankrupt. Fast forward to 2015, Ossie’s business has a single on-site server, and with the power of ASR has an off-site solution providing peace of mind.
I mentioned before that the difficulty with the RTO (what happens after a Disaster) is the automation, and the fact that human intervention is usually required. With ASR, you can still keep a human interaction aspect if you like, and there are many good reasons to. But the great thing is the Continuous Health Monitoring built into ASR. The ASR service itself is monitoring the state of your servers continuously and remotely from Azure. So the point here is that with ASR, the RTO side of the equation can be completely automated for the Small Business. What a breath of fresh air!
In Small, Medium and Mid-market businesses, every dollar counts. Having a highly scalable and robust DR solution usually means having a stack of IT resources essentially sitting on a shelf collecting dust, costing an arm and a leg. Enterprises throw around terminology all the time such as Backup, DR, and even will ask: What is your BCP, or Business Continuity Plan? Careers are made and destroyed with those 3 acronyms and having a workable solution is the Holy Grail of most businesses and IT integrators.
As I have shown, by utilising ASR, Guardian Strata has achieved what the Board Rooms of most Enterprises dream of having. We live in a day and age where we are becoming more and more reliant on the systems we connect to each day, every blip on the radar is felt more and more. But the great news is that technology is providing new ways to solve old problems, ways to reduce costs and no matter who you are, I am sure that is something that is important to you.