A network disaster recovery plan is a set of procedures designed to prepare your company or organization to respond to an interruption of network services during a natural or manmade incident. Network disaster recovery planning also provides clear and easy-to-follow guidelines for restoring network services and normal operations following an incident, an emergency, or a crisis. We at Paessler have a detailed recovery plan ourselves, stored in our Intranet and accessible to all employees (up to high management positions), who play a role if the occasion arises.
The best approach to disaster recovery focuses primarily on planning and prevention. While earthquakes and terror attacks are difficult to anticipate, many other disaster scenarios can be analyzed in every detail. For those events that can't be prevented, an IT disaster recovery plan considers the need to :
- Detect the outages or other disaster effects as quickly as possible
- Notify any affected party so that they can act
- Isolate the affected systems so that damage cannot spread
- Repair the critical affected systems so that operations can be resumed
These points are collectively called risk management or risk mitigation activities. When executed well, disaster recovery procedures save large sums of money. The financial impact to corporations of even a few hours of network downtime or lost web connectivity runs easily into very high amounts of money. Collaboration specialist Avaya did some research into this topic and found out the following: 81% of the European companies surveyed in 2013 suffered network outages, causing average costs of $ 70,000 at 77 % of the surveyed companies. As a side effect, the responsible IT staff was fired in every fifth company concerned, in Germany in every fourth.
While creating your own recovery plan, the first question should be:
Question # 1: Which Scenarios Can Cause a Downtime Within my Network?
There are various scenarios to consider. Possible threats to your network are:
- Attacks (internal and external including attacks from hackers and viruses)
- Blackouts
- Physical damage (everything from sabotage, weather, and high voltage to accidents, fires and water damage)
- Misconfigurations
- Failed updates
Make sure that you’ve considered every possible threat and scenario. There shouldn’t be any risk that didn’t get in your preparation and recovery plan. Be prepared if it happens! Since not all services of your business share the same relevance, the next question to consider is...
Question # 2: Which Service is How Important For my Business Processes?
It’s very likely that your services have different importance for your business processes and your success. In order to find out which action might be more important than others, it’s necessary to recognize those services with the most significance. Ask yourself the following questions:
- Who will be affected by a downtime of a certain service?
- What is the impact for my business?
- How long is an acceptable downtime?
- What are the benchmarks for an “incident”, an “emergency” and a “crisis”
A network disaster recovery plan contains the contact person for various tasks (see question # 4), from the SysAdmins of your company up to higher management positions and the CEO. An “incident” could mean a downtime of 30 to 60 minutes, while a “crisis” could mean a downtime of 24 hours or more and could lead to significant capital losses. Correspondingly to the expected damages you define the disaster as incident, emergency or crisis and can reply appropriately.
When considering how important a service is, you shouldn’t forget to look at its dependencies:
- Which dependencies does this service have?
- Which services are depending upon this service?
The best way to visualize your dependencies could be a visual stairway.
Question # 3: What Can I Do to Mitigate Some/Most of the Possible Risks and at What Price?
There is a direct correlation of investment and high availability. Having very little and short downtimes can be expensive. Not being able to offer your services to employees and clients might possibly be more expensive though. The key here is to find out how much it will cost to have a service highly availabile compared to how likely the different downtime causes are and how much they will cost you in terms of lost revenue.
Below is a formula that can be used for this analysis:
(costs < (loss in revenue * probability of occurrence))
Question # 4: In Case of an Incident, Emergency or Crisis. Who Are My Contact Persons?
Hope for the best – but prepare for the worst. In the worst-case scenario, the responsible persons of your company need to be informed – and need to inform others. So record:
- Who will inform?
- Who needs to be informed?
And in addition:
- Who are my external contact persons for the police, vendors or datacenter provider(s)?
The mantra of every viable disaster recovery plan is: “Keep your network safe. But be prepared if something goes terribly wrong.” Those were our Top 4 questions to ask yourself while creating a network disaster recovery plan. Please tell us your opinion in the comment section beneath this article.