Disaster Recovery Planning and Network Services Continuity

Disaster Recovery Planning and Network Services Continuity

Disaster recovery planning (DRP) starts with a discussion that involves key management employees. It is important to get their support with any disaster recovery initiative. Explain what disaster recovery is and why it is required for business continuity, cost reduction, generating revenue and improving productivity. Disaster scenarios such as fire, flood, earthquake, cold weather and employee sabotage should be discussed. Alternate vendors should be discussed as well as a potential issue with business continuity. 

Risk Assessment

The Risk Assessment is a ” what if analysis ” that describes the amount of risk associated with the current state of the network. The following are some things to consider before any disaster recovery strategy formulation.

• Average cost per/minute that your network is unavailable.

• Cost of replacing servers, applications, circuits and devices.

• What if any disaster recovery plan exists and how extensive it is.

• Have alternate vendors been identified should primary vendors have their own disaster recovery problems.

Disaster Recovery Strategy

The disaster recovery strategy describes operational changes, design changes and failover strategies for business continuity. An action plan document is created that describes all those strategies and a detailed escalation procedure should the network become unavailable. It should document employees, responsibilities, time frames, event sequence, vendors and processes.

The following describes recommended operational changes:

1.  Network Documentation

Automate the network documentation process. It is difficult to restore a network without having current documentation of the network before it became unavailable. Running a network assessment will collect some information however you need application and device configurations as well. Find a tool that will automate this process !

Document these items:

• Current Topology

• Infrastructure

• Security Policies

• Management Strategy

• Application Configurations, Versions and Patches

• Device Configurations, IOS Versions and Firmware

2. Regular Backups rotated off-site and tested for data integrity

The following list describes recommended design changes:

Review and modify design, infrastructure, configuration, security and management for improved network resiliency and availability. It is my contention that running a network assessment is an effective strategy for determining what changes should be made to your network. The argument could be made that all assessment groups have some affect on network availability and resiliency. The availability assessment will collect most of the key information however the security assessment must be considered since problems with company security will expose your network to attacks. When your network is being attacked it isn’t available!

Management strategy assessments are key as well since the absence of effective management policies and applications will create a tenuous situation. For instance without any change management policies you will have employees changing application and device configurations (assuming they have security authorization) without prior approval and at any time of the day. The configuration change doesn’t work as expected and it is 10 am while employees are starting their day. Guess what, your day just got longer. Pro-Active fault and performance monitoring strategies will indicate when a device or server is not operational or near capacity. Those situations will obviously affect network availability. The performance assessment will describe how well the network is performing and whether there are any capacity issues and what offices are affected. The infrastructure assessment will focus on issues such as media mismatches, switch port capacity, IOS version problems, router memory shortages, application software versions and protocols. Facilities are considered with an availability assessment and focus on rack space, temperature controls, power availability and raised floors.

Select Failover Strategies

1. On-Line data synchronization between the production Data Center and a remote Data Center facility. The cutover or convergence time should be transparent to employees and all current data would be available. This requires the cost of a remote facility with routers, switches and matching servers and applications to synchronize the offices. Cisco distributed director technology can be utilized to configure both Data Centers for concurrent operation if that is required.

2. Configure the distributed director to redirect sessions to the alternate Data Center once a certain percentage of

Pages: 1 2