What is disaster recovery?


Disaster Recovery is a service for recovering IT systems and data after a failure. Cloud providers either offer the service separately or as a part of a larger solution. It can be divided into three components: a backup site, DR software solutions, and a recovery plan.

Why Disaster Recovery is important

The more actively a company uses its IT infrastructure, the greater the dependence on its availability. Disruptions directly affect the organization's revenues and reputation. They also have a negative impact on the employees' efficiency and customers' comfort. Therefore, companies spend many resources to reduce the risk of infrastructure failures.

Besides ensuring reliability, it is also necessary to be prepared to get back fully and as quickly as possible in the event of a disruption. The sooner everything is up and running, the fewer negative consequences for the company. This is why Disaster Recovery solutions are important.

Disaster Recovery is part of a business continuity plan. Its idea is that the company should work regardless of internal failures, cyber-attacks, and other incidents. And if a disaster occurs, not to lose valuable assets, and quickly restore operations.

Disaster recovery requires a backup infrastructure for storing data and virtual machine templates or serving as an additional system that will handle business tasks during the incident.

Cloud providers most often offer Disaster Recovery solutions. The CSP provides cloud capacities to host a backup information system (IS). The primary one is located in another data center. The data is simultaneously transferred either to the main and backup IS through configured communication channels.

An important thing to understand is that disaster recovery as a service (DRaaS) is not the same as backup solutions. The main task of the backup system is to save data. Disaster Recovery, on the other hand, aims to reduce the downtime of IT systems. The backup will not keep operations running on the backup platform while the main site is being restored. With DRaaS, the organization has a site that is identical to the main site and ensures business continuity.

Key Disaster Recovery Parameters

RTO and RPO are two main parameters in Disaster Recovery solutions. They determine the cost of a disaster-resilient system and the amount of damage in the event of an incident.

RTO (recovery time objective) is the period of time the IT system requires recovery. RTO values may vary across different companies. For example, a value of 4 hours means that the infrastructure will be up and running within that time. With the RTO of few seconds, users may not even notice the system crash. Some disaster recovery solutions support automatic traffic transfer to the backup infrastructure. This allows mitigating the effects of a disaster by making them seamless to users. The RTO determines how long your business can survive without IT. For example, for a large online store, the RTO value of 2-3 hours could lead to serious losses. 

RPO (recovery point objective) – the period in which data can be lost due to a disaster; defines the maximum amount of data lost, since the outage occurred until the last valid backup. For example, three hours of data loss means once the system is restored, no more than three hours of data can be lost before the incident. RPO of a few seconds will save almost all data, which is especially crucial for banks, large developers, and other organizations that cannot lose data even in a minute.

The RPO value affects the frequency of creating IT infrastructure copies.

The lower the RTO/RPO, the higher the cost of the Disaster Recovery solution. Select a DR model that costs no more than the loss in the event of downtime. You need to find a balance between the cost of disaster resiliency and loss due to an incident, taking into account the time it takes to restore business processes and the amount of data lost.

What is a Disaster Recovery Plan

A Disaster Recovery Plan (DRP) is a document describing in detail all data recovery activities. The plan specifies the roles and responsibilities of the employees involved and the sequence of actions to be taken.

It is not easy to say exactly when a company needs DRP. We can formulate this criterion as follows:

  • A server/application outage or database loss entails significant financial, reputational, or other losses;
  • The staff has a full-fledged IT department with its own budget;
  • There is a real opportunity to allocate funds for a full or at least partial backup in case of disaster.

If the database loss for a day is not crucial, and IT Department is able to wait months for new server components, the DRP is hardly necessary. Yet this document can be helpful.

The main purpose of the Disaster Recovery Plan is to create a step-by-step instruction outlining the timing of particular procedures. Using the plan, the company is able to ensure the following:

  • Recover IT infrastructure more quickly after a disruption;
  • Keep critical processes running during the main site downtime;
  • Keep important company data safe.

A Disaster Recovery Plan consists of several parts. Primarily, these are the goals, the risk factors, and a list of critical services.

The purpose of the DRP include:

  • Preparing employees. It is important that when faced with an emergency, they do not get confused, but act according to the instructions.
  • Keeping the service up and running. Restoration of services in a short time and data integrity.
  • PR contacts. Correct interaction with media, customers, and partners play an important role during an accident.
  • Compliance with standards. During disaster recovery, it is important to adhere to corporate standards to avoid chaos.

Risk factors indicate which processes require special attention during disaster recovery. The document spells out actions to address these risks. For example, checking backups are created correctly, backup communication channels work, and checking the availability of the right equipment.

The list of critical services determines the order of recovery procedures. The more sensitive the process the faster you need to restore it back to normal operation. Disaster recovery implies that critical services are migrated to a backup platform. Therefore, even in the event of a serious incident, their availability must be maintained. But if something goes wrong with the backup platform as well, the recovery work starts with the most mission-critical systems.

DRaaS solution by Cloud4Y

Corporate cloud provider Cloud4Y offers three disaster recovery models:

 

Backup. Data backup follows the Active-Passive scheme. The RTO/RPO vary depending on the data volume, in terms of time – from 1 hour. The model is suitable for all types of businesses for which recovery time is not crucial, and data losses are possible. 

VM Replication. Replication of data to a remote site (Active-Standby). RTO/RPO are from 30 /15 minutes respectively. This option is suitable for web platforms, e-commerce, BigData. Basic SLA - 99.982%, business continuity, simplified failover and failback, Near-CDP for any virtualized application.

SyncCluster. Synchronous data mirroring using Active - Active scheme. RTO/RPO of 30 seconds/from 0 seconds respectively. The solution is suitable for banks, large IT companies, government institutions, BigData. Replication is provided at the storage level. Fault tolerance SLA is 99.99%, protection even from natural disasters is guaranteed. The distance between redundant data centers is 10 km.

Using cloud-based Disaster Recovery solutions is easier to organize and manage, and cheaper than building your own infrastructure. With Cloud4Y's DRaaS service, you ensure to get back to normal operations within the period specified in the contract. Designing interaction, connectivity, and routing schemes do not take much time, so any business can easily implement disaster recovery solutions.


Is useful article?
0
0
author: Alexander Vorontsov
published: 09/20/2021
Last articles
Scroll up!