ImplementationNIS2DORAISO 27001

Disaster Recovery Planning: A Complete Guide

14 minUpdated 2026-03-18

Complete guide to disaster recovery planning covering DR fundamentals, DORA ICT response and recovery requirements under Articles 11-12, RTO/RPO setting and testing, DR site strategies with EU data sovereignty considerations, and DR testing, exercising, and lessons learned processes.

Key Takeaways

1
Disaster recovery planning derives from business impact analysis and risk assessment. RTO and RPO are business risk decisions set by business stakeholders, not technical parameters chosen by IT teams.
2
DORA Articles 11-12 impose prescriptive DR requirements on financial entities, including annual testing, backup segregation, restoration verification, and management body governance of ICT business continuity policies.
3
RTO and RPO targets must be validated through testing under realistic conditions. A stated RTO that cannot be achieved in practice creates a false sense of security and a regulatory compliance gap.
4
EU data sovereignty requirements constrain DR site selection. Multi-region strategies within the EU provide geographic diversity while maintaining full data residency compliance under GDPR and sector-specific regulations.
5
DR testing should follow a progressive programme from component testing through full-scale simulation, with documented lessons learned driving continuous improvement and providing supervisory evidence of operational resilience.

1. Disaster Recovery Planning Fundamentals

Disaster recovery (DR) planning is the process of establishing policies, procedures, and technical capabilities to restore critical technology systems and data following a disruptive event. It is a subset of business continuity planning (BCP) — while BCP addresses the organisation's overall ability to continue operating during and after a disruption, DR focuses specifically on the technology layer: recovering IT systems, applications, data, and infrastructure to support business operations. The relationship is hierarchical: business continuity requirements drive disaster recovery requirements, and DR capabilities enable business continuity.

The fundamental inputs to DR planning are a business impact analysis (BIA) and a risk assessment. The BIA identifies critical business processes, the IT systems that support them, and the impact of those systems being unavailable over time. The risk assessment identifies the threats that could cause disruption — ranging from hardware failure and software corruption to natural disasters, cyberattacks, and supply chain failures. Together, the BIA and risk assessment produce the two metrics that drive all DR planning: the Recovery Time Objective (RTO), the maximum acceptable time between a disruption and the restoration of service, and the Recovery Point Objective (RPO), the maximum acceptable amount of data loss measured in time (e.g., an RPO of 1 hour means the organisation can tolerate losing up to 1 hour of data).

A DR plan is not a static document — it is a living operational capability. The plan must define recovery procedures for each critical system (step-by-step instructions that operations staff can follow under stress), roles and responsibilities (who declares a disaster, who executes recovery, who communicates with stakeholders), escalation pathways (from technical teams to executive management to external parties including regulators and customers), dependencies and sequencing (the order in which systems must be recovered, based on inter-system dependencies), and resource requirements (personnel, infrastructure, third-party services, and consumables needed for recovery). A plan that exists only as a document in a governance repository, untested and unfamiliar to the teams who would execute it, provides a false sense of security that is arguably worse than having no plan at all.

DR planning is driven by two metrics: RTO (how quickly you must recover) and RPO (how much data loss you can tolerate). Both derive from business impact analysis and should be set by business stakeholders, not IT teams — they are business risk decisions with technology implications.

2. DORA ICT Response and Recovery Requirements (Articles 11-12)

DORA dedicates specific attention to ICT response and recovery in Articles 11 and 12, establishing requirements that go beyond general DR best practice. Article 11 requires financial entities to put in place a comprehensive ICT business continuity policy, which must define arrangements, plans, procedures, and mechanisms to ensure the continuity of the entity's critical or important functions, respond quickly and effectively to all ICT-related incidents to minimise damage and resume operations, and activate dedicated plans that contain measures, actions, and procedures to be activated for each type of ICT-related incident. Article 11(3) requires that the policy be implemented through dedicated, appropriate, and documented arrangements, plans, procedures, and mechanisms.

Article 11(4) introduces specific requirements for ICT business continuity plans: financial entities must test the plans at least annually and after substantive changes to the ICT systems, ensure that testing covers an adequate range of scenarios (including scenarios with severe business disruptions), and ensure that backup systems can be switched to from the primary systems in a timely manner. The testing obligation is not discretionary — it is a regulatory requirement with supervisory oversight. Article 11(6) further requires financial entities to maintain records of all testing activities and their results, and to address any deficiencies identified during testing through a formal remediation process.

Article 12 addresses backup policies and recovery methods specifically. Financial entities must define and implement backup policies specifying the scope of data subject to backup, the frequency of backup, and the recovery procedures. Article 12(2) requires that when restoring data from backups, financial entities use ICT systems that are physically and logically segregated from the source system, and that restoration does not jeopardise the security of network and information systems or the availability, authenticity, integrity, and confidentiality of data. Article 12(3) mandates that financial entities periodically test the backup procedures and the restoration and recovery procedures to verify that recovery can be achieved within the defined RTO and RPO. The RTS under DORA provide additional technical detail on backup requirements, including expectations for backup encryption, geographically separated storage, and backup integrity verification.

For financial entities subject to DORA, disaster recovery planning is not a technology decision delegated to IT — it is a governance obligation under Article 5, requiring management body approval of ICT business continuity policies and oversight of their implementation and testing. The management body must be informed of test results, aware of any gaps or deficiencies, and satisfied that remediation is progressing appropriately.

DORA Article 11(4) requires annual testing of ICT business continuity plans, including DR scenarios. Article 12(3) requires periodic testing of backup restoration to verify RTO/RPO achievement. These are regulatory requirements subject to supervisory examination — not optional best practice.

3. Setting and Validating RTO and RPO

Recovery Time Objectives and Recovery Point Objectives are the quantitative foundations of disaster recovery planning, yet they are frequently set incorrectly — either too aggressively (committing to recovery capabilities the organisation cannot deliver), too conservatively (accepting unnecessary business disruption to avoid DR investment), or without reference to actual business requirements (allowing IT teams to set technical targets without business input). Correct RTO and RPO setting begins with the business impact analysis, not with technology capabilities.

The BIA should quantify the impact of system unavailability over time for each critical business process. Impact dimensions include revenue loss, regulatory non-compliance, contractual breach, reputational damage, and impact on customers or market participants. Plot these impacts on a timeline: what is the impact at 1 hour, 4 hours, 8 hours, 24 hours, 48 hours, and 1 week? The point at which impact becomes unacceptable — as determined by business stakeholders and the governing body — defines the RTO. Similarly, the RPO is determined by quantifying the impact of data loss: how much work would need to be repeated, how much transaction data would be irrecoverable, and what regulatory or contractual consequences would follow from the data loss. For DORA-subject entities, the RTO and RPO must be set at levels consistent with the risk tolerance approved by the management body under Article 5(2)(b).

Once RTO and RPO targets are set, they must be validated through testing. A stated RTO of 4 hours is worthless if actual recovery takes 12 hours in a test scenario. Validation testing should be conducted under realistic conditions: use the actual recovery procedures, the actual recovery infrastructure, and staff who would be available during a real disaster (not a hand-picked team of experts working during business hours). Measure actual recovery time and actual data loss against the stated RTO and RPO. Where actual performance exceeds the target, investigate whether the gap reflects inadequate technology (needing investment), inadequate procedures (needing improvement), inadequate training (needing practice), or unrealistic targets (needing adjustment). Document validation results and report them to the governance body, as they directly inform the organisation's risk posture and its compliance with DORA Articles 11-12.

4. DR Site Strategies and EU Data Sovereignty Considerations

The choice of disaster recovery site strategy has fundamental implications for recovery capability, cost, regulatory compliance, and data sovereignty. Traditional DR site strategies are categorised by readiness level: cold sites (space and power available but no pre-installed systems — recovery time measured in days to weeks), warm sites (pre-installed but not current systems — recovery time measured in hours to days), and hot sites (fully replicated and current systems ready for failover — recovery time measured in minutes to hours). The appropriate strategy for each system depends on its RTO: systems with RTOs of minutes require hot sites, systems with RTOs of hours can use warm sites, and systems with RTOs of days may use cold sites.

For EU organisations, DR site strategy must incorporate data sovereignty considerations. NIS2 applies to entities providing services within the EU, and Member State transposition laws may impose data localisation requirements for certain data categories. DORA Article 28(1)(a) requires financial entities to ensure that ICT third-party service providers comply with appropriate information security standards, which includes data processing location. The GDPR constrains personal data transfers outside the EU/EEA unless adequate safeguards exist. These regulatory requirements collectively mean that DR sites must be located within the EU/EEA for many data categories, and the DR site operator must comply with applicable EU regulations.

Cloud-based DR strategies (disaster recovery as a service, or DRaaS) have become the dominant approach for organisations seeking hot or warm site capabilities without the capital investment of physical infrastructure. However, EU organisations must select cloud providers that offer EU-based regions with contractual commitments to data residency, comply with the EU Cloud Code of Conduct or equivalent standards, provide transparency regarding sub-processors and data access jurisdictions, and support the technical controls required by DORA (encryption, access control, audit logging, segregation). Organisations with strict sovereignty requirements — particularly those in the defence, intelligence, or critical national infrastructure sectors — may require sovereign cloud providers that are EU-owned, EU-operated, and subject exclusively to EU jurisdiction. The Gaia-X initiative and national sovereign cloud programmes provide frameworks for assessing provider sovereignty.

Multi-region DR strategies within the EU offer a pragmatic balance between sovereignty and resilience. Deploying primary systems in one EU region (e.g., France) and DR systems in another (e.g., Germany or the Netherlands) provides geographic diversity against regional disasters while maintaining full EU data residency. Ensure that cross-border data flows within the EU do not trigger Member State-specific localisation requirements — while GDPR protects intra-EU data flows, certain sector-specific regulations (financial services, healthcare, public administration) may impose national data residency obligations that constrain the choice of DR region.

For EU organisations, a multi-region DR strategy within the EU (e.g., primary in FR, DR in DE or NL) provides geographic diversity without triggering cross-border data transfer complexities. Verify that no sector-specific national localisation requirements constrain your DR region choice.

5. DR Testing, Exercising, and Lessons Learned

A disaster recovery plan that has not been tested is a hypothesis, not a capability. Testing validates that recovery procedures work, that staff know their roles, that RTO and RPO targets can be achieved, and that dependencies and assumptions hold under stress. DORA Article 11(4) requires annual testing of ICT business continuity plans with adequate scenario coverage. ISO 22301:2019 (Business Continuity Management Systems) requires organisations to exercise and test their business continuity procedures at planned intervals and after significant changes. NIS2 Article 21(2)(c) requires business continuity and crisis management measures including backup management and disaster recovery — measures that must be operationally effective, not merely documented.

DR testing should follow a progressive programme that builds capability over time. Start with component testing — verifying that individual technical mechanisms work (backup restoration, database failover, network re-routing). Progress to integrated testing — recovering multiple interdependent systems in sequence and verifying end-to-end functionality. Then conduct full-scale simulation exercises — declaring a simulated disaster, activating the DR plan, executing recovery procedures in real time, and measuring actual performance against RTO and RPO targets. The most valuable (and most challenging) test type is an unannounced exercise, where the recovery team is activated without prior warning and must execute the plan under realistic conditions including time pressure, incomplete information, and communication challenges.

Each test must produce documented outcomes: what was tested, what scenario was used, what the expected results were, what the actual results were, what gaps or failures were identified, and what corrective actions are required. The lessons learned process is where testing delivers its full governance value. Analyse root causes of any failures: were they procedural (unclear instructions), technical (inadequate infrastructure), human (insufficient training), or environmental (incorrect assumptions)? Prioritise corrective actions by severity and implement them within defined timelines. Track corrective action completion through the governance structure and verify effectiveness in the next test cycle. Over time, the test-learn-improve cycle progressively hardens your DR capability and provides compelling evidence of operational resilience for supervisory authorities.

For DORA-subject entities, testing documentation must be retained and available for supervisory examination. Article 11(6) explicitly requires financial entities to keep records of all testing activities. Ensure your testing programme includes records of test scope, scenarios, participants, results, findings, corrective actions, and action completion status. These records demonstrate not just that testing occurred but that it drove measurable improvement in DR capability.

6. DR Plan Governance and Maintenance

Disaster recovery planning is not a project with a completion date — it is an ongoing governance responsibility that requires continuous maintenance, periodic review, and governance-level oversight. The DR plan must evolve as the IT environment changes: new systems are deployed, existing systems are decommissioned, dependencies shift, regulatory requirements change, and the threat landscape evolves. A DR plan that accurately reflected the IT environment twelve months ago may be dangerously inaccurate today.

Establish a DR plan maintenance process that includes triggered updates and scheduled reviews. Triggered updates occur when a material change affects DR: a new critical system is deployed, a system is migrated to new infrastructure, a third-party dependency changes, an organisational restructuring changes recovery team composition, or a test reveals a gap that requires plan modification. Scheduled reviews occur at defined intervals — quarterly for critical system recovery plans, semi-annually for supporting system plans, and annually for the overall DR programme including strategy, governance framework, and resource adequacy. Both triggered updates and scheduled reviews should be documented, with changes tracked and approved through the governance structure.

Governance oversight of DR should include regular reporting to the governing body on DR capability status, test results, and remediation progress. For NIS2-subject entities, DR is part of the business continuity and crisis management measures under Article 21(2)(c), which the management body must approve and oversee under Article 20. For DORA-subject entities, the management body must approve the ICT business continuity policy under Article 11 and be informed of testing outcomes and gaps. Include DR status in the regular cybersecurity and ICT risk governance reporting — not as a separate reporting stream but integrated into the overall resilience posture assessment.

Finally, ensure that DR governance addresses the human dimension. Recovery teams must be trained, drilled, and supported. Maintain up-to-date contact lists with multiple communication channels (mobile, personal email, out-of-band messaging) in case primary communication systems are affected by the disaster. Define clear succession arrangements for key DR roles — if the primary recovery lead is unavailable during a disaster, the plan must not depend on their unique knowledge. Cross-train team members, document procedures at a level of detail that enables execution by competent staff who may not be the plan authors, and conduct regular walkthroughs to maintain familiarity. The most sophisticated DR technology is useless if the people responsible for operating it are unreachable, untrained, or unclear on their responsibilities.

Frequently Asked Questions

What is the difference between disaster recovery and business continuity?

Business continuity planning (BCP) addresses the organisation's overall ability to continue delivering products and services during and after a disruptive event. It encompasses people, processes, facilities, and technology. Disaster recovery (DR) is a subset of BCP focused specifically on restoring technology systems — IT infrastructure, applications, data, and communications — to support business operations after a technology disruption. BCP determines what business capabilities must be maintained; DR determines how the underlying technology systems will be recovered. Business continuity requirements drive disaster recovery requirements: the business defines the maximum acceptable disruption (RTO) and data loss (RPO), and DR planning delivers the technology capability to meet those targets.

What does DORA require for disaster recovery specifically?

DORA imposes several specific DR requirements. Article 11 requires an ICT business continuity policy with dedicated recovery plans for different incident types, annual testing covering severe disruption scenarios, and documentation of all testing activities and results. Article 12 requires backup policies specifying scope, frequency, and procedures, physically and logically segregated restoration environments, periodic testing of backup restoration to verify RTO/RPO achievement, and appropriately secured and geographically separated backup storage. Article 5 requires management body approval of these policies. The RTS provide additional technical detail on backup encryption, integrity verification, and geographic separation requirements.

How do we set appropriate RTO and RPO values?

RTO and RPO should be derived from business impact analysis (BIA), not from technology capabilities. For each critical system, quantify the impact of unavailability over time (revenue loss, regulatory non-compliance, contractual breach, reputational damage) across defined time intervals (1 hour, 4 hours, 8 hours, 24 hours, etc.). The point where impact becomes unacceptable defines the RTO. For RPO, quantify the impact of data loss: rework cost, irrecoverable transactions, and regulatory consequences. Set RTO and RPO at levels consistent with the governing body's risk appetite. Then validate through testing — a target you cannot achieve in testing is not a credible target. For DORA-subject entities, targets must align with the management body-approved ICT risk tolerance.

Can we use cloud providers outside the EU for disaster recovery?

This depends on the data involved and applicable regulations. For personal data, GDPR Chapter V governs transfers outside the EU/EEA and requires adequate safeguards (adequacy decisions, standard contractual clauses, etc.). For financial entities under DORA, ICT third-party providers must comply with information security standards and the entity must maintain adequate oversight. For NIS2-subject entities, Member State transposition laws may impose data localisation requirements. In practice, many EU organisations use EU-region deployments of global cloud providers for DR, ensuring contractual data residency commitments. Organisations with strict sovereignty requirements may need EU-owned sovereign cloud providers. A multi-region EU strategy (e.g., primary in one Member State, DR in another) provides resilience within full EU jurisdiction.

How often should we test our disaster recovery plan?

DORA Article 11(4) requires annual testing at minimum for financial entities, with testing also required after substantive changes to ICT systems. ISO 22301 requires testing at planned intervals and after significant changes. Best practice for critical systems is quarterly component testing (backup restoration, failover mechanisms), semi-annual integrated testing (multi-system recovery), and annual full-scale simulation (end-to-end DR plan execution). Unannounced exercises should be conducted at least annually to test readiness under realistic conditions. Every test should produce documented results, gap analysis, and corrective action plans. Testing frequency should be proportionate to the criticality of the systems and the rate of change in the IT environment.

Related Guides

Implementation

Ready to Operationalise This?

Turn this guide into working compliance workflows. Create an account or schedule a personalised demo.

Create Account Schedule Demo