AWS Webinar 24 - Getting Started with AWS - Understanding DR

Cobus Bernard
Sr Developer Advocate
Amazon Web Services
Getting Started AWS:
Understanding Disaster Recovery
@cobusbernard
cobusbernard
cobusbernard

Agenda
Define requirements & SPOFs
Choosing recovery method
Backups
Testing your plan
Resiliency and self-healing systems
Using DR as a migration strategy

Initial questions to answer
How important are the applications to your business?
What is the associated recovery point and time for these applications?
How are you storing the data?
Where are you storing the data?
How are you restoring the application?

Protected data
Data changing over time
RPO
t1 Current
Why do we backup data?
Minimize data loss

Liability
Cost
Why do we backup data?
Balance cost with liability

AWS offers four levels of backup and DR support
across a spectrum of complexity and time
• Lower priority use cases
• Solutions: Amazon S3,
AWS Storage Gateway
• Cost: $
• Meeting lower
RTO & RPO requirements
• Core services
• Scale AWS resources
in response to a DR event
• Cost: $$
• Solutions that require
RTO & RPO in minutes
• Business critical services
• Cost: $$$
• Auto-failover of your
environment in AWS
• Cost: $$$$
RPO/RTO:
Hours
RPO/RTO:
10s of Minutes
RPO/RTO:
Minutes
RPO/RTO:
Real-time
Low High
Backup & Restore Pilot light
Warm standby
in AWS
Hot standby
(with multi-site)

AWS Backup: centralize compliance, automate
backup, work across services
Amazon EFSAmazon EBS
Amazon RDS Amazon
DynamoDB
AWS Storage
Gateway
AWS Backup
1. Simplified backup
scheduling and lifecycle
management across
AWS services
2. Centrally manage
backup activities,
security, and reporting
3. Achieve consistency and
meet compliance
requirements

Not running
Pilot light
system
Corporate data center
Primary Database
server
Subordinate
database
server
Data
volume
Application
server
Reverse
proxy/
caching
server
AWS Cloud
Pilot light prep
www.example.com
Data mirroring
replication
Reverse proxy/
caching server
Application
server

Reverse proxy/
caching server
Application
server
Start in minutes
Add additional
capacity,
if needed
Primary Database
server
database
server
Data
volume
Application
server
Reverse
proxy/
caching
server
AWS Cloud
Pilot light recovery
www.example.com

Elastic load
balancing
Route 53
Data volume
Application
server
Subordinate
database
server
Reverse
proxy/
caching
server
AWS Region
Reverse proxy/
caching server
Application
server
Primary Database
server
AWS CloudWarm standby prep
www.example.com
Mirroring/replication
Application
data source
cut over
Not active for
production
traffic
Scaled down
standby

Reverse proxy/
caching server
Application
server
Subordinate
database
server
Warm standby recover
www.example.com
Primary Database
server
Elastic load
balancing
Route 53
Data volume
Application
server
Reverse
proxy/
caching
server
Active
Scaled up
production
AWS Region
AWS Cloud

Elastic load
balancing
Route 53
Data volume
Application
server
Database
server
Reverse
proxy/
caching
server
AWS Region
Primary Database
server
Active
AWS CloudHot site prep
www.example.com
Mirroring/replication
Application
data source
cut over
Reverse proxy/
caching server
Application
server

Elastic load
balancing
Route 53
Data volume
Application
server
Database
server
Reverse
proxy/
caching
server
Primary Database
server
Active
Scaled up
for production use
AWS CloudHot site recovery
www.example.com
AWS Region
Reverse proxy/
caching server
Application
server

“Chaos Engineering is the discipline of
experimenting on a distributed system
in order to build confidence in the system’s
capability to withstand turbulent conditions in
production.”
http://principlesofchaos.org

STEADY
STATE
HYPOTHESIS
RUN
EXPERIMENT
VERIFY
FIX!
Phases of Chaos Engineering

Chaos engineering
https://github.com/Netflix/SimianArmy

Resiliency: Ability for a system to handle and
eventually recover from unexpected conditions

Multi-AZ architecture
Region
Availability zone a Availability zone b Availability zone c
Instances Instances Instances
DB Instance DB instance
standby
Elastic Load
Balancing (ELB)

Multi-AZ architecture
Region
Availability zone a Availability zone b Availability zone c
Instances Instances Instances
DB Instance DB instance
new master
Elastic Load
Balancing (ELB)

Availability zone 1
Auto Scaling group
AWS Region
Availability zone 2
Auto-scaling for self-healing
Elastic Load
Balancing (ELB)
X

Reverse proxy/
caching server
Application
server
Subordinate
database
server
Warm standby recover
www.example.com
Primary Database
server
Elastic load
balancing
Route 53
Data volume
Application
server
Reverse
proxy/
caching
server
Active
Scaled up
production
AWS Cloud
AWS Region

Availability concepts
High availability
Keep your applications
running 24x7
Backup
Make sure your
data is safe
Disaster recovery
Get your applications
and data back after
a major disaster

Visit aws.amazon.com/training/path-storage/
Classroom offerings, like Architecting on AWS, feature AWS
expert instructors and hands-on activities
45+ free digital courses cover topics related to cloud storage, including:
Learn storage with AWS Training and Certification
• Amazon S3
• AWS Storage Gateway
• Amazon S3 Glacier
• Amazon Elastic File Storage
(Amazon EFS)
• Amazon Elastic Block Storage
(Amazon EBS)
Resources created by the experts at AWS to help you build cloud storage skills

Thank you!
Cobus Bernard
Sr Developer Advocate
Amazon Web Services
@cobusbernard
cobusbernard
cobusbernard

AWS Webinar 24 - Getting Started with AWS - Understanding DR

More Related Content

Similar to AWS Webinar 24 - Getting Started with AWS - Understanding DR

More from Cobus Bernard

Recently uploaded

AWS Webinar 24 - Getting Started with AWS - Understanding DR

Editor's Notes