High Availability in Azure: Storage Redundancies

Mithun Shanbhag

Sat Mar 02 2019

Note: This blog post is part of a series centered around the topic of high availability in Azure:

The basics
SLAs and the 9s (coming soon)
Availability sets
Availability zones
Storage redundancies (this post)
Load balancing (coming soon)
Application gateways (coming soon)
Traffic management
App service, function apps
SQL (coming soon)
CosmosDB (coming soon)
Wrapping up (coming soon)

I'll not be addressing scaling (horizontal or vertical), backups/restores and resiliency/healing in these posts. Each of those topics deserve their own series, perhaps I'll write about them in the future if time permits.

# Azure Storage Account

azure storage account

In Azure, the following entities are backed by Azure storage accounts: blobs (opens new window), file shares (opens new window), queues (opens new window), NoSQL table storages (opens new window), Data Lake Storage (gen2) (opens new window) and unmanaged disks. In this blog post, we'll go over the various redundancy options (opens new window) available for these storage accounts. We'll compare & contrast them based on the following parameters:

Replication latency: How soon before all replicas are in full-sync?
Disaster scenarios: Are you looking at partial data loss or fully unrecoverable data? How easy (or difficult) is it to get back on track once things have hit rock bottom?
SLAs: How many 9s?

Hopefully this blog post will serve as a cheat-sheet and help you choose the right Azure storage redundancy options for your use cases.

# LRS (locally-redundant storage)

With LRS (opens new window), your data is replicated thrice across multiple fault domains & update domains within a single storage scale unit (all within a single datacenter). Note that all three replicas are addressed by a single endpoint (i.e. you can't target individual replicas for read/write operations).

Replication latency: No replication latency, data is synchronously written to all three replicas on every write request.

Disaster scenarios:

disaster type	service interruption?	data loss?	recovery possible?
hardware failure in physical rack/node	NO	NO¹	N/A
datacenter disaster	YES	YES	NO²
availability zone disaster	"	"	"
regional disaster	"	"	"
geographic disaster	"	"	"
worldwide disaster	"	"	"

Since the replicas are spread across multiple fault domains.
Assuming all three replicas within the storage scale unit are affected, your data is permanently lost & unrecoverable.

SLAs:

object storage >= 99.999999999% (11 nines)
read requests (hot tier) >= 99.9% (3 nines)
read requests (cool tier) >= 99% (2 nines)
write requests (hot tier) >= 99.9% (3 nines)
write requests (cool tier) >= 99% (2 nines)

# ZRS (zone-redundant storage)

With ZRS (opens new window), your data is replicated across three availability zones within the same region (please note that currently not all regions support availability zones). As in the earlier case with LRS, all three replicas are addressed by a single endpoint.

Replication latency: Very low latency, data is synchronously written to all three replicas on every write request.

Disaster scenarios:

disaster type	service interruption?	data loss?	recovery possible?
hardware failure in physical rack/node	NO	NO¹	N/A
datacenter disaster	"	"	"
availability zone disaster	YES²	NO	N/A
regional disaster	YES	YES	NO³
geographic disaster	"	"	"
worldwide disaster	"	"	"

Only one replica will be affected, since the replicas are spread across different availability zones.
Temporary service interruption until Azure finishes DNS updates (not entirely sure how long these updates take, the official docs (opens new window) do not mention this). To mitigate this, best to use transient fault-handling patterns (opens new window) (retries (opens new window) with back-offs (opens new window) and circuit breakers (opens new window)) for all reads/writes on the storage account. More details can be found here (opens new window) and here (opens new window).
Assuming all three replicas across the availability zones are affected, your data is permanently lost & unrecoverable.

SLAs:

object storage >= 99.9999999999% (12 nines)
read requests (hot tier) >= 99.9% (3 nines)
read requests (cool tier) >= 99% (2 nines)
write requests (hot tier) >= 99.9% (3 nines)
write requests (cool tier) >= 99% (2 nines)

# GRS (geo-redundant storage)

With GRS (opens new window), your data is replicated across two paired-regions (within the same Azure geography) in a primary region + secondary region setup. This ensures that one regional replica will be available in the event of a regional disaster.

The primary region & the secondary regions are addressed by separate endpoints. The secondary endpoint is generally inaccessible. However in case of a fail-over (opens new window), the secondary is promoted to primary and read + write access is enabled for this endpoint. Fail-overs are automatically initiated by Azure in the event of a regional disaster. Azure is also introducing user-initiated fail-overs (opens new window), which is currently in preview mode as of the time of writing this post.

Note: Both GRS (geo-redundant storage) and RA-GRS (read-access geo-redundant storage) are misnomers. They don't create redundant copies across Azure geographies, only across paired-regions within the same Azure geography.

azure storage GRS

Replication latency: Your data is first replicated synchronously within the primary region via LRS. The data is then replicated asynchronously to the secondary region (eventually consistent). Within the secondary region, it is replicated synchronously using LRS. The official SLA for Azure storage (opens new window) does not make any guarantees about the time needed for geo-replication.

Disaster scenarios:

disaster type	service interruption?	data loss?	recovery possible?
hardware failure in physical rack/node	NO	NO	N/A
datacenter disaster	YES¹	POSSIBLE²	YES³
availability zone disaster	"	"	"
regional disaster	"	"	"
geographic disaster	YES	YES	NO
worldwide disaster	"	"	"

Within the primary & secondary regions itself, the data is replicated via LRS. In the event of a datacenter disaster in the primary region, it is possible that all replicas within the storage scale unit are affected and the primary endpoint will now be both inaccessible & unrecoverable. Although the secondary region has replica data, its endpoint will be inaccessible until a fail-over is initiated (the data will be inaccessible until the fail-over is complete).
With GRS, the replication from primary to secondary regions is asynchronous. In the event of the primary being destroyed before it has completely replicated the data to secondary, the secondary will have a stale copy and un-replicated writes will be permanently lost.
Only when a fail-over has completed, the secondary endpoint becomes the new primary, accessible for read + write operations, with LRS replication.

SLAs:

object storage >= 99.99999999999999% (16 nines)
read requests (hot tier) >= 99.9% (3 nines)
read requests (cool tier) >= 99% (2 nines)
write requests (hot tier) >= 99.9% (3 nines)
write requests (cool tier) >= 99% (2 nines)

# RA-GRS (read-access geo-redundant storage)

Same as GRS, but you always have read-only access to the secondary replica.

Replication latency: Same as GRS.

Disaster scenarios:

disaster type	service interruption?	data loss?	recovery possible?
hardware failure in physical rack/node	NO	NO	N/A
datacenter disaster	YES¹	POSSIBLE²	YES³
availability zone disaster	"	"	"
regional disaster	"	"	"
geographic disaster	YES	YES	NO
worldwide disaster	"	"	"

In the event of primary replica being destroyed, the secondary region will still have read-only access, even without a failover being initiated (unlike GRS where the secondary is inaccessible until a fail-over has been completed).
In the event of the primary being destroyed before it has completely replicated the data to secondary, the secondary will have a stale copy and un-replicated writes will be permanently lost (same as GRS).
Prior to fail-over, the secondary will have read-access. After fail-over, the secondary becomes the new primary, with read + write access and LRS replication.

SLAs:

object storage >= 99.99999999999999% (16 nines)
read requests (hot tier) >= 99.99% (4 nines)
read requests (cool tier) >= 99.9% (3 nines)
write requests (hot tier) >= 99.9% (3 nines)
write requests (cool tier) >= 99% (2 nines)