Skip to content

ADR: DynamoDB Backup and Recovery Strategy for Customer Mappings

Context

The system stores customer mappings in a DynamoDB table. These mappings are write-once and may never be updated after initial creation. It is a strict requirement that all valid customer mappings—regardless of when they were created—must be present after any table restore, including recovery from accidental table deletion.

The original backup strategy used AWS Backup with hourly backups retained for 30 days, which resulted in rapidly escalating and unsustainable costs that were disproportionate to the table’s logical size.

The team evaluated multiple alternatives, including DynamoDB PITR and reduced-frequency AWS Backup configurations. The business has confirmed: - A 2-hour Recovery Point Objective (RPO) is acceptable - A 7-day retention window is required - Table deletion recovery is mandatory - AWS Backup is preferred for simplicity and correctness


Considered Options

Option 1: AWS Backup (Hourly, 30-Day Retention — Current State)

  • Hourly backups retained for 30 days
  • Strong recovery guarantees

Rejected due to compounding, time-based cost growth.


Option 2: DynamoDB PITR Only

  • Continuous recovery with second-level granularity
  • Very low cost

Rejected because PITR: - Cannot recover data after table deletion - Does not capture immutable records created before enablement


Option 3: AWS Backup (Daily, 5-Day Retention)

  • Very low cost
  • Supports deletion recovery

Rejected because RPO of 24 hours was considered too coarse for some operational scenarios.


Option 4: AWS Backup (Every 2 Hours, 7-Day Retention) — Selected

  • One backup every 2 hours
  • Retain backups for 7 days
  • No PITR enabled

This option balances improved recovery granularity with significant cost reduction, while preserving deletion recovery and immutable data correctness.


Decision

Adopt AWS Backup with 2-hourly backups and 7-day retention as the sole backup and recovery mechanism for the customer mappings DynamoDB table.

Configuration details: - Backup frequency: Every 2 hours - Retention: 7 days - PITR: Disabled - Recovery scope: Logical corruption and table deletion - RPO: Up to 2 hours

This decision prioritises correctness, simplicity, and a tighter RPO, while still delivering substantial FinOps savings.


Cost Impact

Previous Configuration

  • Backup frequency: Hourly
  • Retention: 30 days
  • Observed monthly cost: $3.5k – $5.8k

New Configuration

  • Backup frequency: Every 2 hours
  • Retention: 7 days
  • Estimated monthly cost: $400 – $700

Cost Reduction

  • Monthly savings: ~$5,100 – $5,400
  • Annual savings: ~$61,000 – $65,000
  • Percentage reduction: ~88% – 93%

Consequences

Positive

  • Guaranteed recovery of immutable customer mappings
  • Supports recovery after table deletion
  • Improved RPO (2 hours) compared to daily backups
  • Significant and defensible cost reduction
  • Simple restore process using AWS Backup
  • Lower architectural complexity than PITR-based solutions

Negative

  • Higher cost than daily-backup configuration
  • RPO not as fine-grained as PITR
  • Snapshot-based recovery still scales with retention period

These trade-offs are acceptable given the operational requirements.


  1. Prevent accidental deletion via CDK

    removalPolicy: cdk.RemovalPolicy.RETAIN
    

  2. IAM guardrail

  3. Restrict dynamodb:DeleteTable to approved break-glass roles only

  4. Cost monitoring

  5. AWS Backup monthly cost alarm (e.g. $1,000 threshold)

Summary

AWS Backup with 2-hourly backups and 7-day retention provides a strong balance between recovery granularity and cost efficiency. This approach meets all stated recovery requirements, preserves deletion recovery, and delivers up to ~93% cost reduction compared to the original hourly/30-day configuration.


References

  • AWS Backup for DynamoDB Documentation
  • DynamoDB Backup and Restore
  • Internal FinOps Cost Review