Ansible Automation Platform High Availability and Disaster Recovery: Single Topology Architecture
By Luca Berton · Published 2024-01-01 · Category: events
AAP HA/DR architecture with proven db failover under 60s, full AZ recovery under 3 min, and EDB PostgreSQL partnership.
AAP now provides a single topology built with High Availability and Disaster Recovery — proven platform resilience with quantified recovery times, documented failure boundaries, and a tested DR blueprint built on the EDB partnership.
Four Enterprise Guarantees
1. Platform Resilience Is Proven, Not Assumed
Verified automatic recovery from: • Component failure — individual service restarts • Database failover — under 60 seconds • Full AZ loss — recovery under 3 minutes • Zero human intervention required for all scenarios
2. Failure Boundaries Are Documented, Not Discovered in Production
Test scenarios characterize exactly what survives a platform disruption and what does not — giving operators a clear, tested contract for how to design automation that is safe to re-run after any failure event.
3. Complete, Tested Disaster Recovery Blueprint
Full region DR scenarios validate the entire failover chain: Database promotion DNS cutover Execution reconnection
Every step classified as automatic or manual and timed against real infrastructure. This is a tested blueprint — not a runbook on paper.
4. Built on the EDB Partnership
Any disaster recovery for a platform requires a strong data strategy. AAP leverages the EDB partnership to make data resiliency top of mind within the topology.
See also: Ansible Disaster Recovery Automation: Backup, Failover, and Recovery Playbooks
Recovery Time Objectives
| Failure Scenario | Recovery Time | Intervention | |---|---|---| | Single component failure | Seconds | Automatic | | Database failover | < 60 seconds | Automatic | | Full Availability Zone loss | < 3 minutes | Automatic | | Full region DR | Minutes | Semi-automatic (DNS) |
Architecture
┌─── Region A (Primary) ──────────────────────────────┐
│ │
│ ┌─── AZ 1 ──────────┐ ┌─── AZ 2 ──────────┐ │
│ │ AAP Controller (P) │ │ AAP Controller (S) │ │
│ │ EDB PostgreSQL (P) │◄──►│ EDB PostgreSQL (S) │ │
│ │ Execution Nodes │ │ Execution Nodes │ │
│ └────────────────────┘ └────────────────────┘ │
│ │ │
└───────────────────┼───────────────────────────────────┘
│ Async replication
┌─── Region B (DR) ─┼──────────────────────────────────┐
│ ▼ │
│ ┌─── AZ 3 ──────────┐ │
│ │ AAP Controller (S) │ │
│ │ EDB PostgreSQL (S) │ │
│ │ Execution Nodes │ │
│ └────────────────────┘ │
└───────────────────────────────────────────────────────┘
See also: AAP 2.6 Backup, Restore, and Disaster Recovery Guide
EDB PostgreSQL Configuration
- name: Configure EDB Failover Manager
hosts: db_servers
roles:
- role: edb.postgres.efm
vars:
efm_cluster_name: aap-cluster
efm_notification_level: warning
efm_auto_failover: true
efm_auto_resume_period: 60
efm_virtual_ip: "{{ vault_efm_vip }}"
efm_bind_address: "{{ ansible_default_ipv4.address }}"
DR Failover Procedure
| Step | Action | Type | Time | |---|---|---|---| | 1 | Detect primary region failure | Automatic | ~30s | | 2 | Promote EDB standby to primary | Automatic | ~30s | | 3 | Update DNS to DR region | Manual/Automatic | ~60s | | 4 | Execution nodes reconnect | Automatic | ~30s | | 5 | Verify platform health | Automatic | ~30s | | Total | | | < 3 minutes |
See also: Ansible Private Automation Hub: Host & Manage Collections (Guide)
FAQ
Is this topology available in AAP 2.7?
Yes. The single HA/DR topology is the recommended production deployment for AAP 2.7+.
Do I need EDB PostgreSQL or can I use standard PostgreSQL?
EDB PostgreSQL is recommended for the full HA/DR capability including automatic failover. Standard PostgreSQL works for non-HA deployments.
What happens to running jobs during failover?
Running jobs may fail during failover. The documented failure boundaries tell you exactly which jobs are safe to re-run after recovery.
Can I test DR without impacting production?
Yes. The DR blueprint includes test procedures for validating failover without affecting production workloads.
Related Articles
• Ansible Solution Guides: AIOps Partner Walkthroughs • Red Hat Ansible Automation Platform 2.7: What's New • Red Hat Summit 2026 HighlightsCategory: events