Ansible Automation Mesh: Scalable Automation Across Hybrid Cloud Environments
By Luca Berton · Published 2024-01-01 · Category: installation
How Ansible Automation Mesh enables scalable automation across datacenters, cloud, and edge. Configure hop nodes, execution nodes, and mesh topologies.
Introduction
Automation Mesh is a feature of Ansible Automation Platform that creates a scalable, resilient overlay network for distributing automation workloads across geographically dispersed infrastructure. Instead of requiring direct connectivity from a central controller to every managed host, Automation Mesh routes automation traffic through intermediate hop nodes and distributes execution to local execution nodes.
See also: AAP 2.6 Automation Mesh: Distributed Execution Across Sites and Networks
Why Automation Mesh?
Traditional automation architectures break down when: • Network restrictions prevent direct SSH from a central controller to remote datacenters • Latency makes running playbooks from a central location impractical • Compliance requires automation execution to stay within specific network boundaries • Scale exceeds what a single controller instance can handle
Automation Mesh solves these by distributing the automation workload closer to the managed infrastructure.
Mesh Node Types
Control Nodes
The automation controller itself. Manages the web UI, API, job scheduling, RBAC, and dispatches work to the mesh.
Execution Nodes
Run Ansible playbooks locally. Placed close to managed infrastructure for low latency and network isolation compliance.
┌─────────────────────────┐
│ Execution Node │
│ ┌───────────────────┐ │
│ │ ansible-runner │ │
│ │ Execution Env │ │
│ │ (containers) │ │
│ └───────────────────┘ │
│ Direct access to │
│ managed hosts │
└─────────────────────────┘
Hop Nodes
Route traffic between control and execution nodes without running playbooks themselves. Like network routers for automation traffic.
Control Node ──→ Hop Node ──→ Execution Node ──→ Managed Hosts
(HQ) (DMZ/WAN) (Remote DC) (Servers)
See also: Ansible for Edge Computing and IoT: Managing Thousands of Distributed Devices
Example Topologies
Hub and Spoke
┌──────────────┐
│ Control │
│ (HQ) │
└──────┬───────┘
┌─────────┼─────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Exec │ │ Exec │ │ Exec │
│ (US-E) │ │ (EU) │ │ (APAC) │
└──────────┘ └──────────┘ └──────────┘
Multi-Hop (Restricted Networks)
Control ──→ Hop (DMZ) ──→ Hop (Partner VPN) ──→ Exec (Partner DC)
(HQ) Firewall OK Cross-org tunnel Local execution
Redundant Mesh
Control-1 ←──→ Control-2
│ ╲ ╱ │
│ ╲ ╱ │
▼ ╳ ▼
Hop-A ←──→ Hop-B
│ │
▼ ▼
Exec-1 Exec-2
Configuration
Inventory for Mesh Setup
[automationcontroller]
controller.example.com
[execution_nodes]
exec-us-east.example.com node_type=execution
exec-eu-west.example.com node_type=execution
exec-apac.example.com node_type=execution
[hop_nodes]
hop-dmz.example.com node_type=hop
[execution_nodes:vars]
peers=hop-dmz.example.com
[hop_nodes:vars]
peers=controller.example.com
Instance Group Assignment
Map execution nodes to instance groups, then assign groups to job templates:
# Instance Groups
US-East Group:
Nodes: [exec-us-east.example.com]
EU-West Group:
Nodes: [exec-eu-west.example.com]
APAC Group:
Nodes: [exec-apac.example.com]
# Job Template → Instance Group
"Deploy US App Servers":
Instance Group: US-East Group
# Playbook runs on exec-us-east, close to US servers
See also: Containerized Ansible Automation Platform 2024 Update
Mesh Communication
Automation Mesh uses receptor — a lightweight overlay network built on mutual TLS: • All traffic encrypted with per-node certificates • Automatic route discovery and failover • UDP and TCP transport options • Works across firewalls (outbound-only connections from execution/hop nodes)
Firewall Requirements
| From | To | Port | Protocol | |------|----|------|----------| | Exec/Hop → Control | 27199/tcp | Receptor mesh | | Control → Exec | 27199/tcp | Receptor mesh | | Exec → Managed hosts | 22/tcp | SSH (automation) |
Key insight: Hop and execution nodes initiate outbound connections to the controller. This means you only need to allow outbound 27199 from restricted networks — no inbound rules needed on remote firewalls.
Use Cases
Multi-Cloud Automation
Controller (AWS us-east-1)
├── Exec Node (AWS eu-west-1) → EU AWS resources
├── Exec Node (Azure westeurope) → Azure resources
└── Exec Node (GCP us-central1) → GCP resources
Edge Computing
Controller (Central DC)
├── Hop (Regional Hub)
│ ├── Exec (Store-001) → POS systems
│ ├── Exec (Store-002) → POS systems
│ └── Exec (Store-003) → POS systems
└── Hop (Regional Hub 2)
├── Exec (Store-101)
└── Exec (Store-102)
Air-Gapped Environments
Controller (Corporate)
└── Hop (Data Diode/Gateway)
└── Exec (Air-gapped DC)
→ Classified servers (no internet)
Monitoring Mesh Health
# Check mesh topology via API
curl https://controller.example.com/api/v2/mesh_visualizer/
# Check receptor status on any node
receptorctl status
# View node connections
receptorctl connections
# Ping through mesh
receptorctl ping exec-us-east
Best Practices
Place execution nodes close to managed infrastructure — Minimize latency and network hops Use hop nodes for network boundaries — Don't expose execution nodes directly to the internet Redundant hop nodes — At least two hop nodes per path for failover Instance groups per region/environment — Map execution nodes to job templates logically Monitor receptor mesh health — Set up alerts for node disconnections Plan capacity — Each execution node handles a finite number of concurrent jobs (forks) Keep mesh certificates rotated — Automate receptor certificate renewal Test failover — Regularly verify that jobs route through alternate paths when nodes go downFAQ
How many execution nodes do I need?
Depends on concurrent job count and fork count per job. A single execution node can handle 50-200 concurrent forks depending on hardware. Start with one per region and scale based on queue depth.
Can I use Automation Mesh with AWX?
No — Automation Mesh is an AAP-only feature. AWX uses a simpler architecture with a single execution environment.
What's the overhead of hop nodes?
Minimal — hop nodes only relay traffic. A small VM (2 vCPU, 4GB RAM) can handle thousands of concurrent relay connections.
Does mesh work over the internet?
Yes — all receptor traffic is encrypted with mutual TLS. Hop nodes can relay across WAN/internet connections securely.
Conclusion
Automation Mesh transforms Ansible Automation Platform from a centralized tool into a distributed automation fabric. By placing execution close to managed infrastructure and routing through hop nodes, enterprises can automate across hybrid cloud, edge, and restricted network environments without compromising security or performance.
Related Articles
• Ansible Automation Platform 2.6 • What is Ansible AWX? • Ansible AWS Complete GuideCategory: installation