AnsiblePilot — Master Ansible Automation

AnsiblePilot is the leading resource for learning Ansible automation, DevOps, and infrastructure as code. Browse over 1,100 tutorials covering Ansible modules, playbooks, roles, collections, and real-world examples. Whether you are a beginner or an experienced engineer, our step-by-step guides help you automate Linux, Windows, cloud, containers, and network infrastructure.

Popular Topics

About Luca Berton

Luca Berton is an Ansible automation expert, author of "Ansible for VMware by Examples" and "Ansible for Kubernetes by Example" published by Apress, and creator of the Ansible Pilot YouTube channel. He shares practical automation knowledge through tutorials, books, and video courses to help IT professionals and DevOps engineers master infrastructure automation.

AAP 2.6 Automation Mesh: Distributed Execution Across Sites and Networks

By Luca Berton · Published 2024-01-01 · Category: installation

Deploy and manage Automation Mesh in AAP 2.6 for distributed automation across data centers, DMZs, and cloud regions. Configure control nodes, execution nodes, and hop nodes with Receptor protocol. Complete topology examples.

What Is Automation Mesh?

Automation Mesh is an overlay network in AAP 2.6 that separates control capacity from execution capacity. It lets you distribute automation execution across multiple sites, network zones, and cloud regions while maintaining centralized control through Automation Controller.

Mesh replaces the legacy isolated nodes concept from Ansible Tower with a more flexible, resilient, peer-to-peer architecture built on the Receptor protocol.

Why Automation Mesh Matters

Without Mesh, all automation runs on or directly from the Controller nodes. This creates bottlenecks: • Network limitations — Controller must reach every managed host directly • Security concerns — Controller needs firewall access to every network zone • Scalability ceiling — Controller CPU/RAM limits concurrent job capacity • Latency — remote sites experience slow execution over WAN links

Mesh solves all of these by offloading execution to distributed nodes that are close to the managed hosts.

Mesh Node Types

| Node Type | Role | Runs Jobs? | Description | |-----------|------|-----------|-------------| | Control node | Automation Controller | No | Schedules and orchestrates jobs. Dispatches work to execution nodes. | | Execution node | Job runner | Yes | Runs Ansible playbooks inside Execution Environments. Place near managed hosts. | | Hop node | Network relay | No | Relays Receptor traffic between nodes. Does not run jobs. Used for DMZ/firewall traversal. | | Hybrid node | Control + Execution | Yes | Acts as both control and execution. Default for single-node deployments. |

The Receptor Protocol

Automation Mesh uses Receptor (TCP port 27199) for all node-to-node communication: • Bidirectional — control nodes and execution nodes communicate in both directions • Encrypted — TLS by default • Resilient — automatic reconnection on network interruption • Efficient — multiplexed connections, reduced overhead vs SSH

How Receptor Differs from SSH

| Feature | Receptor (Mesh) | SSH (Legacy Isolated) | |---------|-----------------|----------------------| | Protocol | TCP 27199 | TCP 22 | | Direction | Bidirectional | Control → Isolated only | | Routing | Multi-hop peer-to-peer | Direct connection only | | Connection | Persistent, multiplexed | Per-job connection | | Failover | Automatic path selection | Manual reconfiguration | | Overhead | Low (single connection) | Higher (per-host SSH) |

Topology Patterns

Pattern 1: Simple Hub and Spoke

Best for single-site deployments with moderate scale.

Pattern 2: DMZ Traversal with Hop Nodes

Execution nodes in a restricted network zone reached through a hop node in the DMZ.

Pattern 3: Multi-Site with Regional Execution

Distribute execution across geographic regions for low-latency automation.

Pattern 4: Cloud Hybrid

On-premises Controller with execution nodes in multiple cloud providers.

Configuring Mesh in the Installer

Container Enterprise Topology

RPM Enterprise Topology

Instance Groups

Instance groups assign execution nodes to specific automation workloads:

Instance Group Use Cases

| Instance Group | Execution Nodes | Use Case | |----------------|----------------|----------| | default | All general-purpose nodes | Standard automation | | network-automation | Nodes with network access | Router/switch management | | dmz-servers | Nodes in DMZ | Web server automation | | cloud-aws | Nodes in AWS VPC | AWS resource management | | compliance | Dedicated secure nodes | CIS/STIG scanning |

Mesh System Requirements

Per Red Hat tested configurations, each Automation Mesh node requires:

| Requirement | Minimum | |-------------|---------| | RAM | 16 GB | | CPUs | 4 | | Local disk | 60 GB | | Disk IOPS | 3000 | | OS | RHEL 9.4+ or RHEL 10+ |

Network Requirements

| Port | Protocol | Source | Destination | Purpose | |------|----------|--------|-------------|---------| | 27199 | TCP Receptor | Controller | Execution node | Direct mesh communication | | 27199 | TCP Receptor | Controller | Hop node | Relay mesh communication | | 27199 | TCP Receptor | Hop node | Execution node | Hop to execution relay | | 80/443 | TCP HTTPS | Execution node | Hub / Gateway | Pull EE images, report results |

Monitoring Mesh Health

Via the UI

Navigate to Administration → Topology View in Platform Gateway to see a visual map of all mesh nodes, their connections, and health status.

Via the API

Health Check Indicators

| Indicator | Healthy | Unhealthy | |-----------|---------|-----------| | Node capacity | > 0 | 0 (node offline or overloaded) | | Errors | Empty string | Error message present | | Last heartbeat | Within last 120s | > 120s ago | | Connection count | Expected peers | Missing peers |

Scaling Mesh

Adding Execution Nodes

Add new execution nodes to handle increased workload: Provision a new RHEL 9 VM meeting minimum requirements Add to the installer inventory under [execution_nodes] Re-run the installer Assign to appropriate instance groups

Capacity Planning

Each execution node's capacity is calculated based on available CPU and memory. The formula:

Default values: • per_fork_memory: 100 MB • forks_per_cpu: 4 • Reserved memory: ~2 GB for OS

A node with 16 GB RAM and 4 CPUs typically has a capacity of ~56 forks (limited by CPU: 4 × 4 × ~3.5).

Troubleshooting

Execution Node Not Connecting

Check: Firewall allows TCP 27199 between Controller and execution node Receptor service is running: systemctl status receptor TLS certificates are valid and not expired DNS resolution works for the node hostname

Jobs Stuck in Pending

Check: Instance group has healthy execution nodes Execution nodes have available capacity EE image can be pulled from the registry on execution nodes

Hop Node Not Relaying

Check: Hop node has connectivity to both Controller and execution nodes on TCP 27199 Receptor is running and configured with node_type='hop' Routing table shows expected paths: receptorctl routes

FAQ

Can execution nodes run without continuous Controller connectivity?

No. Execution nodes need a connection to the Controller (directly or through hop nodes) to receive job dispatches and return results. If the connection drops mid-job, Receptor will buffer and retry, but prolonged disconnection will cause job failures.

How many hop nodes do I need?

One hop node per network boundary is typical. For HA, deploy two hop nodes per boundary with both connected to execution nodes. Mesh automatically routes through available hop nodes.

Can I mix containerized and RPM execution nodes?

No. All mesh components in a single AAP deployment must use the same installation method. You cannot mix containerized Controller with RPM execution nodes.

What is the maximum mesh size?

Red Hat tests specific topologies (growth and enterprise). For very large meshes (50+ execution nodes), work with Red Hat to validate your topology. The practical limit depends on job frequency, Controller capacity, and network bandwidth.

Can execution nodes be in a different cloud than Controller?

Yes. This is the cloud hybrid pattern — Controller on-premises (or in one cloud) with execution nodes in other clouds. Use hop nodes in the DMZ to bridge network boundaries. Ensure latency between hop and execution nodes is acceptable (< 100ms recommended).

Conclusion

Automation Mesh is what transforms AAP from a single-site tool into an enterprise-scale distributed automation platform. By separating control from execution and adding hop-node routing, Mesh lets you automate anywhere — across data centers, DMZs, cloud regions, and air-gapped environments — while maintaining centralized visibility and control.

Related ArticlesAAP 2.6 Architecture and Components: Complete GuideAAP 2.6 Workflow Templates: Advanced Multi-Step Automation GuideAAP 2.6 RBAC and Gateway APIAAP 2.6 Security Best PracticesAAP 2.6 Execution Environments: Build, Manage, and Deploy Custom EEs

Category: installation

Browse all Ansible tutorials · AnsiblePilot Home