Ansible Performance Tuning: Speed Up Playbooks 10x with These Optimizations

By Luca Berton · Published 2024-01-01 · Category: installation

Optimize Ansible playbook performance. Configure pipelining, SSH multiplexing, fact caching, async tasks, mitogen, forks, and callback plugins. Reduce execution time from hours to minutes at scale.

Why Ansible Feels Slow

Out of the box, Ansible establishes a new SSH connection for every task on every host. For a playbook with 20 tasks across 100 hosts, that's 2,000 SSH connections. Each connection involves TCP handshake, key exchange, authentication, and module transfer. At scale, this overhead dominates execution time.

The good news: most of this is fixable with configuration.

Quick Wins (ansible.cfg)

Impact of each setting:

| Setting | Default | Optimized | Speedup | |---------|---------|-----------|---------| | forks | 5 | 50 | ~10x on 50+ hosts | | pipelining | False | True | ~2x per task | | ControlPersist | none | 60s | ~3x for multi-task plays | | gathering: smart | implicit | smart | Skip unchanged facts | | fact_caching | off | jsonfile | Skip fact gathering entirely |

Pipelining

The single biggest performance improvement. Without pipelining, Ansible copies each module to the remote host via SFTP, executes it, and cleans up. With pipelining, modules are piped directly through the SSH connection.

Requirements: requiretty must NOT be set in /etc/sudoers on managed hosts. Most modern distributions don't set it. Check with:

If it's set, remove it or add an exception:

SSH Multiplexing

Reuse SSH connections instead of creating new ones for each task:

This creates a master SSH connection that subsequent connections reuse. The 60s persist means connections stay open for 60 seconds of idle time.

For large inventories, increase the control socket path length:

Forks — Parallel Execution

Set forks based on your control node's capacity: • Laptop: 20-30 • Dedicated control node: 50-100 • AAP controller: 200+

Monitor CPU and memory on the control node. Each fork uses ~50-100 MB RAM.

Fact Caching

Gathering facts (hostname, OS, IP, disks, etc.) takes 2-5 seconds per host. Cache them:

Or use Redis for shared caching across control nodes:

Disable Facts When Not Needed

Async Tasks

Run long tasks asynchronously and poll or check later:

Parallel Package Installation

Mitogen

Mitogen replaces Ansible's SSH module transfer mechanism with a persistent Python interpreter on remote hosts. It can provide 1.25x to 7x speedup depending on playbook complexity.

Caveats: • Not always compatible with latest Ansible versions • Doesn't work with all connection types • Some modules may behave differently • Test thoroughly before production use

Free Strategy — True Parallel Execution

Default strategy (linear) waits for all hosts to complete a task before moving to the next. free strategy lets each host proceed independently:

When to use free: • Tasks are independent (no cross-host dependencies) • Hosts have different speeds (fast hosts don't wait for slow ones) • Large-scale deployments where any parallelism helps

Don't use free when: • Task order matters across hosts (rolling updates) • You need serial execution (serial: 1)

Reduce Task Count

Use Package Lists Instead of Loops

Use ansible.builtin.template with loop for Multiple Files

Use ansible.builtin.copy with directory Recursion

Profile Your Playbooks

Enable Timing Callbacks

Output:

Custom Timing

Optimized ansible.cfg Template

FAQ

How much faster can Ansible get with tuning?

Typical improvement is 3-10x. A playbook taking 30 minutes can often be reduced to 5-10 minutes with pipelining, SSH multiplexing, increased forks, and fact caching. The biggest gains come from pipelining and forks.

Is Mitogen safe for production?

Mitogen is used in production by many organizations, but it has compatibility limitations with newer Ansible versions. Test thoroughly in staging first. The maintained fork at github.com/mitogen-hq/mitogen is the most reliable.

Should I always use strategy: free?

No. Use free only when tasks are independent across hosts. For most production deployments, linear (default) with serial for rolling updates is safer. free is best for read-only operations like compliance checks or fact gathering.

Why is gather_facts so slow?

Ansible runs the setup module which collects hundreds of facts (hardware, network, OS, mounts, etc.). Use gather_subset to collect only what you need:

How many forks should I use?

Rule of thumb: start with the number of CPU cores × 5. A 4-core machine handles 20 forks well. Monitor memory — each fork uses ~50-100 MB. For 100+ forks, use a dedicated control node with 8+ GB RAM.

Conclusion

Ansible performance tuning is about reducing connection overhead and maximizing parallelism. Enable pipelining, SSH multiplexing, and fact caching in ansible.cfg. Increase forks to match your infrastructure scale. Use async for long-running tasks, free strategy for independent operations, and profile with callback plugins to find bottlenecks. These optimizations routinely deliver 5-10x speedup with minimal effort.

Category: installation

Browse all Ansible tutorials · AnsiblePilot Home

AnsiblePilot — Master Ansible Automation

Popular Topics

About Luca Berton

Ansible Performance Tuning: Speed Up Playbooks 10x with These Optimizations