Ansible Performance Optimization: Speed Up Playbooks for Large-Scale Environments
By Luca Berton · Published 2024-01-01 · Category: installation
Optimize Ansible playbook performance for enterprise scale. Reduce execution time with pipelining, async tasks, fact caching, mitogen, and parallel execution.
Introduction
A playbook that takes 5 minutes on 10 hosts can take hours across 1,000. Enterprise Ansible deployments require performance optimization to keep execution times practical. This guide covers every major optimization — from ansible.cfg tuning to architectural patterns that dramatically reduce playbook runtime.
See also: Networking Throttle Strategies for Managing 3000 Servers with Ansible
Quick Wins: ansible.cfg
[defaults]
# Increase parallelism (default: 5)
forks = 50
# Enable pipelining — reduces SSH operations
pipelining = True
# Disable host key checking (internal networks)
host_key_checking = False
# Use JSON callback for faster output
stdout_callback = yaml
# Gather only needed facts
gathering = smart
fact_caching = jsonfile
fact_caching_connection = /tmp/ansible-facts
fact_caching_timeout = 3600
# Use persistent connections
[persistent_connection]
connect_timeout = 30
command_timeout = 30
[ssh_connection]
# Reuse SSH connections
ssh_args = -o ControlMaster=auto -o ControlPersist=60s -o PreferAuthentication=publickey
# Enable pipelining
pipelining = True
# Transfer method
transfer_method = piped
SSH Pipelining
The single biggest performance improvement. Reduces SSH operations from 5+ per task to 1:
# ansible.cfg
[ssh_connection]
pipelining = True
Requirement: requiretty must be disabled in /etc/sudoers on target hosts:
# /etc/sudoers — remove or comment:
# Defaults requiretty
Impact: 2-5x speed improvement on most playbooks.
See also: Why Memory, Not CPU, Is the Critical Bottleneck in Ansible Automation
Fact Gathering Optimization
Disable When Not Needed
- name: Deploy configuration files
hosts: webservers
gather_facts: false # Skip if you don't need ansible_* variables
tasks:
- name: Copy config
ansible.builtin.copy:
src: app.conf
dest: /etc/myapp/app.conf
Gather Only What You Need
- name: Targeted fact gathering
hosts: all
gather_facts: true
gather_subset:
- '!all'
- '!min'
- network
- hardware
# Gathers only network and hardware facts, skipping everything else
Fact Caching
# Redis fact cache (fastest)
[defaults]
fact_caching = redis
fact_caching_connection = localhost:6379:0
fact_caching_timeout = 86400
# JSON file cache (no extra deps)
[defaults]
fact_caching = jsonfile
fact_caching_connection = /tmp/ansible-facts
fact_caching_timeout = 3600
Async and Parallel Tasks
Fire-and-Forget
- name: Start long-running tasks in parallel
hosts: all
tasks:
- name: Run system update (async)
ansible.builtin.apt:
update_cache: true
upgrade: dist
async: 3600 # Max runtime: 1 hour
poll: 0 # Don't wait — fire and forget
register: update_job
- name: Do other work while updates run
ansible.builtin.copy:
src: monitoring.conf
dest: /etc/monitoring/
- name: Wait for updates to finish
ansible.builtin.async_status:
jid: "{{ update_job.ansible_job_id }}"
register: job_result
until: job_result.finished
retries: 60
delay: 60
Parallel Loop Execution
- name: Download files in parallel
ansible.builtin.get_url:
url: "{{ item }}"
dest: "/tmp/{{ item | basename }}"
loop: "{{ download_urls }}"
async: 300
poll: 0
register: download_jobs
- name: Wait for all downloads
ansible.builtin.async_status:
jid: "{{ item.ansible_job_id }}"
loop: "{{ download_jobs.results }}"
register: download_results
until: download_results.finished
retries: 30
delay: 10
See also: Ansible async and poll: Run Long Tasks Without Timeout Complete Guide
Free Strategy
Execute tasks on each host independently (don't wait for slowest host):
- name: Fast rolling deployment
hosts: webservers
strategy: free # Each host proceeds independently
tasks:
- name: Pull latest code
ansible.builtin.git:
repo: "{{ app_repo }}"
dest: /opt/myapp
version: "{{ app_version }}"
- name: Restart service
ansible.builtin.systemd:
name: myapp
state: restarted
Default strategy (linear): All hosts complete task 1 before any start task 2.
Free strategy: Each host runs through all tasks at its own pace.
Reduce Task Overhead
Use ansible.builtin.copy over ansible.builtin.template when no variables needed
# Slow — processes Jinja2 even with no variables
- ansible.builtin.template:
src: static-config.conf
dest: /etc/myapp/config.conf
# Fast — direct file copy
- ansible.builtin.copy:
src: static-config.conf
dest: /etc/myapp/config.conf
Batch Operations
# Slow — N tasks for N packages
- ansible.builtin.apt:
name: "{{ item }}"
state: present
loop:
- nginx
- python3
- redis-server
# Fast — single transaction
- ansible.builtin.apt:
name:
- nginx
- python3
- redis-server
state: present
Avoid command/shell When Module Exists
# Slow and not idempotent
- ansible.builtin.command: useradd deploy
# Fast and idempotent
- ansible.builtin.user:
name: deploy
state: present
Serial Execution for Rolling Updates
- name: Rolling update with controlled parallelism
hosts: webservers
serial:
- 1 # First: 1 canary host
- "25%" # Then: 25% at a time
- "50%" # Then: 50% at a time
max_fail_percentage: 10
tasks:
- name: Deploy new version
ansible.builtin.include_role:
name: deploy_app
- name: Health check
ansible.builtin.uri:
url: "http://{{ inventory_hostname }}:8080/health"
retries: 5
delay: 10
until: health.status == 200
Mitogen for Ansible
Third-party plugin that replaces SSH with a faster Python-based transport:
# Install
pip install mitogen
# ansible.cfg
[defaults]
strategy_plugins = /path/to/mitogen/ansible_mitogen/plugins/strategy
strategy = mitogen_linear
Impact: 1.5-7x speed improvement depending on playbook complexity.
Caveat: Not compatible with all modules; test thoroughly before production use.
Profiling Playbook Performance
Enable Task Timing
# ansible.cfg
[defaults]
callbacks_enabled = ansible.posix.profile_tasks
Output shows per-task timing:
PLAY RECAP
Tuesday 08 April 2026 12:00:00 +0000 (0:00:02.123) 0:05:43.210 ****
=============================================================
Install packages --------------------------------- 180.23s
Configure nginx ---------------------------------- 45.12s
Deploy application ------------------------------- 23.45s
Restart services --------------------------------- 8.67s
Identify Bottlenecks
# Run with timing callback
ANSIBLE_CALLBACKS_ENABLED=ansible.posix.profile_tasks \
ansible-playbook site.yml
Architecture Patterns
Pull Mode with ansible-pull
# On each host — runs playbook from Git on schedule
ansible-pull -U https://github.com/myorg/ansible-configs.git \
-d /opt/ansible \
-i localhost, \
site.yml \
--sleep 60 # Random delay to avoid thundering herd
Split Large Inventories
# Instead of one massive run
ansible-playbook site.yml -i all_2000_hosts
# Split by group
ansible-playbook site.yml -i group_a --limit batch1
ansible-playbook site.yml -i group_a --limit batch2
Performance Comparison
| Optimization | Improvement | Complexity |
|-------------|-------------|------------|
| SSH pipelining | 2-5x | Low — one config line |
| Forks increase (5→50) | 5-10x | Low — one config line |
| Fact caching | 1.5-2x | Low — config + optional Redis |
| gather_facts: false | 1.2-1.5x | Low — per playbook |
| Package batching | 2-3x per task | Low — code change |
| Async tasks | 2-5x for I/O tasks | Medium — code change |
| Free strategy | 1.5-3x | Medium — behavior change |
| Mitogen | 1.5-7x | Medium — plugin install |
| ansible-pull | 10-100x | High — architecture change |
Best Practices
Start with profiling — Measure before optimizing Pipelining + forks first — Biggest bang for least effort Fact caching for repeated runs — Avoid re-gathering facts every time Batch package operations — Single apt/yum call, not loop Async for I/O-bound tasks — Downloads, updates, long-running commands Free strategy for independent hosts — When tasks don't depend on other hosts' state Test Mitogen compatibility — Great speedup but verify with your modules Profile regularly — Performance characteristics change as playbooks evolveFAQ
What's the maximum practical forks setting?
Depends on the Ansible controller's resources. Rule of thumb: 50-100 for a dedicated server with 8+ cores and 16GB+ RAM. Monitor CPU and memory during runs.
Does Ansible Tower/AAP handle scaling automatically?
AAP distributes jobs across execution nodes (Automation Mesh). Each node runs its own forks. This scales horizontally — add more execution nodes for more capacity.
ansible-pull vs push — when to switch?
Consider ansible-pull when: 1,000+ hosts, highly repetitive configs, hosts have internet access to Git. Keep push for: orchestrated workflows, sequential operations, one-off tasks.
Conclusion
Ansible performance at scale requires deliberate optimization. Start with the easy wins — pipelining, forks, fact caching — then progressively adopt async execution, free strategy, and architectural patterns like ansible-pull as your environment grows.
Related Articles
• Ansible Automation Mesh • Ansible Automation Platform 2.6 • Ansible Execution EnvironmentsCategory: installation