Ansible Performance Optimization: Speed Up Playbooks for Large-Scale Environments

By Luca Berton · Published 2024-01-01 · Category: installation

Optimize Ansible playbook performance for enterprise scale. Reduce execution time with pipelining, async tasks, fact caching, mitogen, and parallel execution.

Introduction

A playbook that takes 5 minutes on 10 hosts can take hours across 1,000. Enterprise Ansible deployments require performance optimization to keep execution times practical. This guide covers every major optimization — from ansible.cfg tuning to architectural patterns that dramatically reduce playbook runtime.

Quick Wins: ansible.cfg

[defaults]
# Increase parallelism (default: 5)
forks = 50

# Enable pipelining — reduces SSH operations
pipelining = True

# Disable host key checking (internal networks)
host_key_checking = False

# Use JSON callback for faster output
stdout_callback = yaml

# Gather only needed facts
gathering = smart
fact_caching = jsonfile
fact_caching_connection = /tmp/ansible-facts
fact_caching_timeout = 3600

# Use persistent connections
[persistent_connection]
connect_timeout = 30
command_timeout = 30

[ssh_connection]
# Reuse SSH connections
ssh_args = -o ControlMaster=auto -o ControlPersist=60s -o PreferAuthentication=publickey
# Enable pipelining
pipelining = True
# Transfer method
transfer_method = piped

SSH Pipelining

The single biggest performance improvement. Reduces SSH operations from 5+ per task to 1:

# ansible.cfg
[ssh_connection]
pipelining = True

Requirement: requiretty must be disabled in /etc/sudoers on target hosts:

# /etc/sudoers — remove or comment:
# Defaults    requiretty

Impact: 2-5x speed improvement on most playbooks.

Fact Gathering Optimization

Disable When Not Needed

- name: Deploy configuration files
  hosts: webservers
  gather_facts: false  # Skip if you don't need ansible_* variables
  tasks:
    - name: Copy config
      ansible.builtin.copy:
        src: app.conf
        dest: /etc/myapp/app.conf

Gather Only What You Need

- name: Targeted fact gathering
  hosts: all
  gather_facts: true
  gather_subset:
    - '!all'
    - '!min'
    - network
    - hardware
  # Gathers only network and hardware facts, skipping everything else

Fact Caching

# Redis fact cache (fastest)
[defaults]
fact_caching = redis
fact_caching_connection = localhost:6379:0
fact_caching_timeout = 86400

# JSON file cache (no extra deps)
[defaults]
fact_caching = jsonfile
fact_caching_connection = /tmp/ansible-facts
fact_caching_timeout = 3600

Async and Parallel Tasks

Fire-and-Forget

- name: Start long-running tasks in parallel
  hosts: all
  tasks:
    - name: Run system update (async)
      ansible.builtin.apt:
        update_cache: true
        upgrade: dist
      async: 3600    # Max runtime: 1 hour
      poll: 0        # Don't wait — fire and forget
      register: update_job

    - name: Do other work while updates run
      ansible.builtin.copy:
        src: monitoring.conf
        dest: /etc/monitoring/

    - name: Wait for updates to finish
      ansible.builtin.async_status:
        jid: "{{ update_job.ansible_job_id }}"
      register: job_result
      until: job_result.finished
      retries: 60
      delay: 60

Parallel Loop Execution

    - name: Download files in parallel
      ansible.builtin.get_url:
        url: "{{ item }}"
        dest: "/tmp/{{ item | basename }}"
      loop: "{{ download_urls }}"
      async: 300
      poll: 0
      register: download_jobs

    - name: Wait for all downloads
      ansible.builtin.async_status:
        jid: "{{ item.ansible_job_id }}"
      loop: "{{ download_jobs.results }}"
      register: download_results
      until: download_results.finished
      retries: 30
      delay: 10

Free Strategy

Execute tasks on each host independently (don't wait for slowest host):

- name: Fast rolling deployment
  hosts: webservers
  strategy: free  # Each host proceeds independently
  tasks:
    - name: Pull latest code
      ansible.builtin.git:
        repo: "{{ app_repo }}"
        dest: /opt/myapp
        version: "{{ app_version }}"

    - name: Restart service
      ansible.builtin.systemd:
        name: myapp
        state: restarted

Default strategy (linear): All hosts complete task 1 before any start task 2. Free strategy: Each host runs through all tasks at its own pace.

Reduce Task Overhead

Use `ansible.builtin.copy` over `ansible.builtin.template` when no variables needed

# Slow — processes Jinja2 even with no variables
- ansible.builtin.template:
    src: static-config.conf
    dest: /etc/myapp/config.conf

# Fast — direct file copy
- ansible.builtin.copy:
    src: static-config.conf
    dest: /etc/myapp/config.conf

Batch Operations

# Slow — N tasks for N packages
- ansible.builtin.apt:
    name: "{{ item }}"
    state: present
  loop:
    - nginx
    - python3
    - redis-server

# Fast — single transaction
- ansible.builtin.apt:
    name:
      - nginx
      - python3
      - redis-server
    state: present

Avoid `command`/`shell` When Module Exists

# Slow and not idempotent
- ansible.builtin.command: useradd deploy

# Fast and idempotent
- ansible.builtin.user:
    name: deploy
    state: present

Serial Execution for Rolling Updates

- name: Rolling update with controlled parallelism
  hosts: webservers
  serial:
    - 1        # First: 1 canary host
    - "25%"    # Then: 25% at a time
    - "50%"    # Then: 50% at a time
  max_fail_percentage: 10
  tasks:
    - name: Deploy new version
      ansible.builtin.include_role:
        name: deploy_app

    - name: Health check
      ansible.builtin.uri:
        url: "http://{{ inventory_hostname }}:8080/health"
      retries: 5
      delay: 10
      until: health.status == 200

Mitogen for Ansible

Third-party plugin that replaces SSH with a faster Python-based transport:

# Install
pip install mitogen

# ansible.cfg
[defaults]
strategy_plugins = /path/to/mitogen/ansible_mitogen/plugins/strategy
strategy = mitogen_linear

Impact: 1.5-7x speed improvement depending on playbook complexity.

Caveat: Not compatible with all modules; test thoroughly before production use.

Profiling Playbook Performance

Enable Task Timing

# ansible.cfg
[defaults]
callbacks_enabled = ansible.posix.profile_tasks

Output shows per-task timing:

PLAY RECAP
Tuesday 08 April 2026  12:00:00 +0000 (0:00:02.123)  0:05:43.210 ****
=============================================================
Install packages --------------------------------- 180.23s
Configure nginx ---------------------------------- 45.12s
Deploy application ------------------------------- 23.45s
Restart services --------------------------------- 8.67s

Identify Bottlenecks

# Run with timing callback
ANSIBLE_CALLBACKS_ENABLED=ansible.posix.profile_tasks \
  ansible-playbook site.yml

Architecture Patterns

Pull Mode with ansible-pull

# On each host — runs playbook from Git on schedule
ansible-pull -U https://github.com/myorg/ansible-configs.git \
  -d /opt/ansible \
  -i localhost, \
  site.yml \
  --sleep 60  # Random delay to avoid thundering herd

Split Large Inventories

# Instead of one massive run
ansible-playbook site.yml -i all_2000_hosts

# Split by group
ansible-playbook site.yml -i group_a --limit batch1
ansible-playbook site.yml -i group_a --limit batch2

Performance Comparison

Optimization	Improvement	Complexity
SSH pipelining	2-5x	Low — one config line
Forks increase (5→50)	5-10x	Low — one config line
Fact caching	1.5-2x	Low — config + optional Redis
`gather_facts: false`	1.2-1.5x	Low — per playbook
Package batching	2-3x per task	Low — code change
Async tasks	2-5x for I/O tasks	Medium — code change
Free strategy	1.5-3x	Medium — behavior change
Mitogen	1.5-7x	Medium — plugin install
ansible-pull	10-100x	High — architecture change

Best Practices

Start with profiling — Measure before optimizing
Pipelining + forks first — Biggest bang for least effort
Fact caching for repeated runs — Avoid re-gathering facts every time
Batch package operations — Single apt/yum call, not loop
Async for I/O-bound tasks — Downloads, updates, long-running commands
Free strategy for independent hosts — When tasks don't depend on other hosts' state
Test Mitogen compatibility — Great speedup but verify with your modules
Profile regularly — Performance characteristics change as playbooks evolve

FAQ

What's the maximum practical forks setting?

Depends on the Ansible controller's resources. Rule of thumb: 50-100 for a dedicated server with 8+ cores and 16GB+ RAM. Monitor CPU and memory during runs.

Does Ansible Tower/AAP handle scaling automatically?

AAP distributes jobs across execution nodes (Automation Mesh). Each node runs its own forks. This scales horizontally — add more execution nodes for more capacity.

ansible-pull vs push — when to switch?

Consider ansible-pull when: 1,000+ hosts, highly repetitive configs, hosts have internet access to Git. Keep push for: orchestrated workflows, sequential operations, one-off tasks.

Conclusion

Ansible performance at scale requires deliberate optimization. Start with the easy wins — pipelining, forks, fact caching — then progressively adopt async execution, free strategy, and architectural patterns like ansible-pull as your environment grows.

Category: installation

Browse all Ansible tutorials · AnsiblePilot Home

AnsiblePilot — Master Ansible Automation

Popular Topics

About Luca Berton