Ansible serial: Rolling Updates and Batch Deployment Guide

By Luca Berton · Published 2024-01-01 · Category: installation

Complete guide to Ansible serial keyword. Implement rolling updates, batch deployments, canary releases, and zero-downtime strategies with max_fail_percentage.

The serial keyword controls how many hosts Ansible updates at once. Without it, Ansible runs on all hosts simultaneously — fine for configuration, dangerous for deployments. With serial, you get rolling updates, canary releases, and zero-downtime deployments.

Default Behavior (No serial)

# Runs on ALL 100 webservers simultaneously
- hosts: webservers
  tasks:
    - name: Restart nginx
      ansible.builtin.systemd:
        name: nginx
        state: restarted
# If something goes wrong, ALL servers are down at once

Basic serial Usage

Fixed Batch Size

# Update 5 hosts at a time
- hosts: webservers
  serial: 5
  tasks:
    - name: Deploy application
      ansible.builtin.include_role:
        name: deploy

Ansible processes hosts in batches of 5. If you have 20 hosts, it runs 4 batches sequentially.

Percentage

# Update 25% of hosts at a time
- hosts: webservers
  serial: "25%"
  tasks:
    - name: Deploy application
      ansible.builtin.include_role:
        name: deploy

One at a Time

# Safest: one host at a time
- hosts: webservers
  serial: 1
  tasks:
    - name: Restart critical service
      ansible.builtin.systemd:
        name: myapp
        state: restarted

Increasing Batch Sizes (Canary Pattern)

Start small to catch problems early, then increase batch size:

# Canary deployment: 1 → 5 → remaining
- hosts: webservers
  serial:
    - 1          # First: test on 1 host (canary)
    - 5          # Then: 5 hosts at a time
    - "100%"     # Finally: all remaining hosts
  tasks:
    - name: Deploy new version
      ansible.builtin.include_role:
        name: deploy

    - name: Health check
      ansible.builtin.uri:
        url: "http://localhost:{{ app_port }}/health"
      retries: 5
      delay: 3
      register: health
      until: health.status == 200

With 50 hosts: batch 1 = 1 host, batch 2 = 5 hosts, batch 3 = 44 remaining hosts.

Percentage Ramp-Up

- hosts: webservers
  serial:
    - "5%"      # Canary: ~5% of fleet
    - "25%"     # Quarter of fleet
    - "100%"    # Everything else

max_fail_percentage

Stop the entire deployment if too many hosts fail:

- hosts: webservers
  serial: 5
  max_fail_percentage: 10    # Stop if >10% of hosts fail
  tasks:
    - name: Deploy and verify
      ansible.builtin.include_role:
        name: deploy

Zero Tolerance

# Stop immediately on ANY failure
- hosts: webservers
  serial: 5
  max_fail_percentage: 0
  tasks:
    - name: Critical deployment
      ansible.builtin.include_role:
        name: deploy

Real-World Patterns

Zero-Downtime Web Deployment

---
- name: Rolling deployment with LB management
  hosts: webservers
  serial: 1
  max_fail_percentage: 0

  pre_tasks:
    - name: Remove from load balancer
      community.general.haproxy:
        state: disabled
        host: "{{ inventory_hostname }}"
        backend: web_backend
      delegate_to: "{{ haproxy_host }}"

    - name: Wait for connections to drain
      ansible.builtin.wait_for:
        port: "{{ app_port }}"
        state: drained
        timeout: 60

  tasks:
    - name: Pull latest code
      ansible.builtin.git:
        repo: "{{ app_repo }}"
        dest: "{{ app_dir }}"
        version: "{{ deploy_version }}"

    - name: Install dependencies
      ansible.builtin.command: npm install --production
      args:
        chdir: "{{ app_dir }}"

    - name: Restart application
      ansible.builtin.systemd:
        name: myapp
        state: restarted

  post_tasks:
    - name: Wait for health check
      ansible.builtin.uri:
        url: "http://localhost:{{ app_port }}/health"
        status_code: 200
      retries: 10
      delay: 5
      register: health
      until: health.status == 200

    - name: Re-enable in load balancer
      community.general.haproxy:
        state: enabled
        host: "{{ inventory_hostname }}"
        backend: web_backend
      delegate_to: "{{ haproxy_host }}"

    - name: Wait for traffic to flow
      ansible.builtin.pause:
        seconds: 10

Database Migration with Rolling App Restart

---
# Play 1: Migrate database (once)
- name: Run database migration
  hosts: webservers[0]    # Only first host
  tasks:
    - name: Run migration
      ansible.builtin.command: "{{ app_dir }}/manage.py migrate"
      register: migrate
      changed_when: "'Applying' in migrate.stdout"

# Play 2: Rolling restart of app servers
- name: Rolling restart
  hosts: webservers
  serial: 2
  max_fail_percentage: 0
  tasks:
    - name: Restart application
      ansible.builtin.systemd:
        name: myapp
        state: restarted

    - name: Wait for ready
      ansible.builtin.wait_for:
        port: "{{ app_port }}"
        delay: 5
        timeout: 30

Canary with Automatic Rollback

- name: Canary deployment
  hosts: webservers
  serial:
    - 1
    - "100%"
  max_fail_percentage: 0

  tasks:
    - name: Deploy new version
      ansible.builtin.include_role:
        name: deploy

    - name: Run smoke tests
      ansible.builtin.uri:
        url: "http://localhost:{{ app_port }}{{ item }}"
        status_code: 200
      loop:
        - /health
        - /api/status
        - /

  rescue:
    - name: Rollback on failure
      ansible.builtin.include_role:
        name: deploy
      vars:
        deploy_version: "{{ previous_version }}"

    - name: Notify team of rollback
      ansible.builtin.uri:
        url: "{{ slack_webhook }}"
        method: POST
        body: '{"text": "⚠️ Rollback on {{ inventory_hostname }}: {{ ansible_failed_result.msg }}"}'
        body_format: json
      delegate_to: localhost

serial vs forks

Setting	What It Controls
`serial`	Batch size — how many hosts per batch (sequential batches)
`forks`	Parallelism — how many hosts run simultaneously within a batch

# ansible.cfg
[defaults]
forks = 20    # Up to 20 hosts in parallel

# With serial: 10 and forks: 20
# → 10 hosts per batch, all 10 run in parallel (forks > serial)
# → Next batch of 10 only starts after first batch completes

# With serial: 50 and forks: 20
# → 50 hosts per batch, but only 20 run in parallel at a time

FAQ

What does serial do in Ansible?

The serial keyword limits how many hosts are processed in each batch. Without it, Ansible runs on all hosts in the play simultaneously. With serial: 5, Ansible processes hosts in groups of 5, completing each batch before starting the next.

How do I do a rolling update with Ansible?

Set serial: 1 (or a small number) on your play to update hosts one at a time. Combine with load balancer management (drain → update → health check → re-enable) for zero-downtime deployments.

What is the difference between serial and forks?

serial controls batch size (how many hosts per round). forks controls parallelism within each batch (how many run simultaneously). serial affects execution order; forks affects speed.

How do I implement a canary deployment?

Use a list for serial: serial: [1, 5, "100%"]. This deploys to 1 host first (canary), then 5, then all remaining. Combined with max_fail_percentage: 0, any failure stops the rollout before affecting more hosts.

What happens when a host fails with serial?

If a host fails and max_fail_percentage is exceeded, Ansible stops processing remaining batches. Hosts already in the current batch continue, but no new batches start.

Conclusion

serial: 1 — Safest, one at a time (use for critical services)
serial: N — Fixed batch size (balance speed vs safety)
serial: [1, 5, "100%"] — Canary pattern (catch problems early)
max_fail_percentage: 0 — Stop on any failure
Always pair with health checks in post_tasks

Category: installation

Browse all Ansible tutorials · AnsiblePilot Home

AnsiblePilot — Master Ansible Automation

Popular Topics

About Luca Berton