Ansible retries & until: Retry Failed Tasks Automatically (Guide)
By Luca Berton · Published 2024-01-01 · Category: installation
Complete guide to Ansible retries and until loops. Retry failed tasks automatically, wait for services, implement polling patterns, and handle transient.
Ansible's retries and until directives let you automatically retry tasks that fail due to transient errors — network timeouts, services still starting, APIs returning temporary errors. This is essential for reliable automation in real-world environments.
Basic Syntax
- name: Wait for service to be ready
ansible.builtin.uri:
url: http://localhost:8080/health
status_code: 200
register: result
until: result.status == 200
retries: 10
delay: 5
• until — Condition that must be true for the task to succeed
• retries — Maximum number of attempts (default: 3)
• delay — Seconds between retries (default: 5)
The task runs up to retries times, waiting delay seconds between each attempt, until the until condition is true.
See also: Ansible block, rescue, always: Error Handling Complete Guide (2026)
Wait for a Service to Start
- name: Start application
ansible.builtin.service:
name: myapp
state: started
- name: Wait for application to respond
ansible.builtin.uri:
url: http://localhost:{{ app_port }}/health
status_code: 200
register: health_check
until: health_check.status == 200
retries: 30
delay: 10
# Total wait: up to 5 minutes (30 * 10s)
Wait for a Port to Open
- name: Wait for PostgreSQL to accept connections
ansible.builtin.wait_for:
host: "{{ db_host }}"
port: 5432
state: started
timeout: 300
register: port_check
until: port_check is success
retries: 5
delay: 10
See also: Ansible changed_when & failed_when: Control Task Status (Guide)
Retry API Calls
- name: Create resource via API (retry on 503)
ansible.builtin.uri:
url: "https://api.example.com/resources"
method: POST
body:
name: "{{ resource_name }}"
body_format: json
headers:
Authorization: "Bearer {{ api_token }}"
status_code: [200, 201]
register: api_result
until: api_result.status in [200, 201]
retries: 5
delay: 15
Retry Commands Based on Output
- name: Wait for cluster to be healthy
ansible.builtin.command: kubectl get nodes
register: kubectl_result
changed_when: false
until: "'NotReady' not in kubectl_result.stdout"
retries: 20
delay: 15
- name: Wait for database migration to complete
ansible.builtin.command: /opt/myapp/check_migration_status.sh
register: migration
changed_when: false
until: "'COMPLETE' in migration.stdout"
retries: 30
delay: 10
See also: Ansible Playbook Structure: Anatomy, Best Practices & Examples (2026)
Retry Package Installation
- name: Install package (retry on network errors)
ansible.builtin.package:
name: nginx
state: present
register: pkg_result
until: pkg_result is success
retries: 3
delay: 30
Retry SSH Connections
- name: Wait for host to come back after reboot
ansible.builtin.wait_for_connection:
timeout: 300
delay: 10
register: connection
until: connection is success
retries: 3
delay: 60
Complex Until Conditions
Multiple Conditions (AND)
- name: Wait for app to be fully ready
ansible.builtin.uri:
url: http://localhost:8080/status
return_content: true
register: status
until:
- status.status == 200
- "'ready' in status.content"
- "'error' not in status.content"
retries: 20
delay: 5
OR Conditions
- name: Wait for one of multiple success states
ansible.builtin.command: check_status.sh
register: result
changed_when: false
until: "'RUNNING' in result.stdout or 'COMPLETED' in result.stdout"
retries: 15
delay: 10
Retry with Registered Variable Checks
- name: Check replication status
community.postgresql.postgresql_query:
db: myapp
query: "SELECT state FROM pg_stat_replication WHERE client_addr = '{{ replica_ip }}'"
register: repl_status
until:
- repl_status.rowcount > 0
- repl_status.query_result[0].state == 'streaming'
retries: 12
delay: 10
Default Values
When you omit directives:
# These are the defaults:
retries: 3 # Try up to 3 times
delay: 5 # Wait 5 seconds between retries
If you specify until without retries, it defaults to 3 attempts. Without delay, it defaults to 5 seconds.
Retry Information in Output
Ansible shows retry progress in the output:
TASK [Wait for service] *************************
FAILED - RETRYING: Wait for service (10 retries left).
FAILED - RETRYING: Wait for service (9 retries left).
ok: [webserver]
Access Retry Information
- name: Task with retries
ansible.builtin.uri:
url: http://localhost:8080/health
register: result
until: result.status == 200
retries: 10
delay: 5
- name: Show retry info
ansible.builtin.debug:
msg: "Took {{ result.attempts }} attempts"
The result.attempts variable contains the number of attempts made.
Real-World Patterns
Wait After Reboot
- name: Reboot the server
ansible.builtin.reboot:
reboot_timeout: 600
msg: "Rebooting for kernel update"
# Alternative manual approach:
- name: Reboot
ansible.builtin.command: shutdown -r now
async: 1
poll: 0
- name: Wait for server to come back
ansible.builtin.wait_for_connection:
delay: 30
timeout: 300
- name: Verify services after reboot
ansible.builtin.service_facts:
register: services
until: "'nginx.service' in services.ansible_facts.services"
retries: 10
delay: 10
Wait for Cloud Instance
- name: Create EC2 instance
amazon.aws.ec2_instance:
name: webserver
instance_type: t3.micro
image_id: "{{ ami_id }}"
wait: true
register: ec2
- name: Wait for SSH on new instance
ansible.builtin.wait_for:
host: "{{ ec2.instances[0].public_ip_address }}"
port: 22
delay: 10
timeout: 300
- name: Wait for cloud-init to finish
ansible.builtin.command: cloud-init status --wait
delegate_to: "{{ ec2.instances[0].public_ip_address }}"
register: cloud_init
changed_when: false
until: cloud_init.rc == 0
retries: 30
delay: 10
Poll for Job Completion
- name: Start backup job
ansible.builtin.uri:
url: "https://backup-api.example.com/jobs"
method: POST
body: '{"type": "full"}'
body_format: json
register: job
- name: Wait for backup to complete
ansible.builtin.uri:
url: "https://backup-api.example.com/jobs/{{ job.json.id }}"
method: GET
register: job_status
until: job_status.json.state in ['completed', 'failed']
retries: 60
delay: 30
failed_when: job_status.json.state == 'failed'
retries vs async/poll
| Feature | retries/until | async/poll |
|---------|----------------|-------------|
| Purpose | Retry on failure | Background execution |
| Blocks play | Yes (synchronous) | No (with poll: 0) |
| Condition | Custom until expression | Completion only |
| Delay type | Fixed between retries | Fixed polling interval |
| Use when | Transient failures, readiness | Long-running tasks |
FAQ
How do I retry a failed task in Ansible?
Add retries, delay, and until to any task. Register the result and set an until condition: the task retries up to retries times, waiting delay seconds between attempts, until the condition is true.
What are the default values for retries and delay?
The default is 3 retries with 5 seconds delay between each. If you specify until without retries, Ansible uses these defaults.
How do I wait for a service to be ready?
Use ansible.builtin.uri with until: result.status == 200 and set retries and delay based on how long the service typically takes to start. For port-level checks, use ansible.builtin.wait_for.
Can I use retries without until?
You can, but it's not recommended. Without until, the task retries only on hard failures (exceptions). With until, you define explicit success criteria, which is more reliable.
How do I know how many retries a task needed?
The registered variable includes an attempts field: result.attempts contains the number of attempts made (1 = succeeded first try).
Conclusion
Ansible's retry mechanism is essential for reliable automation:
• until: condition — Define what "success" looks like
• retries: N — Maximum attempts (default: 3)
• delay: N — Seconds between retries (default: 5)
• Always register the result to use in until conditions
• Combine with changed_when: false for read-only checks
Use retries whenever you interact with services that might not be immediately available — APIs, databases, cloud resources, or freshly started services.
Related Articles
• Ansible changed_when & failed_when: Control Task Status • Ansible block, rescue, always: Error Handling Guide • Ansible Ignore Errors Complete Guide • Ansible async: Run Long Tasks in BackgroundSee also
• How to Retry a Failed Task in Ansible (retries, delay, until)Category: installation