AnsiblePilot — Master Ansible Automation

AnsiblePilot is the leading resource for learning Ansible automation, DevOps, and infrastructure as code. Browse over 1,400 tutorials covering Ansible modules, playbooks, roles, collections, and real-world examples. Whether you are a beginner or an experienced engineer, our step-by-step guides help you automate Linux, Windows, cloud, containers, and network infrastructure.

Popular Topics

About Luca Berton

Luca Berton is an Ansible automation expert, author of 8 Ansible books published by Apress and Leanpub including "Ansible for VMware by Examples" and "Ansible for Kubernetes by Example", and creator of the Ansible Pilot YouTube channel. He shares practical automation knowledge through tutorials, books, and video courses to help IT professionals and DevOps engineers master infrastructure automation.

Ansible Event-Driven Automation (EDA): Complete Guide with Rulebooks and Examples

By Luca Berton · Published 2024-01-01 · Category: installation

Master Ansible Event-Driven Automation (EDA) for real-time incident response and auto-remediation.

Ansible Event-Driven Automation (EDA) enables real-time, automated responses to events from monitoring systems, cloud providers, webhooks, and custom sources. Instead of running playbooks on a schedule, EDA listens for events and triggers automation instantly when conditions are met. This guide covers the complete EDA architecture, rulebook syntax, event sources, and production patterns.

What is Event-Driven Automation?

Traditional Ansible automation is imperative — you run a playbook when you decide to. EDA flips this model: you define rules that watch for specific events and automatically trigger actions when conditions match.

Event Source → Event → Rule Matching → Action → Resolution
(Webhook)    (alert)  (severity=critical) (run playbook) (service restored)

Key Concepts

| Concept | Description | |---------|-------------| | Rulebook | YAML file defining sources, rules, conditions, and actions | | Event Source | Plugin that receives events (webhook, Kafka, alertmanager, etc.) | | Condition | Jinja2 expression that evaluates event data | | Action | What happens when condition matches (run_playbook, run_job_template, etc.) | | ansible-rulebook | CLI tool that executes rulebooks |

See also: Event-Driven Ansible (EDA): Automate Responses to Events Guide

Installation

# Install ansible-rulebook
pip install ansible-rulebook

# Install required collections ansible-galaxy collection install ansible.eda

# Verify installation ansible-rulebook --version

Requirements: Python 3.9+, Java 17+ (for Drools rule engine).

Rulebook Syntax

Basic Structure

---
- name: Respond to monitoring alerts
  hosts: all
  sources:
    - ansible.eda.webhook:
        host: 0.0.0.0
        port: 5000

rules: - name: Restart service on failure alert condition: event.payload.status == "firing" and event.payload.alert == "service_down" action: run_playbook: name: playbooks/restart-service.yml extra_vars: target_service: "{{ event.payload.service }}" target_host: "{{ event.payload.host }}"

Running a Rulebook

# Run with verbose output
ansible-rulebook --rulebook rulebook.yml -i inventory.yml --verbose

# Run with specific variables ansible-rulebook --rulebook rulebook.yml -i inventory.yml \ --vars vars.yml

See also: A Preview of Ansible Journey in 2024

Event Sources

Webhook Source

sources:
  - ansible.eda.webhook:
      host: 0.0.0.0
      port: 5000
      token: "{{ webhook_secret }}"

Alertmanager Source

sources:
  - ansible.eda.alertmanager:
      host: 0.0.0.0
      port: 8888

Kafka Source

sources:
  - ansible.eda.kafka:
      host: kafka-broker.example.com
      port: 9092
      topic: infrastructure-events
      group_id: eda-consumer

File Watch Source

sources:
  - ansible.eda.file_watch:
      path: /var/log/application/
      recursive: true

URL Check Source

sources:
  - ansible.eda.url_check:
      urls:
        - https://api.example.com/health
        - https://web.example.com/health
      delay: 30

Production Rulebook Examples

Example 1: Auto-Remediation for Service Failures

---
- name: Service auto-remediation
  hosts: all
  sources:
    - ansible.eda.webhook:
        host: 0.0.0.0
        port: 5000

rules: - name: Restart failed service condition: >- event.payload.status == "firing" and event.payload.labels.severity in ["critical", "warning"] and event.payload.labels.alertname == "ServiceDown" throttle: once_within: 5 minutes group_by: - event.payload.labels.instance - event.payload.labels.service action: run_playbook: name: playbooks/restart-service.yml extra_vars: target_host: "{{ event.payload.labels.instance }}" service_name: "{{ event.payload.labels.service }}" alert_severity: "{{ event.payload.labels.severity }}"

- name: Scale up on high CPU condition: >- event.payload.labels.alertname == "HighCPU" and event.payload.labels.cpu_percent | int > 90 throttle: once_within: 15 minutes group_by: - event.payload.labels.cluster action: run_playbook: name: playbooks/scale-up.yml extra_vars: cluster: "{{ event.payload.labels.cluster }}"

- name: Disk cleanup on low space condition: >- event.payload.labels.alertname == "DiskSpaceLow" and event.payload.labels.disk_percent | int > 85 action: run_playbook: name: playbooks/disk-cleanup.yml extra_vars: target_host: "{{ event.payload.labels.instance }}" mount_point: "{{ event.payload.labels.mountpoint }}"

Example 2: Security Incident Response

---
- name: Security incident response
  hosts: all
  sources:
    - ansible.eda.webhook:
        host: 0.0.0.0
        port: 5001

rules: - name: Block brute force attacker condition: >- event.payload.alert_type == "brute_force" and event.payload.failed_attempts | int > 10 action: run_playbook: name: playbooks/block-ip.yml extra_vars: attacker_ip: "{{ event.payload.source_ip }}" target_host: "{{ event.payload.target_host }}" block_duration: "24h"

- name: Isolate compromised host condition: >- event.payload.alert_type == "malware_detected" and event.payload.confidence | float > 0.9 action: run_playbook: name: playbooks/isolate-host.yml extra_vars: compromised_host: "{{ event.payload.hostname }}" malware_hash: "{{ event.payload.file_hash }}" alert_id: "{{ event.payload.alert_id }}"

- name: Rotate credentials on exposure condition: >- event.payload.alert_type == "credential_exposure" action: run_playbook: name: playbooks/rotate-credentials.yml extra_vars: exposed_service: "{{ event.payload.service }}" exposure_source: "{{ event.payload.source }}"

Example 3: Cloud Auto-Scaling and Cost Management

---
- name: Cloud infrastructure automation
  hosts: all
  sources:
    - ansible.eda.webhook:
        host: 0.0.0.0
        port: 5002

rules: - name: Auto-scale on queue depth condition: >- event.payload.metric == "sqs_queue_depth" and event.payload.value | int > 1000 throttle: once_within: 10 minutes action: run_playbook: name: playbooks/scale-workers.yml extra_vars: queue_name: "{{ event.payload.queue }}" current_depth: "{{ event.payload.value }}"

- name: Terminate idle instances condition: >- event.payload.metric == "instance_idle" and event.payload.idle_minutes | int > 60 and event.payload.environment != "production" action: run_playbook: name: playbooks/terminate-instance.yml extra_vars: instance_id: "{{ event.payload.instance_id }}" region: "{{ event.payload.region }}"

- name: Right-size over-provisioned instances condition: >- event.payload.alert_type == "rightsizing_recommendation" and event.payload.savings_percent | int > 30 action: run_playbook: name: playbooks/rightsize-instance.yml extra_vars: instance_id: "{{ event.payload.instance_id }}" recommended_type: "{{ event.payload.recommended_instance_type }}"

Example 4: CI/CD Pipeline Triggers

---
- name: CI/CD event-driven deployment
  hosts: all
  sources:
    - ansible.eda.webhook:
        host: 0.0.0.0
        port: 5003

rules: - name: Deploy on successful build condition: >- event.payload.event == "build_complete" and event.payload.status == "success" and event.payload.branch == "main" action: run_playbook: name: playbooks/deploy-application.yml extra_vars: app_version: "{{ event.payload.version }}" artifact_url: "{{ event.payload.artifact_url }}" commit_sha: "{{ event.payload.commit }}"

- name: Rollback on failed deployment condition: >- event.payload.event == "health_check_failed" and event.payload.consecutive_failures | int >= 3 action: run_playbook: name: playbooks/rollback-deployment.yml extra_vars: app_name: "{{ event.payload.application }}" failed_version: "{{ event.payload.version }}"

See also: Ansible ServiceNow Integration: Automate ITSM Workflows and Change Management

Integration with AAP Controller

EDA integrates natively with Ansible Automation Platform (AAP) Controller:

rules:
  - name: Trigger AAP job template
    condition: event.payload.alert == "critical"
    action:
      run_job_template:
        name: "Remediate Critical Alert"
        organization: "Operations"
        job_args:
          extra_vars:
            alert_data: "{{ event.payload }}"

EDA Controller Setup in AAP

Navigate to Automation DecisionsRulebook Activations Create a new activation with your rulebook Select the appropriate Decision Environment (execution environment for EDA) Configure credentials for event sources Enable the activation

Conditions Reference

Comparison Operators

# Equality
condition: event.payload.status == "critical"

# Numeric comparison condition: event.payload.value | int > 100

# String contains condition: "'error' in event.payload.message"

# Regex match condition: event.payload.host is match("web-.*")

# List membership condition: event.payload.severity in ["critical", "high"]

# Multiple conditions (AND) condition: >- event.payload.status == "firing" and event.payload.severity == "critical" and event.payload.environment == "production"

# OR conditions (use any_of) condition: any: - event.payload.alert == "disk_full" - event.payload.alert == "disk_readonly"

Throttling and Deduplication

Prevent alert storms from triggering excessive automation:

rules:
  - name: Throttled remediation
    condition: event.payload.alert == "service_down"
    throttle:
      once_within: 5 minutes
      group_by:
        - event.payload.host
        - event.payload.service
    action:
      run_playbook:
        name: playbooks/remediate.yml

Best Practices

1. Always Use Throttling

Without throttling, a flapping alert can trigger hundreds of remediation attempts. Always set once_within for production rulebooks.

2. Log All Events

rules:
  - name: Log all events for debugging
    condition: "true"
    action:
      debug:
        msg: "Received event: {{ event }}"

3. Use Decision Environments

Package your EDA dependencies (Python packages, Java, collections) into a Decision Environment for reproducible execution.

4. Test with Mock Events

# Send test event to webhook
curl -X POST http://localhost:5000/endpoint \
  -H "Content-Type: application/json" \
  -d '{"status": "firing", "alert": "service_down", "service": "nginx", "host": "web-01"}'

5. Implement Escalation

rules:
  - name: Auto-remediate first occurrence
    condition: >-
      event.payload.alert == "service_down"
      and event.payload.occurrence | int <= 3
    action:
      run_playbook:
        name: playbooks/restart-service.yml

- name: Escalate repeated failures condition: >- event.payload.alert == "service_down" and event.payload.occurrence | int > 3 action: run_playbook: name: playbooks/escalate-to-oncall.yml

Frequently Asked Questions

What is Ansible Event-Driven Automation (EDA)?

EDA is a framework that listens for events from external sources (monitoring systems, cloud providers, webhooks) and automatically triggers Ansible playbooks or AAP job templates when specific conditions are met. It enables real-time, reactive automation instead of scheduled or manual execution.

What events can trigger Ansible EDA?

Any event that can be delivered via webhooks, message queues (Kafka, AMQP), APIs, or file changes. Common sources include Prometheus/Alertmanager, Dynatrace, PagerDuty, AWS EventBridge, ServiceNow, GitHub webhooks, and custom applications.

How is EDA different from running playbooks on a cron schedule?

Cron-based automation runs at fixed intervals regardless of need. EDA responds to events in real-time — when an alert fires, when a file changes, or when a webhook is received. This means faster response times and no wasted execution cycles.

Can EDA integrate with Ansible Automation Platform?

Yes. EDA is a core component of AAP 2.4+. The EDA Controller provides a web UI for managing rulebook activations, viewing event logs, and integrating with AAP Controller job templates. EDA can trigger any job template in your AAP environment.

How do I prevent alert storms from overwhelming EDA?

Use the throttle directive with once_within to limit how often a rule can fire for the same event source. Group similar events with group_by to deduplicate by host, service, or other dimensions.

Related Articles

Ansible Automation Platform 2.6 New FeaturesAnsible Monitoring Prometheus Grafana ELKAnsible CI CD Pipeline IntegrationAnsible Zero Trust Security Automation

Conclusion

Event-Driven Automation transforms Ansible from a tool you run into a system that acts autonomously. By connecting event sources to rulebooks, you can auto-remediate service failures in seconds, respond to security incidents in real-time, and optimize cloud costs automatically. Start with simple webhook-based rules, add throttling for safety, and gradually expand to cover your most common operational scenarios. EDA is the bridge between monitoring and action. • Event-Driven Ansible: Automate IT Operations

Category: installation

Browse all Ansible tutorials · AnsiblePilot Home