Ansible for IoT and Edge Computing: Automate Device Fleets at Scale
By Luca Berton · Published 2024-01-01 · Category: installation
Automate IoT and edge computing infrastructure with Ansible. Manage device fleets, deploy edge applications, configure network gateways, update firmware.
Why Ansible for IoT and Edge?
Edge computing pushes workloads from centralized clouds to thousands of distributed locations — retail stores, factories, cell towers, vehicles, and remote sites. Managing these devices manually is impossible at scale.
Ansible is uniquely suited for edge automation:
• Agentless — no client software to install on resource-constrained devices
• SSH-based — works over any network connection, including cellular and satellite
• Idempotent — safe to re-run on unreliable connections (if it fails, just run again)
• Low overhead — managed devices need only Python and SSH
• Offline-capable — ansible-pull lets devices self-configure without inbound connectivity
See also: Ansible for Edge Computing and IoT: Managing Thousands of Distributed Devices
Edge Architecture Patterns
Pattern 1: Centralized Push (AAP + Automation Mesh)
[AAP Controller] → [Hop Nodes] → [Edge Devices]
(HQ) (Regional) (1000s of sites)
AAP's Automation Mesh extends execution across network boundaries:
# Automation Mesh topology
# Controller (HQ) → Hop Node (Region) → Execution Node (Edge)
#
# Hop nodes relay traffic without running playbooks
# Execution nodes run playbooks locally at the edge
Pattern 2: Pull-Based (ansible-pull)
For devices behind NAT or with intermittent connectivity:
# Cron-driven ansible-pull on each device
# /etc/cron.d/ansible-pull
*/30 * * * * root ansible-pull \
-U https://github.com/company/edge-config.git \
-d /opt/ansible \
-i localhost, \
--accept-host-key \
site.yml >> /var/log/ansible-pull.log 2>&1
# site.yml (in the Git repo)
---
- name: Edge device self-configuration
hosts: localhost
connection: local
become: true
tasks:
- name: Ensure edge application is running
ansible.builtin.systemd:
name: edge-agent
state: started
enabled: true
- name: Update configuration from Git
ansible.builtin.template:
src: edge-config.yml.j2
dest: /etc/edge-agent/config.yml
notify: restart edge-agent
- name: Report status to central server
ansible.builtin.uri:
url: "https://management.company.com/api/devices/{{ ansible_hostname }}/heartbeat"
method: POST
body_format: json
body:
hostname: "{{ ansible_hostname }}"
uptime: "{{ ansible_uptime_seconds }}"
version: "{{ edge_agent_version }}"
ip: "{{ ansible_default_ipv4.address }}"
ignore_errors: true # Don't fail if management server unreachable
handlers:
- name: restart edge-agent
ansible.builtin.systemd:
name: edge-agent
state: restarted
Pattern 3: Hybrid (Push + Pull)
# Normal operations: ansible-pull every 30 minutes
# Emergency updates: AAP pushes directly via Automation Mesh
# Firmware updates: AAP workflow with rolling update strategy
Manage Raspberry Pi Fleets
---
- name: Configure Raspberry Pi edge devices
hosts: raspberry_pis
become: true
vars:
wifi_ssid: "{{ vault_wifi_ssid }}"
wifi_password: "{{ vault_wifi_password }}"
ntp_server: "time.company.com"
tasks:
- name: Set hostname
ansible.builtin.hostname:
name: "edge-{{ site_id }}-{{ inventory_hostname_short }}"
- name: Configure WiFi
ansible.builtin.template:
src: wpa_supplicant.conf.j2
dest: /etc/wpa_supplicant/wpa_supplicant.conf
mode: '0600'
notify: restart networking
- name: Set timezone
community.general.timezone:
name: "{{ device_timezone | default('UTC') }}"
- name: Configure NTP
ansible.builtin.template:
src: timesyncd.conf.j2
dest: /etc/systemd/timesyncd.conf
notify: restart timesyncd
- name: Install edge packages
ansible.builtin.apt:
name:
- docker.io
- mosquitto-clients # MQTT client
- python3-pip
- jq
- monitoring-plugins-basic
state: present
update_cache: true
- name: Deploy edge application container
community.docker.docker_container:
name: edge-sensor
image: "registry.company.com/edge-sensor:{{ app_version }}"
state: started
restart_policy: always
ports:
- "8080:8080"
volumes:
- /data/sensor:/data
env:
SITE_ID: "{{ site_id }}"
MQTT_BROKER: "{{ mqtt_broker }}"
DEVICE_ID: "{{ inventory_hostname }}"
- name: Configure watchdog
ansible.builtin.copy:
content: |
[Unit]
Description=Hardware Watchdog
[Service]
ExecStart=/usr/sbin/watchdog
[Install]
WantedBy=multi-user.target
dest: /etc/systemd/system/watchdog.service
notify: enable watchdog
- name: Set GPU memory split (headless)
ansible.builtin.lineinfile:
path: /boot/config.txt
regexp: '^gpu_mem='
line: 'gpu_mem=16'
notify: reboot required
handlers:
- name: restart networking
ansible.builtin.systemd:
name: networking
state: restarted
- name: restart timesyncd
ansible.builtin.systemd:
name: systemd-timesyncd
state: restarted
- name: enable watchdog
ansible.builtin.systemd:
name: watchdog
state: started
enabled: true
- name: reboot required
ansible.builtin.debug:
msg: "Reboot required on {{ inventory_hostname }}"
See also: Ansible for Autonomous Industrial Systems: Automate Smart Factories & Supply Chains (2026 Guide)
Firmware and OS Updates
Rolling Updates with Serial
---
- name: Rolling firmware update
hosts: edge_devices
become: true
serial: "{{ update_batch_size | default('10%') }}"
max_fail_percentage: 5
pre_tasks:
- name: Health check before update
ansible.builtin.uri:
url: "http://localhost:8080/health"
register: health
failed_when: health.status != 200
tasks:
- name: Download firmware
ansible.builtin.get_url:
url: "{{ firmware_url }}"
dest: /tmp/firmware-{{ firmware_version }}.bin
checksum: "sha256:{{ firmware_checksum }}"
- name: Stop application
ansible.builtin.systemd:
name: edge-agent
state: stopped
- name: Apply firmware update
ansible.builtin.command:
cmd: "/usr/local/bin/fw-update /tmp/firmware-{{ firmware_version }}.bin"
register: fw_result
failed_when: fw_result.rc != 0
- name: Reboot device
ansible.builtin.reboot:
reboot_timeout: 300
connect_timeout: 30
post_reboot_delay: 30
post_tasks:
- name: Verify firmware version
ansible.builtin.command:
cmd: cat /sys/firmware/version
register: current_fw
failed_when: firmware_version not in current_fw.stdout
- name: Health check after update
ansible.builtin.uri:
url: "http://localhost:8080/health"
register: post_health
retries: 5
delay: 30
until: post_health.status == 200
OS Image Updates (A/B Partition)
---
- name: A/B partition OS update
hosts: edge_devices
become: true
tasks:
- name: Identify inactive partition
ansible.builtin.shell: |
current=$(findmnt -n -o SOURCE /)
if [[ "$current" == *"partA"* ]]; then
echo "partB"
else
echo "partA"
fi
register: inactive_partition
changed_when: false
- name: Write new image to inactive partition
ansible.builtin.command:
cmd: >
dd if=/tmp/os-image-{{ os_version }}.img
of=/dev/mmcblk0{{ inactive_partition.stdout }}
bs=4M status=progress
async: 600
poll: 30
- name: Update boot configuration
ansible.builtin.lineinfile:
path: /boot/grub/grub.cfg
regexp: '^set default='
line: "set default={{ inactive_partition.stdout }}"
- name: Set rollback timer
ansible.builtin.copy:
content: |
[Unit]
Description=Rollback if health check fails
[Timer]
OnBootSec=300
[Install]
WantedBy=timers.target
dest: /etc/systemd/system/rollback-check.timer
- name: Reboot into new partition
ansible.builtin.reboot:
reboot_timeout: 300
Network Gateway Configuration
---
- name: Configure edge network gateways
hosts: gateways
become: true
tasks:
- name: Configure MQTT broker
ansible.builtin.template:
src: mosquitto.conf.j2
dest: /etc/mosquitto/mosquitto.conf
notify: restart mosquitto
- name: Configure VPN tunnel to HQ
ansible.builtin.template:
src: wireguard.conf.j2
dest: /etc/wireguard/wg0.conf
mode: '0600'
notify: restart wireguard
- name: Enable IP forwarding
ansible.posix.sysctl:
name: net.ipv4.ip_forward
value: '1'
sysctl_set: true
- name: Configure NAT for edge network
ansible.builtin.iptables:
table: nat
chain: POSTROUTING
out_interface: wg0
jump: MASQUERADE
- name: Deploy local container registry mirror
community.docker.docker_container:
name: registry-mirror
image: registry:2
state: started
restart_policy: always
ports:
- "5000:5000"
volumes:
- /data/registry:/var/lib/registry
env:
REGISTRY_PROXY_REMOTEURL: "https://registry.company.com"
handlers:
- name: restart mosquitto
ansible.builtin.systemd:
name: mosquitto
state: restarted
- name: restart wireguard
ansible.builtin.systemd:
name: wg-quick@wg0
state: restarted
See also: Ansible for Physical AI & Robotics: Automate Fleet Management (2026 Guide)
Dynamic Inventory for Edge Devices
#!/usr/bin/env python3
# edge_inventory.py - Dynamic inventory from device management API
import json
import requests
import os
API_URL = os.environ.get('EDGE_API_URL', 'https://management.company.com/api')
API_TOKEN = os.environ.get('EDGE_API_TOKEN')
def get_inventory():
headers = {'Authorization': f'Bearer {API_TOKEN}'}
devices = requests.get(f'{API_URL}/devices', headers=headers).json()
inventory = {
'_meta': {'hostvars': {}},
'all': {'children': ['edge_devices', 'gateways']},
'edge_devices': {'hosts': []},
'gateways': {'hosts': []}
}
# Group by site
sites = {}
for device in devices:
hostname = device['hostname']
site = device['site_id']
# Add to site group
site_group = f"site_{site}"
if site_group not in sites:
sites[site_group] = {'hosts': []}
inventory['all']['children'].append(site_group)
sites[site_group]['hosts'].append(hostname)
# Add to type group
device_type = 'gateways' if device['role'] == 'gateway' else 'edge_devices'
inventory[device_type]['hosts'].append(hostname)
# Host variables
inventory['_meta']['hostvars'][hostname] = {
'ansible_host': device['ip_address'],
'site_id': site,
'device_type': device['hardware'],
'firmware_version': device['firmware'],
'app_version': device.get('app_version', 'unknown')
}
inventory.update(sites)
return inventory
if __name__ == '__main__':
print(json.dumps(get_inventory(), indent=2))
Monitoring Edge Fleets
---
- name: Deploy monitoring to edge devices
hosts: edge_devices
become: true
tasks:
- name: Install node-exporter
ansible.builtin.get_url:
url: "https://github.com/prometheus/node_exporter/releases/download/v1.8.0/node_exporter-1.8.0.linux-{{ go_arch }}.tar.gz"
dest: /tmp/node_exporter.tar.gz
- name: Deploy node-exporter
ansible.builtin.unarchive:
src: /tmp/node_exporter.tar.gz
dest: /usr/local/bin/
remote_src: true
extra_opts: [--strip-components=1]
creates: /usr/local/bin/node_exporter
- name: Create systemd service
ansible.builtin.copy:
content: |
[Unit]
Description=Node Exporter
After=network.target
[Service]
User=nobody
ExecStart=/usr/local/bin/node_exporter \
--collector.textfile.directory=/var/lib/node_exporter \
--collector.systemd \
--collector.processes
[Install]
WantedBy=multi-user.target
dest: /etc/systemd/system/node-exporter.service
notify: restart node-exporter
- name: Create custom metrics directory
ansible.builtin.file:
path: /var/lib/node_exporter
state: directory
owner: nobody
- name: Deploy edge-specific metric collector
ansible.builtin.copy:
content: |
#!/bin/bash
# Custom edge metrics
echo "# HELP edge_sensor_temperature Edge device CPU temperature"
echo "# TYPE edge_sensor_temperature gauge"
temp=$(cat /sys/class/thermal/thermal_zone0/temp 2>/dev/null || echo 0)
echo "edge_sensor_temperature $((temp / 1000))"
echo "# HELP edge_uplink_status Edge WAN link status"
echo "# TYPE edge_uplink_status gauge"
ping -c 1 -W 2 8.8.8.8 >/dev/null 2>&1 && echo "edge_uplink_status 1" || echo "edge_uplink_status 0"
dest: /usr/local/bin/edge-metrics.sh
mode: '0755'
- name: Schedule metric collection
ansible.builtin.cron:
name: "edge metrics"
minute: "*/5"
job: "/usr/local/bin/edge-metrics.sh > /var/lib/node_exporter/edge.prom"
handlers:
- name: restart node-exporter
ansible.builtin.systemd:
name: node-exporter
state: restarted
daemon_reload: true
enabled: true
FAQ
Can Ansible manage thousands of edge devices?
Yes. AAP's Automation Mesh distributes execution across hop and execution nodes, handling 10,000+ devices. For pull-based models, ansible-pull scales indefinitely since each device manages itself. Use dynamic inventory to track devices and serial for rolling updates.
What about devices with intermittent connectivity?
Use ansible-pull with a cron schedule. The device pulls configuration from a Git repository whenever it has connectivity. For critical updates, queue jobs in AAP — Automation Mesh retries when the device reconnects.
How do I handle device-specific configuration at scale?
Use dynamic inventory with host variables from a device management database or CMDB. Group variables handle site-level config, host variables handle device-specific settings. Jinja2 templates generate unique configurations per device.
Does Ansible work on ARM devices (Raspberry Pi, Jetson)?
Yes. Ansible's control node needs a standard Linux system, but managed nodes (including ARM) just need Python 3 and SSH. Raspberry Pi, NVIDIA Jetson, and ARM servers all work as managed nodes.
Conclusion
Ansible automates edge computing at scale through three patterns: centralized push with AAP Automation Mesh for managed environments, ansible-pull for devices behind NAT or with intermittent connectivity, and hybrid approaches combining both. From Raspberry Pi fleets to industrial gateways to retail edge servers, Ansible's agentless architecture and SSH-based communication make it the natural choice for managing distributed device infrastructure.
Related Articles
• AAP 2.6 Automation Mesh: Distributed Execution Architecture • AAP 2.6 Job Scheduling and Capacity Planning • Ansible for Network Automation • AAP 2.6 Monitoring and Logging • community.docker collection overviewCategory: installation