Ansible for Edge Computing and IoT: Managing Thousands of Distributed Devices
By Luca Berton · Published 2024-01-01 · Category: installation
Deploy and manage edge computing and IoT infrastructure with Ansible. Automate Raspberry Pi fleets, edge gateways, and distributed devices at massive scale.
Introduction
Edge computing pushes workloads out of the datacenter — to retail stores, factory floors, cell towers, and IoT gateways. Managing thousands of distributed devices with manual SSH and USB drives doesn't scale. Ansible automates edge device provisioning, configuration, updates, and monitoring using the same agentless approach that works for datacenter servers.
See also: Ansible for IoT and Edge Computing: Automate Device Fleets at Scale
Edge Architecture with Ansible
┌─────────────────────┐
│ AAP Controller │
│ (Central DC) │
└─────────┬───────────┘
│
┌────────────┼────────────┐
▼ ▼ ▼
┌────────────┐ ┌────────────┐ ┌────────────┐
│ Hop Node │ │ Hop Node │ │ Hop Node │
│ (Region 1) │ │ (Region 2) │ │ (Region 3) │
└─────┬──────┘ └─────┬──────┘ └─────┬──────┘
┌────┼────┐ ┌────┼────┐ ┌────┼────┐
▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼ ▼
Edge Edge Edge Edge Edge Edge Edge Edge Edge
001 002 003 004 005 006 007 008 009
Dynamic Inventory for Edge Devices
File-Based Inventory
# inventory/edge-devices.yml
all:
children:
retail_stores:
hosts:
store-[001:500]:
ansible_host: "{{ inventory_hostname }}.vpn.example.com"
vars:
ansible_user: edge-admin
ansible_connection: ssh
device_type: retail_kiosk
factory_gateways:
hosts:
factory-gw-[01:50]:
ansible_host: "{{ inventory_hostname }}.factory.example.com"
vars:
device_type: industrial_gateway
ansible_user: root
iot_sensors:
hosts:
sensor-[0001:2000]:
vars:
ansible_user: pi
device_type: raspberry_pi
ansible_python_interpreter: /usr/bin/python3
API-Based Dynamic Inventory
#!/usr/bin/env python3
# inventory/edge_inventory.py
import json
import requests
def get_devices():
resp = requests.get(
'https://device-mgmt.example.com/api/v1/devices',
headers={'Authorization': f'Bearer {API_TOKEN}'}
)
devices = resp.json()
inventory = {'_meta': {'hostvars': {}}}
for device in devices:
group = device['type']
if group not in inventory:
inventory[group] = {'hosts': [], 'vars': {}}
inventory[group]['hosts'].append(device['hostname'])
inventory['_meta']['hostvars'][device['hostname']] = {
'ansible_host': device['ip'],
'device_serial': device['serial'],
'firmware_version': device['firmware'],
'location': device['location'],
'last_seen': device['last_heartbeat']
}
return inventory
print(json.dumps(get_devices()))
See also: Ansible Automation Mesh: Scalable Automation Across Hybrid Cloud Environments
Raspberry Pi Fleet Management
Initial Provisioning
---
- name: Provision Raspberry Pi fleet
hosts: iot_sensors
become: true
vars:
wifi_ssid: "{{ vault_wifi_ssid }}"
wifi_password: "{{ vault_wifi_password }}"
tasks:
- name: Set hostname
ansible.builtin.hostname:
name: "{{ inventory_hostname }}"
- name: Configure WiFi
ansible.builtin.template:
src: wpa_supplicant.conf.j2
dest: /etc/wpa_supplicant/wpa_supplicant.conf
mode: '0600'
notify: restart networking
when: wifi_ssid is defined
- name: Set timezone
ansible.builtin.timezone:
name: "{{ device_timezone | default('UTC') }}"
- name: Configure SSH hardening
ansible.builtin.lineinfile:
path: /etc/ssh/sshd_config
regexp: "{{ item.regexp }}"
line: "{{ item.line }}"
loop:
- { regexp: '^#?PasswordAuthentication', line: 'PasswordAuthentication no' }
- { regexp: '^#?PermitRootLogin', line: 'PermitRootLogin no' }
notify: restart sshd
- name: Install base packages
ansible.builtin.apt:
name:
- python3
- python3-pip
- vim
- htop
- unattended-upgrades
state: present
update_cache: true
- name: Enable automatic security updates
ansible.builtin.template:
src: auto-upgrades.j2
dest: /etc/apt/apt.conf.d/20auto-upgrades
- name: Deploy monitoring agent
ansible.builtin.copy:
src: edge-monitor.py
dest: /opt/monitoring/edge-monitor.py
mode: '0755'
- name: Configure monitoring service
ansible.builtin.template:
src: edge-monitor.service.j2
dest: /etc/systemd/system/edge-monitor.service
notify:
- daemon reload
- start monitoring
OTA Firmware Updates
- name: Rolling firmware update for edge fleet
hosts: iot_sensors
become: true
serial: 50 # Update 50 devices at a time
max_fail_percentage: 5
tasks:
- name: Check current firmware version
ansible.builtin.command: cat /etc/firmware-version
register: current_fw
changed_when: false
- name: Skip if already updated
ansible.builtin.meta: end_host
when: current_fw.stdout == target_firmware_version
- name: Download firmware update
ansible.builtin.get_url:
url: "{{ firmware_repo }}/{{ target_firmware_version }}.tar.gz"
dest: /tmp/firmware-update.tar.gz
checksum: "sha256:{{ firmware_checksum }}"
- name: Apply firmware update
ansible.builtin.unarchive:
src: /tmp/firmware-update.tar.gz
dest: /opt/firmware/
remote_src: true
- name: Run update script
ansible.builtin.command: /opt/firmware/update.sh
register: update_result
- name: Reboot device
ansible.builtin.reboot:
reboot_timeout: 300
when: update_result.changed
- name: Verify firmware version
ansible.builtin.command: cat /etc/firmware-version
register: new_fw
failed_when: new_fw.stdout != target_firmware_version
Edge Application Deployment
Container-Based Edge Apps
- name: Deploy edge application
hosts: retail_stores
become: true
tasks:
- name: Install Podman (rootless containers)
ansible.builtin.package:
name: podman
state: present
- name: Pull application image
containers.podman.podman_image:
name: "registry.example.com/edge-app:{{ app_version }}"
username: "{{ vault_registry_user }}"
password: "{{ vault_registry_pass }}"
- name: Deploy edge application
containers.podman.podman_container:
name: edge-app
image: "registry.example.com/edge-app:{{ app_version }}"
state: started
restart_policy: always
ports:
- "8080:8080"
volumes:
- /data/edge-app:/app/data:Z
env:
DEVICE_ID: "{{ inventory_hostname }}"
CENTRAL_API: "{{ central_api_url }}"
API_TOKEN: "{{ vault_edge_api_token }}"
no_log: true
- name: Health check
ansible.builtin.uri:
url: "http://localhost:8080/health"
status_code: 200
retries: 10
delay: 5
See also: Containerized Ansible Automation Platform 2024 Update
Offline/Disconnected Operations
- name: Prepare offline update package
hosts: localhost
tasks:
- name: Download packages for ARM64
ansible.builtin.command: >
apt-get download --target-release stable
-o APT::Architecture=arm64
{{ item }}
loop: "{{ offline_packages }}"
- name: Create offline bundle
community.general.archive:
path: /tmp/offline-packages/
dest: /tmp/edge-update-{{ ansible_date_time.date }}.tar.gz
format: gz
- name: Apply offline update
hosts: edge_devices
tasks:
- name: Copy update bundle
ansible.builtin.copy:
src: "/tmp/edge-update-{{ update_date }}.tar.gz"
dest: /tmp/update.tar.gz
- name: Extract and install
ansible.builtin.shell: |
cd /tmp && tar xzf update.tar.gz
dpkg -i *.deb
Monitoring Edge Fleet
- name: Edge fleet health check
hosts: all
gather_facts: true
tasks:
- name: Collect health metrics
ansible.builtin.set_fact:
device_health:
hostname: "{{ inventory_hostname }}"
uptime: "{{ ansible_uptime_seconds }}"
cpu_temp: "{{ lookup('file', '/sys/class/thermal/thermal_zone0/temp') | int / 1000 }}"
disk_pct: "{{ (ansible_mounts[0].size_total - ansible_mounts[0].size_available) / ansible_mounts[0].size_total * 100 }}"
memory_pct: "{{ (1 - ansible_memfree_mb / ansible_memtotal_mb) * 100 }}"
last_update: "{{ ansible_date_time.iso8601 }}"
failed_when: false
- name: Report to central monitoring
ansible.builtin.uri:
url: "{{ monitoring_api }}/devices/{{ inventory_hostname }}/health"
method: POST
body_format: json
body: "{{ device_health }}"
delegate_to: localhost
failed_when: false
- name: Alert on high temperature
ansible.builtin.debug:
msg: "WARNING: {{ inventory_hostname }} CPU temp {{ device_health.cpu_temp }}°C"
when: device_health.cpu_temp | default(0) | float > 80
Best Practices
Use Automation Mesh — Hop nodes at regional hubs reduce latency and handle network partitions Rolling updates with small batches —serial: 50 with max_fail_percentage: 5 for safety
Offline capability — Edge devices may lose connectivity; support offline update bundles
Minimal base image — Smaller OS footprint = faster updates and fewer vulnerabilities
Rootless containers — Use Podman for container workloads without root privileges
VPN or SSH tunnels — Never expose edge device SSH directly to the internet
Automatic security updates — Enable unattended-upgrades for security patches between managed updates
Health monitoring — Every device reports health metrics; alert on anomalies
Idempotent updates — Edge devices may retry updates due to network interruptions
FAQ
How many devices can Ansible manage?
With Automation Mesh: tens of thousands. Use high forks (200+), fact caching, and regional execution nodes. For 10,000+ devices, consider ansible-pull for routine operations.
What about devices behind NAT?
Use a VPN (WireGuard/OpenVPN) or reverse SSH tunnels. Automation Mesh hop nodes can bridge NAT boundaries with outbound-only connections.
Battery-powered devices?
Minimize SSH sessions. Use ansible-pull on a schedule (e.g., hourly cron). Avoid gathering facts unnecessarily. Keep playbooks short and focused.
Conclusion
Ansible extends enterprise automation to the edge — managing thousands of distributed devices with the same playbook-based approach used for datacenter servers. Combined with Automation Mesh for network topology and rolling updates for safety, Ansible provides a scalable foundation for edge computing and IoT fleet management.
Related Articles
• Ansible Automation Mesh • Ansible Performance Optimization • Ansible Patch ManagementCategory: installation