Ansible for Agentic AI: Automate Multi-Agent Systems Infrastructure (2026 Guide)
By Luca Berton · Published 2024-01-01 · Category: installation
Complete guide to using Ansible for agentic AI infrastructure. Deploy multi-agent systems, orchestrate AI agent workflows, manage LLM backends, configure agent.
Agentic AI — autonomous systems that plan, decide, and execute tasks without constant human input — is the defining technology trend of 2026. Gartner lists multiagent systems in its top 10 strategic trends, and enterprises are redesigning operations around AI agents. Ansible plays a critical role in deploying, configuring, and orchestrating the infrastructure these agents run on.
What Is Agentic AI?
Agentic AI systems go beyond chat interfaces. They autonomously: • Plan — break complex goals into subtasks • Execute — call APIs, run code, interact with systems • Collaborate — multiple agents coordinate on workflows • Learn — adapt based on results and feedback
Examples include autonomous code review pipelines, self-healing infrastructure, automated security response, and multi-agent customer service systems.
See also: Ansible for AI Infrastructure: Deploy LLMs, GPUs & ML Pipelines (2026 Guide)
Why Ansible for Agentic AI?
| Challenge | Ansible Solution | |-----------|-----------------| | Deploy LLM backends (vLLM, Ollama, TGI) | Playbooks for GPU server provisioning | | Configure agent frameworks (AutoGen, CrewAI, LangGraph) | Roles for agent runtime setup | | Manage vector databases (Pgvector, Milvus, Qdrant) | Automated DB deployment and scaling | | Orchestrate multi-agent communication | Network and service mesh configuration | | Scale inference endpoints | Dynamic inventory + auto-scaling | | Secure agent-to-agent traffic | TLS, mTLS, and network policy automation |
Deploy an LLM Backend with Ansible
vLLM Inference Server
- name: Deploy vLLM inference server
hosts: gpu_servers
become: true
vars:
model_name: "meta-llama/Llama-3.1-70B-Instruct"
vllm_port: 8000
gpu_memory_utilization: 0.9
tasks:
- name: Install NVIDIA Container Toolkit
ansible.builtin.shell: |
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
args:
creates: /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
- name: Install nvidia-container-toolkit
ansible.builtin.apt:
name: nvidia-container-toolkit
state: present
update_cache: true
- name: Deploy vLLM container
community.docker.docker_container:
name: vllm-inference
image: "vllm/vllm-openai:latest"
state: started
restart_policy: unless-stopped
ports:
- "{{ vllm_port }}:8000"
volumes:
- /models:/root/.cache/huggingface
env:
HUGGING_FACE_HUB_TOKEN: "{{ vault_hf_token }}"
command: >
--model {{ model_name }}
--gpu-memory-utilization {{ gpu_memory_utilization }}
--max-model-len 8192
device_requests:
- driver: nvidia
count: -1
capabilities:
- - gpu
no_log: true
Ollama for Local Agent Development
- name: Deploy Ollama for local AI agents
hosts: dev_servers
become: true
tasks:
- name: Install Ollama
ansible.builtin.shell: curl -fsSL https://ollama.com/install.sh | sh
args:
creates: /usr/local/bin/ollama
- name: Pull models for agents
ansible.builtin.command: "ollama pull {{ item }}"
loop:
- llama3.1:8b # Fast agent reasoning
- codellama:13b # Code generation agent
- nomic-embed-text # Embedding for RAG
register: pull_result
changed_when: "'pulling' in pull_result.stdout"
- name: Configure Ollama for network access
ansible.builtin.copy:
content: |
[Service]
Environment="OLLAMA_HOST=0.0.0.0"
Environment="OLLAMA_ORIGINS=*"
dest: /etc/systemd/system/ollama.service.d/override.conf
notify: restart ollama
See also: AI DevOps Ansible Community on Skool
Deploy a Multi-Agent Framework
AutoGen Studio
- name: Deploy AutoGen multi-agent platform
hosts: agent_servers
become: true
vars:
autogen_port: 8081
llm_endpoint: "http://{{ hostvars['gpu01']['ansible_host'] }}:8000/v1"
tasks:
- name: Create virtual environment
ansible.builtin.pip:
name:
- autogenstudio
- pyautogen[retrievechat]
virtualenv: /opt/autogen/venv
virtualenv_python: python3.11
- name: Deploy agent configuration
ansible.builtin.template:
src: autogen-config.json.j2
dest: /opt/autogen/config.json
mode: '0600'
no_log: true # Contains API keys
- name: Create systemd service
ansible.builtin.copy:
content: |
[Unit]
Description=AutoGen Studio
After=network.target
[Service]
Type=simple
User=autogen
WorkingDirectory=/opt/autogen
ExecStart=/opt/autogen/venv/bin/autogenstudio ui --host 0.0.0.0 --port {{ autogen_port }}
Restart=always
[Install]
WantedBy=multi-user.target
dest: /etc/systemd/system/autogen-studio.service
notify: restart autogen
CrewAI Agent Deployment
- name: Deploy CrewAI agent crew
hosts: agent_servers
vars:
crew_config:
agents:
- name: researcher
role: "Senior Research Analyst"
goal: "Find and analyze technical information"
backstory: "Expert at finding accurate technical data"
llm: "openai/gpt-4o"
- name: writer
role: "Technical Writer"
goal: "Create clear documentation from research"
backstory: "Skilled at translating complex topics"
llm: "openai/gpt-4o"
tasks:
- name: Install CrewAI
ansible.builtin.pip:
name:
- crewai
- crewai-tools
virtualenv: /opt/crewai/venv
- name: Deploy crew configuration
ansible.builtin.copy:
content: "{{ crew_config | to_nice_yaml }}"
dest: /opt/crewai/agents.yaml
mode: '0644'
- name: Deploy tool configurations
ansible.builtin.template:
src: "{{ item }}"
dest: "/opt/crewai/config/{{ item | basename | regex_replace('.j2$', '') }}"
loop:
- templates/tools.yaml.j2
- templates/tasks.yaml.j2
Deploy Vector Database for Agent Memory
- name: Deploy Qdrant vector database for agent RAG
hosts: vector_db
become: true
vars:
qdrant_port: 6333
qdrant_grpc_port: 6334
tasks:
- name: Create data directory
ansible.builtin.file:
path: /var/lib/qdrant
state: directory
owner: "1000"
group: "1000"
- name: Deploy Qdrant
community.docker.docker_container:
name: qdrant
image: qdrant/qdrant:latest
state: started
restart_policy: unless-stopped
ports:
- "{{ qdrant_port }}:6333"
- "{{ qdrant_grpc_port }}:6334"
volumes:
- /var/lib/qdrant:/qdrant/storage
env:
QDRANT__SERVICE__API_KEY: "{{ vault_qdrant_api_key }}"
no_log: true
- name: Create agent memory collection
ansible.builtin.uri:
url: "http://localhost:{{ qdrant_port }}/collections/agent_memory"
method: PUT
body_format: json
body:
vectors:
size: 768
distance: Cosine
optimizers_config:
indexing_threshold: 10000
headers:
api-key: "{{ vault_qdrant_api_key }}"
status_code: [200, 409] # 409 = already exists
no_log: true
See also: Ansible for Domain-Specific AI Models: Deploy & Manage Enterprise DSLMs (2026 Guide)
Secure Agent-to-Agent Communication
- name: Secure multi-agent communication
hosts: agent_servers
become: true
tasks:
- name: Generate agent TLS certificates
community.crypto.x509_certificate:
path: "/etc/ssl/agents/{{ inventory_hostname }}.crt"
privatekey_path: "/etc/ssl/agents/{{ inventory_hostname }}.key"
provider: ownca
ownca_path: /etc/ssl/agents/ca.crt
ownca_privatekey_path: /etc/ssl/agents/ca.key
common_name: "{{ inventory_hostname }}"
subject_alt_name:
- "DNS:{{ inventory_hostname }}"
- "IP:{{ ansible_host }}"
- name: Configure agent network policies
ansible.builtin.template:
src: agent-network-policy.yaml.j2
dest: /etc/agent-policies/network.yaml
vars:
allowed_agent_ports:
- 8000 # LLM inference
- 6333 # Vector DB
- 8081 # Agent UI
- 9090 # Agent metrics
- name: Deploy agent API gateway
community.docker.docker_container:
name: agent-gateway
image: envoyproxy/envoy:v1.31-latest
state: started
ports:
- "443:8443"
volumes:
- /etc/ssl/agents:/etc/ssl/agents:ro
- /etc/envoy/envoy.yaml:/etc/envoy/envoy.yaml:ro
Monitor Agent Performance
- name: Deploy agent observability stack
hosts: monitoring
become: true
tasks:
- name: Deploy Prometheus for agent metrics
community.docker.docker_container:
name: prometheus
image: prom/prometheus:latest
state: started
ports:
- "9090:9090"
volumes:
- /etc/prometheus:/etc/prometheus
- name: Configure agent scrape targets
ansible.builtin.template:
src: prometheus-agents.yml.j2
dest: /etc/prometheus/prometheus.yml
vars:
agent_targets:
- job: vllm_inference
targets: "{{ groups['gpu_servers'] | map('regex_replace', '$', ':8000') }}"
metrics_path: /metrics
- job: agent_framework
targets: "{{ groups['agent_servers'] | map('regex_replace', '$', ':8081') }}"
- job: vector_db
targets: "{{ groups['vector_db'] | map('regex_replace', '$', ':6333') }}"
metrics_path: /metrics
notify: reload prometheus
Full Multi-Agent Architecture Playbook
# site.yml — Deploy complete agentic AI platform
- name: Common setup
hosts: all
roles:
- common
- security-baseline
- monitoring-agent
- name: GPU inference servers
hosts: gpu_servers
roles:
- nvidia-drivers
- container-runtime
- vllm-inference
- name: Vector databases
hosts: vector_db
roles:
- qdrant
- backup-agent
- name: Agent framework servers
hosts: agent_servers
roles:
- agent-runtime
- agent-gateway
- agent-tools
- name: Monitoring
hosts: monitoring
roles:
- prometheus
- grafana
- alertmanager
Inventory for Multi-Agent Deployment
# inventory/agentic-ai.yml
all:
children:
gpu_servers:
hosts:
gpu01:
ansible_host: 10.0.1.10
gpu_count: 4
gpu_type: A100
gpu02:
ansible_host: 10.0.1.11
gpu_count: 2
gpu_type: L40S
vector_db:
hosts:
qdrant01:
ansible_host: 10.0.2.10
storage_size: 500Gi
agent_servers:
hosts:
agent01:
ansible_host: 10.0.3.10
agent_framework: autogen
agent02:
ansible_host: 10.0.3.11
agent_framework: crewai
monitoring:
hosts:
mon01:
ansible_host: 10.0.4.10
Best Practices
Separate inference from agents — GPU servers handle LLM inference; agent logic runs on CPU servers Use API keys for all agent endpoints — Never expose LLM or vector DB ports without authentication Implement rate limiting — Agentic loops can spiral; add token budgets and call limits Monitor token usage — Track cost per agent, per task, per workflow Version agent configurations — Store agent definitions in Git, deploy with Ansible Test with check mode —ansible-playbook --check before deploying agent changes
Use Ansible Vault — All API keys, tokens, and credentials encrypted
FAQ
What is agentic AI and why does it need infrastructure automation?
Agentic AI systems are autonomous AI agents that plan and execute tasks independently. They need LLM inference servers, vector databases, agent frameworks, and networking — all infrastructure that Ansible automates for consistent, repeatable deployments.
Can Ansible deploy multi-agent systems like AutoGen or CrewAI?
Yes. Ansible can provision GPU servers, deploy LLM backends (vLLM, Ollama), install agent frameworks (AutoGen, CrewAI, LangGraph), set up vector databases for agent memory, and configure secure agent-to-agent communication.
How do I scale agentic AI infrastructure with Ansible?
Use dynamic inventory to discover GPU resources, deploy additional inference endpoints with playbooks, and use serial for rolling updates. Combine with Kubernetes (via the k8s module) for elastic scaling of agent workloads.
What security considerations exist for multi-agent systems?
Secure agent communication with mTLS, authenticate all API endpoints, implement token budgets to prevent runaway costs, isolate agent networks, and use Ansible Vault for all secrets. Monitor agent actions with comprehensive logging.
Conclusion
Agentic AI is moving from demos to production in 2026. Ansible provides the infrastructure automation layer that makes multi-agent systems deployable, reproducible, and manageable at scale. From GPU provisioning to agent framework deployment to security hardening, Ansible playbooks turn complex AI architectures into version-controlled, repeatable infrastructure.
Related Articles
• Ansible Kubernetes k8s Module • Ansible Docker Container Module • Ansible for AWS: Complete GuideCategory: installation