Ansible for Agentic AI: Automate Multi-Agent Systems Infrastructure (2026 Guide)

By Luca Berton · Published 2024-01-01 · Category: installation

Complete guide to using Ansible for agentic AI infrastructure. Deploy multi-agent systems, orchestrate AI agent workflows, manage LLM backends, configure agent.

Agentic AI — autonomous systems that plan, decide, and execute tasks without constant human input — is the defining technology trend of 2026. Gartner lists multiagent systems in its top 10 strategic trends, and enterprises are redesigning operations around AI agents. Ansible plays a critical role in deploying, configuring, and orchestrating the infrastructure these agents run on.

What Is Agentic AI?

Agentic AI systems go beyond chat interfaces. They autonomously:

Plan — break complex goals into subtasks
Execute — call APIs, run code, interact with systems
Collaborate — multiple agents coordinate on workflows
Learn — adapt based on results and feedback

Examples include autonomous code review pipelines, self-healing infrastructure, automated security response, and multi-agent customer service systems.

Why Ansible for Agentic AI?

Challenge	Ansible Solution
Deploy LLM backends (vLLM, Ollama, TGI)	Playbooks for GPU server provisioning
Configure agent frameworks (AutoGen, CrewAI, LangGraph)	Roles for agent runtime setup
Manage vector databases (Pgvector, Milvus, Qdrant)	Automated DB deployment and scaling
Orchestrate multi-agent communication	Network and service mesh configuration
Scale inference endpoints	Dynamic inventory + auto-scaling
Secure agent-to-agent traffic	TLS, mTLS, and network policy automation

Deploy an LLM Backend with Ansible

vLLM Inference Server

- name: Deploy vLLM inference server
  hosts: gpu_servers
  become: true
  vars:
    model_name: "meta-llama/Llama-3.1-70B-Instruct"
    vllm_port: 8000
    gpu_memory_utilization: 0.9

  tasks:
    - name: Install NVIDIA Container Toolkit
      ansible.builtin.shell: |
        curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
        curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
          sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
          tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
      args:
        creates: /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

    - name: Install nvidia-container-toolkit
      ansible.builtin.apt:
        name: nvidia-container-toolkit
        state: present
        update_cache: true

    - name: Deploy vLLM container
      community.docker.docker_container:
        name: vllm-inference
        image: "vllm/vllm-openai:latest"
        state: started
        restart_policy: unless-stopped
        ports:
          - "{{ vllm_port }}:8000"
        volumes:
          - /models:/root/.cache/huggingface
        env:
          HUGGING_FACE_HUB_TOKEN: "{{ vault_hf_token }}"
        command: >
          --model {{ model_name }}
          --gpu-memory-utilization {{ gpu_memory_utilization }}
          --max-model-len 8192
        device_requests:
          - driver: nvidia
            count: -1
            capabilities:
              - - gpu
      no_log: true

Ollama for Local Agent Development

- name: Deploy Ollama for local AI agents
  hosts: dev_servers
  become: true
  tasks:
    - name: Install Ollama
      ansible.builtin.shell: curl -fsSL https://ollama.com/install.sh | sh
      args:
        creates: /usr/local/bin/ollama

    - name: Pull models for agents
      ansible.builtin.command: "ollama pull {{ item }}"
      loop:
        - llama3.1:8b        # Fast agent reasoning
        - codellama:13b      # Code generation agent
        - nomic-embed-text   # Embedding for RAG
      register: pull_result
      changed_when: "'pulling' in pull_result.stdout"

    - name: Configure Ollama for network access
      ansible.builtin.copy:
        content: |
          [Service]
          Environment="OLLAMA_HOST=0.0.0.0"
          Environment="OLLAMA_ORIGINS=*"
        dest: /etc/systemd/system/ollama.service.d/override.conf
      notify: restart ollama

Deploy a Multi-Agent Framework

AutoGen Studio

- name: Deploy AutoGen multi-agent platform
  hosts: agent_servers
  become: true
  vars:
    autogen_port: 8081
    llm_endpoint: "http://{{ hostvars['gpu01']['ansible_host'] }}:8000/v1"

  tasks:
    - name: Create virtual environment
      ansible.builtin.pip:
        name:
          - autogenstudio
          - pyautogen[retrievechat]
        virtualenv: /opt/autogen/venv
        virtualenv_python: python3.11

    - name: Deploy agent configuration
      ansible.builtin.template:
        src: autogen-config.json.j2
        dest: /opt/autogen/config.json
        mode: '0600'
      no_log: true   # Contains API keys

    - name: Create systemd service
      ansible.builtin.copy:
        content: |
          [Unit]
          Description=AutoGen Studio
          After=network.target

          [Service]
          Type=simple
          User=autogen
          WorkingDirectory=/opt/autogen
          ExecStart=/opt/autogen/venv/bin/autogenstudio ui --host 0.0.0.0 --port {{ autogen_port }}
          Restart=always

          [Install]
          WantedBy=multi-user.target
        dest: /etc/systemd/system/autogen-studio.service
      notify: restart autogen

CrewAI Agent Deployment

- name: Deploy CrewAI agent crew
  hosts: agent_servers
  vars:
    crew_config:
      agents:
        - name: researcher
          role: "Senior Research Analyst"
          goal: "Find and analyze technical information"
          backstory: "Expert at finding accurate technical data"
          llm: "openai/gpt-4o"
        - name: writer
          role: "Technical Writer"
          goal: "Create clear documentation from research"
          backstory: "Skilled at translating complex topics"
          llm: "openai/gpt-4o"

  tasks:
    - name: Install CrewAI
      ansible.builtin.pip:
        name:
          - crewai
          - crewai-tools
        virtualenv: /opt/crewai/venv

    - name: Deploy crew configuration
      ansible.builtin.copy:
        content: "{{ crew_config | to_nice_yaml }}"
        dest: /opt/crewai/agents.yaml
        mode: '0644'

    - name: Deploy tool configurations
      ansible.builtin.template:
        src: "{{ item }}"
        dest: "/opt/crewai/config/{{ item | basename | regex_replace('.j2$', '') }}"
      loop:
        - templates/tools.yaml.j2
        - templates/tasks.yaml.j2

Deploy Vector Database for Agent Memory

- name: Deploy Qdrant vector database for agent RAG
  hosts: vector_db
  become: true
  vars:
    qdrant_port: 6333
    qdrant_grpc_port: 6334

  tasks:
    - name: Create data directory
      ansible.builtin.file:
        path: /var/lib/qdrant
        state: directory
        owner: "1000"
        group: "1000"

    - name: Deploy Qdrant
      community.docker.docker_container:
        name: qdrant
        image: qdrant/qdrant:latest
        state: started
        restart_policy: unless-stopped
        ports:
          - "{{ qdrant_port }}:6333"
          - "{{ qdrant_grpc_port }}:6334"
        volumes:
          - /var/lib/qdrant:/qdrant/storage
        env:
          QDRANT__SERVICE__API_KEY: "{{ vault_qdrant_api_key }}"
      no_log: true

    - name: Create agent memory collection
      ansible.builtin.uri:
        url: "http://localhost:{{ qdrant_port }}/collections/agent_memory"
        method: PUT
        body_format: json
        body:
          vectors:
            size: 768
            distance: Cosine
          optimizers_config:
            indexing_threshold: 10000
        headers:
          api-key: "{{ vault_qdrant_api_key }}"
        status_code: [200, 409]   # 409 = already exists
      no_log: true

Secure Agent-to-Agent Communication

- name: Secure multi-agent communication
  hosts: agent_servers
  become: true
  tasks:
    - name: Generate agent TLS certificates
      community.crypto.x509_certificate:
        path: "/etc/ssl/agents/{{ inventory_hostname }}.crt"
        privatekey_path: "/etc/ssl/agents/{{ inventory_hostname }}.key"
        provider: ownca
        ownca_path: /etc/ssl/agents/ca.crt
        ownca_privatekey_path: /etc/ssl/agents/ca.key
        common_name: "{{ inventory_hostname }}"
        subject_alt_name:
          - "DNS:{{ inventory_hostname }}"
          - "IP:{{ ansible_host }}"

    - name: Configure agent network policies
      ansible.builtin.template:
        src: agent-network-policy.yaml.j2
        dest: /etc/agent-policies/network.yaml
      vars:
        allowed_agent_ports:
          - 8000   # LLM inference
          - 6333   # Vector DB
          - 8081   # Agent UI
          - 9090   # Agent metrics

    - name: Deploy agent API gateway
      community.docker.docker_container:
        name: agent-gateway
        image: envoyproxy/envoy:v1.31-latest
        state: started
        ports:
          - "443:8443"
        volumes:
          - /etc/ssl/agents:/etc/ssl/agents:ro
          - /etc/envoy/envoy.yaml:/etc/envoy/envoy.yaml:ro

Monitor Agent Performance

- name: Deploy agent observability stack
  hosts: monitoring
  become: true
  tasks:
    - name: Deploy Prometheus for agent metrics
      community.docker.docker_container:
        name: prometheus
        image: prom/prometheus:latest
        state: started
        ports:
          - "9090:9090"
        volumes:
          - /etc/prometheus:/etc/prometheus

    - name: Configure agent scrape targets
      ansible.builtin.template:
        src: prometheus-agents.yml.j2
        dest: /etc/prometheus/prometheus.yml
      vars:
        agent_targets:
          - job: vllm_inference
            targets: "{{ groups['gpu_servers'] | map('regex_replace', '$', ':8000') }}"
            metrics_path: /metrics
          - job: agent_framework
            targets: "{{ groups['agent_servers'] | map('regex_replace', '$', ':8081') }}"
          - job: vector_db
            targets: "{{ groups['vector_db'] | map('regex_replace', '$', ':6333') }}"
            metrics_path: /metrics
      notify: reload prometheus

Full Multi-Agent Architecture Playbook

# site.yml — Deploy complete agentic AI platform
- name: Common setup
  hosts: all
  roles:
    - common
    - security-baseline
    - monitoring-agent

- name: GPU inference servers
  hosts: gpu_servers
  roles:
    - nvidia-drivers
    - container-runtime
    - vllm-inference

- name: Vector databases
  hosts: vector_db
  roles:
    - qdrant
    - backup-agent

- name: Agent framework servers
  hosts: agent_servers
  roles:
    - agent-runtime
    - agent-gateway
    - agent-tools

- name: Monitoring
  hosts: monitoring
  roles:
    - prometheus
    - grafana
    - alertmanager

Inventory for Multi-Agent Deployment

# inventory/agentic-ai.yml
all:
  children:
    gpu_servers:
      hosts:
        gpu01:
          ansible_host: 10.0.1.10
          gpu_count: 4
          gpu_type: A100
        gpu02:
          ansible_host: 10.0.1.11
          gpu_count: 2
          gpu_type: L40S
    vector_db:
      hosts:
        qdrant01:
          ansible_host: 10.0.2.10
          storage_size: 500Gi
    agent_servers:
      hosts:
        agent01:
          ansible_host: 10.0.3.10
          agent_framework: autogen
        agent02:
          ansible_host: 10.0.3.11
          agent_framework: crewai
    monitoring:
      hosts:
        mon01:
          ansible_host: 10.0.4.10

Best Practices

Separate inference from agents — GPU servers handle LLM inference; agent logic runs on CPU servers
Use API keys for all agent endpoints — Never expose LLM or vector DB ports without authentication
Implement rate limiting — Agentic loops can spiral; add token budgets and call limits
Monitor token usage — Track cost per agent, per task, per workflow
Version agent configurations — Store agent definitions in Git, deploy with Ansible
Test with check mode — ansible-playbook --check before deploying agent changes
Use Ansible Vault — All API keys, tokens, and credentials encrypted

FAQ

What is agentic AI and why does it need infrastructure automation?

Agentic AI systems are autonomous AI agents that plan and execute tasks independently. They need LLM inference servers, vector databases, agent frameworks, and networking — all infrastructure that Ansible automates for consistent, repeatable deployments.

Can Ansible deploy multi-agent systems like AutoGen or CrewAI?

Yes. Ansible can provision GPU servers, deploy LLM backends (vLLM, Ollama), install agent frameworks (AutoGen, CrewAI, LangGraph), set up vector databases for agent memory, and configure secure agent-to-agent communication.

How do I scale agentic AI infrastructure with Ansible?

Use dynamic inventory to discover GPU resources, deploy additional inference endpoints with playbooks, and use serial for rolling updates. Combine with Kubernetes (via the k8s module) for elastic scaling of agent workloads.

What security considerations exist for multi-agent systems?

Secure agent communication with mTLS, authenticate all API endpoints, implement token budgets to prevent runaway costs, isolate agent networks, and use Ansible Vault for all secrets. Monitor agent actions with comprehensive logging.

Conclusion

Agentic AI is moving from demos to production in 2026. Ansible provides the infrastructure automation layer that makes multi-agent systems deployable, reproducible, and manageable at scale. From GPU provisioning to agent framework deployment to security hardening, Ansible playbooks turn complex AI architectures into version-controlled, repeatable infrastructure.

Category: installation

Browse all Ansible tutorials · AnsiblePilot Home

AnsiblePilot — Master Ansible Automation

Popular Topics

About Luca Berton