Setting Up Neo4j GenAI Environment on Fedora Using Ansible
By Luca Berton · Published 2024-01-01 · Category: installation
Learn how to set up a Neo4j GenAI environment on Fedora using Ansible, including full-text and vector indexing, and OpenAI integration

Setting Up a Neo4j GenAI Environment on Fedora
In this article, we will walk through the steps to set up a Neo4j GenAI Python environment on a Fedora system using Ansible automation. This setup will enable you to deploy a Retrieval-Augmented Generation (RAG) system that integrates with a Neo4j graph database and utilizes OpenAI’s language models for interactive data retrieval and analysis.
Prerequisites
Before starting, ensure you have: • A Fedora system with root access. • Ansible installed on your control machine. • Access to the OpenAI API and a valid API key. • Credentials and URI for your Neo4j database.
Additionally, we will be using the Northwind dataset to populate our Neo4j database. For more information on importing the Northwind dataset into Neo4j, refer to the Northwind Dataset Guide.
See also: Evaluating RAG Solutions by Luca Berton on Pluralsight
Step-by-Step Setup
Create an Ansible PlaybookCreate a file named setup_neo4j_genai.yml with the following content:
---
- name: Set up Neo4j GenAI Python environment on Fedora
hosts: all
become: true
tasks:
- name: Install necessary system packages
ansible.builtin.dnf:
name:
- python3
- python3-pip
state: present
update_cache: true
- name: Install necessary Python packages using pip
ansible.builtin.pip:
name:
- neo4j
- neo4j_genai
- openai
state: present
executable: /usr/bin/pip3
- name: Set OpenAI API key as environment variable
ansible.builtin.lineinfile:
path: /etc/environment
line: "OPENAI_API_KEY={{ openai_key }}"
create: true
state: present
mode: '0644'
- name: Source environment file to apply changes
ansible.builtin.shell: source /etc/environment
- name: Create configuration file for Neo4j connection
ansible.builtin.copy:
dest: /etc/neo4j_genai_config.py
content: |
from neo4j import GraphDatabase
URI = "{{ neo4j_uri }}"
AUTH = ("{{ neo4j_auth.split(':')[0] }}", "{{ neo4j_auth.split(':')[1] }}")
driver = GraphDatabase.driver(URI, auth=AUTH)
mode: '0644'
- name: Create application directory
ansible.builtin.file:
path: /opt/neo4j_genai
state: directory
mode: '0755'
- name: Copy Python application to the server
ansible.builtin.copy:
src: files/application.py
dest: /opt/neo4j_genai/application.py
mode: '0755'
- name: Run the Neo4j GenAI application
ansible.builtin.command: python3 /opt/neo4j_genai/application.py
environment:
OPENAI_API_KEY: "{{ openai_key }}"
NEO4J_URI: "{{ neo4j_uri }}"
NEO4J_AUTH0: "{{ neo4j_auth.split(':')[0] }}"
NEO4J_AUTH1: "{{ neo4j_auth.split(':')[1] }}"
register: application_output
ignore_errors: true
- name: Debug application output
ansible.builtin.debug:
var: application_output.stdout_lines
This playbook automates the setup of the Neo4j GenAI environment by installing necessary packages, configuring environment variables, and deploying the application. Define Variables
Create a variables file named servers.yml to store sensitive information. Ensure to replace the values with your actual credentials.
python_version: "3.12"
neo4j_uri: "your-neo4j-uri"
neo4j_auth: "username:password"
openai_key: "your-openai-api-key"
Note: Replace sensitive values like the Neo4j URI, credentials, and OpenAI API key with placeholders or secure vault mechanisms in a production environment. Follow this guide Where do I find my OpenAI API Key? Create the Application File
Save the following Python code in files/application.py:
from neo4j import GraphDatabase
from neo4j_genai.embeddings.openai import OpenAIEmbeddings
from neo4j_genai.retrievers import HybridRetriever
from neo4j_genai.llm import OpenAILLM
from neo4j_genai.generation import GraphRAG
from openai import OpenAIError
import os
# Configure Neo4j and OpenAI
URI = os.getenv("NEO4J_URI")
AUTH = (os.getenv("NEO4J_AUTH0"), os.getenv("NEO4J_AUTH1"))
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
# Initialize driver and retriever
driver = GraphDatabase.driver(URI, auth=AUTH)
embedder = OpenAIEmbeddings(model="text-embedding-ada-002")
retriever = HybridRetriever(
driver=driver,
vector_index_name="product_vector",
fulltext_index_name="product_name_index",
embedder=embedder,
return_properties=["productID", "productName", "unitPrice"],
)
# Set up RAG pipeline
llm = OpenAILLM(model_name="gpt-4", model_params={"temperature": 0})
rag = GraphRAG(retriever=retriever, llm=llm)
# Perform retrieval with retry logic
query_text = "What is the ID and price of the product Queso Cabrales?"
print(query_text)
max_retries = 5
retry_delay = 10 # seconds
for attempt in range(max_retries):
try:
response = rag.search(query_text=query_text, retriever_config={"top_k": 3})
print(response.answer)
break # Exit the loop if successful
except OpenAIError as e:
if "rate limit" in str(e).lower() or "insufficient_quota" in str(e).lower():
print(f"Rate limit exceeded or insufficient quota. Retrying in {retry_delay} seconds...")
time.sleep(retry_delay)
else:
print(f"OpenAI Error: {e}")
break
except Exception as e:
print(f"An unexpected error occurred: {e}")
break
This application connects to a Neo4j database, retrieves data using OpenAI's models, and displays the results. Define a Full-Text Index
Full-text indexes are used to optimize text searches on specific properties of nodes or relationships. To define a full-text index on the product object, use the following Cypher command:
CALL db.index.fulltext.createNodeIndex("product_name_index", ["Product"], ["productName"])
This command creates a full-text index named product_name_index on the Product label, indexing the productName property. For more details on full-text indexes, visit the Neo4j Full-Text Indexes Guide.
Define a Vector Index
Vector indexes enable the storage and retrieval of vector embeddings, which are useful for similarity search. To define a vector index on the product object, use the following Cypher command:
CREATE VECTOR INDEX product_vector FOR (n:Product) ON (n.embedding)
This command creates a vector index named product_vector on the Product label, indexing the embedding property. For more information on vector indexes, refer to the Neo4j Vector Indexes Guide.
Run the Playbook
Execute the Ansible playbook with the following command:
ansible-playbook -i inventory setup_neo4j_genai.yml --extra-vars="@servers.yml"
Replace inventory with your Ansible inventory file that includes your Fedora host.
Verify the Setup
TASK [Debug application output] ***************************************************************
ok: [fedora.example.com] => {
"application_output.stdout_lines": [
"What is the ID and price of the product Queso Cabrales?",
"The ID of the product Queso Cabrales is '11' and the price is 21.0."
]
}
PLAY RECAP ************************************************************************************
fedora.example.com : ok=10 changed=2 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0
After the playbook runs successfully, verify that the application is working as expected by checking the output in the Ansible debug task or by accessing the server and reviewing the application logs.
Conclusion
This guide demonstrates how to set up a Neo4j GenAI environment on Fedora using Ansible automation. By following these steps, you can integrate advanced AI capabilities into your data management workflows and leverage the power of both Neo4j and OpenAI for intelligent data retrieval and analysis.
For additional information on importing datasets and creating indexes in Neo4j, refer to the following resources: • Northwind Dataset Import Guide • Full-Text Indexes in Neo4j • Vector Indexes in Neo4j
By utilizing these resources, you can further enhance your Neo4j database to support complex queries and AI-based analytics.
See also: Automating AI-Powered Graph Databases with Ansible: Insights from CfgMgmtCamp 2025
Related Articles
• Ansible env var patterns • multiple vault IDs in Ansible • Ansible privilege escalation patterns • managing inventory in Ansible • Ansible Ignore Errors GuideCategory: installation