How to create a RAG system in n8n

What is a RAG system and why is it crucial for your business?

A Retrieval-Augmented Generation (RAG) system is a revolutionary technology that combines information search with artificial intelligence content generation. The main advantage: it eliminates AI hallucinations and guarantees answers based on real, verifiable data.

In this article, I explain step by step how I built a RAG system specialized in Title I of the Spanish Constitution using n8n, which can be adapted to any type of business documentation.

Technologies used

N8n: Workflow automation platform
OpenAI API embeddings: Vectorizing with embeddings
Gemini: AI model for natural language processing
JavaScript: Code to slice chunks by article
Cohere: Reranking system

Why implement RAG in your company

The problem of tokens and costs

Imagine you have a 500-page documentation. Sending all that content as context to an AI model can consume millions of tokens, skyrocketing your operating costs. A RAG system solves this elegantly:

Smart search: Retrieves only relevant information
Reduced costs: Uses a fraction of the necessary tokens
Accurate answers: Based on your actual documentation

Business use cases

A RAG system is especially valuable for:

Customer service chatbots that can't afford to fabricate information
Internal knowledge bases for support teams
Paralegals handling legal documentation
Automated FAQ systems

RAG system architecture in n8n

Preparation and processing of documents

The first step is to download and process the PDF document from Google Drive. This is where we encounter the first technical challenge: traditional chunking.

The problem with automatic chunking

Conventional RAG systems divide documents into 1,000-character chunks. This approach has a critical flaw: it can split a legal article into multiple chunks, making complete information retrieval difficult.

Solution: Smart item-by-item chunking

I developed a custom code node that:

Automatically identifies each constitutional article
Preserves the integrity of the legal content
Cleans the text of irrelevant elements (numbering, headings)
Generates metadata for each fragment

// Regular expression to find articles
const articleRegex = /Artículo\s+(\d+)/gi;

// Create array of articles with metadata
const articles = [];
for (let i = 0; i < articleMatches.length; i++) {
  // Extraction and cleaning logic
  articles.push({
    article_number: parseInt(currentArticle.number),
    article_title: `Artículo ${currentArticle.number}`,
    content: articleContent,
    character_count: article.content.length
  });
}

Vectorization with OpenAI Embeddings

Once the articles are processed, we use:

OpenAI Embeddings for vectorization
Supabase Vector Store as a vector database
Meta tags (article_number) to maintain traceability

Advanced recovery system

The query flow includes:

User query: "What does Article 26 say?"
Query vectorization with OpenAI
Initial search: Supabase returns 20 candidate chunks
Reranking with Cohere: Reduces to the 3 most relevant chunks
Final selection: The chunk with the highest relevance score

Implementation of the RAG agent

Interaction model

The system uses Gemini as a conversation model, providing:

Conversation history stored in PostgreSQL
Contextual responses based on previous interactions
Native integration with the vector database

Rerank system with Cohere

The key to success lies in the reranking system:

Improved cosine similarity
RelevanceScore parameter for ranking
Automatic selection of the most relevant chunk

To finish

A well-implemented RAG system in n8n offers accuracy, efficiency, and scalability for any organization handling critical documentation. The key lies in intelligent content processing and accurate retrieval of relevant information. The future of enterprise AI lies not in "all-knowing" models, but in systems that intelligently access verified and up-to-date information.