Creating Knowledge Bases with RAG

1 Understanding RAG

5 min

Retrieval-Augmented Generation (RAG) is a powerful technique that allows AI to access external knowledge sources to provide accurate, up-to-date information.

How RAG Works

Document Processing: Your documents are split into chunks and converted to embeddings
Query Processing: User questions are also converted to embeddings
Similarity Search: The system finds the most relevant document chunks
Response Generation: The AI uses retrieved information to generate accurate answers

Benefits of RAG

Accuracy: Reduces AI hallucinations by grounding responses in real data
Current Information: Access to your latest documents and data
Source Attribution: Can cite specific sources for transparency
Cost Effective: More efficient than fine-tuning models

Use Cases

Customer support with product documentation
Internal knowledge sharing
Educational content delivery
Legal document analysis
Technical documentation assistance

Tips:

RAG is perfect when you need AI to answer questions about specific content
Quality of your source documents directly impacts answer quality
RAG works best with well-structured, factual content

2 Creating Your First Knowledge Base

6 min

Let's create a knowledge base from your documents that your AI can search and reference.

Step-by-Step Knowledge Base Creation

Navigate to Knowledge in the main menu
Click "Create Knowledge"
Give your knowledge base a descriptive name
Add a brief description of its contents
Click "Create"

Supported Data Sources

Dify supports multiple data sources:

Local Files: Upload PDFs, Word docs, text files, CSV, etc.
Notion Pages: Sync directly from your Notion workspace
Web Pages: Scrape content from websites using Jina or Firecrawl API
Plain Text: Copy and paste content directly

File Requirements and Limits

Supported formats: PDF, DOCX, TXT, MD, CSV, XLSX
File size limit: Usually 15MB per file (varies by plan)
Total size: Depends on your subscription tier
Language support: Multi-language documents supported

Tips:

Start with a small set of high-quality documents
Use descriptive names for easy management
Organize related content in the same knowledge base

3 Uploading and Processing Documents

8 min

Now let's add documents to your knowledge base and configure how they're processed.

Document Upload Process

Click "Add Document" in your knowledge base
Choose your upload method (File, Notion, Web scraping, or Text)
Select or upload your documents
Review the document preview
Configure processing settings

Chunking Configuration

Documents are split into smaller chunks for better retrieval:

Automatic Chunking: Dify automatically splits by paragraphs
Custom Rules: Set your own chunk size and overlap
Chunk Size: 500-1000 characters is usually optimal
Overlap: 50-100 characters to maintain context

Text Preprocessing Options

Remove extra spaces: Clean up formatting
Remove URLs: Filter out web links
Remove email addresses: Protect privacy
Custom preprocessing: Advanced filtering rules

Embedding Model Selection

Choose the right embedding model for your content:

OpenAI text-embedding-3-small: Fast and cost-effective
OpenAI text-embedding-3-large: Higher accuracy
Cohere embed-english: Good for English content
Cohere embed-multilingual: For multiple languages

Example:


# Example document structure for optimal RAG performance:

## Product FAQ Document

### What is Product X?
Product X is a comprehensive solution for...

### How do I install Product X?
1. Download the installer from our website
2. Run the installer as administrator
3. Follow the setup wizard

### Troubleshooting Common Issues
**Issue:** Application won't start
**Solution:** Check system requirements and try running as administrator

Tips:

Well-structured documents with clear headings work best
Keep chunk sizes moderate - too small loses context, too large reduces precision
Choose embedding models based on your primary language

4 Configuring Retrieval Settings

5 min

Fine-tune how your knowledge base searches for and retrieves relevant information.

Retrieval Methods

Vector Retrieval: Finds semantically similar content using embeddings
Full-Text Search: Traditional keyword-based search
Hybrid Retrieval (Recommended): Combines both methods for best results

Hybrid Retrieval Configuration

Adjust the balance between semantic and keyword search:

Semantic Weight (70%): Finds conceptually related content
Keyword Weight (30%): Finds exact term matches
Custom Weights: Adjust based on your content type

Reranking Models

Improve retrieval accuracy with reranking:

Cohere Rerank: Reorders results for better relevance
BGE Reranker: Open-source alternative
No Reranking: Faster but potentially less accurate

Retrieval Parameters

Top K: Number of chunks to retrieve (3-5 recommended)
Score Threshold: Minimum similarity score for inclusion
Max Tokens: Total token limit for retrieved content

Tips:

Hybrid retrieval works best for most use cases
Start with 70% semantic, 30% keyword weighting
Use reranking for better accuracy when response quality matters most

5 Testing Your Knowledge Base

4 min

Before integrating your knowledge base into an application, test its retrieval accuracy.

Using the Recall Test

Go to your knowledge base settings
Click on the "Recall Test" tab
Enter test queries related to your content
Review the retrieved chunks and their relevance scores
Adjust settings if needed

Effective Test Queries

Direct questions: "How do I reset my password?"
Conceptual queries: "Security best practices"
Specific terms: "API rate limits"
Variations: Test different ways of asking the same thing

Evaluating Results

Look for:

Relevance: Do retrieved chunks actually answer the question?
Completeness: Is all necessary information retrieved?
Ranking: Are the most relevant chunks ranked highest?
Coverage: Can the system find information across all your documents?

Common Issues and Solutions

Poor retrieval: Adjust chunk size or embedding model
Irrelevant results: Increase score threshold
Missing information: Check document quality and chunking
Inconsistent results: Consider using reranking

Tips:

Test with questions your actual users would ask
Document any query patterns that don't work well
Iterate on your settings based on test results

6 Building a RAG-Powered Chatbot

5 min

Now let's create a chatbot that uses your knowledge base to answer questions accurately.

Creating the Chatflow

Create a new Chatflow application
Keep the default Start → LLM → Answer flow
Click on the LLM node to configure it

Adding Your Knowledge Base

In the LLM node settings, find the "Context" section
Click "Add Knowledge"
Select your knowledge base
Configure retrieval settings if needed

Crafting a RAG-Optimized Prompt

You are a helpful assistant that answers questions based on the provided context.

Instructions:
1. Use the context information below to answer the user's question
2. If the context doesn't contain relevant information, say "I don't have information about that in my knowledge base"
3. Always cite specific parts of the context when possible
4. Be accurate and don't make up information not in the context

Context: {{#knowledge}}

User Question: {{sys.query}}

Please provide a helpful and accurate response based on the context above.

Advanced RAG Techniques

Question Classification: Route different types of questions appropriately
Multiple Knowledge Bases: Use different sources for different topics
Fallback Strategies: Handle cases when no relevant information is found

Tips:

Always instruct the AI to stay within the provided context
Enable citation features to show sources
Test with questions both inside and outside your knowledge base

7 Advanced RAG Workflows

6 min

Create more sophisticated RAG applications with conditional logic and multiple knowledge sources.

Question Classification Workflow

Route different types of questions to appropriate knowledge bases:

Add a Question Classifier node after Start
Define categories (e.g., "Product Info", "Technical Support", "Billing")
Connect different paths to different knowledge bases
Use conditional logic to route appropriately

Multi-Step RAG Process

Initial Retrieval: Find relevant chunks
Relevance Check: Evaluate if information is sufficient
Follow-up Retrieval: Search additional sources if needed
Response Generation: Synthesize all found information

Handling Edge Cases

No Matches Found: Provide helpful guidance on how to rephrase
Low Confidence: Ask clarifying questions
Multiple Valid Answers: Present options clearly
Outdated Information: Include disclaimers about data freshness

# Example multi-step RAG prompt

You are analyzing a user question in two steps:

Step 1: Evaluate if the retrieved context contains sufficient information
Context: {{#knowledge}}
Question: {{sys.query}}

If context is sufficient, respond with: SUFFICIENT
If context is insufficient, respond with: INSUFFICIENT - [reason]

Step 2 (only if sufficient): Provide a complete answer based on the context.

Tips:

Question classification improves accuracy for diverse knowledge bases
Always have fallback options when retrieval fails
Consider the user experience when no good answers are found

8 Monitoring and Optimization

6 min

Continuously improve your RAG system by monitoring performance and optimizing based on usage patterns.

Key Metrics to Track

Retrieval Accuracy: Percentage of queries with relevant results
Response Quality: User satisfaction with answers
Coverage: Percentage of questions that can be answered
Response Time: Average time to generate answers
Cost: Token usage for embeddings and generation

Optimization Strategies

Document Quality: Improve source content structure and clarity
Chunk Optimization: Adjust size and overlap based on performance
Embedding Tuning: Experiment with different embedding models
Prompt Refinement: Continuously improve instructions

Common Performance Issues

Poor Retrieval:
- Check document quality and structure
- Adjust chunking strategy
- Consider different embedding models
Slow Responses:
- Optimize retrieval parameters
- Use smaller, more focused knowledge bases
- Consider caching frequent queries
High Costs:
- Optimize chunk sizes to reduce token usage
- Use more efficient embedding models
- Implement query caching

Best Practices for Production

Regular Updates: Keep knowledge bases current
Quality Control: Review and curate content regularly
User Feedback: Collect and act on user ratings
A/B Testing: Test different configurations
Backup Strategies: Maintain multiple knowledge sources

Tips:

Monitor real user queries to identify content gaps
Regularly review and update your knowledge base content
Use analytics to identify the most common query patterns

Tutorial Progress

Creating Knowledge Bases with RAG

1 Understanding RAG

How RAG Works

Benefits of RAG

Use Cases

Tips:

2 Creating Your First Knowledge Base

Step-by-Step Knowledge Base Creation

Supported Data Sources

File Requirements and Limits

Tips:

3 Uploading and Processing Documents

Document Upload Process

Chunking Configuration

Text Preprocessing Options

Embedding Model Selection

Example:

Tips:

4 Configuring Retrieval Settings

Retrieval Methods

Hybrid Retrieval Configuration

Reranking Models

Retrieval Parameters

Tips:

5 Testing Your Knowledge Base

Using the Recall Test

Effective Test Queries

Evaluating Results

Common Issues and Solutions

Tips:

6 Building a RAG-Powered Chatbot

Creating the Chatflow

Adding Your Knowledge Base

Crafting a RAG-Optimized Prompt

Advanced RAG Techniques

Tips:

7 Advanced RAG Workflows

Question Classification Workflow

Multi-Step RAG Process

Handling Edge Cases

Tips:

8 Monitoring and Optimization

Key Metrics to Track

Optimization Strategies

Common Performance Issues

Best Practices for Production

Tips:

Congratulations!