RAG Memory System
Give your AI a personal knowledge base with Retrieval Augmented Generation.
What is RAG Memory?
RAG (Retrieval Augmented Generation) Memory is a system that allows AI to access and reference your uploaded documents when generating responses. Instead of relying solely on training data, the AI searches your knowledge base for relevant information and uses it to provide accurate, grounded answers.
How It Works
1. Upload Files
Upload documents, PDFs, text files, or code to your knowledge base.
2. Semantic Search
When you ask a question, the system searches for relevant content using embeddings.
3. Enhanced Response
The AI uses retrieved context to generate accurate, citation-backed answers.
Key Benefits
- Zero Hallucinations: Responses are grounded in your actual documents
- Always Up-to-Date: Update your knowledge base anytime without retraining
- Source Citations: Know exactly where information comes from
- Private & Secure: Your data is isolated and never shared
- Automatic Chunking: Documents are intelligently split for optimal retrieval
- Semantic Search: Finds relevant content even with different wording
Quick Example
Here's how RAG enhances your AI responses:
❌ Without RAG
Question: "What is our company's refund policy?"
Answer: "I don't have access to your specific company policies. Generally, refund policies vary by company..."
✅ With RAG
Question: "What is our company's refund policy?"
Answer: "According to your refund policy document, customers can request a full refund within 30 days of purchase. The refund is processed within 5-7 business days to the original payment method. [Source: refund-policy.pdf]"
Using RAG in API Requests
RAG is enabled by default. Just make a normal chat request:
import OpenAI from 'openai';
const client = new OpenAI({
baseURL: 'https://superagentstack.orionixtech.com/api/v1',
apiKey: process.env.OPENROUTER_KEY,
defaultHeaders: {
'superAgentKey': process.env.SUPER_AGENT_KEY,
},
});
const completion = await client.chat.completions.create({
model: 'anthropic/claude-3-sonnet',
messages: [
{ role: 'user', content: 'What does our privacy policy say about data retention?' }
],
// RAG is enabled by default (useRAG: true)
});
console.log(completion.choices[0].message.content);
// The AI will search your uploaded files and provide an accurate answerControlling RAG Behavior
Disable RAG for General Questions
// For general knowledge questions, disable RAG
const completion = await client.chat.completions.create({
model: 'anthropic/claude-3-sonnet',
messages: [
{ role: 'user', content: 'What is the capital of France?' }
],
useRAG: false, // Don't search uploaded files
});Custom RAG Query
// Use a different query for RAG search
const completion = await client.chat.completions.create({
model: 'anthropic/claude-3-sonnet',
messages: [
{ role: 'user', content: 'Can you summarize the key points?' }
],
useRAG: true,
ragQuery: 'privacy policy data retention GDPR', // Custom search query
});
// The AI will search for "privacy policy data retention GDPR"
// but respond to "Can you summarize the key points?"Supported File Types
Documents
- PDF (.pdf)
- Word (.docx, .doc)
- Text (.txt)
- Markdown (.md)
- Rich Text (.rtf)
Code & Data
- Source Code (.js, .ts, .py, .java, etc.)
- JSON (.json)
- CSV (.csv)
- XML (.xml)
- YAML (.yml, .yaml)
Intelligent Chunking
When you upload a file, it's automatically split into smaller chunks for optimal retrieval:
- Semantic Boundaries: Chunks respect paragraph and section boundaries
- Optimal Size: Each chunk is sized for best embedding quality
- Context Preservation: Overlapping chunks maintain context
- Metadata Tracking: Each chunk knows its source file and location
RAG Response Metadata
Responses include metadata about RAG usage:
{
"choices": [...],
"usage": {...},
"_metadata": {
"rag": {
"enabled": true,
"resultsFound": 3,
"query": "privacy policy data retention",
"sources": [
{
"filename": "privacy-policy.pdf",
"relevanceScore": 0.92,
"chunkId": "chunk-123"
},
{
"filename": "gdpr-compliance.md",
"relevanceScore": 0.87,
"chunkId": "chunk-456"
}
]
}
}
}Privacy & Security
- Stored securely with encryption
- Isolated per user (never shared)
- Accessible only through your API key
- Deletable at any time