RAG Memory System

Give your AI a personal knowledge base with Retrieval Augmented Generation.

What is RAG Memory?

RAG (Retrieval Augmented Generation) Memory is a system that allows AI to access and reference your uploaded documents when generating responses. Instead of relying solely on training data, the AI searches your knowledge base for relevant information and uses it to provide accurate, grounded answers.

How It Works

1. Upload Files

Upload documents, PDFs, text files, or code to your knowledge base.

2. Semantic Search

When you ask a question, the system searches for relevant content using embeddings.

3. Enhanced Response

The AI uses retrieved context to generate accurate, citation-backed answers.

Key Benefits

Zero Hallucinations: Responses are grounded in your actual documents
Always Up-to-Date: Update your knowledge base anytime without retraining
Source Citations: Know exactly where information comes from
Private & Secure: Your data is isolated and never shared
Automatic Chunking: Documents are intelligently split for optimal retrieval
Semantic Search: Finds relevant content even with different wording

Quick Example

Here's how RAG enhances your AI responses:

❌ Without RAG

Question: "What is our company's refund policy?"

Answer: "I don't have access to your specific company policies. Generally, refund policies vary by company..."

✅ With RAG

Question: "What is our company's refund policy?"

Answer: "According to your refund policy document, customers can request a full refund within 30 days of purchase. The refund is processed within 5-7 business days to the original payment method. [Source: refund-policy.pdf]"

Using RAG in API Requests

RAG is enabled by default. Just make a normal chat request:

rag-example.ts

import OpenAI from 'openai';

const client = new OpenAI({
  baseURL: 'https://superagentstack.orionixtech.com/api/v1',
  apiKey: process.env.OPENROUTER_KEY,
  defaultHeaders: {
    'superAgentKey': process.env.SUPER_AGENT_KEY,
  },
});

const completion = await client.chat.completions.create({
  model: 'anthropic/claude-3-sonnet',
  messages: [
    { role: 'user', content: 'What does our privacy policy say about data retention?' }
  ],
  // RAG is enabled by default (useRAG: true)
});

console.log(completion.choices[0].message.content);
// The AI will search your uploaded files and provide an accurate answer

Controlling RAG Behavior

Disable RAG for General Questions

disable-rag.ts

// For general knowledge questions, disable RAG
const completion = await client.chat.completions.create({
  model: 'anthropic/claude-3-sonnet',
  messages: [
    { role: 'user', content: 'What is the capital of France?' }
  ],
  useRAG: false,  // Don't search uploaded files
});

Custom RAG Query

custom-rag-query.ts

// Use a different query for RAG search
const completion = await client.chat.completions.create({
  model: 'anthropic/claude-3-sonnet',
  messages: [
    { role: 'user', content: 'Can you summarize the key points?' }
  ],
  useRAG: true,
  ragQuery: 'privacy policy data retention GDPR',  // Custom search query
});

// The AI will search for "privacy policy data retention GDPR"
// but respond to "Can you summarize the key points?"

Supported File Types

Documents

PDF (.pdf)
Word (.docx, .doc)
Text (.txt)
Markdown (.md)
Rich Text (.rtf)

Code & Data

Source Code (.js, .ts, .py, .java, etc.)
JSON (.json)
CSV (.csv)
XML (.xml)
YAML (.yml, .yaml)

Intelligent Chunking

When you upload a file, it's automatically split into smaller chunks for optimal retrieval:

Semantic Boundaries: Chunks respect paragraph and section boundaries
Optimal Size: Each chunk is sized for best embedding quality
Context Preservation: Overlapping chunks maintain context
Metadata Tracking: Each chunk knows its source file and location

RAG Response Metadata

Responses include metadata about RAG usage:

json

{
  "choices": [...],
  "usage": {...},
  "_metadata": {
    "rag": {
      "enabled": true,
      "resultsFound": 3,
      "query": "privacy policy data retention",
      "sources": [
        {
          "filename": "privacy-policy.pdf",
          "relevanceScore": 0.92,
          "chunkId": "chunk-123"
        },
        {
          "filename": "gdpr-compliance.md",
          "relevanceScore": 0.87,
          "chunkId": "chunk-456"
        }
      ]
    }
  }
}

Privacy & Security

Your uploaded files are:

Stored securely with encryption
Isolated per user (never shared)
Accessible only through your API key
Deletable at any time