RAG Best Practices

Optimize your RAG system for accuracy, performance, and reliability.

Document Preparation

Clean and Structure Content

  • Remove noise: Strip headers, footers, and irrelevant formatting
  • Use clear headings: Structure documents with H1, H2, H3 tags
  • Break up long paragraphs: Keep paragraphs focused and concise
  • Include context: Each section should be self-contained

❌ Poor Structure

markdown
Our refund policy is simple. Contact us. We'll help. 
Terms apply. See section 4.2.1 for details.

✅ Good Structure

markdown
# Refund Policy

Customers can request a full refund within 30 days of purchase.

## How to Request
1. Email support@company.com
2. Include order number
3. Refund processed in 5-7 business days

File Organization

Use Meaningful Names

❌ Bad Names

  • doc1.pdf
  • final_FINAL_v2.docx
  • untitled.txt

✅ Good Names

  • privacy-policy-2025.pdf
  • user-guide-v2.1.docx
  • api-documentation.md

Categorize with Metadata

good-metadata.ts
// ✅ Well-organized metadata
const metadata = {
  category: 'legal',
  subcategory: 'privacy',
  tags: ['gdpr', 'data-protection', 'eu'],
  version: '2.1',
  effectiveDate: '2025-01-01',
  department: 'legal',
  language: 'en',
  audience: 'customers'
};

Optimal Chunking

While chunking is automatic, you can optimize it by structuring your documents properly:

Ideal Chunk Characteristics

  • Self-contained: Each chunk should make sense on its own
  • Focused topic: One main idea per chunk
  • Proper context: Include necessary background information
  • Clear boundaries: Use headings and paragraphs to define sections
good-chunking.md
# Refund Policy

## Eligibility
Customers are eligible for a full refund within 30 days of purchase. 
The product must be unused and in original packaging. Digital products 
are eligible if no download occurred.

## Process
To request a refund:
1. Email support@company.com with your order number
2. Provide reason for refund
3. Include proof of purchase
4. Refunds are processed within 5-7 business days

## Exceptions
The following items are not eligible for refunds:
- Sale items marked as "final sale"
- Custom or personalized products
- Digital products after download

Query Optimization

Use Specific Queries

❌ Vague Query

typescript
const completion = await client.chat.completions.create({
  model: 'anthropic/claude-3-sonnet',
  messages: [
    { role: 'user', content: 'Tell me about it' }
  ],
});

✅ Specific Query

typescript
const completion = await client.chat.completions.create({
  model: 'anthropic/claude-3-sonnet',
  messages: [
    { role: 'user', content: 'What is our refund policy for digital products?' }
  ],
});

Custom RAG Queries

Use custom RAG queries when the user's question differs from what you want to search:

custom-rag-query.ts
// User asks a general question
const completion = await client.chat.completions.create({
  model: 'anthropic/claude-3-sonnet',
  messages: [
    { role: 'user', content: 'How do I get my money back?' }
  ],
  // But search for specific terms
  ragQuery: 'refund policy return process eligibility',
});

Performance Optimization

Disable RAG When Not Needed

conditional-rag.ts
function shouldUseRAG(query: string): boolean {
  // General knowledge questions don't need RAG
  const generalKeywords = [
    'what is', 'who is', 'when did', 'where is',
    'define', 'explain', 'how does', 'why does'
  ];
  
  const isGeneral = generalKeywords.some(keyword =>
    query.toLowerCase().includes(keyword)
  );
  
  // Company-specific questions need RAG
  const companyKeywords = [
    'our policy', 'our product', 'our service',
    'company', 'we offer', 'pricing'
  ];
  
  const isCompanySpecific = companyKeywords.some(keyword =>
    query.toLowerCase().includes(keyword)
  );
  
  return isCompanySpecific && !isGeneral;
}

// Usage
const useRAG = shouldUseRAG(userQuery);

const completion = await client.chat.completions.create({
  model: 'anthropic/claude-3-sonnet',
  messages: [{ role: 'user', content: userQuery }],
  useRAG,
});

Optimize File Sizes

  • Compress PDFs: Use tools to reduce PDF file sizes
  • Remove images: Extract text-only versions for better processing
  • Split large files: Break documents over 10MB into smaller files
  • Use text formats: Markdown and plain text process faster than PDFs

Improving Accuracy

Include Source Context

with-context.ts
const completion = await client.chat.completions.create({
  model: 'anthropic/claude-3-sonnet',
  messages: [
    {
      role: 'system',
      content: 'You are a customer support assistant. Always cite sources when answering from documentation.'
    },
    {
      role: 'user',
      content: 'What is the refund policy?'
    }
  ],
  useRAG: true,
});

Verify Critical Information

verify-info.ts
// For critical queries, check metadata
const completion = await client.chat.completions.create({
  model: 'anthropic/claude-3-sonnet',
  messages: [
    { role: 'user', content: 'What are the legal requirements for GDPR compliance?' }
  ],
  useRAG: true,
});

// Check if RAG found relevant sources
if (completion._metadata?.rag?.resultsFound > 0) {
  const sources = completion._metadata.rag.sources;
  console.log('Answer based on:', sources.map(s => s.filename));
} else {
  console.warn('⚠️ No sources found - answer may not be accurate');
}

Regular Maintenance

Keep Content Updated

update-schedule.ts
// Schedule regular updates
async function updateKnowledgeBase() {
  // 1. Check for outdated files
  const files = await listFiles();
  const sixMonthsAgo = new Date();
  sixMonthsAgo.setMonth(sixMonthsAgo.getMonth() - 6);
  
  const outdated = files.filter((f: any) => {
    const uploadDate = new Date(f.uploadedAt);
    return uploadDate < sixMonthsAgo;
  });
  
  console.log(`Found ${outdated.length} files older than 6 months`);
  
  // 2. Review and update or delete
  for (const file of outdated) {
    console.log(`Review: ${file.filename}`);
    // Manually review and update
  }
  
  // 3. Add new documentation
  await uploadNewFiles('./updated-docs');
}

// Run monthly
setInterval(updateKnowledgeBase, 30 * 24 * 60 * 60 * 1000);

Monitor Performance

monitor-rag.ts
// Track RAG effectiveness
function logRAGMetrics(completion: any) {
  const metadata = completion._metadata;
  
  if (metadata?.rag) {
    console.log('RAG Metrics:', {
      enabled: metadata.rag.enabled,
      resultsFound: metadata.rag.resultsFound,
      query: metadata.rag.query,
      topScore: metadata.rag.sources?.[0]?.relevanceScore,
    });
    
    // Alert if no results found
    if (metadata.rag.resultsFound === 0) {
      console.warn('⚠️ RAG found no relevant documents');
    }
    
    // Alert if low relevance scores
    if (metadata.rag.sources?.[0]?.relevanceScore < 0.7) {
      console.warn('⚠️ Low relevance score - answer may be inaccurate');
    }
  }
}

Common Pitfalls to Avoid

❌ Uploading Duplicate Content

Multiple versions of the same document confuse the RAG system. Delete old versions before uploading new ones.

❌ Poor File Names

Generic names like "document.pdf" make it hard to manage files. Use descriptive names with dates or versions.

❌ Ignoring Metadata

Files without metadata are hard to organize and filter. Always add category, tags, and version information.

❌ Uploading Unstructured Content

Wall-of-text documents chunk poorly. Use headings, lists, and paragraphs to structure content.

❌ Never Updating Files

Outdated information leads to incorrect answers. Schedule regular reviews and updates of your knowledge base.

Testing Your RAG System

test-rag.ts
// Create test queries for your knowledge base
const testQueries = [
  'What is our refund policy?',
  'How long does shipping take?',
  'What payment methods do you accept?',
  'How do I cancel my subscription?',
];

async function testRAG() {
  for (const query of testQueries) {
    console.log(`\nTesting: ${query}`);
    
    const completion = await client.chat.completions.create({
      model: 'anthropic/claude-3-sonnet',
      messages: [{ role: 'user', content: query }],
      useRAG: true,
    });
    
    const answer = completion.choices[0].message.content;
    const metadata = completion._metadata;
    
    console.log(`Answer: ${answer.substring(0, 100)}...`);
    console.log(`Sources found: ${metadata?.rag?.resultsFound || 0}`);
    
    if (metadata?.rag?.resultsFound === 0) {
      console.warn('❌ No sources found - missing documentation!');
    } else {
      console.log('✅ RAG working correctly');
    }
  }
}

testRAG();

Pro Tip

Create a "test suite" of common questions and run them regularly to ensure your RAG system is working correctly and returning accurate answers.

RAG Optimization Checklist

Before Uploading

  • ☐ Clean and format documents
  • ☐ Use clear headings and structure
  • ☐ Choose descriptive file names
  • ☐ Prepare metadata (category, tags, version)
  • ☐ Remove duplicate content

After Uploading

  • ☐ Verify processing completed successfully
  • ☐ Test with sample queries
  • ☐ Check relevance scores
  • ☐ Verify answers are accurate

Regular Maintenance

  • ☐ Review files monthly
  • ☐ Update outdated content
  • ☐ Delete obsolete files
  • ☐ Monitor storage usage
  • ☐ Test common queries
  • ☐ Track RAG performance metrics

Next Steps