RAG Best Practices: Reducing Hallucinations

RAG (Retrieval-Augmented Generation) combines the knowledge of your data with the reasoning of LLMs. But without careful implementation, you can still get hallucinations.

The Problem

LLMs are trained to always provide an answer. Even when they don't know, they'll confidently make something up.

The Solution: Grounded Generation

1. Retrieve First, Then Generate

Always retrieve relevant context before generating:

// Step 1: Get relevant memories

const context = await client.search.semantic({

query: userQuestion,

limit: 5,

});

// Step 2: Pass to LLM with strict instructions

const prompt = `

Answer based ONLY on the following context.

If the answer isn't in the context, say "I don't know."

Context:

${context.results.map(r => r.content).join('\n')}

Question: ${userQuestion}

2. Include Source Citations

PiyAPI's TruthMeter helps track which sources support the answer:

const response = await client.context.retrieve({

query: userQuestion,

includeCitations: true,

});

// response.citations = [

// { memoryId: "mem_123", relevance: 0.92, snippet: "..." },

// { memoryId: "mem_456", relevance: 0.87, snippet: "..." }

// ]

3. Set Confidence Thresholds

Don't use low-relevance results:

const context = await client.search.semantic({

query: userQuestion,

limit: 10,

minRelevance: 0.7, // Only high-confidence matches

});

Measuring Hallucinations

Track the "grounding rate" - percentage of claims that can be traced to source documents. Aim for >95%.

RAG Best Practices: Reducing Hallucinations

The Problem

The Solution: Grounded Generation

1. Retrieve First, Then Generate

2. Include Source Citations

3. Set Confidence Thresholds

Measuring Hallucinations

Related Posts

Introducing PiyAPI v2.0: The Memory Layer for AI

Semantic Search Explained: Beyond Keyword Matching

Ready to build with PiyAPI?