Why This Playbook?

RAG is often presented as “retrieve then generate,” but production reliability depends on strict grounding behavior. This playbook focuses on that engineering gap: not just answering from docs, but also refusing unsupported claims in a predictable way.

Repo:

What It Demonstrates

  • Local knowledge retrieval: .md/.txt files are chunked, embedded, and indexed in memory.
  • Grounding-first responses: answers must include valid citations from retrieved chunks.
  • Guarded abstention: low-confidence retrieval or unsupported questions return safe fallback instead of fabricated answers.
  • Coverage-aware behavior: multi-part questions can return grounded known facts while explicitly marking unsupported parts.
  • UI grounding indicator: clear status for grounded vs blocked outcomes.
  • Scenario-based evals: answerable, unanswerable, and partial cases with pass/fail scoring.

Architecture (Simple and Practical)

  • Node.js + Express backend with static HTML UI.
  • Ingestion pipeline: chunking -> embedding -> vector store.
  • Retrieval pipeline: cosine similarity ranking + top-k selection.
  • RAG route validates citations and normalizes unsafe outputs.
  • Scenario runner generates JSON/Markdown reports for repeatable checks.

Key controls:

  • MIN_RETRIEVAL_SCORE: blocks generation when top retrieval quality is weak.
  • MIN_QUESTION_COVERAGE: blocks questions not sufficiently covered by retrieved docs.
  • Citation validation: only retrieved chunk IDs are accepted as grounded references.

Grounding Contract in UI

  • Show Grounding status: grounded only when the answer has valid citations from retrieved chunks.
  • For unsupported questions, show blocked/ungrounded status, not grounded.

Blocked reasons include:

  • retrieval_below_threshold
  • question_not_covered
  • invalid_citations
  • no_grounded_citations
  • non_json_model_output
  • empty_answer

Quickstart

git clone https://github.com/amiya-pattnaik/rag-engineering-playbook.git
cd rag-engineering-playbook/demo-app
cp .env.example .env
npm install
npm run dev
# open http://localhost:3000

Mock mode works offline by default.

Use OpenAI provider mode:

  • Set OPENAI_API_KEY in .env.
  • Keep OPENAI_TEMPERATURE=0 for deterministic outputs.

Scenario Runner and Anti-Hallucination Eval

# run all default scenarios
npm run demo:scenarios

# run dedicated anti-hallucination suite
npm run demo:anti-hallucination

The anti-hallucination suite validates:

  • Answerable: must be grounded and include expected facts.
  • Unanswerable: must abstain and avoid grounded factual claims.
  • Partial: must provide grounded known facts and explicitly abstain on unknown parts.

Latest run in this repo: 10/10 scenarios passed.

Why This Matters for Engineering and QA

  • Behavioral confidence: you can test grounding policy, not just response style.
  • Safer demos and pilots: unsupported asks are blocked with explicit reasons.
  • CI-friendly quality gate: eval exits non-zero on failures.
  • Extensible foundation: add knowledge docs, providers, or workflows without rewriting the core contract.

Extending

  • Add domain docs under demo-app/data/knowledge/ and run npm run ingest.
  • Tune retrieval/grounding with TOP_K, MIN_RETRIEVAL_SCORE, and MIN_QUESTION_COVERAGE.
  • Expand demo-app/scenarios/anti-hallucination-eval.json with domain-specific answerable/unanswerable/partial cases.
  • Add providers in demo-app/src/providers/ and keep strict JSON + citation validation in demo-app/src/services/rag.js.

Notes

  • The vector store is in-memory for demo simplicity; server restart rebuilds state from knowledge files.
  • grounded means citations are valid against retrieved chunks, not global factual correctness outside provided docs.
  • Mock mode is deterministic for offline demos; provider mode quality depends on model and prompt adherence.

Closing Thought

Reliable RAG is less about elegant prompts and more about explicit engineering contracts: retrieval thresholds, coverage checks, citation checks, and measurable evals. This playbook keeps those contracts visible and runnable.