Deep Research
Use Constclaw's Expand & Elaborate mode to dive deep into complex topics. This example shows how a short passage can be expanded into a comprehensive research briefing.
Retrieval-Augmented Generation (RAG) — Deep Dive
Background
RAG was introduced by Lewis et al. (2020) at Facebook AI Research. The core insight: large language models store knowledge in their parameters, but this knowledge is static, limited by training data, and expensive to update. RAG solves this by adding a retrieval step that fetches current, relevant documents at inference time.
Architecture
A RAG system has two main components:
- Retriever — Typically a dense passage retriever (DPR) that encodes both the query and documents into vector embeddings, then finds the top-k most similar documents using approximate nearest neighbor (ANN) search. Common tools: FAISS, Pinecone, Weaviate, ChromaDB.
- Generator — A seq2seq model (originally BART, now commonly GPT-4, Claude, etc.) that receives the concatenation of the query + retrieved passages as input and generates the final answer.
RAG vs. fine-tuning
Advanced techniques
- Hybrid search — Combine dense (semantic) and sparse (BM25 keyword) retrieval for better recall
- Re-ranking — Use a cross-encoder to re-score retrieved passages before generation
- Chunking strategies — Split documents by semantic boundaries rather than fixed token windows
- Query decomposition — Break complex queries into sub-queries, retrieve for each, then synthesize
- Self-RAG — The model decides when to retrieve and critically evaluates the retrieved context
Real-world applications
RAG powers customer support chatbots (retrieval over help docs), enterprise search (internal knowledge bases), legal research (case law retrieval), and medical Q&A (retrieval over PubMed). It's the dominant architecture for production LLM applications that need factual accuracy and up-to-date knowledge.
Key papers
- Lewis et al. (2020) — "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks"
- Karpukhin et al. (2020) — "Dense Passage Retrieval for Open-Domain Question Answering"
- Asai et al. (2023) — "Self-RAG: Learning to Retrieve, Generate, and Critique"
Try it yourself
Select any paragraph from a research paper or technical article, choose Expand & Elaborate, and click Analyze. Constclaw will generate a comprehensive briefing with background, comparisons, and further reading.