Skip to main content

检索-增强 生成 RAG

Benefits of using RAG

Applications of RAG

RAG paradigms

Tree main approaches/paradigms

  • Naive RAG: This is the simplest RAG approach. It retrieves relevant document chunks based on a user query and provides them as context for an LLM to generate a response.
  • Advanced RAG: Building on naive RAG, advanced versions incorporate optimization strategies for better retrieval accuracy and LLM context integration.
  • Modular RAG: The most flexible RAG architecture breaks down the process into modules that can be swapped and customized for specific tasks, offering better control and adaptability.

Naive RAG: The Simplest Retrieval-Generative Integration

Limitations of naive RAG

StageChallenges
RetrievalDifficulty in finding relevant and accurate information
Inclusion of irrelevant chunks
Missing crucial details
GenerationPotential for response unsupported by retrievved-context(hallucination)
Risk of generating irrelevant, toxic, or biased reponse
AugmentationChallenges in effectively integrating retrieved information
Disjointed outputs
Redundancy
Complexity in determining the significance of different passages
Ensuring stylistic consistency

Advanced RAG: Enhancing Model Efficiency and Accuracy

Pre-retrieval process

Optimizing indexing

  • Enhancing data granularity (chunking strategies):
  • Enhancing data granularity (data cleaning)
  • Multi-representation indexing
  • Self-querying retrieval
  • Optimizing index structures
  • Parent document retrieval

Optimizing query

  • Multi-query
  • Decomposition
  • Step back prompting
  • Query routing

Retrieval process

  • Query vectorization
  • Similarity search

Post-retrieval process

  • Reranking
  • Context compression
  • Inference-based filtering

Trade-offs of pre- and post-retrieval processes

ProcessBenefitsDrawbacks
Pre-retrieval
  • More relevant results: Filters out irrelevant information before the LLM sees it, improving overall accuracy.
  • Faster searches: Optimized indexing structures significantly improve search speed.
  • Focused context: Provides LLMs with specific information for better responses, reducing the risk of getting sidetracked by irrelevant details.
  • Reduced noise and redundancy: Cleans irrelevant details for clearer and more concise outputs.
  • Simpler implementation: Requires less complex algorithms compared to some post-retrieval techniques.
  • Requires additional processing power upfront: Chunking, indexing, and data manipulation can be computationally expensive.
  • Less flexibility: Limited ability to adapt retrieval based on specific queries or tasks compared to post-retrieval.
  • Potential for bias: Pre-defined filtering criteria in pre-retrieval might unintentionally exclude relevant information.
Post-retrieval
  • More flexibility in refining results: Allows for adaptation based on specific needs of the query or task.
  • Can leverage additional information: Can incorporate factors not used in pre-retrieval (e.g., passage coherence) for better ranking.
  • Potential for error correction: Can identify and remove irrelevant passages retrieved during the initial search.
  • May require more processing power depending on techniques used: Re-ranking and filtering algorithms can add computational overhead.
  • Less control over initial retrieval accuracy: Relies on the quality of the pre-retrieval stage for a good pool of candidate passages.
  • More complex implementation: Requires sophisticated algorithms for effective post-retrieval processing.

Modular RAG: Building Flexible RAG Pipelines

  • Search module
  • Rag-Fusion
  • Memory module
  • Routing module
  • Predict module
  • Task adapter module

Choosing the Best Approach for Your RAG Application

ParadigmDescriptionProsConsApplications
Naive RAG (Retrieve-Read)Simplest paradigm with indexing, retrieval, and generation.Easy to implement, computationally, efficient.Limited control over retrieved information, LLM might struggle with synthesis.Simple question answering, short document summarization.
Advanced RAG (Retrieve-Read-Rewrite-Rerank)Builds on Naive RAG with pre-retrieval and post-retrieval processing for improved retrieval quality.More control over information, improved response relevance.More complex to implement than Naive RAG.Complex question answer, longer document summarization.
Modular RAG (Flexible Architecutre)Most versatile paradigm with specialized modules for enhanced retrieval and processing.Highly customizable, allows for experimentation and innovation.Most complex to implment, requires deeper understanding of individual RAG components.Domain-specific question answering, tailored creative text generation tasks.
  • Complexity of the task
  • Domain-specific requirements
  • Computational resources