Skip to content

DavidZWZ/Awesome-RAG-Reasoning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Awesome-RAG-Reasoning

Awesome arXiv Maintenance Contribution Welcome Code License: MIT

RAG and Reasoning System Overview

A curated collection of resources, papers, tools, and implementations that bridge the gap between Retrieval-Augmented Generation (RAG) and Reasoning in Large Language Models and Agents. This repository brings together traditionally separate research domains to enable more powerful Agentic AI systems.

📖 Related Survey: This repository is based on the taxonomy and framework presented in "Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs", featured 🏆 in Hugging Face Daily Papers.

🔍 Dive Deeper: For researchers interested in the latest developments in Agentic Deep Research, including cutting-edge papers and industry-leading deep research products, we recommend exploring our comprehensive collection at Awesome-Deep-Research 🔥🔥🔥.

If you find this repository useful, please cite our papers:

@article{li2025towards,
  title={Towards Agentic RAG with Deep Reasoning: A Survey of RAG-Reasoning Systems in LLMs},
  author={Li, Yangning and Zhang, Weizhi and Yang, Yuyao and Huang, Wei-Chieh and Wu, Yaozu and Luo, Junyu and Bei, Yuanchen and Zou, Henry Peng and Luo, Xiao and Zhao, Yusheng and others},
  journal={arXiv preprint arXiv:2507.09477},
  year={2025}
}

@article{zhang2025web,
  title={From Web Search towards Agentic Deep Research: Incentivizing Search with Reasoning Agents},
  author={Zhang, Weizhi and Li, Yangning and Bei, Yuanchen and Luo, Junyu and Wan, Guancheng and Yang, Liangwei and Xie, Chenxuan and Yang, Yuyao and Huang, Wei-Chieh and Miao, Chunyu and others},
  journal={arXiv preprint arXiv:2506.18959},
  year={2025}
}

📖 Introduction

🔍 Retrieval-Augmented Generation (RAG) has emerged as a powerful paradigm that combines the strengths of large language models with external knowledge retrieval. By augmenting language models with relevant information from external sources, RAG systems can provide more accurate, up-to-date, and factual responses while maintaining the generative capabilities of modern LLMs.

  • Limitations:
    • May retrieve irrelevant or inaccurate information
    • Limited by the quality and coverage of external knowledge bases

🧠 Reasoning has recently gained significant popularity as a complementary approach to enhance LLM performance. Reasoning techniques focus on improving the model's ability to process information, perform logical analysis, and arrive at conclusions through structured thinking processes. These methods enable LLMs to tackle complex problems that require multi-step inference, causal understanding, and systematic problem-solving.

  • Limitations:
    • Often hallucinates or mis-grounds facts
    • Struggles with up-to-date or domain-specific information

Although RAG and Reasoning address different aspects of the model's capabilities. they have been developed largely independently, with separate research communities, methodologies, and evaluation benchmarks:

This repository serves as a comprehensive collection that bridges these traditionally separate domains, providing resources for researchers and practitioners interested in combining the strengths of both approaches.

RAG and Reasoning Taxonomy

Why RAG + Reasoning?

Large Language Models (LLMs) serve as the foundation for modern AI systems, but they face significant limitations in both knowledge access and reasoning capabilities. While RAG excels at providing factual knowledge and reasoning excels at logical processing, real-world problems often require both capabilities simultaneously. Complex queries demand not just access to relevant information, but also the ability to reason through that information systematically.

Real-World Impact: This combination enables AI systems to tackle complex problems that require both knowledge retrieval and sophisticated reasoning, such as scientific research, legal analysis, medical diagnosis, and strategic planning.

RAG and Reasoning Framework

The Reasoning-Enhanced RAG methods and RAG-Enhanced Reasoning methods represent one-way enhancements. In contrast, the Synergized RAG-Reasoning System performs reasoning and retrieval iteratively, enabling mutual enhancements.


What This Repository Covers

Below you will find a curated selection of research papers, open-source implementations, and benchmarking datasets that drive progress in RAG and Reasoning.

Latest academic publications and open-source implementations that advance the integration of RAG and Reasoning.


The table linked below covers a diverse range of tasks. Each benchmark is annotated with its domain, knowledge type, reasoning capability, and dataset size.

Guidelines for contributing to this repository and adding citation information.


📚 Research Papers and Frameworks: This section is organized according to the taxonomy in our research paper, providing resources for researchers and practitioners to explore, implement, and motivate new methods in the field.

Reasoning-Enhanced RAG

Retrieval Optimization

  • (AAAI 2025) MaFeRw: Query Rewriting with Multi-Aspect Feedbacks for Retrieval-Augmented Large Language Models [Paper] [Code] GitHub Repo stars

  • (ArXiv 2025) Collab-RAG: Boosting Retrieval-Augmented Generation for Complex Question Answering via White-Box and Black-Box LLM Collaboration [Paper] [Code] GitHub Repo stars

  • (ArXiv 2025) DeepRetrieval: Hacking Real Search Engines and Retrievers with Large Language Models via Reinforcement Learning [Paper] [Code] GitHub Repo stars

  • (ArXiv 2025) Credible plan-driven rag method for multi-hop question answering [Paper]

  • (ArXiv 2025) FIND: Fine-grained Information Density Guided Adaptive Retrieval-Augmented Generation for Disease Diagnosis [Paper]

  • (ArXiv 2025) LLM-Independent Adaptive RAG: Let the Question Speak for Itself [Paper] [Code]

  • (ACL 2024) Chain-of-Verification Reduces Hallucination in Large Language Models [Paper]

  • (EMNLP 2024) Learning to Plan for Retrieval-Augmented Large Language Models from Knowledge Graphs [Paper] [Code] GitHub Repo stars

  • (EMNLP 2024) Retrieval and Reasoning on KGs: Integrate Knowledge Graphs into Large Language Models for Complex Question Answering [Paper] [Code] GitHub Repo stars

  • (NAACL 2024) Adaptive-RAG: Learning to Adapt Retrieval-Augmented Large Language Models through Question Complexity [Paper] [Code] GitHub Repo stars

  • (SIGIR 2024) Can Query Expansion Improve Generalization of Strong Cross-Encoder Rankers? [Paper]

  • (LREC-COLING 2024) RADCoT: Retrieval-Augmented Distillation to Specialization Models for Generating Chain-of-Thoughts in Query Expansion [Paper] [Code] GitHub Repo stars

  • (ArXiv 2024) GNN-RAG: Graph Neural Retrieval for Large Language Model Reasoning [Paper] [Code] GitHub Repo stars

  • (ArXiv 2024) RuleRAG: Rule-Guided Retrieval-Augmented Generation with Language Models for Question Answering [Paper] [Code]

Integration Enhancement

  • (ArXiv 2025) DualRAG: A Dual-Process Approach to Integrate Reasoning and Retrieval for Multi-Hop Question Answering [Paper]

  • (EMNLP 2024) SEER: Self-Aligned Evidence Extraction for Retrieval-Augmented Generation [Paper] [Code] GitHub Repo stars

  • (ICLR 2024) Making Retrieval-Augmented Language Models Robust to Irrelevant Context [Paper] [Code] GitHub Repo stars

  • (ACL 2024) BeamAggR: Beam Aggregation Reasoning over Multi-source Knowledge for Multi-hop Question Answering [Paper]

Generation Enhancement

  • (AAAI 2025) Improving Retrieval Augmented Language Model with Self-Reasoning [Paper]

  • (ArXiv 2025) RARE: Retrieval-Augmented Reasoning Enhancement for Large Language Models [Paper]

  • (ArXiv 2025) AlignRAG: Leveraging Critique Learning for Evidence-Sensitive Retrieval-Augmented Reasoning [Paper] [Code] GitHub Repo stars

  • (EMNLP 2024) Open-RAG: Enhanced Retrieval Augmented Reasoning with Open-Source Large Language Models [Paper] [Code] GitHub Repo stars

  • (EMNLP 2024) TRACE the evidence: Constructing knowledge-grounded reasoning chains for retrieval-augmented generation [Paper] [Code] GitHub Repo stars

⬆️ Back to Table of Contents

RAG-Enhanced Reasoning

External Knowledge Retrieval

Knowledge Base

  • (ICLR 2025) KBLaM: Knowledge Base augmented Language Model [Paper] [Code] GitHub Repo stars

  • (ArXiv 2025) Assisting Mathematical Formalization with A Learning-based Premise Retriever [Paper] [Code] GitHub Repo stars

  • (ArXiv 2025) ReaRAG: Knowledge-guided Reasoning Enhances Factuality of Large Reasoning Models with Iterative Retrieval Augmented Generation [Paper] [Code] GitHub Repo stars

  • (ArXiv 2025) Scaling Test-Time Inference with Policy-Optimized, Dynamic Retrieval-Augmented Generation via KV Caching and Decoding [Paper]

  • (ArXiv 2025) PIKE-RAG: sPecIalized KnowledgE and Rationale Augmented Generation [Paper] [Code] GitHub Repo stars

  • (SIGIR 2024) Retrieval-Augmented Generation with Knowledge Graphs for Customer Service Question Answering [Paper]

  • (ICCBR 2024) CBR-RAG: Case-Based Reasoning for Retrieval Augmented Generation in LLMs for Legal Question Answering [Paper] [Code] GitHub Repo stars

  • (LLM4Code 2024) LLM-based and Retrieval-Augmented Control Code Generation [Paper]

  • (ArXiv 2024) MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop Queries [Paper] [Code] GitHub Repo stars

  • (MDPI 2024) CRP-RAG: A Retrieval-Augmented Generation Framework for Supporting Complex Logical Reasoning and Knowledge Planning [Paper]

Web Retrieval

  • (ICTIR 2025) Distillation and Refinement of Reasoning in Small Language Models for Document Re-ranking [Paper] [Code] GitHub Repo stars

  • (NAACL 2025) Step-by-Step Fact Verification System for Medical Claims with Explainable Reasoning [Paper] [Code] GitHub Repo stars

  • (COLM 2024) Web Retrieval Agents for Evidence-Based Misinformation Detection [Paper] [Code] GitHub Repo stars

  • (EMNLP 2024) OPEN-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models [Paper] [Code] GitHub Repo stars

  • (ACL 2024) FRVA: Fact-Retrieval and Verification Augmented Entailment Tree Generation for Explainable Question Answering [Paper]

  • (FEVER 2024) Ragar, your falsehood radar: Rag-augmented reasoning for political fact-checking using multimodal large language models [Paper]

  • (LREC-COLING 2024) PACAR: Automated Fact-Checking with Planning and Customized Action Reasoning using Large Language Models [Paper]

Tool Using

  • (COLING 2025) Efficient Tool Use with Chain-of-Abstraction Reasoning [Paper]

  • (NAACL 2025) Meta-Reasoning Improves Tool Use in Large Language Models [Paper] [Code] GitHub Repo stars

  • (ArXiv 2025) Self-Training Large Language Models for Tool-Use Without Demonstrations [Paper] [Code] GitHub Repo stars

  • (ICLR 2024) Large Language Models As Tool Makers [Paper] [Code] GitHub Repo stars

  • (ICLR 2024) ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs [Paper] [Code] GitHub Repo stars

  • (NeurIPS 2024) AVATAR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning [Paper] [Code] GitHub Repo stars

  • (EMNLP 2024) Re-Invoke: Tool Invocation Rewriting for Zero-Shot Tool Retrieval [Paper]

  • (EMNLP 2024) SCIAGENT: Tool-augmented Language Models for Scientific Reasoning [Paper]

  • (EMNLP 2024) RAR: Retrieval-augmented retrieval for code generation in low-resource languages [Paper]

  • (ACL 2024) MORE: Multi-mOdal REtrieval Augmented Generative Commonsense Reasoning [Paper] [Code] GitHub Repo stars

  • (LREC-COLING 2024) Towards Autonomous Tool Utilization in Language Models: A Unified, Efficient and Scalable Framework [Paper]

  • (NAACL 2024) Making Language Models Better Tool Learners with Execution Feedback [Paper] [Code] GitHub Repo stars

  • (NeurIPS 2023) ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings [Paper] [Code] GitHub Repo stars

In-context Retrieval

Prior Experience

  • (ICLR 2025) Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasoning [Paper] [Code] GitHub Repo stars

  • (ICLR 2025) Human-like Episodic Memory for Infinite Context LLMs [Paper]

  • (IEEE TPAMI 2025) JARVIS-1: Open-World Multi-Task Agents With Memory-Augmented Multimodal Language Models [Paper]

  • (ArXiv 2025) Reasoning Under 1 Billion: Memory-Augmented Reinforcement Learning for Large Language Models [Paper]

  • (ArXiv 2025) Review of Case-Based Reasoning for LLM Agents: Theoretical Foundations, Architectural Components, and Cognitive Integration [Paper]

  • (NeurIPS 2024) CoPS: Empowering LLM Agents with Provable Cross-Task Experience Sharing [Paper] [Code] GitHub Repo stars

  • (CHI EA 2024) "My agent understands me beter": Integrating Dynamic Human-like Memory Recall and Consolidation in LLM-Based [Paper]

  • (ArXiv 2024) Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level [Paper]

  • (ArXiv 2024) RAP: Retrieval-Augmented Planning with Contextual Memory for Multimodal LLM Agents [Paper]

Example or Training Data

  • (ICLR 2025) OpenRAG: Optimizing RAG End-to-End viaIn-ContextRetrievalLearning [Paper]

  • (COLING 2025) PERC: Plan-As-Query Example Retrieval for Underrepresented Code Generation [Paper]

  • (IJCAI 2024) Recall, Retrieve and Reason: Towards Better In-Context Relation Extraction [Paper]

  • (NeurIPS 2024) Mixture of Demonstrations for In-Context Learning [Paper] [Code] GitHub Repo stars

  • (EACL 2024) Learning to Retrieve In-Context Examples for Large Language Models [Paper] [Code] GitHub Repo stars

  • (EMNLP 2023) UPRISE: Universal Prompt Retrieval for Improving Zero-Shot Evaluation [Paper] [Code] GitHub Repo stars

  • (ArXiv 2023) Dr.ICL: Demonstration-Retrieved In-context Learning [Paper]

⬆️ Back to Table of Contents

Synergized RAG and Reasoning

Reasoning Workflow

Chain-based

  • (ICLR 2025) Long-context llms meet rag: Overcoming challenges for long inputs in rag [Paper]

  • (ArXiv 2025) Chain-of-Retrieval Augmented Generation [Paper] [Code] GitHub Repo stars

  • (ArXiv 2025) CoT-RAG: Integrating Chain of Thought and Retrieval-Augmented Generation to Enhance Reasoning in Large Language Models [Paper]

  • (ArXiv 2025) Rankcot: Refining knowledge for retrieval-augmented generation through ranking chain-of-thoughts [Paper] [Code] GitHub Repo stars

  • (EMNLP 2024) Retrieving, Rethinking and Revising: The Chain-of-Verification Can Improve Retrieval Augmented Generation [Paper]

  • (EMNLP 2024) Chain-of-note: Enhancing robustness in retrieval-augmented language models [Paper]

  • (COLM 2024) Raft: Adapting language model to domain specific rag [Paper] [Code] GitHub Repo stars

  • (ArXiv 2024) Rat: Retrieval augmented thoughts elicit context-aware reasoning in long-horizon generation [Paper] [Code] GitHub Repo stars

  • (ArXiv 2024) TRACE the evidence: Constructing knowledge-grounded reasoning chains for retrieval-augmented generation [Paper] [Code] GitHub Repo stars

  • (ACL 2023) Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions [Paper] [Code] GitHub Repo stars

Tree-based

  • (ACL 2025) ARise: Towards Knowledge-Augmented Reasoning via Risk-Adaptive Search [Paper] [Code] GitHub Repo stars

  • (AAAI 2025) RATT: A Thought Structure for Coherent and Correct LLM Reasoning [Paper] [Code] GitHub Repo stars

  • (ArXiv 2025) MCTS-RAG: Enhance Retrieval-Augmented Generation with Monte Carlo Tree Search [Paper] [Code] GitHub Repo stars

  • (ArXiv 2025) Airrag: Activating intrinsic reasoning for retrieval augmented generation via tree-based search [Paper]

  • (ArXiv 2025) Tree-based RAG-Agent Recommendation System: A Case Study in Medical Test Data [Paper]

  • (ArXiv 2024) SeRTS: Self-Rewarding Tree Search for Biomedical Retrieval-Augmented Generation [Paper]

  • (ArXiv 2024) CORAG: A Cost-Constrained Retrieval Optimization System for Retrieval-Augmented Generation [Paper]

  • (ACL 2023) Tree of clarifications: Answering ambiguous questions with retrieval-augmented large language models [Paper] [Code] GitHub Repo stars

  • (EMNLP 2023) Grove: a retrieval-augmented complex story generation framework with a forest of evidence [Paper]

Graph-based

Walk-on-Graph
  • (ICLR 2025) Reasoning of Large Language Models over Knowledge Graphs with Super-Relations [Paper] [Code] GitHub Repo stars

  • (ICLR 2025) Simple is Effective: The Roles of Graphs and LLMs in Knowledge-Graph-Based RAG [Paper] [Code] GitHub Repo stars

  • (ICLR 2025) StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization [Paper] [Code] GitHub Repo stars

  • (ArXiv 2025) From RAG to Memory: Non-Parametric Continual Learning for Large Language Models [Paper] [Code] GitHub Repo stars

  • (ArXiv 2025) From Local to Global: A GraphRAG Approach to Query-Focused Summarization [Paper] [Code] GitHub Repo stars

  • (NeurIPS 2024) G-Retriever: Retrieval-Augmented Generation for Textual Graph Understanding and Question Answering [Paper] [Code] GitHub Repo stars

  • (NeurIPS 2024) HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models [Paper] [Code] GitHub Repo stars

  • (ArXiv 2024) DALK: Dynamic Co-Augmentation of LLMs and KG to answer Alzheimer's Disease Questions with Scientific Literature [Paper] [Code] GitHub Repo stars

  • (ArXiv 2024) GNN-RAG: Graph Neural Retrieval for Large Language Model Reasoning [Paper] [Code] GitHub Repo stars

  • (ArXiv 2024) LightRAG: Simple and Fast Retrieval-Augmented Generation [Paper] [Code] GitHub Repo stars

  • (ArXiv 2023) Retrieve-Rewrite-Answer: A KG-to-Text Enhanced LLMs Framework for Knowledge Graph Question Answering [Paper] [Code] GitHub Repo stars

  • (ICLR 2022) GreaseLM: Graph REASoning Enhanced Language Models for Question Answering [Paper] [Code] GitHub Repo stars

  • (ACL 2022) Subgraph Retrieval Enhanced Model for Multi-hop KBQA [Paper] [Code] GitHub Repo stars

  • (NAACL 2021) QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering [Paper] [Code] GitHub Repo stars

  • (ACL 2019) PullNet: Open Domain Question Answering with Iterative Retrieval on Knowledge Bases and Text [Paper] [Code]

Think-on-Graph
  • (ICLR 2025) Think-on-Graph 2.0: Deep and Faithful Large Language Model Reasoning with Knowledge-guided Retrieval Augmented Generation [Paper] [Code] GitHub Repo stars

  • (ICLR 2024) Think-on-Graph: Deep and Responsible Reasoning of Large Language Model on Knowledge Graph [Paper] [Code] GitHub Repo stars

  • (ICLR 2024) Reasoning on Graphs: Faithful and Interpretable LLM Reasoning (RoG) [Paper]

  • (ACL 2024) Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs [Paper] [Code] GitHub Repo stars

  • (ACL 2024) GraphReader: Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models [Paper]

  • (AAAI 2024) Knowledge Graph Prompting for Multi-Document Question Answering [Paper] [Code] GitHub Repo stars

  • (WWW 2025) Kag: Boosting llms in professional domains via knowledge augmented generation [Paper] [Code] GitHub Repo stars

  • (EMNLP 2022) Empowering Language Models with Knowledge Graph Reasoning for Question Answering [Paper]

  • (CIS 2024) KnowledgeNavigator: Leveraging Large Language Models for Enhanced Reasoning over Knowledge Graph [Paper] [Code] GitHub Repo stars

  • (ArXiv 2024) HyKGE: A Hypothesis Knowledge Graph Enhanced Framework for Accurate and Reliable Medical LLMs Responses [Paper] [Code] GitHub Repo stars

  • (ArXiv 2024) KG-RAG: Bridging the Gap Between Knowledge and Creativity​ [Paper]

  • (ArXiv 2024) Mitigating Hallucinations in Large Language Models via Self-Refinement-Enhanced Knowledge Retrieval [Paper]

Agentic Orchestration

Single-Agent

Prompting
  • (ArXiv 2025) Search-o1: Agentic Search-Enhanced Large Reasoning Models [Paper] [Code] GitHub Repo stars

  • (ArXiv 2025) Plan∗RAG: Efficient Test-Time Planning for Retrieval Augmented Generation [Paper]

  • (ArXiv 2025) Open Deep Search: Democratizing Search with Open-source Reasoning Agents [Paper] [Code] GitHub Repo stars

  • (ArXiv 2025) DeepRAG: Thinking to Retrieval Step by Step for Large Language Models [Paper]

  • (ArXiv 2025) Enhancing Retrieval Systems with Inference-Time Logical Reasoning [Paper]

  • (ArXiv 2025) Self-Taught Agentic Long-Context Understanding [Paper]

  • (ICLR 2024) Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection [Paper] [Code] GitHub Repo stars

  • (KDD Cup 2024) A Hybrid RAG System with Comprehensive Enhancement on Complex Reasoning [Paper]

  • (ICLR 2023) ReAct: Synergizing Reasoning and Acting in Language Models [Paper] [Code] GitHub Repo stars

  • (EMNLP 2023) Measuring and Narrowing the Compositionality Gap in Language Models [Paper]

  • (ACL 2023) Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions [Paper]

Supervised Fine-Tuning
  • (EMNLP 2024) REAR: A Relevance-Aware Retrieval-Augmented Framework for Open-Domain Question Answering [Paper]

  • (EMNLP 2024) RAG-Studio: Towards In-Domain Adaptation of Retrieval Augmented Generation Through Self-Alignment [Paper]

  • (ICML 2024) InstructRetro: Instruction Tuning post Retrieval-Augmented Pretraining [Paper]

  • (ICML 2024) INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning [Paper] [Code] GitHub Repo stars

  • (ICLR 2024) Ra-dit: Retrieval-augmented dual instruction tuning [Paper]

  • (COLM 2024) RAFT: Adapting Language Model to Domain Specific RAG [Paper]

  • (SR 2024) A fine-tuning enhanced RAG system with quantized influence measure as AI judge [Paper]

  • (ArXiv 2024) SFR-RAG: Towards Contextually Faithful LLMs [Paper] [Code] GitHub Repo stars

  • (NeurIPS 2023) Toolformer: Language Models Can Teach Themselves to Use Tools [Paper]

Reinforcement Learning
  • (ArXiv 2025) DeepResearcher: Scaling Deep Research via Reinforcement Learning in Real-world Environments [Paper] [Code] GitHub Repo stars

  • (ArXiv 2025) Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning [Paper] [Code] GitHub Repo stars

  • (ArXiv 2025) RAG-RL: Advancing Retrieval-Augmented Generation via RL and Curriculum Learning [Paper]

  • (ArXiv 2025) R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning [Paper] [Code] GitHub Repo stars

  • (ArXiv 2025) ReSearch: Learning to Reason with Search for LLMs via Reinforcement Learning [Paper] [Code] GitHub Repo stars

  • (ArXiv 2025) ZeroSearch: Incentivize the Search Capability of LLMs without Searching [Paper] [Code] GitHub Repo stars

  • (ArXiv 2025) ReARTeR: Retrieval-Augmented Reasoning with Trustworthy Process Rewarding [Paper]

  • (ArXiv 2022) WebGPT: Browser-assisted question-answering with human feedback [Paper]

Multi-Agent

  • (ACL 2025) Agentic Reasoning: Reasoning LLMs with Tools for the Deep Research [Paper] [Code] GitHub Repo stars

  • (ArXiv 2025) Collab-RAG: Boosting Retrieval-Augmented Generation for Complex Question Answering via White-Box and Black-Box LLM Collaboration [Paper]

  • (ArXiv 2025) Knowledge-Aware Iterative Retrieval for Multi-Agent Systems [Paper]

  • (ArXiv 2025) SLA Management in Reconfigurable Multi-Agent RAG: A Systems Approach to Question Answering [Paper]

  • (ArXiv 2025) SurgRAW: Multi-Agent Workflow with Chain of Thought Reasoning for Surgical Intelligence [Paper]

  • (ArXiv 2025) HM-RAG: Hierarchical Multi-Agent Multimodal Retrieval Augmented Generation [Paper] [Code] GitHub Repo stars

  • (ArXiv 2025) RAG-KG-IL: A Multi-Agent Hybrid Framework for Reducing Hallucinations and Enhancing LLM Reasoning through RAG and Incremental Knowledge Graph Learning Integration [Paper]

  • (ArXiv 2025) MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding [Paper] [Code] GitHub Repo stars

  • (ArXiv 2025) MANTRA: Enhancing Automated Method-Level Refactoring with Contextual RAG and Multi-Agent LLM Collaboration [Paper]

  • (ArXiv 2025) Talk to Right Specialists: Routing and Planning in Multi-agent System for Question Answering [Paper]

  • (ArXiv 2025) Improving Retrieval-Augmented Generation through Multi-Agent Reinforcement Learning [Paper]

  • (ArXiv 2025) Agentic Information Retrieval [Paper]

  • (ACL 2024) M-RAG: Reinforcing Large Language Model Performance through Retrieval-Augmented Generation with Multiple Partitions [Paper]

  • (NeurIPS 2024) Chain of Agents: Large Language Models Collaborating on Long-Context Tasks [Paper] [Code] GitHub Repo stars

  • (ArXiv 2024) A Collaborative Multi-Agent Approach to Retrieval-Augmented Generation Across Diverse Data [Paper]

  • (ArXiv 2024) MindSearch: Mimicking Human Minds Elicits Deep AI Searcher [Paper] [Code] GitHub Repo stars

⬆️ Back to Table of Contents


📊 Benchmarks and Datasets: These resources enable standardized evaluation and comparison of RAG and Reasoning methods across various real-world scenarios, supporting research progress and practical deployment.

Title Venue & Code Benchmark Task Domain Knowledge Type Reasoning Capability Size
TriviaQA ACL'17 GitHub stars Single-hop QA General Commonsense, Logical Deductive 650,000+
NQ ACL'19 GitHub stars Single-hop QA General Commonsense, Logical Deductive 307,373
SimpleQA Arxiv'24 GitHub stars Single-hop QA General Commonsense Deductive 4,326
HotpotQA EMNLP'18 GitHub stars Multi-hop QA General Commonsense Deductive 113,000
CWQ NAACL'18 GitHub stars Multi-hop QA General Commonsense Deductive 34,689
IIRC EMNLP'20 Multi-hop QA General Commonsense, Logical Deductive 13,000+
2WikiMultiHopQA COLING'20 GitHub stars Multi-hop QA General Commonsense, Logical Deductive 192,606
MuSiQue ACL'22 GitHub stars Multi-hop QA General Commonsense, Logical Deductive 25,000
TopiOCQA TACL'22 GitHub stars Multi-hop QA General Commonsense, Logical Deductive 3,920 + 50,574
FRAMES Arxiv'24 Multi-hop QA General Commonsense, Logical, Arithmetic Deductive 824
MINTQA Arxiv'24 GitHub stars Multi-hop QA General Commonsense, Logical Deductive 10,479
GPQA COLM'24 GitHub stars Multi-hop QA Science Logical Deductive, Abductive 448
HLE Arxiv'25 GitHub stars Multi-hop QA Science Arithmetic, Logical, Multimodal Deductive, Abductive 2,500
QuALITY NAACL'22 GitHub stars Multi-choice QA Narrative Commonsense, Logical Deductive, Abductive 6,737
CC/Bamboogle EMNLP'23 GitHub stars Multi-choice QA General Logical Deductive, Abductive 125
BIG-Bench TMLR'23 GitHub stars Multi-choice QA General Commonsense, Logical Deductive, Abductive, Inductive, Analogical 204
ADQA EMNLP'24 GitHub stars Multi-choice QA Health Commonsense, Logical Deductive, Abductive 446
MMLU-Pro NeurIPS'24 GitHub stars Multi-choice QA Science Arithmetic, Commonsense, Logical Deductive, Inductive 12,032
StrategyQA TACL'21 GitHub stars Multi-step QA General Commonsense, Logical Deductive 2,780
CrisisMMD Arxiv'18 Multimodal QA Crisis Response Commonsense, Multimodal Abductive 16,097
ALFWORLD ICLR'21 GitHub stars Multimodal QA Game Multimodal Deductive, Abductive 3,827
SCIENCEQA NeurIPS'22 GitHub stars Multimodal QA Science Logical, Multimodal Deductive 21,000+
WebShop NeurIPS'22 GitHub stars Multimodal QA E-commerce Multimodal Inductive, Abductive 12,087
MMLongBench-DOC NeurIPS'24 GitHub stars Multimodal QA Narrative Multimodal Deductive, Abductive 1,082
UDA NeurIPS'24 GitHub stars Multimodal QA Narrative Multimodal Deductive 29,590
LongDocURL Arxiv'24 GitHub stars Multimodal QA Narrative Multimodal Deductive, Abductive 2,325
SurgCoTBench Arxiv'25 GitHub stars Multimodal QA Health Multimodal, Logical Abductive, Deductive 14,176
∞BENCH ACL'24 GitHub stars Long-form QA Narrative, General Multimodal, Logical Inductive, Abductive 3,946
GRBENCH ACL'24 GitHub stars Graph QA Narrative, E-commerce, Health Logical Deductive, Inductive 1,740
GraphQA NeurIPS'24 GitHub stars Graph QA Textual Graph Understanding Commonsense, Multimodal Deductive, Abductive 107,503
Refactoring Oracle IEEE'22 GitHub stars Code Software Logical Deductive 7,226
LiveCodeBench ICLR'25 GitHub stars Code General Logical Deductive, Abductive 1,055
ColBench Arxiv'25 Code Software Logical Abductive, Inductive 10,000+
DailyDialog IJCNLP'17 Dialog General Commonsense 13,118
Fever NAACL'18 GitHub stars Fact Checking General Logical Deductive, Abductive 185,445
PubHealth EMNLP'20 GitHub stars Fact Checking Health Commonsense, Logical Abductive, Deductive 11,800
XSum EMNLP'18 GitHub stars Text Summarization Narrative Logical, Commonsense Abductive 226,711

⬆️ Back to Table of Contents


Contributions are welcome! Please feel free to submit pull requests or open issues to suggest new resources.

🤝 Contributing

We welcome contributions to expand this collection! To add your work, please:

  1. Submit a Pull Request or Open an Issue with the following information:

    • Paper Title: Your paper's full title
    • Paper Link: DOI, arXiv, or conference link
    • GitHub Repository: Link to your open-source implementation (if available)
    • Category: Specify which category under our taxonomy your work belongs to:
      • Reasoning-Enhanced RAG: Retrieval Optimization / Integration Enhancement / Generation Enhancement
      • RAG-Enhanced Reasoning: External Knowledge Retrieval (Knowledge Base/Web Retrieval/Tool Using) / In-context Retrieval (Prior Experience/Example or Training Data)
      • Synergized RAG and Reasoning: Reasoning Workflow (Chain-based/Tree-based/Graph-based) / Agentic Orchestration (Single-Agent/Multi-Agent)
  2. Format: Follow the existing format in the README for consistency.

  3. Quality: Ensure your work is relevant to RAG and Reasoning integration.

Your contributions help build a comprehensive resource for the research community!

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •