EraRAG: A Scalable, Multi-Layered Graph-Based Retrieval System for Dynamic and Growing Corpora

Large Language Models (LLMs) have revolutionized many areas of natural language processing, but they still face critical limitations when dealing with up-to-date facts, domain-specific information, or complex multi-hop reasoning. Retrieval-Augmented Generation (RAG) approaches aim to address these gaps by allowing language models to retrieve and integrate information from external sources. However, most existing graph-based RAG systems are optimized for static corpora and struggle with efficiency, accuracy, and scalability when the data is continually growing—such as in news feeds, research repositories, or user-generated online content.

Introducing EraRAG: Efficient Updates for Evolving Data

Recognizing these challenges, researchers from Huawei, The Hong Kong University of Science and Technology, and WeBank have developed EraRAG, a novel retrieval-augmented generation framework purpose-built for dynamic, ever-expanding corpora. Rather than rebuilding the entire retrieval structure whenever new data arrives, EraRAG relies on localized, selective updates that touch only those parts of the retrieval graph affected by the changes.

Core Features:

Hyperplane-Based Locality-Sensitive Hashing (LSH):
Every corpus is chunked into small text passages which are embedded as vectors. EraRAG then uses randomly sampled hyperplanes to project these vectors into binary hash codes—a process that groups semantically similar chunks into the same “bucket.” This LSH-based approach maintains both semantic coherence and efficient grouping.
Hierarchical, Multi-Layered Graph Construction:
The core retrieval structure in EraRAG is a multi-layered graph. At each layer, segments (or buckets) of similar text are summarized using a language model. Segments that are too large are split, while those too small are merged—ensuring both semantic consistency and balanced granularity. Summarized representations at higher layers enable efficient retrieval for both fine-grained and abstract queries.
Incremental, Localized Updates:
When new data arrives, its embedding is hashed using the original hyperplanes—ensuring consistency with the initial graph construction. Only the buckets/segments directly impacted by new entries are updated, merged, split, or re-summarized, while the rest of the graph remains untouched. The update propagates up the graph hierarchy, but always remains localized to the affected region, saving significant computation and token costs.
Reproducibility and Determinism:
Unlike standard LSH clustering, EraRAG preserves the set of hyperplanes used during initial hashing. This makes bucket assignment deterministic and reproducible, which is crucial for consistent, efficient updates over time.

Performance and Impact

Comprehensive experiments on a variety of question answering benchmarks demonstrate that EraRAG:

Reduces Update Costs: Achieves up to 95% reduction in graph reconstruction time and token usage compared to leading graph-based RAG methods (e.g., GraphRAG, RAPTOR, HippoRAG).
Maintains High Accuracy: EraRAG consistently outperforms other retrieval architectures in both accuracy and recall—across static, growing, and abstract question answering tasks—with minimal compromise in retrieval quality or multi-hop reasoning capabilities.
Supports Versatile Query Needs: The multi-layered graph design allows EraRAG to efficiently retrieve fine-grained factual details or high-level semantic summaries, tailoring its retrieval pattern to the nature of each query.

Practical Implications

EraRAG offers a scalable and robust retrieval framework ideal for real-world settings where data is continuously added—such as live news, scholarly archives, or user-driven platforms. It strikes a balance between retrieval efficiency and adaptability, making LLM-backed applications more factual, responsive, and trustworthy in fast-changing environments.

Check out the Paper and GitHub. All credit for this research goes to the researchers of this project | Meet the AI Dev Newsletter read by 40k+ Devs and Researchers from NVIDIA, OpenAI, DeepMind, Meta, Microsoft, JP Morgan Chase, Amgen, Aflac, Wells Fargo and 100s more [SUBSCRIBE NOW]

The post EraRAG: A Scalable, Multi-Layered Graph-Based Retrieval System for Dynamic and Growing Corpora appeared first on MarkTechPost.

Source: Read MoreÂ

The Value-Driven AI Roadmap

This week in AI updates: Mistral’s new Le Chat features, ChatGPT updates, and more (September 5, 2025)

Designing For TV: Principles, Patterns And Practical Guidance (Part 2)

Neo4j introduces new graph architecture that allows operational and analytics workloads to be run together

‘Job Hugging’ Trend Emerges as Workers Confront AI Uncertainty

Distribution Release: MocaccinoOS 25.09

Composition in CSS

DataCrunch raises €55M to boost EU AI sovereignty with green cloud infrastructure

Finally, safe array methods in JavaScript

Finally, safe array methods in JavaScript

Perficient Interviewed for Forrester Report on AI’s Transformative Role in DXPs

Perficient’s “What If? So What?” Podcast Wins Gold Stevie® Award for Technology Podcast

Distribution Release: MocaccinoOS 25.09

Distribution Release: MocaccinoOS 25.09

Speed Isn’t Everything When Buying SSDs – Here’s What Really Matters!

14 Themes for Beautifying Your Ghostty Terminal

EraRAG: A Scalable, Multi-Layered Graph-Based Retrieval System for Dynamic and Growing Corpora