Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Error’d: Pickup Sticklers

      September 27, 2025

      From Prompt To Partner: Designing Your Custom AI Assistant

      September 27, 2025

      Microsoft unveils reimagined Marketplace for cloud solutions, AI apps, and more

      September 27, 2025

      Design Dialects: Breaking the Rules, Not the System

      September 27, 2025

      Building personal apps with open source and AI

      September 12, 2025

      What Can We Actually Do With corner-shape?

      September 12, 2025

      Craft, Clarity, and Care: The Story and Work of Mengchu Yao

      September 12, 2025

      Cailabs secures €57M to accelerate growth and industrial scale-up

      September 12, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Using phpinfo() to Debug Common and Not-so-Common PHP Errors and Warnings

      September 28, 2025
      Recent

      Using phpinfo() to Debug Common and Not-so-Common PHP Errors and Warnings

      September 28, 2025

      Mastering PHP File Uploads: A Guide to php.ini Settings and Code Examples

      September 28, 2025

      The first browser with JavaScript landed 30 years ago

      September 27, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured
      Recent
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»Meet LEANN: The Tiniest Vector Database that Democratizes Personal AI with Storage-Efficient Approximate Nearest Neighbor (ANN) Search Index

    Meet LEANN: The Tiniest Vector Database that Democratizes Personal AI with Storage-Efficient Approximate Nearest Neighbor (ANN) Search Index

    August 12, 2025

    Embedding-based search outperforms traditional keyword-based methods across various domains by capturing semantic similarity using dense vector representations and approximate nearest neighbor (ANN) search. However, the ANN data structure brings excessive storage overhead, often 1.5 to 7 times the size of the original raw data. This overhead is manageable in large-scale web applications but becomes impractical for personal devices or large datasets. Reducing storage to under 5% of the original data size is critical for edge deployment, but existing solutions fall short. Techniques like product quantization (PQ) can reduce storage, but either lead to a decrease in accuracy or need increased search latency.

    Recommended Article: NVIDIA AI Released DiffusionRenderer: An AI Model for Editable, Photorealistic 3D Scenes from a Single Video

    Vector search methods depend on IVF and proximity graphs. Graph-based approaches like HNSW, NSG, and Vamana are considered state-of-the-art due to their balance of accuracy and efficiency. Efforts to reduce graph size, such as learned neighbor selection, face limitations due to high training costs and dependency on labeled data. For resource-constrained environments, DiskANN and Starling store data on disk, while FusionANNS optimizes hardware usage. Methods like AiSAQ and EdgeRAG attempt to minimize memory usage but still suffer from high storage overhead or performance degradation at scale. Embedding compression techniques like PQ and RabitQ provides quantization with theoretical error bounds, but struggles to maintain accuracy under tight budgets.

    Researchers from UC Berkeley, CUHK, Amazon Web Services, and UC Davis have developed LEANN, a storage-efficient ANN search index optimized for resource-limited personal devices. It integrates a compact graph-based structure with an on-the-fly recomputation strategy, enabling fast and accurate retrieval while minimizing storage overhead. LEANN achieves up to 50 times smaller storage than standard indexes by reducing the index size to under 5% of the original raw data. It maintains 90% top-3 recall in under 2 seconds on real-world question-answering benchmarks. To reduce latency, LEANN utilizes a two-level traversal algorithm and dynamic batching that combines embedding computations across search hops, enhancing GPU utilization.

    LEANN’s architecture combines core methods such as graph-based recomputation, main techniques, and system workflow. Built on the HNSW framework, it observes that each query needs embeddings for only a limited subset of nodes, prompting on-demand computation instead of pre-storing all embeddings. To address earlier challenges, LEANN introduces two techniques: (a) a two-level graph traversal with dynamic batching to lower recomputation latency, and (b) a high degree of preserving graph pruning method to reduce metadata storage. In the system workflow, LEANN begins by computing embeddings for all dataset items and then constructs a vector index using an off-the-shelf graph-based indexing approach.

    In terms of storage and latency, LEANN outperforms EdgeRAG, an IVF-based recomputation method, achieving latency reductions ranging from 21.17 to 200.60 times across various datasets and hardware platforms. This advantage is from LEANN’s polylogarithmic recomputation complexity, which scales more efficiently than EdgeRAG’s √𝑁 growth. In terms of accuracy for downstream RAG tasks, LEANN achieves higher performance across most datasets, except GPQA, where a distributional mismatch limits its effectiveness. Similarly, on HotpotQA, the single-hop retrieval setup limits accuracy gains, as the dataset demands multi-hop reasoning. Despite these limitations, LEANN shows strong performance across diverse benchmarks.

    In this paper, researchers introduced LEANN, a storage-efficient neural retrieval system that combines graph-based recomputation with innovative optimizations. By integrating a two-level search algorithm and dynamic batching, it eliminates the need to store full embeddings, achieving significant reductions in storage overhead while maintaining high accuracy. Despite its strengths, LEANN faces limitations, such as high peak storage usage during index construction, which could be addressed through pre-clustering or other techniques. Future work may focus on reducing latency and enhancing responsiveness, opening the path for broader adoption in resource-constrained environments.


    Check out the Paper and GitHub Page here. Feel free to check out our GitHub Page for Tutorials, Codes and Notebooks. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter.

    🇬 Star us on GitHub
    🇸 Sponsor us

    The post Meet LEANN: The Tiniest Vector Database that Democratizes Personal AI with Storage-Efficient Approximate Nearest Neighbor (ANN) Search Index appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleAutomate AIOps with Amazon SageMaker Unified Studio projects, Part 1: Solution architecture
    Next Article Zhipu AI Releases GLM-4.5V: Versatile Multimodal Reasoning with Scalable Reinforcement Learning

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    September 3, 2025
    Machine Learning

    Announcing the new cluster creation experience for Amazon SageMaker HyperPod

    September 3, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    10+ Best Free Hero Scene Mockup Templates for Photoshop in 2025

    Learning Resources

    Over 70 Malicious npm and VS Code Packages Found Stealing Data and Crypto

    Development

    KL-001-2025-010: Schneider Electric EcoStruxure IT Data Center Expert Privilege Escalation

    Security

    10 Top React.js Development Service Providers For Your Next Project In 2026

    Tech & Work

    Highlights

    AI Won’t Kill UX, Our Habits Will: Why Designers Must Stay Intentional

    August 10, 2025

    Post Content Source: Read More 

    CISA Warns of iOS 0-Click Vulnerability Exploited in the Wild

    June 17, 2025

    Inertia Releases a New Form Component

    August 16, 2025

    New machine-learning application to help researchers predict chemical properties

    July 24, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.