Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Error’d: Pickup Sticklers

      September 27, 2025

      From Prompt To Partner: Designing Your Custom AI Assistant

      September 27, 2025

      Microsoft unveils reimagined Marketplace for cloud solutions, AI apps, and more

      September 27, 2025

      Design Dialects: Breaking the Rules, Not the System

      September 27, 2025

      Building personal apps with open source and AI

      September 12, 2025

      What Can We Actually Do With corner-shape?

      September 12, 2025

      Craft, Clarity, and Care: The Story and Work of Mengchu Yao

      September 12, 2025

      Cailabs secures €57M to accelerate growth and industrial scale-up

      September 12, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Using phpinfo() to Debug Common and Not-so-Common PHP Errors and Warnings

      September 28, 2025
      Recent

      Using phpinfo() to Debug Common and Not-so-Common PHP Errors and Warnings

      September 28, 2025

      Mastering PHP File Uploads: A Guide to php.ini Settings and Code Examples

      September 28, 2025

      The first browser with JavaScript landed 30 years ago

      September 27, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured
      Recent
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»This AI Paper from Microsoft Introduces a DiskANN-Integrated System: A Cost-Effective and Low-Latency Vector Search Using Azure Cosmos DB

    This AI Paper from Microsoft Introduces a DiskANN-Integrated System: A Cost-Effective and Low-Latency Vector Search Using Azure Cosmos DB

    May 19, 2025

    The ability to search high-dimensional vector representations has become a core requirement for modern data systems. These vector representations, generated by deep learning models, encapsulate data’s semantic and contextual meanings. This enables systems to retrieve results not based on exact matches, but on relevance and similarity. Such semantic capabilities are essential in large-scale applications such as web search, AI-powered assistants, and content recommendations, where users and agents alike need access to information in a meaningful way rather than through structured queries alone.

    One of the main issues faced in vector-based retrieval is the high cost and complexity of operating separate systems for transactional data and vector indexes. Traditionally, vector databases are optimized solely for semantic search performance, but they require users to duplicate data from their primary databases, introducing latency, storage overhead, and risk of inconsistencies. Developers are also burdened with synchronizing two distinct systems, which can limit scalability, flexibility, and data integrity when updates occur rapidly.

    Some popular tools for vector search, like Zilliz and Pinecone, operate as standalone services that offer efficient similarity search. However, these platforms rely on segment-based or fully in-memory architectures. They often require repeated rebuilding of indices and can suffer from latency spikes and significant memory usage. This makes them inefficient in scenarios that involve large-scale or constantly changing data. The issue worsens when dealing with updates, filtering queries, or managing multiple tenants, as these systems lack deep integration with transactional operations and structured indexing.

    Researchers at Microsoft introduced an approach that integrates vector indexing directly into Azure Cosmos DB’s NoSQL engine. They used DiskANN, a graph-based indexing library already known for its performance in large-scale semantic search, and re-engineered it to work within Cosmos DB’s infrastructure. This design eliminates the need for a separate vector database. Cosmos DB’s built-in capabilities—such as high availability, elasticity, multi-tenancy, and automatic partitioning—are fully utilized, making the solution both cost-efficient and scalable. Each collection maintains a single vector index per partition, which is synchronized with the main document data using the existing Bw-Tree index structure.

    The rewritten DiskANN library uses Rust and introduces asynchronous operations to ensure compatibility with database environments. It allows the database to retrieve or update only necessary vector components, such as quantized versions or neighbor lists, reducing memory usage. Vector insertions and queries are managed using a hybrid approach, with most computations occurring in quantized space. This design supports paginated searches and filter-aware traversal, which means queries can efficiently handle complex predicates and scale across billions of vectors. The methodology also includes a sharded indexing mode, allowing separate indices based on defined keys, such as tenant ID or time period.

    In experiments, the system demonstrated strong performance. For a dataset of 10 million 768-dimensional vectors, query latency remained below 20 milliseconds (p50), and the system achieved a recall@10 of 94.64%. Compared to enterprise-tier offerings, Azure Cosmos DB provided query costs that were 15× lower than Zilliz and 41× lower than Pinecone. Cost-efficiency was maintained even as the index increased from 100,000 to 10 million vectors, with less than a 2× rise in latency or Request Units (RUs). On ingestion, Cosmos DB charged about $162.5 for 10 million vector inserts, which was lower than Pinecone and DataStax, though higher than Zilliz. Furthermore, recall remained stable even during heavy update cycles, with in-place deletions significantly improving accuracy in shifting data distributions.

    The study presents a compelling solution to unifying vector search with transactional databases. The research team from Microsoft designed a system that simplifies operations and achieves considerable performance in cost, latency, and scalability. By embedding vector search within Cosmos DB, they offer a practical template for integrating semantic capabilities directly into operational workloads.


    Check out the Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 90k+ ML SubReddit.

    The post This AI Paper from Microsoft Introduces a DiskANN-Integrated System: A Cost-Effective and Low-Latency Vector Search Using Azure Cosmos DB appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleOmni-R1: Advancing Audio Question Answering with Text-Driven Reinforcement Learning and Auto-Generated Data
    Next Article HERE Technologies boosts developer productivity with new generative AI-powered coding assistant

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    September 3, 2025
    Machine Learning

    Announcing the new cluster creation experience for Amazon SageMaker HyperPod

    September 3, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    CISA Warns of Rails Ruby on Rails Path Traversal Vulnerability Exploited in Attacks

    Security

    Accepting Multiple Parameters in Laravel Commands

    Development

    Threat Report H2 2024: Infostealer shakeup, new attack vector for mobile, and Nomani

    Development

    Why More Businesses Are Switching from Flutter to React Native in 2025🔄

    Web Development

    Highlights

    Development

    Learn LangGraph and Build Conversational AI with Python

    May 20, 2025

    If you’re building conversational AI and tired of messy logic or hard-to-scale workflows, LangGraph makes…

    The AI Fix #59: Grok thinks it’s Mecha Hitler, and AIs can think strategically

    July 15, 2025

    Chatbots That Do More: The Evolution of AI-Powered Conversations in 2025🤖

    July 22, 2025

    MongoDB Data Types

    August 23, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.