This AI Paper from Microsoft Introduces a DiskANN-Integrated System: A Cost-Effective and Low-Latency Vector Search Using Azure Cosmos DB

The ability to search high-dimensional vector representations has become a core requirement for modern data systems. These vector representations, generated by deep learning models, encapsulate data’s semantic and contextual meanings. This enables systems to retrieve results not based on exact matches, but on relevance and similarity. Such semantic capabilities are essential in large-scale applications such as web search, AI-powered assistants, and content recommendations, where users and agents alike need access to information in a meaningful way rather than through structured queries alone.

One of the main issues faced in vector-based retrieval is the high cost and complexity of operating separate systems for transactional data and vector indexes. Traditionally, vector databases are optimized solely for semantic search performance, but they require users to duplicate data from their primary databases, introducing latency, storage overhead, and risk of inconsistencies. Developers are also burdened with synchronizing two distinct systems, which can limit scalability, flexibility, and data integrity when updates occur rapidly.

Some popular tools for vector search, like Zilliz and Pinecone, operate as standalone services that offer efficient similarity search. However, these platforms rely on segment-based or fully in-memory architectures. They often require repeated rebuilding of indices and can suffer from latency spikes and significant memory usage. This makes them inefficient in scenarios that involve large-scale or constantly changing data. The issue worsens when dealing with updates, filtering queries, or managing multiple tenants, as these systems lack deep integration with transactional operations and structured indexing.

Researchers at Microsoft introduced an approach that integrates vector indexing directly into Azure Cosmos DB’s NoSQL engine. They used DiskANN, a graph-based indexing library already known for its performance in large-scale semantic search, and re-engineered it to work within Cosmos DB’s infrastructure. This design eliminates the need for a separate vector database. Cosmos DB’s built-in capabilities—such as high availability, elasticity, multi-tenancy, and automatic partitioning—are fully utilized, making the solution both cost-efficient and scalable. Each collection maintains a single vector index per partition, which is synchronized with the main document data using the existing Bw-Tree index structure.

The rewritten DiskANN library uses Rust and introduces asynchronous operations to ensure compatibility with database environments. It allows the database to retrieve or update only necessary vector components, such as quantized versions or neighbor lists, reducing memory usage. Vector insertions and queries are managed using a hybrid approach, with most computations occurring in quantized space. This design supports paginated searches and filter-aware traversal, which means queries can efficiently handle complex predicates and scale across billions of vectors. The methodology also includes a sharded indexing mode, allowing separate indices based on defined keys, such as tenant ID or time period.

In experiments, the system demonstrated strong performance. For a dataset of 10 million 768-dimensional vectors, query latency remained below 20 milliseconds (p50), and the system achieved a recall@10 of 94.64%. Compared to enterprise-tier offerings, Azure Cosmos DB provided query costs that were 15× lower than Zilliz and 41× lower than Pinecone. Cost-efficiency was maintained even as the index increased from 100,000 to 10 million vectors, with less than a 2× rise in latency or Request Units (RUs). On ingestion, Cosmos DB charged about $162.5 for 10 million vector inserts, which was lower than Pinecone and DataStax, though higher than Zilliz. Furthermore, recall remained stable even during heavy update cycles, with in-place deletions significantly improving accuracy in shifting data distributions.

The study presents a compelling solution to unifying vector search with transactional databases. The research team from Microsoft designed a system that simplifies operations and achieves considerable performance in cost, latency, and scalability. By embedding vector search within Cosmos DB, they offer a practical template for integrating semantic capabilities directly into operational workloads.

Check out the Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 90k+ ML SubReddit.

The post This AI Paper from Microsoft Introduces a DiskANN-Integrated System: A Cost-Effective and Low-Latency Vector Search Using Azure Cosmos DB appeared first on MarkTechPost.

Source: Read MoreÂ

Stop writing tests: Automate fully with Generative AI

Opsera’s Codeglide.ai lets developers easily turn legacy APIs into MCP servers

Black Duck Security GitHub App, NuGet MCP Server preview, and more – Daily News Digest

10 Ways Node.js Development Boosts AI & Real-Time Data (2025-2026 Edition)

This new Coros watch has 3 weeks of battery life and tracks way more – even fly fishing

5 ways automation can speed up your daily workflow – and implementation is easy

This new C-suite role is more important than ever in the AI era – here’s why

iPhone users may finally be able to send encrypted texts to Android friends with iOS 26

Creating Dynamic Real-Time Features with Laravel Broadcasting

Creating Dynamic Real-Time Features with Laravel Broadcasting

Understanding Tailwind CSS Safelist: Keep Your Dynamic Classes Safe!

Sitecore’s Content SDK: Everything You Need to Know

Why GNOME Replaced Eye of GNOME with Loupe as the Default Image Viewer

Why GNOME Replaced Eye of GNOME with Loupe as the Default Image Viewer

Microsoft admits it broke “Reset this PC” in Windows 11 23H2 KB5063875, Windows 10 KB5063709

How to Fix “EA AntiCheat Has Detected an Incompatible Driver” on Windows 11?

This AI Paper from Microsoft Introduces a DiskANN-Integrated System: A Cost-Effective and Low-Latency Vector Search Using Azure Cosmos DB

How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

Streamline employee training with an intelligent chatbot powered by Amazon Q Business

A Step-by-Step Coding Guide to Defining Custom Model Context Protocol (MCP) Server and Client Tools with FastMCP and Integrating Them into Google Gemini 2.0’s Function‑Calling Workflow

Microsoft Pressed Over DOGE GitHub Code Tied to NLRB Data Removal

PHP Fatal Error Backtraces in PHP 8.5

CVE-2025-44952 – Open5GS PFCP Buffer Overflow

4 Ways I am Encouraging My 4 Year Old Child to Help Learn Coding and Use Computer

Beginner’s Guide to Cloud Cybersecurity

Microsoft is finally fixing my biggest issues with the Windows 11 Start menu

CVE-2025-27459 – VNC Weak Password Storage

This AI Paper from Microsoft Introduces a DiskANN-Integrated System: A Cost-Effective and Low-Latency Vector Search Using Azure Cosmos DB

Related Posts