MixedBread AI Introduces Binary MRL: A Novel Embeddings Compression Method, Making Vector Search Scalable and Enable Embeddings-based Applications

Mixedbread.ai recently introduced Binary MRL, a 64-byte embedding to address the challenge of scaling embeddings in natural language processing (NLP) applications due to their memory-intensive nature. In natural language processing (NLP), embeddings play a vital role in various tasks, such as recommendation systems, retrieval, and similarity search. However, the memory requirements of embeddings pose a significant challenge, particularly when dealing with massive datasets. The method aims to find a way to decrease the memory use for embeddings while maintaining their utility and effectiveness in NLP applications.

Currently, state-of-the-art models produce embeddings with high dimensions (e.g., 1024 dimensions), encoded in float32 format, requiring large memory for storage and retrieval. To address these limitations, researchers at mixedbread.ai have found two main approaches: Matryoshka Representation Learning (MRL) and Vector Quantization. MRL focuses on reducing the number of output dimensions of an embedding model while preserving accuracy. This is done by putting more important data in the earlier dimensions of the embedding, which lets the less important dimensions be cut off. On the other hand, Vector Quantization aims to reduce the size of each dimension by representing them as binary values instead of floating-point numbers.Â

The proposed approach, Binary MRL, combines both methods to achieve simultaneous dimensionality reduction and compression of embeddings. By integrating MRL and Vector Quantization, Binary MRL aims to retain the semantic information encoded in embeddings while significantly reducing their memory footprint.

Binary MRL achieves compression by first reducing the number of output dimensions of the embedding model using MRL techniques. This involves training the model to preserve important information in fewer dimensions, thereby allowing for the truncation of less relevant dimensions. Then, Vector Quantization is used to show each dimension of the reduced-dimensional embedding as a binary value. This binary representation significantly reduces the memory usage of embeddings while retaining semantic information. The evaluation of Binary MRL on various datasets demonstrates that the method can achieve over 90% of the performance of the original model while using significantly smaller embeddings.

In conclusion, Binary MRL represents a novel approach to addressing the scalability challenges of embeddings in NLP applications. By combining techniques from MRL and Vector Quantization, Binary MRL achieves significant compression of embeddings while preserving their utility and effectiveness. Not only does this method reduce the costs of large-scale retrieval, but it also makes new tasks possible that were not possible before because of memory limits.

Follow-up on binary embeddings: 64 bytes per embedding, yee-haw

Reduces memory usage of our embedding model by more than 98% (64x) while retaining over 90% of model performance with binary

Model: https://t.co/ZlbEJf3DKi
Blog: https://t.co/ZaalEm0U92

â€” mixedbreadai (@mixedbreadai) April 12, 2024

The post MixedBread AI Introduces Binary MRL: A Novel Embeddings Compression Method, Making Vector Search Scalable and Enable Embeddings-based Applications appeared first on MarkTechPost.

Source: Read MoreÂ

Sunshine And March Vibes (2025 Wallpapers Edition)

The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

How To Fix Largest Contentful Paint Issues With Subpart Analysis

How To Prevent WordPress SQL Injection Attacks

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

Minecraft licensing robbed us of this controversial NFL schedule release video

The power of generators

The power of generators

Simplify Factory Associations with Laravel’s UseFactory Attribute

This Week in Laravel: React Native, PhpStorm Junie, and more

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

MixedBread AI Introduces Binary MRL: A Novel Embeddings Compression Method, Making Vector Search Scalable and Enable Embeddings-based Applications

Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

CVE-2025-4831 – TOTOLINK HTTP POST Request Handler Buffer Overflow Vulnerability

BreachForums Fallout: Secretforums Announces BF Ranks, USDoD Shares Update

Microsoft confirms multiple issues in Windows 11â€™s 2024 security updates

PlayStation’s live service mishaps spotlight the one area Xbox is undeniably dominating its rival

The 12 best Black Friday Verizon deals 2024: Early sales available now

How can we counter online disinformation? | Unlocked 403 cybersecurity podcast (S2E2)

Roborock’s new AI-powered vacuums with market-leading suction are on sale now

Human Metapneumovirus (HMPV) – Testing Management System Using PHP and MySQL

5 Tips for Controlling Your IT Budget

MixedBread AI Introduces Binary MRL: A Novel Embeddings Compression Method, Making Vector Search Scalable and Enable Embeddings-based Applications

Related Posts