Nearest Neighbor Normalization: A Sublinear Approach to Improving Contrastive Retrieval

Contrastive image and text models face significant challenges in optimizing retrieval accuracy despite their crucial role in large-scale text-to-image and image-to-text retrieval systems. While these models effectively learn joint embeddings through contrastive loss functions to align matching text-image pairs and separate non-matching pairs, they primarily optimize pretraining objectives like InfoNCE rather than downstream retrieval performance. This fundamental limitation leads to suboptimal embeddings for practical retrieval tasks. Current methodologies struggle with issues like the hubness problem, where certain retrieval candidates dominate as nearest neighbors for multiple queries in high-dimensional embedding spaces, resulting in incorrect matches. Also, existing solutions often require substantial computational resources, additional training across domains, or external database integration, making them impractical for limited-compute environments or black-box embedding models.

Researchers from Massachusetts Institute of Technology and Stanford University present Nearest Neighbor Normalization (NNN), which emerges as a robust training-free approach to enhance contrastive retrieval performance. This innovative method addresses the limitations of previous approaches by introducing a computationally efficient solution with sublinear time complexity relative to reference database size. At its core, NNN implements a correction mechanism that targets embeddings receiving disproportionate retrieval scores by normalizing candidate scores using only the k nearest query embeddings from a reference dataset. This targeted approach not only surpasses the performance of existing methods like QBNorm and DBNorm but also maintains minimal inference overhead. The method demonstrates remarkable versatility by consistently improving retrieval accuracy across various models and datasets while simultaneously reducing harmful biases, such as gender bias, making it a significant advancement in contrastive retrieval systems.

The Nearest Neighbor Normalization method introduces a sophisticated approach to address the hubness problem in contrastive text-to-image retrieval systems. The method calculates a match score s(q, r) between a query q and database retrieval candidate r using cosine similarity between image and text embeddings. To counteract bias towards hub images that show high cosine similarity with multiple query captions, The NNN method implements a bias correction mechanism. This bias b(r) for each retrieval candidate is computed as Î± times the mean of the k highest similarity scores from a reference query dataset D. The final debiased retrieval score is obtained by subtracting this estimated bias from the original score: sD(q, r) = s(q, r) â€“ b(r). The methodâ€™s efficiency stems from its ability to compute bias scores offline and cache them while maintaining sublinear time complexity during retrieval operations through vector retrieval techniques.

The evaluation of NNN demonstrates impressive performance improvements across multiple contrastive multimodal models including CLIP, BLIP, ALBEF, SigLIP, and BEiT. The method shows consistent gains in both text-to-image and image-to-text retrieval tasks, outperforming existing approaches while requiring significantly less computational resources. In addition, while DBNormâ€™s hyperparameter optimization demands 100 times more compute, NNN achieves superior results with minimal computational overhead. The methodâ€™s robustness is evident through its consistent performance with both in-distribution and out-of-distribution queries, maintaining effectiveness even with varying sizes of reference databases. In addressing gender bias, NNN significantly reduced bias in occupation-related image retrieval from 0.348 to 0.072 (n=6) and from 0.270 to 0.078 (n=10), while simultaneously improving average precision from 56.5% to 69.6% for Retrieval@1 and from 49.6% to 56.5% for Retrieval@5, demonstrating its capability to enhance both fairness and accuracy.

Nearest Neighbor Normalization represents a significant advancement in contrastive multimodal retrieval systems. The methodâ€™s innovative approach of using k-nearest neighbors for bias correction scores demonstrates superior efficiency while maintaining improved accuracy compared to existing test-time inference methods. NNNâ€™s versatility is evident in its successful application with various reference datasets and its effectiveness in reducing gender bias, marking it as a practical and powerful solution for enhancing multimodal retrieval systems.

Check out the Paper and GitHub. All credit for this research goes to the researchers of this project. Also,Â donâ€™t forget to follow us onÂ Twitter and join ourÂ Telegram Channel andÂ LinkedIn Group. If you like our work, you will love ourÂ newsletter.. Donâ€™t Forget to join ourÂ 55k+ ML SubReddit.

[Sponsorship Opportunity with us] Promote Your Research/Product/Webinar with 1Million+ Monthly Readers and 500k+ Community Members

The post Nearest Neighbor Normalization: A Sublinear Approach to Improving Contrastive Retrieval appeared first on MarkTechPost.

Source: Read MoreÂ

Sunshine And March Vibes (2025 Wallpapers Edition)

The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

How To Fix Largest Contentful Paint Issues With Subpart Analysis

How To Prevent WordPress SQL Injection Attacks

Microsoft’s allegiance isn’t to OpenAI’s pricey models — Satya Nadella’s focus is selling any AI customers want for maximum profits

If you think you can do better than Xbox or PlayStation in the Console Wars, you may just want to try out this card game

Surviving a 10 year stint in dev hell, this retro-styled hack n’ slash has finally arrived on Xbox

Save $400 on the best Samsung TVs, laptops, tablets, and more when you sign up for Verizon 5G Home or Home Internet

NodeSource N|Solid Runtime Release – May 2025: Performance, Stability & the Final Update for v18

NodeSource N|Solid Runtime Release – May 2025: Performance, Stability & the Final Update for v18

Big Changes at Meteor Software: Our Next Chapter

Apps in Generative AI – Transforming the Digital Experience

Microsoft’s allegiance isn’t to OpenAI’s pricey models — Satya Nadella’s focus is selling any AI customers want for maximum profits

Microsoft’s allegiance isn’t to OpenAI’s pricey models — Satya Nadella’s focus is selling any AI customers want for maximum profits

If you think you can do better than Xbox or PlayStation in the Console Wars, you may just want to try out this card game

Surviving a 10 year stint in dev hell, this retro-styled hack n’ slash has finally arrived on Xbox

Nearest Neighbor Normalization: A Sublinear Approach to Improving Contrastive Retrieval

February 2025 Baseline monthly digest

Learn A1 Level Spanish

No, Call of Duty: Black Ops 6 wonâ€™t require a massive 300 GB download

Tableau vs Power BI: A Comparison of AI-Powered Analytics Tools

CVE-2025-47674 – Credova Financial CSRF

Asset Prefetching Strategies with Vite in Laravel 11.21

Why Security Audits Are Important

Exploring JavaScript (ES2024 Edition)

French deeptech mirSense raises €7M to industrialise quantum laser tech

US car dealerships reeling from massive cyberattack: 3 things customers should know

Nearest Neighbor Normalization: A Sublinear Approach to Improving Contrastive Retrieval

Related Posts