Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Mirantis reveals Lens Prism, an AI copilot for operating Kubernetes clusters

      July 3, 2025

      Avoid these common platform engineering mistakes

      July 3, 2025

      Full-Stack Techies vs Toptal: Which Is Better for React.js Outsourcing?

      July 3, 2025

      The AI productivity paradox in software engineering: Balancing efficiency and human skill retention

      July 2, 2025

      Microsoft Gaming studios head Matt Booty says “overall portfolio strategy is unchanged” — with more than 40 games in production

      July 3, 2025

      Capcom reports that its Steam game sales have risen massively — despite flagship titles like Monster Hunter Wilds receiving profuse backlash from PC players

      July 3, 2025

      Cloudflare is fighting to safeguard “the future of the web itself” — standing directly in the way of leading AI firms

      July 3, 2025

      Microsoft reportedly lacks the know-how to fully leverage OpenAI’s tech — despite holding IP rights

      July 3, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      PHP 8.5.0 Alpha 1 available for testing

      July 3, 2025
      Recent

      PHP 8.5.0 Alpha 1 available for testing

      July 3, 2025

      Recording cross browser compatible media

      July 3, 2025

      Celebrating Perficient’s Third Databricks Champion

      July 3, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft Gaming studios head Matt Booty says “overall portfolio strategy is unchanged” — with more than 40 games in production

      July 3, 2025
      Recent

      Microsoft Gaming studios head Matt Booty says “overall portfolio strategy is unchanged” — with more than 40 games in production

      July 3, 2025

      Capcom reports that its Steam game sales have risen massively — despite flagship titles like Monster Hunter Wilds receiving profuse backlash from PC players

      July 3, 2025

      Cloudflare is fighting to safeguard “the future of the web itself” — standing directly in the way of leading AI firms

      July 3, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»Nomic Open Sources State-of-the-Art Multimodal Embedding Model

    Nomic Open Sources State-of-the-Art Multimodal Embedding Model

    April 2, 2025

    Nomic has announced the release of “Nomic Embed Multimodal,” a groundbreaking embedding model that achieves state-of-the-art performance on visual document retrieval tasks. The new model seamlessly processes interleaved text, images, and screenshots, establishing a new high score on the Vidore-v2 benchmark for visual document retrieval. This advancement is particularly significant for retrieval augmented generation (RAG) applications working with PDF documents, where capturing both visual and textual context is crucial.

    Breaking New Ground in Visual Document Retrieval

    The Nomic Embed Multimodal 7B model has achieved an impressive 62.7 NDCG@5 score on the Vidore-v2 benchmark, representing a 2.8-point improvement over previous best-performing models. This advancement marks a significant milestone in the evolution of multimodal embeddings for document processing.

    Unlike traditional retrieval systems that primarily rely on extracted text and often miss crucial visual elements, Nomic’s new model captures the full richness of documents by embedding both text and visual components directly. This approach eliminates the need for complex, error-prone processing pipelines commonly used in document analysis.

    Solving Real-World Document Challenges

    Documents are inherently multimodal, conveying information through text, figures, page layouts, tables, and even fonts. Traditional text-only systems struggle with this complexity, often requiring separate encoders for visual and text inputs or complex preprocessing pipelines.

    Nomic Embed Multimodal provides an elegant solution by supporting interleaved text and image inputs in a single model, making it ideal for:

    • PDF documents and research papers
    • Screenshots of applications and websites
    • Visually rich content where layout matters
    • Multilingual documents where visual context is important

    A Complete Embedding Ecosystem

    With the release of Nomic Embed Multimodal, Nomic has finalized a comprehensive suite of embedding models that achieve state-of-the-art performance across multiple domains:

    • Nomic Embed Multimodal: The latest addition that achieves state-of-the-art performance on interleaved text, images, and screenshots. It is ideal for document retrieval workflows.
    • Nomic Embed Text v2: A powerful multilingual text embedding model that achieves state-of-the-art performance on the MIRACL benchmark. It is ideal for text retrieval workflows in any language.
    • Nomic Embed Code: An embedding model that is specialized for code search applications, achieving a state-of-the-art score on the CodeSearchNet benchmark. It is ideal for code agent applications.

    This complete ecosystem provides developers with cutting-edge tools for handling diverse data types, from pure text to complex multimodal documents and specialized code repositories. Each model in the ecosystem is designed to work seamlessly with modern RAG workflows while delivering best-in-class performance in its domain.

    Availability

    Nomic has made their multimodal embedding models available on Hugging Face, along with the corresponding dataset and GitHub repository, making this cutting-edge technology accessible to researchers and developers worldwide.

    This release represents a significant step forward in multimodal representation learning and document understanding, completing Nomic’s vision of providing state-of-the-art embedding solutions across the full spectrum of data modalities.

    Availability is upcoming in the (Nomic Atlas Data and Embedding Platform)


    Thanks to the Nomic team for the thought leadership/ Resources for this article. Nomic team has supported us financially and by content for this article.

    The post Nomic Open Sources State-of-the-Art Multimodal Embedding Model appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleMitigating Hallucinations in Large Vision-Language Models: A Latent Space Steering Approach
    Next Article Mutual Reinforcement of LLM Dialogue Synthesis and Summarization Capabilities for Few-Shot Dialogue Summarization

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    July 3, 2025
    Machine Learning

    End-to-End model training and deployment with Amazon SageMaker Unified Studio

    July 3, 2025
    Leave A Reply Cancel Reply

    For security, use of Google's reCAPTCHA service is required which is subject to the Google Privacy Policy and Terms of Use.

    Continue Reading

    Xbox Game Pass gets Metaphor: ReFantazio, Tales of Kenzera: Zau, To a T, and more

    News & Updates

    Markus Buehler receives 2025 Washington Award

    Artificial Intelligence

    ️ Inside the 160-Comment Fight to Fix SnakeYAML’s RCE Default

    Security

    CVE-2025-6310 – PHPGurukul Emergency Ambulance Hiring Portal SQL Injection Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    Highlights

    The Anatomy of an RCE Attack : The Hacker’s Big Score

    June 15, 2025

    The Anatomy of an RCE Attack : The Hacker’s Big Score

    Real Attacks, Big Damage, and How to Stop ThemRCE is often dubbed the holy grail of hacking—because it grants attackers full control over a system remotely. Think of it as finding a hidden backdoor th …
    Read more

    Published Date:
    Jun 15, 2025 (2 hours, 22 minutes ago)

    Vulnerabilities has been mentioned in this article.

    CVE-2021-44228

    CVE-2015-20112 – Fortinet SSL/TLS CTR Stream Collision Vulnerability

    June 29, 2025

    CVE-2025-43002 – SAP S4CORE OData Information Disclosure

    May 13, 2025

    CVE-2025-4647 – Centreon Web Cross-Site Scripting (XSS)

    May 13, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.