Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»LLM2Vec: A Simple AI Approach to Transform Any Decoder-Only LLM into a Text Encoder Achieving SOTA Performance on MTEB in the Unsupervised and Supervised Category

    LLM2Vec: A Simple AI Approach to Transform Any Decoder-Only LLM into a Text Encoder Achieving SOTA Performance on MTEB in the Unsupervised and Supervised Category

    April 12, 2024

    Natural Language Processing (NLP) tasks heavily rely on text embedding models as they translate the semantic meaning of text into vector representations. These representations make it possible to quickly complete a variety of NLP tasks, including information retrieval, grouping, and semantic textual similarity. 

    Pre-trained bidirectional encoders or encoder-decoders, such as BERT and T5, have historically been the preferred models for this use. Lately, the trend in text embedding jobs has been to use Large Language Models (LLMs) that are decoder-only.

    In the NLP field, decoder-only LLMs have proven sluggish in taking off text embedding tasks. Their causal attention mechanism, which restricts their capacity to produce rich contextualized representations, is partially responsible for this reluctance. With causal attention, the representation of each token is determined only by the tokens that came before it, which limits the model’s ability to extract information from the full input sequence. 

    Despite this drawback, decoder-only LLMs are superior to their encoder-only counterparts in a number of ways. Decoder-only LLMs become more sample-efficient by learning from all input tokens during pre-training. They also gain from an existing ecosystem that includes a wealth of tooling and pre-training recipes. Decoder-only LLMs are now highly proficient at the instruction following tasks because of recent improvements in instruction fine-tuning, which makes them adaptable to a wide range of NLP applications.

    In order to overcome the drawback of decoder-only LLMs for text embedding, a team of researchers from Mila, McGill University, ServiceNow Research, and Facebook CIFAR AI Chair has proposed LLM2Vec, a straightforward, unsupervised method to convert any pre-trained decoder-only LLM into a text encoder. LLM2Vec is very data and parameter-efficient and does not require any labeled data. 

    There are three simple steps in LLM2Vec: First, it permits bidirectional attention, which lets the model build representations by taking into account both tokens that come before and after it. Second, it uses a technique called masked next token prediction, in which the model predicts the masked tokens that will appear in the input sequence, assisting it in inefficiently comprehending and encoding contextual information. Finally, it makes use of unsupervised contrastive learning, which contrasts similar and different occurrences in the embedding space to help the model develop robust representations.

    LLM2Vec has been used on three well-known LLMs with parameter sizes ranging from 1.3 billion to 7 billion to verify its effectiveness. The team has tested the converted models on many word and sequence-level tasks in English. The tests have shown notable gains in performance over conventional encoder-only models, especially when it comes to word-level tasks. The method has set a new benchmark performance in unsupervised learning on the Massive Text Embeddings Benchmark (MTEB).

    The team has shared that they have attained state-of-the-art results on MTEB by combining LLM2Vec with supervised contrastive learning. The extensive research and empirical results have highlighted how well LLMs work as universal text encoders. This transition has been accomplished in a parameter-efficient way, removing the requirement for expensive adaptation or the creation of synthetic data using fictitious models such as GPT-4.

    Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

    If you like our work, you will love our newsletter..

    Don’t Forget to join our 40k+ ML SubReddit

    Want to get in front of 1.5 Million AI Audience? Work with us here

    The post LLM2Vec: A Simple AI Approach to Transform Any Decoder-Only LLM into a Text Encoder Achieving SOTA Performance on MTEB in the Unsupervised and Supervised Category appeared first on MarkTechPost.

    Source: Read More 

    Hostinger
    Facebook Twitter Reddit Email Copy Link
    Previous ArticleAutomotive Dealerships Should Look More Like Genius Bars
    Next Article Microsoft and CMU Researchers Propose a Machine Learning Method to Train an AAC (Automated Audio Captioning) System Using Only Text

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 17, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-4831 – TOTOLINK HTTP POST Request Handler Buffer Overflow Vulnerability

    May 17, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    How Faithful are RAG Models? This AI Paper from Stanford Evaluates the Faithfulness of RAG Models and the Impact of Data Accuracy on RAG Systems in LLMs

    Development

    IBM AI Releases Granite 4.0 Tiny Preview: A Compact Open-Language Model Optimized for Long-Context and Instruction Tasks

    Machine Learning

    Google DeepMind Releases PaliGemma 2 Mix: New Instruction Vision Language Models Fine-Tuned on a Mix of Vision Language Tasks

    Machine Learning

    Understanding the Language Server Protocol – Easier Code Editing Across Languages and Tools

    Development

    Highlights

    Development

    Perficient Included in IDC Market Glance: Healthcare Provider Operational IT Solutions, 1Q25

    April 25, 2025

    As technology continues to advance, patients and care teams expect to seamlessly engage with tools…

    MoDEM (Mixture of Domain Expert Models): A Paradigm Shift in AI Combining Specialized Models and Intelligent Routing for Enhanced Efficiency and Precision

    December 3, 2024

    Part 2: Sitecore Quick Guide for the Beginner

    January 3, 2025

    New TunnelVision Attack Allows Hijacking of VPN Traffic via DHCP Manipulation

    May 9, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.