Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»NuMind Releases Three SOTA NER Models that Outperform Similar-Sized Foundation Models in the Few-shot Regime and Competing with Much Larger LLMs

    NuMind Releases Three SOTA NER Models that Outperform Similar-Sized Foundation Models in the Few-shot Regime and Competing with Much Larger LLMs

    May 17, 2024

    Named Entity Recognition (NER) is vital in natural language processing, with applications spanning medical coding, financial analysis, and legal document parsing. Custom models are typically created using transformer encoders pre-trained on self-supervised tasks like masked language modeling (MLM). However, recent years have seen the rise of large language models (LLMs) like GPT-3 and GPT-4, which can tackle NER tasks through well-crafted prompts but pose challenges due to high inference costs and potential privacy concerns.

    NuMind team introduces an approach that suggests utilizing LLMs to minimize human annotations for custom model creation. Rather than employing an LLM to annotate a single-domain dataset for a specific NER task, the idea involves using the LLM to annotate a diverse, multi-domain dataset covering various NER problems. Subsequently, a smaller foundation model like BERT is further pre-trained on this annotated dataset. This pre-trained model can then be fine-tuned for any downstream NER task.

    The team has introduced its three NER models, which are the following:

    NuNER Zero: A zero-shot NER model adopts the GLiNER (Generalist Model for Named Entity Recognition using Bidirectional Transformer) architecture and requires input as a concatenation of entity types and text. Unlike GLiNER, NuNER Zero functions as a token classifier, enabling the detection of arbitrarily long entities. Trained on the NuNER v2.0 dataset, which merges subsets of Pile and C4 annotated via LLMs using NuNER’s procedure, NuNER Zero emerges as the leading compact zero-shot NER model, boasting a +3.1% token-level F1-Score improvement over GLiNER-large-v2.1 on GLiNER’s benchmark.

     Image Source

    NuNER Zero 4k: NuNER Zero 4k is the long-context (4k tokens) version of NuNER Zero. It is generally less performant than NuNER Zero but can outperform NuNER Zero on applications where context size matters.

    Image Source

    NuNER Zero-span: NuNER Zero-span is the span-prediction version of NuNER Zero, which shows slightly better performance than NuNER Zero but cannot detect entities larger than 12 tokens.

    Image Source

    The key features of these three models are:

    NuNER Zero: Originated from NuNER, convenient for moderate token size.

    NuNER Zero 4K: A variation of NuNER performs better in scenarios where context size matters.

    NuNER Zero-span: The span-prediction version of NuNER Zero is not convenient for entities larger than 12 tokens.

    In conclusion, NER is crucial in natural language processing, yet creating custom models typically relies on transformer encoders trained via MLM. However, the rise of LLMs like GPT-3 and GPT-4 poses challenges due to high inference costs. The NuMind team proposes an approach utilizing LLMs to reduce human annotations by annotating a multi-domain dataset. They introduce three NER models: NuNER Zero, a compact zero-shot model; NuNER Zero 4k, emphasizing longer context; and NuNER Zero-span, prioritizing span prediction with slight performance enhancements but limited to entities under 12 tokens.

    Sources

    https://huggingface.co/numind/NuNER_Zero-4k

    https://huggingface.co/numind/NuNER_Zero

    https://huggingface.co/numind/NuNER_Zero-span

    https://arxiv.org/pdf/2402.15343

    https://www.linkedin.com/posts/tomaarsen_numind-yc-s22-has-just-released-3-new-state-of-the-art-activity-7195863382783049729-kqko/?utm_source=share&utm_medium=member_ios

    The post NuMind Releases Three SOTA NER Models that Outperform Similar-Sized Foundation Models in the Few-shot Regime and Competing with Much Larger LLMs appeared first on MarkTechPost.

    Source: Read More 

    Hostinger
    Facebook Twitter Reddit Email Copy Link
    Previous ArticlePhidata: An AI Framework for Building Autonomous Assistants with Long-Term Memory, Contextual Knowledge and the Ability to Take Actions Using Function Calling
    Next Article AgentClinic: Simulating Clinical Environments for Assessing Language Models in Healthcare

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 17, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-4610 – WordPress WP-Members Membership Plugin Stored Cross-Site Scripting Vulnerability

    May 17, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Creating a detailed Content Audit & Mapping Strategy for your next site build

    Development

    VS klaagt verdachte aan voor ransomware-aanvallen tegen Exchange-servers

    Security

    Critical Flaw in Apache Parquet Allows Remote Attackers to Execute Arbitrary Code

    Development

    SwiftUI Apprentice [SUBSCRIBER]

    Learning Resources
    Hostinger

    Highlights

    Build a Vector Image Service Using ThreeJS and Vite | Tutorial

    February 10, 2025

    Post Content Source: Read More 

    Why Most Microsegmentation Projects Fail—And How Andelyn Biosciences Got It Right

    March 16, 2025

    CVE-2023-53134 – “Broadcom bnxt_en Linux kernel Memory Allocation Vulnerability”

    May 2, 2025

    PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers

    June 22, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.