Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 17, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 17, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 17, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 17, 2025

      Microsoft’s allegiance isn’t to OpenAI’s pricey models — Satya Nadella’s focus is selling any AI customers want for maximum profits

      May 17, 2025

      If you think you can do better than Xbox or PlayStation in the Console Wars, you may just want to try out this card game

      May 17, 2025

      Surviving a 10 year stint in dev hell, this retro-styled hack n’ slash has finally arrived on Xbox

      May 17, 2025

      Save $400 on the best Samsung TVs, laptops, tablets, and more when you sign up for Verizon 5G Home or Home Internet

      May 17, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      NodeSource N|Solid Runtime Release – May 2025: Performance, Stability & the Final Update for v18

      May 17, 2025
      Recent

      NodeSource N|Solid Runtime Release – May 2025: Performance, Stability & the Final Update for v18

      May 17, 2025

      Big Changes at Meteor Software: Our Next Chapter

      May 17, 2025

      Apps in Generative AI – Transforming the Digital Experience

      May 17, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft’s allegiance isn’t to OpenAI’s pricey models — Satya Nadella’s focus is selling any AI customers want for maximum profits

      May 17, 2025
      Recent

      Microsoft’s allegiance isn’t to OpenAI’s pricey models — Satya Nadella’s focus is selling any AI customers want for maximum profits

      May 17, 2025

      If you think you can do better than Xbox or PlayStation in the Console Wars, you may just want to try out this card game

      May 17, 2025

      Surviving a 10 year stint in dev hell, this retro-styled hack n’ slash has finally arrived on Xbox

      May 17, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»The “Zero-Shot” Mirage: How Data Scarcity Limits Multimodal AI

    The “Zero-Shot” Mirage: How Data Scarcity Limits Multimodal AI

    April 10, 2024

    Imagine an AI system that can recognize any object, comprehend any text, and generate realistic images without being explicitly trained on those concepts. This is the enticing promise of “zero-shot” capabilities in AI. But how close are we to realizing this vision?

    Major tech companies have released impressive multimodal AI models like CLIP for vision-language tasks and DALL-E for text-to-image generation. These models seem to perform remarkably well on a variety of tasks “out-of-the-box” without being explicitly trained on them – the hallmark of zero-shot learning. However, a new study by researchers from Tubingen AI Center, University of Cambridge, University of Oxford, and Google Deepmind casts doubt on the true generalization abilities of these systems.  

    The researchers conducted a large-scale analysis of the data used to pretrain popular multimodal models like CLIP and Stable Diffusion. They looked at over 4,000 concepts spanning images, text, and various AI tasks. Surprisingly, they found that a model’s performance on a particular concept is strongly tied to how frequently that concept appeared in the pretraining data. The more training examples for a concept, the better the model’s accuracy.

    But here’s the kicker – the relationship follows an exponential curve. To get just a linear increase in performance, the model needs to see exponentially more examples of that concept during pre-training. This reveals a fundamental bottleneck – current AI systems are extremely data hungry and sample inefficient when it comes to learning new concepts from scratch.

    The researchers dug deeper and unearthed some other concerning patterns. Most concepts in the pretraining datasets are relatively rare, following a long-tailed distribution. There are also many cases where the images and text captions are misaligned, containing different concepts. This “noise” likely further impairs a model’s generalization abilities.  

    To put their findings to the test, the team created a new “Let It Wag!” dataset containing many long-tailed, infrequent concepts across different domains like animals, objects, and activities. When evaluated on this dataset, all models – big and small, open and private – showed significant performance drops compared to more commonly used benchmarks like ImageNet. Qualitatively, the models often failed to properly comprehend or render images for these rare concepts.

    The study’s key revelation is that while current AI systems excel at specialized tasks, their impressive zero-shot capabilities are somewhat of an illusion. What seems like broad generalization is largely enabled by the models’ immense training on similar data from the internet. As soon as we move away from this data distribution, their performance craters.

    So where do we go from here? One path is improving data curation pipelines to cover long-tailed concepts more comprehensively. Alternatively, model architectures may need fundamental changes to achieve better compositional generalization and sample efficiency when learning new concepts. Lastly, retrieval mechanisms that can enhance or “look up” a pre-trained model’s knowledge could potentially compensate for generalization gaps.  

    In summary, while zero-shot AI is an exciting goal, we aren’t there yet. Uncovering blind spots like data hunger is crucial for sustaining progress towards true machine intelligence. The road ahead is long, but clearly mapped by this insightful study.

    Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

    If you like our work, you will love our newsletter..

    Don’t Forget to join our 40k+ ML SubReddit

    The post The “Zero-Shot” Mirage: How Data Scarcity Limits Multimodal AI appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleResearchers at Stanford and MIT Introduced the Stream of Search (SoS): A Machine Learning Framework that Enables Language Models to Learn to Solve Problems by Searching in Language without Any External Support
    Next Article HuggingFace Releases Parler-TTS: An Inference and Training Library for High-Quality, Controllable Text-to-Speech (TTS) Models

    Related Posts

    Development

    February 2025 Baseline monthly digest

    May 17, 2025
    Development

    Learn A1 Level Spanish

    May 17, 2025
    Leave A Reply Cancel Reply

    Hostinger

    Continue Reading

    Optimize your database storage for Oracle workloads on AWS, Part 2: Using hybrid partitioning and ILM data movement policies

    Databases

    Configure change data capture parameters on Amazon RDS for SQL Server

    Databases

    CISA Warns of Critical Jenkins Vulnerability Exploited in Ransomware Attacks

    Development

    MOFI: Learning Image Representation from Noisy Entity Annotated Images

    Development

    Highlights

    Machine Learning

    Best practices for Amazon SageMaker HyperPod task governance

    February 19, 2025

    At AWS re:Invent 2024, we launched a new innovation in Amazon SageMaker HyperPod on Amazon…

    Create Custom Tattoo Designs – ArtiTattoos

    July 9, 2024

    CERT-UA Warns: Dark Crystal RAT Targets Ukrainian Defense via Malicious Signal Messages

    March 20, 2025

    MLOps vs DevOps: Understanding the Key Differences and Synergies

    May 6, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.