Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Google AI Propose LANISTR: An Attention-based Machine Learning Framework to Learn from Language, Image, and Structured Data

    Google AI Propose LANISTR: An Attention-based Machine Learning Framework to Learn from Language, Image, and Structured Data

    May 26, 2024

    Google Cloud AI Researchers have introduced LANISTR to address the challenges of effectively and efficiently handling unstructured and structured data within a framework.  In machine learning, handling multimodal data—comprising language, images, and structured data—is increasingly crucial. The key challenge is the issue of missing modalities in large-scale, unlabeled, and structured data like tables and time series. Traditional methods often struggle when one or more types of data are absent, leading to suboptimal model performance. 

    Current methods for multimodal data pre-training typically rely on the availability of all modalities during training and inference, which is often not feasible in real-world scenarios. These methods include various forms of early and late fusion techniques, where data from different modalities is combined either at the feature level or the decision level. However, these approaches are not well-suited to situations where some modalities might be entirely missing or incomplete. 

    Google’s LANISTR (Language, Image, and Structured Data Transformer), a novel pre-training framework, leverages unimodal and multimodal masking strategies to create a robust pretraining objective that can handle missing modalities effectively. The framework is based on an innovative similarity-based multimodal masking objective, which enables it to learn from available data while making educated guesses about the missing modalities. The framework aims to improve the adaptability and generalizability of multimodal models, particularly in scenarios with limited labeled data.

    The LANISTR framework employs unimodal masking, where parts of the data within each modality are masked during training. This forces the model to learn contextual relationships within the modality. For example, in text data, certain words might be masked, and the model learns to predict these based on surrounding words. In images, certain patches might be masked, and the model learns to infer these from the visible parts. 

    Multimodal masking extends this concept by masking entire modalities. For instance, in a dataset containing text, images, and structured data, one or two modalities might be entirely masked at random during training. The model is then trained to predict the masked modalities from the available ones. This is where the similarity-based objective comes into play. The model is guided by a similarity measure, ensuring that the generated representations for the missing modalities are coherent with the available data. The efficacy of LANISTR was evaluated on two real-world datasets: the Amazon Product Review dataset from the retail sector and the MIMIC-IV dataset from the healthcare sector. 

    LANISTR showed effectiveness in out-of-distribution scenarios, where the model encountered data distributions not seen during training. This robustness is crucial in real-world applications, where data variability is a common challenge. LANISTR achieved significant gains in accuracy and generalization even with the availability of labeled data.

    In conclusion, LANISTR addresses a critical problem in the field of multimodal machine learning: the challenge of missing modalities in large-scale unlabeled datasets. By employing a novel combination of unimodal and multimodal masking strategies, along with a similarity-based multimodal masking objective, LANISTR enables robust and efficient pretraining. The evaluation experiment demonstrates LANISTR can effectively learn from incomplete data and generalize well to new, unseen data distributions, making it a valuable tool for advancing multimodal learning.

    Check out the Paper and Blog. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

    If you like our work, you will love our newsletter..

    Don’t Forget to join our 42k+ ML SubReddit

    The post Google AI Propose LANISTR: An Attention-based Machine Learning Framework to Learn from Language, Image, and Structured Data appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleTop Courses on Data Structures and Algorithms
    Next Article How to Fine-tune GPT-3.5 for Outreach Emails

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 17, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2024-47893 – VMware GPU Firmware Memory Disclosure

    May 17, 2025
    Leave A Reply Cancel Reply

    Hostinger

    Continue Reading

    CLI Experiments : Prong (Part 2)

    Development

    marcreichel/igdb-laravel

    Development

    Is AGI finally in OpenAI’s grasp? The ChatGPT maker reportedly wants to scrap a stringent clause — extending its Microsoft tie-up beyond AGI and securing more investment for exorbitant AI advances

    Development

    PAN India Townhall “Crafting Tomorrow, Together”

    Development

    Highlights

    CVE-2025-43946 – TCPWave DDI Remote Code Execution Vulnerability

    April 22, 2025

    CVE ID : CVE-2025-43946

    Published : April 22, 2025, 6:16 p.m. | 31 minutes ago

    Description : TCPWave DDI 11.34P1C2 allows Remote Code Execution via Unrestricted File Upload (combined with Path Traversal).

    Severity: 0.0 | NA

    Visit the link for more details, such as CVSS details, affected products, timeline, and more…

    Activision user research workers form union under Microsoft

    March 25, 2025

    Google’s antitrust loss and Samsung’s Galaxy AI expansion top the Innovation Index

    August 9, 2024

    BreachForums Returns Just Weeks After FBI Seizure – Honeypot or Blunder?

    May 29, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.