Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 17, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 17, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 17, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 17, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Apps in Generative AI – Transforming the Digital Experience

      May 17, 2025
      Recent

      Apps in Generative AI – Transforming the Digital Experience

      May 17, 2025

      Jill Boisvert Fosters Continuous Learning in Perficient’s Salesforce Practice

      May 17, 2025

      michael-rubel/laravel-formatters

      May 17, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      The Secret Weapon Behind Successful Email Campaigns? Buy SMTP And Win Big!

      May 17, 2025
      Recent

      The Secret Weapon Behind Successful Email Campaigns? Buy SMTP And Win Big!

      May 17, 2025

      Play With Words in Linux Terminal With This Bookmark Inspired Game

      May 17, 2025

      Servas is a self-hosted bookmark management tool

      May 17, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»MedVersa: A Generalist Learner that Enables Flexible Learning and Tasking for Medical Image Interpretation

    MedVersa: A Generalist Learner that Enables Flexible Learning and Tasking for Medical Image Interpretation

    May 31, 2024

    Despite the advancement of artificial intelligence in the field of medical science, these systems have limited application. This limitation creates a gap in developing AI solutions for specific tasks. Researchers from Harvard Medical School, USA; Jawaharlal Institute of Postgraduate Medical Education and Research, India; and Scripps Research Translational Institute, USA, proposed MedVersa to address the challenges in medical artificial intelligence systems, hindering their widespread adoption in clinical practice. The task-specific approach of the existing models is the key issue that causes their inability to adapt to healthcare settings’ diverse and complex needs. MedVersa, a generalist learner capable of multifaceted medical image interpretation, aims to solve these challenges.

    Current medical AI systems are predominantly designed for specific tasks, such as identifying chest pathologies or classifying skin diseases. However, these task-specific approaches limit their adaptability and usability in real-world clinical scenarios. In contrast, MedVersa, the proposed solution, is a generalist learner that leverages a large language model as a learnable orchestrator. The unique architecture of MedVersa enables it to learn from both visual and linguistic supervision, supporting multimodal inputs and real-time task specification. Unlike previous generalist medical AI models that focus solely on natural language supervision, MedVersa integrates vision-centric capabilities, allowing it to perform tasks such as detection and segmentation crucial for medical image interpretation.

    MedVersa’s method involves three key components: the multimodal input coordinator, the large language model-based learnable orchestrator, and various learnable vision modules. The multimodal input coordinator processes both visual and textual inputs, while the large language model orchestrates the execution of tasks using language and vision modules. This architecture enables MedVersa to excel in both vision-language tasks, like generating radiology reports, and vision-centric challenges, including detecting anatomical structures and segmenting medical images. For training the model, researchers combined more than 10 publicly available medical datasets for various tasks, such as MIMIC-CXR, Chest ImaGenome, and Medical-Diff-VQA, into one multimodal dataset, MedInterp. 

    MedVersa employs advanced multimodal input coordination using distinct vision encoders and an orchestrator optimized for medical tasks. For the 2D and 3D vision encoders, researchers utilized the base version of the Swin Transformer pre-trained on ImageNet and the encoder architecture from the 3D UNet, respectively. They cropped 50–100% of the original images, resized them to 224 x 224 pixels with three channels, and further applied various augmentations for specific tasks. Additionally, the system implements two distinct linear projectors for 2D and 3D data. MedVersa uses the Low-Rank Adaptation (LoRA) strategy to train the orchestrator. LoRA uses the idea of low-rank matrix decomposition to achieve proximity to a large weight matrix in neural network layers. By setting the rank and alpha values of LoRA to 16, the method ensures efficient training while modifying only a fraction of the model parameters

    MedVersa outperforms existing state-of-the-art across multiple tasks, in areas such as radiology report generation and chest pathology classification. MedVersa’s ability to adapt to impromptu task specifications, as well as its consistent performance across external cohorts, indicate its robustness and generalization. MedVersa demonstrates superior performance over DAM in chest pathology classification, with an average F1 score of 0.615, notably higher than DAM’s 0.580. For detection tasks, MedVersa surpasses YOLOv5 in detecting a variety of anatomical structures, with most IoU scores on certain structures, especially in detecting lung zones. By incorporating vision-centric training alongside vision-language training, the model achieved an average improvement of 4.1% compared to models trained solely on vision-language data.

    In conclusion, the study presents a state-of-the-art generalist medical AI (GMAI) model to support multimodal inputs, outputs, and on-the-fly task specification. By integrating visual and linguistic supervision within its learning processes, MedVersa demonstrates superior performance across a wide range of tasks and modalities. Its adaptability and versatility make it an important resource in medical AI, paving the way for more thorough and efficient AI-assisted clinical decision-making.

    Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

    If you like our work, you will love our newsletter..

    Don’t Forget to join our 43k+ ML SubReddit | Also, check out our AI Events Platform

    The post MedVersa: A Generalist Learner that Enables Flexible Learning and Tasking for Medical Image Interpretation appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleSignLLM: A Multilingual Sign Language Model that can Generate Sign Language Gestures from Input Text
    Next Article Enhancing Self-Supervised Learning with Automatic Data Curation: A Hierarchical K-Means Approach

    Related Posts

    Development

    Apps in Generative AI – Transforming the Digital Experience

    May 17, 2025
    Development

    Jill Boisvert Fosters Continuous Learning in Perficient’s Salesforce Practice

    May 17, 2025
    Leave A Reply Cancel Reply

    Hostinger

    Continue Reading

    How the Amazon TimeHub team designed a recovery and validation framework for their data replication framework: Part 4

    Databases

    CVE-2025-3931 – Yggdrasil DBus Unauthenticated Command Injection Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    I found the ultimate laptop accessory for power users – and it’s gloriously designed

    Development

    Microsoft wants to streamline your workday with powerful AI agents

    Operating Systems
    Hostinger

    Highlights

    The AI Fix #35: Project Stargate, the AI emergency, and batsh*t AI cryonics

    January 28, 2025

    In episode 35 of The AI Fix, our hosts learn who the 175th best programmer…

    A fluid CSS methodology

    March 24, 2025

    SnapCenter Security Flaw Rated Critical—NetApp Urges Immediate Patch

    March 27, 2025

    Pentesters: Is AI Coming for Your Role?

    March 16, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.