Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»RadGraph2: A New Dataset for Tracking Disease Progression in Radiology Reports

    RadGraph2: A New Dataset for Tracking Disease Progression in Radiology Reports

    August 10, 2024

    Automated information extraction from radiology notes presents significant challenges in the field of medical informatics. Researchers are trying to develop systems that can accurately extract and interpret complex medical data from radiological reports, particularly focusing on tracking disease progression over time. The primary challenge lies in the limited availability of suitably labeled data that can capture the nuanced information contained in these reports. Current methodologies often struggle with representing the temporal aspects of patient conditions, especially when it comes to comparisons with prior examinations, which are crucial for understanding a patient’s healthcare trajectory.

    To overcome the limitations in capturing temporal changes in radiology reports, researchers have developed RadGraph2, an enhanced hierarchical schema for entities and relations. This new approach builds upon the original RadGraph schema, expanding its capabilities to represent various types of changes observed in patient conditions over time. RadGraph2 was developed through an iterative process, involving continuous feedback from medical practitioners to ensure its coverage, faithfulness, and reliability. The schema maintains the original design principles of maximizing clinically relevant information while preserving simplicity for efficient labeling. This method enables the capture of detailed information about findings and changes described in radiology reports, particularly focusing on comparisons with prior examinations.

    The RadGraph2 method employs a Hierarchical Graph Information Extraction (HGIE) model to annotate radiology reports automatically. This approach utilizes the structured organization of labels to enhance information extraction performance. The core of the system is a Hierarchical Recognition (HR) component that utilizes an entity taxonomy, recognizing inherent relationships between various entities used in graph labeling. For instance, entities like CHAN-CON-WOR and CHAN-CON-AP are categorized under changes in patient conditions. The HR system uses a BERT-based model as its backbone, extracting 12 scalar outputs corresponding to entity categories. These outputs represent conditional probabilities of entities being true, given their parent’s truth in the entity hierarchy.

    RadGraph2’s information schema defines three main entity types: “anatomy,” “observation,” and “change,” along with three relation types: “modify,” “located at,” and “suggestive of.” The entity types are further divided into subtypes, forming a hierarchical structure. Change entities (CHAN) are a key addition to the original RadGraph schema, encompassing subtypes such as No change (CHAN-NC), Change in medical condition (CHAN-CON), and Change in medical devices (CHAN-DEV). Each of these subtypes is further categorized to capture specific aspects of change, such as condition appearance, worsening, improvement, or resolution. Anatomy entities (ANAT) and Observation entities (OBS) are retained from the original schema, with OBS further divided into definitely present, uncertain, and absent subtypes. This hierarchical structure allows for a more nuanced representation of the information contained in radiology reports, particularly emphasizing the temporal aspects and changes in patient conditions.

    RadGraph2’s schema defines three types of relations as directed edges between entities:

    1. Modify relations (modify):

       • Indicate that the first entity modifies the second entity

       • Connect entity types: (OBS-*, OBS-*), (ANAT-DP, ANAT-DP), (CHAN-*, *), and (OBS-*, CHAN-*)

       • Example: “right” → “lung” in “right lung”

    2. Located at relations (located_at):

       • Connect anatomy and observation entities

       • Indicate that observation is related to anatomy

       • Connect entity types: (OBS-*, ANAT-DP)

       • Example: “clear” → “lungs” in “lungs are clear”

    3. Suggestive of relations (suggestive_of):

       • Indicate that the status of the second entity is derived from the first entity

       • Connect entity types: (OBS-*, OBS-*), (CHAN-*, OBS-*), and (OBS-*, CHAN-*)

       • Example: “opacity” → “pneumonia” in “The opacity may indicate pneumonia”

    These relations enable RadGraph2 to capture the complex relationships between different entities in radiology reports, including modifications, anatomical associations, and diagnostic inferences. The schema’s relational structure allows for a more comprehensive representation of the information contained in the reports, facilitating a better understanding of the interconnections between observations, anatomical structures, and changes in patient conditions.

    RadGraph2’s dataset is organized into three main partitions:

    1. Training set:

       • Contains 575 manually labeled reports

       • Used for model training and optimization

    2. Development set:

       • Consists of 75 manually labeled reports

       • Used for model validation and hyperparameter tuning

    3. Test set:

       • Comprises 150 manually labeled reports

       • Used for final model evaluation

    Key characteristics of the dataset:

    • Patient disjointness: Reports in each partition are from distinct sets of patients

    • Consistency with original RadGraph: Maintains the report placement from the original dataset

    • De-identification: All protected health information in the reports is removed

    Additional dataset component:

    • 220,000+ automatically labeled reports:

       – Annotated by the best-performing model (HGIE)

       – Provides a large-scale resource for further research and model development

    This dataset structure ensures a robust evaluation framework for RadGraph2, maintaining data integrity and patient privacy while offering a substantial corpus for training and testing advanced information extraction models in the radiology domain.

    RadGraph2 releases a comprehensive set of files to support researchers and developers. The dataset package includes a README.md file providing a brief overview, along with train.json, dev.json, and test.json files containing labeled reports from MIMIC-CXR-JPG and CheXpert. Also, two large inference files, inference-chexpert.json and inference-mimic.json, contain reports labeled by the benchmark model. The file format follows a structure similar to the original RadGraph dataset, utilizing a JSON format with a hierarchical dictionary structure. Each report is identified by a unique key and contains metadata such as the full text, data split, data source, and a flag indicating if it was part of the original RadGraph dataset. The “entities” key within each report’s dictionary encapsulates detailed information about entity and relation labels, including tokens, label types, token indices, and relations to other entities. This structured format allows for efficient data processing and analysis, enabling researchers to utilize the rich information contained in radiology reports for various natural language processing tasks and medical informatics applications.

    Image source: https://physionet.org/content/radgraph2-radiology-reports/1.0.0/

    RadGraph2 is an advanced approach to automated information extraction from radiology reports, addressing the challenges of tracking disease progression over time. Key aspects of RadGraph2 include:

    1. Enhanced hierarchical schema: Built upon the original RadGraph, it introduces new entity types to represent various kinds of changes in patient conditions.

    2. Hierarchical Graph Information Extraction model: Utilizes a structured organization of labels and a Hierarchical Recognition component with a BERT-based backbone.

    3. Comprehensive entity types: Includes anatomy, observation, and change entities, with further subtypes to capture nuanced information.

    4. Relation types: Defines modify, located_at, and suggestive_of relations to represent complex relationships between entities.

    5. Dataset structure: Comprises training (575 reports), development (75 reports), and test (150 reports) sets, plus 220,000+ automatically labeled reports.

    6. File format: Uses JSON structure with detailed metadata and entity information for each report.

    RadGraph2 aims to provide a more comprehensive representation of temporal changes in radiology reports, enabling better tracking of disease progression and patient care trajectories. The dataset and schema offer researchers a robust framework for developing advanced natural language processing models in the medical domain.

    Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter..

    Don’t Forget to join our 48k+ ML SubReddit

    Find Upcoming AI Webinars here

    Arcee AI Released DistillKit: An Open Source, Easy-to-Use Tool Transforming Model Distillation for Creating Efficient, High-Performance Small Language Models

    The post RadGraph2: A New Dataset for Tracking Disease Progression in Radiology Reports appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleWaiting for Masonry grid layout to be ready, before clicking button
    Next Article Exploring the Evolution and Impact of LLM-based Agents in Software Engineering: A Comprehensive Survey of Applications, Challenges, and Future Directions

    Related Posts

    Machine Learning

    Salesforce AI Releases BLIP3-o: A Fully Open-Source Unified Multimodal Model Built with CLIP Embeddings and Flow Matching for Image Understanding and Generation

    May 16, 2025
    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 16, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Anchor Positioning Just Don’t Care About Source Order

    News & Updates

    Autoapply: Automatically Apply for Jobs with Smart Tools in 2025

    Web Development

    CVE-2025-43919 – cPanel WHM GNU Mailman File Traversal Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    Audeze continues to push into the mainstream with these high-end audiophile headphones, and I love them

    News & Updates
    Hostinger

    Highlights

    nip4 is an image processing spreadsheet

    May 13, 2025

    You create a set of formula connecting your objects together, and on a change nip4…

    How to preorder the new Surface Pro and Surface Laptop

    May 6, 2025

    How Deutsche Bahn redefines forecasting using Chronos models – Now available on Amazon Bedrock Marketplace

    May 8, 2025

    GSAP is Now Completely Free, Even for Commercial Use!

    May 6, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.