Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»SignLLM: A Multilingual Sign Language Model that can Generate Sign Language Gestures from Input Text

    SignLLM: A Multilingual Sign Language Model that can Generate Sign Language Gestures from Input Text

    May 31, 2024

    The primary goal of Sign Language Production (SLP) is to create sign avatars that resemble humans using text inputs. The standard procedure for SLP methods based on deep learning involves several steps. First, the text is translated into gloss, a language that represents postures and gestures. This gloss is then used to generate a video that mimics sign language. The resulting video is further processed to create more interesting avatar movies that appear more like real people. Acquiring and processing data in sign language is challenging due to the complexity of these processes. 

    Over the past decade, most studies have grappled with the challenges of a German sign language (GSL) dataset called PHOENIX14T and other lesser-known language datasets for Sign Language Production, Recognition, and Translation tasks (SLP, SLR, and SLT). These challenges, which include the lack of standardized tools and the slow progress in research on minority languages, have significantly dampened researchers’ enthusiasm. The complexity of the problem is further underscored by the fact that studies using the American Sign Language (ASL) dataset are still in their infancy.

    Thanks to the current mainstream datasets, a lot of progress has been made in the sector. Nevertheless, they fail to tackle the new problems that are appearing:

    Pre-existing datasets sometimes include files in complicated forms, such as pictures, scripts, OpenPose skeleton key points, graphs, and perhaps other formats used for preprocessing. Directly trainable actionable data is absent from these forms. 

    Annotating glosses by hand is a tedious and time-consuming process. 

    After obtaining several sign video datasets from sign language experts, the data is transformed into various forms, making expanding the dataset very difficult.

    Researchers from Rutgers University, Australian National University, Data61/CSIRO, Carnegie Mellon University, University of Texas at Dallas, and University of Central Florida present Prompt2Sign, a groundbreaking dataset that tracks the upper body movements of sign language demonstrators on a massive scale. This is a significant step forward in the field of multilingual sign language recognition and generation, as it is the first comprehensive dataset combining eight distinct sign languages and using publicly available online videos and datasets to address the shortcomings of earlier efforts.

    The researchers begin by standardizing the posture information of video frames (the original material of the tool) into the preset format using OpenPose, a video processing application so that they can construct this dataset. Reducing redundancy and making training with seq2seq and text2text models easier can be achieved by storing key information in their standardized format. Then, to make it more cost-effective, they auto-generate prompt words to decrease the need for human annotations. Lastly, to address the issues with manual preprocessing and data collecting, they enhance the tools’ processing level of automation, making them extremely efficient and lightweight. This improves their data processing capabilities without the need for further model loading. 

    The team highlights that the current model could benefit from some tweaks because the new datasets present different obstacles while training models. Due to the variances in sign language from country to country, it is not possible to train several sets of sign language data simultaneously. Managing additional languages and larger datasets makes training more complex and time-consuming, making downloading, storing, and loading data more painful. Therefore, investigating training techniques at fast speeds is essential. Furthermore, it is important to investigate under-researched topics like multilingual SLP, efficient training, and the capacity to understand prompts since the current model structure cannot comprehend more languages and more complicated, natural human conversational inputs. This pertains to questions like improving the large model’s generalization capability and fundamental understanding prompts ability.

    To address these issues, the team presented SignLLM, the initial large-scale multilingual SLP model built on the Prompt2Sign dataset. It generates the skeletal poses of eight different sign languages given texts or suggestions. There are two other modes for SignLLM: (i) The Multi-Language Switching Framework (MLSF), which dynamically adds encoder-decoder groups to generate many sign languages in tandem. (ii) The Prompt2LangGloss module enables SignLLM to generate static encoder-decoder pairs. 

    The team aims to use their new dataset to set a standard for multilingual recognition and generation. Their latest loss function incorporates a novel module grounded in the Reinforcement Learning (RL) idea to expedite model training on more languages and larger datasets, thereby resolving the prolonged training time caused by these factors. A large number of tests and ablation investigations were carried out. The outcomes prove that the SignLLM outperforms baseline methods on both the development and test sets for a total of eight sign languages.

    Even though their work has made great strides in automating the processing and capture of data in sign language, it still needs to provide a comprehensive end-to-end solution. For example, the team highlights that to utilize one’s private dataset, one must use OpenPose to extract 2D keypoint json files and then update them manually.

    Check out the Paper and Project. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. Join our Telegram Channel, Discord Channel, and LinkedIn Group.

    If you like our work, you will love our newsletter..

    Don’t Forget to join our 43k+ ML SubReddit | Also, check out our AI Events Platform

    The post SignLLM: A Multilingual Sign Language Model that can Generate Sign Language Gestures from Input Text appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleUnlocking the Potential of Multimodal Data: A Look at Vision-Language Models and their Applications
    Next Article MedVersa: A Generalist Learner that Enables Flexible Learning and Tasking for Medical Image Interpretation

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 16, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-47916 – Invision Community Themeeditor Remote Code Execution

    May 16, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Massive Google Leak Exposes Search Algorithm Secrets

    Development

    How to rotate the screen on Windows 10

    Development

    How Next.js Supercharges Load Times & Retains Customers

    Web Development

    FBI, DHS Warn of Insider Threats to 2024 US Elections, Issue New Guidance for Officials

    Development

    Highlights

    News & Updates

    Are you buying the new Surface Pro 12-inch or Surface Laptop 13-inch? — Discussion 💬

    May 6, 2025

    The new Surface Pro 12-inch and Surface Laptop 13-inch are more affordable and portable than…

    10 Ways to Become an IT Superstar (Free Download)

    April 8, 2025

    Researchers at Stanford Propose TRANSIC: A Human-in-the-Loop Method to Handle the Sim-to-Real Transfer of Policies for Contact-Rich Manipulation Tasks

    May 23, 2024

    Grouping database tables in AWS DMS tasks for Oracle source engine

    March 17, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.