Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      May 16, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      May 16, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      May 16, 2025

      How To Prevent WordPress SQL Injection Attacks

      May 16, 2025

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025

      Minecraft licensing robbed us of this controversial NFL schedule release video

      May 16, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      The power of generators

      May 16, 2025
      Recent

      The power of generators

      May 16, 2025

      Simplify Factory Associations with Laravel’s UseFactory Attribute

      May 16, 2025

      This Week in Laravel: React Native, PhpStorm Junie, and more

      May 16, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025
      Recent

      Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

      May 16, 2025

      Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

      May 16, 2025

      Microsoft might kill the Surface Laptop Studio as production is quietly halted

      May 16, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»DigiRL: A Novel Autonomous Reinforcement Learning RL Method to Train Device-Control Agents

    DigiRL: A Novel Autonomous Reinforcement Learning RL Method to Train Device-Control Agents

    June 23, 2024

    Advances in vision-language models (VLMs) have shown impressive common sense, reasoning, and generalization abilities. This means that developing a fully independent digital AI assistant, that can perform daily computer tasks through natural language is possible. However, better reasoning and common-sense abilities don’t automatically lead to intelligent assistant behavior. AI assistants are used to complete tasks, behave rationally, and recover from mistakes, not just provide plausible responses based on pre-training data. So, a method is required to turn pre-training abilities into practical AI “agents.” Even the best VLMs, like GPT-4V and Gemini 1.5 Pro, still struggle to perform the right actions when completing device tasks.  

    This paper discusses three existing methods. The first method is training multi-modal digital agents, which face challenges like device control being done directly at the pixel level in a coordinate-based action space, and the stochastic and unpredictable nature of device ecosystems and the internet. The second method is Environments for device control agents. These environments are designed for evaluation, and offer a limited range of tasks in fully deterministic and stationary settings. The last method is Reinforcement learning (RL) for LLM/VLMs, where research with RL for foundation models focuses on single-turn tasks like preference optimization, but optimizing for single-turn interaction from expert demonstrations can lead to sub-optimal strategies for multi-step problems.

    Researchers from UC Berkeley, UIUC, and Google DeepMind have introduced DigiRL (RL for Digital Agents), a novel autonomous RL method for training device control agents. The resulting agent attains state-of-the-art performance on several Android device-control tasks. The training process involves two phases: first, an initial offline RL phase to initialize the agent using existing data, followed by an offline-to-online RL phase, that is used for fine-tuning the model obtained from offline RL on online data. To train online RL a scalable and parallelizable Android learning environment was developed that includes a robust general-purpose evaluator (average error rate 2.8% against human judgment) based on VLM.

    Researchers carried out experiments to evaluate the performance of DigiRL on challenging Android device control problems. It is important to understand if DigiRL has the potential to produce agents that can learn effectively through autonomous interaction, while still being able to utilize offline data for learning. So, a comparative analysis was performed on DigiRL against the following:

    State-of-the-art agents built around proprietary VLMs using several prompting and retrieval-style techniques. 

    Running imitation learning on static human demonstrations with the same instruction distribution

    A filtered Behavior Cloning approach.

    An agent trained using DigiRL was tested on various tasks from the Android in the Wild dataset (AitW) with real Android device emulators. The agent achieved a 28.7% improvement over the existing state-of-the-art agents (raising the success rate from 38.5% to 67.2%) 18B CogAgent. It also outperformed the previous top autonomous learning method based on Filtered Behavior Cloning by more than 9%. Moreover, despite having only 1.3B parameters, the agent performed better than advanced models like GPT-4V and Gemini 1.5 Pro (17.7% success rate). This makes it the first agent to achieve state-of-the-art performance in device control using an autonomous offline-to-online RL approach.

    In summary, researchers proposed DigiRL, a novel autonomous RL approach for training device-control agents that sets a new state-of-the-art performance on several Android control tasks from AitW. A scalable and parallelizable Android environment was developed to achieve this with a robust VLM-based general-purpose evaluator for quick online data collection. The agent trained on DigiRL achieved a 28.7% improvement over the existing state-of-the-art agents 18B CogAgent. However, the training was limited to tasks from the AitW dataset instead of all possible device tasks. So, future work includes building algorithmic research and expanding the task space, making DigiRL the base algorithm. 

    Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter. 

    Join our Telegram Channel and LinkedIn Group.

    If you like our work, you will love our newsletter..

    Don’t Forget to join our 45k+ ML SubReddit

    The post DigiRL: A Novel Autonomous Reinforcement Learning RL Method to Train Device-Control Agents appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleCephalo: A Series of Open-Source Multimodal Vision Large Language Models (V-LLMs) Specifically in the Context of Bio-Inspired Design
    Next Article LOFT: A Comprehensive AI Benchmark for Evaluating Long-Context Language Models

    Related Posts

    Security

    Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

    May 16, 2025
    Common Vulnerabilities and Exposures (CVEs)

    CVE-2025-47916 – Invision Community Themeeditor Remote Code Execution

    May 16, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Inspirational Websites Roundup: Webflow Special #5

    Development

    CVE-2025-47665 – Bistromatic N360 Splash Screen Stored Cross-site Scripting Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    Trojanized Game Installers Deploy Cryptocurrency Miner in Large-Scale StaryDobry Attack

    Development

    The best meal kit delivery services of 2024: Expert tested

    Development

    Highlights

    EU ironically violates its own GDPR law, awards €400 in damages to German citizen

    January 10, 2025

    A German citizen was awarded €400 in damages for uncertainty about their data transfer during…

    Pixel 7a battery problems? Google might fix it for free – here’s how to check

    April 24, 2025

    The Xbox Series X Mini Fridge is more than a meme, it’s the perfect Christmas gift

    December 20, 2024

    From Latent Spaces to State-of-the-Art: The Journey of LightningDiT

    January 5, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.