Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      June 1, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      June 1, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      June 1, 2025

      How To Prevent WordPress SQL Injection Attacks

      June 1, 2025

      7 MagSafe accessories that I recommend every iPhone user should have

      June 1, 2025

      I replaced my Kindle with an iPad Mini as my ebook reader – 8 reasons why I don’t regret it

      June 1, 2025

      Windows 11 version 25H2: Everything you need to know about Microsoft’s next OS release

      May 31, 2025

      Elden Ring Nightreign already has a duos Seamless Co-op mod from the creator of the beloved original, and it’ll be “expanded on in the future”

      May 31, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Student Record Android App using SQLite

      June 1, 2025
      Recent

      Student Record Android App using SQLite

      June 1, 2025

      When Array uses less memory than Uint8Array (in V8)

      June 1, 2025

      Laravel 12 Starter Kits: Definite Guide Which to Choose

      June 1, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      Photobooth is photobooth software for the Raspberry Pi and PC

      June 1, 2025
      Recent

      Photobooth is photobooth software for the Raspberry Pi and PC

      June 1, 2025

      Le notizie minori del mondo GNU/Linux e dintorni della settimana nr 22/2025

      June 1, 2025

      Rilasciata PorteuX 2.1: Novità e Approfondimenti sulla Distribuzione GNU/Linux Portatile Basata su Slackware

      June 1, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»Mobile-Agent-E: A Hierarchical Multi-Agent Framework Combining Cognitive Science and AI to Redefine Complex Task Handling on Smartphones

    Mobile-Agent-E: A Hierarchical Multi-Agent Framework Combining Cognitive Science and AI to Redefine Complex Task Handling on Smartphones

    January 24, 2025

    Smartphones are essential tools in dAIly life. However, the complexity of tasks on mobile devices often leads to frustration and inefficiency. Navigating applications and managing multi-step processes consumes time and effort. Advancements in AI have introduced large multimodal models (LMMs) that enable mobile assistants to perform intricate operations autonomously. While these innovations aim to simplify technology, they often fail to meet practical demands. Addressing these gaps requires advanced AI capabilities and adaptable systems.

    Current mobile assistants struggle to handle complex tasks requiring long-term planning, reasoning, and adaptability. Tasks like creating itineraries or comparing prices involve multiple steps across platforms. These systems treat each task as isolated, lacking the ability to learn from experience or optimize performance for repeated tasks, leading to inefficiency. Also, allocating identical resources to all tasks, regardless of complexity, reduces effectiveness in demanding scenarios. 

    Some frameworks address these challenges but remain limited in planning and decision-making. Current mobile agents like AppAgent and Mobile-Agent-v1 focus on short, predefined tasks. Systems like Mobile-Agent-v2, despite improved planning, fail to incorporate a hierarchical structure for effective task delegation and refinement. These limitations highlight the need for more advanced mobile assistant designs.

    Researchers from the University of Illinois Urbana-Champaign and Alibaba Group have developed Mobile-Agent-E, a novel mobile assistant that addresses these challenges through a hierarchical multi-agent framework. The system features a Manager agent responsible for planning and breaking down tasks into sub-goals, supported by four subordinate agents: Perceptor, Operator, Action Reflector, and Notetaker. These agents specialize in visual perception, immediate action execution, error verification, and information aggregation. A standout feature of Mobile-Agent-E is its self-evolution module, which includes a long-term memory system. This memory is divided into two components: 

    1. Tips, which provide generalized guidance based on previous tasks
    2. Shortcuts, which are reusable sequences of operations tailored to specific recurring subroutines

    Mobile-Agent-E operates by continuously refining its performance through feedback loops. After completing each task, the system’s Experience Reflectors update its Tips and propose new Shortcuts based on interaction history. These updates are inspired by human cognitive processes, where episodic memory informs future decisions, and procedural knowledge facilitates efficient task execution. For example, if a user frequently performs a sequence of actions, such as searching for a location and creating a note, the system creates a Shortcut to streamline this process in the future. Mobile-Agent-E balances high-level planning and low-level action precision by incorporating these learnings into its hierarchical framework.

    The performance of Mobile-Agent-E has been tested using a new benchmark called Mobile-Eval-E, which evaluates the system’s ability to handle complex real-world tasks. Compared to existing models, Mobile-Agent-E achieves significantly higher satisfaction scores, with a 15% increase in task completion rates. Also, evolved Tips and Shortcuts reduce computational overhead, enabling faster task execution without compromising accuracy. For instance, a single Shortcut that combines actions like “Tap,” “Type,” and “Enter” can save two decision-making iterations, improving efficiency. The system’s hierarchical design enhances error recovery, allowing it to adapt to unforeseen challenges during task execution.

    Key takeaways from this research include the following:  

    1. Mobile-Agent-E features a Manager agent supported by four specialized subordinate agents, enabling efficient task delegation and execution.  
    2. The system continuously updates its Tips and Shortcuts, inspired by human cognitive processes, to improve performance and reduce redundant errors.
    3. Shortcuts reduce computational overhead, resulting in faster task execution with fewer resources. For example, task completion time decreased by 20% compared to previous models.
    4. Mobile-Agent-E achieved a 15% increase in satisfaction scores compared to state-of-the-art models, demonstrating its effectiveness in real-world applications.
    5. The system’s capabilities extend to various scenarios, such as planning itineraries, managing notes, and comparing prices across apps, showcasing its versatility and adaptability. 

    In conclusion, Mobile-Agent-E bridges the gap between user needs and technological capabilities by addressing critical challenges in task management, planning, and decision-making. Its hierarchical framework and self-evolution capabilities enhance efficiency and set a new benchmark for intelligent mobile assistants. This research highlights the potential of AI-driven solutions to transform human-device interaction, making technology more accessible and intuitive for all users.


    Check out the Paper, GitHub Page and Project Page. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 70k+ ML SubReddit.

    🚨 [Recommended Read] Nebius AI Studio expands with vision models, new language models, embeddings and LoRA (Promoted)

    The post Mobile-Agent-E: A Hierarchical Multi-Agent Framework Combining Cognitive Science and AI to Redefine Complex Task Handling on Smartphones appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleO1-Pruner: Streamlining Long-Thought Reasoning in Language Models
    Next Article Google AI Introduces Learn-by-Interact: A Data-Centric Framework for Adaptive and Efficient LLM Agent Development

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    June 1, 2025
    Machine Learning

    BOND 2025 AI Trends Report Shows AI Ecosystem Growing Faster than Ever with Explosive User and Developer Adoption

    June 1, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    Recommending for Long-Term Member Satisfaction at Netflix

    Development

    Researchers Observe Surge in Use of Mekotio Banking Trojan Against Latin American Financial Systems

    Development

    Smart Route Detection in Laravel

    Development

    Windows 11 is planning a huge redesign of its Start Menu

    Operating Systems

    Highlights

    22 Best Midjourney Prompts for Character Design and Concept Art

    December 7, 2024

    Midjourney prompts for character design make it simple to create captivating and detailed characters. If…

    Edward Snowden labels OpenAI’s new board appointment a “willful, calculated betrayal of the rights of every person on Earth”

    June 24, 2024

    Microsoft is shutting down its flagship retail storefront in the UK — cuts lease short in the heart of London

    January 23, 2025

    DDoS Attacks Surge 46% in First Half of 2024, Gcore Report Reveals

    August 14, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.