Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      June 1, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      June 1, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      June 1, 2025

      How To Prevent WordPress SQL Injection Attacks

      June 1, 2025

      My top 5 must-play PC games for the second half of 2025 — Will they live up to the hype?

      June 1, 2025

      A week of hell with my Windows 11 PC really makes me appreciate the simplicity of Google’s Chromebook laptops

      June 1, 2025

      Elden Ring Nightreign Night Aspect: How to beat Heolstor the Nightlord, the final boss

      June 1, 2025

      New Xbox games launching this week, from June 2 through June 8 — Zenless Zone Zero finally comes to Xbox

      June 1, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Student Record Android App using SQLite

      June 1, 2025
      Recent

      Student Record Android App using SQLite

      June 1, 2025

      When Array uses less memory than Uint8Array (in V8)

      June 1, 2025

      Laravel 12 Starter Kits: Definite Guide Which to Choose

      June 1, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      My top 5 must-play PC games for the second half of 2025 — Will they live up to the hype?

      June 1, 2025
      Recent

      My top 5 must-play PC games for the second half of 2025 — Will they live up to the hype?

      June 1, 2025

      A week of hell with my Windows 11 PC really makes me appreciate the simplicity of Google’s Chromebook laptops

      June 1, 2025

      Elden Ring Nightreign Night Aspect: How to beat Heolstor the Nightlord, the final boss

      June 1, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»Sony Researchers Propose TalkHier: A Novel AI Framework for LLM-MA Systems that Addresses Key Challenges in Communication and Refinement

    Sony Researchers Propose TalkHier: A Novel AI Framework for LLM-MA Systems that Addresses Key Challenges in Communication and Refinement

    February 23, 2025

    LLM-based multi-agent (LLM-MA) systems enable multiple language model agents to collaborate on complex tasks by dividing responsibilities. These systems are used in robotics, finance, and coding but face challenges in communication and refinement. Text-based communication leads to long, unstructured exchanges, making it hard to track tasks, maintain structure, and recall past interactions. Refinement methods like debates and feedback-based improvements struggle as important inputs may be ignored or biased due to processing order. These issues limit the efficiency of LLM-MA systems in handling multi-step problems.

    Currently, LLM-based multi-agent systems use debate, self-refinement, and multi-agent feedback to handle complex tasks. These techniques become unstructured and hard to control based on text-based interaction. Agents struggle to follow subtasks, remember previous interactions, and provide consistent responses. Various communication structures, including chain and tree-based models, try to enhance efficiency but do not have explicit protocols for structuring information. Feedback-refinement techniques try to increase accuracy but have challenges with biased or duplicate inputs, making evaluation unreliable. Without systematic communication and feedback on a large scale, such systems still are inefficient and error-prone.

    To mitigate these issues, researchers from Sony Group Corporation, Japan, proposed TalkHier, a framework that improves communication and task coordination in multi-agent systems using structured protocols and hierarchical refinement. Unlike standard approaches, TalkHier explicitly describes the interactions of agents and task formulation more and more subtly, reducing error and efficiency. Agents execute formalized roles, and scaling is automatically adapted to different issues by the system, resulting in improved decision-making and coordination.

    This framework structures agents in a graph such that each node is an agent, and edges represent communication paths. Agents possess independent memory, which allows them to hold pertinent information and make decisions based on informed inputs without using shared memory. Communication follows a formal process: messages contain content, background information, and intermediate outputs. Agents are organized into teams with supervisors monitoring the process, and a subset of agents serve as members and supervisors, resulting in a nested hierarchy. Work is allocated, assessed, and improved in a series of iterations until it passes a quality threshold, with the goal of accuracy and minimizing errors.

    Upon evaluation, researchers assessed TalkHier across multiple benchmarks to analyze its effectiveness. On the MMLU dataset, covering Moral Scenario, College Physics, Machine Learning, Formal Logic, and US Foreign Policy, TalkHier, built on GPT-4o, achieved the highest accuracy of 88.38%, surpassing AgentVerse (83.66%) and single-agent baselines like ReAct–7@ (67.19%) and GPT-4o-7@ (71.15%), demonstrating the benefits of hierarchical refinement. On the WikiQA dataset, it outperformed baselines in open-domain question answering with a ROUGE-1 score of 0.3461 (+5.32%) and a BERTScore of 0.6079 (+3.30%), exceeding AutoGPT (0.3286 ROUGE-1, 0.5885 BERTScore). An ablation study showed that removing the evaluation supervisor or structured communication significantly reduced accuracy, confirming their importance. TalkHier outperformed OKG by 17.63% across Faithfulness, Fluency, Attractiveness, and Character Count Violation on the Camera dataset for ad text generation, with human evaluations validating its multi-agent assessments. While OpenAI-o1’s internal architecture was not revealed, TalkHier posted competitive MMLU scores and beat it decisively on WikiQA, showing flexibility between tasks and dominance over majority voting and open-source multi-agent systems.

    In the end, the proposed framework improved communication, reasoning, and coordination in LLM multi-agent systems by combining a structured protocol with hierarchical refinement, which resulted in a better performance on several benchmarks. Including messages, intermediate results, and context information ensured structured interactions without sacrificing heterogeneous agent feedback. Even with increased API expenses, TalkHier set a new benchmark for scalable, objective multi-agent cooperation. This methodology can serve as a baseline in subsequent research, directing improvement in effective communication mechanisms and low-cost multi-agent interactions, ultimately towards advancing LLM-based cooperative systems.


    Check out the Paper and GitHub Page. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 75k+ ML SubReddit.

    🚨 Recommended Read- LG AI Research Releases NEXUS: An Advanced System Integrating Agent AI System and Data Compliance Standards to Address Legal Concerns in AI Datasets

    The post Sony Researchers Propose TalkHier: A Novel AI Framework for LLM-MA Systems that Addresses Key Challenges in Communication and Refinement appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleFine-Tuning NVIDIA NV-Embed-v1 on Amazon Polarity Dataset Using LoRA and PEFT: A Memory-Efficient Approach with Transformers and Hugging Face
    Next Article TokenSkip: Optimizing Chain-of-Thought Reasoning in LLMs Through Controllable Token Compression

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    June 1, 2025
    Machine Learning

    Enigmata’s Multi-Stage and Mix-Training Reinforcement Learning Recipe Drives Breakthrough Performance in LLM Puzzle Reasoning

    June 1, 2025
    Leave A Reply Cancel Reply

    Continue Reading

    30+ Best Free Heavy & Ultra-Bold Fonts for Designers

    Development

    Building an Ideation Agent System with AutoGen: Create AI Agents that Brainstorm and Debate Ideas

    Machine Learning

    Autonomous mortgage processing using Amazon Bedrock Data Automation and Amazon Bedrock Agents

    Machine Learning

    Catching a phish with many faces

    Development
    Hostinger

    Highlights

    Machine Learning

    Biophysical Brain Models Get a 2000× Speed Boost: Researchers from NUS, UPenn, and UPF Introduce DELSSOME to Replace Numerical Integration with Deep Learning Without Sacrificing Accuracy

    April 16, 2025

    Biophysical modeling serves as a valuable tool for understanding brain function by linking neural dynamics…

    Even Nvidia’s CEO is obsessed with Google’s NotebookLM AI tool

    November 21, 2024

    The Complete Guide to NetSuite Saved Searches

    November 15, 2024

    CVE-2025-30169 – Aspect File Upload and Execute PHP Script Injection Vulnerability

    May 22, 2025
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.