Optimizing Agent Planning: A Parametric AI Approach to World Knowledge

Large Language Models (LLMs) have advanced natural language processing tasks significantly. Recently, using LLMs for physical world planning tasks has shown promise. However, LLMs, primarily autoregressive models, often fail to understand the real world, leading to hallucinatory actions and trial-and-error behavior. Unlike LLMs, humans utilize global task knowledge and local state knowledge to mentally rehearse and execute tasks efficiently, avoiding blind trial-and-error and confusion during the planning and execution stages.

Existing work in LLM-based agent systems focuses on agent planning, external tool utilization, and code generation, often fine-tuning open-source LLMs. These approaches may lead to trial-and-error actions due to a lack of environmental cognition. Knowledge-augmented agent planning, using pre-trained knowledge or structured prompts, faces challenges in transferring across tasks.Â

Inspired by the human approach to planning, researchers from Zhejiang University â€“ Ant Group Joint Laboratory of Knowledge Graph, National University of Singapore, and Alibaba Group developed a parametric World Knowledge Model (WKM) for agent planning. WKM is built on knowledge from both expert and explored trajectories. The agent model synthesizes task knowledge by comparing these trajectories and summarizes state knowledge for each planning step. This knowledge is integrated into expert trajectories to train the WKM. During planning, WKM provides global task knowledge and maintains dynamic state knowledge, guiding the agent and preventing hallucinatory actions through kNN retrieval and weighted predictions.

The agent model self-synthesizes task knowledge by comparing expert and sampled trajectories. An experienced agent generates high-quality rejected trajectories, enhancing task knowledge beyond supervised fine-tuning. Task knowledge guides global planning, avoiding blind trial-and-error. State knowledge, summarized at each planning step from expert trajectories, constrains local planning to prevent hallucinatory actions. A state knowledge base, formed by combining state knowledge with preceding and subsequent actions, facilitates retrieval without overloading the context, ensuring effective and accurate agent planning.

The method is evaluated on ALFWorld, WebShop, and ScienceWorld datasets, with unseen tasks testing generalization. ALFWorld uses binary rewards, while WebShop and ScienceWorld use dense rewards. The models tested include Mistral-7B, Gemma-7B, and Llama-3-8B, compared against prompt-based baselines (REACT, Reflexion), fine-tuning baselines (NAT, ETO), KNOWAGENT, and ChatGPT/GPT-4. The approach, through LoRA training alone, surpasses GPT-4 on ALFWorld (44.29â†’73.57 on seen, 38.05â†’76.87 on unseen) and WebShop (62.76â†’66.64), and fine-tuning baselines, demonstrating that integrating world knowledge is more effective than further fine-tuning on negative examples. WKM shows superior performance and generalization compared to human-designed knowledge methods like KNOWAGENT.

This research develops a parametric WKM to enhance language agent model planning. The WKM provides task knowledge for global planning and state knowledge for local planning. Results show WKMâ€™s superior performance on GPT-4 and state-of-the-art models, outperforming strong baselines. Analytical experiments demonstrate WKMâ€™s ability to reduce trial-and-error, improve generalization to unseen tasks, achieve weak-guide-strong, and extend to unified world knowledge training.Â

Check out theÂ Paper. All credit for this research goes to the researchers of this project. Also,Â donâ€™t forget to follow us onÂ Twitter.Â Join ourÂ Telegram Channel,Â Discord Channel, andÂ LinkedIn Group.

If you like our work, you will love ourÂ newsletter..

Donâ€™t Forget to join ourÂ 43k+ ML SubReddit

The post Optimizing Agent Planning: A Parametric AI Approach to World Knowledge appeared first on MarkTechPost.

Source: Read MoreÂ

Sunshine And March Vibes (2025 Wallpapers Edition)

The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

How To Fix Largest Contentful Paint Issues With Subpart Analysis

How To Prevent WordPress SQL Injection Attacks

Microsoft’s allegiance isn’t to OpenAI’s pricey models — Satya Nadella’s focus is selling any AI customers want for maximum profits

If you think you can do better than Xbox or PlayStation in the Console Wars, you may just want to try out this card game

Surviving a 10 year stint in dev hell, this retro-styled hack n’ slash has finally arrived on Xbox

Save $400 on the best Samsung TVs, laptops, tablets, and more when you sign up for Verizon 5G Home or Home Internet

NodeSource N|Solid Runtime Release – May 2025: Performance, Stability & the Final Update for v18

NodeSource N|Solid Runtime Release – May 2025: Performance, Stability & the Final Update for v18

Big Changes at Meteor Software: Our Next Chapter

Apps in Generative AI – Transforming the Digital Experience

Microsoft’s allegiance isn’t to OpenAI’s pricey models — Satya Nadella’s focus is selling any AI customers want for maximum profits

Microsoft’s allegiance isn’t to OpenAI’s pricey models — Satya Nadella’s focus is selling any AI customers want for maximum profits

If you think you can do better than Xbox or PlayStation in the Console Wars, you may just want to try out this card game

Surviving a 10 year stint in dev hell, this retro-styled hack n’ slash has finally arrived on Xbox

Optimizing Agent Planning: A Parametric AI Approach to World Knowledge

February 2025 Baseline monthly digest

Learn A1 Level Spanish

The browser for web developers

CVE-2025-4359 – iSourcecode Gym Management System SQL Injection Vulnerability

New Golang-Based Backdoor Uses Telegram Bot API for Evasive C2 Operations

ASUS releases fix for AMI bug that lets hackers brick servers

The Xbox Series X Mini Fridge is more than a meme, it’s the perfect Christmas gift

Navigating the Landscape of CLIP: Investigating Data, Architecture, and Training Strategies

Capture and diagnose I/O bottlenecks on Amazon RDS for SQL Server

Aligning AI with human values

Optimizing Agent Planning: A Parametric AI Approach to World Knowledge

Related Posts