Optimizing Agent Planning: A Parametric AI Approach to World Knowledge

Large Language Models (LLMs) have advanced natural language processing tasks significantly. Recently, using LLMs for physical world planning tasks has shown promise. However, LLMs, primarily autoregressive models, often fail to understand the real world, leading to hallucinatory actions and trial-and-error behavior. Unlike LLMs, humans utilize global task knowledge and local state knowledge to mentally rehearse and execute tasks efficiently, avoiding blind trial-and-error and confusion during the planning and execution stages.

Existing work in LLM-based agent systems focuses on agent planning, external tool utilization, and code generation, often fine-tuning open-source LLMs. These approaches may lead to trial-and-error actions due to a lack of environmental cognition. Knowledge-augmented agent planning, using pre-trained knowledge or structured prompts, faces challenges in transferring across tasks.Â

Inspired by the human approach to planning, researchers from Zhejiang University â€“ Ant Group Joint Laboratory of Knowledge Graph, National University of Singapore, and Alibaba Group developed a parametric World Knowledge Model (WKM) for agent planning. WKM is built on knowledge from both expert and explored trajectories. The agent model synthesizes task knowledge by comparing these trajectories and summarizes state knowledge for each planning step. This knowledge is integrated into expert trajectories to train the WKM. During planning, WKM provides global task knowledge and maintains dynamic state knowledge, guiding the agent and preventing hallucinatory actions through kNN retrieval and weighted predictions.

The agent model self-synthesizes task knowledge by comparing expert and sampled trajectories. An experienced agent generates high-quality rejected trajectories, enhancing task knowledge beyond supervised fine-tuning. Task knowledge guides global planning, avoiding blind trial-and-error. State knowledge, summarized at each planning step from expert trajectories, constrains local planning to prevent hallucinatory actions. A state knowledge base, formed by combining state knowledge with preceding and subsequent actions, facilitates retrieval without overloading the context, ensuring effective and accurate agent planning.

The method is evaluated on ALFWorld, WebShop, and ScienceWorld datasets, with unseen tasks testing generalization. ALFWorld uses binary rewards, while WebShop and ScienceWorld use dense rewards. The models tested include Mistral-7B, Gemma-7B, and Llama-3-8B, compared against prompt-based baselines (REACT, Reflexion), fine-tuning baselines (NAT, ETO), KNOWAGENT, and ChatGPT/GPT-4. The approach, through LoRA training alone, surpasses GPT-4 on ALFWorld (44.29â†’73.57 on seen, 38.05â†’76.87 on unseen) and WebShop (62.76â†’66.64), and fine-tuning baselines, demonstrating that integrating world knowledge is more effective than further fine-tuning on negative examples. WKM shows superior performance and generalization compared to human-designed knowledge methods like KNOWAGENT.

This research develops a parametric WKM to enhance language agent model planning. The WKM provides task knowledge for global planning and state knowledge for local planning. Results show WKMâ€™s superior performance on GPT-4 and state-of-the-art models, outperforming strong baselines. Analytical experiments demonstrate WKMâ€™s ability to reduce trial-and-error, improve generalization to unseen tasks, achieve weak-guide-strong, and extend to unified world knowledge training.Â

Check out theÂ Paper. All credit for this research goes to the researchers of this project. Also,Â donâ€™t forget to follow us onÂ Twitter.Â Join ourÂ Telegram Channel,Â Discord Channel, andÂ LinkedIn Group.

If you like our work, you will love ourÂ newsletter..

Donâ€™t Forget to join ourÂ 43k+ ML SubReddit

The post Optimizing Agent Planning: A Parametric AI Approach to World Knowledge appeared first on MarkTechPost.

Source: Read MoreÂ

IBM’s next generation Granite models are now available

The Human Element: Using Research And Psychology To Elevate Data Storytelling

Google to offer free version of Gemini Code Assist

MongoDB acquires Voyage AI for its embedding and reranking models

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

Razer and Minecraft just announced a limited-edition collection, and I’m surprised it took so long

Panos Panay’s Amazon AI move: A bold bet or another Surface Duo?

OpenAI expands ‘Deep Reseach’ to those paying $20 a month or more, a day after Microsoft made OpenAI’s ‘Think Deeper’ free for all Copilot users with no usage caps

Rethink State💡 Why You Should Model Your Frontend Around Events

Rethink State💡 Why You Should Model Your Frontend Around Events

What To Expect When Migrating Your Site To A New Platform

Kotlin Multiplatform vs. React Native vs. Flutter: Building Your First App

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

AI-generated content in games is here to stay — the bigger issue is the outright deception and what the future may look like

Razer and Minecraft just announced a limited-edition collection, and I’m surprised it took so long

Panos Panay’s Amazon AI move: A bold bet or another Surface Duo?

Optimizing Agent Planning: A Parametric AI Approach to World Knowledge

ANDI Accessibility Testing Tool Tutorial

How Data Analytics in Insurance is Driving Smarter Decisions

New generative media models and tools, built with and for creators

Tailwind CSS v4.0 is here

Alvaro Montoro: CSS One-Liners to Improve (Almost) Every Project

Your Microsoft 365 subscription cost is going up for the first time in 12 years — but don’t worry, it now includes a “monthly allotment” of Copilot

Optimizing costs of generative AI applications on AWS

RTL Styling 101

Offensive AI: The Sine Qua Non of Cybersecurity

A full-stack sample web application based on Next.js that creates a simple whole-website architecture

Optimizing Agent Planning: A Parametric AI Approach to World Knowledge

Related Posts