Close Menu
    DevStackTipsDevStackTips
    • Home
    • News & Updates
      1. Tech & Work
      2. View All

      Sunshine And March Vibes (2025 Wallpapers Edition)

      June 1, 2025

      The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

      June 1, 2025

      How To Fix Largest Contentful Paint Issues With Subpart Analysis

      June 1, 2025

      How To Prevent WordPress SQL Injection Attacks

      June 1, 2025

      My top 5 must-play PC games for the second half of 2025 — Will they live up to the hype?

      June 1, 2025

      A week of hell with my Windows 11 PC really makes me appreciate the simplicity of Google’s Chromebook laptops

      June 1, 2025

      Elden Ring Nightreign Night Aspect: How to beat Heolstor the Nightlord, the final boss

      June 1, 2025

      New Xbox games launching this week, from June 2 through June 8 — Zenless Zone Zero finally comes to Xbox

      June 1, 2025
    • Development
      1. Algorithms & Data Structures
      2. Artificial Intelligence
      3. Back-End Development
      4. Databases
      5. Front-End Development
      6. Libraries & Frameworks
      7. Machine Learning
      8. Security
      9. Software Engineering
      10. Tools & IDEs
      11. Web Design
      12. Web Development
      13. Web Security
      14. Programming Languages
        • PHP
        • JavaScript
      Featured

      Student Record Android App using SQLite

      June 1, 2025
      Recent

      Student Record Android App using SQLite

      June 1, 2025

      When Array uses less memory than Uint8Array (in V8)

      June 1, 2025

      Laravel 12 Starter Kits: Definite Guide Which to Choose

      June 1, 2025
    • Operating Systems
      1. Windows
      2. Linux
      3. macOS
      Featured

      My top 5 must-play PC games for the second half of 2025 — Will they live up to the hype?

      June 1, 2025
      Recent

      My top 5 must-play PC games for the second half of 2025 — Will they live up to the hype?

      June 1, 2025

      A week of hell with my Windows 11 PC really makes me appreciate the simplicity of Google’s Chromebook laptops

      June 1, 2025

      Elden Ring Nightreign Night Aspect: How to beat Heolstor the Nightlord, the final boss

      June 1, 2025
    • Learning Resources
      • Books
      • Cheatsheets
      • Tutorials & Guides
    Home»Development»Machine Learning»Google DeepMind Introduces Mind Evolution: Enhancing Natural Language Planning with Evolutionary Search in Large Language Models

    Google DeepMind Introduces Mind Evolution: Enhancing Natural Language Planning with Evolutionary Search in Large Language Models

    January 22, 2025

    It can significantly enhance LLMs’ problem-solving capabilities by guiding them to think more deeply about complex problems and effectively utilize inference-time computation. Prior research has explored various strategies, including chain-of-thought reasoning, self-consistency, sequential revision with feedback, and search mechanisms guided by auxiliary verifiers or evaluators. Search-based methods, particularly when paired with solution evaluators, leverage additional computational resources to explore a broader set of solution candidates. Techniques like best-of-N and tree search harness this capability to increase the likelihood of identifying successful solutions by examining a more extensive solution space.

    Recent efforts have combined LLMs with evolutionary search for optimization tasks, such as numerical and combinatorial problems and natural language planning. Unlike earlier studies that required task formalization in structured spaces, these approaches evolve solutions directly in natural language, bypassing the need for expert knowledge in formalizing tasks. Evolutionary search has also been applied to prompt optimization and multi-agent system design, such as EvoAgent, which evolved agents for problem-solving. However, these approaches often achieved limited success compared to methods like Gemini 1.5 Flash, demonstrating significant improvements in tasks like the TravelPlanner benchmark. Additionally, program-based evaluators integrated during evolutionary search provide reliable feedback to refine solutions, a technique widely adopted in code generation and response refinement across various domains. While learned feedback models or self-evaluators have been explored, they often suffer from noise and unreliability, presenting opportunities for future advancements.

    Researchers from Google DeepMind, UC San Diego, and the University of Alberta introduced Mind Evolution, an evolutionary search strategy designed to enhance inference-time computation for LLMs. Unlike previous methods like Best-of-N or sequential refinement, Mind Evolution uses a genetic approach to iteratively generate, refine, and recombine candidate solutions in natural language. It avoids formalizing tasks by relying on a solution evaluator, enabling higher success rates in natural language planning tasks like TravelPlanner and Natural Plan. Mind Evolution achieved 95.6% success on TravelPlanner and introduced new benchmarks like StegPoet, showcasing its versatility across challenging, non-formalized domains.

    Mind Evolution integrates a genetic search approach with an LLM and customized prompts to efficiently address natural language planning tasks. It employs language-based genetic algorithms, where solutions are represented in natural language, enabling LLMs to facilitate key operations like crossover, mutation, and island reset. The process begins by generating initial solutions through LLM-driven prompts. Solutions are iteratively refined using a “Refinement through Critical Conversation” (RCC) process involving critic and author roles for evaluation and improvement. The framework incorporates Boltzmann tournament selection, cyclic migration between islands, and periodic island resets to sustain diversity and optimize solutions effectively.

    The experiments evaluate Mind Evolution on three natural language planning benchmarks: TravelPlanner, Trip Planning, and Meeting Planning, excluding Calendar Scheduling due to its simplicity. The primary model, Gemini 1.5 Flash, is used with specified hyperparameters, while a two-stage approach incorporates Gemini 1.5 Pro for unsolved cases, improving cost efficiency. Mind Evolution outperforms baselines, achieving over 95% success in TravelPlanner and Trip Planning and 85% in Meeting Planning, with near-perfect results using the two-stage approach. Metrics such as success rate, LLM calls, token usage, and API costs highlight the efficiency of Mind Evolution’s evolutionary search strategy compared to baselines.

    In conclusion, Mind Evolution introduces an evolutionary search strategy to enhance inference-time computation for complex natural language planning tasks, focusing on stochastic exploration and iterative refinement. Unlike methods relying on formal solvers, Mind Evolution leverages language models to generate, recombine, and refine candidate solutions, requiring only a solution evaluator. It outperforms strategies like Best-of-N and Sequential Revision in benchmarks such as TravelPlanner, Natural Plan, and the newly introduced StegPoet. Controlling for inference costs, it achieves remarkable success, solving over 98% of problem instances in TravelPlanner and Natural Plan benchmarks using Gemini 1.5 Pro, demonstrating its effectiveness without formal solver dependency.


    Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. Don’t Forget to join our 65k+ ML SubReddit.

    🚨 [Recommended Read] Nebius AI Studio expands with vision models, new language models, embeddings and LoRA (Promoted)

    The post Google DeepMind Introduces Mind Evolution: Enhancing Natural Language Planning with Evolutionary Search in Large Language Models appeared first on MarkTechPost.

    Source: Read More 

    Facebook Twitter Reddit Email Copy Link
    Previous ArticleMeet EvaByte: An Open-Source 6.5B State-of-the-Art Tokenizer-Free Language Model Powered by EVA
    Next Article How Cato Networks uses Amazon Bedrock to transform free text search into structured GraphQL queries

    Related Posts

    Machine Learning

    How to Evaluate Jailbreak Methods: A Case Study with the StrongREJECT Benchmark

    June 1, 2025
    Machine Learning

    Enigmata’s Multi-Stage and Mix-Training Reinforcement Learning Recipe Drives Breakthrough Performance in LLM Puzzle Reasoning

    June 1, 2025
    Leave A Reply Cancel Reply

    Hostinger

    Continue Reading

    CVE-2025-48791 – Apache Struts Deserialization Vulnerability

    Common Vulnerabilities and Exposures (CVEs)

    From AI trainers to ethicists: AI may obsolete some jobs but generate new ones

    Development

    Rilasciata CachyOS Marzo 2025: Si Rinnova con il Bootloader Limine e Porta Nuove Funzionalità

    Linux

    Improve you C++ skills by coding an audio plugin

    Development

    Highlights

    News & Updates

    The Xbox handheld is actually a next-gen ROG Ally — It’s the best hardware crossover I could hope for

    March 31, 2025

    A new teaser shows that the Xbox handheld is actually going to be a next-gen…

    The new Amazfit Active 2 smartwatch is affordable and packed with surprises

    January 7, 2025

    CVE-2025-3843 – Panhainan DS-Java CSRF Vulnerability

    April 21, 2025

    Enhance call center efficiency using batch inference for transcript summarization with Amazon Bedrock

    August 21, 2024
    © DevStackTips 2025. All rights reserved.
    • Contact
    • Privacy Policy

    Type above and press Enter to search. Press Esc to cancel.