â€˜AI Scientistâ€™ performs fully automatic scientific discovery

Japanese AI research lab Sakana AI has developed The AI Scientist, a framework for fully automatic scientific research and discovery.

The scientific community already uses AI models to automate or assist in their research, but these models only perform a small part of the scientific process. With advances in agentic AI, weâ€™re now seeing AI agents that act autonomously across platforms with less human guidance.

With The AI Scientist, Sakana AI created a system that uses an LLM like GPT-4o or Gemini to automate the entire scientific process from ideation, research, experimentation, and even writing and reviewing research papers.

The ultimate goal is to have an AI research tool that conducts fully automated, open-ended scientific discovery. The AI Scientist gives us a glimpse into the possibilities of this becoming a reality.

The AI Scientist process

In their paper, Sakana AI explained how the framework was applied to machine learning research. Given a broad template as a research field, The AI Scientist is free to explore any possible research direction.

It first brainstorms a set of ideas and then accesses Semantic Scholar to check if these ideas represent novel avenues for research. If they do, then it uses automated code generation to create and run experiments.

The AI Scientist then compiles the explanation of the research and experimental results into a research paper along with citations of relevant papers from Semantic Scholar.

Sakana AI developed an automated paper reviewing system that uses an LLM to evaluate the research paper with near-human accuracy. This review process creates a feedback loop for iterative improvements to the research papers.

The AI Scientist research and paper writing and review process. Source: Sakana AI

Hereâ€™s an example of one of the research papers The AI Scientist created: â€œDualScale Diffusion: Adaptive Feature Balancing for Low-Dimensional Generative Modelsâ€

The AI Scientist currently doesnâ€™t have vision capabilities so some of the charts, plots, and page layouts arenâ€™t great. Using the vision capabilities of multimodal models in the next iteration will fix this.

It also suffers from some of the limitations that leading AI models struggle with, like hallucinations, illogical reasoning, and comparing the magnitude of two numbers. However, the latest version of GPT-4o finally understands that 9.9 is larger than 9.11 so this should improve too.

Concerning behavior

The idea of a fully automated AI scientist that recursively improves itself is equal parts exciting and scary. The AI Scientist exhibited some emergent behavior that hints at how things could go wrong.

The researchers â€œnoticed that The AI Scientist occasionally tries to increase its chance of success, such as modifying and launching its own execution scriptâ€¦In another case, its experiments took too long to complete, hitting our timeout limit. Instead of making its code run faster, it simply tried to modify its own code to extend the timeout period.â€

The AI Scientist has the potential to be a valuable tool for researchers, but its creators say it also carries significant risks of misuse.â€

At an average cost of around $15 per research paper, someone could use the tool to flood an already overburdened human academic peer review system. If those overworked human reviewers decided to default to Sakana AIâ€™s automated paper review system it could compromise scientific quality control.

The researchers also noted that The AI Scientist has the potential to be used in unethical ways. If given access to automated â€œcloud labsâ€ it could â€œcreate new, dangerous viruses or poisons that harm people before we can intervene. Even in computers, if tasked to create new, interesting, functional software, it could create dangerous malware.â€

Weâ€™ll have to see how the AI-generated research papers fare after human review, but at $15 per paper, the future of scientific research looks cheaper, faster, and a lot less human.

The post â€˜AI Scientistâ€™ performs fully automatic scientific discovery appeared first on DailyAI.

Source: Read MoreÂ

Sunshine And March Vibes (2025 Wallpapers Edition)

The Case For Minimal WordPress Setups: A Contrarian View On Theme Frameworks

How To Fix Largest Contentful Paint Issues With Subpart Analysis

How To Prevent WordPress SQL Injection Attacks

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

Minecraft licensing robbed us of this controversial NFL schedule release video

The power of generators

The power of generators

Simplify Factory Associations with Laravel’s UseFactory Attribute

This Week in Laravel: React Native, PhpStorm Junie, and more

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Microsoft has closed its “Experience Center” store in Sydney, Australia — as it ramps up a continued digital growth campaign

Bing Search APIs to be “decommissioned completely” as Microsoft urges developers to use its Azure agentic AI alternative

Microsoft might kill the Surface Laptop Studio as production is quietly halted

â€˜AI Scientistâ€™ performs fully automatic scientific discovery

The AI Scientist process

Concerning behavior

Nmap 7.96 Launches with Lightning-Fast DNS and 612 Scripts

CVE-2025-47916 – Invision Community Themeeditor Remote Code Execution

How to install Ubuntu Server in under 30 minutes

Tally – hybrid native and web app for Plausible Analytics

Researchers at Stanford Use AI and Spatial Transcriptomics to Discover What Makes Some Cells Age Faster/Slower in the Brain

Accelerate database development in Amazon RDS and Amazon Aurora with Amazon CodeWhisperer

Meta’s Llama 4 ‘herd’ controversy and AI contamination, explained

DOOM: The Dark Ages release date — Launch time, Early Access, and when it comes out in your time zone

Lenovo’s new IdeaPad 2-in-1 is the perfect Snapdragon X Plus laptop for daily office tasks and student work

CVE-2025-45616 – Brcc Authentication Bypass Vulnerability

â€˜AI Scientistâ€™ performs fully automatic scientific discovery

The AI Scientist process

Concerning behavior

Related Posts