Pooya Golchian

Posted on Mar 27 • Edited on Apr 18 • Originally published at pooya.blog

AI-Scientist-v2: How AI is Automating Scientific Discovery

#ai #machinelearning #scientificresearch #automation

Sakana AI has released AI-Scientist-v2, a system that automates the entire scientific research process. From hypothesis generation to experimental design, execution, and paper writing, this agentic AI system performs end-to-end research autonomously.

The project, published on GitHub with 2,700+ stars within days of release, represents a significant leap in AI-driven research automation. It builds upon the original AI-Scientist while introducing agentic tree search for more sophisticated exploration of research directions Sakana AI GitHub, 2026.

What is AI-Scientist-v2?

AI-Scientist-v2 is an autonomous research system that leverages large language models and agentic workflows to conduct scientific investigations without human intervention.

Core Capabilities:

Hypothesis Generation: The system analyzes existing literature, identifies gaps, and generates novel research hypotheses. It uses retrieval-augmented generation to ground hypotheses in current scientific knowledge.

Experimental Design: AI-Scientist-v2 designs experiments to test hypotheses, selecting appropriate methodologies, datasets, and evaluation metrics. It considers computational constraints and reproducibility requirements.

Code Implementation: The system writes, executes, and debugs code for experiments. It handles data preprocessing, model training, and statistical analysis automatically.

Results Interpretation: Experimental results are analyzed to determine whether they support or refute hypotheses. The system identifies limitations and suggests follow-up experiments.

Paper Generation: Complete research papers are produced, including abstracts, introductions, methods, results, discussions, and citations. Papers follow standard academic formatting.

The Agentic Tree Search Architecture

AI-Scientist-v2's key innovation is agentic tree search, a method for exploring research directions more effectively than linear approaches.

How It Works:

The system maintains a tree of research states, where each node represents a potential research direction. Nodes are evaluated based on novelty, feasibility, and expected impact. Promising branches are explored deeply while unpromising paths are pruned.

Components:

Explorer Agent: Generates new research directions by combining existing ideas in novel ways. It uses analogical reasoning to transfer concepts across domains.

Critic Agent: Evaluates research directions for scientific merit, feasibility, and novelty. It identifies potential flaws and suggests improvements.

Executor Agent: Implements experiments, runs code, and collects results. It handles error recovery and adaptive experimentation.

Synthesizer Agent: Combines results from multiple experiments into coherent findings. It identifies patterns and draws conclusions.

This multi-agent architecture enables parallel exploration of research directions, significantly accelerating the discovery process compared to sequential approaches.

Performance and Results

Sakana AI evaluated AI-Scientist-v2 across multiple scientific domains with impressive results.

Machine Learning Research:

In automated machine learning (AutoML) research, AI-Scientist-v2 discovered novel neural architecture components that improved ImageNet accuracy by 0.8% over existing approaches. The system identified an overlooked regularization technique from a 2019 paper and applied it to modern architectures.

Materials Science:

The system proposed candidate materials for battery electrolytes with predicted conductivity properties. While experimental validation is pending, computational screening identified promising compounds missed by traditional methods.

Computational Biology:

AI-Scientist-v2 analyzed protein interaction networks and proposed novel drug targets for antibiotic-resistant bacteria. The hypotheses are currently being evaluated by partner laboratories.

Comparison with Human Researchers

AI-Scientist-v2 does not replace human researchers but augments their capabilities in specific ways.

Speed: AI-Scientist-v2 completes literature reviews in hours rather than weeks. Experiments run 24/7 without fatigue. Paper writing takes minutes instead of days.

Scale: The system can explore thousands of research directions simultaneously. Human researchers typically pursue one or a few parallel investigations.

Objectivity: AI-Scientist-v2 evaluates hypotheses based on evidence without cognitive biases. It does not favor pet theories or suffer from confirmation bias.

Limitations: The system lacks physical intuition and real-world context. It cannot perform physical experiments requiring laboratory work. Creativity is bounded by training data patterns.

Implications for Scientific Research

AI-Scientist-v2 raises profound questions about the future of scientific discovery.

Democratization of Research: Small institutions and developing countries gain access to research capabilities previously requiring large teams and budgets. A single researcher with AI assistance can match the output of traditional labs.

Publication Pressure: If AI systems can generate papers autonomously, the volume of scientific literature will explode. Peer review systems already struggling with volume face collapse without AI-assisted review tools.

Novelty vs. Incrementalism: Critics argue AI-Scientist-v2 optimizes for publishable results rather than breakthrough discoveries. The system excels at incremental improvements but has not yet produced paradigm-shifting findings.

Reproducibility Crisis: Automated research could worsen reproducibility issues if experiments are not properly documented. AI-Scientist-v2 includes detailed logging, but verification remains challenging.

Ethical Considerations: Research involving human subjects, animals, or dual-use technologies requires ethical oversight. AI-Scientist-v2 currently operates in computational domains where these concerns are minimal.

Technical Implementation

AI-Scientist-v2 is built on a modular architecture enabling customization for different research domains.

Technology Stack:

Language Models: GPT-4, Claude, and open-source alternatives for reasoning and generation
Code Execution: Sandboxed Python environments with GPU access for ML experiments
Literature Database: Semantic Scholar API for paper retrieval and citation analysis
Version Control: Git integration for experiment tracking and reproducibility
LaTeX Generation: Automated paper formatting with BibTeX citation management

Extensibility:

Researchers can define domain-specific agents by implementing standardized interfaces. The system supports custom experiment runners, evaluation metrics, and paper templates.

Open Source:

Sakana AI released AI-Scientist-v2 under the MIT license. The community has contributed agents for chemistry, physics, and economics research. A plugin ecosystem is emerging.

Limitations and Challenges

Despite impressive capabilities, AI-Scientist-v2 faces significant limitations.

Computational Cost: Running comprehensive research campaigns requires substantial GPU resources. Each full research cycle costs approximately $50-200 in compute, limiting accessibility.

Hallucination Risk: Language models occasionally generate plausible-sounding but incorrect information. The system includes verification steps but cannot eliminate all errors.

Narrow Domain Focus: AI-Scientist-v2 excels in computational domains with clear evaluation metrics. It struggles with qualitative research, field work, and interdisciplinary studies.

Citation Gaming: The system optimizes for citation impact, potentially favoring trendy topics over important but obscure research areas.

Lack of Physical Grounding: Without robotic capabilities, AI-Scientist-v2 cannot perform experiments requiring physical manipulation. It is limited to computational and theoretical research.

FAQ

Can AI-Scientist-v2 replace human researchers?

No. AI-Scientist-v2 augments human capabilities but cannot replace scientific intuition, physical experimentation, and ethical judgment. It excels at computational research but requires human oversight for direction and validation Nature, 2026.

How much does AI-Scientist-v2 cost to run?

A single research cycle costs $50-200 depending on experiment complexity and model choices. Literature review and paper generation are cheaper ($5-20). Large-scale research campaigns exploring multiple directions can cost thousands. Costs are decreasing as models become more efficient.

What domains does AI-Scientist-v2 support?

Currently optimized for machine learning, computational biology, materials science, and theoretical physics. Community contributions have added support for economics, chemistry, and climate modeling. Each domain requires custom agents and evaluation metrics.

Is AI-Scientist-v2 open source?

Yes, released under the MIT license. The GitHub repository includes core agents, example research campaigns, and documentation. Some components rely on proprietary language model APIs, but open-source alternatives are supported.

How does AI-Scientist-v2 ensure research quality?

Multiple mechanisms ensure quality: critic agents evaluate hypotheses, code is tested before execution, results are cross-validated, and papers include confidence intervals. However, human review remains essential for publication-quality work.

Can AI-Scientist-v2 perform physical experiments?

No. The system is limited to computational research. Physical experiments requiring laboratory work, human subjects, or field observations cannot be automated. Integration with robotic systems is an active research area.

What are the ethical implications?

Concerns include: authorship attribution, reproducibility, potential for generating low-quality research at scale, and displacement of early-career researchers. Sakana AI recommends transparent disclosure of AI assistance and human oversight of all research outputs.

Conclusion

AI-Scientist-v2 represents a paradigm shift in scientific research automation. By combining large language models with agentic workflows and tree search, Sakana AI has created a system that can autonomously conduct end-to-end research.

The implications are profound. Research productivity could increase by orders of magnitude. Small teams could match the output of major institutions. Scientific discovery might accelerate beyond current imagination.

Yet significant challenges remain. Physical experimentation, ethical oversight, and creative breakthroughs still require human involvement. AI-Scientist-v2 is a powerful tool, not a replacement for scientific thinking.

As the system evolves and costs decrease, AI-assisted research will become standard practice. The scientists of tomorrow will direct AI agents rather than conduct experiments manually. The nature of scientific work is changing, and AI-Scientist-v2 is leading that transformation.

The future of science is not human vs. machine. It is human and machine, together, exploring questions neither could answer alone.

Pooya Golchian is an AI Engineer and Full Stack Developer tracking advances in artificial intelligence and automation. Follow him on Twitter @pooyagolchian for more insights on AI research and development.

DEV Community