WonderLab

Posted on Mar 8

Open Source Project of the Day (Part 10): AgentEvolver - Self-Evolving Agent System for Autonomous Learning and Evolution

#python #opensource #llm #ai

Introduction

"If AI agents could evolve like biological organisms — autonomously discovering problems, accumulating experience, and optimizing strategies — they would no longer be static tools, but truly 'growing' intelligent entities."

This is Part 10 of the "Open Source Project of the Day" series. Today we explore AgentEvolver (GitHub).

Traditional AI agent training requires large amounts of manually annotated datasets — expensive and hard to scale. AgentEvolver uses three self-evolving mechanisms — Self-Questioning, Self-Navigating, and Self-Attributing — to enable AI agents to autonomously generate tasks, accumulate experience, and optimize strategies, achieving true self-evolution.

What You'll Learn

Core self-evolution mechanisms and how AgentEvolver works
How the three mechanisms (Self-Questioning, Self-Navigating, Self-Attributing) work together
How to set up and train a self-evolving agent system
Service-oriented data flow architecture design
Outstanding performance on AppWorld and BFCL-v3 benchmarks
Comparative analysis with other agent training frameworks

Prerequisites

Basic understanding of AI agents and reinforcement learning
Familiarity with Python programming
Understanding of basic LLM concepts
Basic knowledge of reinforcement learning training pipelines (optional)

Project Background

Project Introduction

AgentEvolver is an efficient self-evolving agent system that enables AI agents to autonomously learn and evolve through three core mechanisms:

Self-Questioning: Agents autonomously explore environments and generate diverse tasks, eliminating the cost of expensive manual dataset construction
Self-Navigating: Summarizes and reuses cross-task experience to guide higher-quality exploration and improve exploration efficiency
Self-Attributing: Handles long trajectories, discovers causal contributions of intermediate steps, and enables fine-grained and efficient policy optimization

Core problems the project solves:

Agent training requires large amounts of manually annotated datasets at high cost
Lack of autonomous exploration capabilities makes it hard to discover new tasks
Experience cannot be effectively reused, leading to low exploration efficiency
Credit assignment in long trajectories is imprecise, making policy optimization inefficient
Different environment integrations are difficult, lacking a unified training framework

Target user groups:

AI agent researchers and developers
Researchers needing to train autonomous agents
Enterprises looking to reduce agent training costs
Technical professionals interested in self-evolving systems

Author/Team Introduction

Team: ModelScope

Background: Alibaba DAMO Academy ModelScope team, focused on AI model and system development
Contributors: 10 contributors including @YunpengZhai, @TaoShuchang, @Xinji-Mai, and others
Philosophy: Building efficient, autonomous, evolvable AI agent systems
Website: modelscope.github.io/AgentEvolver

Project creation date: 2024 (based on GitHub activity, an actively maintained project)

Project Stats

⭐ GitHub Stars: 1.1k+ (continuously growing)
🍴 Forks: 128+
📦 Version: Latest version (continuously updated)
📄 License: Apache-2.0 (fully open source, free to use)
🌐 Website: modelscope.github.io/AgentEvolver
📚 Documentation: Includes complete usage guides and API documentation
💬 Community: Active GitHub Issues
📊 Paper: arXiv:2511.10395

Project development history:

2024: Project created, started building core self-evolution mechanisms
2024-2025: Refined the three mechanisms, added multi-environment support
2025: Published paper, achieved outstanding performance on AppWorld and BFCL-v3 benchmarks
2026: Continuous optimization, added Game Arena multi-agent scenario support

Main Features

Core Purpose

AgentEvolver's core purpose is to build an efficient self-evolving agent system that enables AI agents to:

Autonomously generate tasks: Through Self-Questioning, agents autonomously explore environments and generate diverse tasks
Experience-guided exploration: Through Self-Navigating, summarize and reuse cross-task experience to improve exploration efficiency
Fine-grained credit assignment: Through Self-Attributing, precisely identify the contributions of key steps in long trajectories
Efficient policy optimization: Based on fine-grained credit assignment, achieve more efficient policy optimization

Use Cases

Agent training and research
- Training autonomously exploring AI agents
- Researching the effectiveness of self-evolution mechanisms
- Reducing agent training costs
Complex environment interaction
- AppWorld application operation tasks
- BFCL-v3 complex reasoning tasks
- Multi-agent social games (Avalon, Diplomacy)
Automatic task generation
- Automatically discover new tasks in the environment
- Generate diverse training data
- Reduce manual annotation costs
Experience reuse and optimization
- Cross-task experience summarization and reuse
- Improve exploration efficiency
- Accelerate agent learning

Quick Start

Installation

AgentEvolver requires conda and CUDA toolkit:

# Step 1: Install base dependencies
bash install.sh

# Step 2: Set up environment service (AppWorld as example)
cd env_service/environments/appworld && bash setup.sh

# Step 3: Set up ReMe (optional, for experience management)
bash external/reme/install_reme.sh

# Step 4: Start training
conda activate agentevolver

# Method 1: Basic example (without ReMe)
python launcher.py --conf examples/basic.yaml --with-appworld

# Method 2: Full example (with ReMe, includes questioning + navigating + attributing)
python launcher.py --conf examples/overall.yaml --with-appworld --with-reme

Prerequisites

conda: For environment management
CUDA toolkit: For GPU acceleration
Python 3.x: Primary programming language

Simplest Usage Example

# Copy config file
cp example.env .env

# Modify .env file, set API key and conda path
# Then run training

# Basic training (using built-in environment dataset)
python launcher.py --conf examples/basic.yaml --with-appworld

# Full self-evolving training
python launcher.py --conf examples/overall.yaml --with-appworld --with-reme

Core Features

Self-Questioning: Agents autonomously explore environments, generate diverse tasks, eliminating manual dataset construction costs
Self-Navigating: Summarizes and reuses cross-task experience to guide high-quality exploration, improving exploration efficiency
Self-Attributing: Handles long trajectories, discovers causal contributions of intermediate steps, enables fine-grained policy optimization
Environment compatibility: Standardized interfaces for seamless integration with various external environments and tool APIs
Flexible context management: Built-in tools for managing multi-turn context and complex interaction logic
Modular architecture: Decoupled components, easy to customize, extend, and upgrade algorithms
Game Arena support: Extended to multi-agent social game environments, supporting interaction, evaluation, and training

Project Advantages

Comparison	AgentEvolver	Traditional agent training	Other self-evolving frameworks
Task generation	✅ Autonomous generation	❌ Requires manual annotation	⚠️ Partial support
Experience reuse	✅ Cross-task experience summary	❌ Cannot reuse	⚠️ Limited reuse
Credit assignment	✅ Fine-grained attribution	⚠️ Coarse-grained	⚠️ Moderate precision
Training efficiency	✅ Highly efficient	❌ Expensive	⚠️ Moderate
Environment support	✅ Standardized interface	⚠️ Needs adaptation	⚠️ Limited support
Multi-agent	✅ Game Arena	❌ Not supported	⚠️ Partial support

Why choose AgentEvolver?

Compared to traditional agent training methods, AgentEvolver uses three self-evolving mechanisms to achieve autonomous task generation, experience reuse, and fine-grained credit assignment, significantly reducing training costs, improving efficiency, and achieving outstanding performance on AppWorld and BFCL-v3 benchmarks.

Detailed Project Analysis

Architecture Design

AgentEvolver adopts a service-oriented data flow architecture, seamlessly integrating environment sandboxes, LLMs, and experience management into modular services.

Core Architecture

AgentEvolver System
├── Environment Service
│   ├── AppWorld environment
│   ├── BFCL-v3 environment
│   ├── Game Arena (Avalon, Diplomacy)
│   └── Custom environment interface
├── LLM Service
│   ├── Qwen2.5-7B/14B
│   ├── Other LLM support
│   └── API call wrapper
├── Experience Manager
│   ├── ReMe integration
│   ├── Experience pool management
│   └── Experience summarization and reuse
├── Task Manager
│   ├── Task exploration
│   ├── Synthetic task generation
│   └── Training data management
└── Advantage Processor
    ├── ADCA-GRPO algorithm
    ├── Credit assignment
    └── Policy optimization

Self-Questioning Mechanism

Self-Questioning enables agents to autonomously explore environments and generate diverse tasks:

Workflow:

Agent autonomously explores the environment
Discovers new tasks and challenges in the environment
Automatically generates task descriptions and training data
Eliminates expensive manual dataset construction costs

Advantages:

High task diversity, covering various scenarios in the environment
No manual annotation needed, significantly reduces costs
High task quality, based on actual environment exploration

Self-Navigating Mechanism

Self-Navigating improves exploration efficiency through experience summarization and reuse:

Workflow:

Summarize successful cross-task experiences
Build an experience knowledge base
Reuse relevant experience in new tasks
Guide higher-quality exploration

Advantages:

Significantly improves exploration efficiency
Experience is reusable, avoiding repeated exploration
Guides higher-quality strategies

Self-Attributing Mechanism

Self-Attributing achieves efficient policy optimization through fine-grained credit assignment:

Workflow:

Analyze intermediate steps in long trajectories
Identify causal contributions of key steps
Assign credit based on contributions
Implement fine-grained policy optimization

Advantages:

Precise credit assignment, avoids incorrect attribution
High policy optimization efficiency
Supports long trajectory processing

Performance

AgentEvolver achieves outstanding performance on AppWorld and BFCL-v3 benchmarks:

AppWorld Benchmark

Qwen2.5-7B + AgentEvolver: avg@8: 32.4%, best@8: 51.2%
Qwen2.5-14B + AgentEvolver: avg@8: 48.7%, best@8: 69.4%

Significant performance improvements over baseline models:

7B model: Improved from 1.8% to 32.4% (avg@8)
14B model: Improved from 18.0% to 48.7% (avg@8)

BFCL-v3 Benchmark

Qwen2.5-7B + AgentEvolver: avg@8: 57.9%, best@8: 69.0%
Qwen2.5-14B + AgentEvolver: avg@8: 66.5%, best@8: 76.7%

Significant performance improvements over baseline models:

7B model: Improved from 29.8% to 57.9% (avg@8)
14B model: Improved from 41.6% to 66.5% (avg@8)

Mechanism Ablation Study

Experiments show that all three mechanisms working together achieves the best results:

+Questioning: Significant performance improvement
+Questioning&Navigating: Further improves exploration efficiency
+Questioning&Attributing: Fine-grained optimization brings additional gains
AgentEvolver (Full): All three mechanisms together, best performance

Game Arena Multi-agent Scenarios

AgentEvolver Game Arena extends AgentEvolver to multi-agent social game environments:

Core Capabilities

Web interface interaction: Real-time observation of AI agent reasoning and communication, or participate as a human player
Scalable evaluation: Run large-scale self-play or mixed-model tournaments, supports configuration and leaderboards
End-to-end training: Directly train LLM agents using reinforcement learning methods (like GRPO) in social game environments

Supported Games

Avalon: Social reasoning game, tests agents' reasoning and communication abilities
Diplomacy: Complex multi-agent strategy game, tests long-term planning and collaboration abilities

Training Example

The training curve for training the assassin role in Avalon shows that AgentEvolver can effectively improve agent performance on complex social reasoning tasks.

Environment Compatibility

AgentEvolver provides standardized interfaces supporting seamless integration with various external environments:

Environment Interface

Standardized interface: Unified environment interface specification
Tool API integration: Supports integration with various tools and APIs
Custom environments: Easy to add custom environments

Supported Environments

AppWorld: Application operation task environment
BFCL-v3: Complex reasoning task environment
Game Arena: Multi-agent social game environment
Custom environments: Integrated through standard interfaces

Experience Management (ReMe)

AgentEvolver integrates ReMe for experience management:

Features

Experience summarization: Summarize successful cross-task experiences
Experience pool management: Manage storage and retrieval of the experience pool
Experience reuse: Reuse relevant experience in new tasks

Usage

# Install ReMe
bash external/reme/install_reme.sh

# Train with ReMe
python launcher.py --conf examples/overall.yaml --with-appworld --with-reme

Project Resources

Official Resources

🌟 GitHub: https://github.com/modelscope/AgentEvolver
🌐 Website: modelscope.github.io/AgentEvolver
📄 Paper: arXiv:2511.10395

Who Should Use This

AgentEvolver is especially suitable for: AI agent researchers and developers, researchers needing to train autonomous agents, enterprises looking to reduce agent training costs, technical professionals interested in self-evolving systems, research teams needing multi-agent training.

Not suitable for: Users who only need simple agents, scenarios that don't require autonomous learning, developers lacking reinforcement learning background.

Welcome to visit my personal homepage for more useful knowledge and interesting products

Top comments (0)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.