TL;DR
- ml-intern is Hugging Face's open-source ML engineering agent (Apache 2.0, 621⭐ as of April 2026).
- Same Anthropic Claude runtime as Claude Code, same MCP protocol, but specialized for ML workflows (papers → datasets → training → deployment).
- Key engineering wins: 170k auto-compaction, 300-iteration agentic loop, Doom Loop Detector, HF ecosystem deep integration.
- Sionic AI runs 1,000+ ML experiments per day with the Claude Code + HF Skills stack.
If you've been looking for a Claude Code equivalent that actually understands load_dataset, Trainer, and push_to_hub, this is it.
What is ml-intern?
ml-intern is an autonomous ML engineering agent released by Hugging Face on October 30, 2025. It's a CLI tool that reads papers, trains models, and ships code — end-to-end.
Think of it as "Claude Code, but it speaks fluent Hugging Face."
Repo: https://github.com/huggingface/ml-intern
Stars: 621 ⭐ (2026-04)
Forks: 62
Lang: Python 69.4% + TypeScript 30.1%
License: Apache 2.0
Maintainer: Hugging Face (smolagents org)
Why should you care?
Because the architecture isn't hype. It's a textbook implementation of Anthropic's agentic harness principles:
User/CLI
↓
submission_loop (agent_loop.py)
↓
Handlers.run_agent()
↓
Agentic Loop (max 300 iterations)
├─ Session
│ ├─ ContextManager (170k auto-compaction → HF Hub upload)
│ └─ ToolRouter
│ ├─ HF docs & research
│ ├─ HF repos / datasets / papers
│ ├─ HF Jobs (cloud GPU)
│ ├─ GitHub code search
│ ├─ Sandbox & local tools
│ ├─ Planning
│ └─ MCP server tools
└─ Doom Loop Detector
The pieces developers will actually appreciate:
- 170k auto-compaction — When context hits 170k tokens, it compresses and uploads to HF Hub so you can rewind later.
- Doom Loop Detector — The #1 failure mode of agents (infinite loops) is actively detected and broken with corrective prompts.
-
17-event stream —
processing,tool_call,approval_required,compacted,interrupted, etc. Perfect for monitoring dashboards.
Installation (5 minutes)
# Clone and install
git clone git@github.com:huggingface/ml-intern.git
cd ml-intern
uv sync
uv tool install -e .
# Verify
ml-intern --help
Then drop 3 keys in .env:
ANTHROPIC_API_KEY=sk-ant-...
HF_TOKEN=hf_...
GITHUB_TOKEN=ghp_...
Done.
Running it
Interactive mode (for exploration)
ml-intern
REPL-style. Good for first-time users or when you want to approve each tool call.
Headless mode (for automation)
ml-intern "fine-tune mistralai/Mistral-7B-v0.1 on my HF dataset using LoRA"
Auto-approve is the default. Drop this in a GitHub Action and you have nightly ML experiments.
Useful flags
ml-intern --model anthropic/claude-opus-4-6 "complex reasoning task"
ml-intern --max-iterations 100 "bounded budget"
ml-intern --no-stream "CI-friendly output"
Extending: Add a custom tool
Open agent/core/tools.py and drop a new ToolSpec:
def create_builtin_tools() -> List[ToolSpec]:
return [
# ...existing tools
ToolSpec(
name="my_internal_search",
description="Search my company's internal docs for ML best practices",
input_schema={
"type": "object",
"properties": {
"query": {"type": "string"}
},
"required": ["query"]
},
handler=my_internal_search_handler
),
]
async def my_internal_search_handler(query: str) -> str:
# Your logic here
return formatted_result
Re-install with uv tool install -e . --force and you're done.
Extending: Attach an MCP server
In configs/main_agent_config.json:
{
"mcpServers": {
"my-db": {
"command": "node",
"args": ["/opt/mcp-servers/my-db/index.js"],
"env": {
"DB_URL": "${COMPANY_DB_URL}",
"API_KEY": "${COMPANY_API_KEY}"
}
}
}
}
${ENV_VAR} is auto-substituted from your .env. No secret leakage in JSON.
Real production usage
Sionic AI (Korean ML team) runs 1,000+ experiments per day with a Claude Code + HF Skills pipeline. The HF blog post "We Got Claude to Fine-Tune an Open Source LLM" hit 613 upvotes.
This isn't a toy. It's what solo ML teams use to 10x their throughput.
How does this compare to Claude Code?
| Feature | Claude Code | ml-intern |
|---|---|---|
| Domain | General coding | ML workflows |
| Model | claude-sonnet-4-5 |
claude-sonnet-4-5 |
| MCP support | ✅ | ✅ |
| Ecosystem | Generic | HF Hub/Jobs/Spaces/Papers |
| Open source | Partial | Fully open (Apache 2.0) |
| Max iterations | Configurable | 300 default |
| Auto-compaction | Yes | 170k + HF Hub upload |
Same philosophy, different specialization. If your daily work is "read paper → download dataset → fine-tune → deploy Space," ml-intern is the right tool.
The bigger picture: HF's AI Agent-First strategy
ml-intern isn't standalone. It's part of a coordinated push:
- huggingface/skills — Skill repository compatible with Claude Code, Codex, Gemini CLI, Cursor (30 contributors).
- hf CLI v1.9 — Auto-detects if called by an AI agent, strips ANSI codes, saves ~40% tokens.
- hf skills add — One-command installer for agent-specific CLI skills.
- Trackio + HF Jobs — Real-time training monitoring + cloud GPU.
If you're building anything with LLMs in 2026, this ecosystem is worth knowing.
3 things to try today
- ⭐ Star + fork the repo. Apache 2.0 means you can white-label it.
- Install an HF Skill into your existing Claude Code:
hf skills add hf-cli. - Read the Sionic AI blog post and reverse-engineer their 1,000-experiment pipeline.
Closing thoughts
The question isn't "should I use an ML agent?" anymore. It's "how fast can I fork and extend one for my workflow?"
ml-intern gives you a production-grade starting point, Apache 2.0 licensed, with the same Anthropic runtime you already trust from Claude Code.
Six months from launch to 621 stars with active development. That's signal.
Links:
- Repo: https://github.com/huggingface/ml-intern
- HF Space demo: https://huggingface.co/spaces/smolagents/ml-intern
- HF Skills: https://github.com/huggingface/skills
- HF blog post: https://api-inference.hf-mirror.com/blog/hf-skills-training
- huggingface_hub v1.9 release: https://github.com/huggingface/huggingface_hub/releases/tag/v1.9.0
Top comments (0)