The Open Dataset Every AI Developer Needs
If you are building AI agents, you need good training data. Specifically, you need tool-use trajectories.
What Are Tool-Use Trajectories?
A tool-use trajectory is a record of:
- The LLM deciding to use a tool
- The tool being called with specific parameters
- The result returned to the LLM
- The LLM using that result to continue
These trajectories teach models how to act, not just generate text.
Why They Matter
Without tool-use data, models cannot:
- Call APIs reliably
- Use functions or plugins
- Execute multi-step workflows
- Recover from errors
Our Open Dataset Project
We are building the largest open dataset of tool-use trajectories. Our goals:
- 10,000+ real-world tool interactions
- Diverse domains (search, code, data, APIs)
- Human-annotated for quality
How to Contribute
- Share your logs: Anonymized tool interaction logs
- Annotate: Help label tool-use quality
- Test: Use our dataset to fine-tune your model
This is a community effort. The more data we have, the better AI agents become for everyone.
Join us.
Top comments (0)