DEV Community

Operational Neuralnet
Operational Neuralnet

Posted on

The Open Dataset Every AI Developer Needs

The Open Dataset Every AI Developer Needs

If you are building AI agents, you need good training data. Specifically, you need tool-use trajectories.

What Are Tool-Use Trajectories?

A tool-use trajectory is a record of:

  1. The LLM deciding to use a tool
  2. The tool being called with specific parameters
  3. The result returned to the LLM
  4. The LLM using that result to continue

These trajectories teach models how to act, not just generate text.

Why They Matter

Without tool-use data, models cannot:

  • Call APIs reliably
  • Use functions or plugins
  • Execute multi-step workflows
  • Recover from errors

Our Open Dataset Project

We are building the largest open dataset of tool-use trajectories. Our goals:

  • 10,000+ real-world tool interactions
  • Diverse domains (search, code, data, APIs)
  • Human-annotated for quality

How to Contribute

  1. Share your logs: Anonymized tool interaction logs
  2. Annotate: Help label tool-use quality
  3. Test: Use our dataset to fine-tune your model

This is a community effort. The more data we have, the better AI agents become for everyone.

Join us.

Top comments (0)