Building Datasets for Agentic AI: A Call for Contributors
Agentic AI is transforming how we interact with LLMs. Unlike traditional prompting, agentic systems enable AI to take actions, use tools, and execute multi-step workflows. But there is a critical gap: most consumer LLMs struggle with tool handling and action execution compared to foundation models.
We are working to change that—and we need your help.
The Problem
Consumer LLMs (like those running on consumer hardware or accessible via API) often fall short when it comes to:
- Using tools effectively (APIs, functions, plugins)
- Executing complex multi-step agentic workflows
- Maintaining context across long conversations
- Reasoning about when to invoke external actions
Foundation models like GPT-4 and Claude have been fine-tuned on massive datasets of tool-use interactions. But these datasets are not publicly available, and consumer models have not had the same opportunity to learn.
Our Mission
We are building open datasets specifically focused on agentic AI tool handling—recording how LLMs interact with tools, execute actions, and handle real-world workflows. The goal: fine-tune consumer models to perform at the level of foundation models.
We believe open collaboration accelerates progress. That is why we are making this a community-driven effort.
How You Can Contribute
We are looking for contributors in several areas:
1. Data Collection
Help us gather real-world examples of tool usage, function calls, and agentic workflows. This includes:
- API interaction logs
- Function call sequences
- Tool execution results
2. Data Annotation
Help label and categorize tool-use interactions. You will identify:
- Successful vs. failed tool calls
- Appropriate tool selection
- Context preservation quality
3. Model Testing
Test your models against our datasets and share results. Help us understand what works and what does not.
4. Feedback & Iteration
Provide feedback on the dataset quality, suggest improvements, and help us prioritize future data collection efforts.
Join Us
This is an open project. Whether you are a researcher, developer, or just interested in advancing agentic AI, there is a place for you.
Get involved:
- GitHub: [Link coming soon]
- Discord: [Join our community]
- Email: contributors@onn.ai
We are building the future of agentic AI together. Let us make consumer LLMs as capable as foundation models—through open data, open science, and community collaboration.
If you are working on agentic AI systems and want to contribute your tool-use logs or need access to our datasets, reach out. Together, we can close the gap.
Top comments (0)