Hernan Chilabert

Posted on Aug 14 • Edited on Aug 25

I'm Building an AI Agent to Write My Unit Tests

#ai #python #opensource #showdev

Hey DEV community! 👋

Like many of you, I've spent countless hours writing unit tests. It's one of the most critical parts of building reliable software, but it can also be a real grind. As I've been diving deeper into the world of AI Agents, I thought: what if I could automate this?

So, I started a tiny project to build my own AI agent to handle it. This is my journey of learning in public, and I wanted to share the first version with you all.

What I've Built So Far: The "Dev Engineer" Agent

The first phase is a simple but functional "Dev Engineer" agent. The concept is straightforward:

You give it a Python source file.
It gives you back a test_.py file with unit tests ready to run with pytest.

Under the hood, it's a Python script using LangChain to manage the logic and OpenAI LLM. It's a simple, powerful starting point.

The Big Picture: An Autonomous Testing Team

This is just the beginning. The ultimate goal isn't just to generate tests, but to create a collaborative team of AI agents that can ensure code quality autonomously. The vision is to build a "QA Engineer" agent that will work alongside the "Dev Engineer" in a feedback loop:

The Dev Agent writes the tests.
The QA Agent runs them, checks for failures, and analyzes code coverage.
If anything is wrong, the QA Agent sends feedback to the Dev Agent.
The Dev Agent corrects the tests and sends them back.
...and so on, until we have a robust and passing test suite.

Let's Build This Together! (Call for Collaboration)

This project is my personal learning playground, but I believe it has the potential to become a genuinely useful tool for the community. That's where you come in.

I'm building this completely open source, and I would love for you to get involved. Whether you're an AI expert or just a curious developer, there are plenty of ways to contribute:

Check out the code and give feedback.
Suggest new features or improvements.
Tackle an open issue or a task from the roadmap below.

I've laid out a clear plan for where the project is headed. Take a look and see if anything sparks your interest!

🚀 Project Roadmap

This project is under active development. Below is a summary of my progress and a look at what's ahead. Contributions are highly encouraged!

✅ Phase 1: Core Test Generation Engine (MVP)
[x] Develop "Dev Engineer" Agent: A core agent capable of generating unit tests from a single Python source file.

[x] LLM Integration: Connect the agent to a foundational LLM (e.g., GPT-4o, Llama 3) to power code generation.

[x] Basic CLI: A simple command-line interface to input a file and receive the generated test file.

🎯 Phase 2: Multi-Agent Collaboration & Feedback Loop
[ ] Introduce "QA Engineer" Agent: Develop a second agent responsible for reviewing, validating, and executing the generated tests.

[ ] Implement Test Execution Tool: Create a secure tool for the QA Agent to programmatically run pytest, capture results, and parse code coverage reports.

[ ] Establish Collaborative Framework (CrewAI): Refactor the agent logic into a Crew to manage the feedback loop, allowing the Dev Agent to fix tests based on the QA Agent's feedback until a target coverage is achieved.

🏗️ Phase 3: API-First Architecture & State Management
[ ] Expose via API: Wrap the agent crew in a FastAPI application to make it accessible as a service.

[ ] Job State Management: Integrate Redis or a database to manage the state of long-running jobs, allowing for asynchronous operation.

[ ] Containerization: Create a Dockerfile and docker-compose.yml to ensure a consistent and reproducible environment for the entire application stack.

✨ Future Vision
[ ] LLMOps & Observability: Integrate with tools like LangSmith to trace, debug, and evaluate the performance of the agent interactions.

[ ] IDE Integration: Develop a VSCode extension for a seamless developer experience right within the editor.

[ ] Multi-Language Support: Expand capabilities beyond Python to include other languages like JavaScript/TypeScript and Go.

[ ] Automated Code Refactoring: Empower the Dev Agent to suggest fixes in the source code itself, not just the tests.

You can find the repository with all the code for Phase 1 here:

👉 https://github.com/herchila/unittest-ai-agent

What do you think? What other developer chores do you wish you could automate with AI? Let me know in the comments below!

See you!

Top comments (6)

Anik Sikder • Aug 16

This is brilliant. Automating unit test generation is already a win, but the feedback loop between Dev and QA agents? That’s next-level. I’ve worked with LangChain and FastAPI, and seeing them used like this makes me want to jump in. Also love the idea of expanding to refactoring and multi-language support, feels like the start of a real autonomous dev assistant. Checking out the repo now!

Hernan Chilabert • Aug 18

Hey Anik! Thanks so much for the encouraging words. I'm really glad the vision of the Dev/QA feedback loop resonated with you.

It's awesome that you have experience with LangChain and FastAPI, as they are definitely part of the roadmap! The best place to follow the progress and brainstorm ideas is on the GitHub repo itself.

Hope to see you around there!

Ahmed Aly El Agamy • Aug 15

This is exactly the kind of project I’ve been looking to contribute to. I have experience in Python, and AI/ML, so I can jump in quickly. Let me know how I can help — I’d love to collaborate.

Hernan Chilabert • Aug 25

Hi Ahmed! Thanks so much for your comment!
Right now I'm finishing the CLI implementation that generates the UTs with a single command (i.e. ut generate path/to/file.py).
Also, I published my roadmap here so everybody can see it and follow -> focusmap.pro/roadmap/45a1b599-aead...

On the other hand, regarding the collaboration, this week I'll start to publish some issues on GH so everybody who wants to collaborate can fork and contribute with a PR.

Let's keep in touch!