Daniel Varnai

Posted on Jun 10 • Originally published at qvai.hu

AI Agents Inside GitLab: What We Learned Automating the Ticket Lifecycle

#agents #ai #automation #softwaredevelopment

AI Agents Inside GitLab: What We Learned Automating the Ticket Lifecycle

A real-world case study on using AI agents for ticket grooming, implementation support, and merge request review without removing human control

Most companies do not need another chatbot.

They need less friction in the work that already happens every day.

For software teams, that friction often hides in places that look normal from the outside: unclear tickets, missing acceptance criteria, implementation plans that live only in someone’s head, repeated review comments, and documentation that slowly drifts away from reality.

At QvAI, we worked on an anonymized GitLab-based AI agent system that addressed exactly this kind of friction.

The original case study was published in Hungarian on the QvAI blog. This article is an English adaptation focused on the broader lessons: what we built, why we used multiple agents, where the value appeared, and what other teams can learn from the pattern.

The goal was not to replace engineers.

The goal was to make the engineering workflow clearer, faster, and easier to control.

The real bottleneck was not coding

When people talk about AI in software development, they usually jump straight to code generation.

That makes sense. Code is visible. Code can be demoed. Code feels measurable.

But in many teams, the bigger loss happens before and after the code is written.

A ticket arrives with missing context. The business goal is vague. Acceptance criteria are unclear. The developer asks follow-up questions, waits for answers, makes assumptions, starts implementation, then later finds out during review that something important was never aligned.

Nobody did anything wrong. The process simply leaked context.

That was the real problem in this case.

The team had recurring friction around four areas:

Ticket grooming required too much manual reading and clarification.
Implementation plans were often not explicit enough.
Merge request reviews kept finding issues that could have been clarified earlier.
Ticket state and documentation did not always reflect the real work.

This made the process a good fit for AI agents.

Not because the AI could magically write perfect code, but because the workflow contained repeated reading, checking, summarizing, planning, and reviewing.

That is where AI agents can become genuinely useful.

Why we used multiple agents instead of one general assistant

A common mistake in enterprise AI projects is trying to build one assistant that does everything.

It reads tickets. It writes code. It reviews code. It updates documentation. It answers questions. It becomes a vague digital coworker that is impressive in a demo but hard to control in production.

We took a different approach.

The system used multiple specialized agents, each with a narrow role.

That mattered because each agent could be evaluated against a specific job. The grooming agent did not need to behave like a code reviewer. The review agent did not need to act like a product manager. The implementation agent did not need to decide business priority.

Each agent had a clear place in the workflow.

The grooming agent

The grooming agent read GitLab tickets and looked for missing information.

It identified unclear requirements, suggested clarification questions, and helped formulate acceptance criteria.

This sounds basic, but it changes the rhythm of the team.

Instead of a developer discovering ambiguity halfway through implementation, the team gets a structured pass over the ticket earlier. The agent does not need to solve the whole task. It just needs to make the task clearer before expensive engineering time is spent on it.

This is also a useful reminder for AI projects in general.

The best first agent is often not the most advanced one. It is the one that improves the input quality of the rest of the process.

The implementation agent

Once the ticket was clearer, the implementation agent prepared a technical plan and, in suitable cases, made code changes.

It worked from the ticket, repository context, comments, and shared development guidelines. It could also iterate based on feedback from the review agent or from human reviewers.

The point was not to remove developers from the loop.

The point was to give developers a better starting point.

For simpler tasks, the merge request produced by the agent was often close enough that a human could review and merge it. For more complex work, the agent still reduced the blank-page problem. It prepared the shape of the solution, surfaced relevant files, and made the first technical assumptions explicit.

That is valuable even when the final implementation still needs human judgment.

AI does not have to finish the work to save time. Sometimes it creates value by getting the work into a state where a skilled person can finish it faster.

The review agent

The review agent checked the merge request against the ticket goal, the repository context, and the team’s development rules.

It looked for missing pieces, inconsistencies, risky changes, and places where the implementation did not match the intent of the ticket.

This agent was especially useful before merge.

It caught issues that were easy to miss during normal review cycles. More importantly, it made feedback available earlier. The implementation agent could refine the work before a human reviewer spent serious time on it.

The human reviewer still owned the final decision.

The agent simply made the review process sharper.

The most important design choice: AI suggests, humans decide

The safest AI systems are not always the least capable ones.

They are the ones with the clearest boundaries.

In this GitLab workflow, the agents were not given unlimited freedom. They worked within predefined permissions, controlled actions, and auditable steps.

That mattered for three reasons.

First, source code, tickets, and internal documentation are sensitive company data. The system had to use only the context it needed.

Second, engineering decisions carry risk. The system could analyze, prepare, suggest, and review, but final decisions around ticket state, merge, and priority stayed with engineers and responsible leaders.

Third, teams need consistency. If every developer uses a different AI assistant in a different way, the organization gets scattered behavior. In this setup, the agents followed shared guidelines, review criteria, and quality expectations.

That is the difference between giving everyone a chatbot and building a controlled workflow.

This is also how we usually think about AI agents for business workflows at QvAI: narrow responsibilities, useful context, measurable goals, controlled permissions, and human approval where it matters.

What changed for the developers

The biggest change was not that developers stopped coding.

The biggest change was that they dealt with fewer poorly prepared tasks.

Tickets more often had clear acceptance criteria. Developers could see the relevant technical area earlier. Risks were easier to spot before implementation went too far. Review comments became more focused on actual engineering judgment instead of basic missing context.

The implementation and review agents also created a useful feedback loop.

The implementation agent proposed a solution. The review agent checked it against the ticket and the shared guidelines. If something was missing, the implementation agent could refine the work before a human reviewer had to step in.

That loop did not make the system perfect.

It made the starting point better.

And in real engineering teams, a better starting point is often worth a lot.

A side effect: better tickets from humans

One of the most interesting results was not technical.

The grooming agent improved how people wrote tickets over time.

When the agent kept surfacing missing context, vague acceptance criteria, or unclear business goals, it created a feedback loop for the people writing the tickets. They could see what was missing before the issue reached implementation.

That is an underrated benefit of AI in workflows.

A good agent does not only process work. It shows the organization where its process is weak.

In this case, better grooming led to better tickets. Better tickets led to smoother implementation. Smoother implementation led to more focused reviews.

The agent was useful as a worker, but also as a mirror.

Why this pattern is not limited to GitLab

GitLab was the environment in this case, but the pattern is much broader.

The same structure can apply to many business processes:

A request enters the system.

The agent reads the available context.

It checks the request against internal rules.

It asks for missing information or prepares the next step.

Another agent or a human reviews the output.

A person approves the final decision where risk requires it.

In software development, the objects are tickets, merge requests, repository files, and review comments.

In finance, they might be invoices, purchase orders, budget rules, and approval workflows.

In sales, they might be leads, CRM records, proposals, and follow-up tasks.

In HR, they might be applications, internal policies, interview notes, and onboarding checklists.

In operations, they might be support tickets, vendor emails, reports, and exception handling.

That is why AI automation should usually start with a real process, not a model demo.

The question is not “Can we add AI here?”

The better question is “Where does the same kind of reading, checking, summarizing, or preparation happen again and again?”

That is where AI automation for business processes becomes practical.

What made the system work

Looking back, the system worked because of a few practical choices.

The agents had narrow roles. Grooming, implementation, and review were separate responsibilities.

They worked inside the existing workflow. The team did not have to move all work into a separate chat interface.

They used real context. Tickets, comments, merge requests, repository data, and development guidelines all mattered.

They had controlled permissions. The system was designed around auditability and human decision points.

They produced useful intermediate work. Even when the AI did not finish the task, it often prepared enough of the work to make the human step faster and clearer.

That last point is important.

AI projects are often judged as if the only successful outcome is full automation. In real teams, partial automation can still be highly valuable when it removes the right friction.

A ticket that is 80 percent prepared is not finished work.

But it is much better than a vague ticket that forces a developer to reconstruct the intent from scratch.

The lesson for teams considering AI agents

If you are considering AI agents in your company, do not start by asking for an autonomous system.

Start by finding a recurring workflow where people already spend time reading, checking, summarizing, preparing, or moving information between systems.

Then ask:

Where does work get stuck because context is missing?

Where do people ask the same clarification questions again and again?

Where do reviewers catch the same types of issues?

Which decisions must stay with humans?

Which steps could be safely prepared by an agent?

This framing leads to better AI systems than “let’s build a bot.”

For custom workflows, especially where internal data, permissions, and business rules matter, a generic SaaS tool is often not enough. That is where custom AI development can make sense.

Not because every company needs something complex.

Because the AI has to fit the process, not the other way around.

Final thought

The best AI agent projects are not always the most dramatic.

Sometimes the best agent is the one that quietly improves a workflow the team already depends on.

It reads the ticket before the developer does.

It asks the obvious missing questions.

It prepares a plan.

It checks the merge request.

It gives the human reviewer a better starting point.

That may not look as flashy as a chatbot demo.

But in a real company, it is often much more useful.

If you are exploring a similar workflow around GitLab, GitHub, internal operations, CRM, helpdesk, or document-heavy processes, the original GitLab AI agents case study is a good place to start.

You can also read more about how QvAI approaches controlled AI agents for business workflows.

DEV Community

AI Agents Inside GitLab: What We Learned Automating the Ticket Lifecycle

AI Agents Inside GitLab: What We Learned Automating the Ticket Lifecycle

A real-world case study on using AI agents for ticket grooming, implementation support, and merge request review without removing human control

The real bottleneck was not coding

Why we used multiple agents instead of one general assistant

The grooming agent

The implementation agent

The review agent

The most important design choice: AI suggests, humans decide

What changed for the developers

A side effect: better tickets from humans

Why this pattern is not limited to GitLab

What made the system work

The lesson for teams considering AI agents

Final thought

Top comments (0)