We Gave Our Engineers AI. Here's What Happened.

#ai #productivity

By Eli Daniel, Head of Engineering, Jellyfish

We all know that AI coding tools are generating a lot of attention and hype, and that their emergence is changing the software development landscape remarkably quickly. And we can probably all trade stories of cases where they’ve been transformative, and others where the results might not live up to the promise. But stories and anecdotes are one thing and data is another.

Last week Jellyfish published new data around how AI coding tools are being used and the impact they’re having across engineering teams. Here’s what we found across more than 21,000 engineers and 2 million PR reviews:

AI use has exploded. 51% of PRs in May 2025 use AI, compared to just 14% in June 2024.
Cycle times for PRs using AI are faster than those without: AI-assisted PRs were 16% faster in Q2 2025.
Both faster coding and faster reviews contribute to this speedup.
What isn’t changing? Quality. We see no meaningful correlation between an organization's level of AI adoption and the number of bugs introduced.

These numbers are notable for being both higher and lower than I might have guessed. They aren’t small – they are showing some real meaningful shifts – but also they are well short of some of those 10x stories we hear.

To answer the question of why that might be, we can turn from pure data analysis and statistics to some more insight from Jellyfish’s own AI journey.

Drinking Our Own Champagne

In parallel to our external research, our engineering team began its own internal experiments across AI coding tools including Copilot, Cursor, Gemini Code Assist and others. We also experimented with PR review bots like Greptile and Cursor BugBot, as well as agents like Devin.ai.

So far, we’ve found that in a two-week period, a little over half of our engineers are routinely using AI coding assistants. And this group tends to do more – and more quickly. Across this group, we observed:

55% decrease in PR cycle time
66% increase in Jira issues resolved

AI: Great at Some Things, Less Great at Others

With results like these you might assume that AI is now mandated across our R&D org. But here’s why it’s not: the causality here seems to go in the other direction. Our engineers are generally eager to try out things that will improve their workflows, but today’s tools are not equally effective across all kinds of work.

Work that is well defined, with a clear understanding of what success means? Great! This could mean app interactions (“add a widget to display a metrics graph over here in the same format we usually use”) or even performance optimization (“make this database query go faster”).

But there are lots of engineering activities where AI just isn’t as good. For instance:

Architectural decisions
Subtle systems behavior
Things that don’t have good automated test coverage

One funny illustration of limitations in today’s tools came when using our recently-released MCP server, which provides a way to provide chat bots like Anthropic’s Claude access to Jellyfish data. In playing with it, one person prompted:

“Ask Jellyfish what Allison worked on last week.”

Claude cheerfully explained that it would query the Jellyfish API, then proceeded to give a completely wrong answer. When challenged, the LLM explained its reasoning: “I didn’t actually retrieve any data from Jellyfish. My answer was completely fabricated.”

It’s a funny moment, but also a powerful reminder: these tools are confident, even when they’re confidently wrong. In this case, we quickly learned to build some extra checks into our prompts when working with MCP.

A Net Positive, But Not a Magic Wand

So what’s the outcome of our AI experiments? We learned a few key lessons:

Today’s tools are powerful, but imperfect
What’s possible is changing quickly
Ongoing experimentation is a must

For engineering leaders, we need to keep our team members excited about AI’s possibilities. We should encourage our teams to experiment with these tools where they can comfortably harness the improvements AI delivers today.

But we shouldn’t burn people out trying to force things that don’t yet work. That’s not to say they never will. AI is moving at a breakneck speed and things are changing daily. What worked yesterday might not work tomorrow and what failed the week before might return 10x next month. This is a moving target, and experimentation and humility are key to getting the most out of AI.

Jellyfish AI Impact gives engineering leaders unparalleled insight into how their teams interact with tools including GitHub Copilot, Cursor, Gemini Code Assist, Sourcegraph and more. Request a demo here.

DEV Community

We Gave Our Engineers AI. Here's What Happened.

Drinking Our Own Champagne

AI: Great at Some Things, Less Great at Others

A Net Positive, But Not a Magic Wand

Top comments (0)