Raffaele Pizzari

Posted on May 4 • Originally published at pixari.dev

I'm Bullish on AI-Assisted Coding. That's Exactly Why I Take the Risks Seriously.

#ai #leadership #engineering

I use AI coding agents every day. I believe they are reshaping how we build software, and I think the teams that adopt them deliberately will outperform those that don't.

I am not writing this to warn you away from AI-assisted development.

I am writing this because the loudest voices in the AI enthusiasm camp are also the most allergic to discussing what can go wrong. And that worries me more than the risks themselves.

The productivity gains are real

Let's start with what is undeniable.

By 2024, LangChain's State of AI Agents report already showed 51% of surveyed organizations running agents in production. By 2026, that number has only grown. The global AI agent market is projected to expand from $7.8 billion to over $50 billion by 2030.

This is not a hype cycle anymore. This is infrastructure.

The case studies are equally striking.

Rakuten engineers used a CLI-based agent to implement a complex activation vector extraction method within vLLM, a codebase of roughly 12.5 million lines. A task that would have taken weeks of onboarding and implementation was completed in seven hours with 99.9% numerical accuracy.

TELUS reported shipping code 30% faster with agents, saving over 500,000 hours across the organization.

These are not toy demos. This is production-grade acceleration at enterprise scale.

I find this genuinely exciting. And none of it changes what I am about to say next.

The risks are equally real

Lars Faye's Agentic Coding is a Trap struck a nerve because it named something many of us were feeling but not saying out loud. The core argument: the skills you need to supervise AI agents are the exact skills that atrophy when you over-rely on them.

The trade-offs that need honest discussion are already quantifiable:

Skill atrophy at scale. The debugging and reasoning abilities required to supervise agents degrade measurably when you stop exercising them.
System complexity to compensate for non-determinism. AI outputs are probabilistic. The guardrails, review layers, and validation infrastructure required to make them production-safe add real engineering overhead.
Vendor dependency for individuals and entire teams. Claude Code outages have already left teams at a standstill. When your workflow depends on a third-party model, their downtime becomes yours.
Unpredictable and rising costs. An employee's cost is fixed. Token pricing is a constantly moving target, dictated unilaterally by providers who can "nerf" a model and force you to burn two to three times more tokens for the same result.
A widening security attack surface. Autonomous agents with broad permissions introduce threat categories that traditional security controls were never designed to handle.
Regulatory exposure most teams are not preparing for. The EU AI Act's high-risk obligations take effect in August 2026, and many agentic workflows are closer to the compliance line than their operators realize.

These are not hypothetical concerns. Let me expand on the ones that matter most.

Cognitive debt

Lars Faye calls this the "paradox of supervision." Anthropic's own research on how AI assistance impacts coding skill formation backs it up: in a controlled study, developers using AI scored 50% on average versus 67% for those coding manually, with the largest gap appearing specifically in debugging questions.

Senior developers with decades of experience report being unable to explain systems they technically "built" with agents. I have written before about the gap between perceived velocity and actual throughput. The pattern here is the same: the metric that looks good on the dashboard is hiding a cost that only surfaces later.

The cognitive friction of writing code, hitting errors, reading documentation, and resolving conflicts manually is not wasted effort. It is the mechanism through which engineers actually understand what they are building.

As I argued in From Attention Economy to Thinking Economy, the challenge is not whether AI eliminates jobs. It is whether we protect the cognitive abilities that make us valuable in the first place.

Security surface expansion

Autonomous agents translate a single instruction into long chains of API calls, database queries, and data manipulations. If an adversary compromises an agent's input, the blast radius is exponentially larger than a traditional exploit.

Research from 2026 shows an 88% success rate in bypassing guardrails on open-source models using automated probing techniques. Indirect prompt injection, where malicious instructions hide in external content the agent reads, requires far fewer attempts than direct attacks.

Dependency poisoning can inject zero-day vulnerabilities straight into your CI/CD pipeline. A CVSS 10.0 remote code execution vulnerability discovered in Google's Gemini CLI in early 2026, exploitable specifically in CI/CD pipeline environments, made this supply-chain risk impossible to ignore.

Regulatory pressure

On August 2, 2026, the EU AI Act's high-risk obligations take effect. Under Annex III, AI systems used to allocate tasks based on individual behavior or to monitor and evaluate worker performance in employment contexts are classified as high-risk.

Coding agents do not automatically fall under this scope, but the line gets blurry fast when orchestrator systems start auto-assigning tickets, ranking PR quality, or feeding into performance reviews.

Article 14 requires that human supervisors understand the system's capabilities and limitations, remain aware of automation bias, correctly interpret outputs, and retain the ability to override them.

Organizations that let engineers rubber-stamp massive AI-generated pull requests without genuine comprehension are building a compliance liability, whether or not they realize it yet.

The real problem is not the risks. It is the denial.

Here is where I part ways with both camps.

The skeptics read all of this and conclude: stop using agents. Go back to writing everything by hand. Agentic coding is a trap, full stop.

The enthusiasts read all of this and shrug. They treat any discussion of downsides as FUD from people who "don't get it." They dismiss cognitive atrophy as a skill issue. They wave away security concerns as solvable later.

Both responses are wrong, but the second one is more dangerous.

In engineering, we do not ship without testing. We do not deploy without monitoring. We do not scale without load testing.

We never adopt a technology by pretending it has no failure modes. That is not engineering. That is wishful thinking.

The people who refuse to discuss the risks of AI-assisted development are not optimists. They are in denial.

And denial is how promising technologies get killed. Not by their limitations, but by the backlash that follows when those limitations are discovered too late by people who were told everything was fine.

I have seen this pattern play out across two decades in this industry. The technologies that survived had honest advocates. The ones that did not were oversold by people who confused enthusiasm with recklessness.

What honest adoption looks like

Anthropic's own data reveals what they call the "Delegation Paradox": engineers use AI in 60% of their workflows but can fully delegate only 0-20% of actual tasks.

This is not a failure of the tools. It is the reality that high-stakes architectural work resists probabilistic automation. Accept it and plan around it instead of fighting it.

That means building deliberate constraints into how you and your team use these tools.

Maintain your skills deliberately. Use agents where they genuinely accelerate: boilerplate, exploration, context retrieval, test scaffolding. The scaffolding use case remains the healthiest relationship most engineers can have with AI right now.

But regularly write core logic yourself. Run pair programming sessions where AI is off. During code reviews, trace the logic manually.

If you do not exercise the debugging and reasoning muscles, they atrophy within months. This is not a metaphor. It is what the data shows.

Respect the context limits. Agents suffer from measurable "context rot." A Databricks study found that model correctness drops significantly around the 32,000-token mark, well before theoretical limits.

The "lost in the middle" phenomenon means agents routinely miss critical guidelines buried in large context windows. Agents confidently invent non-existent variables, mix incompatible framework versions, or hallucinate API calls because they failed to parse intermediate contextual data.

This is not a bug that will be fixed next quarter. It is a fundamental characteristic you need to design around.

Never generate more code than you can review. If your agent produced a 10,000-line pull request overnight and your team approved it in 20 minutes, you did not ship faster. You shipped blindly.

The volume mismatch between machine generation speed and human comprehension speed is the single biggest enabler of the "LGTM" culture that is quietly degrading code quality across the industry.

Strict volume constraints are not a productivity bottleneck. They are what keeps your codebase deterministic instead of probabilistic.

Invest in specification as the primary artifact. When implementation is nearly free, the specification becomes the real engineering work.

Formal, machine-readable specs with explicit non-goals, hard constraints, and testable acceptance criteria prevent agents from filling ambiguity with hallucinated assumptions. Spec-driven development is not overhead. It is the structural response to a world where generating code is trivial and verifying it is expensive.

Watch for the junior developer trap. When confronted with bugs in generated code, many junior developers treat the problem as a "prompt engineering issue" rather than a logic flaw. They tweak prompts repeatedly instead of reading the code.

In this dynamic, the agent delivers the results, the developer takes the credit, and nobody builds real engineering skills. If you lead a team, you have a responsibility to ensure your junior engineers build foundations, not just prompting habits. Their long-term career depends on it.

Prepare for regulatory compliance now. The EU AI Act's August 2026 enforcement date is not far away. If your agentic workflows touch task allocation or performance evaluation, you may already be in high-risk territory under Annex III.

Even outside that scope, Article 12 requires continuous logging over the system's lifetime, and Article 14 requires human overseers who genuinely understand the system, not just approve its output.

If your current workflow is "agent generates, junior approves, code ships," start asking whether that process would survive regulatory scrutiny. The organizations that treat governance as infrastructure rather than bureaucracy will be the ones that scale AI adoption sustainably.

The technology deserves better advocates

The cognitive debt is real. The security surface expansion is real. The regulatory pressure is real. The skill atrophy is measurable and documented.

None of this means we should stop using these tools.

All of it means we should use them like engineers: with eyes open, with guardrails in place, and with the humility to admit what we do not yet fully understand.

The enterprises that will thrive are those that explicitly instrument their workflows to prevent human cognition from atrophying. That treat the agent as a tool of the intellect rather than a replacement for it.

The engineers who will thrive are those who master what the probabilistic agent inherently lacks: systemic architectural vision, contextual judgment, and the willingness to take responsibility for what ships.

I am betting on AI-assisted development. And that bet means taking its risks seriously enough to contain them.

Because the best thing you can do for a technology you believe in is to be honest about it.

Originally published on pixari.dev

DEV Community