<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Raffaele Pizzari</title>
    <description>The latest articles on DEV Community by Raffaele Pizzari (@pixari).</description>
    <link>https://dev.to/pixari</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F116548%2Fb03a6e20-dfb4-414f-97a2-df6600dd123c.jpg</url>
      <title>DEV Community: Raffaele Pizzari</title>
      <link>https://dev.to/pixari</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/pixari"/>
    <language>en</language>
    <item>
      <title>AI-Assisted Product Engineering: Orchestrating Claude Code Across the Software Development Lifecycle</title>
      <dc:creator>Raffaele Pizzari</dc:creator>
      <pubDate>Mon, 04 May 2026 23:39:10 +0000</pubDate>
      <link>https://dev.to/pixari/ai-assisted-product-engineering-orchestrating-claude-code-across-the-software-development-lifecycle-1k59</link>
      <guid>https://dev.to/pixari/ai-assisted-product-engineering-orchestrating-claude-code-across-the-software-development-lifecycle-1k59</guid>
      <description>&lt;p&gt;Most LLM coding tools live inside a single editor session. They suggest, complete, and refactor inside one file at a time. That is useful, but it is not where real product engineering happens.&lt;/p&gt;

&lt;p&gt;Real engineering spans ticket breakdown, cross-repository implementation, code review, merge request management, and the knowledge that has to survive between sessions. None of that fits in one tool window.&lt;/p&gt;

&lt;p&gt;I built a system that orchestrates Claude Code across that full lifecycle. It has been running daily for months. This post describes how it works, why it is structured the way it is, and what I have learned from the parts that broke.&lt;/p&gt;

&lt;p&gt;The core thesis is one sentence.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The right unit of agent invocation is the judgment step, not the workflow.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Mechanical steps, the API calls, the test runs, the git operations, do not need an LLM. They need deterministic code. The agent should be invoked only when something genuinely requires judgment: writing the code, evaluating a review finding, choosing between two architectural options. Conflating these two categories is the most expensive mistake I see in agent systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  The architecture
&lt;/h2&gt;

&lt;p&gt;A terminological note before going further. Claude Code is not a raw API. It is an agent runtime: an LLM with tool use (file reads, shell commands), file system access, and a multi-turn loop. When the orchestrator "hands off to Claude Code", it is not a single API call. It is transferring control to an autonomous process that may read dozens of files, run commands, and iterate before returning. I will use "the agent" or "Claude Code" for what the system invokes, and "LLM" only when discussing the underlying model's behavior.&lt;/p&gt;

&lt;p&gt;Three principles guide the design.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Python orchestrates, the agent reasons.&lt;/strong&gt; Every workflow is split into phases. Phases that involve API calls, file operations, test execution, or data transformation are deterministic Python scripts. Claude Code is invoked only when the task requires judgment. This separation reduces token consumption, improves latency (mechanical phases complete in under two seconds), and makes the system auditable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Propose, do not execute.&lt;/strong&gt; The system never performs irreversible external actions (merging code, closing tickets, sending messages) without explicit human approval. It creates structured proposals that surface in a dashboard for review. This makes the system safe to leave running unattended.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Compound knowledge, do not re-derive it.&lt;/strong&gt; Engineering context (architectural decisions, team ownership, ticket history) is captured in a persistent wiki and an operational database. Each session starts with this accumulated context rather than re-deriving it from scratch.&lt;/p&gt;

&lt;h3&gt;
  
  
  The six layers
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────────────────────────────────────────────┐
│  1. User          CLI + Dashboard                       │
├─────────────────────────────────────────────────────────┤
│  2. Skill         Command → orchestrator routing        │
├─────────────────────────────────────────────────────────┤
│  3. Orchestrator  Python, phased, JSON I/O              │
├─────────────────────────────────────────────────────────┤
│  4. Agent         Claude Code + specialized subagents   │
├─────────────────────────────────────────────────────────┤
│  5. Data          SQLite + Markdown wiki + ChromaDB     │
├─────────────────────────────────────────────────────────┤
│  6. External      Jira, GitLab, Confluence, K8s         │
└─────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Layers 1 to 3 are deterministic. Layer 4 is where Claude Code operates. Layers 5 and 6 are stateful backends. The skill layer maps user commands to orchestrators via a YAML manifest, so the system's capabilities are explicit. Specialized agents (code review, knowledge synthesis, planning) run in isolated context windows with explicitly scoped tool permissions. The code review agent, for instance, cannot edit files.&lt;/p&gt;

&lt;h3&gt;
  
  
  When a skill needs an orchestrator
&lt;/h3&gt;

&lt;p&gt;Not every skill needs the full structure. The deciding factor is side effects.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Orchestrated skills&lt;/strong&gt; have multi-step workflows with external side effects: ticket implementation, MR creation, CI analysis, code review remediation. They need deterministic coordination (create branch, run tests, push code) interleaved with agent judgment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agent-native skills&lt;/strong&gt; are single-turn reasoning tasks: debugging a service issue, classifying an unknown input, generating a standup summary. The agent reads context and produces an output. There is nothing mechanical worth extracting.&lt;/p&gt;

&lt;p&gt;If a skill creates branches, runs tests, calls external APIs, or modifies shared state, it gets an orchestrator. If it only reads and reasons, the agent handles it directly. Adding an orchestrator has a real cost: more code to maintain, more failure modes, more surface area to test. It is justified only when the mechanical steps are complex enough that the agent would be unreliable executing them.&lt;/p&gt;

&lt;h3&gt;
  
  
  A ticket from start to finish
&lt;/h3&gt;

&lt;p&gt;To make this concrete, here is the lifecycle of a single ticket implementation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                    ┌──────────────────────┐
                    │    User: /ticket      │
                    │    &amp;lt;ticket-id&amp;gt;        │
                    └──────────┬───────────┘
                               │
              ┌────────────────▼────────────────┐
              │  Phase 1: Context Assembly       │
              │  (Python orchestrator)           │
              │                                  │
              │  • Fetch Jira ticket             │
              │  • Search wiki for decisions     │
              │  • Create worktree + branch      │
              │  • Extract implementation brief  │
              │  • Return JSON bundle            │
              └────────────────┬────────────────┘
                               │
              ┌────────────────▼────────────────┐
              │  Phase 2: Implementation         │
              │  (Claude Code)                   │
              │                                  │
              │  • Read brief + standards        │
              │  • Write / modify code           │
              └────────────────┬────────────────┘
                               │
              ┌────────────────▼────────────────┐
              │  Phase 3: Validation             │
              │  (Orchestrator + Review Agent)   │
              │                                  │
              │  • Run tests, lint, format       │
              │  • If fail → back to agent (3x)  │
              │  • Dispatch code review agent    │
              │  • If blockers → back to agent   │
              └────────────────┬────────────────┘
                               │
              ┌────────────────▼────────────────┐
              │  Phase 4: Proposal + Ship        │
              │  (Orchestrator → Human → Orch.)  │
              │                                  │
              │  • Create exchange proposal      │
              │  • ── HUMAN DECISION POINT ──    │
              │  • On approve: push + create MR  │
              │  • Log to activity trail         │
              └────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude Code is invoked only in Phase 2 and during fix iterations in Phase 3. Everything else is deterministic Python.&lt;/p&gt;

&lt;h3&gt;
  
  
  Before and after
&lt;/h3&gt;

&lt;p&gt;The first version of the system did not look like this. The agent orchestrated everything. It read 150 to 200 line configuration files, made API calls through tool use, managed git operations, and tracked its own progress.&lt;/p&gt;

&lt;p&gt;That version had three problems.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Latency.&lt;/strong&gt; A complete ticket workflow took several minutes, dominated by the agent parsing configuration and deciding which API call to make next.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Token consumption.&lt;/strong&gt; The agent's context window filled with mechanical details (API responses, git output, test logs) that displaced the actual implementation context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Brittleness.&lt;/strong&gt; The agent would skip steps, hallucinate API parameters, or lose track of which phase it was in. These failures were non-deterministic and hard to reproduce.&lt;/p&gt;

&lt;p&gt;After moving mechanical steps to Python orchestrators, Claude Code receives a 30 to 50 line context brief instead of navigating 200 lines of configuration. Workflow latency dropped by roughly an order of magnitude. Mechanical phases now complete in under two seconds. Failures produce deterministic error messages instead of vague agent confusion. Token consumption dropped substantially, because the agent no longer processes responses it only needs to pass through.&lt;/p&gt;

&lt;p&gt;A second-order benefit is testability. Python orchestrators can be unit-tested with mock data, so I can verify the mechanical pipeline independently of the agent. That is not possible when the agent is the orchestrator.&lt;/p&gt;

&lt;p&gt;Separation pays off immediately. It is the single most impactful design decision in the system.&lt;/p&gt;

&lt;h2&gt;
  
  
  Memory and observability
&lt;/h2&gt;

&lt;p&gt;A system that acts on your behalf needs two things: memory, so it does not re-derive context every session, and transparency, so you can trust what it is doing. These are deeply intertwined.&lt;/p&gt;

&lt;h3&gt;
  
  
  The semantic wiki
&lt;/h3&gt;

&lt;p&gt;Long-term memory is a collection of Markdown pages organized by category (features, tickets, teams, decisions, architectural concepts). Each page follows a structured template with metadata, cross-references, confidence tiers, and a changelog.&lt;/p&gt;

&lt;p&gt;A specialized knowledge agent creates and maintains the pages, synthesizing information from Jira, Confluence, GitLab, and prior conversations. The wiki distinguishes between three kinds of facts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Verified&lt;/strong&gt; facts: directly cited from an authoritative source with a reference ID.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inferred&lt;/strong&gt; facts: synthesized from patterns across multiple sources.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Human-provided&lt;/strong&gt; facts: explicitly stated by a user in an exchange response.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This provenance tracking matters more than I expected. It prevents the most common failure mode of LLM-driven knowledge bases: the model fabricates context, the system stores it, and a week later that fabrication is being cited as truth.&lt;/p&gt;

&lt;p&gt;Wiki pages have field-level staleness thresholds. Team ownership becomes stale after 14 days. Architectural decisions remain fresh for 90 days. Ticket status is never cached, because it changes too often. When a stale page is queried, the knowledge agent silently re-ingests it before using it.&lt;/p&gt;

&lt;p&gt;After sustained use, the wiki has become one of the most valuable parts of the system. It contains synthesized knowledge about ownership, decisions, and cross-repository dependencies that would take hours to reconstruct from scratch. The confidence tiers are essential. Without them, agents treat inferred knowledge as if it were verified, and you compound hallucinations into authoritative-looking documentation.&lt;/p&gt;

&lt;h3&gt;
  
  
  The operational database
&lt;/h3&gt;

&lt;p&gt;Short-term state lives in SQLite and tracks four things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Work items&lt;/strong&gt;: tickets, MRs, and plans with current status, CI state, and cross-repo dependencies.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Exchange items&lt;/strong&gt;: structured proposals from agents to humans (more on these below).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;To-do items&lt;/strong&gt;: a prioritized task queue with urgency levels and ownership.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Activity log&lt;/strong&gt;: an append-only audit trail of every external action.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This database is the substrate for the dashboard and the heartbeat process.&lt;/p&gt;

&lt;h3&gt;
  
  
  The dashboard
&lt;/h3&gt;

&lt;p&gt;A lightweight web dashboard, a single-file application with no external dependencies, gives real-time visibility into active work, pending proposals, the to-do queue, recent activity, knowledge health (stale pages, open questions, broken cross-references), and a heartbeat indicator.&lt;/p&gt;

&lt;p&gt;The dashboard is also the primary approval surface for exchange items, with controls for approve, defer, and dismiss. It refreshes every five seconds.&lt;/p&gt;

&lt;p&gt;The heartbeat indicator turned out to be unexpectedly important. Knowing that the background process is alive and polling gives me confidence that the system is aware of its environment. A stale heartbeat is an immediate signal that something needs attention.&lt;/p&gt;

&lt;h3&gt;
  
  
  Activity logging
&lt;/h3&gt;

&lt;p&gt;Every external write is logged on success. The log captures the action type, the affected resource, the associated ticket, the skill that triggered it, and the target repository. Reads and internal state changes are not logged, which keeps the trail focused on externally visible effects.&lt;/p&gt;

&lt;p&gt;The activity log powers the dashboard's feed, generates standup reports ("what did the system do yesterday?"), prevents duplicate work (the heartbeat checks the log before re-proposing), and gives me a forensic trail when I need to debug something unexpected.&lt;/p&gt;

&lt;h2&gt;
  
  
  Human-in-the-loop controls
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Hard limits
&lt;/h3&gt;

&lt;p&gt;Some operations are never performed without explicit human confirmation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Merging an MR (the system creates MRs, humans merge them).&lt;/li&gt;
&lt;li&gt;Transitioning a ticket to "Done".&lt;/li&gt;
&lt;li&gt;Deleting branches, files, or database records.&lt;/li&gt;
&lt;li&gt;Creating tickets in protected project spaces.&lt;/li&gt;
&lt;li&gt;Force-pushing to protected branches.&lt;/li&gt;
&lt;li&gt;Running database schema migrations.&lt;/li&gt;
&lt;li&gt;Sending messages to external communication channels.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are enforced at the agent level through explicit constraints. The code review agent, for example, has a deny-list of tools (Edit, Write) so it cannot modify code. Review and implementation are separate by construction.&lt;/p&gt;

&lt;h3&gt;
  
  
  Labeling
&lt;/h3&gt;

&lt;p&gt;All agent-created artifacts carry explicit labels:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tickets created by agents include an &lt;code&gt;ai-generated&lt;/code&gt; label.&lt;/li&gt;
&lt;li&gt;MRs created by agents include an &lt;code&gt;ai-automated&lt;/code&gt; label.&lt;/li&gt;
&lt;li&gt;Commit messages follow a conventional format that includes the originating ticket key.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This means human team members can identify agent-produced work at a glance during review, triage, and audit. No guessing.&lt;/p&gt;

&lt;h3&gt;
  
  
  The exchange protocol
&lt;/h3&gt;

&lt;p&gt;The governance model rests on the exchange protocol: a structured format for agent-to-human communication that replaces ad-hoc permission checks with explicit proposals.&lt;/p&gt;

&lt;p&gt;Each exchange item has an intent (&lt;code&gt;approval&lt;/code&gt;, &lt;code&gt;decision&lt;/code&gt;, &lt;code&gt;question&lt;/code&gt;, &lt;code&gt;blocker&lt;/code&gt;, or &lt;code&gt;flag&lt;/code&gt;), an urgency level, a Markdown body with relevant links, and a human answer field. There is no informational intent. Every exchange item requires human action. If the system cannot ask for something, it should not be telling you about it.&lt;/p&gt;

&lt;p&gt;Items move from &lt;code&gt;open&lt;/code&gt; to &lt;code&gt;answered&lt;/code&gt; (when the human responds) to &lt;code&gt;done&lt;/code&gt; (when execution completes). If execution fails, the system retries up to three times, preserving the original approval. After three failures it escalates by creating a new &lt;code&gt;blocker&lt;/code&gt; item. Users can defer proposals for 24 hours; deferred items re-surface when the deferral expires.&lt;/p&gt;

&lt;p&gt;I tried building a permission model first. Define what the system can do autonomously, define what needs approval. It was fragile. The risk of an action depends on context. Pushing to a feature branch is routine. Pushing to main is dangerous. Same operation, different risk.&lt;/p&gt;

&lt;p&gt;The proposal-approval model sidesteps this entirely. The system proposes everything and executes nothing without approval, with a small list of hard-coded exceptions (like creating a to-do for CI failure triage). Simpler, easier to reason about, more trustworthy.&lt;/p&gt;

&lt;p&gt;It also solves the asynchrony problem. Proposals created during a heartbeat cycle, when no user session is active, are queued and presented at the next session start. Every decision has a timestamp, a human answer, and an execution outcome. The whole system is auditable.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pre-commit safety
&lt;/h3&gt;

&lt;p&gt;A hook system intercepts operations before execution. Before any commit, the system runs the linter and formatter. Commits that would introduce lint violations are blocked. This prevents the agent from introducing code quality regressions even when its generated code is syntactically correct.&lt;/p&gt;

&lt;h2&gt;
  
  
  Proactive behavior
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The heartbeat
&lt;/h3&gt;

&lt;p&gt;A background process runs every five minutes, independent of any active user session. It polls external systems for state changes and creates exchange items when it detects actionable events:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A blocked ticket becomes unblocked: propose starting implementation.&lt;/li&gt;
&lt;li&gt;An MR receives a review comment: propose investigating.&lt;/li&gt;
&lt;li&gt;A CI pipeline fails: create a to-do for triage.&lt;/li&gt;
&lt;li&gt;An MR has been awaiting review for more than 24 hours: flag staleness.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The heartbeat is deliberately conservative. It proposes but never executes. Its job is to keep the system aware of the engineering environment even when nobody is actively working with it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Session initialization
&lt;/h3&gt;

&lt;p&gt;Every new session begins with a checklist:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Verify the dashboard and heartbeat are running.&lt;/li&gt;
&lt;li&gt;Fetch backlog items created since the last session.&lt;/li&gt;
&lt;li&gt;Scan open exchange items by urgency.&lt;/li&gt;
&lt;li&gt;List pending to-do items.&lt;/li&gt;
&lt;li&gt;Present a concise summary before accepting input.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Continuity matters. The system picks up where it left off, instead of starting from zero every morning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where it breaks
&lt;/h2&gt;

&lt;p&gt;Plenty.&lt;/p&gt;

&lt;h3&gt;
  
  
  Hallucinated references
&lt;/h3&gt;

&lt;p&gt;The LLM can hallucinate a ticket key, a file path, an API endpoint. The orchestrator validates external references before acting on them. Ticket keys are checked against Jira, branch names against git, file paths against the file system. When validation fails, the orchestrator returns a structured error rather than propagating the hallucination.&lt;/p&gt;

&lt;h3&gt;
  
  
  Stale knowledge acting as truth
&lt;/h3&gt;

&lt;p&gt;Despite staleness management, a window exists between a real-world change and the next re-ingestion. I mitigate this by never caching fast-changing data and by marking inferred knowledge with lower confidence. Agents are instructed to treat inferred facts as context, not constraint. This is not a perfect defense. It is defense in depth.&lt;/p&gt;

&lt;h3&gt;
  
  
  Proposal flooding
&lt;/h3&gt;

&lt;p&gt;During active development, the heartbeat can generate a high volume of exchange items. Review fatigue follows: I start approving things without reading them carefully. Urgency levels and 24-hour deferral reduce the volume, but the underlying tension between proactivity and cognitive load is real and unsolved.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scope creep
&lt;/h3&gt;

&lt;p&gt;Given an implementation brief, the agent will sometimes implement more than requested. Error handling for impossible cases. Refactoring adjacent code. Abstractions for hypothetical future requirements. I mitigate this with explicit coding standards ("don't add features beyond what was asked") and the code review agent flags scope creep as a blocker. It is still one of the most common failure modes. Constant calibration.&lt;/p&gt;

&lt;h2&gt;
  
  
  Operational reality
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Cost
&lt;/h3&gt;

&lt;p&gt;The whole system runs on a single laptop. No GPU, no dedicated server, no cloud infrastructure beyond the team tools that already exist (Jira, GitLab, Confluence). The only operational cost is Claude Code's API usage, which scales with the number of tickets processed.&lt;/p&gt;

&lt;p&gt;Mechanical phases consume zero API tokens. The knowledge agent and heartbeat consume modest amounts during re-ingestion and polling. The bulk of consumption comes from implementation and code review, which are exactly the steps where agent reasoning is genuinely needed.&lt;/p&gt;

&lt;p&gt;The SQLite database, the Markdown wiki, and the ChromaDB vector store all run locally. The dashboard is a single-file Node.js app. This minimalism was deliberate. Every external dependency is a maintenance burden and a failure mode.&lt;/p&gt;

&lt;h3&gt;
  
  
  Maintainability
&lt;/h3&gt;

&lt;p&gt;Three things need ongoing maintenance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Orchestrators.&lt;/strong&gt; When external APIs change (a new Jira field, a GitLab API deprecation), the affected orchestrator needs updating. Plain Python with structured JSON I/O makes this straightforward to test and deploy. A few hours per month.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Standards.&lt;/strong&gt; The coding standards file is a living document. When I notice a new failure mode (the agent over-engineers, a test pattern is fragile), I update the standards. This is not different from maintaining a team style guide, except that the primary consumer is an LLM. The standards evolve through the same proposal mechanism as everything else: the code review agent flags a recurring pattern, and it becomes a candidate for a new standard.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Wiki schema.&lt;/strong&gt; As the engineering environment evolves, the wiki's category structure and staleness thresholds need adjustment. The schema is a single YAML file, so changes are low-risk.&lt;/p&gt;

&lt;p&gt;What does not need maintenance: the exchange protocol, the dashboard, and the activity log. Stable across months of use. The layered architecture pays off here. Stable components (governance, observability) are decoupled from evolving ones (orchestrators, standards, wiki schema).&lt;/p&gt;

&lt;h3&gt;
  
  
  What breaks if you stop maintaining it
&lt;/h3&gt;

&lt;p&gt;If the orchestrators fall behind external API changes, mechanical phases start failing with deterministic errors. The system degrades gracefully. The agent can still reason, but the automated context assembly stops working and you have to provide context manually. Annoying, not catastrophic.&lt;/p&gt;

&lt;p&gt;If the standards stop evolving, the code review agent keeps enforcing stale rules. They drift from what the team actually wants. The system still works, but its output becomes increasingly misaligned with reality. Subtler failure.&lt;/p&gt;

&lt;p&gt;If the wiki stops being maintained, it becomes unreliable. Staleness thresholds mitigate this, but if the underlying sources change in ways the schema does not anticipate, the wiki compounds outdated information. This is the most dangerous failure mode, because it is silent.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lessons
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The silent overwrite
&lt;/h3&gt;

&lt;p&gt;Early in deployment, the system implemented a ticket that required modifying a shared utility function. Claude Code correctly identified the function and modified it to satisfy the new ticket's acceptance criteria. In doing so, it broke three other features that depended on the function's original behavior. The test suite caught one regression. The other two had no test coverage.&lt;/p&gt;

&lt;p&gt;The root cause was not the agent's code quality. The modification was locally correct. The root cause was the orchestrator's context assembly. Phase 1 had provided the ticket's acceptance criteria and the target file, but not the list of callers. The agent did not know what else depended on that function.&lt;/p&gt;

&lt;p&gt;The fix was straightforward. The orchestrator now includes a dependency analysis step that identifies all callers of modified functions and adds them to the implementation brief. The code review agent was updated to explicitly check for behavioral changes in shared code.&lt;/p&gt;

&lt;p&gt;The broader lesson is the most useful one I have.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The agent's failure modes are usually upstream, in the context it receives, not in its reasoning.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Improving context assembly has had a larger impact on output quality than any prompt engineering I have ever done.&lt;/p&gt;

&lt;h3&gt;
  
  
  Proposals beat permissions
&lt;/h3&gt;

&lt;p&gt;The proposal-approval model replaced a fragile permission system with a simple rule: the system proposes, the human decides. Easier to implement, easier to reason about, easier to trust. The only ongoing challenge is proposal volume during active development.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where agents still fall short
&lt;/h3&gt;

&lt;p&gt;Even with the architectural mitigations, certain limitations remain.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cross-repository reasoning.&lt;/strong&gt; When a feature spans multiple services, the agent struggles to maintain a coherent mental model of the full change set. Structured tracking helps, but does not solve it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ambiguous acceptance criteria.&lt;/strong&gt; When ticket descriptions are vague, the agent produces reasonable but often wrong implementations. The system flags ambiguous tickets as blockers rather than guessing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scope creep.&lt;/strong&gt; The agent's tendency to over-engineer requires constant calibration through standards and review.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stale context windows.&lt;/strong&gt; In long sessions, earlier context falls out of the underlying LLM's effective attention. Session-start re-initialization mitigates but does not eliminate this.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Bounded autonomy beats the demo
&lt;/h2&gt;

&lt;p&gt;Most autonomous coding agents on the market optimize for the demo. End-to-end issue resolution. Watch the agent work. Marvel at the autonomy.&lt;/p&gt;

&lt;p&gt;I am not interested in the demo. I am interested in Tuesday morning, when someone has to debug why a merge broke staging.&lt;/p&gt;

&lt;p&gt;Bounded autonomy with explicit human decision points is less impressive in a screencast and far more useful in practice. The system I built is deliberately the opposite of an autonomous agent. It is a tool with a strong opinion about what humans should still do.&lt;/p&gt;

&lt;p&gt;If I had one piece of advice for someone building something similar, it would be this.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Start with the orchestrator, not the prompt.&lt;/strong&gt; Figure out what context the agent actually needs, assemble it mechanically, and hand it over in a clean bundle. The agent will do the rest.&lt;/p&gt;

&lt;p&gt;The hard part is not getting the agent to reason well. It is giving it the right things to reason about.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://pixari.dev/ai-assisted-product-engineering/" rel="noopener noreferrer"&gt;pixari.dev&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>engineering</category>
      <category>ai</category>
    </item>
    <item>
      <title>I'm Bullish on AI-Assisted Coding. That's Exactly Why I Take the Risks Seriously.</title>
      <dc:creator>Raffaele Pizzari</dc:creator>
      <pubDate>Mon, 04 May 2026 08:59:48 +0000</pubDate>
      <link>https://dev.to/pixari/im-bullish-on-ai-assisted-coding-thats-exactly-why-i-take-the-risks-seriously-47pf</link>
      <guid>https://dev.to/pixari/im-bullish-on-ai-assisted-coding-thats-exactly-why-i-take-the-risks-seriously-47pf</guid>
      <description>&lt;p&gt;I use AI coding agents every day. I believe they are reshaping how we build software, and I think the teams that adopt them deliberately will outperform those that don't.&lt;/p&gt;

&lt;p&gt;I am not writing this to warn you away from AI-assisted development.&lt;/p&gt;

&lt;p&gt;I am writing this because the loudest voices in the AI enthusiasm camp are also the most allergic to discussing what can go wrong. And that worries me more than the risks themselves.&lt;/p&gt;

&lt;h2&gt;
  
  
  The productivity gains are real
&lt;/h2&gt;

&lt;p&gt;Let's start with what is undeniable.&lt;/p&gt;

&lt;p&gt;By 2024, LangChain's State of AI Agents report already showed 51% of surveyed organizations running agents in production. By 2026, that number has only grown. The global AI agent market is projected to expand from $7.8 billion to over $50 billion by 2030.&lt;/p&gt;

&lt;p&gt;This is not a hype cycle anymore. This is infrastructure.&lt;/p&gt;

&lt;p&gt;The case studies are equally striking.&lt;/p&gt;

&lt;p&gt;Rakuten engineers used a CLI-based agent to implement a complex activation vector extraction method within vLLM, a codebase of roughly 12.5 million lines. A task that would have taken weeks of onboarding and implementation was completed in seven hours with 99.9% numerical accuracy.&lt;/p&gt;

&lt;p&gt;TELUS reported shipping code 30% faster with agents, saving over 500,000 hours across the organization.&lt;/p&gt;

&lt;p&gt;These are not toy demos. This is production-grade acceleration at enterprise scale.&lt;/p&gt;

&lt;p&gt;I find this genuinely exciting. And none of it changes what I am about to say next.&lt;/p&gt;

&lt;h2&gt;
  
  
  The risks are equally real
&lt;/h2&gt;

&lt;p&gt;Lars Faye's &lt;a href="https://larsfaye.com/articles/agentic-coding-is-a-trap" rel="noopener noreferrer"&gt;Agentic Coding is a Trap&lt;/a&gt; struck a nerve because it named something many of us were feeling but not saying out loud. The core argument: the skills you need to supervise AI agents are the exact skills that atrophy when you over-rely on them.&lt;/p&gt;

&lt;p&gt;The trade-offs that need honest discussion are already quantifiable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Skill atrophy at scale.&lt;/strong&gt; The debugging and reasoning abilities required to supervise agents degrade measurably when you stop exercising them.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;System complexity to compensate for non-determinism.&lt;/strong&gt; AI outputs are probabilistic. The guardrails, review layers, and validation infrastructure required to make them production-safe add real engineering overhead.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vendor dependency for individuals and entire teams.&lt;/strong&gt; Claude Code outages have already left teams at a standstill. When your workflow depends on a third-party model, their downtime becomes yours.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unpredictable and rising costs.&lt;/strong&gt; An employee's cost is fixed. Token pricing is a constantly moving target, dictated unilaterally by providers who can "nerf" a model and force you to burn two to three times more tokens for the same result.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A widening security attack surface.&lt;/strong&gt; Autonomous agents with broad permissions introduce threat categories that traditional security controls were never designed to handle.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Regulatory exposure most teams are not preparing for.&lt;/strong&gt; The EU AI Act's high-risk obligations take effect in August 2026, and many agentic workflows are closer to the compliance line than their operators realize.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are not hypothetical concerns. Let me expand on the ones that matter most.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cognitive debt
&lt;/h3&gt;

&lt;p&gt;Lars Faye calls this the "paradox of supervision." Anthropic's own research on how AI assistance impacts coding skill formation backs it up: in a controlled study, developers using AI scored 50% on average versus 67% for those coding manually, with the largest gap appearing specifically in debugging questions.&lt;/p&gt;

&lt;p&gt;Senior developers with decades of experience report being unable to explain systems they technically "built" with agents. I have &lt;a href="https://dev.to/the-paradox-of-ai-acceleration-why-we-are-typing-faster-but-shipping-slower/"&gt;written before&lt;/a&gt; about the gap between perceived velocity and actual throughput. The pattern here is the same: the metric that looks good on the dashboard is hiding a cost that only surfaces later.&lt;/p&gt;

&lt;p&gt;The cognitive friction of writing code, hitting errors, reading documentation, and resolving conflicts manually is not wasted effort. It is the mechanism through which engineers actually understand what they are building.&lt;/p&gt;

&lt;p&gt;As I argued in &lt;a href="https://dev.to/from-attention-economy-to-thinking-economy-the-ai-challenge/"&gt;From Attention Economy to Thinking Economy&lt;/a&gt;, the challenge is not whether AI eliminates jobs. It is whether we protect the cognitive abilities that make us valuable in the first place.&lt;/p&gt;

&lt;h3&gt;
  
  
  Security surface expansion
&lt;/h3&gt;

&lt;p&gt;Autonomous agents translate a single instruction into long chains of API calls, database queries, and data manipulations. If an adversary compromises an agent's input, the blast radius is exponentially larger than a traditional exploit.&lt;/p&gt;

&lt;p&gt;Research from 2026 shows an 88% success rate in bypassing guardrails on open-source models using automated probing techniques. Indirect prompt injection, where malicious instructions hide in external content the agent reads, requires far fewer attempts than direct attacks.&lt;/p&gt;

&lt;p&gt;Dependency poisoning can inject zero-day vulnerabilities straight into your CI/CD pipeline. A CVSS 10.0 remote code execution vulnerability discovered in Google's Gemini CLI in early 2026, exploitable specifically in CI/CD pipeline environments, made this supply-chain risk impossible to ignore.&lt;/p&gt;

&lt;h3&gt;
  
  
  Regulatory pressure
&lt;/h3&gt;

&lt;p&gt;On August 2, 2026, the EU AI Act's high-risk obligations take effect. Under Annex III, AI systems used to allocate tasks based on individual behavior or to monitor and evaluate worker performance in employment contexts are classified as high-risk.&lt;/p&gt;

&lt;p&gt;Coding agents do not automatically fall under this scope, but the line gets blurry fast when orchestrator systems start auto-assigning tickets, ranking PR quality, or feeding into performance reviews.&lt;/p&gt;

&lt;p&gt;Article 14 requires that human supervisors understand the system's capabilities and limitations, remain aware of automation bias, correctly interpret outputs, and retain the ability to override them.&lt;/p&gt;

&lt;p&gt;Organizations that let engineers rubber-stamp massive AI-generated pull requests without genuine comprehension are building a compliance liability, whether or not they realize it yet.&lt;/p&gt;

&lt;h2&gt;
  
  
  The real problem is not the risks. It is the denial.
&lt;/h2&gt;

&lt;p&gt;Here is where I part ways with both camps.&lt;/p&gt;

&lt;p&gt;The skeptics read all of this and conclude: stop using agents. Go back to writing everything by hand. Agentic coding is a trap, full stop.&lt;/p&gt;

&lt;p&gt;The enthusiasts read all of this and shrug. They treat any discussion of downsides as FUD from people who "don't get it." They dismiss cognitive atrophy as a skill issue. They wave away security concerns as solvable later.&lt;/p&gt;

&lt;p&gt;Both responses are wrong, but the second one is more dangerous.&lt;/p&gt;

&lt;p&gt;In engineering, we do not ship without testing. We do not deploy without monitoring. We do not scale without load testing.&lt;/p&gt;

&lt;p&gt;We never adopt a technology by pretending it has no failure modes. That is not engineering. That is wishful thinking.&lt;/p&gt;

&lt;p&gt;The people who refuse to discuss the risks of AI-assisted development are not optimists. They are in denial.&lt;/p&gt;

&lt;p&gt;And denial is how promising technologies get killed. Not by their limitations, but by the backlash that follows when those limitations are discovered too late by people who were told everything was fine.&lt;/p&gt;

&lt;p&gt;I have seen this pattern play out across &lt;a href="https://dev.to/from-ic-to-engineering-manager-first-90-days/"&gt;two decades in this industry&lt;/a&gt;. The technologies that survived had honest advocates. The ones that did not were oversold by people who confused enthusiasm with recklessness.&lt;/p&gt;

&lt;h2&gt;
  
  
  What honest adoption looks like
&lt;/h2&gt;

&lt;p&gt;Anthropic's own data reveals what they call the "Delegation Paradox": engineers use AI in 60% of their workflows but can fully delegate only 0-20% of actual tasks.&lt;/p&gt;

&lt;p&gt;This is not a failure of the tools. It is the reality that high-stakes architectural work resists probabilistic automation. Accept it and plan around it instead of fighting it.&lt;/p&gt;

&lt;p&gt;That means building deliberate constraints into how you and your team use these tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Maintain your skills deliberately.&lt;/strong&gt; Use agents where they genuinely accelerate: boilerplate, exploration, context retrieval, test scaffolding. The &lt;a href="https://dev.to/dont-ask-ai-to-build-the-house-ask-it-to-build-the-scaffolding/"&gt;scaffolding use case&lt;/a&gt; remains the healthiest relationship most engineers can have with AI right now.&lt;/p&gt;

&lt;p&gt;But regularly write core logic yourself. Run pair programming sessions where AI is off. During code reviews, trace the logic manually.&lt;/p&gt;

&lt;p&gt;If you do not exercise the debugging and reasoning muscles, they atrophy within months. This is not a metaphor. It is what the data shows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Respect the context limits.&lt;/strong&gt; Agents suffer from measurable "context rot." A Databricks study found that model correctness drops significantly around the 32,000-token mark, well before theoretical limits.&lt;/p&gt;

&lt;p&gt;The "lost in the middle" phenomenon means agents routinely miss critical guidelines buried in large context windows. Agents confidently invent non-existent variables, mix incompatible framework versions, or hallucinate API calls because they failed to parse intermediate contextual data.&lt;/p&gt;

&lt;p&gt;This is not a bug that will be fixed next quarter. It is a fundamental characteristic you need to design around.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Never generate more code than you can review.&lt;/strong&gt; If your agent produced a 10,000-line pull request overnight and your team approved it in 20 minutes, you did not ship faster. You shipped blindly.&lt;/p&gt;

&lt;p&gt;The volume mismatch between machine generation speed and human comprehension speed is the single biggest enabler of the "LGTM" culture that is quietly degrading code quality across the industry.&lt;/p&gt;

&lt;p&gt;Strict volume constraints are not a productivity bottleneck. They are what keeps your codebase deterministic instead of probabilistic.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Invest in specification as the primary artifact.&lt;/strong&gt; When implementation is nearly free, the specification becomes the real engineering work.&lt;/p&gt;

&lt;p&gt;Formal, machine-readable specs with explicit non-goals, hard constraints, and testable acceptance criteria prevent agents from filling ambiguity with hallucinated assumptions. Spec-driven development is not overhead. It is the structural response to a world where generating code is trivial and verifying it is expensive.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Watch for the junior developer trap.&lt;/strong&gt; When confronted with bugs in generated code, many junior developers treat the problem as a "prompt engineering issue" rather than a logic flaw. They tweak prompts repeatedly instead of reading the code.&lt;/p&gt;

&lt;p&gt;In this dynamic, the agent delivers the results, the developer takes the credit, and nobody builds real engineering skills. If you &lt;a href="https://dev.to/you-cannot-mandate-your-way-to-ai-adoption/"&gt;lead a team&lt;/a&gt;, you have a responsibility to ensure your junior engineers build foundations, not just prompting habits. Their long-term career depends on it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prepare for regulatory compliance now.&lt;/strong&gt; The EU AI Act's August 2026 enforcement date is not far away. If your agentic workflows touch task allocation or performance evaluation, you may already be in high-risk territory under Annex III.&lt;/p&gt;

&lt;p&gt;Even outside that scope, Article 12 requires continuous logging over the system's lifetime, and Article 14 requires human overseers who genuinely understand the system, not just approve its output.&lt;/p&gt;

&lt;p&gt;If your current workflow is "agent generates, junior approves, code ships," start asking whether that process would survive regulatory scrutiny. The organizations that treat governance as infrastructure rather than bureaucracy will be the ones that scale AI adoption sustainably.&lt;/p&gt;

&lt;h2&gt;
  
  
  The technology deserves better advocates
&lt;/h2&gt;

&lt;p&gt;The cognitive debt is real. The security surface expansion is real. The regulatory pressure is real. The skill atrophy is measurable and documented.&lt;/p&gt;

&lt;p&gt;None of this means we should stop using these tools.&lt;/p&gt;

&lt;p&gt;All of it means we should use them like engineers: with eyes open, with guardrails in place, and with the humility to admit what we do not yet fully understand.&lt;/p&gt;

&lt;p&gt;The enterprises that will thrive are those that explicitly instrument their workflows to prevent human cognition from atrophying. That treat the agent as a tool of the intellect rather than a replacement for it.&lt;/p&gt;

&lt;p&gt;The engineers who will thrive are those who master what the probabilistic agent inherently lacks: systemic architectural vision, contextual judgment, and the willingness to take responsibility for what ships.&lt;/p&gt;

&lt;p&gt;I am betting on AI-assisted development. And that bet means taking its risks seriously enough to contain them.&lt;/p&gt;

&lt;p&gt;Because the best thing you can do for a technology you believe in is to be honest about it.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://pixari.dev/bullish-on-ai-coding-thats-why-i-take-the-risks-seriously/" rel="noopener noreferrer"&gt;pixari.dev&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>engineering</category>
      <category>leadership</category>
    </item>
    <item>
      <title>You Cannot Mandate Your Way to AI Adoption</title>
      <dc:creator>Raffaele Pizzari</dc:creator>
      <pubDate>Sun, 19 Apr 2026 12:34:18 +0000</pubDate>
      <link>https://dev.to/pixari/you-cannot-mandate-your-way-to-ai-adoption-5c92</link>
      <guid>https://dev.to/pixari/you-cannot-mandate-your-way-to-ai-adoption-5c92</guid>
      <description>&lt;p&gt;Most AI adoption strategies in engineering organizations are failing for one of three reasons: leadership mandates tool usage, tracks individual adoption rates, or does neither and hopes something changes.&lt;/p&gt;

&lt;p&gt;Each fails differently. Together, they explain most of the friction between executive expectations and engineering teams right now.&lt;/p&gt;

&lt;h2&gt;
  
  
  The polarization lives inside your organization
&lt;/h2&gt;

&lt;p&gt;I have written before about &lt;a href="https://dev.to/the-ai-echo-chamber-is-the-new-agile-industrial-complex/"&gt;the gap between AI discourse and AI reality&lt;/a&gt;. But there is a version of that gap that lives inside your organization, and it is more expensive than the one on LinkedIn.&lt;/p&gt;

&lt;p&gt;Executives — often validly — see AI tools demonstrating real velocity gains in controlled environments. They see competitors moving faster. They read the reports. They push for adoption.&lt;/p&gt;

&lt;p&gt;Engineers — also often validly — see AI-assisted pull requests failing review more often, debug time rising, and new categories of subtle bugs appearing in production. They know that the person professionally accountable for the code that ships is them, not the tool. The gains in the demos are real. So is the debugging cost that does not appear in the demos.&lt;/p&gt;

&lt;p&gt;Both observations are correct. The problem is structural: the benefits appear where executives measure, and the costs appear where engineers work. &lt;a href="https://dev.to/the-paradox-of-ai-acceleration-why-we-are-typing-faster-but-shipping-slower/"&gt;The data confirms this split&lt;/a&gt;. AI-assisted pull requests contain on average 1.7 times more issues than human-authored ones. Experienced developers on complex brownfield tasks took 19% longer with AI than without. Not because AI is useless, but because it shifts the bottleneck from writing to verifying, and verification is expensive.&lt;/p&gt;

&lt;p&gt;When those two realities meet in the same organization without a coherent strategy, you get polarization. And then you get one of three bad responses.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mandating adoption
&lt;/h2&gt;

&lt;p&gt;The most common response from leadership is the most destructive: set adoption targets, mandate specific tools, and track whether engineers are meeting the numbers.&lt;/p&gt;

&lt;p&gt;This fails for a reason that goes beyond morale. Developers know they own the code that ships. When you mandate a tool they distrust, you are asking them to stake their professional reputation on outputs they cannot fully verify. That is not resistance to change. That is a rational risk calculation.&lt;/p&gt;

&lt;p&gt;Boston Consulting Group has identified a ceiling for this dynamic. Only half of frontline employees effectively apply AI tools in practice when forced, because the tools are not integrated into how they actually work. Adoption numbers look acceptable on a dashboard. Actual behavior changes minimally.&lt;/p&gt;

&lt;p&gt;What mandates reliably produce: surface compliance, metric gaming, and resentment. The developers who would have experimented most productively — the senior engineers with the institutional knowledge to evaluate AI outputs critically — become the most resistant. They recognize the pattern.&lt;/p&gt;

&lt;p&gt;AI adoption happens because the tool is demonstrably useful to the person using it. That is not idealism. It is the only path that produces real behavior change.&lt;/p&gt;

&lt;h2&gt;
  
  
  Monitoring individual adoption
&lt;/h2&gt;

&lt;p&gt;The second response is subtler: do not mandate, but measure. Track adoption rates, count AI-assisted commits, monitor prompt volume per engineer. Use the data to understand who is using the tools.&lt;/p&gt;

&lt;p&gt;The intention is reasonable. The execution creates what researchers call "surveillance allergy." When AI usage becomes an individual performance signal, developers optimize for the metric instead of for the outcome. They accept AI suggestions they would otherwise reject. They avoid flagging AI-generated code they are uncertain about, because doing so creates a visible record of uncertainty.&lt;/p&gt;

&lt;p&gt;This is exactly the wrong direction. Good AI usage depends on engineers being critical evaluators of AI output. Surveillance incentivizes uncritical acceptance — which is what drives the code quality problems in the first place.&lt;/p&gt;

&lt;p&gt;The principle that fixes this: AI metrics should never feed into individual performance evaluations or compensation decisions. Communicate this explicitly, not just once. Measure at the system level instead. Adoption rates against change failure rates. AI-assisted PR percentages against incident volume. If quality drops as adoption rises, the process needs structural adjustment. That is a systemic diagnosis, not an individual one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Doing nothing
&lt;/h2&gt;

&lt;p&gt;The third response is laissez-faire: no policy, no approved tools, no guidance. Let engineers figure it out.&lt;/p&gt;

&lt;p&gt;What this produces is shadow AI. Not because developers are reckless, but because they are solving real problems with the tools available to them, in the absence of anything better. It looks like individual productivity. It is actually unmanaged data risk.&lt;/p&gt;

&lt;p&gt;When engineers feed proprietary source code, internal architecture, or customer data into unvetted public LLMs, the organization loses control of its most sensitive assets without a trace in any audit log. The risk is not that AI exists. It is that unregulated AI multiplies data paths faster than security teams can map them. Fragmented adoption across hundreds of individual tool choices makes uniform governance impossible and ROI measurement meaningless.&lt;/p&gt;

&lt;p&gt;Shadow AI is a symptom of governance failure. The only remedy is providing a real alternative: a centralized platform of approved, enterprise-licensed tools with clear security boundaries, within which developers have genuine autonomy to choose what works for their workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  The identity dimension most strategies miss
&lt;/h2&gt;

&lt;p&gt;Underneath all of this is a human problem that most adoption playbooks do not name: the developer identity crisis.&lt;/p&gt;

&lt;p&gt;Senior engineers did not choose this profession to orchestrate AI. They chose it to build things. The satisfaction of tracking down a production bug, of optimizing a slow query until response times drop from seconds to milliseconds, of understanding a system at a level few others do — these are not peripheral to engineering identity. They are central to it.&lt;/p&gt;

&lt;p&gt;Annie Vella, a Distinguished Engineer and AI researcher at Westpac, found in her research that 77% of engineers report spending less time writing code. Her &lt;a href="https://annievella.com/posts/the-software-engineering-identity-crisis/" rel="noopener noreferrer"&gt;blog post on this&lt;/a&gt; went viral with over 65,000 views — not because it was controversial, but because it named something engineers had been carrying without language for it.&lt;/p&gt;

&lt;p&gt;The developers most valuable for AI adoption — the seniors with the contextual knowledge to catch what AI gets wrong — are the ones for whom the role shift is most disorienting. This is not a coincidence. Treating their skepticism as simple resistance misses the actual problem.&lt;/p&gt;

&lt;p&gt;The reframe that works: the craft does not disappear, it scales. What matters now is how code is architected, how robust it is, how testable it is, how secure it is. The ability to affect quality and outcomes without typing every line is still engineering — it is a more leveraged version of the same discipline. Making this case explicitly, and creating individual integration paths based on where each engineer derives meaning from their work, is more effective than any uniform rollout policy.&lt;/p&gt;

&lt;h2&gt;
  
  
  What actually works
&lt;/h2&gt;

&lt;p&gt;The organizations seeing durable AI adoption share a common structure.&lt;/p&gt;

&lt;p&gt;A centralized platform team evaluates, procures, and security-validates AI tools. They produce an approved toolkit — enterprise-licensed options — and developers choose within that toolkit. No single vendor mandate. But all outputs conform to the same architectural standards and review processes, regardless of which tool generated them. The AI adapts to the organization's conventions, not the reverse.&lt;/p&gt;

&lt;p&gt;Measurement is systemic. Adoption rates are tracked against change failure rates and incident volume at team and org level. When quality drops as adoption rises, the pace slows and governance catches up before continuing.&lt;/p&gt;

&lt;p&gt;Integration paths are individual. Senior engineers get roadmaps based on where AI genuinely reduces friction in their specific work. Junior engineers get AI literacy training — critical evaluation of outputs, system design fundamentals — before unrestricted tool access.&lt;/p&gt;

&lt;p&gt;The staged approach that works: start with low-risk work and no metric pressure. Let engineers discover what is genuinely useful. Then, once there is organic pull, remove the friction — documentation, environment setup, tooling integration — that slows everyday use.&lt;/p&gt;

&lt;h2&gt;
  
  
  The governance stakes
&lt;/h2&gt;

&lt;p&gt;One more thing worth naming directly: regulatory scrutiny of AI usage in software engineering is coming. In some sectors it is already here.&lt;/p&gt;

&lt;p&gt;The organizations with centralized platforms, audit trails, and systemic measurement will be able to answer the questions that compliance, legal, and regulators will ask. The organizations with fragmented, ungoverned shadow AI will not.&lt;/p&gt;

&lt;p&gt;Governance is not a constraint on AI adoption. Done correctly, it is the infrastructure that makes adoption sustainable. The organizations treating it as bureaucratic overhead will spend far more time explaining their data incidents than they saved by skipping the process.&lt;/p&gt;

&lt;p&gt;Build the governance first. The adoption follows.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://pixari.dev/you-cannot-mandate-your-way-to-ai-adoption/" rel="noopener noreferrer"&gt;pixari.dev&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>leadership</category>
      <category>engineering</category>
    </item>
    <item>
      <title>The Gap Between Estimated and Delivered</title>
      <dc:creator>Raffaele Pizzari</dc:creator>
      <pubDate>Wed, 08 Apr 2026 22:03:25 +0000</pubDate>
      <link>https://dev.to/pixari/stop-blaming-estimation-start-fixing-the-org-2a04</link>
      <guid>https://dev.to/pixari/stop-blaming-estimation-start-fixing-the-org-2a04</guid>
      <description>&lt;p&gt;Here's a pattern I've seen play out dozens of times. A team estimates a feature at 5 story points. Low complexity, clear requirements, well-understood domain. By every estimation framework, it's a small task.&lt;/p&gt;

&lt;p&gt;It ships three weeks later.&lt;/p&gt;

&lt;p&gt;The team gets blamed for bad estimation. Leadership pushes for better grooming, more detailed breakdowns, tighter story points. The team tries harder. Next sprint, the same thing happens.&lt;/p&gt;

&lt;p&gt;The estimate was never the problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  Estimation does what it's supposed to do
&lt;/h2&gt;

&lt;p&gt;Modern estimation frameworks are actually good at what they measure. The best ones decompose work into multiple dimensions: complexity (how hard is this to understand), effort (how much raw work), uncertainty (how many unknowns), and risk (what external factors could derail it). Each dimension gets scored, and the combination drives the story points.&lt;/p&gt;

&lt;p&gt;This works. A 5-point story really is a 5-point story. The team correctly assessed the complexity of the code, the effort required, the technical unknowns. They weren't wrong.&lt;/p&gt;

&lt;p&gt;The problem is that estimation measures the &lt;em&gt;work&lt;/em&gt;. It was never designed to measure the &lt;em&gt;environment the work has to travel through&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The gap has a name
&lt;/h2&gt;

&lt;p&gt;Between "estimated" and "delivered" sits everything the estimation framework doesn't capture. I call it &lt;strong&gt;org friction&lt;/strong&gt;: the invisible overhead that organizational structure, processes, and cross-team dependencies impose on every piece of work.&lt;/p&gt;

&lt;p&gt;That 5-point story took three weeks not because the team misjudged the complexity. It took three weeks because a schema change needed approval from another team. The design review sat in a queue for days. Security wanted to sign off because it touches user data. The one person who understood the legacy service was unavailable. And the engineer doing the work got two focused hours per day between meetings, incidents, and "quick questions."&lt;/p&gt;

&lt;p&gt;None of that is estimation error. It's organizational drag. And unlike technical debt, which at least gets mentioned in retros, org friction is untracked and unowned. It doesn't show up in Jira. Nobody has "reduce org friction" in their OKRs. But it's eating 30-40% of your team's capacity.&lt;/p&gt;

&lt;h2&gt;
  
  
  Seven sources of org friction
&lt;/h2&gt;

&lt;p&gt;After paying attention to this for a while, I see the same patterns everywhere.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cross-team dependencies.&lt;/strong&gt; The work itself takes hours. The waiting takes days. Waiting for the Platform team to review your PR. Waiting for Design to finalize a spec they promised last sprint. The estimate captured the work. Nobody captured the queue time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Process overhead.&lt;/strong&gt; Change management boards, architecture review committees, compliance gates. Each one was created for a good reason. None of them were ever removed when the reason went away. They accumulate like sediment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Knowledge silos.&lt;/strong&gt; That one engineer who understands the billing service. That one PM who knows the historical context behind a weird product decision. When they're unavailable, work stops. Not because it's technically blocked, but because nobody else has the context to make the call.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Legacy process debt.&lt;/strong&gt; This is different from technical debt. It's the outdated deployment process that requires manual steps. The testing pipeline that takes 45 minutes because nobody prioritized making it faster. The onboarding doc that hasn't been updated in two years. Not broken enough to fix, but slowing everything down.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Decision latency.&lt;/strong&gt; No formal gate, just nobody with clear authority to make the call. Or the person who does keeps deferring. The feature is technically unblocked, but the team is waiting for someone to say "yes, build it this way." Estimates assume decisions happen instantly. They never do.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Context switching tax.&lt;/strong&gt; The estimate assumed focused time. Reality: support rotations, incident responses, Slack threads from other teams, syncs that could have been async. The work takes three days of focused effort, but your engineers get two focused hours per day. The calendar is the friction nobody accounts for.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Misaligned incentives.&lt;/strong&gt; Team A is measured on shipping features. Team B is measured on system stability. Team A needs Team B to deploy a breaking change. Team B has zero motivation to help. This isn't a technical problem. It's an organizational design problem. And it shows up as a "missed estimate" on Team A's board.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stop improving estimation. Reduce the drag.
&lt;/h2&gt;

&lt;p&gt;When teams consistently miss delivery targets, the default response is to push for better estimation. More grooming sessions. More detailed breakdowns. More precise story points.&lt;/p&gt;

&lt;p&gt;This misses the point entirely.&lt;/p&gt;

&lt;p&gt;The estimates are fine. A 5-point story is still a 5-point story. What changed between "estimated" and "delivered" wasn't the complexity of the work. It was the friction the work encountered on its way to production. Making teams better at predicting friction doesn't reduce it. It just makes everyone more accurately pessimistic.&lt;/p&gt;

&lt;p&gt;The real lever is reducing the drag. And that's a leadership problem, not an engineering one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Run a Friction Log
&lt;/h2&gt;

&lt;p&gt;Here's the most useful thing I've done as an EM to make org friction visible.&lt;/p&gt;

&lt;p&gt;For one sprint, ask your team to keep a shared doc. Call it a &lt;strong&gt;Friction Log&lt;/strong&gt;. The rules are simple: every time someone is blocked, delayed, or slowed down by something that isn't the code itself, they log it. One line. What happened, how long they waited, which friction source it was.&lt;/p&gt;

&lt;p&gt;No analysis during the sprint. Just logging. Keep it low effort.&lt;/p&gt;

&lt;p&gt;After two weeks, read it together. What you'll see is a document that's almost impossible to argue with. Not opinions about process. Not complaints. Just a factual record of where time went that had nothing to do with the work your team estimated.&lt;/p&gt;

&lt;p&gt;The first time I ran one, the log showed that 40% of the sprint's elapsed time was spent waiting on things outside the team's control. Cross-team reviews, decision latency, a compliance gate that took four days for a one-line config change. The estimates had been accurate. The org had been expensive.&lt;/p&gt;

&lt;p&gt;A Friction Log does three things:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It separates estimation from delivery.&lt;/strong&gt; You can see that the 5-point story really was a 5-point story. The extra two weeks was org friction, not estimation error. Once you can see the gap, you can talk about it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It gives you data, not complaints.&lt;/strong&gt; "Our estimates are always off" gets you a shrug. "Here's a log showing we spent 11 days this sprint waiting on cross-team reviews" gets you a conversation. Leaders respond to patterns with numbers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;It picks your battles for you.&lt;/strong&gt; After one sprint, the top friction source is obvious. You don't have to fix everything. Pick the one that shows up the most and work on reducing it for the next quarter. Maybe it's getting embedded in the platform team's sprint review so your PRs don't sit in a queue. Maybe it's documenting the legacy service so you're not dependent on one person. Small, compounding improvements.&lt;/p&gt;

&lt;h2&gt;
  
  
  Nobody designed it this way
&lt;/h2&gt;

&lt;p&gt;Nobody designed org friction into your company. It accumulated. One reasonable process at a time. One well-intentioned approval gate at a time. One team boundary at a time. Each decision made sense in isolation. Together, they created an invisible tax on every piece of work your team ships.&lt;/p&gt;

&lt;p&gt;Your estimates aren't the problem. Your org is just more expensive than anyone's willing to admit.&lt;/p&gt;

&lt;p&gt;The question isn't whether your team can estimate better. It's whether you, as a leader, are willing to name the drag and do something about it. Because the gap between "estimated" and "delivered" isn't a measurement error. It's a leadership opportunity.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://pixari.dev/the-missing-dimension-in-software-estimation/" rel="noopener noreferrer"&gt;pixari.dev&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>leadership</category>
      <category>engineering</category>
    </item>
    <item>
      <title>The Engineering Manager Role Is Mutating</title>
      <dc:creator>Raffaele Pizzari</dc:creator>
      <pubDate>Mon, 06 Apr 2026 07:44:34 +0000</pubDate>
      <link>https://dev.to/pixari/the-engineering-manager-role-is-getting-weird-4h6c</link>
      <guid>https://dev.to/pixari/the-engineering-manager-role-is-getting-weird-4h6c</guid>
      <description>&lt;p&gt;I read Gregor Ojstersek's piece the other day. &lt;a href="https://newsletter.eng-leadership.com/p/would-i-still-go-the-engineering-manager-route-in-2026" rel="noopener noreferrer"&gt;"Would I Still Go The Engineering Manager Route in 2026?"&lt;/a&gt;. And it hit me in a strange way. Not because of the answer but because of the question. The fact that someone who's been doing this for years is publicly asking whether he'd do it again says something.&lt;/p&gt;

&lt;p&gt;I've been in this role for almost three years. And I'd be lying if I said I hadn't asked myself the same thing.&lt;/p&gt;

&lt;p&gt;But here's the thing. I've done this loop before.&lt;/p&gt;

&lt;p&gt;Before I was an EM, I was an IC. Before I was an IC, I ran my own web agency for eight years. I was the CEO. I did sales, code, hiring, client management, architecture, marketing. Full ownership, full autonomy. And then I deliberately walked away from that, went back to building as an individual contributor, and worked my way through Lead and Principal before choosing management again.&lt;/p&gt;

&lt;p&gt;I chose this role knowing what I was giving up. I'd already tasted the other side. Twice, actually.&lt;/p&gt;

&lt;p&gt;That context shapes how I see what's happening to the EM role right now.&lt;/p&gt;

&lt;h2&gt;
  
  
  The job changed
&lt;/h2&gt;

&lt;p&gt;When I moved into management, the deal was roughly this: you stop writing code full-time, you spend your days on people, process, and delivery. You run 1:1s. You shield the team from organizational noise. You hire, you coach, you make sure things ship. In return, you get a seat at the table and the satisfaction of watching people grow.&lt;/p&gt;

&lt;p&gt;That deal still technically exists. But the fine print keeps growing.&lt;/p&gt;

&lt;p&gt;Somewhere along the way, the expectations started stacking. I've seen it in my career, I've seen it in job postings, I've heard it from every EM I talk to. Be technical enough to review architecture. Be strategic enough to present to leadership. Be empathetic enough to handle burnout and conflict. Own delivery metrics. Own hiring pipelines. Own the roadmap conversation with product. And increasingly: pick up some coding too.&lt;/p&gt;

&lt;p&gt;I don't think anyone designed it this way. It just accumulated. Browse any EM job listing from 2026 and count the responsibilities. Technical leadership, people management, delivery ownership, hiring, strategy, stakeholder alignment. Each one makes sense on its own. But the list keeps getting longer and nothing ever falls off.&lt;/p&gt;

&lt;h2&gt;
  
  
  I've seen this movie before
&lt;/h2&gt;

&lt;p&gt;When I was running my agency, mobile was going to change everything. We had to rethink our entire business. Some agencies died. We adapted.&lt;/p&gt;

&lt;p&gt;When I came back to IC, cloud and microservices were going to change how teams work. Some companies over-rotated, split everything into tiny services, and spent years untangling the mess. The ones who kept their heads did fine.&lt;/p&gt;

&lt;p&gt;Now it's AI.&lt;/p&gt;

&lt;p&gt;I've watched three cycles where a technology was supposed to make a role obsolete or fundamentally different. And the pattern is always the same: the technology matters, it does change things, but it doesn't change the things people think it will.&lt;/p&gt;

&lt;p&gt;Mobile didn't kill web agencies. It made them busier. Cloud didn't eliminate ops. It renamed them. And AI won't replace engineering managers. But it is changing the math around the role in ways that are worth paying attention to.&lt;/p&gt;

&lt;h2&gt;
  
  
  What AI actually changes
&lt;/h2&gt;

&lt;p&gt;AI doesn't build trust across a team over months. AI doesn't read the room in a planning meeting and realize the real problem isn't the estimate, it's that people don't believe in the project. AI doesn't tell a PM that the deadline is unrealistic and hold the line.&lt;/p&gt;

&lt;p&gt;But AI does make small teams more productive. And when small teams are more productive, companies start asking why they need so many managers. The ratio shifts. Where you had one EM for five or six engineers, now it's eight. Ten. Twelve. The scope grew but the support didn't.&lt;/p&gt;

&lt;p&gt;That's the actual pressure. Not "AI replaces managers." Just "we need fewer of them, and the ones we keep do more."&lt;/p&gt;

&lt;p&gt;Combine that with the IC career track getting better, real Staff and Principal roles with real compensation, and you've got a situation where the best engineers don't need management to advance. The people who &lt;em&gt;want&lt;/em&gt; to be EMs used to be the best engineers who also cared about people. Now some of them are choosing Staff instead. I don't blame them.&lt;/p&gt;

&lt;h2&gt;
  
  
  The player-coach myth
&lt;/h2&gt;

&lt;p&gt;I keep hearing "player-coach" as if it's aspirational. Like we should all want to be the EM who also ships features.&lt;/p&gt;

&lt;p&gt;I've been a player-coach. It means you do both jobs badly. You context-switch between a PR review and a difficult conversation about someone's performance. You write code in the gaps between meetings, which means you write code in 25-minute windows, which means you write bad code. Or you stay up late to get the focused time, which means you burn out faster.&lt;/p&gt;

&lt;p&gt;The industry uses "player-coach" like it's a compliment. It's usually a budget decision disguised as a philosophy. Someone needed to cut a headcount and decided the EM could absorb the work.&lt;/p&gt;

&lt;p&gt;I'm not saying it can't work. In early-stage startups, in small teams, when the scope is tight, sure. But in a 40-person org with multiple squads? If your EM is regularly shipping features, something is wrong with your staffing, not right with your EM.&lt;/p&gt;

&lt;h2&gt;
  
  
  The hard problems don't change
&lt;/h2&gt;

&lt;p&gt;Here's what I keep coming back to, across all the cycles I've lived through. The technology changes. The hard problems don't.&lt;/p&gt;

&lt;p&gt;Getting people aligned on what matters. Making decisions with incomplete information. Knowing when to push and when to protect the team. Helping someone grow when they don't see their own potential yet. Having the conversation everyone's avoiding.&lt;/p&gt;

&lt;p&gt;Those problems looked the same when I was running my company in 2015. They look the same now. AI didn't create them and AI won't solve them. They're human problems, and the EM role exists because someone needs to own them.&lt;/p&gt;

&lt;p&gt;That's why I'm not worried about the role dying. I've watched enough cycles to know it won't. But it will mutate. It always does. And the EMs who get left behind won't be the ones who ignored AI. They'll be the ones who forgot that the human stuff is the actual job.&lt;/p&gt;

&lt;h2&gt;
  
  
  It's one title, but it's not one job
&lt;/h2&gt;

&lt;p&gt;I think a lot of the confusion comes from a naming problem. We all call ourselves "Engineering Manager" but we're doing wildly different jobs depending on the company, the stage, and the org.&lt;/p&gt;

&lt;p&gt;I've talked to EMs who spend 80% of their time on architecture and code review. I've talked to EMs who haven't opened an IDE in two years and spend their days on coaching, hiring, and cross-team alignment. I've talked to EMs who are basically program managers with a different title, owning delivery timelines and stakeholder updates. All of them have the same title on LinkedIn.&lt;/p&gt;

&lt;p&gt;This matters because most of the frustration I hear from EMs isn't really about the role. It's about the mismatch. You took the job expecting to be a people leader and you ended up being a delivery lead. Or you wanted to stay close to the technical decisions and instead you're spending your weeks in stakeholder meetings. The role didn't let you down. The expectations were just never made explicit.&lt;/p&gt;

&lt;p&gt;If you're an EM and something feels off, before you question whether you want to be a manager at all, try a simpler question first: which version of this job is your company actually asking you to do? And is that the version you want?&lt;/p&gt;

&lt;p&gt;Sometimes the answer is "I'm doing the wrong version of EM at the wrong company." That's a very different problem than "I don't want to be an EM anymore," and it has a very different solution.&lt;/p&gt;

&lt;p&gt;If you're hiring an EM or about to become one: have that conversation early. Don't just talk about the team size and the tech stack. Talk about what the job actually looks like on a Tuesday. Where does this EM spend most of their time? What does success look like in six months? The clearer that picture is, the fewer EMs will burn out wondering why the job doesn't feel like what they signed up for.&lt;/p&gt;

&lt;h2&gt;
  
  
  The feedback loop problem
&lt;/h2&gt;

&lt;p&gt;Something I've noticed talking to other EMs. A lot of us still build things on the side. Side projects, open source, weekend hacks. And when you ask why, the answer is almost never "to stay technical." It's because the feedback loop is different. You write code, you see it work, you feel something. It's immediate.&lt;/p&gt;

&lt;p&gt;Management has its own rewards. Watching someone you coached nail a presentation. Seeing a team you built ship something complex without drama. Those moments are real. But they're slow. They happen over months. The feedback loop in management is measured in quarters, not commits. You have to learn to find satisfaction in that pace, and some weeks it's easier than others.&lt;/p&gt;

&lt;h2&gt;
  
  
  So would I do it again?
&lt;/h2&gt;

&lt;p&gt;Yes. Without hesitation. I already did it twice.&lt;/p&gt;

&lt;p&gt;I walked away from running my own company. I went back to being an IC. I had the full picture, the autonomy, the ownership, and I chose to come back to building first and then to management. Not because I had to. Because I'd seen enough of both sides to know which problems I actually wanted to spend my days on.&lt;/p&gt;

&lt;p&gt;I love this job. The role is broader and harder than when I started. The industry hasn't settled on what it wants EMs to be. But that ambiguity is part of what makes it interesting. You get to shape it.&lt;/p&gt;

&lt;p&gt;If you're an IC thinking about management: go in with your eyes open. The role is worth doing. It's also messier and more ambiguous than it looks from the outside. Talk to EMs. Ask them what their actual week looks like, not their LinkedIn version of it.&lt;/p&gt;

&lt;p&gt;And if you're already an EM: the role is changing fast. It always has been. The technology driving the change is new but the pattern isn't. The EMs who'll thrive are the ones who can sit with the ambiguity and shape the role instead of waiting for someone to define it for them. That's always been the interesting part of this job anyway.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://pixari.dev/the-engineering-manager-role-is-getting-weird/" rel="noopener noreferrer"&gt;pixari.dev&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>leadership</category>
      <category>engineering</category>
      <category>ai</category>
    </item>
    <item>
      <title>I Built a Game That Teaches Git by Making You Type Real Commands</title>
      <dc:creator>Raffaele Pizzari</dc:creator>
      <pubDate>Thu, 02 Apr 2026 00:00:39 +0000</pubDate>
      <link>https://dev.to/pixari/i-built-a-game-that-teaches-git-by-making-you-type-real-commands-495h</link>
      <guid>https://dev.to/pixari/i-built-a-game-that-teaches-git-by-making-you-type-real-commands-495h</guid>
      <description>&lt;p&gt;I work in IT, and there's one scene I keep witnessing. A developer joins the team, they're sharp, they ship features, they write clean code. And then someone asks them to rebase, and you can see the panic set in.&lt;/p&gt;

&lt;p&gt;It's not their fault. Git is taught badly.&lt;/p&gt;

&lt;p&gt;Every git tutorial I've ever seen follows the same formula: here's a diagram of branches, here's a table of commands, now go practice on your own repo and try not to destroy anything. It's like learning to drive by reading the car manual. Technically accurate. Practically useless.&lt;/p&gt;

&lt;p&gt;I've watched junior developers memorize &lt;code&gt;git add . &amp;amp;&amp;amp; git commit -m "fix" &amp;amp;&amp;amp; git push&lt;/code&gt; like an incantation, terrified to deviate because the one time they tried &lt;code&gt;git rebase&lt;/code&gt; they ended up in a state that required a senior engineer and 45 minutes of &lt;code&gt;git reflog&lt;/code&gt; to unfurl.&lt;/p&gt;

&lt;p&gt;And I've watched senior developers, people with a decade of experience, avoid &lt;code&gt;git bisect&lt;/code&gt; entirely because nobody ever showed them what it actually does in a safe environment.&lt;/p&gt;

&lt;p&gt;So I built one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Gitvana
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://gitvana.pixari.dev" rel="noopener noreferrer"&gt;Gitvana&lt;/a&gt; is a browser game. You play a monk climbing toward "git enlightenment" at the Monastery of Version Control. There's a Head Monk who assigns you tasks, a judgmental cat, and pixel art that looks like it belongs on a Game Boy.&lt;/p&gt;

&lt;p&gt;But underneath the retro charm, there's a real git engine. When you type &lt;code&gt;git init&lt;/code&gt; in the terminal, it runs &lt;code&gt;git init&lt;/code&gt;. When you type &lt;code&gt;git commit&lt;/code&gt;, it creates an actual commit in an actual repository. The repository lives in your browser, powered by &lt;a href="https://isomorphic-git.org/" rel="noopener noreferrer"&gt;isomorphic-git&lt;/a&gt; and an in-memory filesystem, but it's real. Every command, every SHA, every ref.&lt;/p&gt;

&lt;p&gt;35 levels. 6 acts. 21 git commands. From &lt;code&gt;git init&lt;/code&gt; to &lt;code&gt;git bisect&lt;/code&gt;. No slides, no diagrams, no hand-holding. Just you and a terminal.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Play it at &lt;a href="https://gitvana.pixari.dev" rel="noopener noreferrer"&gt;gitvana.pixari.dev&lt;/a&gt;.&lt;/strong&gt; It's free, it works offline, and it doesn't want your email.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why a Game
&lt;/h2&gt;

&lt;p&gt;I could have written another tutorial. I could have built a sandbox. But I've been thinking a lot about how people actually learn, and the answer isn't "reading."&lt;/p&gt;

&lt;p&gt;People learn by doing things that are slightly too hard, failing, figuring out why, and trying again. That's what games are. They're structured failure environments with feedback loops.&lt;/p&gt;

&lt;p&gt;Every level in Gitvana has a target state, a set of conditions that the git repository must satisfy. "There must be exactly 3 commits on main." "The branch &lt;code&gt;feature&lt;/code&gt; must be deleted." "The file &lt;code&gt;config.yml&lt;/code&gt; must not contain the API key in any commit." The game validates these conditions in real time as you type commands. You see the checklist turn green, one objective at a time.&lt;/p&gt;

&lt;p&gt;This isn't gamification bolted onto a tutorial. The game &lt;em&gt;is&lt;/em&gt; the learning.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Journey: 6 Acts
&lt;/h2&gt;

&lt;p&gt;The structure mirrors how a developer actually encounters git:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Act 1: Awakening&lt;/strong&gt; — The basics. &lt;code&gt;init&lt;/code&gt;, &lt;code&gt;add&lt;/code&gt;, &lt;code&gt;commit&lt;/code&gt;, &lt;code&gt;status&lt;/code&gt;, &lt;code&gt;log&lt;/code&gt;, &lt;code&gt;diff&lt;/code&gt;. You're a new monk. The Head Monk is patient. The cat is skeptical.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Act 2: The Middle Path&lt;/strong&gt; — Branching, merging, &lt;code&gt;cherry-pick&lt;/code&gt;, &lt;code&gt;revert&lt;/code&gt;, &lt;code&gt;stash&lt;/code&gt;. Things start getting interesting. You begin to understand that git isn't a linear timeline, it's a tree.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Act 3: Rewriting Reality&lt;/strong&gt; — &lt;code&gt;rebase&lt;/code&gt;, &lt;code&gt;amend&lt;/code&gt;, squashing commits, purging secrets from history. This is where most developers tap out in real life. In Gitvana, you can't tap out. The monastery doors are locked.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Act 4: The Safety Net&lt;/strong&gt; — &lt;code&gt;reflog&lt;/code&gt;, &lt;code&gt;blame&lt;/code&gt;, &lt;code&gt;bisect&lt;/code&gt;, disaster recovery. The levels where you learn that git never truly forgets, and that &lt;code&gt;reflog&lt;/code&gt; is the "undo" button nobody told you about.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Act 5: Advanced Techniques&lt;/strong&gt; — Surgical staging, dependency chains, the operations that separate "uses git" from "understands git."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Act 6: Gitvana&lt;/strong&gt; — The final trial.&lt;/p&gt;

&lt;p&gt;Each act introduces new commands gradually, with in-game documentation you can pull up without leaving the terminal.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Tech
&lt;/h2&gt;

&lt;p&gt;The stack is deliberately minimal. Svelte 5 for the UI, xterm.js for the terminal, isomorphic-git for the git engine, and lightning-fs for the in-memory filesystem. No backend. No database. No accounts. Everything runs in your browser and your progress saves to localStorage.&lt;/p&gt;

&lt;p&gt;The interesting engineering problems were all in the details:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rebase&lt;/strong&gt; was the hardest command to implement. The real &lt;code&gt;git rebase&lt;/code&gt; is a multi-step, stateful operation. It collects commits, replays them one by one, and can pause mid-way for conflict resolution. I had to build a state machine that saves rebase progress to &lt;code&gt;.git/rebase-merge/&lt;/code&gt;, handles &lt;code&gt;--continue&lt;/code&gt; and &lt;code&gt;--abort&lt;/code&gt;, and writes proper conflict markers when files clash.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bisect&lt;/strong&gt; maintains its own state files in &lt;code&gt;.git/&lt;/code&gt;, just like real git. It performs an actual binary search across commits to find where a bug was introduced. In one level, you have to find which commit broke a test by using &lt;code&gt;git bisect start&lt;/code&gt;, marking commits as good or bad, and letting the algorithm narrow it down.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The blame algorithm&lt;/strong&gt; walks the entire commit history, builds a content-at-commit map, and attributes each line to the oldest commit where it appeared unchanged. It's not efficient. It doesn't need to be, these repos are tiny. But it's correct.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The level validator&lt;/strong&gt; checks 12 types of conditions in real time: file existence, file content, branch existence, HEAD position, commit count, commit message patterns, merge commits, conflict state, staging area state, and tag existence. Every keystroke can potentially satisfy an objective, and the UI updates instantly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sound effects&lt;/strong&gt; are procedurally generated with the Web Audio API. No audio files. Just oscillators, frequency envelopes, and square waves. Every &lt;code&gt;commit&lt;/code&gt; gets a satisfying chiptune beep. Every merge conflict gets an ominous buzz.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Learned Building It
&lt;/h2&gt;

&lt;p&gt;Building an educational game taught me more about git than 15 years of using it.&lt;/p&gt;

&lt;p&gt;I had to read the git internals documentation to implement commands correctly. I discovered that &lt;code&gt;git stash&lt;/code&gt; is essentially syntactic sugar over a specific commit-and-reset workflow. I learned that the reflog is just a flat file of HEAD movements. I finally understood, at the implementation level, why a detached HEAD happens and what it actually means in terms of refs.&lt;/p&gt;

&lt;p&gt;There's a difference between using a tool and understanding it deeply enough to rebuild it. This project forced the second.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Pixel Art Problem
&lt;/h2&gt;

&lt;p&gt;I can't draw. At all. My artistic ability peaked at stick figures in 1993. But I wanted Gitvana to have a specific aesthetic: 16-bit monastery vibes, cherry blossoms, monks in robes, a cat that judges your commits.&lt;/p&gt;

&lt;p&gt;I used &lt;a href="https://www.pixellab.ai/" rel="noopener noreferrer"&gt;PixelLab&lt;/a&gt; to generate the sprites. I'd describe what I wanted: "pixel art monk in grey robes, standing, 64x64, side view, retro game style" and iterate until it felt right. The landing page monastery, the mountain progression map, the four monk tiers (grey, brown, blue, golden) were all generated this way.&lt;/p&gt;

&lt;p&gt;It's not hand-crafted pixel art. But it has soul. And it's consistent, which matters more than perfection when you're a solo developer trying to ship something.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why It's Free
&lt;/h2&gt;

&lt;p&gt;Because I built it for fun. That's the honest answer.&lt;/p&gt;

&lt;p&gt;I had a problem: I wanted to understand git at the implementation level, not just the "copy this command from Stack Overflow" level. Building a game that teaches it forced me to actually learn it. Selfish motivation, great side effect.&lt;/p&gt;

&lt;p&gt;And maybe other people have the same problem. Maybe there's a developer out there who's been using git for five years and still gets nervous when someone says "rebase." If Gitvana helps them, great. If not, I still had a blast building it.&lt;/p&gt;

&lt;p&gt;There's no paywall, no signup, no "premium" tier. The source code is on &lt;a href="https://github.com/pixari/gitvana" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;. It's MIT licensed. Fork it, improve it, translate it, add levels.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://gitvana.pixari.dev" rel="noopener noreferrer"&gt;gitvana.pixari.dev&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;35 levels. Real terminal. Real git. Zero setup.&lt;/p&gt;

&lt;p&gt;Start at Act 1. Get to Gitvana. The cat is waiting.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://pixari.dev/i-built-a-game-that-teaches-git-by-making-you-type-real-commands/" rel="noopener noreferrer"&gt;pixari.dev&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>engineering</category>
      <category>personal</category>
      <category>ai</category>
    </item>
    <item>
      <title>Why Your Team's Docs Are a Strategic Asset (Not an Afterthought)</title>
      <dc:creator>Raffaele Pizzari</dc:creator>
      <pubDate>Wed, 01 Apr 2026 22:34:43 +0000</pubDate>
      <link>https://dev.to/pixari/why-your-teams-docs-are-a-strategic-asset-not-an-afterthought-3ale</link>
      <guid>https://dev.to/pixari/why-your-teams-docs-are-a-strategic-asset-not-an-afterthought-3ale</guid>
      <description>&lt;p&gt;Good documentation is more than just a chore; it's a strategic asset that can transform how your engineering team operates.&lt;/p&gt;

&lt;p&gt;Far too often, documentation is viewed as an afterthought, something to be done only when absolutely necessary, or worse, not at all.&lt;/p&gt;

&lt;p&gt;But what if we reframed our perspective?&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;The Hidden Power of Good Documentation&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Think of clear, concise documentation not as a task, but as an act of professional kindness.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Kindness to your future self:&lt;/strong&gt; We've all been there, staring at old code or a system we designed months ago, wondering about a specific decision or implementation detail. Good documentation acts as a reliable memory, saving you hours of head-scratching and reverse-engineering your own work.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Kindness to your teammates:&lt;/strong&gt; Onboarding new team members can be a time-intensive process. Comprehensive documentation allows them to get up to speed faster, understand existing systems, and contribute effectively without constantly interrupting others for explanations. It empowers them to find answers independently, fostering a more autonomous and efficient team.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Beyond the Basics: Why Documentation is a High-Leverage Activity&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;When documentation is prioritized and well-executed, it significantly reduces friction within a team. It clarifies processes, defines responsibilities, and provides a single source of truth for critical information. This, in turn, increases individual and team autonomy. Engineers can make informed decisions and solve problems without constant oversight, leading to faster development cycles and fewer roadblocks.&lt;/p&gt;

&lt;p&gt;Ultimately, robust documentation forms the bedrock of a maintainable system. It ensures that knowledge isn't siloed in individual minds but is instead shared and accessible, making systems more resilient and easier to evolve. In the grand scheme of engineering activities, creating good documentation is one of the highest-leverage tasks you can perform, yielding disproportionate returns in team efficiency, system stability, and overall project success.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Making Documentation an Asset, Not an Afterthought&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;So, how can you shift your team's mindset and practice to make documentation an asset? It starts with recognizing its value and integrating it into your workflow, rather than relegating it to an optional, last-minute item.&lt;/p&gt;

&lt;p&gt;What are your team's biggest struggles when it comes to documentation, and what strategies have you found most effective in overcoming them?&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://pixari.dev/why-your-teams-docs-are-a-strategic-asset-not-an-afterthought/" rel="noopener noreferrer"&gt;pixari.dev&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>engineering</category>
      <category>leadership</category>
    </item>
    <item>
      <title>Why Empathy is a Hard Skill</title>
      <dc:creator>Raffaele Pizzari</dc:creator>
      <pubDate>Wed, 01 Apr 2026 22:34:13 +0000</pubDate>
      <link>https://dev.to/pixari/why-empathy-is-a-hard-skill-a2m</link>
      <guid>https://dev.to/pixari/why-empathy-is-a-hard-skill-a2m</guid>
      <description>&lt;p&gt;The modern workplace is full of buzzwords, and few are as overused as "empathy." We hear it in every leadership seminar and read about it in every management book. But what does it truly mean to lead with empathy, and how does it translate from a lofty concept into a practical, day-to-day skill.&lt;/p&gt;

&lt;p&gt;For many years, I believed that leadership was about competence and results. My focus was on the numbers, the deadlines, and the outputs. I challenged my teams directly, but I often missed the "caring personally" part of the equation. It felt soft, almost like a distraction from the real work. What I eventually learned, often the hard way, is that empathy isn't a distraction, it’s a prerequisite for durable, high-performing teams.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem with Overly Direct Feedback
&lt;/h2&gt;

&lt;p&gt;Many of us were raised on the idea that direct, sometimes harsh, feedback is the only way to drive performance. While challenging people directly is crucial for growth, when it’s delivered without a foundation of empathy, it often lands as criticism. This creates a defensive environment where people shut down, stop taking risks, and fear making mistakes.&lt;/p&gt;

&lt;p&gt;This is where the principles of Kim Scott’s "Radical Candor" became a powerful compass for me. The idea is simple: &lt;strong&gt;care personally and challenge directly.&lt;/strong&gt; The two parts are not independent; they are a symbiotic relationship. You can't have one without the other and expect to build a team that thrives. The "caring personally" part is the empathy. It's the engine that makes the "challenging directly" part effective.&lt;/p&gt;

&lt;h2&gt;
  
  
  Empathy isn't Weakness, It's Leverage
&lt;/h2&gt;

&lt;p&gt;Leading with empathy isn't about being everyone's friend or avoiding tough conversations. It's about taking the time to truly &lt;strong&gt;see and hear&lt;/strong&gt; the people on your team. It means understanding their professional and personal aspirations, recognizing their struggles, and genuinely celebrating their wins. It’s about creating an environment where they feel safe enough to be vulnerable.&lt;/p&gt;

&lt;p&gt;When you invest in that foundation of psychological safety, challenging feedback becomes a gift, not a threat. Your team knows that your feedback comes from a place of support for their growth, not a place of judgment. This is the ultimate form of leverage. You can demand excellence and push boundaries because your people trust that you are pushing them for their own benefit.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical Ways to Apply Empathy as a Leader
&lt;/h2&gt;

&lt;p&gt;So, how do you make empathy a tangible part of your leadership toolkit? It starts with small, consistent actions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Listen Actively:&lt;/strong&gt; Put away your phone and give your full attention. Ask open-ended questions and listen to understand, not just to respond.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Show Vulnerability:&lt;/strong&gt; Acknowledge your own mistakes. It shows your team that imperfection is normal and creates a safe space for them to do the same.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Recognize Effort, Not Just Results:&lt;/strong&gt; Celebrate the hard work and resilience, especially when a project doesn't go as planned. This builds trust and encourages risk-taking.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Remember the Person:&lt;/strong&gt; Ask about their weekend, their family, or their hobbies. These small moments can build a powerful personal connection.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Empathy is not a soft skill; it is a &lt;strong&gt;hard, pragmatic skill&lt;/strong&gt; that directly impacts a team’s performance. It’s the difference between managing a group of individuals and leading a unified, high-performing team.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://pixari.dev/why-empathy-is-a-hard-skill/" rel="noopener noreferrer"&gt;pixari.dev&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>leadership</category>
    </item>
    <item>
      <title>When the hard part was the point</title>
      <dc:creator>Raffaele Pizzari</dc:creator>
      <pubDate>Wed, 01 Apr 2026 22:33:42 +0000</pubDate>
      <link>https://dev.to/pixari/when-the-hard-part-was-the-point-3nj3</link>
      <guid>https://dev.to/pixari/when-the-hard-part-was-the-point-3nj3</guid>
      <description>&lt;p&gt;I still remember the weight of the book.&lt;/p&gt;

&lt;p&gt;It was 2003. I was building a text search engine in Perl. I was trying to write a recursive function to traverse a directory tree without blowing up the server’s memory.&lt;/p&gt;

&lt;p&gt;I didn't have Copilot. I didn't have ChatGPT. I didn't even have StackOverflow open, it didn't exist. I just had a heavy, physical Perl manual with a cracked spine, an open space with many loud colleagues and the hum of the computer fans.&lt;/p&gt;

&lt;p&gt;I spent days on that function. I remember the frustration. I remember the panic of staring at a cursor that wouldn't move. But mostly, I remember the texture of the moment the logic finally clicked. It was a physical sensation, a headache dissolving into pure clarity.&lt;/p&gt;

&lt;p&gt;For the last 20 years, I have defined my professional worth by my ability to &lt;strong&gt;endure that friction&lt;/strong&gt;. I was a watchmaker, and I took an immense amount of pride in the fact that the gears were incredibly small, the manual was hard to read, and my hands were the only ones steady enough to place them.&lt;/p&gt;

&lt;p&gt;And now, I am grieving.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Robbery
&lt;/h3&gt;

&lt;p&gt;Last weekend, I sat down to work on a personal side project. This wasn't for work; this was for &lt;em&gt;me&lt;/em&gt;. I hit a roadblock with a particularly nasty piece of logic involving data synchronization.&lt;/p&gt;

&lt;p&gt;A few years ago, this would have been the best part of my Saturday night. It would have been a ritual: a fresh pot of tea, a blank notebook, and three hours of deep work until I cracked the code.&lt;/p&gt;

&lt;p&gt;This time, almost out of muscle memory, I pasted the error into a prompt window.&lt;/p&gt;

&lt;p&gt;I didn't even get to take a sip of my tea.&lt;/p&gt;

&lt;p&gt;Four seconds. That’s how long it took. The code appeared. It handled the edge cases. It was cleaner than what I would have written. I pasted it in. It worked perfectly.&lt;/p&gt;

&lt;p&gt;I didn't feel efficient. I felt robbed.&lt;/p&gt;

&lt;p&gt;I had robbed myself of the flow state. I had robbed myself of the "Aha!" moment. It was like taking a helicopter to the summit of Everest. Yes, the view is the same. But the person standing at the top isn't the person who climbed the mountain. They haven't been changed by the ascent.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Identity Crisis
&lt;/h3&gt;

&lt;p&gt;We talk about AI velocity. We talk about "10x engineers." But we aren't talking about the silence that comes after the code is written.&lt;/p&gt;

&lt;p&gt;For many of us, engineering wasn't just a trade; it was an identity built on suffering. We were the magicians who knew the secret spells. We were the ones willing to read the documentation at 2 AM.&lt;/p&gt;

&lt;p&gt;When the "hard parts" become trivial, when the struggle is removed, it forces us to ask a terrifying question:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If I am not the one struggling, who am I?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If my value isn't in my ability to grind through the logic, and my value isn't in my encyclopedic knowledge of the Perl manual, then what have I been doing for the last 2 decades? Was I just a slow, biological text-generator waiting to be optimized?&lt;/p&gt;

&lt;h2&gt;
  
  
  The Wisdom of the Scar
&lt;/h2&gt;

&lt;p&gt;But perhaps I am asking the wrong question. Because there is a fundamental difference between an LLM and a Senior Engineer, and it isn't intelligence. It’s &lt;strong&gt;trauma&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;An AI has never been woken up at 3 AM by a critical production alert. An AI has never felt the cold sweat of realizing a migration script just dropped the wrong table in production. An AI has never had to sit in a post-mortem meeting and explain to a CEO why the site was down for four hours.&lt;/p&gt;

&lt;p&gt;That terror? That is where wisdom comes from.&lt;/p&gt;

&lt;p&gt;The "friction" we are mourning wasn't just annoying; it was educational. Every time we struggled, we were building a map of the minefield in our heads.&lt;/p&gt;

&lt;p&gt;The AI is an eternal optimist. It assumes the happy path will work. It assumes the API will respond. It assumes the data is clean. &lt;strong&gt;It has infinite knowledge, but zero scar tissue.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;And in our line of work, the scars are the only things that tell you &lt;strong&gt;where the ice is too thin to walk&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where the Friction Lives Now
&lt;/h3&gt;

&lt;p&gt;So, how do we navigate this? If the "writing" is gone, where do we put our obsession?&lt;/p&gt;

&lt;p&gt;I’ve realized that we don't have to abandon our standards; we have to elevate them. We need to take that restless energy that used to go into &lt;em&gt;syntax&lt;/em&gt; and pour it into &lt;em&gt;verification&lt;/em&gt; and &lt;em&gt;architecture&lt;/em&gt;.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Become the Editor, Not the Writer:&lt;/strong&gt; The AI is an eager, hallucinations-prone engineer. Your new struggle is not creating the code, but having the taste to look at 50 lines of generated logic and spot the subtle architectural flaw that will haunt you in six months. The "friction" is now in the review.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Protect Your "Gym" Time:&lt;/strong&gt; I have made a new rule for myself. Every day I turn the AI "off" for ~50% of my coding time. I force myself to write code by hand. Not because it’s efficient, it isn’t, but because my brain needs the gym. We lift weights not to move iron, but to keep our muscles strong. We must code manually to keep our intuition sharp.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;From Photorealism to Impressionism:&lt;/strong&gt; Photography didn't kill painting. It forced painters to stop trying to be photorealistic. When the camera arrived, painters realized: &lt;em&gt;“The machine can capture the light better than I can. So I must capture the feeling.”&lt;/em&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We are at that same juncture. The era of the "Photorealistic Coder", the one who takes pride in memorizing syntax, is over. The era of the Impressionist Engineer is beginning.&lt;/p&gt;

&lt;p&gt;Our value is no longer in &lt;em&gt;how&lt;/em&gt; we build the wall. Our value is in knowing &lt;em&gt;where&lt;/em&gt; to put the window so the light hits the room just right.&lt;/p&gt;

&lt;h3&gt;
  
  
  The "Good" Struggle
&lt;/h3&gt;

&lt;p&gt;If you are reading this and feeling a twinge of sadness, I want you to know it’s okay.&lt;/p&gt;

&lt;p&gt;It’s okay to miss the blinking cursor. It’s okay to miss the frustration of the physical manual. It’s okay to miss the era where the barrier to entry was high, because that height made us feel safe.&lt;/p&gt;

&lt;p&gt;We can accept the new tools. &lt;strong&gt;We can use the helicopter when the destination is all that matters.&lt;/strong&gt; But every once in a while, for the sake of our souls, we should still climb the mountain on foot.&lt;/p&gt;

&lt;p&gt;I suspect I'm not the only one feeling this phantom limb.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://pixari.dev/when-the-hard-part-was-the-point/" rel="noopener noreferrer"&gt;pixari.dev&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>personal</category>
      <category>engineering</category>
    </item>
    <item>
      <title>Using AI in the Software Development Lifecycle (Without Slowing Shipping)</title>
      <dc:creator>Raffaele Pizzari</dc:creator>
      <pubDate>Wed, 01 Apr 2026 22:33:11 +0000</pubDate>
      <link>https://dev.to/pixari/using-ai-in-the-software-development-lifecycle-without-slowing-shipping-5b7</link>
      <guid>https://dev.to/pixari/using-ai-in-the-software-development-lifecycle-without-slowing-shipping-5b7</guid>
      <description>&lt;p&gt;AI is everywhere in the software development lifecycle: code completion, test generation, docs, and even design. The promise is faster, better output. The risk is &lt;strong&gt;typing faster but &lt;a href="https://dev.to/the-paradox-of-ai-acceleration-why-we-are-typing-faster-but-shipping-slower/"&gt;shipping&lt;/a&gt; slower&lt;/strong&gt;—more generated code to review, more wrong abstractions, and more time debugging AI output. Here’s how to &lt;strong&gt;use AI in the software development lifecycle&lt;/strong&gt; in ways that actually speed you up instead of slowing you down.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where AI Helps in the SDLC
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Scaffolding and boilerplate.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Generating CRUD, config, tests stubs, and repetitive code from a clear spec or prompt is where AI shines. You stay in control of design; AI fills in the tedious parts. Think “build the scaffolding, not the whole house.”&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Documentation and comments.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Turning code into docs, or keeping comments in sync with behavior, is a good fit. So is generating runbooks or API descriptions from existing code—as long as someone reviews for accuracy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tests and validation.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Generating unit tests, edge cases, or example data can improve coverage quickly. The key is running the tests and fixing failures; don’t trust generated tests without review.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Exploration and learning.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Asking “how does X work in this codebase?” or “what’s the pattern for Y?” can speed onboarding and investigation. Treat answers as a starting point, not gospel.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Refactoring and small, mechanical changes.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Renaming, formatting, or applying a pattern across many files can be suggested or partially done by AI. Again, review and tests are essential.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where AI Tends to Slow You Down
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Big, greenfield features from a single prompt.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
AI can produce a lot of code that looks plausible but is wrong, over-engineered, or doesn’t fit your system. You spend more time fixing and aligning than if you’d written a smaller, targeted slice yourself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Critical paths and subtle logic.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Security, correctness, and performance need human judgment. Use AI to suggest or draft, then verify carefully.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When context is missing.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
AI doesn’t know your product decisions, your constraints, or your team’s conventions. The more you provide (specs, examples, ADRs), the better the output—and the more you rely on “just generate it,” the more rework you get.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practices That Keep Shipping Fast
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Invest in context AI can use.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Good docs, clear APIs, and up-to-date specs improve AI output and reduce back-and-forth. Treat documentation as the “context window” for both humans and tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Prefer small, verifiable steps.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Use AI for small units of work (a function, a test, a doc section) that you can review and test immediately. Avoid “generate the whole feature” unless you’re willing to treat it as a draft to heavily edit.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Tighten the feedback loop.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Strong test coverage and fast CI mean you catch AI mistakes quickly. Without that, you risk merging broken or brittle code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Set team norms.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Decide what’s acceptable to generate (e.g. tests, boilerplate, comments) and what always needs a human design (e.g. security, APIs, data models). Review generated code like any other code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Measure impact.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Track cycle time, bug rate, and rework. If “AI-assisted” work takes longer or introduces more incidents, adjust where and how you use AI.&lt;/p&gt;

&lt;h2&gt;
  
  
  Using AI in the Software Development Lifecycle: Summary
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Use AI for:&lt;/strong&gt; scaffolding, boilerplate, docs, test generation, exploration, mechanical refactors.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Be cautious with:&lt;/strong&gt; large features from one prompt, security/correctness-critical code, and anything where context is vague.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Keep shipping:&lt;/strong&gt; small steps, strong tests, clear context, and team norms so AI speeds you up instead of burying you in rework.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AI in the SDLC is most valuable when it handles the repetitive, well-defined work and leaves you in control of design and quality. Use it there, and you can ship faster without slowing down.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://pixari.dev/using-ai-in-the-software-development-lifecycle-without-slowing-shipping/" rel="noopener noreferrer"&gt;pixari.dev&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>engineering</category>
      <category>ai</category>
    </item>
    <item>
      <title>The Year I Stopped Apologizing for the Chaos</title>
      <dc:creator>Raffaele Pizzari</dc:creator>
      <pubDate>Wed, 01 Apr 2026 22:32:40 +0000</pubDate>
      <link>https://dev.to/pixari/the-year-i-stopped-apologizing-for-the-chaos-4ngm</link>
      <guid>https://dev.to/pixari/the-year-i-stopped-apologizing-for-the-chaos-4ngm</guid>
      <description>&lt;p&gt;Let’s look at the data, because that’s usually where I feel safe.&lt;/p&gt;

&lt;p&gt;My oldest is &lt;strong&gt;three years old&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
My twins &lt;strong&gt;turned one just three months ago&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;For the last 15 months, I have been attempting to function as a human being while outnumbered by toddlers. I work from home as an engineering manager, a role I am incredibly grateful for, but one that requires a brain that isn’t constantly running on fumes.&lt;/p&gt;

&lt;p&gt;If you looked at my calendar this year, you saw neat blocks of meetings and deep work.&lt;/p&gt;

&lt;p&gt;If you looked inside my room, you saw a father muting his mic to gently negotiate with a crying toddler, wiping oatmeal off his shirt five minutes before a call, and functioning on a sleep schedule that simply doesn’t add up.&lt;/p&gt;

&lt;p&gt;It has been the toughest year of my life.&lt;/p&gt;

&lt;p&gt;I love my children more than I thought possible. They are my world. But love doesn’t replace sleep, and deep affection doesn’t pause the backlog.&lt;/p&gt;

&lt;p&gt;It is possible to be incredibly grateful for your family and completely broken by the logistics of raising it at the same time.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Weight We Carry
&lt;/h2&gt;

&lt;p&gt;We need to talk about the guilt. The specific, heavy kind that sits on your chest at 2 AM.&lt;/p&gt;

&lt;p&gt;I am lucky. I work for a company that is incredibly empathetic. They support families. They understand flexibility. I am not fighting a toxic employer. I am fighting an internal narrative that tells me I &lt;em&gt;should&lt;/em&gt; be able to do it all perfectly.&lt;/p&gt;

&lt;p&gt;But let’s be clear: that narrative didn’t appear out of thin air. It was installed by a culture that &lt;strong&gt;equates human value with constant output&lt;/strong&gt;. Even in a supportive environment, the societal pressure remains.&lt;/p&gt;

&lt;p&gt;There are days when I close my laptop and feel I didn’t give enough to my team. Then I step away from my desk and feel I didn’t give enough to my kids.&lt;/p&gt;

&lt;p&gt;We are constantly told, implicitly or explicitly, that if we just organized our time better, woke up earlier, or had more discipline, we could balance it all.&lt;/p&gt;

&lt;p&gt;That is a lie.&lt;/p&gt;

&lt;p&gt;We are living through a time where we are expected to work like we don’t have children, and parent like we don’t have jobs. That isn’t a puzzle you can solve if you just try harder. It is simply impossible.&lt;/p&gt;

&lt;h3&gt;
  
  
  Permission to Be Human
&lt;/h3&gt;

&lt;p&gt;If you are reading this and feeling that same shame, I want to tell you something I wish someone had told me six months ago: &lt;strong&gt;You are not wrong.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You are not broken. You are not “bad at this.”&lt;/p&gt;

&lt;p&gt;We have been taught that vulnerability is a weakness, that as professionals, and especially as men, we need to present a facade of calm competence. We hide the chaos. We apologize when “life” interrupts the background blur. We treat our exhaustion like a shameful secret.&lt;/p&gt;

&lt;p&gt;But the exhaustion isn’t a sign of failure. It is the only logical response to the situation.&lt;/p&gt;

&lt;p&gt;The shame you feel? That doesn’t belong to you. That belongs to a culture that demands the impossible and then blames you when you can’t deliver.&lt;/p&gt;

&lt;p&gt;We need to stop hiding the cracks in the armor. We need to stop pretending that we are untouched by the chaos. Real strength isn’t about carrying the weight without stumbling; it’s about admitting that the weight is too heavy to carry alone. It is about saying, “I am struggling,” and realizing that this admission doesn’t make you less of a leader,&lt;/p&gt;

&lt;p&gt;It makes you human.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Privilege of Vulnerability
&lt;/h2&gt;

&lt;p&gt;I just argued that we should stop hiding the cracks in our armor. But I need to be honest about why it is safe for me to drop my shield.&lt;/p&gt;

&lt;p&gt;I am a white man with a supportive partner, a stable salary, and a company that treats me like a human being. I have every safety net money and social status can buy.&lt;/p&gt;

&lt;p&gt;This mountain of privilege doesn’t just buy me safety; it buys me the &lt;em&gt;permission to be human&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;When I write about my chaos, I get applauded for my “authenticity.” When I admit to being overwhelmed, I am seen as a “relatable leader.” I can wear my vulnerability like a badge of honor because my competence is rarely questioned.&lt;/p&gt;

&lt;p&gt;But I know that for many, especially women and mothers, showing those same cracks isn’t a badge of honor. It’s a professional liability.&lt;/p&gt;

&lt;p&gt;Women have been carrying this load (and often a much heavier one) for generations. Yet, when they show the strain, society doesn’t offer them the same “permission slip” it hands to me. It offers judgment. It perceives their exhaustion not as a systemic failure, but as a lack of commitment.&lt;/p&gt;

&lt;p&gt;So while I am proud to share my struggle, I am acutely aware that the ability to do so without fear of professional penalty is, in itself, the ultimate privilege.&lt;/p&gt;

&lt;p&gt;To the men reading this: We possess the “political capital” to change this narrative, and we have a duty to spend it. We need to be the ones to break the facade. When we say, “I can’t make that 5 PM meeting, I have childcare duties,” we create a blast radius of safety for everyone else who is terrified to ask for what they need.&lt;/p&gt;

&lt;p&gt;We have to stop pretending everything is fine just because we are scraping by.&lt;/p&gt;

&lt;h2&gt;
  
  
  You Are Doing Enough
&lt;/h2&gt;

&lt;p&gt;If you are just surviving right now, you are doing enough. The work will always be there. The deadlines can move. But you, and your family, are the only version that exists.&lt;/p&gt;

&lt;p&gt;This season of life is relentless, but it is also finite. Don’t measure your worth by your productivity during a crisis. Measure it by your ability to be kind to yourself when the world demands you be a machine.&lt;/p&gt;

&lt;p&gt;We need to stop hiding the scars and start supporting the people. We need to build a culture where “I am tired” is a valid status update, and where asking for help is recognized as an act of courage, not a confession of weakness.&lt;/p&gt;

&lt;p&gt;Resilience isn’t about never breaking; it’s about knowing that you shouldn’t have to carry the weight alone.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;I am still navigating this myself. I don’t have a roadmap for how to be the perfect parent in this chaos. I am also actively trying to be a better ally to listen to the experiences I don’t share and to speak up when silence would be easier. I’m just trying to draw the map as I go. But I know that navigating it alone is the hardest way to travel.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If this resonated with you, I’d love to hear your story. Whether you want to vent about the impossible schedule, share a small win, or just tell me how you’re keeping the lights on, my inbox is open. We can’t fix the whole system today, but we can start by making sure no one has to debug this mess alone.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://pixari.dev/the-year-i-stopped-apologizing-for-the-chaos/" rel="noopener noreferrer"&gt;pixari.dev&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>personal</category>
      <category>leadership</category>
    </item>
    <item>
      <title>The Paradox of AI-Acceleration: Why We Are Typing Faster but Shipping Slower</title>
      <dc:creator>Raffaele Pizzari</dc:creator>
      <pubDate>Wed, 01 Apr 2026 22:32:09 +0000</pubDate>
      <link>https://dev.to/pixari/the-paradox-of-ai-acceleration-why-we-are-typing-faster-but-shipping-slower-52ic</link>
      <guid>https://dev.to/pixari/the-paradox-of-ai-acceleration-why-we-are-typing-faster-but-shipping-slower-52ic</guid>
      <description>&lt;p&gt;We are deep in the deployment phase of Generative AI.&lt;/p&gt;

&lt;p&gt;According to the 2025 Google DORA report, (&lt;a href="https://survey.stackoverflow.co/2025/" rel="noopener noreferrer"&gt;https://survey.stackoverflow.co/2025/&lt;/a&gt;). The hype cycle is officially over. This is now &lt;strong&gt;non-negotiable baseline tooling&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The initial promise was exponential efficiency: AI would handle the heavy lifting, freeing us for high-value &lt;a href="https://dev.to/engineering-metrics-that-actually-improve-outcomes/"&gt;engineering&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;But if you look at actual telemetry from the field, not vendor marketing brochures, but hard data from mature organizations, the dashboard is flashing red.&lt;/p&gt;

&lt;p&gt;We are witnessing a profound &lt;strong&gt;Paradox of Acceleration&lt;/strong&gt;. Developers report feeling &lt;em&gt;in the flow&lt;/em&gt;, with perceived productivity boosts up to (&lt;a href="https://github.blog/news-insights/research/research-quantifying-github-copilots-impact-in-the-enterprise-with-accenture/" rel="noopener noreferrer"&gt;https://github.blog/news-insights/research/research-quantifying-github-copilots-impact-in-the-enterprise-with-accenture/&lt;/a&gt;). Yet, objective telemetry indicates that cycle times, the time from first commit to actual production deployment, (&lt;a href="https://devops.com/study-finds-no-devops-productivity-gains-from-generative-ai/" rel="noopener noreferrer"&gt;https://devops.com/study-finds-no-devops-productivity-gains-from-generative-ai/&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;&lt;a href="/images/blog/paradox.jpg" class="article-body-image-wrapper"&gt;&lt;img src="/images/blog/paradox.jpg"&gt;&lt;/a&gt;&lt;/p&gt;


The Paradox of Accelation in AI-Assisted [Development](/using-ai-in-the-software-development-lifecycle-without-slowing-shipping/)





&lt;p&gt;As engineering leaders, we must stop managing by the "vibe" of instant generation and start &lt;em&gt;managing by the physics of your delivery lifecycle&lt;/em&gt;. The law of gravity applies to software: &lt;strong&gt;Code is Mass.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;AI tools allow your team to generate mass at an industrial scale. But &lt;strong&gt;unless you have exponentially increased your structural integrity&lt;/strong&gt;, your automated testing coverage, your review bandwidth, your observability, you are simply building a heavier tower on the same cracking foundation.&lt;/p&gt;

&lt;p&gt;Gravity doesn't care about your roadmap or your quarterly goals. If the load exceeds the capacity, &lt;strong&gt;the collapse is not a possibility, it is a mathematical certainty&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The "Vibe Coding" Trap: Shifting the Bottleneck
&lt;/h2&gt;

&lt;p&gt;There is a massive disconnect between the &lt;em&gt;feeling&lt;/em&gt; of speed and the &lt;em&gt;reality&lt;/em&gt; of engineering outcomes.&lt;/p&gt;

&lt;p&gt;Early studies on "greenfield" tasks (&lt;a href="https://survey.stackoverflow.co/2025/" rel="noopener noreferrer"&gt;https://survey.stackoverflow.co/2025/&lt;/a&gt;). This created a dopamine feedback loop. Developers feel productive because the agonizing pause of syntax recall is gone. But professional engineering is rarely greenfield. It is mostly "brownfield", navigating complex, existing dependency trees.&lt;/p&gt;

&lt;p&gt;&lt;a href="/images/blog/vibe-coding-trap.jpg" class="article-body-image-wrapper"&gt;&lt;img src="/images/blog/vibe-coding-trap.jpg"&gt;&lt;/a&gt;&lt;/p&gt;


The "Vibe Coding" Trap: Generation vs. Verification





&lt;p&gt;When we look at data for real-world maintenance tasks, the narrative flips. A 2025 study found AI-equipped developers took &lt;a href="https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/" rel="noopener noreferrer"&gt;&lt;strong&gt;1&lt;/strong&gt;&lt;/a&gt;&lt;a href="https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/" rel="noopener noreferrer"&gt;&lt;strong&gt;9% longer&lt;/strong&gt; to complete complex modification tasks compared to control groups&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Why? Because we shifted the bottleneck.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Old Bottleneck:&lt;/strong&gt; Typing and syntax recall (Generation).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;New Bottleneck:&lt;/strong&gt; Reading, verifying, and debugging alien logic (Verification).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;AI is confident, but frequently wrong. Debugging code you didn't write, which lacks coherent human intent, is exponentially harder than writing it yourself. We are trading cheap typing time for expensive debugging time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Expectation vs. Reality
&lt;/h2&gt;

&lt;p&gt;Let’s look at the &lt;em&gt;profit &amp;amp; loss&lt;/em&gt; of AI adoption. When we compare vendor promises against enterprise telemetry, the deficit becomes clear.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;The Sales Pitch (Vendor/Survey Data)&lt;/th&gt;
&lt;th&gt;The Site Reality (Forensic/Telemetry Data)&lt;/th&gt;
&lt;th&gt;The Structural Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Velocity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;+55% Faster&lt;/strong&gt; ((&lt;a href="https://github.blog/2022-09-07-research-quantifying-github-copilots-impact-on-developer-productivity-and-happiness/)" rel="noopener noreferrer"&gt;https://github.blog/2022-09-07-research-quantifying-github-copilots-impact-on-developer-productivity-and-happiness/)&lt;/a&gt;)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;-19%&lt;/strong&gt; &lt;a href="https://metr.org/blog/" rel="noopener noreferrer"&gt;&lt;strong&gt;Slower&lt;/strong&gt; on complex maintenance tasks&lt;/a&gt;
&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Latency Spike:&lt;/strong&gt; Code sits in review/QA longer due to complexity.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Quality&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;"More time for deep work"&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;+41% Bug Rate&lt;/strong&gt; (&lt;a href="https://www.google.com/search?q=https://uplevelteam.com/generative-ai-coding-research-study/" rel="noopener noreferrer"&gt;https://www.google.com/search?q=https://uplevelteam.com/generative-ai-coding-research-study/&lt;/a&gt;)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Rework Loop:&lt;/strong&gt; Speed in typing is lost to fixing bugs in production.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Maintenance&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;"Clean, efficient code"&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.google.com/search?q=https://www.gitclear.com/coding_on_copilot_data_quality_impact_research" rel="noopener noreferrer"&gt;&lt;strong&gt;Doubled Code Churn&lt;/strong&gt; &amp;amp; &lt;strong&gt;Collapsed Refactoring&lt;/strong&gt; (&amp;lt;10%)&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Technical Inflation:&lt;/strong&gt; We are building "Write-Only" legacy systems.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Security&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;"Secure by design"&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;55% Pass Rate&lt;/strong&gt; (&lt;a href="https://www.google.com/search?q=https://www.veracode.com/state-of-software-security" rel="noopener noreferrer"&gt;https://www.google.com/search?q=https://www.veracode.com/state-of-software-security&lt;/a&gt;)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Risk Injection:&lt;/strong&gt; Automating the introduction of XSS/SQLi vulnerabilities.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;Table 1: The divergence between perceived value and engineering outcomes (2024-2025 Data).&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Structural Integrity is Collapsing (The Rise of AI Debt)
&lt;/h2&gt;

&lt;p&gt;If we analyze the code itself, the picture gets uglier. We are rapidly accumulating a new toxic asset class: &lt;strong&gt;AI Debt&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;(&lt;a href="https://www.google.com/search?q=https://www.gitclear.com/coding_on_copilot_data_quality_impact_research" rel="noopener noreferrer"&gt;https://www.google.com/search?q=https://www.gitclear.com/coding_on_copilot_data_quality_impact_research&lt;/a&gt;) on 211 million lines of code reveals that while volume is up, structural health is crashing.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Code Churn has Doubled:&lt;/strong&gt; Lines of code reverted within two weeks of authorship have doubled against pre-AI baselines. We are generating massive amounts of "throwaway" code that doesn't survive first contact with reality.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Refactoring has Collapsed:&lt;/strong&gt; The rate of refactored code plummeted from 25% in 2021 to under 10% in 2024. AI models predict the next token based on patterns; they are biased toward repeating existing mistakes rather than abstracting them.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Security Vulnerabilities are Baked In:&lt;/strong&gt; We are automating the injection of flaws. Recent analysis shows extremely high failure rates for basic issues: &lt;strong&gt;86% failure on XSS&lt;/strong&gt; and &lt;strong&gt;(&lt;a href="https://www.google.com/search?q=https://www.veracode.com/state-of-software-security" rel="noopener noreferrer"&gt;https://www.google.com/search?q=https://www.veracode.com/state-of-software-security&lt;/a&gt;)&lt;/strong&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We are building bloated, repetitive, insecure systems at record speed.&lt;/p&gt;

&lt;p&gt;This isn't an aesthetic issue; it is financial leverage working against your future velocity.&lt;/p&gt;

&lt;h2&gt;
  
  
  The "Mirror and Multiplier" Effect
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Is AI a net negative? No.&lt;/strong&gt; It is a power tool. Its impact depends entirely on the discipline of the operator.&lt;/p&gt;

&lt;p&gt;The critical insight from the (&lt;a href="https://cloud.google.com/devops/state-of-devops" rel="noopener noreferrer"&gt;https://cloud.google.com/devops/state-of-devops&lt;/a&gt;). AI is not magic pixie dust that fixes broken engineering cultures. &lt;strong&gt;It is an amplifier&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The Mirror (Dysfunction):&lt;/strong&gt; If your organization has chaotic processes, weak testing cultures, and high tolerance for debt, AI will mirror that dysfunction. It helps you generate spaghetti code faster.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The Multiplier (Excellence):&lt;/strong&gt; If you have strong "wiring", robust CI/CD, high-coverage automated testing, and rigorous standards, AI acts as an accelerator. You have the systemic capacity to catch the AI's mistakes instantly.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="/images/blog/mirror-and-multiplier-effect.jpg" class="article-body-image-wrapper"&gt;&lt;img src="/images/blog/mirror-and-multiplier-effect.jpg"&gt;&lt;/a&gt;&lt;/p&gt;


The Mirror and Multiplier Effect: Organizational Impact of AI





&lt;p&gt;You cannot buy a tool to bypass the hard work of building a mature engineering culture.&lt;/p&gt;

&lt;h2&gt;
  
  
  Real-world Case Studies:
&lt;/h2&gt;

&lt;p&gt;Theory is useless without field data. When we look at real-world deployments in 2024–2025, the data confirms the "Mirror Effect": the outcome is determined not by the AI model you buy, but by the organizational wiring you plug it into.&lt;/p&gt;

&lt;p&gt;Success stories are not about magic, they are about rigorous structural preparation. Failures are almost always failures of governance.&lt;/p&gt;

&lt;h3&gt;
  
  
  The WINS: Structural Capacity and Targeted Strikes
&lt;/h3&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Mercado Libre&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;With 9,000+ developers, they reported a 50% reduction in coding time. How? They didn't just hand out licenses and hope for the best. Their success was predicated on a pre-existing, standardized microservices architecture and strong platform engineering capabilities.&lt;/p&gt;

&lt;p&gt;They &lt;strong&gt;had the structural capacity to absorb the increased velocity safely&lt;/strong&gt;. They built the high-speed rail network &lt;em&gt;before&lt;/em&gt; buying the bullet train.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Duolingo&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Instead of trying to automate complex feature creation, they focused AI on pure toil: regression testing workflows. The result was a 70% reduction in manual testing time, turning hours-long processes into minutes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;This is tactical brilliance.&lt;/strong&gt; They didn't accelerate code &lt;em&gt;generation&lt;/em&gt; (which increases risk), they accelerated &lt;em&gt;verification&lt;/em&gt; (which decreases risk). They used AI to tighten the feedback loop, improving overall system stability.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Pinterest&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Pinterest didn't aim for speed; they aimed for safety. They executed a measured rollout with internal "Safety" checks and built a custom internal platform to govern AI usage before scaling.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;They treated AI like an unproven junior engineer&lt;/strong&gt;. They built guardrails first. They recognized that without governance, speed is just velocity towards a cliff.&lt;/p&gt;

&lt;h3&gt;
  
  
  The FAILURES: Abdication of Responsibility
&lt;/h3&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;The Replit &amp;amp; Air Canada Effect&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;The industry has seen predictable failures where human-led processes broke down. Replit highlighted instances where unsupervised AI generation led to "negative productivity." Air Canada faced legal liability when its chatbot hallucinated a policy that the company was forced to honor.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;These are not AI failures; they are management failures&lt;/strong&gt;. "Blind trust" in probabilistic tooling is professional negligence. If you abdicate your responsibility to verify outputs, you deserve the resulting chaos.&lt;/p&gt;

&lt;h2&gt;
  
  
  Syntax is Commodity, Structure is Leverage
&lt;/h2&gt;

&lt;p&gt;We are exiting the era where syntax was the constraint. Writing code is now a commodity, abundant, cheap, and infinite.&lt;/p&gt;

&lt;p&gt;In an economy of infinite syntax, &lt;strong&gt;structural judgment is the only scarcity.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The "&lt;em&gt;10x Developer&lt;/em&gt;" of the AI era is not the one who generates the most code. It is the one with the discipline to &lt;strong&gt;&lt;em&gt;reject&lt;/em&gt; the most code&lt;/strong&gt;. It is the one who understands that every line accepted into the repository is a permanent liability that must be defended against entropy.&lt;/p&gt;

&lt;h2&gt;
  
  
  &lt;strong&gt;The Stabilization Plan&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;If your team is using AI, you are in production&lt;/strong&gt;. To prevent the collapse predicted by the data, you must implement a rigorous engineering protocol.&lt;/p&gt;

&lt;h3&gt;
  
  
  Phase 1: Containment
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Treat AI Code as Untrusted User Input:&lt;/strong&gt; Stop treating Copilot suggestions as "peer code." Treat them with the same hostility you treat an external API payload. Implement an &lt;strong&gt;"Air Gap" Policy&lt;/strong&gt;: No AI-generated code merges to the main branch without passing a dedicated SAST.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Invert Your Metrics Dashboard:&lt;/strong&gt; Deprecate "Velocity" and "Commit Frequency" as primary KPIs immediately. They are being gamed by inflation.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Primary Metric:&lt;/strong&gt; &lt;strong&gt;Change Failure Rate.&lt;/strong&gt; If this rises, AI usage gets throttled.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Secondary Metric:&lt;/strong&gt; &lt;strong&gt;Code Longevity.&lt;/strong&gt; Measure how much AI-generated code survives past the 2-week mark. If churn is high, you aren't building features, you're prototyping in production.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  Phase 2: Reinforcement
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Automated "Refusal to Merge":&lt;/strong&gt; You cannot scale code generation without scaling automated rejection.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Property-Based Testing:&lt;/strong&gt; Unit tests are no longer enough (AI can write passing unit tests for broken code). Implement property-based testing (fuzzing) to bombard the AI’s logic with edge cases it didn't anticipate.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The "Context Check":&lt;/strong&gt; Mandate that every PR description includes a "Why" section written by the human, explaining the architectural decision. If the developer cannot explain the &lt;em&gt;intent&lt;/em&gt; independent of the &lt;em&gt;syntax&lt;/em&gt;, the PR is rejected.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Debt Repayment Quotas:&lt;/strong&gt; To combat the collapse in refactoring (down to &amp;lt;10%), enforce a &lt;strong&gt;"Boy Scout Rule"&lt;/strong&gt;. For every feature PR generated with AI, the developer must include a corresponding refactor or cleanup of an existing module. Tie this to mergeability.&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;

&lt;h3&gt;
  
  
  Phase 3: Calibration
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;The "Senior-to-Junior" Review Ratio:&lt;/strong&gt; A Senior Engineer can no longer review the same amount of code. The cognitive load of verifying "hallucinated logic" is higher than reviewing human logic.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Action:&lt;/strong&gt; Reduce the review load on Seniors by 20% to account for the increased density of AI code. Do not expect them to review faster just because the code was written faster.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;&lt;p&gt;&lt;strong&gt;Mandatory "Analog Weeks" for Juniors:&lt;/strong&gt; To prevent the "Knowledge Collapse," institute training periods where Junior engineers must execute tasks &lt;em&gt;without&lt;/em&gt; AI assistance. They must prove they understand the memory model and the SQL execution plan before they are allowed to automate it. You cannot automate what you do not understand.&lt;/p&gt;&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;&lt;a href="/images/blog/junior-dev-trajectory.jpg" class="article-body-image-wrapper"&gt;&lt;img src="/images/blog/junior-dev-trajectory.jpg"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;You are not managing a software team anymore; you are managing a &lt;strong&gt;nuclear power plant&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The reaction (code generation) is self-sustaining and powerful, but without heavy lead shielding (testing/review) and control rods (governance), it will melt down.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;We have officially exited the "Artisan Era" of software development, where code was hand-crafted and scarce. We have entered the &lt;strong&gt;Industrial Era&lt;/strong&gt;, where code is mass-produced and abundant.&lt;/p&gt;

&lt;p&gt;In this new reality, the primary danger to your organization is no longer a lack of speed, it is a &lt;strong&gt;lack of friction&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Generative AI has removed the friction of writing code, but that friction served a purpose: it gave us time to think. Without it, we are flooding our repositories with "presumed competence", logic that &lt;em&gt;looks&lt;/em&gt; correct but has not earned its place in the system.&lt;/p&gt;

&lt;p&gt;The engineering teams that survive this transition will not be the ones who generate the most features. They will be the ones who build the best &lt;strong&gt;filtration systems&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The mandate is clear:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Stop celebrating volume.&lt;/strong&gt; A large codebase is just a large surface area for bugs.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Start rewarding skepticism.&lt;/strong&gt; The most valuable engineer is no longer the fastest typist; it is the one who refuses to merge a pull request because the logic feels "hollow."&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Shift from Creation to Curation.&lt;/strong&gt; Your job is no longer to write the code; your job is to certify that the code is safe to run.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The hype is done, now we have to manage the cleanup.&lt;/p&gt;

&lt;p&gt;Stop building faster, start building things that don't fall down.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://pixari.dev/the-paradox-of-ai-acceleration-why-we-are-typing-faster-but-shipping-slower/" rel="noopener noreferrer"&gt;pixari.dev&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>engineering</category>
    </item>
  </channel>
</rss>
