DEV Community

UC Jung
UC Jung

Posted on

Why Developers Don't Trust Code Built by AI Agents

Let Me Start with the Conclusion

Because they never instructed the AI Agent to produce trustworthy results.

According to Sonar's 2026 State of Code report, based on a survey of over 1,100 enterprise developers, 72% of developers who have tried AI use it every day. Yet 96% don't fully trust that AI-generated code is functionally correct. And only 48% always verify it before committing.

They use it every day but don't trust it. They don't trust it, yet more than half don't even verify.

The cause of this contradiction is not a lack of capability in the AI Agent. It's because the person giving instructions never followed the process required to produce trustworthy results.


How Should You Instruct to Get Trustworthy Results?

1. Your instructions are requirements — refine them together with the AI Agent

Most developers instruct their AI Agent like this:

"Build me a login feature."

This is not a requirement. It's a wish.

Instructions are requirements. And requirements don't come out perfect from the start. You need to refine them together with the AI Agent through review. Try asking the AI Agent: "What decisions need to be made to implement this feature?" Authentication method, token strategy, error handling policy, security requirements — the AI Agent will systematically surface decision points you might miss on your own.

2. State intent and purpose explicitly, then explore guidelines and the codebase together

"We want to improve the profile editing UX to reduce user churn. Explore the profile-related APIs and components in the current codebase, and propose an implementation scope that's consistent with existing patterns."

State the Why (intent) and What (objective) explicitly, and let the AI Agent propose the How after exploring the codebase. This reduces the space the AI Agent has to fill with inference. 61% of developers agree that AI often produces code that appears correct but isn't reliable — precisely because the AI Agent fills in unspecified parts with its own assumptions. Reduce the inference space, and you reduce unreliable results.

3. Once requirements are defined, open a separate session and instruct it to build an implementation plan

Don't jump straight to coding once requirements are finalized. Open a separate session and instruct the AI Agent to create an implementation plan referencing the requirements document.

Why a separate session? Because the conversational context accumulated during requirements discussion contaminates implementation judgment. In a fresh session with only the requirements document as reference, the AI Agent can design the implementation approach from a clean state.

4. Review the implementation plan — from a design perspective

Don't accept the implementation plan as-is. Review it.

The review perspectives are clear:

  • Data model design: Do table structures, relationships, and indexes satisfy the requirements?
  • Source architecture: Is there proper layer separation? Are concerns properly separated?
  • SOLID principles: Are Single Responsibility, Open-Closed, and other principles being followed?
  • Design patterns: Are there applicable patterns that were missed?

You can also instruct the AI Agent: "Review this implementation plan from the perspective of SOLID principles and design patterns." What matters is that the human decides whether to review and approves the review results.

5. Get a testing approach proposal and review it

Before implementation, confirm the test strategy too. Ask "Propose what tests should be written and how for this implementation," then review whether unit tests, integration tests, and edge case coverage are sufficient.

6. Once all steps are complete, instruct development based on requirements and implementation plan

If you've made it this far, the AI Agent has a clear requirements document, a reviewed implementation plan, and an agreed-upon test strategy. When you instruct development in this state, the AI Agent doesn't need to infer anything. It simply executes what has already been decided.

This is the process for producing trustworthy results.


An AI Agent Can Be a Capable CO-Woker or a Deluded Fool

Depending on the level of the person giving instructions.

88% of developers reported negative impacts from AI on technical debt. Of those, 53% cited code that appears correct but isn't reliable. This isn't because the AI Agent is stupid. It's because the AI Agent received instructions that forced it to fill gaps with inference.

Without clear requirements, without an implementation plan, without review — when you just say "build it" — the AI Agent fills every blank with inference. And it doesn't tell you whether those inferences were right or wrong. The output looks plausible. Tests pass. Then it blows up in production three weeks later.


"Doesn't All That Take Too Much Time?"

Yes. It takes time.

You don't need requirements refinement, implementation planning, SOLID review, and test strategy design to rename a button. Obviously not.

Deciding how far to go based on the importance of the task — that's the human's responsibility. Simple changes need simple instructions. But for core business logic, authentication systems, or payment processing, you need to follow the process above.

This judgment itself is what separates senior from junior. Knowing what level of verification each task needs. That's experience.


But Think About This

You — the one who doesn't trust the AI Agent on important work — think about the time difference between doing requirements refinement and implementation planning alone versus together with the AI Agent.

Recently, I refactored source code that had been rapidly developed as an MVP. Redesigning the architecture from a SOLID perspective, identifying areas where design patterns could be applied, checking for errors, and fixing them.

It took exactly 3 days.

Work that would have taken over a month if done alone.

Spend 2 hours refining requirements with the AI Agent, and you save 3 weeks of post-implementation debugging. Spend 1 hour reviewing the implementation plan, and you prevent an all-night production incident response. Spend 30 minutes on a SOLID review, and you save 2 weeks of refactoring 6 months later.

The review process with an AI Agent is not wasted time. It's the most efficient time investment you can make.


Real Case: How 6 Lines of Instructions Became a Detailed Requirements Document

Words alone aren't convincing, so let me show a real case. The original text is attached in full in the appendix below. See for yourself.

The initial prompt I gave the AI Agent contained nothing more than two intentions: "I want to manage requirements by project" and "I want the AI Agent to launch immediately without path input." No specific schemas, no UI layouts, no implementation order.

The AI Agent took this instruction, asked questions through conversation, explored the codebase, and refined the requirements with my approval at each step.

The final deliverable was a detailed requirements document spanning 7 sections, with 5 functional requirements (FR), 4 non-functional requirements (NFR), schema change specifications, decision trees, UI wireframes, system component maps, implementation priorities, and explicit scope exclusions.

All of this was the result of me writing 6 lines and refining through conversation with the AI Agent. All I did was communicate intent, answer questions, review results, and approve.

When I instructed implementation based on these requirements, the AI Agent didn't need to infer. The schema was defined, the logic branches were specified, the affected files were enumerated, and the scope boundaries were drawn.

If instructions at this level of detail had still produced untrustworthy results, I would never have built an automated development WORK-PIPELINE powered by AI Agents.

But the reality is different. A pipeline that refines requirements, reviews implementation plans, designs tests, and then instructs development is actually running in production. Requirements analysis (SPECIFIER) → Planning (PLANNER) → Scheduling (SCHEDULER) → Building (BUILDER) → Verification (VERIFIER) → Committing (COMMITTER). A 6-stage pipeline that executes automatically from a single instruction.

6 lines of intent, refined through conversation, become input to an automated pipeline, and output as trustworthy code. That is what it means to use an AI Agent "properly."


Remember This

The level at which you use your AI Agent is your level.

Using the same AI Agent, someone receives untrustworthy code while someone else receives code ready for production deployment. The difference isn't the AI Agent's version. It's the difference in what process the person giving instructions followed.

An AI Agent is a mirror. Give it vague instructions and you get vague results. Give it concrete requirements and a reviewed plan and you get trustworthy results.

The statistic that 96% of developers don't trust AI doesn't reveal a limitation of AI. It reveals that 96% of developers haven't yet learned how to properly instruct an AI Agent.

Before blaming the tool, look at how you're using it first.

And raise your own level by refining requirements and building development plans together with the AI Agent — using that process as a way to deepen your understanding of the project itself.


📌 This article includes content from the AI Agent User Training Guide series that I've created for educational purposes.
The related material — AI Agent User Training Guide — Advanced (Context Management, Context Isolation) — will be refined further and published at a later date.



Appendix: Use Case — From Initial Prompt to Detailed Requirements Document

Below are the original texts from the real case mentioned in the article. The initial prompt and the final requirements document produced through conversational refinement with the AI Agent are attached without modification.


A. Initial Prompt

You must fulfill the role of a requirements analysis expert.
Currently, while requirements are categorized by project in the existing functionality, they cannot be managed by project, resulting in poor usability.
We intend to add a feature to the currently implemented requirements management function that allows for viewing and managing requirements by project.
Currently, the system requires the project folder to be entered when implementing requirements.
We plan to allow the AI Agent to launch and start implementation immediately without requiring a path input if a project is configured in the requirements settings.
Please write a detailed requirement regarding this part and submit it.
The proposed requirement is located at ./requirment/todo_0001.project_setup_feature.md. For matters decided via conversation, request permission to modify the document, and make the changes once approved.


B. Final Output: Requirements Document

REQ-TODO-0001: Project Setup for Requirements — Project-based View & Auto-path Execution

Status: DRAFT
Author: Requirements Analysis Expert
Created: 2026-03-28
Priority: P1
Category: Functional Requirement (FR)


1. Background & Problem Statement

1.1 Current State

The current requirements management system has a Requirement ↔ Project relationship via the optional projectId field. However, this relationship is limited to organizational grouping and list filtering only. The following usability issues exist:

# Problem Impact
P1 No project-centric requirement view — Requirements are displayed in a flat list with a project filter dropdown, but there is no dedicated project-based management screen where users can see project context alongside its requirements. Users cannot grasp the full picture of a project's requirement status at a glance.
P2 Manual projectRoot input on every execution — Each time the AI Agent is launched, the user must manually type or confirm the file system path (projectRoot). This path is not stored on the Project entity. Repetitive input, error-prone (typos in paths), and blocks one-click execution.
P3 Hard-coded default path — The frontend defaults to C:\rnd\uc-teamspace in RequirementDetail.tsx and RequirementActionBar.tsx. This is developer-specific and does not scale to multi-project or multi-user environments. Incorrect path for any project other than the main codebase.
P4 No project-level execution configuration — Execution parameters (maxTurns, timeoutMinutes, profileId) must be specified per execution. There is no way to set project-level defaults. Repetitive configuration for projects with consistent execution needs.

1.2 Goal

Enable project-based requirement management and zero-input AI Agent execution by:

  1. Adding project-level configuration (workspace path, execution defaults)
  2. Providing a project-centric requirement management view
  3. Auto-resolving execution parameters from the project configuration when available

2. Functional Requirements

FR-01: Add projectRoot field to the Project model

Description: Extend the Project entity with a projectRoot field that stores the file system workspace path for that project.

Schema Change:

Project {
  ...existing fields...
  + projectRoot    String?     // Absolute path to project workspace (e.g., "C:\rnd\uc-teamspace")
}
Enter fullscreen mode Exit fullscreen mode

Rules:

  • Field is optional (nullable) — existing projects without a path remain valid.
  • Value must be a non-empty string when provided (trimmed, no trailing slash normalization needed — stored as-is).
  • No server-side file system validation (backend runs in Docker and cannot access host FS). Validation is deferred to Runner execution time.
  • Editable via project update API and project settings UI.

Affected Components:
| Layer | File | Change |
|-------|------|--------|
| DB | schema.prismaProject model | Add projectRoot String? field |
| Backend | Project DTO (create/update) | Accept projectRoot in create/update payloads |
| Backend | Project service | Persist projectRoot |
| Frontend | Project API types | Add projectRoot to Project interface |
| Frontend | Project settings/edit form | Add path input field |


FR-02: Add project-level execution defaults

Description: Extend the Project entity with optional execution default fields so that frequently-used execution parameters can be pre-configured per project.

Schema Change:

Project {
  ...existing fields...
  + defaultProfileId      String?     // Default CliProfile for executions
  + defaultMaxTurns       Int?        // Default maxTurns (1–200)
  + defaultTimeoutMinutes Int?        // Default timeoutMinutes (1–120)
}
Enter fullscreen mode Exit fullscreen mode

Rules:

  • All fields are optional. When null, system defaults apply (maxTurns=50, timeoutMinutes=30).
  • defaultProfileId must reference a valid CliProfile.id if provided.
  • Values are used as defaults — users can still override per execution.

Affected Components:
| Layer | File | Change |
|-------|------|--------|
| DB | schema.prismaProject model | Add 3 optional fields |
| Backend | Project DTO | Accept new fields |
| Backend | CliExecution service | Resolve defaults from project when not explicitly provided |
| Frontend | Project settings form | Add execution default configuration fields |


FR-03: Auto-resolve projectRoot on AI Agent execution

Description: When launching an AI Agent execution for a requirement that is linked to a project with projectRoot configured, the system should auto-fill the project root path without requiring manual input.

Behavior:

User clicks "Execute" on a requirement
  └─ Requirement has projectId?
       ├─ YES → Project has projectRoot?
       │    ├─ YES → Auto-fill projectRoot (user can still edit)
       │    │         + Auto-fill execution defaults (profileId, maxTurns, timeoutMinutes)
       │    └─ NO  → Show empty input (current behavior)
       └─ NO  → Show empty input (current behavior)
Enter fullscreen mode Exit fullscreen mode

Frontend Changes:

  • RequirementExecutionPanel.tsx: On mount/open, if requirement has project.projectRoot, pre-fill the input.
  • RequirementActionBar.tsx (batch execution): For batch execution, if all selected requirements share the same project with projectRoot, auto-fill. If mixed projects, show empty input with a note.
  • RequirementDetail.tsx: Remove hard-coded 'C:\\rnd\\uc-teamspace' default. Replace with project-derived value or empty string.
  • When projectRoot is auto-filled from project, display a visual indicator (e.g., small label "from project settings") so the user knows the source.

Backend Changes:

  • StartExecutionDto: projectRoot remains required — the auto-fill happens on the frontend.
  • CliExecutionService.startExecution(): If projectRoot is not provided in the request but the requirement's project has projectRoot, use it as fallback. This enables API-only clients (not just UI) to benefit from the project default.

Acceptance Criteria:

  1. When a requirement is linked to a project with projectRoot set, clicking "Execute" shows the path pre-filled.
  2. User can still modify the pre-filled path before confirming.
  3. Batch execution with requirements from the same project auto-fills the path.
  4. Batch execution with requirements from mixed projects (or no project) shows empty input.
  5. The hard-coded default 'C:\\rnd\\uc-teamspace' is removed from all frontend files.

FR-04: Project-based requirement management view

Description: Add a project-centric view where users can see a project's details and manage all its requirements in one place.

UI Specification:

4a. Project Requirement Dashboard (New Page)

Route: /projects/:projectId/requirements

Layout:

┌──────────────────────────────────────────────────┐
│  Project Header                                  │
│  [Project Name] [Code] [Status]                  │
│  Project Root: C:\rnd\project-a  [Edit]          │
│                                                  │
│  Summary Cards                                   │
│  ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐    │
│  │ Total  │ │  Draft │ │In Prog │ │  Done  │    │
│  │  24    │ │   5    │ │   8    │ │  11    │    │
│  └────────┘ └────────┘ └────────┘ └────────┘    │
│                                                  │
│  [+ New Requirement]  [▶ Batch Execute]          │
│                                                  │
│  ┌──────────────────────────────────────────┐    │
│  │  Requirement Table (filtered by project) │    │
│  │  - Same columns as RequirementList       │    │
│  │  - Project column hidden (redundant)     │    │
│  │  - Status/Priority/Category filters      │    │
│  └──────────────────────────────────────────┘    │
└──────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Features:

  • Project header shows key project info and projectRoot (editable inline).
  • Summary cards show requirement count by status.
  • "New Requirement" auto-sets projectId to the current project.
  • "Batch Execute" auto-fills projectRoot from project settings — no path input required if configured.
  • Requirement table is pre-filtered by projectId, with additional sub-filters for status, priority, and category.

4b. Project List enhancement

Current: The project list page exists but does not show requirement summary.

Enhancement: Add a "Requirements" column to the project list showing count and status breakdown (e.g., "12 total · 3 in progress · 2 done").

Acceptance Criteria:

  1. /projects/:projectId/requirements page renders with project header, summary cards, and filtered requirement table.
  2. Creating a requirement from this page auto-assigns the projectId.
  3. "Batch Execute" from this page auto-fills projectRoot if the project has it configured.
  4. Project list shows requirement summary per project.

FR-05: Project Settings UI for path configuration

Description: Provide a UI for configuring projectRoot and execution defaults on the project entity.

Location: Project edit/settings page (existing or new)

Fields:
| Field | Type | Validation | Description |
|-------|------|------------|-------------|
| Project Root Path | Text input (monospace) | Optional, non-empty if provided | Workspace path for AI Agent execution |
| Default Profile | Dropdown (CliProfile list) | Optional, must be valid profileId | Default execution profile |
| Default Max Turns | Number input | Optional, 1–200 | Default max turns for execution |
| Default Timeout (min) | Number input | Optional, 1–120 | Default timeout for execution |

Acceptance Criteria:

  1. Project settings page shows all 4 fields.
  2. Saving with valid values updates the Project entity.
  3. Saving with projectRoot empty sets it to null (not empty string).
  4. Changes take effect immediately for subsequent executions.

3. Non-Functional Requirements

NFR-01: Backward Compatibility

  • Existing projects without projectRoot continue to work as-is (execution requires manual path input).
  • Existing requirements without projectId are unaffected.
  • Existing API clients that provide explicit projectRoot in StartExecutionDto continue to work — the project default is a fallback, not a forced override.

NFR-02: Data Migration

  • New fields are added as nullable — no data migration required.
  • No existing data is modified.
  • Optional: Provide a one-time admin action to bulk-set projectRoot for existing projects.

NFR-03: Security

  • projectRoot is a host file system path. It is stored and passed through but never accessed by the backend (which runs in Docker).
  • No path traversal validation is needed on the backend. The Runner validates the path at execution time.

NFR-04: Performance

  • Adding fields to the Project model has no meaningful performance impact.
  • The project-based requirement view uses the existing projectId index on the requirements table (already indexed via foreign key).

4. Out of Scope

The following items are explicitly not part of this requirement:

Item Reason
Multi-path per project (monorepo support) Future enhancement — current scope is one path per project
Project creation/deletion workflow changes Existing project CRUD is sufficient
Runner-side path validation enhancement Separate concern — Runner already validates at execution time
Requirement mandatory project assignment projectId remains optional to preserve flexibility
Project-level git configuration (branch, remote) Separate requirement — git config is per-user, not per-project

5. Implementation Priority & Dependencies

Suggested Implementation Order

Phase 1 (Foundation):
  FR-01  Add projectRoot to Project model
  FR-05  Project Settings UI for path configuration

Phase 2 (Execution Enhancement):
  FR-02  Add project-level execution defaults
  FR-03  Auto-resolve projectRoot on execution

Phase 3 (View Enhancement):
  FR-04  Project-based requirement management view
Enter fullscreen mode Exit fullscreen mode

Dependencies

  • FR-03 depends on FR-01 (needs projectRoot on Project)
  • FR-03 depends on FR-02 (needs execution defaults on Project)
  • FR-04 depends on FR-01 (needs projectRoot for display and auto-execution)
  • FR-05 depends on FR-01 and FR-02 (needs schema fields to exist)

6. Affected System Components Summary

packages/backend/
├── prisma/schema.prisma                          ← FR-01, FR-02: Project model changes
├── src/project/dto/                              ← FR-01, FR-02: Create/Update DTOs
├── src/project/project.service.ts                ← FR-01, FR-02: Persist new fields
├── src/project/project.controller.ts             ← FR-01, FR-02: API endpoints
├── src/cli-execution/cli-execution.service.ts    ← FR-03: Fallback projectRoot resolution
├── src/cli-execution/dto/start-execution.dto.ts  ← FR-03: Optional projectRoot with fallback
└── src/requirement/requirement.service.ts        ← FR-04: Include project.projectRoot in responses

packages/frontend/
├── src/api/project.api.ts                        ← FR-01: Add projectRoot to Project type
├── src/api/cli-execution.api.ts                  ← FR-03: Update StartExecutionData
├── src/pages/RequirementDetail.tsx               ← FR-03: Remove hard-coded default, auto-fill
├── src/pages/ProjectRequirements.tsx             ← FR-04: New page
├── src/components/requirement/
│   ├── RequirementExecutionPanel.tsx             ← FR-03: Auto-fill from project
│   └── RequirementActionBar.tsx                  ← FR-03: Auto-fill for batch, remove default
└── src/pages/ProjectList.tsx                     ← FR-04: Add requirement summary column
Enter fullscreen mode Exit fullscreen mode

7. Glossary

Term Definition
projectRoot Absolute file system path on the host machine where the project source code resides. Used as the working directory for AI Agent execution.
CliExecution A record of an AI Agent execution run, including its prompt, status, and results.
CliProfile A named configuration set for AI Agent execution (model, parameters, etc.).
Runner The local service that polls for pending executions and launches the Claude CLI process.
PipelineStage Stages of AI Agent execution: SPECIFIER → PLANNER → SCHEDULER → BUILDER → VERIFIER → COMMITTER

Top comments (0)