Emir Cangir

Posted on Feb 11

AI Agents, Source Context, and Prompt History: A New Software Development Paradigm

#agents #ai #llm #softwaredevelopment

Software development is shifting from “writing code” to “curating intent.” With modern LLMs, a large part of implementation can be produced by an AI agent — but only if the agent is grounded in the project’s truth.

Here’s the simplest mental model:

An AI agent is like a developer with anterograde amnesia.
It can reason and write code, but it doesn’t reliably “remember” your system unless you give it memory. In an AI-first workflow, that memory is built from two first-class artifacts:

Source context: the curated, modular documentation that defines requirements, constraints, architecture, and invariants.
Prompt history: the running dialogue that captures decisions, feedback, and rationale as the project evolves.

Together, these become a kind of language-native codebase: a project defined by intent and constraints, where code is generated and maintained under human oversight.

This idea aligns with Andrej Karpathy’s “Software 3.0” framing: prompts and context increasingly behave like programs, and development becomes a conversation where natural language is the dominant control surface.

But “docs replace code” is not the claim. The real claim is subtler:

Context becomes the project’s constitution. Code remains the executable artifact.

The four-layer stack

You can think of the AI-first repo as a layered system:

Code: the executable artifact (still necessary)
Source context: the normative spec (“what must be true”)
Prompt history: working memory + rationale (“why we chose this”)
Agent: the compiler/contributor that converts (2)+(3) into (1) under review

The breakthrough is treating layers 2 and 3 as versioned, reviewed, and intentionally maintained — not accidental chat logs.

Modular context architecture

Large systems fail with a single monolithic context file for the same reason monolithic codebases rot: everything is coupled.

A practical pattern is module-scoped context files, each acting as a “README for humans and agents”:

product_context.md
orders_context.md
payment_context.md
user_auth_context.md

Each file should be small enough to load, specific enough to constrain behavior, and stable enough to trust.

Why modular context works

Clarity: agents load only what matters.
Separation of concerns: requirements and constraints evolve locally.
Easier onboarding: humans and agents ramp faster.
Parallel work: multiple agents can operate safely in different domains.

A context file spec (copy this)

Most teams struggle because their “context” is vibes-based. A reliable context file has a consistent shape:

<module>_context.md

Purpose / Non-goals
Public API / Contracts (endpoints, events, schemas)
Core invariants (“must always hold”)
Data model (field meaning; avoid schema dumps)
Workflows / State machines
Security & privacy constraints
Operational constraints (latency, retries, idempotency)
Failure modes & recovery
Observability (logs/metrics/traces expectations)
Test expectations (golden paths + edge cases)
Changelog (dated, human-readable)

The key is normativity: context should state constraints and invariants, not mirror implementation details.

Prompt history isn’t memory unless you distill it

Raw conversation history is not durable. It’s noisy, contradictory, and often too long.

So use this rule:

Conversation is a meeting transcript. Context is the meeting minutes.

A mature workflow actively distills prompt history into context updates:

If a decision affects future work, it must be recorded.
If a rule is corrected, it must be added to the relevant context file.
If a decision changes, the old one must be marked as superseded.

This turns “chat” into governance.

Live context updates and versioning

If context is the constitution, it must evolve with the project.

When an agent implements a feature, it should update the relevant context file(s) in the same PR:

add or revise invariants/contracts
record edge cases discovered during implementation
add a dated changelog entry

Example:

## Changelog

### 2026-02-02
- Added wishlist support: users can store product IDs in `wishlistItems`.
- Added endpoints: GET/POST/DELETE `/users/{id}/wishlist`.
- Enforced owner-only access and idempotent add/remove behavior.

This is how you prevent context drift: docs and code change together.

The collaboration loop: human as executive, agent as operator

A practical AI-first loop looks like:

Plan
- Human states goal.
- Agent proposes approach and affected modules.
Review
- Human checks architecture/security/product intent.
- Agent revises plan.
Implement
- Agent writes code + tests following context rules.
Verify
- Tests, static checks, and human review.
- Agent fixes issues.
Update context
- Agent updates module context + changelog.
- Human reviews context changes too.

This is not “autopilot.” It’s delegation with constraints.

Case study: wishlist in an e-commerce app

Suppose we add Wishlist to an existing e-commerce system.

Modules involved:

product_context.md (product IDs, catalog lookup rules)
user_auth_context.md (identity, authorization constraints)

The agent loads:

global rules (AGENTS.md)
product_context.md
user_auth_context.md
most recent changelog entries

It proposes:

store wishlistItems: string[] (product IDs) on the user profile
endpoints for add/view/remove
verify product exists before adding
prevent duplicates (idempotent semantics)
enforce “owner-only” authorization

The human adds missing requirements (e.g., removal, max size, logging).

The agent implements, tests, then updates both context files.

The important part isn’t the endpoints — it’s the feedback → distillation loop that makes future work faster and safer.

Benefits (and why it’s worth it)

Speed: agent handles boilerplate and iteration.
Maintainability: context stays synced because it’s part of change flow.
Onboarding: context is a human-readable, module-level truth.
Consistency: standards live in context, not tribal memory.
Compounding improvement: every correction becomes durable guidance.

Limitations (and the one you can’t ignore)

Context windows: too much context causes “lost in the middle.”
Ambiguity and gaps: agents hallucinate if constraints aren’t explicit.
Maintenance overhead: context needs ownership and review.
Cost/tooling: large contexts can be expensive.

But the most important caution is this:

Context is a new attack surface.
Bad context can silently steer agents into insecure or noncompliant solutions. Treat context changes like code changes: review, provenance, and automated verification.

Reference Implementation

For a hands-on example of how this paradigm can be applied in practice, see the companion repository on GitHub:

https://github.com/cangiremir/context-driven-ai-development

The repository demonstrates a modular source context architecture, human-in-the-loop governance rules, example context files, ADRs, and prompt templates aligned with the ideas described in this article.

The future: code as a compiled artifact of intent

If this trajectory continues, “the repo” becomes less like a pile of source code and more like a knowledge system:

context docs define intent and constraints
decisions and rationales are preserved
agents translate intent into code
code is continuously verified and regenerated

In that world, the skill of a software engineer expands: you’re not only writing code, you’re designing systems of constraints that both humans and agents can execute.

DEV Community