DEV Community: HyunKi Lee

Your Notes App: Why Developer Note Taking Fails to Execute

HyunKi Lee — Tue, 21 Jul 2026 23:31:05 +0000

The Entropy of Unstructured Text

Every developer has a directory of markdown files, a private Git repository, or a scratchpad application filled with software concepts. These files represent the initial state of software design. However, unstructured text exists in a state of high entropy. It lacks schema validation, state transitions, and execution paths. When we write "Users can upload a file and get a parsed JSON response" in a markdown file, we have not written a specification; we have written an aspiration. The gap between this unstructured text and a working system is where most software concepts fail.

The primary issue with developer note taking is not the act of capturing thoughts. Capture is a solved problem. The issue is the structural deficit of the medium. Unstructured text editors impose zero schema constraints. You do not have to define types, handle edge cases, or design database schemas to write down a paragraph. This low friction is necessary for initial cognitive offloading, but it creates a false sense of progress. Because the notes app does not enforce structural integrity, it allows us to bypass the hard architectural decisions. We mistake the ease of writing the note for progress toward building the system.

The Structural Deficit of the Scratchpad

To understand why notes apps become graveyards for software concepts, we must analyze the properties of unstructured text.

First, text lacks relational integrity. If you change a concept in note A, note B does not update. If you decide to rename a core entity from "Account" to "Organization" in your database schema, your unstructured notes remain out of sync, creating immediate cognitive debt.

Second, text lacks validation. There is no compiler for notes. A note can contain logical contradictions, impossible state transitions, and missing dependencies without raising a single error. For example, a note might state that "the system transitions from Pending to Active upon payment" and elsewhere state that "payment is only processed after the account is Active." In a text file, these two statements can coexist indefinitely. In code, they create a deadlock.

Third, text lacks execution paths. A note cannot be executed, tested, or compiled. To turn a note into a working application, a developer must manually translate every sentence into a structured format: database migrations, API routes, state machines, and user interface components. This translation step requires a massive cognitive leap, and the sheer friction of starting from a blank editor window often kills the momentum of the project.

The Methodology of Structured Transition

To bridge the gap between developer note taking and execution, we must treat notes not as final documentation, but as raw, unstructured input for a structured planning pipeline. We need a systematic methodology to transition from flat text to formal specifications.

This transition involves three distinct phases:

Entity Extraction and Schema Definition: Identifying the core domain models and their relationships.
State Machine Formalization: Mapping the valid states and transitions of those models.
Interface Specification: Defining the boundaries, inputs, and outputs of the system components.

Let us look at how we can represent this pipeline programmatically. Suppose we have a raw note describing a simple subscription billing system. Instead of leaving this as text, we can run it through a parser that extracts the domain model and outputs a validated schema.

Here is a pseudo-code representation of how a structured planning system processes unstructured developer notes:

// Pseudo-code: Structured Planning Pipeline
interface RawNote {
  content: string;
  metadata: {
    created_at: string;
    tags: string[];
  };
}

interface DomainEntity {
  name: string;
  properties: Array<{ name: string; type: string; required: boolean }>;
  relations: Array<{ target: string; type: "one-to-one" | "one-to-many" | "many-to-many" }>;
}

interface StateMachine {
  entity: string;
  states: string[];
  transitions: Array<{ from: string; to: string; trigger: string }>;
}

interface SystemSpecification {
  entities: DomainEntity[];
  stateMachines: StateMachine[];
}

class PlanningSystem {
  // Parses raw text to extract structured domain entities
  private extractEntities(text: string): DomainEntity[] {
    // The system analyzes nouns and relationships in the text
    // to construct a normalized relational schema.
    return [
      {
        name: "Subscription",
        properties: [
          { name: "id", type: "uuid", required: true },
          { name: "status", type: "string", required: true },
          { name: "billing_interval", type: "string", required: true }
        ],
        relations: [
          { target: "User", type: "one-to-many" }
        ]
      }
    ];
  }

  // Parses raw text to extract state transitions
  private extractStateMachines(text: string): StateMachine[] {
    // The system identifies lifecycle descriptions and maps them
    // to a formal state machine representation.
    return [
      {
        entity: "Subscription",
        states: ["Pending", "Active", "PastDue", "Canceled"],
        transitions: [
          { from: "Pending", to: "Active", trigger: "payment_success" },
          { from: "Active", to: "PastDue", trigger: "payment_failure" },
          { from: "PastDue", to: "Active", trigger: "payment_success" },
          { from: "PastDue", to: "Canceled", trigger: "grace_period_expiry" }
        ]
      }
    ];
  }

  public compile(note: RawNote): SystemSpecification {
    const entities = this.extractEntities(note.content);
    const stateMachines = this.extractStateMachines(note.content);

    // Validate that all states referenced in transitions exist
    this.validateStateTransitions(stateMachines, entities);

    return {
      entities,
      stateMachines
    };
  }

  private validateStateTransitions(machines: StateMachine[], entities: DomainEntity[]) {
    for (const machine of machines) {
      const entity = entities.find(e => e.name === machine.entity);
      if (!entity) {
        throw new Error(`State machine references non-existent entity: ${machine.entity}`);
      }
      // Ensure the entity has a status or state property to hold the state
      const hasStateField = entity.properties.some(p => p.name === "status" || p.name === "state");
      if (!hasStateField) {
        throw new Error(`Entity ${entity.name} lacks a state field to support the state machine`);
      }
    }
  }
}

By running our unstructured notes through a validation pipeline like the one sketched above, we immediately expose logical gaps. If our note mentions a "Canceled" state but our database schema has no way to store or transition to that state, the compiler flags it. We are forced to resolve the architectural ambiguity before we write a single line of application code.

The Trade-offs of Formalization

A common objection to this approach is that formalizing ideas too early kills creativity. The argument is that the friction of schema definition prevents the free flow of thoughts. This is a valid concern. If you must write a complete JSON schema just to jot down an idea on a train, you will write down fewer ideas.

The solution is not to abandon developer note taking, but to separate the capture phase from the planning phase.

The Capture Phase: Low friction, unstructured, highly creative. Use whatever tool is fastest.
The Planning Phase: High discipline, structured, adversarial. This is where you import the raw note into a system that forces you to define the schema, the state transitions, and the API boundaries.

The mistake most developers make is trying to go directly from the Capture Phase to the Execution Phase (writing code) without passing through the Planning Phase. They open an IDE and start writing React components or database migrations based on a vague markdown file. This leads to frequent refactoring, abandoned codebases, and wasted effort as they discover logical contradictions mid-implementation.

By introducing a structured planning phase, you narrow the decision space early. You identify critical factors and commit resources only after the plan survives adversarial review. This is systems-thinking applied to software design: planning is not a prelude to execution; it is the high-leverage phase of execution itself.

Conclusion

Unstructured notes are excellent for capturing the spark of an idea, but they are a poor foundation for building software. Without a systematic transition from text to structure, your notes app will remain a graveyard of good concepts. By separating capture from planning, and using structured pipelines to validate your ideas before writing code, you can ensure that your concepts actually make it to production.

This article is a preview of the system design methodology we are building at Bridge; sign up for early access and our newsletter at https://bridgedev.io/?utm_source=devto&utm_medium=social&utm_campaign=prelaunch to join the private preview.

Agentic Note Expansion: From Raw Ideas to App Specs

HyunKi Lee — Sun, 12 Jul 2026 08:01:11 +0000

Agentic Note Expansion with Loops and Goals

Raw, fragmented notes captured on a mobile device are a common starting point for software projects. A developer might jot down a few lines about a data model, a user flow, or a specific business rule while away from their keyboard. However, translating these unstructured fragments into rigorous, actionable engineering artifacts usually requires hours of manual synthesis.

Traditional static templates and simple LLM prompts often fail at this task. They lack the context to resolve ambiguities, leading to generic outputs that do not reflect the original intent. To solve this, we must look at the problem through the lens of system design. By treating note expansion as an agentic workflow governed by closed-loop feedback and explicit goals, we can systematically transform raw input into structured user stories, data schemas, and screen-by-screen UX flows.

The Problem with Static Expansion

When a system attempts to expand a brief note using a single prompt-response cycle, it operates without a feedback loop. If the input is "build an offline-first task manager with sync," a static expansion might generate a standard todo-list schema. It misses the critical engineering questions: What is the conflict resolution strategy? How are binary assets handled offline? What is the sync protocol?

Without a mechanism to identify missing information, the system either makes assumptions that are often incorrect or produces high-level platitudes. To build a reliable tool, the system must be able to:

Analyze the input against a defined target schema.
Identify gaps in the logic or requirements.
Formulate specific queries or sub-tasks to resolve those gaps.
Execute iterative refinement loops until the output meets a quality threshold.

This is the core of agentic note expansion.

The Architecture of a Closed-Loop Expansion System

To implement this, we design a system composed of three main components: the Planner, the Executor, and the Evaluator. This classic agentic triad operates over a shared state, executing a loop until a pre-defined goal is satisfied.

+-------------------------------------------------+
|                  Shared State                   |
|  (Raw Input, Current Artifacts, Gap Registry)   |
+-------------------------------------------------+
       ^                       |               ^
       |                       v               |
+--------------+       +--------------+       +--------------+
|   Planner    | ----> |   Executor   | ----> |  Evaluator   |
| (Finds Gaps) |       | (Drafts Spec)|       | (Checks Goal)|
+--------------+       +--------------+       +--------------+

1. The Planner

The Planner reads the raw input and establishes the target goals. For a software feature, the goals might include a relational database schema, a set of OpenAPI specifications, and a step-by-step user flow. The Planner registers what information is missing from the initial note to complete these targets.

2. The Executor

The Executor is responsible for generating the actual content. It takes the current state and the Planner's instructions to write the markdown, SQL, or JSON specifications. It does not operate in a vacuum; it only addresses the specific gaps highlighted by the Planner.

3. The Evaluator

The Evaluator acts as an adversarial gatekeeper. It tests the generated artifacts against strict validation rules. For example, if the Executor generated a database schema, the Evaluator checks for foreign key integrity, missing indexes on frequently queried fields, and compliance with the offline-sync requirements. If validation fails, the Evaluator writes the failures back to the state as new gaps, and the loop repeats.

Implementing the Loop: A Pseudo-Code Implementation

The following pseudo-code demonstrates how to structure this iterative loop in an application. This pattern ensures that the system does not exit until the artifacts meet the defined quality criteria or the maximum iteration limit is reached.

class ExpansionState:
    def __init__(self, raw_note: str):
        self.raw_note = raw_note
        self.artifacts = {}
        self.detected_gaps = []
        self.iteration_count = 0

def expand_note_workflow(raw_note: str, max_iterations: int = 5) -> ExpansionState:
    state = ExpansionState(raw_note)

    # Initial planning phase
    state.detected_gaps = planner.analyze_initial_input(state.raw_note)

    while len(state.detected_gaps) > 0 and state.iteration_count < max_iterations:
        # Address the highest priority gaps
        active_gaps = state.detected_gaps[:3]

        # Executor drafts updates based on active gaps
        drafted_updates = executor.generate_specifications(state.artifacts, active_gaps)

        # Apply updates to the shared state
        state.artifacts.update(drafted_updates)

        # Evaluator reviews the updated state
        evaluation_result = evaluator.validate_artifacts(state.artifacts)

        # Update the gap registry with new or unresolved issues
        state.detected_gaps = evaluation_result.remaining_gaps
        state.iteration_count += 1

    if len(state.detected_gaps) > 0:
        # Log warnings if the system exited due to iteration limits
        logger.warning("Expansion completed with unresolved gaps", extra={"gaps": state.detected_gaps})

    return state

Trade-offs and System Constraints

While this closed-loop methodology produces highly structured and technically sound specifications, it introduces specific trade-offs that system architects must consider.

Latency vs. Quality

A single-prompt generation takes seconds but often yields shallow results. A closed-loop system running multiple iterations can take significantly longer to complete. For asynchronous workflows, such as processing notes in the background after a user closes their mobile app, this latency is acceptable. For real-time interactive interfaces, the system must provide intermediate feedback to the user to maintain responsiveness.

Token Consumption and Cost

Iterative loops inherently consume more tokens. Each pass requires sending the current state, the generated artifacts, and the evaluation feedback back to the model. To mitigate this, the system should employ state-pruning techniques, passing only the relevant diffs and specific schemas rather than the entire project history on every iteration.

Loop Termination and Hallucination

There is a risk of the Planner and Evaluator entering an infinite loop if they disagree on a specific requirement. For instance, the Evaluator might flag a schema as incomplete, while the Executor lacks the context to resolve it without human input. Setting a strict iteration cap and implementing a fallback mechanism that flags the ambiguity for human review is essential for production stability.

The Path Forward for Developer Tooling

Moving beyond simple text completion requires building systems that understand intent, structure, and validation. By implementing agentic loops that treat note expansion as an engineering problem, we can turn fragmented thoughts into precise, production-ready specifications.

This approach forms the foundation of how we think about software planning and execution. This article is a preview of the kind of systems-thinking methodology Bridge will publish at launch.

System Design Prompting for Building Better Mobile Apps

HyunKi Lee — Mon, 06 Jul 2026 09:43:26 +0000

System Design Prompting for Developers

The Cost of Vague Specifications

When translating a mobile app concept into a working system, the widest gap is not between writing code and compiling it. The widest gap is between a vague product idea and a concrete technical specification.

Most developers have experienced the failure modes of poorly defined requirements. A product manager requests a real-time messaging feature. The developer builds a polling mechanism because it is fast to implement. Later, the product manager reveals that the feature must support presence indicators and typing states, which require a persistent WebSocket connection. The initial implementation must be discarded. This is not a failure of coding skill. It is a failure of system design planning.

To prevent these costly rewrites, developers can use system design prompting. This methodology uses structured prompts to guide the system through the process of translating high-level product ideas into precise technical specifications. By front-loading the planning phase, you narrow the decision space early and identify critical architectural constraints before writing a single line of code.

The System Design Prompting Methodology

System design prompting is not about asking a generic assistant to write your code. It is a structured, multi-step process that treats the planning system as an adversarial reviewer. The goal is to produce three core artifacts before implementation begins:

A normalized data schema that reflects the business logic.
A complete set of user stories with explicit edge cases.
A screen-by-screen user experience flow that maps directly to database operations.

To achieve this, we use a three-phase prompting workflow. Each phase builds on the output of the previous phase, ensuring that the technical specifications remain consistent throughout the design process.

Phase 1: Schema Definition and Constraint Mapping

The first phase focuses entirely on the data model. Instead of asking for a database schema directly, we prompt the system to identify the core entities, their relationships, and the constraints that govern them.

Here is a pseudo-code representation of how to structure the initial prompt for the planning system:

Define SystemDesignPrompt_Phase1:
  Input: High-level product description
  Output: Normalized relational schema (PostgreSQL dialect)

  Instructions:
    1. Identify all core entities required to support the description.
    2. Define primary and foreign keys for each entity.
    3. Specify data types, nullability, and unique constraints.
    4. Identify potential race conditions or concurrency issues.
    5. Output the schema as valid DDL.

By forcing the system to output valid DDL first, you establish a concrete foundation. If the system cannot represent the product idea in a relational schema, the product idea itself is likely under-defined.

Phase 2: User Story Generation and Edge Case Analysis

Once the schema is established, the second phase maps user actions to database operations. This phase prevents the common mistake of designing user interfaces that require impossible or highly inefficient database queries.

We prompt the system to generate user stories using a strict template:

As a [User Role]
I want to [Action]
So that [Value]
Database Operations: [Specific SELECT, INSERT, UPDATE, or DELETE statements]
Edge Cases: [List of potential failure states and how the system should handle them]

For example, if the user story is "As a user, I want to join a private group," the system must specify the exact database transaction required to insert the membership record and verify that the group has not reached its maximum capacity. If the transaction fails due to a constraint violation, the system must define the error response returned to the client.

Phase 3: Screen-by-Screen UX Flow Mapping

The final phase translates the user stories into a concrete user experience flow. For each screen in the mobile application, the system must define:

The data required to render the screen (the read path).
The user actions available on the screen (the write path).
The state transitions between screens.

This step ensures that the frontend and backend teams are aligned on the API contract before any code is written. The frontend team knows exactly what data to expect, and the backend team knows exactly what endpoints to expose.

Trade-offs and Considerations

While system design prompting reduces architectural errors, it requires a significant upfront investment of time. Developers must resist the urge to begin coding immediately.

For simple applications with well-understood patterns, this level of planning may feel excessive. However, for complex systems with distributed state, real-time requirements, or strict consistency constraints, the time spent planning is recovered during the integration phase. It is far cheaper to modify a text-based specification than to refactor a database schema and rewrite API endpoints in a production environment.

Additionally, the quality of the output depends heavily on the precision of the prompts. Vague prompts yield vague specifications. Developers must treat prompt engineering as a form of system design, applying the same rigor to their instructions as they do to their code.

Conclusion

Planning is not a prelude to execution. It is the high-leverage phase of the work. By using system design prompting to translate vague ideas into concrete technical specifications, developers can identify architectural bottlenecks early, align team members on API contracts, and reduce the need for costly refactoring.

To receive more technical deep dives and early access to the tools we are building to assist with system design planning, sign up for our newsletter at https://bridgedev.io/?utm_source=devto&utm_medium=social&utm_campaign=prelaunch

Spec: A Developer Workflow Case Study in Mobile Application Planning

HyunKi Lee — Tue, 30 Jun 2026 02:35:44 +0000

Software architecture suffers when execution outpaces planning. In mobile development, jumping straight to code without a rigorous schema and defined user flows leads to immediate technical debt. The database schema drifts from the UI state, and edge cases in user navigation require late-stage refactoring.

To prevent this, we must treat planning as a high-leverage phase of execution. This developer workflow case study demonstrates how to systematically decompose a raw mobile product concept into a structured development plan. We will generate core pillars, detailed user stories, a relational data schema, and screen-by-screen UX flows before writing a single line of application code.

The Problem with Traditional Specifications

Solo developers and small product teams often skip formal specification because traditional tools require manual synchronization. When the UI changes, the database schema in the developer's head must be manually updated across three different documents. This manual overhead causes the specification to rot.

When specifications rot, developers default to improvisational coding. This is the practice of designing database tables and API payloads on the fly while writing UI components. It leads to several critical failure modes:

Orphaned State: UI components requesting data that does not exist in the local database.
Race Conditions: Network synchronization logic written without a clear state machine, leading to duplicate writes.
Scope Creep: Features expanding mid-sprint because the boundaries of the user story were never defined.

To solve this, we need a system where the specification is treated as code: structured, relational, and deterministic.

The Case Study: Offline-First Field Inventory App

To demonstrate this methodology, we will walk through the planning phase of a mobile application designed for field technicians. The core requirement is offline-first inventory tracking. Technicians must be able to view, update, and log inventory changes in remote areas with intermittent connectivity.

We begin with a raw product concept: "An app that lets technicians manage warehouse stock on their phones, even when offline, and syncs back to the main database when they get a signal."

We pass this raw concept to the planner, our structured system designed to decompose product requirements.

Step 1: Domain Modeling and Core Pillars

The planner first identifies the core domain entities and their relationships. Instead of writing a vague text document, the system outputs a structured domain model. This model serves as the single source of truth for both the database schema and the UI state.

Here is the structured representation of our domain model:

{
  "domain": "FieldInventory",
  "entities": {
    "Item": {
      "properties": ["id", "sku", "name", "quantity", "warehouse_id"],
      "relations": {
        "warehouse": {
          "type": "belongs_to",
          "foreign_key": "warehouse_id"
        }
      }
    },
    "Warehouse": {
      "properties": ["id", "name", "location"],
      "relations": {
        "items": {
          "type": "has_many",
          "foreign_key": "warehouse_id"
        }
      }
    },
    "SyncTransaction": {
      "properties": ["id", "table_name", "record_id", "action", "payload", "timestamp"],
      "relations": {}
    }
  }
}

By defining this structure first, we establish clear boundaries. We know exactly what entities exist and how they relate to one another.

Step 2: Generating Structured User Stories

With the domain model established, the system generates user stories. Traditional user stories are often too vague to be actionable. A story like "As a technician, I want to update inventory" leaves too many open questions.

The planner generates stories with strict preconditions, flows, and postconditions. This ensures that every story is testable and directly translatable into integration tests.

story_id: US-01
title: Offline Inventory Update
actor: Field Technician
preconditions:
  - User is authenticated.
  - Local database is initialized and populated with cached data.
flow:
  1. User navigates to the Dashboard.
  2. User selects a specific Warehouse.
  3. User selects an Item from the warehouse list.
  4. User inputs a new quantity value.
  5. User commits the change by tapping "Update".
postconditions:
  - The local SQLite database updates the quantity for the selected Item.
  - A new record is appended to the SyncTransaction table with the action "UPDATE" and the updated payload.
  - The UI reflects the updated quantity immediately.

This level of detail eliminates ambiguity. The developer knows exactly what database operations must occur and what UI states must be handled.

Step 3: Relational Data Schema Generation

Because the domain model and user stories are structured, generating the relational database schema is deterministic. For an offline-first mobile app, SQLite is the standard choice.

The system generates the following SQL schema based on the domain model:

CREATE TABLE warehouses (
    id TEXT PRIMARY KEY,
    name TEXT NOT NULL,
    location TEXT
);

CREATE TABLE items (
    id TEXT PRIMARY KEY,
    sku TEXT UNIQUE NOT NULL,
    name TEXT NOT NULL,
    quantity INTEGER NOT NULL DEFAULT 0,
    warehouse_id TEXT NOT NULL,
    FOREIGN KEY (warehouse_id) REFERENCES warehouses(id) ON DELETE CASCADE
);

CREATE TABLE sync_transactions (
    id TEXT PRIMARY KEY,
    table_name TEXT NOT NULL,
    record_id TEXT NOT NULL,
    action TEXT NOT NULL CHECK (action IN ('INSERT', 'UPDATE', 'DELETE')),
    payload TEXT NOT NULL,
    timestamp INTEGER NOT NULL
);

This schema directly supports the offline-first requirement. The sync_transactions table acts as an outbox pattern, capturing all local mutations that need to be replicated to the backend server when connectivity is restored.

Step 4: Screen-by-Screen UX Flows as State Machines

The final step in the planning phase is mapping out the user interface transitions. Instead of relying solely on visual design tools, we represent the navigation graph as a state machine. This prevents dead-ends in the UI and ensures that all edge cases, such as network loss during a sync operation, are handled.

Here is a TypeScript representation of the navigation state machine:

type ScreenState = 'Dashboard' | 'WarehouseDetail' | 'ItemDetail' | 'SyncStatus';

interface NavigationTransition {
  current: ScreenState;
  event: 'SELECT_WAREHOUSE' | 'SELECT_ITEM' | 'BACK' | 'VIEW_SYNC' | 'SYNC_COMPLETE';
  next: ScreenState;
}

const navigationGraph: NavigationTransition[] = [
  { current: 'Dashboard', event: 'SELECT_WAREHOUSE', next: 'WarehouseDetail' },
  { current: 'WarehouseDetail', event: 'SELECT_ITEM', next: 'ItemDetail' },
  { current: 'ItemDetail', event: 'BACK', next: 'WarehouseDetail' },
  { current: 'WarehouseDetail', event: 'BACK', next: 'Dashboard' },
  { current: 'Dashboard', event: 'VIEW_SYNC', next: 'SyncStatus' },
  { current: 'SyncStatus', event: 'BACK', next: 'Dashboard' }
];

By modeling navigation as a state machine, we can write automated tests to verify that every screen transition is valid. This approach prevents common mobile bugs where a user double-taps a button and pushes duplicate screens onto the navigation stack.

Trade-offs and Analysis

Front-loading the planning phase requires an initial investment of time. Developers who are eager to write code may view this as a bottleneck. However, the trade-offs favor this structured approach:

Reduced Decision Fatigue: When you sit down to write code, you are not deciding how the database schema should look or how the navigation should flow. You are simply implementing a pre-verified specification.
Clearer Boundaries: By defining the schema and user stories upfront, you prevent scope creep. If a new requirement arises, it must first be integrated into the specification before code is modified.
Automated Verification: Structured specifications can be used to generate boilerplate code, database migrations, and test suites automatically.

The system acts as an adversarial reviewer during this process. It identifies missing edge cases, such as what happens to the sync queue if the network drops mid-payload, before you write any code.

Conclusion

Planning is not a prelude to execution; it is the high-leverage phase of execution. By using a structured system to map out domain models, user stories, database schemas, and navigation flows, developers can eliminate architectural drift and build more robust applications.

Large Context Window Prompting: 2M Token Guide

HyunKi Lee — Fri, 26 Jun 2026 22:39:15 +0000

Structuring Prompts for 2M Token Contexts: Maintaining Retrieval Accuracy at Scale

The expansion of Large Language Model (LLM) context windows to 2 million tokens changes how we think about in-context learning. However, a larger context window does not guarantee perfect recall. Standard Needle In A Haystack (NIAH) tests often use simple, isolated keys. In real-world engineering scenarios, where you feed an entire codebase, database schema, and UX specification into a model, retrieval accuracy degrades significantly. This degradation is not uniform; it typically concentrates in the middle of the context window, a phenomenon known as the "lost in the middle" effect.

The Problem with Unstructured Context

When dealing with large context window prompting, developers often treat the context window as a database. This is a conceptual error. A database uses deterministic indexing to retrieve records. An LLM uses soft attention mechanisms that distribute weights across the entire input sequence. When the input sequence spans millions of tokens, the attention signal-to-noise ratio drops.

If you dump unstructured text, raw markdown files, and loose JSON schemas into a 2-million-token prompt, the model will struggle to resolve cross-references. For example, if a database schema is defined at token 200,000, and an API route handler is defined at token 1,500,000, the model may fail to connect the two when generating a new controller. To maintain high retrieval accuracy and prevent hallucination, we must apply strict structural patterns to our inputs.

The Anatomy of a Structured 2M Token Prompt

To optimize attention allocation, we must structure the prompt deterministically. We recommend a hierarchical XML-based structure. XML tags provide clear boundaries that the model's attention heads can easily parse.

Here is the recommended structural layout for a massive context prompt:

System Instructions and Constraints (Top)
Global Metadata and Dependency Graph
Static Reference Data (Schemas, API contracts)
Dynamic Codebase/Document Context (The bulk of the tokens)
Task-Specific Instructions and Query (Bottom)

Placing the query and the system instructions at the absolute boundaries (top and bottom) takes advantage of primacy and recency biases in transformer models. The middle of the context should be reserved for the dense, static reference material.

Implementing Context Zoning

Let us look at how to structure the dynamic codebase context. Instead of concatenating files raw, each file should be wrapped in an XML block containing metadata. This metadata acts as an index for the attention mechanism.

<context_zone id="codebase">
  <file path="src/models/user.ts" language="typescript">
    <dependencies>
      <dependency>src/types/auth.ts</dependency>
    </dependencies>
    <code>
      // File content goes here
    </code>
  </file>
</context_zone>

By explicitly declaring dependencies within the metadata tags, we assist the model in tracing execution paths without requiring it to infer relationships solely from the code structure.

Programmatic Prompt Assembly

Assembling a 2-million-token prompt manually is impractical. It must be done programmatically. Below is a Python pseudo-code example demonstrating how to build a structured context prompt from a directory, calculating token usage and injecting structural anchors.

# Pseudo-code for structured context assembly
import os
from typing import List, Dict

class ContextAssembler:
    def __init__(self, root_dir: str, max_tokens: int = 2000000):
        self.root_dir = root_dir
        self.max_tokens = max_tokens
        self.token_estimator_factor = 4  # Rough character-to-token ratio

    def estimate_tokens(self, text: str) -> int:
        return len(text) // self.token_estimator_factor

    def build_file_node(self, file_path: str) -> str:
        relative_path = os.path.relpath(file_path, self.root_dir)
        with open(file_path, 'r', encoding='utf-8') as f:
            content = f.read()

        return (
            f'<file path="{relative_path}">\n'
            f'<code>\n{content}\n</code>\n'
            f'</file>\n'
        )

    def assemble(self, query: str, system_instructions: str) -> str:
        prompt_parts = []

        # 1. System Instructions at the top
        prompt_parts.append("<system_instructions>\n" + system_instructions + "\n</system_instructions>")

        # 2. Open Context Zone
        prompt_parts.append("<context_zone id=\"source_code\">")

        current_tokens = self.estimate_tokens("".join(prompt_parts))

        for root, _, files in os.walk(self.root_dir):
            for file in files:
                if file.endswith(('.ts', '.py', '.json', '.sql')):
                    file_path = os.path.join(root, file)
                    node = self.build_file_node(file_path)
                    node_tokens = self.estimate_tokens(node)

                    if current_tokens + node_tokens > (self.max_tokens - 10000): # Reserve space for query
                        break

                    prompt_parts.append(node)
                    current_tokens += node_tokens

        prompt_parts.append("</context_zone>")

        # 3. Query at the bottom
        prompt_parts.append("<query>\n" + query + "\n</query>")

        return "\n".join(prompt_parts)

Trade-offs and Architectural Decisions

Using a 2-million-token context window is not always the correct architectural choice. Developers must weigh the trade-offs against Retrieval-Augmented Generation (RAG) and fine-tuning.

Latency: Processing 2 million tokens can result in Time-To-First-Token (TTFT) latencies of several seconds or even minutes, depending on the provider and infrastructure. For interactive applications, this is often unacceptable.
Cost: Input token costs scale linearly. Running a 2-million-token prompt for every user query is financially non-viable for high-throughput production systems.
Global Synthesis vs. Local Retrieval: RAG is highly efficient for retrieving specific, isolated facts. However, RAG fails when the task requires global synthesis, such as refactoring an entire codebase to use a new state management library. Large context windows excel at global synthesis because the entire state is present in the model's working memory.

Therefore, the decision framework should be:

Use RAG for point-lookup queries and low-latency requirements.
Use Large Context Windows for complex refactoring, architectural planning, and deep code analysis where global context is mandatory.

Mitigating Attention Degradation with Attention Anchors

To combat the "lost in the middle" effect within a 2-million-token context, we can employ "attention anchors." These are repetitive, high-level summaries placed at regular intervals throughout the prompt. For example, every 500,000 tokens, you can inject a structural map of the codebase. This reminds the model of the global architecture, reinforcing the attention weights on key components.

Another technique is "redundant schema definition." If your query relies heavily on a specific database schema, define that schema both in the static reference section and directly inside the query block at the bottom. This redundant placement ensures that the attention heads do not have to traverse the entire 2-million-token space to resolve basic structural questions.

Evaluating Retrieval Accuracy

Before deploying a large context prompt to production, you must measure its retrieval accuracy. Do not rely on generic benchmarks. Instead, implement a synthetic evaluation pipeline:

Generate synthetic needles: Create unique, random UUIDs associated with specific, arbitrary instructions (e.g., "If you see UUID-9823, append the word 'ALPHA' to the output").
Inject needles at varying depths: Place these synthetic needles at 10 percent, 30 percent, 50 percent, 70 percent, and 90 percent of your context window.
Run evaluations: Execute the prompt multiple times and measure the retrieval rate at each depth.
Optimize structure: If retrieval drops below 95 percent at the 50 percent depth, adjust your XML tagging, increase the redundancy of your anchors, or reduce the overall context size.

Conclusion

As context windows continue to expand, the bottleneck shifts from capacity to structure. Simply dumping data into a model is a recipe for high latency, high costs, and inaccurate outputs. By treating the context window as a structured memory space, using XML zoning, placing critical instructions at the boundaries, and programmatically assembling inputs, developers can maintain high retrieval accuracy even at the 2-million-token limit.

AI development planning tools: Separating Fact from Fiction

HyunKi Lee — Sat, 13 Jun 2026 19:31:26 +0000

We are exploring the evolving landscape of AI development planning tools. This piece, titled 'AI Dev Planning Tools: Beyond the Hype,' delves into how AI can assist indie founders and small teams in transforming mobile app ideas into structured, executable plans. It offers a preview of the kind of insights Bridge will publish at launch. For more on effective product planning and to stay updated, sign up for our newsletter at https://bridgedev.io/?utm_source=devto&utm_medium=social&utm_campaign=prelaunch.

Beyond Code: A Guide to AI project scoping for developers

HyunKi Lee — Fri, 12 Jun 2026 03:10:13 +0000

Effective software project scoping requires translating high-level requirements into concrete development artifacts. For mobile applications, this often means inferring architectural components from an initial product brief. An AI can parse such a brief, moving beyond natural language processing to generate structured outputs like proposed project pillars, user stories, and preliminary data schemas.

// Proposed Project Pillars
[
  "User Authentication & Profiles",
  "Activity Data Management",
  "Social Interaction & Gamification"
]

// Example User Story Generation (from "Activity Data Management")
AS A user, I WANT to log my runs with GPS data, SO THAT I can track my progress.
AS A user, I WANT to see a summary of my weekly activity, SO THAT I can stay motivated.

// Preliminary Data Schema Suggestion (simplified)
User { id: UUID, name: String, email: String, activities: [Activity] }
Activity { id: UUID, userId: UUID, type: Enum<Run, Cycle>, duration: Int, distance: Float, geoPath: [LatLng] }

This process aims to provide a technical blueprint, translating high-level requirements into actionable development artifacts like user stories and preliminary data models, thereby establishing a robust foundation for engineering efforts. We're sharing this as a preview of the kind of technical insights Bridge will publish at launch. Sign up for early access to our platform and more content: https://bridgedev.io/?utm_source=devto&utm_medium=social&utm_campaign=prelaunch

AI for Dev Teams: Decomposing Tasks, Delivering Quality

HyunKi Lee — Wed, 03 Jun 2026 12:42:12 +0000

This post introduces the concept of AI-assisted task decomposition for development teams, a systematic approach to transforming complex projects into structured plans. It highlights how this method can help map dependencies, reduce integration risks, and encourage architectural thinking from the outset. This is a preview of the kind of in-depth guides Bridge will publish at launch; to receive the full article and future content, please sign up for our newsletter and early access updates at https://bridgedev.io/?utm_source=devto&utm_medium=social&utm_campaign=prelaunch.
Actions

Solid Mobile App Planning: Avoiding the 'Prompt to App Trap'

HyunKi Lee — Sun, 24 May 2026 07:42:28 +0000

The "prompt to app trap" highlights a critical challenge in modern development: the temptation to generate code rapidly, often bypassing essential architectural planning. While powerful tools can accelerate coding, a robust and well-considered software architecture, guided by a comprehensive mobile app planning framework, is the indispensable foundation for any successful application. This post offers a glimpse into the structured approach necessary to navigate the complexities of software development, emphasizing why architectural planning is paramount. This is the kind of foundational content Bridge will be sharing when we launch; for more insights and early access, visit https://bridgedev.io/?utm_source=devto&utm_medium=social&utm_campaign=prelaunch.