Stop "chatting" with your LLM.
To achieve predictable results from a Large Language Model, it's necessary to stop communicating in prose and start providing executable specifications. This concept is the foundation of the 2WHAV framework, a structured way to transform a requirement into a "blueprint" that an LLM can execute, not just interpret.
But even the most detailed blueprint cannot capture the complexity of the real world on the first try. The secret to moving from a "good attempt" to "robust code" is not a longer prompt. It's an engineered feedback process.
Test the Process Yourself
Before analyzing the theory, here is a practical experiment to run on an advanced LLM. Copy and paste the following commands step-by-step.
Step 1: Load the framework context
Ask the LLM (tested on Gemini) to load the rules of the iterative process.
Load the README.md file from the URL: https://github.com/fra00/2WHAV-iterative
Step 2 (Optional): Test comprehension
To get better results, it's useful to verify if the LLM has understood the original 2WHAV.
To verify your understanding, can you explain the "mode definitions" of the original 2WHAV?
(The correct answer is "MINIMAL, STANDARD, FULL, Custom". If the answer is incorrect, you can have it load the original framework with load https://github.com/fra00/2WHAV)
Step 3: Start the iterative cycle
Request the first version of the application.
codeCode
Now, apply the 2WHAV full mode-iterative framework to generate a "Pixel Art Pad" app in a single HTML/JS file. The app must display a 16x16 grid. When I click on a cell, it should turn black. Start by generating the 2WHAV v1 document for this task, and then the code.
Step 4: A Note on This Simple Example The LLM will produce a functional app. You might rightly think that you could achieve the same result, and even add more features like a color picker, with a simple chat. And you would be correct.
This example is intentionally simple to illustrate the 2WHAV methodology, not to prove its necessity for trivial tasks.
The true value of this framework emerges in complex, long-term projects. In those scenarios, a "chat-and-fix" approach often leads to:
- Silent Regressions: New features breaking old ones.
- Lack of Traceability: No clear, documented specification that reflects the current state of the code.
- An Unreliable "Source of Truth": The code becomes a patchwork of fixes instead of the result of a coherent design.
The 2WHAV iterative cycle is designed to solve these scaling problems by treating the blueprint as the master specification, ensuring a more robust and maintainable development process.
The final code can be tested in an online editor like JSFiddle.
The Common Workflow (and Why It Fails)
The process you just tested is the opposite of the typical interaction workflow with an LLM.
Flow 1: The Chaos of "Chat-and-Fix"
codeMermaid
graph TD
A[Vague Request in Prose] --> B{LLM Generates Code v1};
B --> C{User's Manual Test};
C --> D{Bugs/Issues Discovered};
D --> E["Quick Fix: 'add a color picker'"];
E --> F{LLM Generates Code v2 - a patch};
F --> G{Manual Test};
G --> H{New Bug Introduced? - Regression};
I --> J{...endless cycle?};
This approach is fragile, untraceable, and does not improve the underlying knowledge base.
The Solution: The Engineered Evaluation Cycle
The alternative is a systematic feedback loop where the code isn't corrected, but rather the blueprint (the 2WHAV document) is improved.
Flow 2: The 2WHAV Iterative Process
codeMermaid
graph TD
subgraph Iteration N
A[Blueprint 2WHAV vN] --> B{LLM Executes & Generates Code vN};
B --> C[Systematic Evaluation];
C --> D{Score < 10/10?};
end
D -- Yes --> E[Identify Issues: 🔴 Blocker, 🟡 Major];
E --> F[Root Cause Analysis: Why was the 2WHAV incomplete?];
F --> G[Improve the Blueprint --> 2WHAV v(N+1)];
G --> A;
D -- No --> H[🎉 Production-Ready Code];
This is not a dialogue; it's an algorithm. Let's analyze the phases using the Pixel Art Pad example.
We start with a 2WHAV v1 for the "Pixel Art Pad":
WHAT: A 16x16 grid; a click colors the cell black.
WHERE: A single HTML file with inline JS and CSS.
HOW: Dynamically generate the grid with JavaScript.
VERIFY: The grid must be visible; the click must work.
The LLM executes this blueprint and produces the v1 code.
The result is evaluated with a weighted metric. The v1 code scores 5/10. It is correct according to the specification but useless as a creative tool (🟡 MAJOR).
The question is not "How do I add a color picker?". The question is:
"What ambiguity or omission in the 2WHAV v1 led to such a limited app?"
The answer is almost always in the specification: the WHAT was too simplistic.
Based on this analysis, the blueprint is updated to create a 2WHAV v2:
- 2WHAV v2 - Changes: Adds requirements for a color picker and an "eraser" function to WHAT and VERIFY.
The 2WHAV document becomes more robust. This is accumulated knowledge.
The LLM is provided with the 2WHAV v2. By executing the new directives, the LLM produces a better Code v2. The process is repeated until a score of 10/10 is reached.
Trade-offs and Considerations
Despite its effectiveness, the 2WHAV process is not a universal solution and has trade-offs to consider:
Overhead for simple tasks: Applying the entire iterative cycle to generate a few lines of code or a trivial component is inefficient. The framework excels at medium-to-high complexity tasks, where specification clarity and bug prevention are crucial.
Indispensable human supervision: The process is not autonomous. It requires a human operator to act as a domain expert to evaluate the output, identify strategic gaps in the specification, and guide the iterations. It is a framework for collaboration, not total automation.
Context dependency: The cycle's effectiveness depends on the LLM's knowledge of the framework. As seen in the tutorial, it is essential to ensure the LLM has loaded and understood the 2WHAV rules (both original and iterative) to execute the process correctly.
Increased token consumption: Each iteration requires resending the context and conversation history, increasing token consumption compared to a simple back-and-forth. This is a conscious trade-off: exchanging higher computational cost for exponentially superior quality and reliability in the final product.
Debugging remains essential: Even with rigorous 2WHAV
specifications, complex systems require debugging and human
intervention (fortunately). This framework isn't a magic formula
that eliminates bugs—it's a productivity amplifier that helps you
build complex systems faster and more reliably. The difference:
you debug architectural mismatches, not trivial mistakes.
Conclusion: From Prompt Engineer to Systems Engineer
The 2WHAV iterative cycle relies on engineering, not hope. The real paradigm shift is this: the main artifact is not the code, but the 2WHAV document that generates it.
Each iteration cycle produces not just better code, but a better blueprint. A reusable asset that captures lessons learned and ensures that past mistakes are not repeated, leading to a more robust and scalable software production process.
Useful Links
📄 LLM-First Manifesto
📄 Tool as Prompt - The Paradigm
📚 LLM-First Documentation Framework
🛠️ 2WHAV - Prompt Engineering
Top comments (0)