canonical

Posted on Nov 23

Necessary Conditions GPT Must Satisfy for Producing Complex Code

#ai #llm #aigc #lowcode

Many people are attempting to have GPT generate code directly, guiding it through natural language to complete traditional programming tasks. However, almost no one seriously considers how the generated code will be maintained over the long term.

Based on the fundamental concepts of Reversible Computation theory, we can reason as follows:

If a piece of complex business logic code is to run stably long-term, it's clearly impossible to regenerate the code from scratch with every requirement change. We must revise the original logic in a controlled, incremental (delta) manner. This necessitates defining a Delta space where the program can automatically perform delta merge operations. Further reasoning: to ensure the stability of a delta description for a complex system, it must be defined within a domain-model space that carries business meaning, not within a general-purpose programming language space. The general-purpose language space is too vast and misaligned with the requirement space. A small requirement change can trigger massive shifts in the general-purpose language space, undermining the stability of the logical expression.
Given the inherent ambiguity of natural language, it should not be the stable carrier for complex business logic. Even if a natural language description is unambiguous today, as social contexts shift and word meanings evolve, the same phrasing could be interpreted differently in the future. To express business logic in a stable, precise manner and guarantee absolutely reproducible execution with specified semantics, we must use the brilliant pinnacle of human intelligence developed over the last century: formal language. To ensure that formal-language logic automatically generated by GPT is understandable by people and can be rapidly verified, it should utilize a descriptive language whose complexity matches the business requirements—one that can be automatically verified by tools and from which information can be reverse-extracted for other uses: a Domain-Specific Language (DSL).
Program code stores a vast amount of business knowledge, but previous programming technologies often locked this knowledge inside specific technical implementations, lacking general techniques for extraction. Even if GPT can read existing code and produce a natural-language explanation, writing a program to accurately extract precise knowledge from the system's source code remains difficult. GPT's code explanations can serve as references but are prone to hallucinations. However, if we are using GPT to build a new system, why not start with a structural expression that supports reversible analysis?
If GPT is not just completing tasks via simple Q&A but can call external plugins and design complex execution plans, then—whether for security or stable external interactions—we need to constrain the commands GPT issues to a pre-defined semantic space.

Some programmers may undervalue theoretical analysis, believing that prompt engineering for AI large models is purely a matter of accumulated practical experience. I disagree. In fact, based on the Reversible Computation theory analysis above, we naturally derive the necessary conditions for using GPT as a serious software production tool:

GPT's input and output should be delta-ized DSL (domain language) descriptions.

A Forest of DSLs Built on a Unified Metamodel

Many believe GPT can understand complex business logic descriptions and generate accurate general-purpose code implementations. If so, why can't GPT master a structurally simpler DSL with clearer semantic definitions? A common misunderstanding is that DSLs use custom, niche syntaxes, and large language models lack sufficient training data to learn such grammars. In reality, what matters in DSLs is the domain semantic space they establish. They use domain-specific terminology to concisely express related business knowledge and can naturally map to user requirement descriptions. For example, describing a user approval flow only requires concepts like process, step, action, and approver. Every token in a DSL carries business semantics, not added due to technical constraints. In contrast, using a general-purpose language inevitably introduces business-irrelevant details—like importing dependencies and declaring variable scopes—arising from language syntax constraints.

If we focus only on DSL semantics, we can absolutely adopt general XML or JSON syntax as the DSL's universal syntax. Simply put, a DSL can be defined as an AST (Abstract Syntax Tree). This approach resembles LISP's S-expressions, except we can use more readable XML tags. Different representations can be reversibly converted. For example, the Nop Platform defines multiple reversible conversions between XML and JSON; essentially, both can represent the same DSL.

With a unified representation syntax, different DSLs can be constrained using a unified metamodel (similar to JSON Schema), forming a forest of DSLs. Leveraging this unified metamodel, the DSL forest can achieve consistent semantic understanding and support seamless embedding among multiple DSLs.

Traditionally, many programmers think designing a new DSL requires hand-writing a parser, compiler, and even maintaining IDE plugins—a significant workload. But in the Nop Platform, you only need to define an XDef metamodel to automatically obtain a parser, validator, IDE syntax hints, and even the ability to set breakpoints and step-debug directly in IDEA! The Nop Platform can also automatically implement bidirectional conversion between domain objects and Excel template files (convert Excel to DSL, or export Excel from DSL), and can automatically generate visual designers, etc. The Nop Platform offers the concept of a Domain Language Workbench, enabling rapid DSL development and expansion. Its design goal is similar to JetBrains' MPS, but the Nop Platform is built on Reversible Computation theory concepts. Its technical approach is simpler and clearer, with much lower complexity than MPS, while greatly surpassing it in flexibility and extensibility.

Why XML Is a Suitable Syntax Carrier for DSLs

Many programmers have never personally designed an XML-format DSL. They've only heard tales from veterans about how XML was dethroned by newer contenders in ancient times, forming a stereotype that XML is too verbose, suitable only for machine-to-machine data exchange, and unfit for human-computer interaction. This is a mistaken prejudice, stemming from XML fundamentalism's misuse of XML and international XML specifications that amplified such misuses.

When some people think of expressing logic in XML, the stereotype that may come to mind is:

<function>
   <name>myFunc</name>
   <args>
      <arg>
         <arg>
           <name>arg1</name>
           <value>3</value>
         </arg>
         <arg>
           <arg>
              <name>arg2</name>
              <value>aaa</value>
           </arg>
         </arg>
      </arg>
   </args>
</function>

But in practice, we can simply use:

<myFunc arg1="3" arg2="aa" />

If we need to express that arg1 is an integer, we can extend XML syntax to allow numeric attribute values directly. Or, similar to the Vue framework, add a specific prefix to distinguish value types; for example, stipulate that the @: prefix indicates the subsequent value conforms to JSON syntax.

<myFunc arg1=3 arg2="aa" /> or
<myFunc arg1="@:3" arg2="aa" />

In the Nop Platform, we define rules for bidirectional JSON-XML conversion. For example, for the following AMIS page description:

{
  "type": "crud",
  "draggable": true,
  "bulkActions": [
    {
      "type": "button",
      "label": "Batch Delete",
      "actionType": "ajax",
      "api": "delete:/amis/api/mock2/sample/${ids|raw}",
      "confirmText": "Confirm batch deletion?"
    },
    {
      "type": "button",
      "label": "Batch Update",
      "actionType": "dialog",
      "dialog": {
        "title": "Batch Edit",
        "name": "sample-bulk-edit",
        "body": {
          "type": "form",
          "api": "/amis/api/mock2/sample/bulkUpdate2",
          "body": [
            {
              "type": "hidden",
              "name": "ids"
            },
            {
              "type": "input-text",
              "name": "engine",
              "label": "Engine"
            }
          ]
        }
      }
    }
  ]
}

The corresponding XML format is:

<crud draggable="@:true">
  <bulkActions j:list="true">
    <button label="Batch Delete" actionType="ajax" confirmText="Confirm batch deletion?">
      <api>delete:/amis/api/mock2/sample/${ids|raw}</api>
    </button>
    <button label="Batch Update" actionType="dialog">
      <dialog title="Batch Edit" name="sample-bulk-edit">
        <body>
           <form>
             <api>/amis/api/mock2/sample/bulkUpdate2</api>
             <body>
               <hidden name="ids" />
               <input-text name="engine" label="Engine" />
             </body>
           </form>
        </body>
      </dialog>
    </button>
  </bulkActions>
</crud>

In fact, the XML syntax often appears more compact and intuitive.

Here we use JSON-XML conversion without metamodel constraints, so we need j:list to mark array elements and the @: prefix for non-string values. If the XML file has an XDef metamodel definition, these extra annotations are unnecessary.

Another benefit of XML over JSON is the ease of introducing XML extension tags for code generation, allowing both the code representation and the generation result to be in XML format. In the Lisp world, this is called homoiconicity. Currently, JSON lacks a homoiconic approach to code generation.

<columns>
  <c:for var="col" items="${entityModel.columns}">
    <column name="${col.name}" sqlType="${col.sqlType}" />
  </c:for>
</columns>

For further discussion on the equivalence of XML and JSON, see: The Equivalence of XML, JSON, and Function ASTs

AI Needs to Understand Metamodels

Large AI models caused a sensation mainly due to their complex logical reasoning capabilities beyond simple pattern memorization. With this capability, LLMs shouldn't need massive program corpora to learn a DSL; informing them of the language's internal structural constraints should suffice.

The discovery of metamodels and metalanguages is among the most revolutionary in mathematics over the last 100 years. They hold special importance; the development of category theory is closely related to model theory and metalanguage research. In software development, we should use metamodels to precisely convey DSL syntactic structure and local semantic knowledge to LLMs. Concretely, a metamodel can be seen as a schema definition similar to JSON Schema.

In the Nop Platform, we emphasize a homomorphic relationship between metamodels and concrete model objects. The schema's form should be fundamentally consistent with the data structure itself, unlike XML Schema, which splits a tree-like domain structure into numerous object-attribute relationships expressed in completely different syntax. For example:

<entity name="test.MyEntity" table="my_entity">
  <columns>
    <column name="SID" sqlType="VARCHAR" length="30" />
    <column name="TITLE" sqlType="VARCHAR" length="200" />
  </columns>
</entity>

The corresponding XDef metamodel definition is:

<entity name="!class-name" table="!string">
   <columns xdef:body-type="list" xdef:key-attr="name">
     <column name="!prop-name" sqlType="!std-sql-type" length="int" />
  </columns>
</entity>

Basically, XDef replaces specific values with stdDomain definitions and retains only a single entry for list elements.

stdDomain is similar to a type declaration but is user-extensible. All stdDomains are maintained in a dictionary and can impose local semantic constraints on field values. For example, class-name means the value must satisfy Java class naming rules; not all strings are allowed. An exclamation mark before a stdDomain indicates the attribute value cannot be null.

Current large models are trained primarily via fill-in-the-blank approaches, so this homomorphic design also helps them quickly grasp metamodels. In today's LLM applications, given a few samples, models can reverse-infer corresponding schema constraints, but such guesses are often inaccurate. For instance, it's hard to teach a model via samples that certain string formats are illegal—like disallowing hyphens as connectors. Through metamodels, we can quickly and efficiently transmit domain knowledge to large models.

Therefore, I believe large model training should intentionally strengthen metamodel training. Metamodels should be distinguished from ordinary models, and it's worth expending extra effort to improve a model's precise mastery of metamodels.

For a concrete attempt at interacting with GPT using metamodels, see my article: A Verified Strategy for GPT-Driven Low-Code Platforms to Produce Complete Applications

Concrete Strategy for Combining the Nop Platform with GPT

The Nop Platform's strategy for communicating with GPT is as follows:

Use the xdef metamodel of the current DSL to help GPT understand the DSL structure more quickly and accurately.
Use delta merge rules from Reversible Computation to guide GPT to return delta descriptions directly.
Merge the returned delta into the current model to form the new current model; interact with GPT indefinitely on this basis.
Complex logical reasoning often cannot be solved in one step with a single DSL. In this case, we can build a Delta pipeline using multiple DSLs, decomposing the problem into several steps.

Based on the DSL support provided by the Nop Platform, AI and humans can collaborate as follows:

AI produces the top-level DSL according to the requirement specification.
A human-written code generator expands the DSL into the next-level DSL.
Humans can refine and adjust the AI-generated DSL using Delta customization.
For the finest details, AI can further refine based on local knowledge.

In short: 1) AI produces the rough cut, 2) humans refine it, 3) AI polishes.

Many programmers currently imagine AI code generation as producing interfaces, classes, properties, and other common software components. The Nop Platform takes a completely different approach. As I keep emphasizing, the class-property abstraction is a consequence of underlying implementation technology constraints and does not fully correspond to internal domain structures. For example, I've repeatedly stressed that mapping the concept of domain structural coordinates to the type level loses information, making precise delta corrections impossible. The Nop Platform's DSL is oriented toward Tree structures and can produce an entire logical tree in one shot.

Some might think of fine-tuning previously generated code structures by having an AI model generate API calls to adjust the model. For example, generating an API to remove the phone3 field:

entityModel.getColumns().remove("phone3");

Comparing this with the Nop Platform's delta merge operator reveals why the API approach is suboptimal:

<columns>
   <column name="phone3" x:override="remove" />
   <column name="status" sqlType="INTEGER" />
</columns>

The Nop approach offers these advantages:

Mergeable and Simplifiable Deltas: Multiple Delta modifications can be merged and simplified, discarding redundant changes. The API approach uses modification actions as Deltas, but multiple actions cannot be automatically merged or simplified. Without mentally executing each action, the final system state is unclear. This aligns with Reversible Computation theory: a Delta should be independently understandable and definable, satisfying associativity and allowing local simplification.
Analyzable Deltas: Deltas defined on domain models can be automatically analyzed by programs, with information reverse-extracted. Implementing Deltas via APIs lacks simple tools for Delta composition analysis; the impact scope is unknown before application. This reverse information extraction capability is central to Reversible Computation theory.
Precise Positioning: A Delta applies to a base model because we can precisely define the change location—e.g., the field named phone3 within the entity model's field collection. This positional definition is a path with clear business meaning and uniqueness. In contrast, locating "lines 10-20 in the MyEntity model file" is unstable and imprecise. The Nop Platform's Delta customization leverages the domain model's coordinate system accurately, while the API-call approach hides coordinates deep within function call chains. GPT might use ad hoc positional techniques and miss the most effective direct positioning within the domain coordinate system.

In practice, guided by Reversible Computation theory, proactively re-examining concrete programming practices from the perspectives of metamodels, reversibility, and delta-ization yields new insights and identifies improvement directions.

Viewing Prompts from the Perspective of Reversible Computation

Some sophisticated prompt designs can be naturally explained from the Reversible Computation standpoint, such as the TaskPlan in HuggingGPT:

The AI assistant performs task parsing on user input, generating a list
of tasks with the following format:
[{"task": task, "id", task_id, "dep": dependency_task_ids,
"args": {"text": text, "image": URL, "audio": URL, "video": URL}}].

This prompt format closely resembles an XDef metamodel definition. HuggingGPT's operation involves having GPT return DSL statements that satisfy the metamodel requirements.

Microsoft's guidance project adopts prompts in this format:

role_simulator = guidance('''
{{#system~}}
You are a helpful assistant
{{~/system}}

{{#user~}}
You will answer the user as {{role}} in the following conversation. At every step, I will provide you with the user input, as well as a comment reminding you of your instructions. Never talk about the fact that you are an AI, even if the user asks you. Always answer as {{role}}.
{{#if first_question}}You can also start the conversation.{{/if}}
{{~/user}}

{{~! The assistant either starts the conversation or not, depending on if this is the first or second agent }}
{{#assistant~}}
Ok, I will follow these instructions.
{{#if first_question}}Let me start the conversation now:
{{role}}: {{first_question}}{{/if}}
{{~/assistant}}

{{~! Then the conversation unrolls }}
{{~#geneach 'conversation' stop=False}}
{{#user~}}
User: {{set 'this.input' (await 'input')}}
Comment: Remember, answer as a {{role}}. Start your utterance with {{role}}:
{{~/user}}