canonical

Posted on Nov 20

Flexible DSL Embedding Using Prefix-Guided Syntax

#dsl #programming #architecture #designpatterns

In the DSL syntax design of the Nop platform, a crucial concept is layered syntax design. This means that multiple styles of DSLs can be mixed and used together, yet they maintain clear formal boundaries. When parsing according to a higher-level DSL syntax, there is no need to consider the parsing requirements of lower-level DSLs. Typically, JSX syntax does not meet this requirement. JSX syntax can be seen as a mix of ordinary JavaScript and XML syntax. However, the JSX parser uniformly recognizes both JavaScript and XML tokens at the Lexer level and parses the syntactic elements into a unified AST. It cannot first parse XML to obtain an overall structure and then parse JavaScript locally, nor can it parse JavaScript first as a whole and then parse XML locally.

I. Embedding Other Syntaxes in XML

In the Nop platform, the XLang language adopts XML as its foundational syntax form, using XML tags to mark specific syntactic sections. For example:

<c:script>
  let x = 3;
  ...
</c:script>

<c:script lang="groovy">
   // Call the Groovy script engine
</c:script>

Inside the <c:script> tag, XScript (syntax similar to TypeScript) can be executed, or other script engines like Groovy can be selected via the lang attribute, thereby supporting different programming syntaxes. The key point here is that, at the XML level, <c:script> is just an ordinary XML tag, and its content is merely a plain string. Only the processor of the local <c:script> tag needs knowledge of Groovy syntax, while the overall XPL template language does not require any knowledge of Groovy.

Similar to macros in Lisp, the <c:script> tag in the XLang language is essentially a macro tag. It automatically runs at compile time, executes specific parsing logic, and returns the parsed abstract syntax tree. This approach can be easily extended to other DSL syntax formats. For example, the layout syntax used in front-end pages:

<ui:Form>
  <layout>
  fieldA fieldB
  fieldC
  </layout>
</ui:Form>

<ui:Form> indicates generating a form based on the current object’s metadata. This form has two rows: the first row displays fieldA and fieldB, and the second row displays fieldC. The specific controls used are automatically selected based on the field types configured in the meta file. At the implementation level, ui:Form automatically parses the DSL syntax corresponding to the layout at compile time, generates a LayoutModel object, and then automatically generates component code based on the meta information.

Based on this mechanism, it is actually easy to implement the concept of a Projectional Editor, as proposed by JetBrains' MPS product. The output of a so-called visual editor can be seen as a model defined using a certain DSL. After DSL syntax parsing, the result is an abstract syntax tree (AST). At any AST node, we can reselect a DSL syntax form to achieve a customized expression of the information at that AST node.

II. Embedding Other Syntaxes in JavaScript

In the XScript language, we use template string syntax to embed XML. Unlike in JavaScript, embedded expressions are not automatically recognized in XScript. For example:

let x = xpl `<c:if test="${condition}">...</c:if>`

In standard JavaScript syntax, ${condition} would be automatically parsed as an expression. However, in XScript, the entire template string is parsed as a string, with escape sequences represented by repeated backticks.

In my opinion, the automatic recognition of ${expr} in JavaScript template string design is a mistake. It disrupts a natural layered syntax design, mixing the internal DSL syntax of the template string with the external JavaScript syntax. This leads to a series of inconveniences in parsing and processing, while also affecting the intuitiveness of the internal DSL syntax.

In the implementation of XScript, template string syntax is defined as a call to a compile-time macro function. That is, xpl is a macro function that automatically executes at compile time. Its specific implementation is as follows:

@Macro
public static Expression xpl(IXLangCompileScope scope, CallExpression expr) {
    String tpl = getTemplateLiteralArg(expr);
    if (StringHelper.isBlank(tpl))
        return Literal.nullValue(expr.getLocation());

    XNode node = XNodeParser.instance().forFragments(true).parseFromText(expr.getArgument(0).getLocation(), tpl);
    return scope.getCompiler().parseTagBody(node, scope);
}

At compile time, the xpl macro function is executed, passing in the AST node corresponding to the template string expression. This functionality is similar to the handling of LinQ expressions in C#, except that macro functions are a more general mechanism. Essentially, their role is similar to macros in Lisp.

This mechanism can be used for embedding various DSLs. For example:

let p = xpath `/a/a[@id=a]`

This means that at compile time, the XPath syntax is parsed, and an XPath object is returned and assigned to the variable p.

To embed SQL syntax similar to LinQ in XScript, you can use:

function myFunc(x,y){
    return x + y;
} 
let obj = ...
let {a,b} = linq `
  select sum(x + y) as a , sum(x * y) as b
  from obj
  where myFunc(x,y) > 2 and sin(x) > cos(y)
`

In a specially customized linq macro function, we can precisely analyze that myFunc is a function called from the external environment and obj is a variable defined in the external environment, achieving a natural integration of SQL syntax and JavaScript syntax.

The built-in template string functionality in JavaScript can be implemented in XScript through the tpl macro function call:

let x = tpl `sss ${myVar}`

III. Prefix-Guided Syntax

The xpl macro function mechanism introduced in the previous section can be seen as a form of prefix identifier + multi-line text string. The prefix identifier is interpreted as a processing function, while the multi-line text string has both internal and external structures. The external structure is simply ordinary multi-line text, requiring only the recognition and escaping of special backtick characters; other characters like line breaks and backslashes do not need escaping. The internal structure is the DSL syntax recognized solely by the prefix identifier function. The advantage of this form is that it seamlessly embeds the DSL into the external program structure without altering the external program syntax or requiring any knowledge of the internal DSL. I refer to this form as prefix-guided syntax. It can be seen as a general technique in DSL design, with a wide range of applications.

In the design of the Nop platform, we extensively use the prefix-guided syntax design form.

3.1 Encrypted Fields

When storing passwords in configuration files, encryption is required. We约定 that configuration values prefixed with @enc: need to be automatically decrypted.

If a field in the ORM model is marked for encrypted storage, it will be automatically encrypted and prefixed with @enc: when saved to the database. This allows automatic recognition of whether decryption is needed when reading, facilitating dynamic adjustment of encryption and decryption settings during system operation.

3.2 Dynamic Configuration

Values in configuration files are generally fixed. However, in gray release scenarios, we may want to extend static configurations to dynamic configurations, using configuration A for calls that meet certain conditions and configuration B for others. In the Nop platform, we use the @switch prefix to identify dynamic configuration items. For example:

nop.a.b = @switch: {Business rules in JSON format}

When used on the client side:

static final IReference<String> CFG_XXX = AppConfig.varRef("nop.a.b",String.class);

CFG_XXX.get()

This returns a dynamically determined value based on the switch configuration.

Using the prefix-guided syntax form to extend configuration items fully maintains the original key=value configuration structure of the system. It allows reusing previous interfaces and storage, requiring only the addition of local DynamicReference support on the client side.

3.3 Redis Cache Encoding

When serializing Java objects to JSON and saving them in a Redis cache, the prefix @data:DATA_CLASS is added before the JSON string. This packages the Java class name required for deserialization (which may include generic information) together with the data and saves it in the cache, facilitating direct deserialization into strongly-typed Java objects.

3.4 IoC Configuration

In Spring’s XML configuration, if we want to express that a value is not a literal but a reference name, we need to use a new attribute name:

<bean>
  <property name="myProp" value-ref="xxx" />
</bean>

Here, value-ref indicates that the value of the property myProp is not the string "xxx" but the bean it references, where "xxx" is the reference name of the bean. In fact, to support the concept of references, Spring introduced multiple additional structures like ref, value-ref, and key-ref. In the Nop platform, we can use the form @ref:name to uniformly express object references:

<bean>
  <property name="myProp" value="@ref:xxx" /> 
</bean>

The @ref prefix can be used directly anywhere an object reference is needed, greatly simplifying the structure of domain objects. Additionally, since object properties are identified by a unique name rather than split into multiple attribute names like value-ref and value, this facilitates Delta customization (directly overriding by name without considering multiple scenarios or priority issues when multiple attributes coexist).

Spring claims to be fully declarative dependency injection, but when it comes to some internal concepts of the IoC container, it still relies on explicitly agreed-upon internal interfaces. For example, to inject the bean name, the BeanNameAware interface must be implemented:

interface BeanNameAware{
    void setBeanName(String beanName);
}

In the Nop platform, we can use @bean:id to represent the injection of the bean name:

<bean>
   <property name="beanName" value="@bean:id" />
</bean>

Similarly, @bean:container can be used to represent the injection of the current container, etc.

The low-code platform NopPlatform, designed based on reversible computation theory, is open source:

gitee: https://gitee.com/canonical-entropy/nop-entropy
github: https://github.com/entropy-cloud/nop-entropy

DEV Community