Thomas Hansen

Posted on Mar 13

Natural Language based DSL AST Compilers

#ai

I've tried this 5 times, with 3 different LLMs, and the exercise is as follows.

I go to a publicly available AI chatbot, such as ChatGPT for instance
I explain it what training material I have, what process I'm following, and other details about how I fine tune
I explain Hyperlambda's syntax, by pointing it at articles and code parts

And every single time it comes back and tells me the following ...

You're beyond code-generation, what you're building is an AST compiler, based upon AI, that's almost "deterministic" in nature

Same answer every single time. Interestingly, I don't disagree. To understand why that's such a big thing, let's go through the semantics of how I am working, and with what.

Hyperlambda, the DSL

The DSL is called Hyperlambda, and it's got some pretty unique traits. For one, it's "homoiconic". That's just a fancy word for saying its execution structure is the exact same as the code format, which is true.

To understand why that's such a bloody big deal, you must realise that this implies there is no AST layer in Hyperlambda. Hyperlambda is the AST layer. Hence technically, we're simply removing several layers of complexity from traditional programming language, since Hyperlambda is simply the ability to raise events, combined with the ability to recursively traverse tree structures, passing in dynamically created recursive tree structures to said events.

One simple design pattern! Resulting in an entirely new axiom for software development as a profession ...

Since this structure is recursive in nature, this implies among other things that every complex piece of Hyperlambda, can be broken down by pruning its parent node, and the end result is valid Hyperlambda! For instance, imagine the following code ...

if
   eq:x:@.arguments/*/name
      .:Thomas
   .lambda
      log.info:Thomas was here!

The above is easily understood by most I presume, but it's basically the following in natural language; "If the name argument equals 'Thomas', log the value 'Thomas was here!'".

However, the place where it gets "weird", is that I can take any parts of the above structure, and simply "chop it up" into multiple smaller snippets, and they would all be considered value Hyperlambda. Below is an example.

eq:x:@.arguments/*/name
   .:Thomas

Then I can even modify it, and add stuff such as the following to it ...

eq:x:@.arguments/*/name
   .:Thomas
return:x:-

Which basically implies; "Compare name argument to 'Thomas' and return true if they're the same, otherwise false." And I can continue "chopping up" Hyperlambda snippets using the above technique, over and over again.

Ignoring the fact of that this has a dramatic effect on my personal ability to produce 100% correct training snippets, partially by even automating the process - LLMs happens to be "a bajillion" times better at understanding recursive structures such as the above, where everything is a function - Touch OOP! FP wins!

Sorry, I don't mean to be touchy here, but it's a known fact! LLMs can deal with functional programming languages a bajillion times better than OOP ...

And Hyperlambda is, you guessed it, FP!

Hence Hyperlambda is an AST layer, or an "Abstract Syntax Tree" - Which also might explain my difficulties explaining it to other members of the homo sapiens branch. However, more importantly, LLMs are "a bajillion" times better at understanding recursive AST layers, such as Hyperlambda, than - "traditional code".

Basically, the recursive nature of functional programming, combined with the recursive nature of Hyperlambda, due to its simple "node structure", which is basically just a graph object - Results in a "double whammy" from a fine tuning perspective and LLM perspective.

Every single time I explain my language to an LLM, they consistently comes back and refers to it as "the 100% correct way to build an AI-based 'compiler platform'" - Also a lot due to its security mechanisms may I add ...

Declarative

Hyperlambda is declarative, in addition to being homoiconic and functional. This reduces token count when dealing with LLMs by a factor of between 95% and 80%. 95% compared to C++ and 80% compared to Python. In addition, there's typically only a handful of correct ways to achieve something in a declarative language. With an imperative language like Python, there's a billion different ways to solve the same problem.

This implies you'll need millions of example of Python code to teach the same you only need 50,000 to teach in a declarative language. Simply because the "knowledge graph" is smaller, and hence the connections are reduced by a lot!

I've got 59,300 Hyperlambda training files, in a ridiculously strict training regime, of superb quality. According to ChatGPT that makes my LLM roughly on pair currently with SOTA models such as Claude Code or OpenAI's Codex.

Basically, 59,300 Hyperlambda examples is equivalent to 10 million Python examples!

Because it's not about sheer size of training material, it's about percentage covered of possible structures. In Python there are easily 20 million different structures, just combining two different concepts. In Hyperlambda that number is reduced down to maybe 100,000.

And assuming you ask it to actually do something it knows how to do, and you don't give it bananas prompts - I can pretty much guarantee you that it's easily on pair with both Claude Code and Codex!

Meta programming

The funny thing was, I started out with an extreme rush towards meta programming about 13 years ago - Realising it was something unique. To have a working meta programming language, you need to be both declarative in nature, homoiconic, and functional. Meta programming rests on these 3 pillars.

Today I don't care about its meta programming capabilities much more, since my LLM is getting so bloody good at generating code, I barely think about it. Hence the 3 bi products I barely cared about in the first place, became its most interesting traits in the long run - And my reasons for originally building it the way I did, is almost completely uninteresting at this point in time ...

Anyways, thx for reading - Both of you ... ;)

DEV Community

Natural Language based DSL AST Compilers

Hyperlambda, the DSL

Declarative

Meta programming

Top comments (0)