1) Parse the text of the code into a sequence of atomic pieces. For example,
function(x,y)
is an identifier, a left parenthesis, an identifier, a comma, some whitespace, an identifier, and a right parethesis.
2) Parse that list into something more meaningful by building it out into an abstract syntax tree. For the example above, if your language is made of expressions, the function call is an expression with two child expressions. This goes down recursively.
3) Translate that tree into roughly equivalent code of something else on the other side. Lots of languages compile to native languages like C or Rust. Web languages compile to JS. Lots of languages also use LLVM machine code.
It's a herculean task to do on your own, but there is some really good tooling out there (search for terms like lexer, parser, and tokenizers in your favorite languages).
If you want to get your feet wet, look for a tutorial on how to write a LISP compiler in whatever language you want to use to write your language. LISP is the language with the simplest syntax, so you can get up and running with the basic concepts and tooling and learning how to think about what makes a sensible language without spending too much time. From there you can build off of your tutorial code.
Top comments (2)
It's an ambitious project for sure!
The basic process for compiling something is:
1) Parse the text of the code into a sequence of atomic pieces. For example,
is an identifier, a left parenthesis, an identifier, a comma, some whitespace, an identifier, and a right parethesis.
2) Parse that list into something more meaningful by building it out into an abstract syntax tree. For the example above, if your language is made of expressions, the function call is an expression with two child expressions. This goes down recursively.
3) Translate that tree into roughly equivalent code of something else on the other side. Lots of languages compile to native languages like C or Rust. Web languages compile to JS. Lots of languages also use LLVM machine code.
It's a herculean task to do on your own, but there is some really good tooling out there (search for terms like lexer, parser, and tokenizers in your favorite languages).
If you want to get your feet wet, look for a tutorial on how to write a LISP compiler in whatever language you want to use to write your language. LISP is the language with the simplest syntax, so you can get up and running with the basic concepts and tooling and learning how to think about what makes a sensible language without spending too much time. From there you can build off of your tutorial code.
Learn Vietnamese