DEV Community

SnowballSH
SnowballSH

Posted on • Edited on

[Part 0] Create your own programming language, in Python

So, you think you have enough programming skills?

If so, why not do something really fun -- create your own programming language!

Who am I?

I am SnowballSH, a programmer and music producer. I have a decent experience in lexical analysing, parsing, and language designing. Gorilla is the best programming language I have made, in golang. Feel free to play it here online!

Why Python?

To be clear, I definitely do not suggest making programming languages in Python.

  • It is slow
  • It is dynamic -- more random errors will occur to the user

However, I chose Python to show you the basic principles and terms in language designing. Therefore, I think Python is clean and very good to explain basics.
I will document the types of almost every argument of functions so you can catch on with any static-typed programming languages too!

If you are wondering what is Python...

This is Python:

(Just kidding)

Uhh, OK, but I have absolutely no idea how a programming language like Python is made!

Well, if you know how one is made, why are you even here? We are here exactly talking about the fundamentals and basics.

Here is the process of an interpreted programming language:

Process of language

Credit: Marc-André Cournoyer

The process of a compiled programming language:

Process Compiled Language

Differences between "Compiled" and "Interpreted"

Compiled Language:

  • Contains Lexer and Parser
  • Has a compiler that compiles user's code into a low-level machine code. It can either be Web Assembly, Machine Code, Assembly, JVM, or even your custom byte code.
  • Pro: It is fast due to its runtime is based on the native machine
  • Con: Extremely hard to manage. The native commands and opcodes differ with the system/OS, therefore making it harder to add more features.

Interpreted Language:

  • Conatins Lexer and Parser
  • Has a special interpreter that accepts Abstract Syntax Trees (Code outline) and runs through it. By doing this way, there is no native runtime. Everything is ran by higher-level languages (e.g. Python, C++) directly.
  • Pro: Easy to manage. The language you are writing your language with has done the compliation for you -- you only have to manage one single case -- The language you are written in. This way you can add features pretty easily and fast by just using one function!
  • Con: It is slow due to it is ran on a high-level programming language. How slow is it? Well, you will see right below.

Most of the times, language makers create their own Virtual Machines and Bytecode (a series of instructions similar to assembly) too.
For example, Python is compiled to "Python Bytecode" using C, and "Python Bytecode" executes the bytecode using C too. This way the Bytecode is flattened and makes the language about 3x faster than normal.

Real World Examples

Interpreted

  • Compiles to Bytecode
    • Python
    • Ruby
    • Java (Compiles to JVM Bytecode)
    • C# (Compiles to CIL)
  • Fully Interpreted
    • JavaScript
    • R
    • PHP

Compiled

  • Compiles to Machine Code
    • C
    • C++
    • Ada
    • Go/Golang
    • Crystal (LLVM)
    • Lisp
    • Pascal
  • Compiles to Bytecode
    • Java (as said, to JVM)
    • C# (to CIL)

What we are going to make

For the sake of this tutorial (my first tutorial ever), we will be building a dynamic, object-oriented, fully-interpreted programming language.

Why?

As a said, an interpreted language is easy to manage.

Although it is slow, it can really show you guys the process of making a language easily!

If you have a strong opinion making a compiled language, I won't disagree -- in fact I love compiled languages. I chose it to be interpreted just to make everyone clear!

Conclusion

Thanks for reading until here! We will talk about the syntax of our language, and start up with lexical analyzer!

See you later :>

Top comments (0)