DEV Community

Discussion on: Language Features: Best and Worst

Collapse
 
awwsmm profile image
Andrew (he/him)

Also are you going to be using flex and bison for the language? Or is it going to be a PEG parser?

I don't know what any of this means. (Sorry, I'm new to language design.) Can you elaborate?

Collapse
 
gavinfernandes2012 profile image
Gavin Fernandes

Alright, as far as I know, there are two general types of parsers, top down, and bottom up. Bottom up is more performant iirc, but is harder to get used to, while top down is less performant (O(n3) if you know what I mean, and iirc), but I hear that they're much simpler to work with.

Ohh before I elaborate further, I'll need to know what sorts of systems and tools you're used to working with. What language, OS, compiler?

Thread Thread
 
awwsmm profile image
Andrew (he/him)

In grad school I programmed in C/C++ for 5 years, now I'm mostly Java and R, but I'm teaching myself Haskell, as well. I work on Windows, Ubuntu, and macOS. If you're asking which compiler I use to compile my own code, just gcc or whatever's available. I haven't yet looked into anything for my language project.

I found this article while researching this and it mentions yacc, which I have heard of. I wasn't aware that bison is its successor. I don't know how yacc works, though, I just recognise the name.

Thread Thread
 
isaacrstor profile image
Isaac Yonemoto

Before you move forward with your project I strongly suggest you study Julia (as a replacement for R) and Elixir (as a replacement for Java). They're more modern languages so they've worked through a lot of issues in their recent, early pre 1.0 phases and have thought, hard and heatedly, about a lot of things you're looking at. Both are functional, and both are dynamic (I think strict adherence to staticity is a bit of a religion: Julia kills the idea that static = performance, because of its amazing compiler strategy, and Elixir + dialyzer kills the idea that static = correctness - I basically have no runtime errors with static typechecking engine, and the rock solid dynamic stability of OTP makes it way worth it - I'm a a relatively inexperienced dev, they just promoted me from junior directly to senior =) with a lot of freedom and in elixir I wrote a testing harness in three days for a senior coworker's go program (6 months in the making) that triggered both linux and go program panics/errors, while my harness (living on the same node and dispatching and monitoring tens of thousands of concurrent processes) was fine.

Moreover, both PLs focus on developer productivity and joy. I don't know if that's what you're optimizing for, but I suggest learning from some of the best!

Thread Thread
 
awwsmm profile image
Andrew (he/him)

Thanks, Isaac! I'll have to explore a few more languages before I make a decision one way or the other. I'll definitely check out Julia and Elixir.

Collapse
 
validark profile image
Validark

A PEG is a parsing expression grammar. It is a way of specifying all valid strings in a language. It looks kinda like this

Statement <- FunctionCall
FunctionCall <- Identifier ( Arguments )
Enter fullscreen mode Exit fullscreen mode

These would be converted into a Finite State Automata that can parse a string in LL(1) or O(n). It's basically a program that takes one character after another and changes state depending on the character. Each state can accept different characters leading to different states. This way, a state can encompass multiple non-terminals.

Let's say you define a language as having members "hero" and "hello".

MyLang <- hello | hero
Enter fullscreen mode Exit fullscreen mode

The way to parse a given string without backtracking or lookahead is to construct this finite state machine. Opening state accepts an h, leading to state 1. State 1 accepts e, leading to state 2. State 2 accepts l, leading to state 3, or r leading to state 4. State 3 accepts l, leading to state 5, which accepts o, leading to a state which it is valid to end on. State 4 also accepts an o and is valid to end on. In this case you could reuse state 4 for 5 if desired.

Thread Thread
 
validark profile image
Validark

Bison and Yaac are tools that allow you to write a grammar and it will spit out a finite state machine like I described.