re: Language Features: Best and Worst VIEW POST

FULL DISCUSSION
 

Features I like:

  • First-class support for package management. When I install a programming language and can't work out within the first hour how to install a 3rd party library I really lose interest. Being able to do something like pip install library_name and for it to just work™ is awesome.
  • Error message output for humans. Elm and Rust spring to mind.
  • Expression-based languages that let you do something like let x = if something { 1 } else { 2 }.
  • Pattern matching (Elixir, Rust, Scala, etc.).
  • Languages that explicitly avoid the concept of exceptions and try/catch. Managing a single means of passing values up to a caller is hard enough, so I like the idea of encoding errors in the return value. Rust does this by way of the Result type. In Elixir, you return a tuple which includes information on whether an error occurred.
  • match statements like in Rust and Scala that support pattern matching. Bonus points if the compiler enforces that matches are exhaustive.
  • Any form of null-safety (e.g. Option types, Elvis operator, etc.). I see a lot of Java code with nested if (variable != null) { ... } checks and find it really hard to read.
  • Solid abstractions around concurrency (Actor model, Goroutines etc, compiler-enforced safety guarantees like those in Rust, etc.)
  • Quality of life features around debugging (for example, in the latest Rust version, you can do dbg!(something) to print out an object and all of its data without having to implement a toString or similar)
  • Native async/await syntax
  • If the language is dynamically typed, some form of type annotation syntax to aid static analysis is really helpful. Python 3 has a typing module which you can use in conjunction with static analysis and I've found it to make code significantly more readable and correct.
  • The ability to compile to a native binary.

Features I dislike:

  • null, NullPointerException, etc.
  • Bloated standard libraries. With a first-class, well supported package management system, “official” libraries could be pulled in when they’re needed.
  • Inconsistent standard libraries (i.e. lack of convention).
  • Excessive verbosity. I’m personally not a huge fan of the do/end syntax in languages like Elixir and Ruby. This is super subjective though and with modern text editors/IDEs it just becomes an aesthetic thing.

I could probably go on all day, but those are the first things that spring to mind!

 

In Elixir, you return a tuple which includes information on whether an error occurred.

But... Elixir has exceptions... Maybe you are thinking of Go?

 

It has exceptions, but it's considered more idiomatic to return a tuple

OK, I see now. It's not Go's abomination like the word "tuple" implies, but more like the dynamically typed version of enum types.

At any rate, I find it a weird design choice to add exceptions and then encourage a different method of error handling. One of the main complaints about C++'s exception was that due to its legacy it had three types of error handling:

  1. Function returned value + errno.
  2. Setting a field on an object.
  3. Exceptions.

I guess Elixir wanted to go with pattern matching for error handling, but had to support exceptions for Erlang interop?

I guess Elixir wanted to go with pattern matching for error handling, but had to support exceptions for Erlang interop?

No. Elixir inherits from erlang's "let it fail" mentality. The PL itself supports supervision trees, restart semantics, etc. So in some cases, you want to just "stop what the thread is doing in its tracks and either throw it away or let the restart semantic kick in". In those cases, you raise an exception and monitor logs. The failure will be contained to the executing thread. You can then make a business decision as to whether or not you REALLY want to bother writing a handler for the error. Does it happen once in 10 thousand? 10 million? Once in a trillion? The scale and importance of the task will dictate whether or not you need to deal with it.

Other times when you might want to raise an exception 1) when you're scripting short tasks. Elixir lets you create "somewhat out of band tasks". Example: commands attached to your program that create/migrate/drop your databases.

In this case, most failures are 'total failures', and you don't care about the overall stability of the program, since it's "just an out-of-band task". So explicit full fledged error handling is more of a boilerplate burden.

2) when you're writing unit or integration tests. The test harness will catch errors anyways, so why bother with boilerplate. Use exceptions instead of error tuples.

Yes, exceptions makes sense. I agree with that. Using the returned value for error handling also makes sense - but only if you don't use exceptions. If the language supports exceptions, and not just aborts/panics - actual exceptions you are expected to catch because they indicate things that can reasonably happen and you need to handle - then you can't argue that using the returned value makes everything clear and deterministic and safe and surprises-free - because some function down the call-chain can potentially throw an exception.

So, if that function can potentially throw, you already need to code in exception-friendly way - specifically put all your cleanup code in RAII/finally/defer/whatever the language offers. And if you already have to do this for all cases - why not just use exceptions in all cases and not suffer from the confusion that is multiple error handling schemes?

You don't have to cleanup. That's the point. I can't explain it except to say, if you watch enough IT crowd, you might start to agree that sometimes it's okay to just "turn it off and back on again".

If your system is designed to tolerate thread aborts, it's really refreshing and liberating. Let's say I was making a life-critical application. In the event of a cosmic ray flipping a bit, I would much rather have a system that was architected where, say, of the less important subprocesses just panics and gets automatically restarted from a safe state, with the critical subprocesses still churning along, than a system that brings down everything because it expects to have everything exactly typechecked at runtime.

There is still cleanup going on. Something has to close the open file descriptors and network connections. You may not need to write the cleanup code yourself, as it happens behind the scenes, but as you have said - you need to design your code in a way that does not jam that automatic cleanup. For example, avoid a crashing subprocess from leaving a corrupted permanent state which the subprocess launched to replace it won't be able to handle.

One of the main arguments of the "exceptions are evil" movement is that having all these hidden control paths makes it hard to reason about the program's flow, especially cleanup code that needs to run in case of error. But... if you already need to design your program to account for the possibility of exceptions, you are losing the benefit of explicit flow control while paying the price of extra verbosity.

This convention in Elixir to prefer returning a tuple seems to me as more trendy than thoughtful...

You really don't have to worry about it. The VM takes care of it for you. Unlike go, there are process listeners that keep track of what's going on. File descriptors are owned by a process id, and if the id goes down it gets closed.

As a FP, most stuff us stateless and in order to use state you have to be very careful about it,so there usually isn't a whole lot of cleanup to do in general. As I said, I wrote some sloppy code in three days as a multinode networked testbench for an internal product and it was - it had to be - more stable than the code shipped by a senior dev (not in a BEAM language)

There is zero extra verbosity because you write zero lines of code to get these features.

As for the tuples, I wouldn't call it trendy since it's inherited from erlang, which has had it since the 80s.

I think you have been misinformed about elixir or erlang and suggest you give it a try before continuing to make assertions about it.

You really don't have to worry about it. The VM takes care of it for you. Unlike go, there are process listeners that keep track of what's going on. File descriptors are owned by a process id, and if the id goes down it gets closed.

Yup - higher level languages do that basic stuff for you. But it can't do all cleanup for you. For example, if a subprocess needs to write two files, and it crashed after writing the first file due to some exception, there will only be one file on the disk. You need to either account for the possibility there will only be one file (when you expected there to be zero or two) or do something to clean up that already-written file.

There is zero extra verbosity because you write zero lines of code to get these features.

I talked about verbosity in the no-exceptions style error handling, not the one in the exceptions style.

As for the tuples, I wouldn't call it trendy since it's inherited from erlang, which has had it since the 80s.

Erlang had tuples, but didn't use them for returned value based error handling. At least, not from what I could see with a quick google. Elixir does use them for error handling.

I think you have been misinformed about elixir or erlang and suggest you give it a try before continuing to make assertions about it.

My "assertions" about Elixir is that it uses both exceptions and pattern-matching-on-returned-values for error handling. Is this incorrect?

At least, not from what I could see with a quick google

ok tuples and error tuples are literally everywhere in erlang. The result type for gen_server start function, for example, is {ok, Pid} | ignore | {error, Error}.

 

Wow! Thanks, Darren. I think I agree with most of the points you raise here. Getting a package management system up and running is clearly a step toward making the language accessible to the general public. I have next to no idea what that would entail, but it's certainly something I'd have to think about as the language matures.

 

I find the Erlang/Elixir treatment of null to be acceptable.

It (nil) is an atom (as are false, and true), definitely not conflatable with zero.

The only thing that is "dangerously" affected is "if", which fails on "false" and "nil" exclusively. Everywhere else you have to treat nil as its own entity.

 

Maybe something similar to this is the best approach?

Continuing with my idea of making all data N-dimensional matrices, nil or null would just be an empty matrix. Then a statement like if ([]) wouldn't make any sense because an empty matrix shouldn't be truthy or falsy. It should throw a compiler error.

 

Erlang and Elixir are dynamically typed languages. The million dollar mistake does not apply to dynamically typed languages. Guaranteeing that a variable cannot be null is not very helpful when you can't guarantee that variable's type.

You can definitely guarantee variable's types in Erlang and Elixir.

By doing explicit checks. How do these differ from null checks?

 

Add to that:

  • Proper static type system with generics, abstract type members, variadic types, tuple types, function types, a sound and complete type system with set-like operations (and, or, not).
  • Flow typing: a variables deduced type is refined through flow control.
  • Garbage collection (of some kind)
  • I agree on no exceptions, use types.
  • Compiler-as-a-library
  • Macros and other meta-programming
  • Incremental compilation
  • Interactive prompt
  • JIT compilation, AOT compilation and scripting
  • Support Language Server Protocol (and check it works in vim, emacs, vscode)
  • Support Debugger Access Protocol (which requires a whole debugger, and debug symbol system)
  • Support memory, cpu, cache profiling tools out of the box (eg. valgrind et. al)
  • Support and/or built-in testing
  • Support and/or built-in quality metrics
  • Make it open and freely available

I've been designing and building languages (now full time) for many years. You'll find you can end up adding an infinite list of things.

My advice, try writing a very simple Lisp interpreter: it can be done in under a day. Then try adding a few things.

You might also want to check out LLVM, which has a tutorial Implementing A Language With LLVM

My other advice, as soon as possible "Eat Your Own Dogfood". The programming language, compiler services, and all it's libraries should be written itself. To do this write a bare minimum language compiler from your language to C, C++, Java or whatever (C++ did this initially with "cfront"). Then rewrite that simple pre-processor in your new language. Then add more features.

This is the best and most efficient way to validate your work - if you like using your own language more than some other, you are on possibly on the right track.

 

Thanks for all the advice. I'm going to have a huge list of things to research before I even think about starting this project. I'm sure I'll come back to your comment more than a few times.

I forgot to say the most influential book for me are:

  • "Programming Languages: An Interpreter Based Approach" by Samuel N Kamin - though it says it's about Interpreters, it's really looking at how to implement language features for various languages. This makes you feel like you could do it yourself, because it explains each feature and gives example code. One of the first books I read on the subject.

  • "Types and Programming Languages" by Benjamin C Pierce. Totally opposite and quite heavy reading. Assumes you can do degree level set-theoretic logic - but this book basically tells you how to build a proper type system from a mathematical perspective. The concepts are key, so it's possible to read and gloss over the maths. Often academia has the future or bleeding edge hidden in research papers, so it's worth reading these also, even if the maths goes way over ones head.

As I mentioned, Lisp is a great place to start because it has a very simple lexical and syntactical grammar, and the semantics can be expressed in very minimally. I quick google search gave me: Lisp in Less Than 200 Lines Of Code

Don't research absolutely everything to begin with, it's too overwhelming a subject. The basics are covered in the "Dragon Book":

  • Lexical analysis (see flex, antlr, re2c)
  • Syntax/Parsing (bison, antlr)
  • Semantic analysis (you're on your own here, it's too language specific)
  • Type systems (if you're going static typing, which I would highly recommend)
  • Code Generation (see LLVM, it does everything you'd need)

Another thing I tend to do is read/use a lot of languages and steal... err... leverage... ideas. Most also have their compilers and libraries open sourced. Some interesting languages: Swift, C#, Kotlin, Rust, Julia, Scala, Clojure, Erlang.

Don't be daunted, the subject, like most, is quite deep and involved when you really look into it.

Wow! Thanks a lot, Harvey! This is a great list of books. I really appreciate it. I'll definitely have a look at Lisp.

code of conduct - report abuse