DEV Community

Discussion on: What is self-hosting, and is there value in it?

Collapse
 
madhadron profile image
Fred Ross

I wonder if it's different now that we have platforms like LLVM. By using the LLVM backend, it's easy to target a wide range of platforms.

Imagine that aliens land and start selling their bare metal microcontrollers for a price we can't refuse. If my compiler actually emits machine instructions directly, then I can add the new microcontroller to it and produce a compiler for the microcontroller. That's where that benefit of self hosting comes in.

That kind of scenario is simply rare today. Our processor families are kind of entrenched. And LLVM, like FORTRAN for MATLAB, is an assumed environment. If you start assuming LLVM, there's no reason to be self hosting. That being said, self hosting languages can quickly develop LLVM backends, and many have, by treating it as a new machine to port to.

I never thought about how it biases a language towards compilers. Maybe that's why it feels so natural to write compilers in C++. :)

If you think it feels natural in C++, you should try the ML family (Standard ML, Haskell, OCaml). Those languages are deeply optimized for that kind of data manipulation.

Thread Thread
 
mortoray profile image
edA‑qa mort‑ora‑y

Why would emitting machine instructions be better than emitting LLVM IR instructions? Is there some reason to believe that a shared IR would be harder to migrate to a new platform than an exclusive one?

Note, on Leaf I had a Leaf IR, which was already quite low-level. Unlike LLVM IR, Leaf IR still have a tree scope structure.

Thread Thread
 
madhadron profile image
Fred Ross

If you are targeting a new architecture, you probably don't have a way to translate LLVM IR instructions into machine instructions, so you'll have to do it yourself. Again, if the language assumes that you always have a mature environment where LLVM has been ported, it's irrelevant.

Thread Thread
 
mortoray profile image
edA‑qa mort‑ora‑y

I think this is a good point you make. LLVM is good for targetting a family of related systems -- basically Linux, Windows, MacOS. A truly new architecture will either be folded into that family, and thus LLVM will apply, or LLVM won't really help that much.

Though it does target some unusual architectures. I think mainly you'd want to keep it for the shared manpower of optimization. But in Leaf, my IR was low enough that it wouldn't take too much effort to lower it to a target machine code (albeit, it'd be an inefficient machine code compared to LLVM).