DEV Community

loading...
REKKI

Why we killed elixir

borislav nikolov
Updated on ・4 min read

Alt Text

First this is my opinion as CTO, I don't care about languages, I hate them all.

Here is my hatelist of languages I know enough to say I hate them, from most hated to least hated:

# most hated
java perl javascript
elixir ruby
python cpp objc clojure
go dart
c
# least hated
Enter fullscreen mode Exit fullscreen mode

All those languages have something to offer, when you decide to use one, you better know how it will bite you in the butt. Most overpromise and underdeliver, their philosophies work great on paper, but poorly in practice. By 'in practice' I mean a diverse team of people working on a non trivially sized project with different experiences, preferences and backgrounds.

When REKKI started the devs thought they are building a chat app, instead of ordering app, and they chose elixir, with phoenix to be able to trivially do things like 'someone is typing' in the chat. However what they didn't realize was that there is no way two chefs are in the app in the same time. You can read more about that here: https://dev.to/rekki/work-in-the-kitchen-4ifm

We moved from elixir to go around 1 year ago, it was pretty smooth, we just made few go services and moved endpoints that we needed. We made a super simple k8s deployment infra make build push deploy and you are good to go.

This enabled all frontend devs to write go, learn more sql, make simple endpoints for themselves, and change something if they need. All of a sudden each frontend dev became a full stack dev.

But elixir has so many powerful concepts, the genserver, hot code reloading, concurrency, its functional, phoenix is pretty popular and has great community, so why would we move away from it?


Let It Fail

First the "let it fail" thing, look, in our company things end up in a master database, for 99.9% of the things, you open a transaction, write your stuff and close it.

GenServer is great and all, but honestly for us its completely useless, it requires way too much discipline to not create inconsistencies in the data model, so if you do want to have things fail you have to actually handle the failure of a weak data model.

Lets look at this example:

User types "hello" message, this ends up in GenServer that sends it in rabbitmq, RMQ is great piece of software, if somethings goes in it is very unlikely it will disappear, except when there is no memory in one of the nodes because of a bug in another queue that the consumer was not consuming, so now lets examine this "let it fail" thing.

What exactly can fail?

Nothing really.. what about we just write the message in the database and return
ok or error to the user?

We had so many outages because the data model has to be very relaxed in order to support the "let it fail" thing, and so often we ended up with partial data in the database..

Checkout https://dev.to/rekki/mutation-is-life-boring-technology-11h0 for example

hot code reloading

For me this is complete myth, to build a robust hot code reloading deployment pipeline requires enormous amount of effort. To properly handoff and transform state properly..

Its ok if you want to replace one module or so, or for some prod debugging.

Performance

Runtime performance wise it is slow, about 2-3 times slower than go, and 4-5 than c, which is OK for most things, unless you want to rank a million items with some semi complicated formula, and at point you will have to scatter gather.

Compile time.. omg it is horrible, like early rust horrible.

Type Safety

We had to have huge amount of useless tests just to avoid having typos.. typos I repeat. Its like perl without strict. Running the linter is slower than flow. Of course we use typespec and etc, it just has the same problem as flow, things become any fast, and any spreads like a virus.

And worse you cant delete anything, first the tests will fail, and then even when you fix them you have no confidence your changes are good. Of course typesafe code does not guarantee things are ok, but I have seen elixir bugs that are just pathetic for year 2020.

Reading and Writing code

This is personal now, some people like it some dont, I wont make strong argument, for me reading clojure is 10x faster than reading elixir, but I also used clojure more.

The problem is it has steep learning curve, you cant get a js dev and ask him to change elixir code.

Scale

Do you really think the issues with our scale will be bound to elixir? Most companies will be fine by using bash+netcat webservers.

This is a joke obviously nobody should do that.

while true; do 
  echo -ne "HTTP/1.0 200 OK\r\nContent-Length: 5\r\n\r\nhello" | nc -l -p 8080
done
Enter fullscreen mode Exit fullscreen mode

Concurrency

What concurrency? Most languages are ok with that, elixir is no better.

Finding developers

IDK, now its hard to find any kind of developers, but choosing niche language makes it even harder. Also people join just because of the tech, which is not good.


So since we wanted to unblock the frontend, and simplify the action-at-a-distance architecture, I had few reasonable choices, go and java amongst the top on the list, and since it is just easier to write ok-ish code in go than in java, I chose go.

So now we are a go shop, we have 20-30 services, each dev can make their own service and deploy it.

Things are pretty nice. I can say, after 1 year, it was a huge enabler and multiplier of people, I don't think we would've done half the things we did have we not switched, and the cost of switching was pretty small.

Go has its baggage as well.. but at least we can delete it.

Discussion (29)

Collapse
alexr2d2 profile image
AlexR2D2

What concurrency? Most languages are ok with that, elixir is no better.

Only Elixir and Erlang has real preemptive multitasking. Do you know what this is?
There is no such thing anymore in any language. Many programmers do not understand the value of this.

Collapse
jackdoe profile image
borislav nikolov Author

do you know the linux kernel is preemptive? and the freebsd one isnt, and yet the world goes by.

Collapse
thorstenhirsch profile image
Thorsten Hirsch

Well...

To deal with the latency problems, the kernel in FreeBSD has been made preemptive. Currently, we only preempt a kernel thread when we release a sleep mutex or when an interrupt comes in. However, the plan is to make the FreeBSD kernel fully preemptive as described below.

(docs.freebsd.org/en/books/arch-han...)

Thread Thread
jackdoe profile image
borislav nikolov Author

haha sorry i meant when freebsd was freebsd4, which was few years ago, and things were ok :)

Thread Thread
jackdoe profile image
borislav nikolov Author

aaa... that was 20 years ago apparently :D

Thread Thread
jackdoe profile image
borislav nikolov Author

anyway, my point was, beam does not live alone, it will be scheduled in and out when the scheduler deems fit

Collapse
robsonpeixoto profile image
Robson Roberto Souza Peixoto

Since version 1.14 golang's scheduler is preemptible

Collapse
alexr2d2 profile image
AlexR2D2

In my opinion the golang solution looks like a hack.
Implementaion use "safe point everywere" conception. But still you can write code with unsafe sections like this:

1)

//go:nosplit
func infiniteLoop() {
var u int
for {
// GC will wait forever...
u -= 2
if u == 1 {
break
}
}
}

2)

j := &someStruct{}
p := unsafe.Pointer(j)
// unsafe-point start
u := uintptr(p)
// This is where:
// - no reference to memory 'j'
// - the GC can move away the goroutine and clean the memory (because "safe point everywere" conception)
p = unsafe.Pointer(u)
// unsafe-point end
// bang !

Does every developer understand what the problem is and will not write such code?

Besides:

  • All the time the runtime must send system signals to os threads with goroutine to check if goroutine should be moved away.
  • 10ms time frame per goroutine. Why 10ms? Does it mean that the system signal should be sent to each thread more often than 10ms?

(source: habr.com/ru/post/502506/) ((sorry in Russian :))

Thread Thread
robsonpeixoto profile image
Robson Roberto Souza Peixoto • Edited

I'm not a Go's defensor. I just said that Go has a preemptible scheduler. So, in rare problem of infinite loop, it will not stop the service.

About unsafe, as the name said, is unsafe to use it. First, you should not use it unless you know what are you doing. Avoid unsafe because of that: "Packages that import unsafe may be non-portable and are not protected by the Go 1 compatibility guidelines." and because is unsafe to use. But options is a good thing.

And Go has more problems: rytisbiel.com/2021/03/06/darker-co... .

Thread Thread
jackdoe profile image
borislav nikolov Author • Edited

the one i hate a lot is the goroutine panic panics main, nobody expects it and devs are blindsided by it and it actually cause outages if ignored, this and the for loop var reuse are such a pain in the butt..

we solve the goroutine panic with lint rules that forces go func() to have defer util.IgnorePanic() or defer util.ForcePanic() so at least we are explicit, annoying af.

but the one i hate most is variable shadowing.

PS: i actually edited the post to add link to rytisbiel.com/2021/03/06/darker-co... :) thanks for the link

Collapse
podmostom profile image
jm

I wish posts like this would come with a resignation letter.

Collapse
strzibny profile image
Josef Strzibny

Hah, I also think this post shows a lot of incompetence.

There are problems with Elixir and I hoped this would be a great article that points them out, but no.

Collapse
jackdoe profile image
borislav nikolov Author

dont be so salty, what would you do? stay with a codebase that is hard to make features and every ticket is "waiting for backend implementation"? we basically tripled the amount of people who can work on adding features and quadrupled the output.

it was just the wrong tool for the job

Collapse
podmostom profile image
jm

There is no shame in stepping down. I myself have realised that I can't keep up with the vast amount of development and changed my position to architecture-focused, having a talented tech lead take over the CTO position.

My point was that you haven't done what is expected from a CTO while assessing a stack, I don't know all the details, but it stands to reason to not just reassess the stack, but also to reassess your position, especially if you had resources committed.

But hey, I'm not on the board of your company, so you do you.

Thread Thread
jackdoe profile image
borislav nikolov Author • Edited

best of luck then :)
we will see what the future holds

Collapse
destynova profile image
Oisín

IDK, now its hard to find any kind of developers, but choosing niche language makes it even harder. Also people join just because of the tech, which is not good.

I think this is mostly a myth sustained by hiring managers and recruiters who advertise "Scala roles" and look for "Python devs", rejecting many many good programmers. You should just look for good, smart devs who aren't afraid to try using a new tool instead of being pigeonholed into the one "safe" tool for everything.

We used Elixir in one of my old teams, after one dev used it to rewrite an internal Python service that was causing us a lot of trouble.
He tried it as an experiment, and waited until it was basically working before telling the rest of us (so we wouldn't get angry). We were surprised to discover that it worked really well, and solved many problems with the old Python implementation, especially when it came to concurrency and scaling. It was also much more concise, although this was probably due to that programmer's way of writing compact, easy to understand code.

At that time I didn't know any Elixir, but was able to pick it up and work on that codebase with no problems at all, after spending an afternoon or two going through the basic Elixir tutorial on their website.

Honestly, I feel like you've been fooled by this myth that "you can't get a JS dev and ask them to change Elixir code". What's worse, you've allowed that attitude to spread to your teams. This gives them implicit approval to throw their hands up in the air and say "yeah we can't work on that, we don't know Elixir", when it really isn't true.

I think that as a CTO, you should try to encourage two ideas that are often neglected:

  1. Everyone owns everything.
  2. Anyone can do anything.

When you get teams that have the opposite philosophy (territorial behaviour, "I don't touch that codebase, only Steve knows how that works"), it's usually because of a lack of communication, and a fear of being punished for not delivering results immediately. As a leader, you need to eliminate that fear so that your teams can take risks and do the best they can without looking over their shoulder and thinking of excuses to protect themselves.

Collapse
jackdoe profile image
borislav nikolov Author

the move to go was incredibly successful and empowering to all developers, previously there was implicit barrier because of the paradigm difference; now every developer actually owns everything, and anyone can actually do anything, and they do.

we have zero problems that are solved by elixir's strengths.

Collapse
stevensonmt profile image
stevensonmt

I find this a very curious post because a lot of what you hated is what most people love about elixir. The data model should be easier to manage and avoid incomplete writes with elixir because of immutable data structures and genservers. Why did you have the opposite experience?
I have not found any highly complex apps in elixir yet but have not run into compile time issues. I know there are stylistic things that can affect compile time like splitting things into multiple modules or having dozens and downs of function signatures for one function name. Can you share some details about your project that might have caused these issues?

Collapse
jackdoe profile image
borislav nikolov Author

you can check dev.to/rekki/mutation-is-life-bori... and dev.to/rekki/work-in-the-kitchen-4ifm for more details

I think the weakness emerged from applying "let it fail" on things that can not fail, and that led to "oh we cant actually have a foreign key here because this can fail" misunderstanding; i joined the company after they have been doing that for 3 years or so. It just slowly crept into the database.

but I dont "hate" those things in elixir, I just said they are not panacea. What we needed was more dev power, and we got it after moving to go.

Collapse
quatermain profile image
Oliver Kriška

I think you didn't get what is "let if fail/crash" it's more about not crashing at all. I understand that issue with rabbitmq was bad but it doesn't mean that it was issue with elixir/erlang logic.

Also performance, it's not exact. I suggest to read this article for example josephg.com/blog/crdts-go-brrr/ . Sometimes it's not about the language but about implementation.

Collapse
matreyes profile image
Matías Reyes

It sound like you refactored an over-engineered product to a simple architecture. Are you sure it was a language issue? (Also elixir could work writing to db without using custom genservers).
Its also common for a startup to have unclear requirements at the beginning, so don’t be so hard with the original team!.

Collapse
jackdoe profile image
borislav nikolov Author • Edited

i have utmost respect for the original team, as i have told them not once or twice.
it is incredibly difficult to build 0 to 1 product.

of course it was not a language issue, as i mention in the post, the language is fine, it was mainly slow tech debt creep that i think could be easier to handle if the language was typed, but the issue was architectural death by thousand cuts.

we could've rewritten it in elixir, but then we wouldn't have empowered all the devs to be able to work on the codebase, and hence i chose go.

Collapse
zentropi profile image
Metehan

I wouldn't criticize a language without deeply understanding core concepts of the language. Once I saw junior front end developers who are not fluent in any language criticizing clojure. This post made me fell same way. I was really expecting some valuable point to think about.

Collapse
jackdoe profile image
borislav nikolov Author • Edited

i wouldn't criticise any language i do not deeply understand.

also the post is not a critique of the language itself, but a comment on the interactions between all aspects of software development, from hiring to complexity creep and technical debt.

as i said, we needed more developer velocity and simpler codebase, and we got them very quickly.

i hope in the future generation of developers, people will stop identifying with technology, but look through the mist, and recognise in the end it's just a bunch of jnz and mov instructions, and not take things personally.

everything has a cost.

Collapse
cheerfulstoic profile image
Brian Underwood

Thank you for the perspective!

I agree that hot-code reloading has been over-promoted in the Elixir world and I think more recently the discussion that I've seen is compensating away from that for the average web-app (that is: just use docker / blue-green / whatever typical deployment and persist in the database)

I'm most curious about what you mention with GenServer it sounds like you were just using it to pass messages through (?) to RMQ. But maybe you just didn't need a GenServer in that case. Generally, as I understand them, GenServers are for holding some state in memory (a cache or a user session or something that either you don't mind losing or which you persist regularly as a separate cycle). I think maybe they're another thing that was over-promoted and because people see them as so central to Elixir that they need to be used for everything. Could you share more detail about how you were using GenServers?

Collapse
jwoods1 profile image
Jason Woods

Thanks for the post. Good call, making things work for your team. I think you made the right choice. I love elixir and go both and my biggest gripe with go is the docs and dep management. I wish you and your team the best.

Collapse
zandao profile image
Alexandre Drummond

It's odd to see you writing about spread of 'any' on typespecs when the interface{} spreads everywhere in go, just turning off compile time type checks on this kind of code. Sometimes it's a sign the code is being treated as OO code instead of functional one, but I understand: its easier to change syntax than to change mental models.

Collapse
slavenin profile image
Max

WhatsApp? You dont know elixir...

Collapse
jackdoe profile image
borislav nikolov Author

sure