Using Erlang

Sunday, July 5, 2015 :: Tagged under: engineering essay pablolife. ⏰ 8 minutes.

Hey! Thanks for reading! Just a reminder that I wrote this some years ago, and may have much more complicated feelings about this topic than I did when I wrote it. Happy to elaborate, feel free to reach out to me! 😄

The following is cross-posted with the Ghostlight blog.

It's not common knowledge, but Ghostlight is written in Erlang. This is slightly bananapants. Here are some thoughts and reactions to this choice, now that I've done substantial work on it.

I ❤️ Erlang

Pattern matching.

Srsly, this ruins you when you use other languages. Using raw if statements to choose your branches, and/or extracting nested data into a variable feels just so cumbersome in other languages.

Binaries

I don't have much use for them in Ghostlight, but when I have needed them (once for a very old project) it's far and away easiest way to work with binary data. This is illustrated pretty well in this solution to a Go challenge.

"Small, simple language"

This phrase apparently gets repeated a lot with some eyeroll, given how hard it can be to get any of your code running in Erlang before you understand OTP Applications and Releases, but there's some truth to it — you have atoms, lists, binaries, tuples, and the ocassional function. When something breaks it's often obvious why (and consequently what the fix is), even in third-party code.

Dare I say: it's got the appeal of JavaScript's "simplicity" (where you can just inspect transparent objects), but smarter — no forced universal floating-point precision, no UTF-16 code points by default, no null and undefined, no 4 meanings of this, an atom type, and tuples and lists as distinct data types.

Oh, and

Being the only (kinda, sorta, I guess) mainstream language that gets software right

Most "Why Erlang?" posts cover this beautifully, so you can start by reading Evan Miller. In particular, I love his points that many of the benefits of Erlang are "back-loaded": they're non-obvious unless you've been building software for a substantial amount of time, and put up with Erlang's quirks for a long enough time to observe them (most give up, understandably, before they reach the top of the mountain to enjoy the view).

My absolute favorite thing about Erlang is its story around reliability: the best way to describe it succinctly comes from the fabulous Fred Hebert's book, Erlang in Anger. It's a PDF, so I'll quote the relevant section:

There’s something rather unique in Erlang in how it approaches failure compared to most other programming languages. There’s this common way of thinking where the language, programming environment, and methodology do everything possible to prevent errors. Something going wrong at run-time is something that needs to be prevented, and if it cannot be prevented, then it’s out of scope for whatever solution people have been thinking about.

The program is written once, and after that, it’s off to production, whatever may happen there. If there are errors, new versions will need to be shipped.

Erlang, on the other hand, takes the approach that failures will happen no matter what, whether they’re developer-, operator-, or hardware-related. It is rarely practical or even possible to get rid of all errors in a program or a system. If you can deal with some errors rather than preventing them at all cost, then most undefined behaviours of a program can go in that "deal with it" approach.

This is where the "Let it Crash" idea comes from: Because you can now deal with failure, and because the cost of weeding out all of the complex bugs from a system before it hits production is often prohibitive, programmers should only deal with the errors they know how to handle, and leave the rest for another process (a supervisor) or the virtual machine to deal with.

Given that most bugs are transient, simply restarting processes back to a state known to be stable when encountering an error can be a surprisingly good strategy.

Erlang is a programming environment where the approach taken is equivalent to the human body’s immune system, whereas most other languages only care about hygiene to make sure no germ enters the body. Both forms appear extremely important to me. Almost every environment offers varying degrees of hygiene. Nearly no other environment offers the immune system where errors at run time can be dealt with and seen as survivable.

Because the system doesn’t collapse the first time something bad touches it, Erlang/OTP also allows you to be a doctor. You can go in the system, pry it open right there in production, carefully observe everything inside as it runs, and even try to fix it interactively. To continue with the analogy, Erlang allows you to perform extensive tests to diagnose the problem and various degrees of surgery (even very invasive surgery), without the patients needing to sit down or interrupt their daily activities.

There is a lot of twee romance about Lisp and how Lisp is Beautiful and Lisp Will Ruin You So Don't Learn Lisp Because You'll Never Want To Program In Anything Else. I learned a few Lisps, had a ball, felt many things, but that wasn't one of them. I do feel that way with Erlang.

Per pattern matching above — once you have a system that is built around smart setting of boundaries with guarantees, not best efforts, and a realistic view of transient errors… well goddamn, you'll never feel comfortable programming anywhere else.

Where Erlang has bitten me

(For a web app,) Too many data formats

We deal with 4 primary resources in Ghostlight:

Productions (the code calls them "shows").
Organizations
People
Pieces (the code calls them "works," as in "works of art")

Each of these is represented by an Erlang record, which, until Maps became more-or-less ratified (and they're not even done in bleeding-edge distributions!),were the standard name-value store.

It is well-known that records suck. Or are… basic, in any case. Because they're just a compile-time hack to tuples, there are a few operations (i.e. indexing by a variable) that are just impossible. When people complain about "ugly syntax," records are the only part of that I agree with. But! Whatever! I'm not here to complain about records. Only to say that, internally, I'm using them.

But we don't send records to the server; the accepted way to speak to a server these days is to send and receive JSON, and so that's how we insert or update either of the 4 resources. So each of the above also has a JSON representation that needs to be validated, parsed from, and converted to.

Yet! We're not storing records or JSON in our persistent database, Postgres!¹ That requires you to take a record, then split it up into whatever representation you have in the various tables and Postgres datatypes! Then when you fetch it, you need to build up a record again for the application layer to use.

But we're not done! Because to actually render variables into an HTML template, we're using ErlyDTL, which uses proplists!²

This means every resource has/needs:

Code to parse JSON and form a record.
Code to split the record into SQL INSERT or UPDATE statements.
Code to parse your SELECT queries columns into a record.
Code to take your record and form a proplist for its HTML templates.

If you want to create JSON from your backend (and I wrote this for things like Elasticsearch, and making an API you can GET), you also need

Code to form a JSON representation of your record.

While I have many engineering qualms against a lot of hot tech (Node is garbage, ORMs are poison), it's cases like these where a Node + JSON object store look mighty attractive, since so much code is spent just shuffling the format.

Static Types

We have Dialyzer, but I can't help but feel if you're going for a Weird Language, you should go big and get one with even a passable static type system. Static type systems are like polio vaccines in that we have the solution to a giant class of suffering and just choose not to put it in widespread use.

Tooling

rebar3 is fabulous compared to what came before, but it's still a very far cry from a more modern package manager and/or builder, like npm, or even things like gradle or the go command-line tool. This isn't due to their bad work — the system they've devised is fabulous given the constraints of the Erlang ecosystem — but the end result is still challenging for the user.

As mentioned before, bundling code in OTP Applications and releasing with Releases has a high learning curve. Those are solutions to real problems, but they can be daunting when you're just trying to get your shit up.

Bas-ackwards Abstraction Model

This is harder to say concretely, but — there are many things that are done differently in Erlang than virtually everywhere else. While this already true by virtue of being an immutable, functional programming language, there's plenty of weird stuff on top of that, even if you're used to other FP that bends your brain a bit. Things like:

Global namespaces: it's irresponsible to not prefix the application name to your module (leading to pretty long module names) because it's all in one namespace.
Building generalized components without direct parametrization: SML gives you functors. Haskell lets functions operate on typeclasses that you pass by parameter.

Erlang did have parametrized modules, but apparently its 11 oldest users hated them, and promptly removed them, despite some notable Erlang apps using them. The proper way to have lots of modules share structure with another is, like gen_server, to call a toplevel module, and have it call your exported callbacks.

It's not impossible, but the syntax/structure of modules make it sound (and feel) a lot weirder than it is.

I could be misremembering because I can't find the original link (she's changed her site), but I credit author Karen Traviss with an old "advice for young writers" checklist she had for on her site. It had some lovely, memorable lines like (paraphrasing) "writing really is like having homework for the rest of your life," and "the competition isn't really as fierce as you think it is: for every 100 wannabe novelists, only 10 will finish their novel, 5 will submit, and 2 of them will be disqualified for using 9pt Comic Sans on pink paper."

One that stuck with me while working side projects was "for every novel, you'll heavily consider trashing it at the 25% and 75% points. Fight through it." Ghostlight is there — I have a cute, if ugly, demo; but I hear little birdies in my head telling me I should scrap it and do it with Phoenix (Elixir is Erlang's future!), or just write an Express Node server like I've done twice professionally (Speed matters more than engineering — Node sucks terribly but I work fast in it, I'll tell you that).

I've killed those little birdies for saying that. Too many promising things chase Perfection and never hit that better thing, Completion. So I'll keep pushing on. But there's still so much work to do…

1: If minimizing data representations was really my priority, I could have implemented a simple prototype in Mnesia (allows Erlang terms) or a JSON store, like Postgres JSON columns or, God forbid, Mongo. If this was a VC-backed boondoggle and needed to ship yesterday and find product-market fit, I probably would have done one of these.

2: Use of the record_info option lets you ostensibly pass in records instead, but that's harder to configure in a program that's already tedious to configure, and you'll end up defining a bunch of nested, intermediary records anyways since your template will usually contain more than just the basics of a resource or list of them.

Thanks for the read! Disagreed? Violent agreement!? Feel free to join my mailing list, drop me a line at , or leave a comment below! I'd love to hear from you 😄