swyx

Posted on Mar 12, 2021 • Edited on May 25, 2021 • Originally published at swyx.io

Preemptive Pluralization is (Probably) Not Evil

#reflections #programming #advice

See discussions on Twitter, Reddit, Lobste.rs, and my mixtape

Before you write any code — ask if you could ever possibly want multiple kinds of the thing you are coding. If yes, just do it. Now, not later.

A few common examples to illustrate:

You assume that one team has many users, and correspondingly, one user belongs to one team.
- Eventually, you find that a user may need multiple teams.
- This is actually fantastic for your business!
- But you are depressed because you now have to spend 2 months refactoring every line of code and database schema that assumed the one-to-one mapping 😱 More examples here.
You assume that loading state only has two states — true / false — so you make a boolean isLoading variable
- Then you realize you need to track error state, so you do, with isError. You do some work to make sure the 4 combinations of states behave intuitively. You write tests for each of them like the good developer you are. Of course.
- Then you realize you need an isCanceled state. 8 and counting...
- Eventually you realize every addition or modification takes exponentially longer to account for each edge case. It's hard to even tell if you've covered them all. Most don't.
- The solution is explicit state machines - but at this point you're too far in to justify a rewrite.
Internationalization. If you winced at that, you know the pain.
Pagination. To quote Simon Willison, co-creator of Django, "Refactoring an existing non-paginated API to support pagination will break everything. Better to fake pagination but only ever return a single page, just in case".

You can listen to Ben Orenstein of Tuple discuss this on my mixtape:

I've done this refactoring a million times. I'll be like, I thought there would only ever be one subscription team, user plan, name, address , and it always ends up being like, "Oh, actually there's more." I almost never go the other way. What if you just paid the upfront cost of thinking "This is just always a collection"?

Donald Knuth is famous for noting that Premature Optimization is the Root of All Evil (there's some nuance to that, btw). I am very sympathetic to the appeal to simplicity — if "You Ain't Gonna Need It", then don't use it. But I think Preemptive Pluralization — projecting forward into hypothetical situations when you need N types of a thing — is exempt, even though you are literally optimizing for a future you don't currently live in.

It is a LOT easier to scale code from a cardinality of 2 to 3 than it is to refactor from a cardinality of 1 to 2. This is a fundamentally under-appreciated nonlinearity. In other words, Preemptive Pluralization can make the difference between "sure, I'll add that today" and "this is going to take us 2 months and we'll introduce merge conflicts with every other in-progress feature."

Write Robust Code

Requirements volatility is a core problem of software engineering. As a software engineer, writing code that does what you ask of it today is the bare minimum. Your real skill comes in what happens next — what you do when requirements inevitably change, whether by new feature request or scaling issues arising from I/O or compute bounds.

It may not be enough to write code for what you foresee in the near term — those are just more requirements. Software design and architecture is all about making it easy to respond to unforeseen changes.

Hillel Wayne has proposed calling these requirement perturbations. If a small, typical feature request can throw your whole design out of whack, then you have fragile code. Clearly you want the opposite of fragile — I am tempted to call it "Antifragile" because that gets clicks — but really the best you can hope for is code that mostly doesn't fall apart due to 1-2 standard deviation changes in requirements. In other words: robust code. Robust code is optimized for change (more in a future blogpost).

The nonlinearity in how expensive it is to make a change comes from the "emergent sclerosis" of code. Code that is robust to future changes is far cheaper to write today, than when written later, as delayed technical debt that you must pay up before you can proceed to a feature request. Fragile code is like the payday loan lender of technical debt.

Preemptive Pluralization creates Robust Code.

Is it Even Premature?

I'm so committed to not prematurely optimizing that I want to make a final pitch for why Preemptive Pluralization is not premature. Let's address two obvious criticisms of Preemptive Pluralization:

Increased code complexity: Functional languages and other abstractions can help make array or matrix operations almost as easy to work with as regular operations.
Slow performance from doing extra loops: Loops only cost significantly when you have lots of N. By definition, if you are pluralizing prematurely, N = 1.
Other concerns raised by readers: Perf bottlenecks from excessive joins, Code Communication

Ultimately I think what makes something premature or not is your definition of what you need to write. If you view "code that works today" as your job, preemptive pluralization is premature. If you view "code that doesn't blow up in my face a year from now" as your job, then it is not.

Make Robust Code a design requirement from the start.

More Pluralization Points

More awkward things to pluralize:

Single tenant open source -> Multi tenant hosted service
Versions -> from no version to v1/v2, or going from "legacy"/"new" to "new new" (hence Stripe just uses dates)
Number of independently shipping frontends in your company (hence module federation)
Number of clouds in your company (you think you will avoid this... until you can't, per the Hashimoto lemma)

Related Preemptive Moves

Timestamps! you might as well turn your boolean into a timestamp. created_at, updated_at, deleted_at. But also modified_by, created_by, and more
UUID? Version? - don't use the UUID as a PK.

Acknowledgements

Thanks to Jon Wong for reviewing a draft of this post and contributing the boolean and internationalization examples
Endorsements & Horror stories from Simon Willison, Ben Orenstein, Andrew Culver, Andrew Ingram, Tanner Linsley, Ryan Murphy, David Welch and Khrome!
Daniel Buckmaster compares this idea to Sandi Metz's POODR.
Martin Gronlund compares this to data oriented design.
Jared Palmer and Daniel Yokomizo mention the Zero one infinity rule - though this article argues FOR preemptively favoring "infinity" over "one"

Top comments (5)

Daniel Buckmaster • Mar 13 '21

Robust code is optimized for change

Sandi Metz, in POODR, would call this "reasonable":

Code should be

Transparent The consequences of change should be obvious in the code that is changing and in distant code that relies upon it.

Reasonable The cost of any change should be proportional to the benefits the change achieves.

Usable Existing code should be usable in new and unexpected contexts.

Exemplary The code itself should encourage those who change it to perpetuate these qualities.

When code is reasonable, small changes are easy to make, even if big changes may be hard to make. It sounds like pre-emptive pluralization is a way to make code more reasonable!

Martin Grönlund • Mar 13 '21

There's even a whole discussion to be had here about arrays, locality of reference, cache eviction, false sharing, parallelization, etc. Summed up in short: data oriented design. Performance will improve vastly if you design for collections instead of objects. It's almost only stupid to think in lone objects these days.

agentofuser • Mar 14 '21

I've never used one of them for anything beyond hello worlds, but I like the idea of array programming languages, where variables are collections by default:

en.wikipedia.org/wiki/Array_progra...

Seems a design pattern worth taking to heart.