Developers talk about abstractions all the time. For example:
The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise.
-- Edsger Dijkstra
or
Prefer duplication over the wrong abstraction.
-- Sandi Metz
or
How do we create abstractions that hide implementation details of a particular UI part?
-- Dan Abramov
Also, Dan mentioned "abstraction ladder" in one of his tweets. I used "abstraction tower" in my previous post. What developers mean when they say "abstraction"?
What is an abstraction?
Abstraction, the cognitive process of isolating, or “abstracting,” a common feature or relationship observed in a number of things, or the product of such a process.
What is abstracted—i.e., the abstraction or abstractum—is sometimes taken to be a concept (or “abstract idea”) rather than a property or relation.-- Britannica
I also checked Oxford Dictionaries, Merriam-Webster, SEP, but they weren't particularly helpful.
A shorter definition would be - abstraction is an artifact of the process of details removal. Pay attention it doesn't specify what details to remove, why and how we do it. This is quite a wide definition and when programmers use word abstraction it can mean a big range of things. Let's see what.
Abstraction as a model
When we need to represent real (or not real) object in the application, we often chose only some properties (details) to which we pay attention. This attempt to represent something is called modeling. We pay attention only to some chosen details and throw away other details, we abstracting.
Examples: modeling entities as tables in relational DB, modeling entities as classes in OOP, modeling entities as types in advanced type systems.
Abstraction as implementation details hiding
When we construct some API or interface or contract and hide how it is done, so we can change the implementation without changing API. We hide implementation details e.g. removing them from sight, we abstracting.
Examples: high-level languages which abstract away implementation details of how it is implemented at hardware level; a relational database which exposes API in the form of SQL, we don't need to know what algorithms and data structures it uses under the hood to work with it.
Sometimes abstraction hides different abstractions, so we have layers of abstractions also known as the tower of abstraction or ladder of abstraction.
If we still need to understand how it is done under the hood it is called "leaky abstraction". For example, indexes in relational Dbs typically implemented as BTrees, that is why query LIKE abc%
, but LIKE %abc
will be slow.
Abstraction as a generalization
We can pay attention only to some common detail of the group, so we can join the group in the one category and make a judgment about the whole group. We remove uncommon details, we abstract.
As you can see the choice to which details pay attention can change what group we will get. In this sense abstraction is a point of view.
Examples: base class in OOP (inheritance) hierarchy;
Issue with a broad definition
As we can see a lot of things can be considered as an abstraction in software development and this can be confusing (I think). Do we need to distinguish different types of abstractions? Are they similar or just seems to be similar? I dunno.
Wrong abstraction
Sometimes we can hear a phrase like "the wrong abstraction". What makes abstraction wrong (or bad)? Abstraction wrong is when we choose to focus on wrong details. For example, when two pieces of code have a syntactic similarity (e.g. looks similar), but not necessarily have the same purpose - the introduction of abstraction here can be a big mistake (this is what Sandi Metz talks about).
When we say "focus on wrong details" we need to understand that "wrong" depends on the context. It happens that abstraction can become wrong eventually when context changes, for example, new requirements were introduced and our old abstraction doesn't fit into the new context anymore.
Abstract thinking
Abstraction is going beyond software development. It is a cognitive process. Abstract thinking has downsides the same way as categorical thinking. And we need to be aware of it.
Different things may seem similar
When we abstract we can choose such a set of details so that different things can start to look similar. For example, Ferris wheel vs auto, both have wheels and you can ride both, but only one can be used to move from one place to another.
Similar things may seem different
When we abstract we can choose such a set of details that similar things can start to look different. For example, whales are more related to hypos than to sharks.
Different people can choose different abstractions
Abstraction is like the coordinate system - it can be chosen arbitrarily and as a consequence, this can be a source of confusion.
Further reading
- "What Do You Mean?" by Kevlin Henney
- "On Abstraction" by Zach Tellman
- "On the Nature of Abstraction" by Ron Pressler
- "The abstraction paths: from experience to concept"
Top comments (17)
Here's my simple definition of "abstraction":
Abstraction is reuse.
Reuse is abstraction.
OK, they're equivalent. But what does "reuse" mean ?
Reuse means re...use.
So the process of abstraction means the process of creating something reusable !
I assume you are talking about reuse through the introduction of new entities rather than copy-paste, right? But the process of the introduction a new thing is more about symbolic thinking (the new thing is a symbol). It can be a process of abstraction (removal of details by hiding implementation) or not. If we will call all cases of reuse an abstraction it can get even more confusing.
Even if you successfully remove all details / hide implementation, it's not an abstraction if it's un-reusable (per my definition)
Well, people do not always build abstraction for reuse, sometimes they do it for simplification. Example when people introduce a lot of small functions with concise names (clearly communicating the purpose of each) and then construct final thing out of it. And they can use those functions only once (e.g. no reuse). This can be referred to as an abstraction.
I got you, but i just want to be more precise about terminology here.
Mine is more about goal-oriented definition.
In your case, i would call it just a refactoring, or simplification.
The Abstraction i mean is, in real world projects, we want to avoid copy-pasting, or DRY, that's why we want to "abstract" those repetitive pieces into only 1 piece.
Or, put more clearly, if you find yourself duplicate your code in two places that're almost the same, "abstract" it !
I'm thinking out loud.
On the other side, it can be interpreted following way: building software is building understanding - software is precise and executable knowledge. In this case, abstraction is "classical" cognitive trick to manage complexity and building knowledge. Ultimately nobody builds software for code, people build it to achieve something and software doesn't work in a vacuum it is also expecting some knowledge from the users. And in this case, repetition of pieces of code is not a big deal, but recognition of the same concepts (abstracts) in different places is important.
Just want to remind that ours (developers) pursuit for DRYness bites us often and it is not ultimate purpose to build DRYest code ever. The ultimate purpose is to make it work, make it right, make it fast. This is what Sandi Metz talks about:
The fact that you made the wrong "abstraction" doesn't invalidate my definition on Abstraction. ;)
For clarity: I didn't try to invalidate your definition. It was more thoughts out loud.
Speaking of terminology...
In my world that's called "refactoring". You can stay in the same abstraction layer while refactoring, e.g. you build a simulation like The Sims and have objects like Cars as well as Trucks with their own constructor. When refactoring you find duplicate code in these constructors and decide to apply the DRY principle. This might result in a VehicleFactory, which is a generalisation, but not an abstraction. You're still in the object/terminology world of your Sims and you might also find a NatureFactory that will create Trees and Flowers in your Sims' world. All these things are in the same abstraction layer. Going the ladder one layer up, you might find your VehicleFactory and NatureFactory being called in initialiseGame. You know that you're in a different abstraction layer when you've got a different vocabulary. The Sims know what a Tree is and what a Car is, because these are the objects in their world. But a Sim wouldn't understand initialiseGame, because that's the vocabulary of the player.
Hmm. We might be speaking of an “abstraction” as a either concept or as a component of a software system. In each case, reuse is a useful property of an abstraction and a good reason to keep it. However, I might create an abstraction to be used only once and still be justified in that, as it might allow me to better organize my thoughts. I could also likely reuse things that are not abstractions, such as a doorknob. (Though one could argue that my using the word doorknob to describe a meaningful group of atoms is nothing if not an abstraction... ?)
Therefore, in my view, reuse is a benefit of abstraction but not it’s definition.
So you consider generalization and abstraction as different things? How would you define both?
I go along with Dijkstra ("to create a new semantic level in which one can be absolutely precise"). IMHO re-usability is just a (good) side-effect of abstractions.
Do you agree that the uppermost abstraction layer should represent (or use the language of) the business domain? I think this is a good approach, which comes easy when designing top-down.
I don't see any trolling here.
Abstraction in art is essentially the same process as I described for CS. It is about removing details.
Image from "Understanding Comics: The Invisible Art" by Scott McCloud.
Abstract art by European modernists like Joseph Albers and Piet Mondrian we're insanely mathy and rule based. Basically an exercise in seeing how far you could divide and routinely paint a grid. I wouldn't want to be them.
Abstract expressionism like Jackson Pollock was all about expressing what he thought was freedom, chance and randomness.
Both are dubbed "abstract" but operate on different implementations lol.
Living artist with no painting skills just has to pitch in. Comment irrelevant but tangentially relevant, on some abstract plane.
Say that we ignore certain aspects of the hyponyms, such as all the cat-specific and dog-specific qualities, when creating the hyperonym. If there were no such qualities to ignore, we could just say we had two dogs.
So the idea that some detail is removed is intrinsic to the process of abstracting when using Aleksei’s idea of what abstraction is.
Abstraction is an artifact of the process of details removal- yes.
Abstraction is also an artifact of the process of pattern recognition, or the surfacing of meaningful groupings.
One defines in terms of what it’s removes and one defines in terms of what it surfaces.
What is the artifact, by the way? Just a useful name for something.
It seems similar to how neuronal clustering fuses groupings of information together into neighborhoods, highways, maps, etc.
Thank you for writing about this stuff stereobooster! I’m glad you’re out here focusing on the abstract stuff.
Thank you for this overview of computational (?) abstraction-thinking. I'm halfway through and will return to add my thoughts. I enjoyed your tracing of the etymology. 🤔 (thinking abstract thoughts.)