I got into a Twitter chat culminating in this tweet with Lee Robinson:
 Lee Robinson@leeerob
Lee Robinson@leeerob @TomGranot @purpleorangeye4 I'm searching for a 500-1000 word document explaining tradeoffs at a high level for why you might choose:
@TomGranot @purpleorangeye4 I'm searching for a 500-1000 word document explaining tradeoffs at a high level for why you might choose:
- Vanilla database
- Database client / ORM
- Auto-generated API over a database
- Fullstack framework
It probably depends on what you want to build. That's the hard part.12:46 PM - 10 Aug 2020
I love this suggestion for an article. I really do - his tweets suggest that he's truly in a bind about all the possibilities, which means (since he's a prominent developer) that a lot of other, more silent developers are too. He wrote his own version here, but I figured I'd roll my own as well.
Some context: up until recently I was a Site Reliability Engineer - an ops guy, tasked with making sure our entire stack works as it should, with all its different parts behaving nicely. This gives me some understanding of how different pieces fit together, and I think I could shed some light on the darker sides of the stack.
Lee's article is very practical and down to the point. This article is a bit more "philosophical" in nature, and is aimed at people who want to get a "feel" for what all the different options out there are like. This usually implies more experienced developers, so if you're either just starting out or want the very practical, to the point answers to your questions - go with Lee. Otherwise - strap in.
What's in a backend?
I get the feeling that when Lee talks about a backend he's talking about a "data machine" - one that knows how to do your regular CRUD activities, and lets you focus on your front-end logic instead of focusing on operational concerns.
The backend, from my perspective, is the cornerstone of two - very different - concerns:
- Running "correct" software - your backend is responding correctly to your requests
- Running "performant" software - your backend is capable of handling the traffic you throw at it without wasting too much resources, in a quick and cost-effective fashion
Generally speaking, this is also the order of importance - your software first and foremost has to do what it should, and then do it as fast and with as little operational concerns as possible.
Following Lee's tweet, I am going to enumerate 4 different options, show some examples, and then discuss the tradeoffs.
I'm making 4 (valid, in my book) assumptions here:
- We're talking about websites, and not various system services or more low-level applications / Machine Learning / Data Science stuff. Those "other" types of software are usually using a different type of front-end than the ones front-end devs are used to. Qtcomes to mind for desktop apps, for example.
- We're intentionally disregarding the fact that multiple developers - and DevOps people and DBAs and sysadmins - need to work, maintain and run this software in production. We are talking about a single developer, working on a single application, on their own. The human facet of things plays so, so much into technology selection, and it's way too large a concept to dive into here.
- The "usual" flow of work for front-end devs is "call API, parse data, send to front". That means a lot of different backend APIs, all tailored towards a specific, "small" goal like setting a property of for an object or getting information about a cross-section of objects.
- Most front-end devs use JavaScript and its myriad of frameworks to write their application logic.
Option 1 - Vanilla Database (Database Client)
This means that your backend is simply a database that you interface with directly. There are basically four variants of databases you can go with here:
- Key-value Stores - Redis, DynamoDB, etc.
- Relational Databases - MySQL, PostgreSQL, etc.
- NoSQL Databases - MongoDB, CouchDB, etc.
- Graph Databases - Don't, unless you specifically have a need for them (and then you'd probably know everything in this article already).
The choice of database changes the way you interact with it. Relational databases use SQL, NoSQL databases have a variety of data models and thus have a variety of ways of interacting with them, and key-value stores usually allow you to get and set key-value pairs.
The list above is actually ordered by the level of complexity each database system presents to you, in my opinion. Using a key-value store is more like dealing with localStorage, so should be somewhat familiar to front-end devs. SQL / NoSQL are.... more tricky.
There's a misconception in the tweet, by the way - a database client and an ORM are two different things. A client is usually just library that allows you to run commands on the database (read: write SQL queries), whereas an ORM is usually another layer of abstraction above the database itself (read: write JavaScript code). I'll deal with ORMs in Option 2.
Considerations
How complicated to deploy?
Relatively easy. Setting up the database is really easy, especially with database addons / plugins by the leading push-to-deploy tools like Netlify. The hard thing is choosing which database to use, maintaining the database, watching that it behaves, optimising it, creating a schema for it, etc. It's the "cleanest" way to go about storing data - no layers of abstraction between you and the database - but it's for people who want to deal with databases (like me!).
There is so much documentation about databases out there, it's insane. It's really easy to get confused. Choosing a database holds with it a very large set of considerations - most of which are completely irrelevant to the front-end developer.
I can abstract some of the mystery away by noting that the choice of which database to use mainly depends on where your code runs. Figure out where you want to deploy to, then google for "How to set up a database on X", where "X" is your platform of choice (Heroku, Netlify, etc). Most of the platforms have a huge amount of documentation already, since they want you to come aboard.
There's also the installation of the client library for that database, but that's usually an npm install away.
How much code do I have to write?
A large amount (SQL / NoSQL) or a medium amount (key-value stores). Note that there's no API here. That means that where you would to do a fetch before, you'd now need to write an SQL query to get the data you want, dispatch it to the database using a client (most databases have JS clients implemented as open-source libraries), then parse the response into form you want the data in. Same goes for updating data, just inversely (you have some data, then need to parse it into an SQL query to dispatch to the database). With data-heavy applications, that can mean hundreds (and often thousands) of different queries with varying length.
Working with key-value stores is a bit easier, since you're writing JSON-like (and sometimes actual JSON) to the database. It still requires defining a general schema for your data, however, or you will quickly have a real mess on your hands.
How complex will my code be?
Quite complex (SQL / NoSQL), or not very (key-value stores). I actually wanted to write that using SQL simplifies your code greatly - no extra APIs to learn - but that's assuming that SQL flows through your fingers. Most (good) backend devs I know speak fluent SQL, but from what I gather it's not something front-end tutorials and videos focus on. I'm doing my best to step out of my shoes and into the shoes of a front-end dev, so SQL fluency is not necessarily a common skill.
That means that any code that has complex SQL queries can be considered complex. Same goes for whatever data structure NoSQL databases use, with the added concern that they are often less represented in online tutorials as their SQL counterparts. There's material out there, sure, just not as in the line of sight as SQL stuff.
I have to note, however, that key-value stores are relatively straightforward if you're coming from JS, and aren't necessarily foreign-looking to most JavaScript devs, who are used to working with JSON and JavaScript objects.
Conclusion
I would opt for a database only if you really want to understand the lowermost abstraction in your stack that deals with persisting data. If that's not interesting for you, choose one of the other options.
Option 2 - an ORM (Object Relational Mapper)
An ORM is another level of abstraction between you and the database. It allows you to call "familiar" constructs (read: objects) to perform common activities, instead of relying on raw queries.
An example: you want to create a new item, that has a few values for the properties that define it. With an ORM, you would do so by calling the relevant ORM API for items:
Item.create({property1: 'value1' , property2: 'value2', property3: 'value3'})
With a raw SQL query, you would do it like so:
INSERT INTO items (property1, property2, property3) VALUES (value1, value2, value3)
This saves you a lot of SQL work, but is actually not the same as using a "normal" API endpoint. It's just a more comfortable wrapper around SQL queries, that is not custom-tailored to a specific need.
In other words, you still work with tables - they're just exposed to you as JavaScript objects. There are much more sophisticated ORMs that read your database schema and do all sorts of magic with it, but at their core - ORMs are just wrappers around tables. They prevent you from dropping down to raw SQL.
In option 3 I talk about another approach for the same idea, that tackles the same idea from a different approach.
Considerations
How complicated to deploy?
Relatively easy. ORMs still require you to deploy a database, and then install an ORM library for your framework of choice or vanilla JS (Sequelize is an example of a JavaScript ORM). That's not that different from deploying a raw database.
How much code do I have to write?
A large amount (models + accessing the ORM). Since your ORM doesn't actually know how you want your data to be structured, you need to define Models for your code. Sequlize's docs make a great intro for understanding what this means in practice, but for the sake of discussion you can think about as creating "virtual" tables.
This means that you're still doing basically the same thing you were doing with raw SQL queries - but instead of defining the tables in the database and then querying them from your code, you're defining your models in your code and the ORM creates the tables for you. This can take quite a lot of code if you have a lot of tables.
The rest is interacting with those tables via the ORM - which is usually around same amount of code as using raw SQL queries.
How complex will my code be?
Not very. Your code will be entirely JavaScript - no SQL. This provides for a much more native experience. The only "new" thing will be the ORM library's code, which is usually straightforward (Tablename.CRUDAction({propertiesObject}).
Conclusion
This choice is still somewhat verbose, and is basically one step up from interacting with the database directly. Option 3 details a path that offers a somewhat different way of thinking and resembles your current way of working, with REST-style APIs, more closely.
Option 3 - Auto-generated API over a database
This option is somewhat tricky to explain, because there are a few technologies that are all considered some variant of "API auto-generation", but are in fact very different things. These include software that turns a database into an API (like Hasura), and databases that come with an auto-generated API out of the box (like CouchDB).
These are more like "traditional" backend APIs, in the sense that they abstract away the need to deal with the database at all - and instead just give you an API you can fetch to and from. This means that you get all the information in the format you're used to - JSON - and there're no parts in the middle.
Note that this does not mean you are exempt from modelling the data in your database. The auto-generated API still relies on you telling it how is the information you want to use is modelled like. The nice part, though, is that once you model your data you don't really need to touch it anymore. Everything else is done via familiar APIs.
One comment - there's a technology called GraphQL that allows you to query APIs just like you would query a database, i.e. using a query language. This means you can use a single GraphQL call to the queryroot (a GraphQL system's main API endpoint) instead of mixing-and-matching different, multiple API queries.
Hasura creates a GraphQL API over a database, while CouchDB only allows you to access the database via an API. It's a tricky differentiation to make, but I would say those are two completely different worlds, and one should not confuse the two. What I'm referring to in this article is Hasura-like services, not CounchDB-like ones.
Considerations
How complicated to deploy?
Really easy. Especially with Hasura and HasuraCloud, getting up and running is very fast. The service is there, you model your data and you're good to go.
How much code do I have to write?
Probably less than you would have before. An auto-generated API is basically not a change at all from the way you used to work. You call an API exactly like you used before. The only difference is that the source of the API is not some backend code crafted by a developer, but an automated API over your database.
Especially with GraphQL, you're looking at shaving off a lot of different API calls, which will result in you writing less code. You will have to, however, define your models in your database / HasuraCloud console, which - as you can probably see by now - is part of the cost of playing.
One comment though: since you're working with a model of the database, expect that building your logic might sometimes be more verbose than what you would have with dedicated backend API endpoints. This really depends on what you are trying to create, and deserves an entirely different discussion. Creating data models is truly an art form, and part of the reason why hardcore programmers are so, so much more efficient than their peers - they're using the correct model for their problem.
How complex will my code be?
Generally simple. An auto-generated API is, in many ways, a front-ender's dream come true - an almost full abstraction of the backend. There's no SQL to write and the flow of work is similar to what you're used to - there's an API, right there in front of you, for the taking.
If you modelled your data correctly before, then the same logic you used previously will probably work here as well. If you're migrating, though, it's probably a good idea to re-think the model and see whether you can simplify it to reduce the number of API calls you're making.
If your old APIs were very complicated and specific, you might find that this new model allows for much more expressiveness with significantly less code. I dislike generalisations and catchphrases, but these services are a gold mine for most applications.
GraphQL itself is, however, somewhat foreign even for SQL veterans. It has a small learning curve, but there is legitimately amazing material out there - like this - that will take you all the way with your existing set of tools and frameworks.
Conclusion
If you're attempting to abstract away the backend, go with GraphQL over a database, like Hasura.
Option 4 - Full Stack Framework
A full-stack JavaScript framework - like Redwood - combines all you need to get a fully-functional web-app without the hassles of the separation of concerns - namely a backend and a frontend. It's a different type of philosophy, aiming to create a "unified" experience for you as a developer.
In practice, a full-stack framework is usually a combination of one of the ideas I mentioned before with the other, "normal" front-end parts of the application. Redwood is based around Prisma, which is a database toolkit (but you can think of it, for the sake of simplicity, as a type of very advanced and easy to use ORM), and uses GraphQL and React under the hood. The beauty of wrapping all of the relevant tools needed for an application in one bundle comes from the ability to stay in the same "state of mind" all of the way - everything is JavaScript, everything is available from the same "vendor" (i.e. your framework) and generally speaking you can "do it all" on your own.
If I had to guess, I'd say that this is where the web is going to - a consolidated JS experience for developers, operations people and anyone in between.
Considerations
How complicated to deploy?
Relatively easy. Everything is available outside of the box, which means that deploying the framework is as easy as finding a place to host it. Pretty much the same as the other options, albeit with all the documentation and concepts under the same roof - the framework's docs.
How much code do I have to write?
Probably less that you would have before. Since you are modelling your own data under the hood, you still need to define how it's going to be built. So writing full-stack code is comprised of defining how your data looks like and then using those definitions to write actual application logic. Pretty similar to the amount of code you would have written in Option 3.
How complex will my code be?
Generally simple. Again, it's all JavaScript - but you have to get familiar with the framework's syntax, which might scare away some people afraid of being "boxed" you into the framework. Fear not - Redwood, for example, utilises well-known open-source projects in the mix, so knowledge you gain by using the platform can generally be later transformed into other, adjacent worlds.
Conclusion
Full-stack frameworks are not yet popular enough to be considered the "de facto standard" of the future, but it sure does fell like they're getting there. I would suggest going first with something a bit more established like an auto-generated API (Hasura) and then make your way to a full-stack framework if it becomes too much to handle.
Wrapping it all up
We've went on quite a journey here.
I'd like to sign off with a personal message. I'm a systems guy - I like dealing with the nitty gritty, with trying different deployment options, with looking at why is my memory running out, on destructuring complicated infrastructure and building it all up again. That means I'm a generalist, rather than a specialist.
That doesn't mean, though, that you have to be one too. There's a whole world of content on both ends of the spectrum. Learn about what interests you most, go deep instead of wide if you want to, and mostly just enjoy the run. There are enough people working on the foundations for your dream project right now - you don't have to build (or even understand) everything yourself.
It does, however, mean you need to share your knowledge, so that other - oppositely-inclined - people would be able to benefit the same that you have. Spend time writing detailed GitHub issues, and blog posts, and tutorial videos.
We're all in this together.
Questions? Comments? Hit me up in a private message or leave a comment here.
 
 
              
 
    
Top comments (0)