Jonathan Gamble

Posted on Mar 11, 2022

Firestore is Stunted, So... What is the Perfect Database in 2022?

#database #firebase #graphql #webdev

Firestore

To be fair I meant only Firestore really, not Firebase the Platform. Ok, Firestore is not dead. It is quite popular. But it should be dead. It should be dead for the reasons I listed in my post almost a year ago. The team has spent the last 14 months building every single SDK, plugin, and monitoring addon for Firebase possible. This includes the controversial Firebase 9 interface (although I do like the word "faster").

But you know what hasn't been updated... not even a little?

FIRESTORE!!!!

We still can't do any kind of basic aggregation without writing an error prone firebase function that doesn't take into account our current data.
We still can't do a basic full-text search. Google... 🤨
MOST IMPORTANTLY, we still can't count our search results. Sure, we can create automated collection counts by incrementing a random table value (yuck!). Try indexing every possible where clause in just one collection.

It can't be done Marty!

I have created a backend solution and a frontend solution for most use cases. I spent hours writing a backend package, but these days I prefer the frontend solutions.

Why I love Firestore

Firestore is so freakin easy to develop. I love that it is cloud hosted. I don't love the vendor lock in. I love that it has the easiest to use client side api for javascript development. I love that I can secure these things with Firestore Rules. I love how easy I can just add the subscribe function to get real time data. I love how Firestore can auto-scale for a few, or many users. I don't love having to manage a second database, and the lack of growth.

The Perfect Database

I want the perfect database. It has these features:

Scalable
Relational (SQL or Graph)
Realtime Data Support
A JS/TS Client Side API (or GraphQL)
Middleware Security
Cloud Hosted DbaaS Option
No Vendor Lock-In
ACL with Login Methods

I have been researching the perfect database, and it does not quite exist... yet. However, there are some options that come close, and some future prospects to watch out for.

noSQL

noSQL databases in general are great at scaling. You know what they're not good at? Everything else... Firestore recommends you use Algolia with it in order to add search functionality. Besides the obvious problem with a Google product requiring you do use a non-google product to function correctly, you would be storing your data in one noSQL database and have to manage a second noSQL database just for searching. Huh? This makes no sense, but it is the world we live in.

MongoDB simulates joins by aggregating data. However, you can't actually do joins. Firestore just says, hey, write you're own damn code that is probably going to be problematic to handle aggregations. Oh, and, there are limits on that too! So, I can't just duplicate 1,000,000 user profile's on posts, as the function would timeout. Yeah, you got to think outside of the box to hack a way of doing this with even more joins and complications. Oh, did I mention the follower feed problem? If you've been following my posts for a while, you know my many theories about how this CAN'T really be done outside of theory... at least not scalable in both users and followers collections. I have 10 posts on the subject. Here is the latest. Now, try aggregating data, keeping it up-to-date, and adding functionality later to existing data that grows with Firestore or Mongodb. Try this with plain-ol-redis. 😂

All data is relational. You simply cannot create a realistic scalable database solution with ONLY a noSQL database... unless...

a graph database is built on top of that noSQL database.

noSQL databases need to die as main databases. MongoDB, I do give you mucho credit for trying... key word "try" here.

Hello my lovely and remarkable... Graph Database 🥰... with one problem...

There is a huge misconception about Graph Databases. I have literally talked with a PhD computer scientist who works at Twitter who didn't understand why anyone would use a Graph Database who is not querying analytical data.

I got news for you buddy. All data is relational! Graphs relate data better than SQL. Graph Databases are better than SQL.

It used to be true that Graph Databases were not scalable. That is no longer true. It used to be true that Graph Databases were slower than SQL. That is also no longer true.

Every database can be awesome, it just depends on what is underneath. A great database is scalable by distributing the data, uses sharding, maybe multi-tenancy, and could even be deployed to grow... all automatically. We also have all needed features without having to spin up a second database. This is usually the problem.

Graph Databases are evolving. There are really two classic types: triple stores, and property graphs. Triples stores usually store RDFs --- that subject verb predicate thing used by SPARQL, and property graphs have edges and nodes. Despite what some articles say, they both can have index free adjacency. This is the real key to the future of a graph database.

Memory Location

Imagine a foreign key in a many-to-many SQL table. We often have to have a junction table (ms pivot table) when doing joins. We join one table to another, just to get the value of the real table we want. We need two joins to relate only two pieces of data. This is the real reason Graph Databases are amazing. We store the direct place in memory to goto, and we go there. We don't make a stop in between. Relational data is actually relational in a Graph Database.

n + 1

There is also the n+1 problem in SQL. If I query junction tables, I am querying n + one more than I need. SELECT and WHERE statements also require you to query more than you need. It is basically over and under-fetching.

Granted, both of these problems can theoretically be fixed if you're really good at SQL querying, and you use advanced JSON querying techniques. However, most cases are simply faster in a graph database. They were made for this kind of SH*T. Try querying multiple deeply nested data objects. If you don't know what a nested data object is, it is because you don't know how to think in graph terms.

Real problem with graph databases...

The real problem with a graph database is not that it is only made for graph data. ALL DATA IS GRAPH DATA!!!

The real problem is the same problem with Svelte... maturity. Most Graph Databases do not have all the necessary features to produce a web app like constraints, policies, triggers, full text search, and security. In fact, that list is very very short.

The definition of a Graph Database is also evolving. First you had graph databases, then you had graph databases built on noSQL or other KV stores, then you had graph databases built on sql (agensgraph), and now you have hybrids that use more than one database under the hood. The focus is no longer on edges and nodes, as a node itself can be used as an edge. The focus is on the speed you save by storing the memory location directly, and not the location of another intermediate table. The focus is on querying.

So which databases are in the lineup, huh?

Get on with it!

In order to compete with Firestore, you have to have pre-written and customizable middleware. I am only going to list databases that have this. In Firebase, this would be the Client Side Javascript API plus Firestore Rules for Security.

1. Supabase

Supabase is by far Firestore's number one competitor. Sure, it is also Firebase's competitor because it has storage bucks and login methods, but it is really Firestore's competitor because it has a client side securable interface similar to Firebase. It is so easy and lovely to use. You may enjoy the flexibility of a schemaless database, but the tradeoffs for relations are incredible. It uses PostgreSQL under the hood. PostgreSQL is faster for handling large sets of data, while mySQL is faster for smaller sets of data. It now even supports secure subscriptions. You must get used to using Policies and Constraints, but frankly Supabase makes writing these a pleasure, seriously. They are also working on a GraphQL Layer, which theoretically would mostly automatically handle the n+1 problem. There is one tiny caveat. PostgreSQL, while made for large datasets, is not made for scalable data. Sure, you can scale vertically with more computing power, but you can't scale (easily) horizontally with more computers / virtual computers. They may one day support this, but it won't be easy. mySQL can.

2. Fauna

Fauna is pretty freaking cool. I admit I have not yet had the pleasure to build anything with it. It is one of those freaky hybrids. You can store data in a key value store, but query it like a graph. The FQL Client API looks like Firebase 9, in that you need to import a lot of functions within functions. You use internal database techniques, like in Supabase, for security. The biggest two problems I see with Fauna are 1) Vendor lock-in 2) Learning Curve---it does not seem as easy to create links etc. as a graph database or sql database.

3. Hasura

Hasura gives you several choices of SQL databases to build on, but specializes in postgres. It also has the most advanced GraphQL engine that exists, although it is still missing some required features. You need to combine Hasura with Firebase Auth, auth0, or some other login system, but technically the middleware is there. It suffers the same scalable problems and feature problems as DGraph. You can also use NHost.io to automatically set up an instance of your database with a built in login system and file storage. I have not built anything complex with Hasura yet, but I have read about missing features like nested updates. I think once you get to the complex level, the GraphQL alone won't cut it. Honestly, no GraphQL cuts it... yet.

Honorable Mention

4. Dgraph

Dgraph was chewed up, spit out, killed, brought back to life, and now split. A month ago I would not have listed Dgraph at all, even though I love Dgraph. They basically got some VC money, spent the money, fired half the staff, started producing a decent return, and split. One side got the Founder and programmers and forked the open source part (now Outcaste), the other side got the board and VC money, as well as the paying cloud users. They are honestly going to be very different products in a year. I personally am spending a lot of time with both CEOs to give them my ideas, and the compiled feedback I have seen from the community of the users. There is an unofficial discord recently started with over 300 users. You will find both communities, and managing staff active on there. I do not care to take sides, I just believe in the original product's potential. There has been active talk of a second fork as well. Dgraph specializes in GraphQL written in GO for extreme speed, and is arguably better and worse than Hasura at GraphQL. I would say Hasura, Prisma, and Dgraph all are in a fight for the best GraphQL. I wrote the j-dgraph package just so it works like Firebase, querying the GraphQL automatically through JS methods. Dgraph checks ALL my boxes, and I believe in one year that one (or even both) versions will take the #1 and maybe #2 spots on this list. This product is absolutely amazing in every way, so follow me for updates.

5. neo4j
neo4j is that 'most popular' graph database everyone knows. They focus too much on the analytical users, and are missing out on the Firebase users. They have advanced querying capabilities with cypher, math functions, and triggers. While they do have basic constraints, they do not have policies. However, you could write your own with triggers. neo4j is really a beast and competes more with sql databases than you know. However they're missing out. They offer a cloud platform, but expect you to host your own GraphQL. They could make this process easy. They also haven't developed subscriptions, although people keep asking for it. It supports huge amounts of data, but it does not support sharding like DGraph. I have heard DGraph users switched from neo4j due to the inability to support the large datasets. So the enterprise version can scale for high availability, so it sort of scales. Full disclosure: I have not tested any of this, nor am I an expert by any means.

6. Planetscale with Prisma
These are really two different products. Really you could choose any cloud hosted mySQL database and Plugin Prisma to it, but the Fireship guy tweeted about Planetscale (and they spend a lot of money on Google ads), so I suspect they're legit. I need to spend more time researching this. Technically there is some setup needed for Prisma. Prisma itself is in the top tier of GraphQL, has its own api too like Firebase, but no frontend caching like pure GraphQL with URQL or Apollo. Prisma has subscription capabilities. This may should be your best option... TBD.

Up and Coming

7. EdgeDB

Edge database looks pretty freakin awesome. It basically seems to re-write SQL and Graph databases together to create some new-ish programming language. It takes care of all the problems GraphQL has, and seems to be built separately but on top of postgres. It is really something unique, beautiful, and powerful. They don't have a security layer yet or a cloud hosting environment, but both are in the works. However, postgres still suffers from the scalable problems we all know. If you like unique fetching and strong typing, also check out TypeDB. It doesn't make its own list number because there is no cloud version, middleware, etc. However, worth checking out.

8. 8base

8base has perhaps the most beautiful UI. It is mySQL, so it is serverless and scalable. It uses GraphQL, so has middleware. It checks all boxes except features. The GraphQL is adequate. It is not the best, not the worst. It needs nested filters, nested updates, etc. I am going to create an app soon with this thing so I can truly test its functionalities. There is also something called Grafbase for producing GraphQL apis, but I'm not sure where it stands... yet.

Who Wins?

Nobody. All options are missing something or another. I am currently building in Supabase because it is so damn easy, and they sent me a T-Shirt due to my past article☝. I love DGraph and will continue to give updates on that and Outserv (Outcaste's product). I think mySQL is the best overall database. noSQL scales well, but is not relational. Hybrid databases are built on other databases, or given names like newSQL. Ultimately great databases are hybrids, we just want all the management done automatically. 8base and Edgedb should definitely be on your radar. Also, MongoDB Aura, if you don't need a follower feed, doesn't have a front end client system.

If we stick with these rules:

real time data
client side api
cloud hosting
middleware

We only have: Fauna, Hasura, and Supabase...

Ok technically DGraph too. 8base is just a baby.

Complain though I have, the Firebase Team forgot about Firestore. We should not only be angry, but frankly disrespected. They quit listening to their users. I have high respect to all the team members, especially the active ones in the community, but low respect for whoever is choosing to keep Firestore stagnant. If they start development again, I will gladly continue spreading the good word. Hopefully, they create a cloud platform for an actual relational database instead. Either way, the product used to be great, but it is getting passed by. Firebase, do something about it!

One thing to remember

Separate your code. Build your React or Svelte app so that your Firestore or Supabase code is totally separate from your key elements. Use good DRY, SOLID, and KISS techniques. When we find that perfect database, your app will be built, and it will be easy to change your code. Otherwise, find the tradeoffs that work for you. Maybe you love noSQL databases like Cassandra. Maybe you want that Web3 database that syncs with the blockchain... did I mention that is Outserv, Dgraph's fork?! Maybe you don't mind managing a second database just for searching. That is fine too.

I personally am ready for the future. This year is a new beginning.

Check out my databases from last year.

Until then, keep building.

Top comments (38)

Nikolas Burk • Mar 11 '22 • Edited

Hey Jonathan, this is a really insightful article, thanks so much for writing it 👏

Technically there is some setup needed for Prisma. Prisma itself is in the top tier of GraphQL, has its own api too like Firebase, but no frontend caching like pure GraphQL with URQL or Apollo. Prisma has subscription capabilities. This may should be your best option... TBD.

I work at Prisma and shortly want to react to this. It seems like you're referring to Prisma 1, which was used to create a GraphQL server for your databases (similar to Hasura or Postgraphile).

However, the latest version of Prisma is actually a new kind of ORM (so it falls into the same category of tools as Sequelize or TypeORM rather than being related to GraphQL). Prisma solves the same kind of database problems developers as these tools, but approaches these in an entirely new way.

You can read more about it here: The Complete ORM for Node.js & TypeScript

Fireship also released a YouTube video about Prisma that might be worth checking out: Prisma in 100 seconds.

Hope this helps!

Jonathan Gamble • Mar 11 '22

I need to test Prisma out. I thought Prisma 2 used GraphQL under the hood, and was just a JSON to GraphQL Client API? Very intrigued!

Scott Cook • Mar 11 '22

Firebase is dead. Jk it's not I don't mean firebase I mean Firestore. Sell actually firestore is really popular I mean I wish it was.

Dude f off with your click bait what a way to ruin your credibility.

Jonathan Gamble • Mar 11 '22

I think my point here is that the development of Firestore is dead, and Firestore is not to be confused with Firebase. I'm hoping they change that.

Scott Cook • Mar 11 '22

Also apologies, I was having a crabby morning... It is a nice article, just had a "rahhhh" moment after being bait and switched a little upon reading the article :) but you got me to click it.. so that's part of the job i guess :)

Scott Cook • Mar 11 '22

Your article is good and I understand your point. It's one worth making. But, I think you also get my point that if you state in this comment that "Firestore is not to be confused with Firebase" then you go and name your article "Firebase is dead" and then go on to talk about Firestore is dead but not actually Firebase.. that is intentionally misleading.. it is confusing and contradictory.

Jonathan Gamble • Mar 12 '22

Hi Scott. I honestly thought that was a good title when I finished it at 1 in the morning. However, I changed it to be more precise.

Rich Winter • Mar 11 '22

Woot! Some STRONG opinions here!! I love it!. Great assessment breakdown.
As you write, a lot of this has to do with the use case. Fbase and its integration with Firestore, its ability to host both front and back end, and the simplicity of its auth across providers all make it really great for an MVP and initial launch als "don't let the perfect be the enemy of the good"
Also, refactoring aside, there's an issue with ver.9? That's news to me. This isn't an Angular 1->2 jump. What's the"controversy?

Jonathan Gamble • Mar 11 '22

Some people didn't like the Firebase 9 changes. I personally like it better now.

development-vargamarcel • Mar 11 '22 • Edited

Check out Directus too (graphql, rest, db connected to knex in custom endpoints if you want direct db queries, auth, Any relational db...)

Cédrick Lunven • Mar 11 '22 • Edited

Hey Jonathan, lot of great insights in the article loved it.

Have you ever tried AstraDB ? (astra.datastax.com). It is based on noSQL Apache Cassandra with OSS Gateway on top of it named Stargate.io.

Scalable => 💯 (Cassandra better strength)
Realtime Data Support => 💯
A JS/TS Client Side API (or GraphQL) => 💯 GraphQL + JS client (github.com/datastax/astrajs)
Middleware Security => the Stargate layer does the security,
Cloud Hosted DbaaS Option => 💯
No Vendor Lock-In => 💯 , OSS Cassandra + OSS Stargate (sample docker compose github.com/datastax/astra-sdk-java...)
ACL with Login Methods => 💯 , Token will hold claims
Relational (SQL or Graph) => No, graph should come later this year though

(I work there, ping me if you need something)
Cheers.

Jonathan Gamble • Mar 11 '22 • Edited

I will check it out!

Michael Fudge • Mar 11 '22

Besides the article title being clickbait....

This reminds me of the saying "it's not the hammer's fault that it won't drive a screw. It's the carpenter's"

Use the right database for the job. Relational is good at OLTP data. Document makes sense when you don't want to join 6 tables to simply render a webpage. For transactions use time oriented dbs like timescale or wide column. If you need text search ship the data to elastic.

The best advice is to not worry about any of these issues until you need to scale horizontally. If you still don't want to bother with any of that too then just pick a new sql system like cockroachdb or even better just use a cloud provider like memsql, dynamo or cosmos.yeah your going to pay for it but then you can focus on building great things instead of worrying about n+1 problems and over fetching of rows.

The point of a database is to persist data... Use cases are many and that is reflected in the choices at our disposal. Moral: Pick the right tools for the task at hand.

Jonathan Gamble • Mar 11 '22

Well put. I am a firm believe that you should use the right tool for the job. That being said, people need to see the alternatives that CAN do what Firestore can't do out of the box. That is what this post is about. I believe competition creates innovation.

Michael Fudge • Mar 11 '22

Well if that's what the post is about how about a title that reflects that??

You've already debunked the two statements in your post title

Firebase is not dead.
There is no perfect database [model] and there never will be.

Jonathan Gamble • Mar 11 '22

Hi Michael. I really didn't mean for this post to be click bait, I just had a lot of ideas I wanted to put in one post, and maybe I chose the title poorly. Perhaps it should have been a question instead of a statement.

I don't think I agree with those statements.

Most people join the Firebase platform due to Firestore. Firebase is still in development, but Firestore is not. The development is definitely dead on Firestore, and I purposely want to get their attention and call them out on that. Every other database grows after a year in features, but they have stopped.
I also do believe there WILL be a nearly perfect database, perhaps as early as next year. We shall see.
I am a firm believer that you should not have to manage two databases, when one database should add simple features like Full Text Search and Counters. MongoDB is a perfect example of how this should be done.

Michael Fudge • Mar 12 '22

Let me take a different angle here. According to Gartner the database market size is a 60+ BILLION dollar industry. If there's going to be one perfect database in 2023, then there are a bunch of companies out there spending a lot of resources on what surely seems like a hopeless cause.

Hey there is no doubt we are seeing convergence in the industry as each product works to scale horizontally and adopt different data models (relational, search, document, wide-column, key-value, Realtime, timeseries, and graph.) One thing the industry is learning from each other is that each model has an appropriate use case, and no one model is the "best".

Sure Moore's law and the cloud will eventually make some of these decisions less relevant, but never irrelevant.

Cloud providers creating back-ends as services with pay as you go auto scale - like firebase - are what will really make databases irrelevant to programmers. And I for one, welcome it. I'd rather build interesting applications and spend my time on user experience than coding up DTOs, repository patterns, REST endpoints and graphQL interfaces so that I can self host on Kubernetes or swap out my back end databases on a whim. Been there, done that. Anyone is right to argue that using these services create lock in - especially if your code is not designed to be loosely coupled - but the real question is - why should it matter in 2022?

There will always be a need for SQL and data engineering, for example shipping data changes in real time from cloud firestore into some other database more conducive to ad-hoc analytics. But these skills will not be required of full stack developers of the future.

Krishna Pankhania • Mar 11 '22

People in 90's be like "Machines and Robots will take over the world" after 2020, and here we are looping through firestore docs, and checking if our read counts doesn't increase rapidly 🙄. At some point I think It is happening I mean It is controlling us 🤔. (Just kidding! No can do!)

Brian McBride • Apr 24 '22

You might want to check out ArangoDB
arangodb.com/

It is my fav graph database. It scales more like MongoDB, so not as easily horizonal as Firestore, but not bad. For analytics, ArangoDB is pretty cool.

When you are using ArangoDB for more transactional UI stuff, it still is good - but I found an issue when using graph traversals for data sets... pagination. I think this might be an issue with all graph databases. When you do a traversal query, it would make sense that the order is deterministic - but there really isn't a good "start at". You have to traverse the whole graph regardless on each page. Maybe there is something that I was missing.

I do agree with you that Firestore really needs some query improvements. There is a lot I like about Google Cloud. What I don't like about Google is that they tend to focus only on what is trending. Firestore is still better than AWS's DynamoDB, but then there is AWS DocumentDB now. Google needs to do something here.

Jonathan Gamble • Apr 24 '22

I don’t think all Graph databases have that problem and can index differently. My issue with Arrangodb, like Tigergraph, is the price. Need to see examples to compare. Thanks for info.

Brian McBride • Apr 24 '22

Well, it is open source with the Apache v2 license.
github.com/arangodb/arangodb

Similar to how MongoDB worked, they make money on the enterprise extras that are probably necessary for large scale deployments. However, you can roll a cluster or a single server on some compute. They also have Oasis, which is like Atlas.

All that said, I do like Firestore's pay as you grow model and GCP has a great free tier for prototyping. Most databases, if you still want performance, requires a minimum buy-in regardless of use. Sometimes the difference of a few hundred a month can pause experimental projects. I really hope Google does some nice updates to Firestore.

Magne • Feb 9 '23

When you are using ArangoDB for more transactional UI stuff, it still is good - but I found an issue when using graph traversals for data sets... pagination. I think this might be an issue with all graph databases. When you do a traversal query, it would make sense that the order is deterministic - but there really isn't a good "start at". You have to traverse the whole graph regardless on each page. Maybe there is something that I was missing.

can’t you combine an AQL Traversal with the high-level operation LIMIT or use a FILTER to only include a subset of the ascending indexes?

Brian McBride • Feb 9 '23

Yes. But then your next page, you are running your traversal again.
One suggestion I read is to put a key/value of a random value then run a weighted traversal using that random number as a weight. That creates a deterministic response. Helps a bit. Still, if you traverse and return 100 responses then want 101 to 200, then you are still running the full traversal.

In a SQL or NoSQL, your more likely to be skipping items on an index. That is much faster than running a traversal then skipping. If your traversal is deterministic then the second page might return unexpected results.

Dhravya • Mar 11 '22

I like Supabase, but still won't use it in production because of how strong google is - with their own infrastructure and everything
Redis is an amazinggggg choice for a database too. But still, I'd stick to SQL and firebase for now, just because of the good infrastructure

Ben • Jun 23 '23 • Edited

not a single mention of postgres... has been renamed to supabase ?

it is indeed not to compare with firestore; but as supabase totally relies on it, it should be given credits. i also see nhost positionning itself as a supabase competitor; also relying completely on postgres.

with redis, both have been around for decades. kind of recently, neon for postgres, somehow promoted by vercel, is something to look into

View full discussion (38 comments)