Brantley Harris

Posted on Jan 19, 2021

Firestore: What You Should Know

#backendless #firebase #webdev #architecture

Here are my notes on Firestore, and what I wish I knew. This is a rundown, so I've put bits in bold so you can scan.

The Basics

Firebase is one of the most popular "backendless" services. The whole idea is: let's provide everything we need for our applications as direct services that make setting up a backend server redundant. Firestore continues this: let's just connect right to our database from the client. The client here is our website, or a mobile app, a desktop client, whatever.

Firestore is a document store database, which means everything is an amorphous document that you set values on. It is schemaless, meaning each document can have whatever fields you want on it– you aren't setting up columns or properties ahead of time.

Every document is in a collection, kind of like a table, but you can think of it as a directory with the documents as files. In fact, they are named exactly like directory paths /collection-name/document-id, and can keep going: you can have sub-collections like /collection-name/document-id/subcollection-name/subdocument-id. In truth, there is absolutely no relation between the sub-document and the document, deleting the document won't even delete the sub-document. It merely serves as a way to organize.

Again, you connect directly to Firestore from the client / front end. You can update the value of a document, you can read a document, and you can query for multiple documents in a collection. Also, you can subscribe to a document, collection, or collection query, and get real-time updates in the form of "snapshots".

Queries are really simple. We're not doing SQL queries here, just things like "is this field equal to this value" and "is this field bigger than this value". You can combine multiple query statements, but they require creating special indexes. Fortunately, when you attempt to use an index that doesn't exist, the library is smart enough to throw an error and give you an exact link to click on to create the query index. But probably 80% of the time, when I make an index, it turns out I don't actually need it. If you're making a lot of indexes in Firestore, it's a red flag that you're laying out your data wrong.

You might be thinking: if it connects directly and we read and write directly, how do we keep those reads and writes secure? That's done with a magical system called Firebase "Rules". You set up rules using a special domain-specific language that sort of resembles javascript. The rules dictate what requests can write and read to documents. In the rule statement you can reference the incoming request, the user, and the requested document.

Generally, there is no processing on writes and reads, so whatever the client writes goes directly into the database. Whatever is in the database, the client reads. However, there is a way to trigger cloud functions when a document is created or written. This lets you simulate processing but, importantly it is after the fact, so there's a quasi-state between writing and processing where things can be out of whack. This task gets scheduled somewhere by Firebase, in my tests it occurred a few seconds after the data was saved.

And that's basically it. Firestore is purposefully super simple. If you need more, you're using the wrong tool.

Where it shines

Don't overthink Firestore. It exists in a larger offering called "Google Cloud Platform". It has a lot of friends to pick up its slack, and it's not trying too hard, so you shouldn't either.

It's one of the fastest ways I know of going from zero to a full "backend" solution. It's amazing for prototyping, amazing for simple applications, and great as a front for a larger system, as a sort of caching layer. It's also pretty cheap, given the alternatives. Just make sure you aren't doing a lot of writes.

Connecting directly from the client is as easy as you'd think it should be. It feels like cheating, and it's worth the price of admission. Although the documentation can be pretty obtuse, once you figure out that you're mostly basing everything off of a document reference: firestore.collection('collection-name').doc('doc-id'), you begin to fly. And once you figure out snapshots, you're golden.

It's fast. Special caching and indexing make it feel like real-time, especially when you subscribe to changes, they seem to come instantly.

It has a live interface to view and edit the data. It's amazing for prototyping because you can see real-time changes. If your client is subscribing to the data, you can make the change in the interface and see it live. Absolute filth for live demos to upper management. "Oh, you want to change this name? Already done."

Another "feature" is that it forces you to simplify your data organization. This can be bad and good, but I've found that it gives you all you need for most simple applications. So if you're struggling with data layout, it's a red flag that you either need a separate solution for that feature, or you're thinking about it all wrong.

Where it's dull

The simplicity will quickly box you in, though. This isn't a relational database. It's a specialized non-relational database– you will have to learn a different way of thinking that matches it, specifically.

With no real way of doing relations, you're going to make a lot of reads and round-trips. This is by design, and probably fine, but we're not able to do anything fancy with joins or graph queries. This also means being very, very aware of how you set up your data. In many places, you will be duplicating data and will have to manage to update this duplicated data in a system that, frankly, doesn't want to help you.

But easily the biggest problem with Firestore is the rules system. This system is nasty. First of all, let's have a new language for you to learn that is not well documented and feels just close enough to javascript to make you feel like it should be, but no. Sorry, not sorry. It would be bad enough to leave it there, but there's almost no feedback. Your requests either work or don't. Debugging is practically impossible. They allow you to "test" the queries from their front-end, but it's buggy, and I don't recommend it. The biggest issue here is that you are simply left wondering about your security. Not a good place to be.

Another big issue is that there is no schema, at all. You cannot be sure of data consistency. That means you have to set up defenses in your client code when you read and/or setup post-processing cloud function triggers to ensure sanity. A lot of object.title || '' or use a schema validation library. It is really pushing the schema work to the client. This is a pretty awful indictment when you compare it to services offering things like GraphQL that handle this inherently.

And given that a nefarious user could, with some console scripting, update whatever they have access to with whatever data they wanted, you can't rely on any of it. Again, security becomes a big question.

Finally: pricing is weird. On the surface, Firestore has a straightforward pricing model: you pay a certain amount per read and per write. But things can get problematic. It's hard to figure out what special functions like subscribing to collection queries do, and it's hard to know what cloud functions might trigger either a read or a write. At best, you'll be able to estimate based on projected users and activity. At worst, like most cloud services, you'll have to do some linear regressions after you've run it for a while to see how much your app will cost when it scales. But, hey, this is why we have CFOs, right?

Strategies

Here are some strategies for success with Firestore:

Employ pricing limits.

First thing, first thing: Set up pricing limits. Make something reasonable. Da pricing calculator don't care that your script went into an infinite loop, your credit card is still on the line. Big daddy Google ain't messing around. Though to be fair, I've heard of people messing this up and Google has been lenient. Still, put the limit in.

Keep public and private data separated.

As a general rule, separating public and private data is great. In Firebase it's even more greaterific. This lets you write public rules that are super simple, and focused, well crafted, private rules. It's also a good line on which to duplicate data, e.g. have private user profiles with all their information, and public profiles, in a separate collection, with only a subset of fields.

Consider moving writes to a separate service.

I abandoned making writes at all directly from the client. Instead, I set up an App Engine REST API to do this. You could easily do serverless functions, an API Gateway with Cloud Run, whatever. Yes, this throws out the idea of fully backendless as you're literally making a backend. But I find it much, much easier to ensure security and data schema in custom functions. Best to do this after the prototyping phase. Get the functionality down, then move on to data consistency and security — batten down the hatches, so to speak.

An alternative to this is to have special collections set up for writes, and then have serverless functions trigger and process them. This is a fine system, but you're essentially just doing the same thing as above.

Consider using Firestore to augment your existing services.

If you start thinking of Firestore as essentially a specialized, fast, subscribable read-only data source, it's easy to imagine gradually putting it in between your existing services and clients. Gradually put data into it and move clients to read from it. Slowly your backend becomes write-only, decreasing complexity and cost.

Test your security rules with a deployment script.

The rules are such little jerks, you need to be sure they are working. Don't leave it to the interface, it's just waiting to fail.

On server-side, use transactions.

The client bends backward to make every operation idempotent, so you can't get the database in a weird state. But on the server, you need to use transactions to make sure you don't create a weird state of the database. Also, it can save on pricing as a single transaction sort of bundles costs up. But, uh, don't quote me on that last part, it's not well defined.

Watch the videos.

Google has done an impressive job creating videos to help you understand how to organize your data. I've dealt with a lot of document store and non-relational databases and they still very much helped me think in Firestore.

Conclusion

Welp, that's all I got. Overall, I really like Firestore. It lets me get building fast. But it's a special thing, and it's best to understand it before you jump in.

It's well-executed; most of the issues are from trade-offs that I can't fault them on. Although the rules system really needs a lot of work.

Going forward, the horizon looks good for some other backendless technologies. AWS is coming around the corner with Amplify, and any GraphQL service is going to have a huge benefit here since we're already giving up REST. But it remains to be seen if the simplicity, cost, and speed of Firebase will shine through.

Cover image: Photo by burak kostak from Pexels

Meet me on twitter: @deadwisdom

I'm available for consulting services on architecture and web development.

Top comments (3)

Mark Okoh • Feb 10 '21

Nice article. I've been using the Firebase Realtime database for a while - I know it well now, and quite like it (though the securiy rules are crap too). Thinking of switching to Cloud Firestore for my lastest project since it's newer, and supposedly better. Do you think it's worth making the switch?

Brantley Harris • Feb 11 '21 • Edited

That you'd have to figure out yourself, I'm afraid. Personally, I would only switch if I had specific features in mind. Otherwise, they are too similar to bother migrating, especially if your app already works with Realtime.

The docs have a good guide on Firestore and Realtime:
firebase.google.com/docs/database/...

Mark Okoh • Feb 11 '21

Yeah, I won't change the existing projects, but I'll probably try Firestore on a new one. Thanks.