DEV Community

Thijs Koerselman
Thijs Koerselman

Posted on • Edited on

3 1 2 1

How to write clean, typed Firestore code

Ever since I started using Firestore about 8 years ago, I have been wanting to find a better way to type my code and reduce boilerplate. In this article I will explain how I finally found a way to write clean, strongly-typed code with abstractions that are easy to use and adopt in any Typescript project.

TL;DR

By defining typed reusable references for all database collections, we can let other functions infer their types from them, sparing us from having to import and apply types all over the place.

I have created a set of abstractions based on this concept, for both server environments and React / React Native applications.

If you want to see them applied in a working example you can check out mono-ts.

Why Abstract?

Below is an example of how you might try to apply typing in your server code the basic way, using the official API methods.

import { db } from "~/firebase";
import type { UpdateData, Timestamp, FieldValue } from "firebase-admin/firestore";

type Book = {
  id: string;
  title: string;
  author: string;
  is_published: boolean;
  published_by?: string;
  published_at?: Timestamp;
  is_available: boolean;
}

async function getBook (bookId: string): Promise<Book | null> {
  const doc = await db.collection('books').doc(bookId).get();

  if (!doc.exists) {
    return null;
  }

  const bookData = doc.data(); 
  return { id: doc.id, ...bookData };
};

async function publishBook (publisherId: string, bookId: string) {
  const book = await getBook(bookId)

  if(!book) {
    throw new Error(`No book exists with id ${bookId}`);
  }

  if (book.is_published) {
    throw new Error(`Book ${bookId} was already published`);
  }

  await db.collection("books").doc(bookId).update({
    is_published: true,
    published_by: publisherId,
    published_at: FieldValue.serverTimestamp()
  } satisfies UpdateData<Book>);
};

type ListItem = Pick<Book, "id" | "title" | "author" | "published_at">;

async function listLatestPublishedBooks(): Promise<ListItem[]> {
  const snapshot = await db
    .collection("books")
    .where("is_published", "==", true)
    .orderBy("published_at", "desc")
    .select("title", "author", "published_at")
    .limit(10)
    .get();

  const items = snapshot.docs.map((doc) => {
    const bookData = doc.data();
    return { id: doc.id, ...bookData };
  });

  return items;
}

Enter fullscreen mode Exit fullscreen mode

There many things to dislike here, but I am going to focus on these issues first:

  1. The getBook function is so generic, you do not want to have to write this type of boilerplate code for every collection you have.
  2. The Book type has the data and id mashed together, while for Firestore they are actually unrelated. A document is allowed have an id field separate from the ID under which it is stored. In the example, having an id field in the data would even overwrite the actual document id. I think this bad practice, because apart from being technically flawed, it also complicates typing.
  3. The publishBook update method is problematic, because you have to remember to write satisfies, otherwise typos can result in corrupt data. Also, we need to construct the document path again to mutate the book, even though we have already fetched the book.
  4. In listLatestPublishedBooks we are narrowing the data with a select statement, but we need to make sure to keep the Pick<T> in sync to have a matching type. Also, the function contains code which looks very similar to getBook for constructing the Book data.

These points might not seem like a big deal in this contrived example, but if you are writing a significant amount of code over many years, that needs to be maintained, you want to think about these types of things.

My First Abstractions

The initial abstractions I created were about simplifying the API and avoiding boilerplate. They combined the document data and id together with the ref in a generic type.

type FsDocument<T> = {
    id: string;
    data: T;
    ref: DocumentReference<T>;
}
Enter fullscreen mode Exit fullscreen mode

Using this type I made functions to get a single document with getDocument<T> or query many with getDocuments<T>, and with these abstractions I solved issues 1, 2, and part of 3.

The example code would now look like this:

import { db } from "~/firebase";
import { getDocument, getDocuments } from "~/lib/firestore";
import type { UpdateData, Timestamp, FieldValue } from "firebase-admin/firestore";

type Book = {
  title: string;
  author: string;
  is_published: boolean;
  published_by?: string;
  published_at?: Timestamp;
  is_available: boolean;
};

async function publishBook (publisherId: string, bookId: string) {
  const book = await getDocument<Book>(db.collection("books").doc(bookId));

  if (book.data.is_published) {
    throw new Error(`Book ${bookId} was already published`);
  }

  await book.ref.update({
    is_published: true,
    published_by: publisherId,
    published_at: FieldValue.serverTimestamp()
  } satisfies UpdateData<Book>);
}

type ListItem = Pick<Book, "title" | "author" | "published_at">;

async function listLatestPublishedBooks(): Promise<FsDocument<ListItem>[]> {
  const items = await getDocuments<ListItem>(db
    .collection("books")
    .where("is_published", "==", true)
    .orderBy("published_at", "desc")
    .select("title", "author", "published_at")
    .limit(10))

  return items
}
Enter fullscreen mode Exit fullscreen mode

I hope you can agree that this is already a significant improvement. I would like to point out a few things:

  • We no longer need to write a get function for every document type.
  • We bundle id and data but keep them nicely separated. The Pick and select arguments now also match because of this.
  • Book can be updated directly using the ref on the variable we already had, making the code more readable and less error-prone.
  • The list function is reduced to just the query part. This also makes it feasible to inline the function body at the calling context if you only need to use this query once. It could make your code more transparent, so I think it is great to have the option.

You might be wondering what happened to the null check on book in publishBook… I decided early on that I would rather throw, because that is what I seem to want in the vast majority of cases.

For situations where you want to fetch a document that might not exist, I created a specific function; getDocumentMaybe.

Problems to Solve

I used these abstractions for many years, and while helpful, they do not provide type-safety. As the codebase grew, a few things were increasingly bothering me:

  1. The repetition of db.collection("some_collection"), and the risk of typos.
  2. Having to import and manually apply types everywhere.
  3. Having to remember to use satisfies UpdateData<T> for all update statements
  4. Having to remember to keep select() and Pick<T> statements in sync.

Out of these, I think the last one bothered me the most, because this is very easy to mess up, and particularly risky if you are writing database "migration" scripts that check some property on existing documents before deciding to mutate them.

Reusable Refs

In order to solve problem 1, I figured it would be better to define refs only once, and then re-use them everywhere:

export const refs = {
  users: db.collection("users"),
  books: db.collection("books"),

  /** For sub-collections you can use a function that returns the reference. */
  userWishlist: (userId: string) =>
    db
      .collection("users")
      .doc(userId)
      .collection("wishlist"),
} as const;
Enter fullscreen mode Exit fullscreen mode

As I was working towards strong typing, and exploring alternative approaches it felt sensible to consistently pass the refs as a separate argument, and so the signature of the abstractions changed to this:

const book = await getDocument<Book>(refs.books, bookId);

const recentlyPublishedBooks = await getDocuments<ListItem>(refs.books, 
    (query) => query
    .where("is_published", "==", true)
    .orderBy("published_at", "desc")
    .select("title", "author", "published_at")
    .limit(10))
Enter fullscreen mode Exit fullscreen mode

Typing Collection Refs

I had noticed that most Firestore types have generics, but I didn’t know how to apply them in a useful way, so I mostly ignored them, but then it finally clicked for me…

If we can type the collection refs, we can probably have other functions infer their type from it!

Below are the refs making use of type generics.

// db-refs.ts
import { db } from "~/firebase"
import type { User, Book, WishlistItem } from "~/types";
import type { CollectionReference } from "firebase/firestore";

export const refs = {
  users: db.collection("users") as CollectionReference<User>,
  books: db.collection("books") as CollectionReference<Book>,

  userWishlist: (userId: string) =>
    db
      .collection("users")
      .doc(userId)
      .collection("wishlist") as CollectionReference<WishListItem>,
} as const;
Enter fullscreen mode Exit fullscreen mode

This is quite nice, because not only do we have one place to define our database shape, we now also see clearly what types are associated with each collection. This is useful documentation for anyone working on the codebase.

The Final Abstraction

The typed refs solved problem 2 for the most part, because it allowed me to remove most of the type imports, but to solve the remaining issues, there was still work to do:

  1. Provide typed mutation methods on the document abstraction
  2. Separate the select statement from the query, in order to narrow the type together with the data.

And so, finally, the code looks like this:

import { getDocument, getDocuments } from "~/lib/firestore";
import { FieldValue } from "firebase-admin/firestore";

async function publishBook (publisherId: string, bookId: string) {
  const book = await getDocument(refs.books, bookId);

  if (book.data.is_published) {
    throw new Error(`Book ${bookId} was already published`);
  }

  await book.update({
    is_published: true,
    published_by: publisherId,
    published_at: FieldValue.serverTimestamp()
  });
}

async function listLatestPublishedBooks() {
  const items = await getDocuments(refs.books,
    query => query
    .where("is_published", "==", true)
    .orderBy("published_at", "desc")
    .limit(10)),
    { select: ["title", "author", "published_at"]}
  );

  return items
}
Enter fullscreen mode Exit fullscreen mode

The key takeaways are:

  • The code does not import the database types anymore. Those are only applied once in the db-refs.ts definition file.
  • The book variable contains a typed update function, which will only accept fields that are part of our type, while also allowing FieldValue types.
  • The select statement is now type-safe. It only accepts keys from the type, and not only the selects the data but also narrows the return type accordingly. As a result, the previous type assignment, as well as the function return type, are obsolete.

Writing Transactions

Technically, our publishBook function is not safe. It is possible that, in between fetching the book document, checking its published state, and publishing it, another process could have published the same book, and so we might publish the same book twice, and possibly with different publishers!

In this contrived example it does not seem like much of a problem, but in production code, where logic is more complex and processes take longer to complete, it could be critical to avoid these types of race-conditions.

A situation like this is solved with a transaction, and the library provides similar abstractions that make working with them a bit easier. This is what the function from earlier would look like using a transaction:

import { getDocumentInTransaction } from "~/lib/firestore";
import { runInTransaction } from "firebase-admin/firestore";

async function publishBook (publisherId: string, bookId: string) {
  await runInTranaction(tx => {
    const book = await getDocumentInTransaction(tx, refs.books, bookId);

    if (book.data.is_published) {
        throw new Error(`Book ${bookId} was already published`);
    }

    book.update({
        is_published: true,
        published_by: publisherId,
        published_at: FieldValue.serverTimestamp()
    });
  })
}
Enter fullscreen mode Exit fullscreen mode

If you are familiar with transactions, I hope you can agree that this is more readable than using only the official API, and the abstractions make it very consistent with non-transaction code.

Note that book.update is now not async anymore. When you get a document from getDocumentInTransaction its update method calls the transaction update function. In a transaction, all mutations are synchronous because they are deferred and executed together.

Processing Entire Collections

As a NoSQL document store, Firestore does not have a concept of database migrations. In other words, if you alter your database "schemas" over time, you might have to run code to patch existing data in order to keep things consistent with your updated type definitions.

Over the years, I found myself writing a lot of code to cycle over many or all documents in a collection to mutate or analyze them. The @typed-firestore/server library contains abstractions that make it trivial to query and process documents in chunks, so you can handle very large collections with constant low memory usage.

Below is an example. I think by now, the API shape will start to look familiar.

await processDocuments(refs.books,
  (query) => query.where("is_published", "==", true),
  async (book) => {
    /** Only author and title are available here, because we selected them below */
    console.log(book.author, book.title);
  },
  { select: ["author", "title"] }
);

Enter fullscreen mode Exit fullscreen mode

In the above code, an unlimited amount of documents is fetched in chunks of 500 (default), and for each chunk the handler function is awaited 500 times in parallel. If instead, you would like handle each chunk as a whole, you can use processDocumentsByChunk.

If you use a .limit() on the query, it is detected at runtime, and the automatic pagination is disabled.

The query itself is optional. Set it to null to process an entire collection.

Additionally, you can set a chunk size, and throttle the speed of progress by setting a minimum time for each chunk to pass before moving on to the next. Throttling might be useful if you’re making async requests to another system, and you want to prevent overloading its processing capacity.

Handling Firestore Events

For cloud functions v2 there are few utilities that make it easy to get data from from onWritten and onUpdated events.

import {
  getDataOnWritten,
  getBeforeAndAfterOnWritten,
} from "@typed-firestore/server/functions";
import { onDocumentWritten } from "firebase-functions/v2/firestore";

export const handleBookUpdates = onDocumentWritten(
  {
    document: "books/{documentId}",
  },
  async (event) => {
    /** Get only the most recent data */
    const data = getDataOnWritten(refs.books, event);

    /** Get the before and after data */
    const [before, after] = getBeforeAndAfterOnWritten(refs.books, event);
  }
);
Enter fullscreen mode Exit fullscreen mode

Here we pass the typed reference only to facilitate type inference, and to keep things consistent. At runtime, the data is extracted from the event and the ref remains unused.

React Hooks

The react and react-native libraries provides a number of hooks as well as plain functions that can be used with libraries like ReactQuery.

The hooks are a little unconventional in that they throw errors instead of returning them. The documentation explains this decision in more detail, but one benefit of throwing errors is that we can link the isLoading boolean to the existence of the data property. Typescript understands that if isLoading is false, the data is available (or an error was thrown).

Let me show you what it looks like:

import { useDocument } from "@typed-firestore/react";
import { UpdateData } from "firebase/firestore";

export function DisplayName({userId}: {userId: string}) {

  /** Returns user as FsMutableDocument<User> */
  const [user, isLoading] = useDocument(refs.users, userId);  

  if (isLoading) {
    return <LoadingIndicator/>;
  }

  function handleUpdate() {
    /** Here update is typed to User, and FieldValues are allowed */
    user.update({modifiedAt: FieldValue.serverTimestamp()})
  }

  /**
   * Typescript knows that user.data is available, because isLoading is false.
   */
  return <div onClick={handleUpdate}>{user.data.displayName}</div>;
}
Enter fullscreen mode Exit fullscreen mode

Where Typing Was Ignored

You might have noticed that the query where() function is still using the official Firestore API. No type-safety is provided there at the moment. I think this part would be quite difficult to type fully, and I fear the API shape would have to be very different.

Besides wanting strong typing, I also want these abstractions to be non-intrusive and easy-to-adopt. I would argue that the where() clause is the least critical part anyway. If you make a mistake with it, there is little to no chance to ruin things in the database and you will likely discover the mistake already during development.

It might even be possible to create a fully-typed query builder function that looks like the current official API, by using some advanced type gymnastics, but that seems to be outside of my current skills, and it is not something I am willing to spend a lot of time on.

For now, this trade-off for the sake of simplicity and familiarity, is something I am perfectly comfortable with.

Note that the Typescript compiler will still let you write the select statement directly on the query, but the library detects this and will throw an error if you do.

What About Alternatives?

Using withConverter Server-Side

The official Firestore approach for getting your data typed on the server seems to be withConverter. There is an article here discussing it.

The API never appealed to me, because I do not have a desire for runtime conversion between my database documents and my application code. It seems like you have to write a lot of code to make it work.

Using Type Generics Client-Side

When I was researching solutions similar to mine, I found an article by Jamie Curnow from 2021 in which he already describes the use of generics to type collection refs using the v9 web SDK. As it turns out, in web you can get typing for most Firestore methods out of the box!

For a moment, I got nervous, and feared the my abstractions were maybe the result of ignorance about the officially intended use of types in the SDK. What if a typed solution was already available under my nose for many years?

Luckily, that wasn’t the case, and I will show you in a bit…

I actually remembered reading Jamie’s article years ago, but apparently it didn’t stick with me. The v9 web SDK was the first to introduce type generics, and at the time I was mostly writing backend code where generics were not yet available.

Also, I never mutate documents client-side so I wasn’t calling update or set, and as such, the inferred typing in the web SDK didn’t bring me much over of the abstractions I was already using.

From what I remember, the communication around the v9 release was mostly about modularity, because it allowed for smaller bundle sizes. I am surprised that the Firebase team was not more vocal about the typing part of things. I suspect that a lot of Typescript developers miss this, because I also never came across any good Typescript examples in the docs.

I tried consulting Google’s own Gemini AI a few times, prompting for examples on how to work with typed documents in a convenient way, but none of the responses hinted at typing the collection refs. I find this pretty peculiar.

Using Type Generics Server-Side

With my newly discovered use of types in the web SDK, I needed to see if I could apply the same concept on the server with the current firebase-admin v13 SDK, and this was the result:

import { db } from "~/firebase";
import type { UpdateData, Timestamp, FieldValue } from "firebase-admin/firestore";
import type { Book } from "~/types";

async function getBook(bookId: string) {
  const doc = await refs.books.doc(bookId).get();

  if (!doc.exists) {
    return null;
  }

  const bookData = doc.data();
  return { id: doc.id, ...bookData } as Book;
}

export async function publishBook(publisherId: string, bookId: string) {
  const book = await getBook(bookId);

  if (!book) {
    throw new Error(`No book exists with id ${bookId}`);
  }

  if (book.is_published) {
    throw new Error(`Book ${bookId} was already published`);
  }

  await refs.books.doc(bookId).update({
    is_published: true,
    published_by: publisherId,
    published_at: FieldValue.serverTimestamp(),
  });
}

type ListItem = Pick<Book, "id" | "title" | "author" | "published_at">;

export async function listLatestPublishedBooks() {
  const snapshot = await refs.books
    .where("is_published", "==", true)
    .orderBy("published_at", "desc")
    .select("title", "author", "published_at")
    .limit(10)
    .get();

  const items = snapshot.docs.map((doc) => {
    const bookData = doc.data();
    return { id: doc.id, ...bookData } as ListItem;
  });

  return items;
}
Enter fullscreen mode Exit fullscreen mode

If you remember the initial code, I think you’ll agree that the difference is not very significant. The update function is now correctly typed, which is obviously very welcome, but that’s about it. All the other issues I pointed out in the initial example code still remain.

As it turns out, I had to cast the return at the end of getBook to Book because the doc.data()function returns T | undefined, and so we end up having to import database types again :(

The web SDK doesn’t have this problem, so while it has been a few years since the release of v9 on web, it seems that the firebase-admin v13 SDK for the server still hasn’t fully caught up.

In any case, I hope you agree that a layer of abstractions can provide clear benefits in terms of code readability, type-safety and maintenance.

Sharing Types Between Server and Client

When you share your Firestore document types between server and client code, you will likely run into a problem with the Timestamp type, because the web and server SDKs currently have slightly incompatible types. The web timestamp has a toJSONmethod which doesn’t exist on the server.

One way to work around this, is by using a type alias called FsTimestamp in all of my document types. Then, in each of the client-side or server-side applications, I declare this type globally in a global.d.ts file.

For web it looks like this:

import type { Timestamp } from "firebase/firestore";

declare global {
  type FsTimestamp = Timestamp;
}
Enter fullscreen mode Exit fullscreen mode

For my server code it looks like this:

import type { Timestamp } from "firebase-admin/firestore";

declare global {
  type FsTimestamp = Timestamp;
}
Enter fullscreen mode Exit fullscreen mode

The mono-ts example code also applies this pattern.

Conclusion

Let me end this by listing a few highlights from the proposed abstractions:

  • A document container type, improves readability and reduces boilerplate
  • Similar abstractions for frontend and backend code
  • Simplified transaction code
  • Type-safe select statements (server only)
  • Automatic pagination (server only)
  • Convenient collection processing functions (server only)
  • Convenient data extraction from cloud function events (server only)

Because these are only very thin abstractions, I think there is no reason to fear any restrictions imposed by them. If you ever find yourself needing a native Firestore API that is not covered, you should be able to work with the document ref directly.

For more info, check out the documentation here:

For a working example, check out my mono-ts boilerplate, where I showcase how to configure a modern monorepo for Typescript including Firebase deployments.

These abstractions are based on my experience working with Firestore for about 8 years. We have applied them on two fairly complex projects for two successful companies, and they seem to cover all of our use-cases.

Now that they are out in the wild, I am curious to see if they also fit other people's needs.

I hope you find these abstractions as useful as I do.

Enjoy!

Image of Timescale

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read full post →

Top comments (1)

Collapse
 
realchakrawarti profile image
Anupam Chakrawarti • Edited

This is so well thought out. Really appreciate you sharing it. 👏🏻👏🏻

I am using firestore with Next.js and this would cleanup the code drastically. 🍻

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more