DEV Community

Humza Tareen
Humza Tareen

Posted on • Originally published at humzakt.github.io

The Firestore Default Database Trap: Why Your Data Is Going to the Wrong Place

Firestore has a (default) database. If you don't explicitly specify which database to use, everything routes there. We had multiple Firestore databases in production, but several code paths were accidentally hitting the default.

This guide covers how Firestore's default database works, how to detect misrouting, and how to fix it in Python and JavaScript/React.

The Problem: Silent Data Misrouting

We use multiple Firestore databases for tenant isolation. Each evaluation tenant has its own database:

evaluations-db-prod-tenant-1
evaluations-db-prod-tenant-2
evaluations-db-prod-tenant-3
...
evaluations-db-prod-tenant-12
Enter fullscreen mode Exit fullscreen mode

But some code paths were missing explicit database references:

# ❌ BAD: Routes to (default) database
from google.cloud import firestore

db = firestore.Client()  # No database specified!
db.collection("evaluations").add({"score": 0.95})
Enter fullscreen mode Exit fullscreen mode

This code writes to (default), not the tenant-specific database. The bug was silent — no errors, just wrong data location.

How Firestore Defaults Work

Firestore has two ways to specify a database:

1. Explicit Database Reference (Correct)

from google.cloud import firestore

# Specify database ID explicitly
db = firestore.Client(database="evaluations-db-prod-tenant-1")
db.collection("evaluations").add({"score": 0.95})
Enter fullscreen mode Exit fullscreen mode

2. Default Database (What Happens If You Don't Specify)

# No database specified → routes to (default)
db = firestore.Client()  # Uses (default) database!
Enter fullscreen mode Exit fullscreen mode

The (default) database is created automatically when you first use Firestore. It's always there, even if you never intended to use it.

Detecting Misrouting

Method 1: Check Database Usage in Console

Go to Firebase Console → Firestore → Databases. Check if (default) has unexpected collections:

(default) database:
  - evaluations (shouldn't be here!)
  - scores (shouldn't be here!)

evaluations-db-prod-tenant-1:
  - evaluations (correct)
  - scores (correct)
Enter fullscreen mode Exit fullscreen mode

Method 2: Query Default Database Explicitly

# Check what's in (default)
default_db = firestore.Client(database="(default)")
evaluations = default_db.collection("evaluations").stream()

for doc in evaluations:
    print(f"Found evaluation in default DB: {doc.id}")
    # This shouldn't exist!
Enter fullscreen mode Exit fullscreen mode

Method 3: Add Logging

import logging

def get_firestore_client(database_id: str):
    """Get Firestore client with explicit database."""
    if not database_id:
        logging.error("Database ID is required!")
        raise ValueError("Database ID must be specified")

    client = firestore.Client(database=database_id)
    logging.info(f"Using Firestore database: {database_id}")
    return client

# Usage
db = get_firestore_client("evaluations-db-prod-tenant-1")
Enter fullscreen mode Exit fullscreen mode

Fixing Python Code

Pattern 1: Environment Variable

import os
from google.cloud import firestore

# Get database ID from environment
DATABASE_ID = os.getenv("FIRESTORE_DATABASE_ID")
if not DATABASE_ID:
    raise ValueError("FIRESTORE_DATABASE_ID environment variable must be set")

db = firestore.Client(database=DATABASE_ID)
Enter fullscreen mode Exit fullscreen mode

Pattern 2: Configuration Class

from dataclasses import dataclass
from google.cloud import firestore

@dataclass
class FirestoreConfig:
    database_id: str
    project_id: str = None

    def get_client(self):
        """Get Firestore client with explicit database."""
        kwargs = {"database": self.database_id}
        if self.project_id:
            kwargs["project"] = self.project_id
        return firestore.Client(**kwargs)

# Usage
config = FirestoreConfig(
    database_id="evaluations-db-prod-tenant-1",
    project_id="my-project"
)
db = config.get_client()
Enter fullscreen mode Exit fullscreen mode

Pattern 3: Factory Function

from google.cloud import firestore
from functools import lru_cache

@lru_cache(maxsize=12)  # Cache clients for each database
def get_firestore_client(database_id: str) -> firestore.Client:
    """Get Firestore client for a specific database."""
    if not database_id:
        raise ValueError("database_id is required")

    return firestore.Client(database=database_id)

# Usage
db = get_firestore_client("evaluations-db-prod-tenant-1")
Enter fullscreen mode Exit fullscreen mode

Fixing JavaScript/React Code

Frontend: Firebase SDK

// ❌ BAD: Uses default database
import { getFirestore } from 'firebase/firestore';

const db = getFirestore(app);  // No database specified!

// ✅ GOOD: Explicit database reference
import { getFirestore } from 'firebase/firestore';

const db = getFirestore(app, 'evaluations-db-prod-tenant-1');
Enter fullscreen mode Exit fullscreen mode

React Hook Pattern

import { useState, useEffect } from 'react';
import { getFirestore, collection, onSnapshot } from 'firebase/firestore';
import { app } from './firebase-config';

function useFirestoreCollection(collectionName, databaseId) {
  const [data, setData] = useState([]);
  const [loading, setLoading] = useState(true);
  const [error, setError] = useState(null);

  useEffect(() => {
    if (!databaseId) {
      setError('Database ID is required');
      return;
    }

    // Explicit database reference
    const db = getFirestore(app, databaseId);
    const colRef = collection(db, collectionName);

    const unsubscribe = onSnapshot(
      colRef,
      (snapshot) => {
        const docs = snapshot.docs.map(doc => ({
          id: doc.id,
          ...doc.data()
        }));
        setData(docs);
        setLoading(false);
      },
      (err) => {
        setError(err.message);
        setLoading(false);
      }
    );

    return () => unsubscribe();
  }, [collectionName, databaseId]);

  return { data, loading, error };
}

// Usage
function EvaluationsList({ tenantId }) {
  const databaseId = `evaluations-db-prod-tenant-${tenantId}`;
  const { data, loading, error } = useFirestoreCollection('evaluations', databaseId);

  if (loading) return <div>Loading...</div>;
  if (error) return <div>Error: {error}</div>;

  return (
    <ul>
      {data.map(eval => (
        <li key={eval.id}>{eval.id}</li>
      ))}
    </ul>
  );
}
Enter fullscreen mode Exit fullscreen mode

Backend: Admin SDK

// ❌ BAD: Uses default database
const admin = require('firebase-admin');
const db = admin.firestore();

// ✅ GOOD: Explicit database reference
const admin = require('firebase-admin');
const db = admin.firestore().database('evaluations-db-prod-tenant-1');
Enter fullscreen mode Exit fullscreen mode

Real-Time Listeners on Non-Default Databases

Real-time listeners must also specify the database:

// ❌ BAD: Listens to default database
import { collection, onSnapshot } from 'firebase/firestore';

const db = getFirestore(app);  // Missing database ID!
const colRef = collection(db, 'evaluations');

onSnapshot(colRef, (snapshot) => {
  // This listens to (default) database, not tenant-specific!
});

// ✅ GOOD: Explicit database in listener
const db = getFirestore(app, 'evaluations-db-prod-tenant-1');
const colRef = collection(db, 'evaluations');

onSnapshot(colRef, (snapshot) => {
  // Now listening to the correct database
});
Enter fullscreen mode Exit fullscreen mode

Auto-Export Configuration

If you're using Firestore auto-export (for backups or analytics), make sure it's configured for the correct databases:

# Export specific database (not default)
gcloud firestore export gs://BUCKET_NAME/export \
  --database-ids=evaluations-db-prod-tenant-1 \
  --collection-ids=evaluations,scores
Enter fullscreen mode Exit fullscreen mode

Gotcha: If you don't specify --database-ids, the export includes (default) and all other databases. This can be expensive and slow.

Migration: Moving Data from Default

If you've already written data to (default), you need to migrate it:

from google.cloud import firestore
from google.cloud.firestore_v1 import Query

def migrate_from_default(target_database_id: str, collection_name: str):
    """Migrate data from (default) to target database."""
    default_db = firestore.Client(database="(default)")
    target_db = firestore.Client(database=target_database_id)

    # Read from default
    docs = default_db.collection(collection_name).stream()

    # Write to target
    batch = target_db.batch()
    count = 0

    for doc in docs:
        batch.set(
            target_db.collection(collection_name).document(doc.id),
            doc.to_dict()
        )
        count += 1

        # Commit in batches of 500 (Firestore limit)
        if count % 500 == 0:
            batch.commit()
            batch = target_db.batch()
            print(f"Migrated {count} documents...")

    # Commit remaining
    if count % 500 != 0:
        batch.commit()

    print(f"Migration complete: {count} documents migrated")

# Usage
migrate_from_default("evaluations-db-prod-tenant-1", "evaluations")
Enter fullscreen mode Exit fullscreen mode

Prevention: Code Review Checklist

Add this to your PR template:

## Firestore Database Checklist
- [ ] All Firestore clients specify explicit `database` parameter
- [ ] No `firestore.Client()` calls without database ID
- [ ] Environment variables set for database IDs
- [ ] Frontend real-time listeners specify database
- [ ] Tests use explicit database IDs (not default)
Enter fullscreen mode Exit fullscreen mode

Complete Example: Production-Ready Pattern

import os
from typing import Optional
from google.cloud import firestore
from functools import lru_cache
import logging

class FirestoreManager:
    """Manages Firestore clients with explicit database references."""

    def __init__(self, project_id: Optional[str] = None):
        self.project_id = project_id or os.getenv("GCP_PROJECT_ID")
        if not self.project_id:
            raise ValueError("GCP_PROJECT_ID must be set")

    @lru_cache(maxsize=20)
    def get_client(self, database_id: str) -> firestore.Client:
        """Get Firestore client for a specific database."""
        if not database_id:
            raise ValueError("database_id is required")

        if database_id == "(default)":
            logging.warning("Using (default) database - is this intentional?")

        client = firestore.Client(
            project=self.project_id,
            database=database_id
        )
        logging.info(f"Created Firestore client for database: {database_id}")
        return client

    def get_tenant_database(self, tenant_id: str) -> firestore.Client:
        """Get Firestore client for a tenant-specific database."""
        database_id = f"evaluations-db-prod-tenant-{tenant_id}"
        return self.get_client(database_id)

# Usage
manager = FirestoreManager()

# Get tenant-specific database
db = manager.get_tenant_database("1")
db.collection("evaluations").add({"score": 0.95})

# Explicit database
db = manager.get_client("evaluations-db-prod-tenant-2")
Enter fullscreen mode Exit fullscreen mode

TL;DR

Problem Solution
Code routes to (default) database Always specify database parameter
Silent data misrouting Check console, add logging, query default DB
Frontend uses wrong database Pass database ID to getFirestore()
Real-time listeners wrong DB Specify database in listener setup
Auto-export includes default Use --database-ids flag

Key takeaway: Never rely on Firestore defaults. Always specify the database ID explicitly. The (default) database is a trap — it's always there, even when you don't want it.


More production GCP articles on my blog. I write about patterns from real infrastructure — find me at humzakt.github.io.


Top comments (0)