Agbo, Daniel Onuoha

Posted on Mar 4

Firestore's New Query Engine

#firebase #database #performance #backend

Firestore's new query engine, introduced in January 2026, brings a powerful and expressive way to query and transform data server-side. This guide walks through setup, core concepts, and real-world implementation patterns.

Prerequisites and Setup

Pipeline operations are available exclusively on Firestore Enterprise edition. If your project currently uses a Standard edition database, you'll need to create a new database — in-place upgrades are not supported.

Step 1: Create a Firestore Enterprise Database

From the Google Cloud console or Firebase console:

Navigate to Firestore → Databases → Create Database
Select Enterprise edition and choose Native mode
Pick a region and click Create

Your new Enterprise database will have its own database ID (e.g., (enterprise-default) or a custom name).

Step 2: Update Your SDKs

Pipeline operations require the latest versions of the Firestore SDKs. Update your dependencies:

# Web / Node.js
npm install firebase@latest

# Admin SDK (Node.js)
npm install firebase-admin@latest

# Python
pip install google-cloud-firestore --upgrade

Pipeline operations are currently available on Android, iOS, Web, and Admin SDKs. Flutter, Unity, and C++ support is forthcoming.

Step 3: Connect to the Enterprise Database

When initializing your client, specify the Enterprise database ID explicitly:

// Web SDK
import { initializeApp } from "firebase/app";
import { getFirestore } from "firebase/firestore";

const app = initializeApp(firebaseConfig);
const db = getFirestore(app, "your-enterprise-db-id");

// Admin SDK (Node.js)
import { initializeApp, cert } from "firebase-admin/app";
import { getFirestore } from "firebase-admin/firestore";

const app = initializeApp({ credential: cert(serviceAccount) });
const db = getFirestore(app, "your-enterprise-db-id");

# Python (Admin SDK)
import firebase_admin
from firebase_admin import firestore

app = firebase_admin.initialize_app()
db = firestore.client(database_id="your-enterprise-db-id")

Important: Calling .pipeline() against a Standard edition database throws a server error. Always ensure you're connected to an Enterprise database before using pipeline APIs.

Core Concepts: Stages, Expressions, and Functions

Before writing pipeline queries, it helps to understand the three building blocks:

Stages are the sequential steps of a pipeline. Each stage receives a stream of documents, transforms it in some way, and passes the result to the next stage. Common stages include collection(), where(), select(), aggregate(), group(), unnest(), sort(), limit(), and sample().

Expressions are used within stages to reference fields or constants. Because pipelines distinguish between a field reference and a literal value, you must be explicit:

import { field, constant } from "firebase/firestore";

// "name" refers to the document field named "name"
field("name").equal(constant("Toronto"))

// Without constant(), a bare string would be ambiguous

Functions are higher-level operations built on expressions — things like countAll(), sum(), avg(), min(), max(), substring(), regex_match(), and array_contains_all().

Your First Pipeline Query

The simplest pipeline mirrors a standard Firestore query. Here is how a traditional query translates to a pipeline:

// Traditional query
const query = db.collection("cities")
  .where("population", ">", 100000)
  .orderBy("name")
  .limit(10);

// Equivalent pipeline
import { field, constant } from "firebase/firestore";

const pipeline = db.pipeline()
  .collection("cities")
  .where(field("population").greaterThan(constant(100000)))
  .sort(field("name").ascending())
  .limit(10);

const snapshot = await pipeline.execute();
snapshot.forEach(doc => console.log(doc.id, doc.data()));

You can also convert an existing query object directly into a pipeline, which is useful when incrementally migrating:

// Start from an existing query, then extend it with pipeline stages
const existingQuery = db.collection("recipes").where("authorId", "==", userId);
const pipeline = existingQuery.pipeline()
  .sort(field("createdAt").descending())
  .limit(20);

Filtering and Projection

Filtering with `where()`

The where() stage in pipelines supports a much richer set of operators than the Standard edition, including regex_match, array_contains_all, and comparisons across computed values.

// Find cities with populations between 500,000 and 5,000,000
const pipeline = db.pipeline()
  .collection("cities")
  .where(
    field("population").greaterThan(constant(500000))
      .and(field("population").lessThan(constant(5000000)))
  );

# Python equivalent
from google.cloud.firestore_v1.pipeline_expressions import Field, Constant

pipeline = (
    db.pipeline()
    .collection("cities")
    .where(
        Field.of("population").greater_than(Constant.of(500000))
        .and_(Field.of("population").less_than(Constant.of(5000000)))
    )
)

Projecting Fields with `select()`

By default, Firestore returns all fields in a document. Using select() restricts the response to only the fields your application needs, reducing network egress and latency:

const pipeline = db.pipeline()
  .collection("users")
  .where(field("active").equal(constant(true)))
  .select("displayName", "email", "lastLogin");

You can also add computed fields to the projection:

import { field, add } from "firebase/firestore";

const pipeline = db.pipeline()
  .collection("orders")
  .select(
    "orderId",
    "subtotal",
    add(field("subtotal"), field("shippingFee")).as("totalDue")
  );

Aggregations and Grouping

This is where pipelines become genuinely powerful. Standard edition queries can count documents, but pipelines can compute arbitrary aggregations — sums, averages, min/max — across the entire collection or within groups.

Global Aggregation

import { countAll, sum, avg } from "firebase/firestore";

// Total orders, total revenue, and average order value
const pipeline = db.pipeline()
  .collection("orders")
  .where(field("status").equal(constant("completed")))
  .aggregate(
    countAll().as("totalOrders"),
    sum(field("amount")).as("totalRevenue"),
    avg(field("amount")).as("avgOrderValue")
  );

const snapshot = await pipeline.execute();
const stats = snapshot.docs[0].data();
console.log(`${stats.totalOrders} orders, $${stats.totalRevenue} revenue`);

Grouped Aggregation

The group() stage lets you aggregate within buckets. Think GROUP BY in SQL:

// Revenue and order count broken down by product category
const pipeline = db.pipeline()
  .collection("orders")
  .group(
    { category: field("category") },
    {
      orderCount: countAll(),
      revenue: sum(field("amount"))
    }
  )
  .sort(field("revenue").descending());

# Python equivalent
from google.cloud.firestore_v1.pipeline_expressions import Field
from google.cloud.firestore_v1.pipeline_stages import Accumulator

pipeline = (
    db.pipeline()
    .collection("orders")
    .group(
        groups=["category"],
        accumulators=[
            Accumulator.count_all().alias("orderCount"),
            Accumulator.sum(Field.of("amount")).alias("revenue"),
        ]
    )
    .sort(Field.of("revenue").descending())
)

Unnesting Arrays

One of the most-requested capabilities missing from Standard edition was the ability to work with array fields inside a query. The unnest() stage explodes an array field into individual rows, one per element, allowing per-element filtering and aggregation.

Finding Trending Hashtags

Consider a recipe app where each document has a tags array. To find the most-used tags across all recipes, you'd previously have had to maintain a separate aggregation collection. With pipelines:

import { countAll, field } from "firebase/firestore";

const pipeline = db.pipeline()
  .collection("recipes")
  .unnest(field("tags").as("tag"))           // Explode the tags array
  .aggregate({
    accumulators: [countAll().as("tagCount")],
    groups: ["tag"]
  })
  .sort(field("tagCount").descending())
  .limit(10)
  .execute();

const snapshot = await pipeline;
snapshot.forEach(doc => {
  const { tag, tagCount } = doc.data();
  console.log(`#${tag}: ${tagCount} recipes`);
});

Filtering on Array Element Values

After unnesting, you can filter on the unnested values just like any other field:

// Find all recipes that have been tagged with vegetarian-friendly tags
const pipeline = db.pipeline()
  .collection("recipes")
  .unnest(field("ingredients").as("ingredient"))
  .where(field("ingredient.allergen").equal(constant(true)))
  .group(
    { recipeId: field("__name__"), title: field("title") },
    { allergenCount: countAll() }
  )
  .where(field("allergenCount").greaterThan(constant(0)));

String Operations

Pipelines expose full string matching capabilities, including substring search and regular expressions — both previously impossible without application-layer processing.

Substring Search

import { field, substring } from "firebase/firestore";

// Find products whose names contain "wireless" (case-insensitive)
const pipeline = db.pipeline()
  .collection("products")
  .where(
    field("name").lower().contains(constant("wireless"))
  );

Regex Matching

import { regexMatch } from "firebase/firestore";

// Find users whose emails are from a corporate domain
const pipeline = db.pipeline()
  .collection("users")
  .where(
    regexMatch(field("email"), constant("^[^@]+@(acme|globex)\\.com$"))
  );

# Python equivalent
from google.cloud.firestore_v1.pipeline_expressions import Field, Function

pipeline = (
    db.pipeline()
    .collection("users")
    .where(
        Function.regex_match(Field.of("email"), "^[^@]+@(acme|globex)\\.com$")
    )
)

Chaining Stages: A Real-World Example

The real power of pipelines comes from chaining stages together. Here is a complete example: a leaderboard query for a game app that computes player stats, filters by tier, and returns a ranked top-ten.

import { field, constant, sum, avg, countAll } from "firebase/firestore";

const pipeline = db.pipeline()
  .collection("gameEvents")
  .where(field("eventType").equal(constant("match_completed")))
  .group(
    { playerId: field("playerId"), username: field("username") },
    {
      totalWins:   sum(field("won")),        // won is 0 or 1
      totalGames:  countAll(),
      avgScore:    avg(field("score"))
    }
  )
  // Compute win rate as a projected field
  .select(
    "playerId",
    "username",
    "totalWins",
    "totalGames",
    "avgScore",
    field("totalWins").divide(field("totalGames")).as("winRate")
  )
  // Only include players who've played at least 10 games
  .where(field("totalGames").greaterThanOrEqual(constant(10)))
  .sort(field("winRate").descending(), field("avgScore").descending())
  .limit(10);

const snapshot = await pipeline.execute();

Notice the filter on totalGames coming after the aggregation. This kind of post-aggregation filtering — equivalent to HAVING in SQL — was completely impossible in Standard edition.

Sampling Documents

Enterprise edition also adds a sample() stage, useful for analytics, ML training data preparation, or testing on a subset of production data:

// Return 100 random documents from the collection
const pipeline = db.pipeline()
  .collection("events")
  .sample(100);

// Or sample by percentage (each document has a 10% chance of being returned)
const pipeline = db.pipeline()
  .database()
  .sample({ percent: 0.1 });

Query Explain: Diagnosing Performance

Because the Enterprise edition uses optional indexing, understanding whether your query is hitting an index or falling back to a full table scan is critical. The explain() API surfaces this information.

Analyzing a Query Without Executing It

// Get the query plan only (dry run)
const explainResult = await pipeline.explain({ analyze: false });
console.log(explainResult.stats);
// → { resultsReturned: 0, executionDuration: null, indexesUsed: [...] }

Running with Full Profiling

// Execute the query and capture execution metrics
const snapshot = await pipeline.explain({ analyze: true });
console.log(snapshot.explainMetrics);
// → {
//     planSummary: { indexesUsed: [{ properties: "category ASC, revenue ASC" }] },
//     executionStats: { resultsReturned: 42, executionDuration: "0.05s", readOperations: 42 }
//   }

If indexesUsed is empty, your query is doing a collection scan. That's fine for small collections or development, but for production workloads you'll want to create a supporting index.

Creating an Index

Once Query Explain tells you which fields a query is filtering and sorting on, create the supporting index via the Firebase console, the CLI, or the Firestore API. For Pipeline operations, the recommended index field order is:

Equality filter fields (in any order)
Sort fields (in the same order as your sort() stage)
Range/inequality filter fields

# Using Firebase CLI
firebase firestore:indexes
# Add index definition to firestore.indexes.json and deploy
firebase deploy --only firestore:indexes

Input Stages: More Than Just Collections

Most queries start with .collection(), but pipelines also support two additional input modes:

// Query all documents in the entire database
const pipeline = db.pipeline()
  .database()
  .where(field("status").equal(constant("flagged")));

// Query a specific set of document references
const pipeline = db.pipeline()
  .documents([
    db.collection("users").doc("user_1"),
    db.collection("users").doc("user_2"),
    db.collection("users").doc("user_3")
  ])
  .select("displayName", "email");

The database() input is powerful for cross-collection analytics but will be slow without appropriate indexing. Use it carefully in production.

Known Limitations in Preview

As of early 2026, Pipeline operations are in Preview status. A few constraints to be aware of:

No real-time listeners. The onSnapshot() API is not yet supported for pipeline queries. If you need live updates, you must use Standard edition Core operations for those use cases. Mixing pipelines (for complex one-time reads) and Core operations (for real-time sync) in the same application is a valid architecture.

No local emulator support. The Firestore emulator does not yet execute pipeline queries, which complicates local testing. One workaround is to maintain a separate test Enterprise database in the cloud and run integration tests against it.

Array-contains and vector search indexes are not yet supported. If your pipeline uses array_contains or find_nearest expressions, Firestore will fall back to ascending/descending indexes as a substitute. Performance may lag for these specific operations compared to Standard edition equivalents. Google has flagged this as an area being actively improved.

Migration Notes

If you're moving data from an existing Standard edition database:

Export your Standard database to Cloud Storage using Firestore's managed export feature.
Import the exported data into your new Enterprise database.
Recreate security rules and indexes manually — they do not transfer automatically.
Update client code to point at the new database ID.
Use pipelines for new query patterns; Core operations remain available for real-time sync.

Summary

Firestore Pipeline operations give developers a genuinely expressive server-side query language — one that handles aggregations, array unnesting, string matching, computed fields, and post-aggregation filtering without pushing data into application code first. The key steps to get started are: create an Enterprise edition database, update your SDKs, connect using the Enterprise database ID, and begin chaining .pipeline() stages. Use Query Explain early to understand your index needs, and keep an eye on the preview limitations, particularly around real-time listeners, before committing to a full migration.

DEV Community

Firestore's New Query Engine

Prerequisites and Setup

Step 1: Create a Firestore Enterprise Database

Step 2: Update Your SDKs

Step 3: Connect to the Enterprise Database

Core Concepts: Stages, Expressions, and Functions

Your First Pipeline Query

Filtering and Projection

Filtering with `where()`

Projecting Fields with `select()`

Aggregations and Grouping

Global Aggregation

Grouped Aggregation

Unnesting Arrays

Finding Trending Hashtags

Filtering on Array Element Values

String Operations

Substring Search

Regex Matching

Chaining Stages: A Real-World Example

Sampling Documents

Query Explain: Diagnosing Performance

Analyzing a Query Without Executing It

Running with Full Profiling

Creating an Index

Input Stages: More Than Just Collections

Known Limitations in Preview

Migration Notes

Summary

Top comments (0)

Prerequisites and Setup

Step 1: Create a Firestore Enterprise Database

Step 2: Update Your SDKs

Step 3: Connect to the Enterprise Database

Core Concepts: Stages, Expressions, and Functions

Your First Pipeline Query

Filtering and Projection

Filtering with where()

Projecting Fields with select()

Aggregations and Grouping

Global Aggregation

Grouped Aggregation

Unnesting Arrays

Finding Trending Hashtags

Filtering on Array Element Values

String Operations

Substring Search

Regex Matching

Chaining Stages: A Real-World Example

Sampling Documents

Query Explain: Diagnosing Performance

Analyzing a Query Without Executing It

Running with Full Profiling

Creating an Index

Input Stages: More Than Just Collections

Known Limitations in Preview

Migration Notes

Summary

Filtering with `where()`

Projecting Fields with `select()`