MongoDB Guests for MongoDB

Posted on Jan 29

Getting Started with MongoDB BulkWrite in PHP

#php #mongodb #programming #tutorial

This tutorial was written by James Shisiah.

MongoDB is a document-oriented NoSQL database with a flexible schema designed to store, query, and manipulate large volumes of data. For PHP development, we have the official MongoDB PHP driver, which consists of a PHP extension as well as a high-level library (mongodb/mongodb) that exposes a clean API for interacting with collections, documents, and advanced database features, including full-text and semantic search on MongoDB Atlas.

With the library, you get tools for standard database operations like inserts, updates, and deletes. Additionally, it supports bulk write operations via the MongoDB BulkWrite API. MongoDB BulkWrite is a command that allows an application to send a mixed list of write operations to a MongoDB server in a single network request. Write operations include insert, update, replace, and delete commands.

In this guide, we will explore the MongoDB BulkWrite API and its performance improvements, and we’ll demonstrate how to handle mixed bulk operations with sample code using the PHP library.

Prerequisites

PHP installed (>= 8.1 required)
MongoDB database server (Atlas for easy setup)—or locally if preferred, via the provided docker-compose.yml file
PIE (a modern PHP installer for extensions)—used to install the PHP MongoDB extension
Composer for managing PHP dependencies, used to install mongodb/mongodb

Why use BulkWrite for batch operations?

Many real-world applications require inserting, updating, or deleting large numbers of documents at once. Examples include:

Bulk importing thousands of records from a CSV file, e.g., log files.
Updating product prices in bulk.
Syncing customer data from external systems.
Cleaning old or inactive entries.
Data migrations and mass updates.

Using individual write operations (insertOne, updateOne, etc.) for each record incurs performance penalties because every write is sent separately over the network.

How MongoDB’s BulkWrite API solves this problem

Improved performance: Multiple operations are grouped into a single batch and sent together, reducing the time spent on the network round-trip.
Fewer network calls: Instead of 1,000 operations, which would make 1,000 requests, BulkWrite can send them all in a single request.
Ideal for imports, updates, and data cleanups: Whether you're inserting new data, updating existing records, or performing deletions, BulkWrite can handle all operations efficiently and atomically at the document level.

Comparing old vs. new BulkWrite APIs in PHP

Before proceeding with the tutorial, let’s look into the different APIs that MongoDB provides for working with large bulk write operations. The BulkWrite API performs bulk write operations at two levels. The old API performs write operations at the collection level, whereas the new API performs them at the MongoDB cluster (client) level.

Old BulkWrite API (collection level)

At the collection level, you can perform bulk write operations on a single collection. The old MongoDB bulk write implementation provides two related APIs:

MongoDB\Driver\Manager::executeBulkWrite()
MongoDB\Collection::bulkWrite()

Collection::bulkWrite() is a convenience wrapper provided by the high-level PHP library. Internally, it prepares a MongoDB\Driver\BulkWrite object and executes it via Manager::executeBulkWrite(), automatically supplying the correct namespace. In other words, Collection::bulkWrite() object provides a namespace plus write operations to Manager::executeBulkWrite(), making your code more readable.

The Collection::bulkwrite method makes a database call for each type of write operation. It can perform multiple insert operations in one call, but it makes two separate calls to the database for an insert operation and a replace operation. While this is efficient when performing a bulk operation for one type of write operation—e.g., inserting thousands of rows from a CSV file—if you combine inserts and updates in one ops array, it will make separate calls to the database for each type (insert and update). Let’s explore sample code snippets with the old bulk write APIs.

Sample code with Manager::executeBulkWrite()

$bulk = new MongoDB\Driver\BulkWrite;

$bulk->insert(['name' => 'Alice']);
$bulk->update(['email' => 'john@example.com'], ['$set' => ['active' => true]]);
$bulk->delete(['status' => 'inactive']);

$manager = new MongoDB\Driver\Manager($mongoDBuri);
$result = $manager->executeBulkWrite('bulkwritedb.users', $bulk);

Sample code with Collection::bulkwrite()

$client = new MongoDB\Client($mongoDBuri);

 $database = $client->selectDatabase('bulkwritedb');
 $collection = $database->selectCollection('users');

 $operations = [
        ['insertOne' => [['name' => 'Alice', 'age' => 25]]],
        ['insertOne' => [['name' => 'Bob', 'age' => 30]]],
        ['insertOne' => [['name' => 'Charlie', 'age' => 28]]],
    ];

 $result = $collection->bulkWrite($operations);

In this example, Collection::bulkWrite() internally provides the namespace bulkwritedb.users to Manager::executeBulkWrite(), which then makes the database call.

Keep in mind the following points when working with the old bulk write APIs:

Operates on a single namespace
Returns a single write result object
Can fail if the server’s response exceeds the maximum BSON document size
Well-suited for small to medium batch operations

New BulkWrite API (cluster/client level)

The new MongoDB BulkWrite API makes calls at the client level, allowing you to perform operations on multiple collections and databases in the same cluster. The new bulkwrite API is exposed via the MongoDB\Client::bulkWrite() method and works with MongoDB Server version 8.0 or later. If you are still running an older MongoDB version, this is your sign to upgrade. For instance, bulkWrite operations on MongoDB 8.0 can run up to 56% faster than bulkWrite operations on MongoDB 7.0. The new API performs all write operations (inserts, updates, replace, and delete) in one database call, unlike the old API, which makes different calls for each type of write operation. It follows the following execution path:

Creates a MongoDB\Driver\BulkWriteCommand instance
Executes it via Manager::executeBulkWriteCommand()
Returns results using a cursor (the old API does not return results using a cursor)

By returning results through a cursor, MongoDB can stream responses incrementally rather than sending a single, potentially large BSON document. This allows us to work around the limitations of the old BulkWrite API and hence can handle much larger bulk operations safely. Your application will have better resilience during large imports and migrations.

Sample code with MongoDB\Client::bulkWrite()

$client = new MongoDB\Client($mongoDBuri);

// Create the ClientBulkWrite for your 'users' collection
$usersCollection = $client->bulkwritedb->users;
$bulkWrite = MongoDB\ClientBulkWrite::createWithCollection($usersCollection);

// Add your operations
$bulkWrite->insertOne(['name' => 'Alice', 'age' => 25]);
$bulkWrite->insertOne(['name' => 'Bob', 'age' => 30]);
$bulkWrite->insertOne(['name' => 'Charlie', 'age' => 28]);

// Perform the bulk write operation
$result = $client->bulkWrite($bulkWrite);

The sample code above demonstrates how to create a ClientBulkWrite instance from a MongoDB\Collection instance by using the createWithCollection() method.

In this guide, we will be working with the new MongoDB bulkWrite API. You will learn how to write to multiple collections by chaining calls to the createWithCollection() method on your MongoDB\Client::bulkWrite() instance. For modern, scalable applications, especially writing to multiple collections, we recommend using MongoDB\Client::bulkWrite() since it allows you to iterate through its results reliably.

1. Project setup

Before we start working with MongoDB’s BulkWrite operations in PHP, let’s set up a sample project environment. We have avoided working with frameworks so you can focus purely on understanding how MongoDB’s PHP extension and the high-level library work. The code for this project can be retrieved from this GitHub repository, which you can clone with the command:

git clone git@github.com:jaymoh/mongodb-bulkwrite-php.git

Create the project folder

Choose a location on your machine and create a new directory, e.g.:

mongodb-bulkwrite-php/

Inside it, we will have a few files, such as:

mongodb-bulkwrite-php/
├── composer.json
├── bulk_write_demo.php
└── customers_10000.csv

You will add more demo files as needed during the tutorial.

Install the MongoDB PHP extension if it is not already installed

The MongoDB PHP extension (mongodb.so) must be installed before the library can work.

Check if it is enabled using the command:

php -m | grep mongodb

Running the command should display mongodb in your terminal. If nothing appears, install the extension by following the official MongoDB PHP setup documentation or the PHP manual. For a quick setup, you can install and enable it using PIE with the following command:

pie install mongodb/mongodb-extension

Install the MongoDB high-level PHP library via Composer

The preferred method for installing MongoDB’s PHP library is via Composer. Install it with the command:

composer require mongodb/mongodb

Since we will be working with a MongoDB connection string, and we do not want it pushed to the git repositories, we will save it in an environment variable (.env) file. This file is then added to .gitignore so it’s ignored when pushing your code. Any dotenv library (symfony/dotenv, vlucas/phpdotenv, etc.) can be used to read environment variables. For this guide, we will be using the vlucas/phpdotenv library. You can install it into your project directory with the command:

composer require vlucas/phpdotenv

That is it for this step. Let’s establish a connection to MongoDB in the next step.

2. Connect to MongoDB

Now that the project is set up, the next step is to connect your PHP script to a MongoDB database. For the simplicity of this tutorial, we will work with a free MongoDB Atlas cluster, since it works online, requires no local installation, and gives you a clean connection URI.

However, if you prefer running MongoDB locally (e.g., via mongod on your machine) or via the MongoDB Atlas Docker image, you can also use a local URI; the steps are the same. We have included a docker-compose.yml file that you can easily use to run the MongoDB Atlas Docker image locally.

Get your connection string

Visit MongoDB Atlas and register for a free Atlas account. The Get Started With MongoDB Atlas docs outline clear steps you can follow to create a free cluster, set up user credentials, and add allowed IP addresses (here, you should include your current IP or add 0.0.0.0/0 to allow access from anywhere, for development purposes only).

When you create your cluster for the first time, you get this setup wizard that you can follow for a quick configuration:

We named our cluster BulkWrite for the sake of this tutorial. In the next step of the connect flow, under Choose a connection method, a connection string is displayed. You can choose your programming language (PHP) to view the code sample:

A connection string looks like this:

mongodb+srv://jamesshisiah_db_user:<db_password>@bulkwrite.vcdi5mk.mongodb.net/?appName=BulkWrite

Add the connection string to your project’s .env file

Copy your connection string, then create an .env file in your project folder. Add the following variables, replacing the MONGODB_URI value with your connection string:

MONGODB_URI=mongodb+srv://jamesshisiah_db_user:<db_password>@bulkwrite.vcdi5mk.mongodb.net/?appName=BulkWrite
MONGODB_DB=bulkwritedb

Next, create a PHP file to test the connection. We can call it bulk_write_connect.php. Add the following code snippet:

<?php

require 'vendor/autoload.php';

use MongoDB\Client;
use Dotenv\Dotenv;

$dotenv = Dotenv::createImmutable(__DIR__);
$dotenv->load();

$uri = $_ENV['MONGODB_URI'] ?: null; // '<your_mongodb_connection_string_here>'

// Testing Connect to MongoDB running on MongoDB Atlas
$client = new Client($uri);

// Test connection by listing databases
$databases = $client->listDatabases();
// If the above line does not throw an exception, the connection is successful
echo "Successfully connected to MongoDB!\n";

Run the script:

php bulk_write_connect.php

If everything is set up correctly, you should see:

Successfully connected to MongoDB!

If, for some reason, you did not see the success message or there was an exception thrown in your code, consider checking out the Connection Troubleshooting page for potential solutions to issues that you might encounter when working with the MongoDB PHP library.

With your connection set up, you're now ready to start working with MongoDB’s new BulkWrite API operations.

3. Prepare BulkWrite operations

After confirming that you can connect to a MongoDB database, let's write some PHP code to see how MongoDB’s BulkWrite feature works. BulkWrite allows you to combine multiple write operations, such as inserts, updates, and deletes, into a single batch request that can be executed on multiple collections or multiple databases in the same cluster.

Create BulkWriteCommand instance using MongoDB\ClientBulkWrite

Given that it works at the client level, the first step in your code is to create a client connection instance to the MongoDB server:

use MongoDB\Client;
$client = new Client("your-mongodb-uri-here");

Next, use the MongoDB\ClientBulkWrite builder class to create a BulkWriteCommand instance that specifies the write operations to perform. To do this, call the createWithCollection() method and pass in the first collection you wish to target. For example:

$usersCollection = $client->bulkwritedb->users;
$bulkWrite = MongoDB\ClientBulkWrite::createWithCollection($usersCollection);

Then, call one or more of the following write methods on the ClientBulkWrite instance to construct the bulk write operation:

deleteOne()
deleteMany()
insertOne()
replaceOne()
updateOne()
updateMany()

For example, here is how you can insert into the users’ collection using the created ClientBulkWrite instance:

$bulkWrite->insertOne(['name' => 'Bob', 'age' => 30]);

You can use the same $bulkWrite instance to perform an update operation:

$bulkWrite->updateOne(
    ['name' => 'John Who'],
    ['$set' => ['age' => 35]],
    ['upsert' => true],
);

Switch to a different Collection using withCollection()

What is more crucial to note here is that you can change the target collection for subsequent bulk operations by calling the withCollection() method on an existing ClientBulkWrite instance before making a database call. For example, you can insert into a customer’s collection in another database (e.g., my_shop_db) using the code:

$customersCollection = $client->my_shop_db->customers;

$bulkWrite = $bulkWrite->withCollection($customersCollection);
$bulkWrite->insertOne(['name' => 'Noella Gray', 'orders' => 10]);

You can still switch to another collection and perform deletion. For example, one of your users (user_id_10) wants all their data scraped from your app’s database:

$logsCollection = $client->my_app->user_logs;

$bulkWrite = $bulkWrite->withCollection($logsCollection);
$bulkWrite->deleteMany(['user_id' => 'user_id_10']);

When we finally execute the database call, the above line will remove all documents from the user_logs collection where the user_id field is equal to “user_id_10.”

Perform the bulk write operation

Once you have chained all your write operations (inserts, updates, deletes), you can then perform the bulk write operation with the code:

$result = $client->bulkWrite($bulkWrite);

MongoDB bulk write API sends all the different operations to different collections and databases in a single call, which is significantly faster than having to send the operations separately. Additionally, the fluent API is cleaner, offering a good developer experience.

Example script with all bulkWrite operations

Let’s say you have two application databases: one for your restaurant app and another for your e-commerce website. Both databases are in one cluster, so you can connect to them using a single client instance, and each database has different collections. In the GitHub repository, you will find a bulk_write_demo.php script. Here is a code snippet extracted from the script:

// Start with e-commerce database
$ecommerceDb = $client->ecommerce;
$customersCollection = $ecommerceDb->customers;
$ordersCollection = $ecommerceDb->orders;

// Restaurant database
$restaurantDb = $client->restaurant;
$menusCollection = $restaurantDb->menus;

// Create bulk write starting with customers
$bulkWrite = ClientBulkWrite::createWithCollection($customersCollection);

// Add customer operations
$bulkWrite->insertOne(['name' => 'Alice', 'email' => 'alice@example.com']);
$bulkWrite->updateOne(['name' => 'Alice'], ['$set' => ['status' => 'premium']]);

// Switch to orders collection (same database)
$bulkWrite = $bulkWrite->withCollection($ordersCollection);
$bulkWrite->insertOne(['customer' => 'Alice', 'total' => 99.99]);

// Switch to menus collection (different database!)
$bulkWrite = $bulkWrite->withCollection($menusCollection);
$bulkWrite->insertOne(['name' => 'Pizza', 'price' => 12.99]);
$bulkWrite->deleteMany(['available' => false]);

// Execute ALL operations in a single request and get a result object
$result = $client->bulkWrite($bulkWrite);

The script puts together the concepts we have covered in this section, including;

Cross-database operations: E-commerce DB and Restaurant DB in a single batch database call.
Collection switching: Using withCollection() to switch between collections.
All write operations: insertOne, updateOne, updateMany, replaceOne, deleteOne, and deleteMany.
Options: Demonstrates upsert option with replaceOne(). Upsert means "update or insert." When you run an update/replace with ['upsert' => true], MongoDB will update matching documents; if none match, it will insert a new document instead.
ID capture: Shows how to capture generated _id values from insertOne(). The second parameter of the insertOne() method takes a variable name by reference, and sets the insertId to the variable. For example, $bobId will hold the generated _id after the database insertion:

 $bulkWrite->insertOne(['name' => 'Bob Smith'], $bobId)

Result reporting: Displays counts for all operation types (more on this in the next step).

4. Handle BulkWrite results

After running a bulk operation, MongoDB returns a MongoDB\BulkWriteCommandResult object. A significant improvement in the new BulkWrite API (Client::bulkWrite()) is its ability to return results using a cursor, rather than a single large BSON document as in the previous API. This change allows you to handle much larger bulk write responses, removing the risk of failure if the response exceeds the maximum BSON document size (16MB), an issue that could occur with the older API (Collection::bulkWrite()).

Understanding how to read these values is essential for logging, debugging, and validating large data-processing operations. The new BulkWrite API returns two types of results: summary counts and cursor-based verbose results.

Summary counts (always available)

Summary counts are always returned and can be accessed from the result object using the following methods:

getInsertedCount(): total documents inserted
getMatchedCount(): total documents matched by filters
getModifiedCount(): total documents modified
getUpsertedCount(): total documents upserted
getDeletedCount(): total documents deleted
isAcknowledged(): whether the write was acknowledged
- isAcknowledged() returns true when using write concern w: 1 or higher (including majority). With w: 0, MongoDB does not wait for confirmation; the above result summary counts methods will return NULL. Default majority ensures the write is durable (can survive primary failover) across replica set failovers.

Sample code showing how to get summary counts

$result = $client->bulkWrite($bulkWrite);

echo "=== Summary Counts ===\n";
echo "Inserted:  " . $result->getInsertedCount() . "\n";
echo "Matched:   " . $result->getMatchedCount() . "\n";
echo "Modified:  " . $result->getModifiedCount() . "\n";
echo "Upserted:  " . $result->getUpsertedCount() . "\n";
echo "Deleted:   " . $result->getDeletedCount() . "\n";
echo "Acknowledged: " . ($result->isAcknowledged() ? 'Yes' : 'No') . "\n";

Cursor-based verbose results (new API advantage)

The new BulkWrite API supports returning results in batches using a cursor rather than only summary counts, so very large responses don’t hit BSON size limits. To receive detailed, per-operation results from the new BulkWrite API, you should set the verboseResults option to true when calling bulkWrite. You can enable verboseResults in the $options array parameter in the following ways.

When creating the ClientBulkWrite instance:

$bulkWrite = ClientBulkWrite::createWithCollection($collection, [
    'ordered' => true,
    'verboseResults' => true  // Enable detailed per-operation results
]);

When calling the bulkWrite command with your bulk write operations on your ClientBulkWrite instance:

$result = $client->bulkWrite(
    $bulkWrite,
    ['verboseResults' => true]
);

With verboseResults enabled, the $result object will now expose $cursor-based results in the following methods:

getInsertResults(): returns an array of detailed results for each insert operation (including insertedId)
getUpdateResults(): returns an array of detailed results for each update (including matchedCount, modifiedCount, upsertedId)
getDeleteResults(): returns an array of detailed results for each delete (including deletedCount)

Sample code showing how to loop and get the detailed results for each operation in your bulk write

$result = $client->bulkWrite($bulkWrite);

$insertResults = $result->getInsertResults();
    if ($insertResults) {
        foreach ($insertResults as $index => $insertResult) {
            echo "Insert operation #$index:\n";
            echo "  Inserted ID: " . $insertResult->insertedId . "\n";
        }
    }

$updateResults = $result->getUpdateResults();
    if ($updateResults) {
        foreach ($updateResults as $index => $updateResult) {
            echo "Update operation #$index:\n";
            echo "  Matched:  " . $updateResult->matchedCount . "\n";
            echo "  Modified: " . $updateResult->modifiedCount . "\n";
            if (isset($updateResult->upsertedId)) {
                echo "  Upserted ID: " . $updateResult->upsertedId . "\n";
            }
        }
    }

$deleteResults = $result->getDeleteResults();
    if ($deleteResults) {
        foreach ($deleteResults as $index => $deleteResult) {
            echo "Delete operation #$index:\n";
            echo "  Deleted: " . $deleteResult->deletedCount . "\n";
        }
    }

Understanding WriteConcerns

WriteConcern controls how strictly MongoDB should confirm a write.

For example:

"majority"—Wait for most nodes to acknowledge the write. This is the default WriteConcern. It ensures that write operations are acknowledged only after a majority of replica set members have written the data to their journals, providing strong guarantees.
"1"—Only the primary node needs to confirm the write.
"0"—Don’t wait for confirmation. Hence, this is faster. Your application is only notified that the primary server received the call, but it does not wait to get confirmation whether the write was successful. With (w: 0), the isAcknowledged() result method returns FALSE. The other summary count methods: getInsertedCount(), getModifiedCount(), and getDeletedCount() return NULL. Use only when performance is critical, and data loss is acceptable.

You can specify the WriteConcern option in the $options array parameter (just like verboseResults) of the bulkWrite command.

Example acknowledged write (w: ‘majority’)

use MongoDB\Driver\WriteConcern;

$writeConcern = new WriteConcern(WriteConcern::MAJORITY, 1000); // majority acknowledgement, 1 second timeout
$result = $client->bulkWrite(
    $bulkWrite,
    [
        'writeConcern' => $writeConcern,
        // You can also include other options, like 'verboseResults'
    ]
);

Note: When performing bulk imports or updates, use a higher write concern (majority) for critical data, and use lower write concerns for performance in non-critical writes. You can also use the wtimeout option (1000ms in the code snippet above) to specify how long to wait for acknowledgement before timing out.

Example unacknowledged write (w: 0)

$unacknowledgedWriteConcern = new WriteConcern(0); // w: 0

    $unacknowledgedBulkWrite = ClientBulkWrite::createWithCollection($collection, [
        'writeConcern' => $unacknowledgedWriteConcern,
    ]);

    $unacknowledgedBulkWrite->insertOne(['name' => 'Unacknowledged Product', 'price' => 5.99]);

    $unackResult = $client->bulkWrite($unacknowledgedBulkWrite);

    echo "Acknowledged: " . ($unackResult->isAcknowledged() ? 'Yes' : 'No') . "\n";
    echo "Inserted Count: " . ($unackResult->getInsertedCount() ?? 'NULL (not available)') . "\n";
    echo "\nNote: With w:0, result counts are unavailable - the server doesn't confirm the write.\n";

Error handling with try...catch

Bulk operations may fail due to various reasons, including:

Invalid queries.
Duplicate keys.
Network timeouts.
Schema validation rules.
Write concern failures.

You should always wrap your bulk operations in a try...catch. For optimal debugging, specifically catch the MongoDB\Driver\Exception\BulkWriteCommandException. This tells you what went wrong while still providing a partial result via getPartialResult(). You can check specific failures using getWriteErrors() and identify any write concern issues using getWriteConcernErrors().

This code snippet shows how to read the bulkWrite error results. Please find the full source code from the GitHub repository, in the file called bulk_write_results.php:

<?php

use MongoDB\Client;
use MongoDB\ClientBulkWrite;
use Dotenv\Dotenv;
use MongoDB\Driver\Exception\BulkWriteCommandException;
use MongoDB\Driver\WriteConcern;

$dotenv = Dotenv::createImmutable(__DIR__);
$dotenv->load();

$uri = $_ENV['MONGODB_URI'] ?: null;

$writeConcern = new WriteConcern(WriteConcern::MAJORITY, 1000); // majority acknowledgement, 1 second timeout

try {
    $client = new Client($uri);

    $database = $client->tutorial;
    $collection = $database->bulk_results_demo;

    // Create ClientBulkWrite with verboseResults and custom WriteConcern
    $bulkWrite = ClientBulkWrite::createWithCollection($collection, [
        'ordered' => true,
        'verboseResults' => true,  // Enable detailed per-operation results
        'writeConcern' => $writeConcern,
    ]);

    // Add as many operations as needed...
    $bulkWrite->insertOne(['name' => 'Product A', 'price' => 29.99, 'category' => 'electronics'], $id1);

    $bulkWrite->updateOne(
        ['name' => 'Product F'],
        ['$set' => ['name' => 'Product F', 'price' => 39.99, 'category' => 'clothing']],
        ['upsert' => true]
    );

    $bulkWrite->deleteOne(['name' => 'Product C']);

    // Execute the bulk write
    $result = $client->bulkWrite($bulkWrite);
} catch (BulkWriteCommandException $e) {
    // Get partial results (operations that succeeded before the error)
    $partialResult = $e->getPartialResult();
    if ($partialResult) {
        echo "\nPartial Results (before failure):\n";
        echo "  Inserted: " . $partialResult->getInsertedCount() . "\n";
        echo "  Modified: " . $partialResult->getModifiedCount() . "\n";
        echo "  Deleted: " . $partialResult->getDeletedCount() . "\n";
    }

    // Get write errors for specific operations that failed
    $writeErrors = $e->getWriteErrors();
    if ($writeErrors) {
        echo "\nWrite Errors:\n";
        foreach ($writeErrors as $index => $error) {
            echo "  Operation #$index: " . $error->getMessage() . "\n";
        }
    }

    // Get write concern errors if any
    $writeConcernErrors = $e->getWriteConcernErrors();
    if ($writeConcernErrors) {
        echo "\nWrite Concern Errors:\n";
        foreach ($writeConcernErrors as $index => $wcError) {
            echo "  Error #$index: " . $wcError->getMessage() . "\n";
        }
    }
    exit(1);
} catch (\Exception $e) {
    echo "Error: " . $e->getMessage() . "\n";
    exit(1);
}

You can run the script with the command:

php bulk_write_results.php

5. Realistic CSV import with BulkWrite

CSV imports are one of the most common real-world use cases for BulkWrite. Instead of inserting documents one at a time, you can use batches of 5,000 rows, for example, into a single BulkWrite operation. While the new API can handle very large operations, batching is still recommended for:

Memory efficiency (don't load entire CSV into memory).
Network reliability (smaller retries on failure).
Progress tracking and resumability.
Server resource management.

Depending on your server resources and error tolerance of your application, you can use higher batch sizes since we are not limited by the 16MB BSON response document size. The bulk write logic also splits outgoing commands into multiple batches to stay below the maximum size for BSON messages. Essentially, you don’t have to worry about sizes as long as an individual document is less than 16 MB. Our code example uses batch sizes of 5,000 rows.

In this example, you will learn how to use the new Client BulkWrite API to import data from multiple CSV files into different collections in a single bulk operation. We are using two CSV files of 10,000 rows each, both downloaded from Datablist. The CSV files are included in the repository.

The customers’ CSV has the following columns:

Index,Customer Id,First Name,Last Name,Company,City,Country,Phone 1,Phone 2,Email,Subscription Date,Website

The organizations’ CSV has the following columns:

Index,Organization Id,Name,Website,Country,Description,Founded,Industry,Number of employees

CSV row → MongoDB document mapping

A CSV row like…

…will become:

[
    'customer_id' => 1001,
    'first_name' => 'Alice',
    'last_name'  => 'Johnson',
    'email'      => 'alice@example.com',
    'company'    => '...',
    'city'       => '...',
    'country'    => '...',
    'phone_1'    => '...',
    'phone_2'    => '...',
    'subscription_date' => new MongoDB\BSON\UTCDateTime(...),
    'website'    => '...',
]

You can either map all fields or only the ones your app needs. You can even add more fields, e.g., imported_at.

The organisations’ CSV is mapped like this:

[
        'organization_id' =>'',
        'name' =>'',
        'website' =>'',
        'country' =>'',
        'description' =>'',
        'founded' => '',
        'industry' =>'',
        'number_of_employees' => '',
        'imported_at' => new UTCDateTime(),
    ]

Full CSV import script

For brevity, we have not included the full import script in this guide, but you can access it from the repository, bulk_write_csv_import.php. Below is a code snippet from the script:

// Batch size configuration
$batchSize = 5000; // Much larger than legacy API's 500-1000 // You can still go higher/lower based on memory/network

// Open CSV files
$customersHandle = fopen('customers.csv', 'r');
$orgsHandle = fopen('organizations.csv', 'r');

// Skip headers
fgetcsv($customersHandle);
fgetcsv($orgsHandle);

$bulkWrite = null;
$operationCount = 0;

while (!feof($customersHandle) || !feof($orgsHandle)) {
    // Read and add customer
    if (!feof($customersHandle)) {
        $row = fgetcsv($customersHandle);
        if ($row) {
            $bulkWrite = $bulkWrite ?? ClientBulkWrite::createWithCollection($customersCollection, [
                'ordered' => false,
                'verboseResults' => false
            ]);
            $bulkWrite = $bulkWrite->withCollection($customersCollection);
            $bulkWrite->updateOne(
                ['customer_id' => $row[0]],
                ['$set' => ['name' => $row[1], 'email' => $row[2], 'imported_at' => new UTCDateTime()]],
                ['upsert' => true]
            );
            $operationCount++;
        }
    }

    // Read and add organization
    if (!feof($orgsHandle)) {
        $row = fgetcsv($orgsHandle);
        if ($row) {
            $bulkWrite = $bulkWrite->withCollection($organizationsCollection);
            $bulkWrite->updateOne(
                ['org_id' => $row[0]],
                ['$set' => ['name' => $row[1], 'industry' => $row[2], 'imported_at' => new UTCDateTime()]],
                ['upsert' => true]
            );
            $operationCount++;
        }
    }

    // Execute batch when threshold reached
    if ($operationCount >= $batchSize) {
        $client->bulkWrite($bulkWrite);
        $bulkWrite = null;
        $operationCount = 0;
    }
}

// Execute remaining operations
if ($bulkWrite && $operationCount > 0) {
    $client->bulkWrite($bulkWrite);
}

You can run the script from the repository with the command:

php bulk_write_csv_import.php

You should see a success message, “✓ Multi-collection CSV import completed successfully!”, in your terminal if everything went well.

You should confirm the records in your MongoDB Atlas dashboard:

Why BulkWrite is ideal for CSV imports

Faster imports: Instead of sending 10,000 inserts one by one, you only send two batches of 5000.
Fewer network calls: Each bulk write = one network round-trip.
Order is preserved: Documents are inserted in the order you read them, unless you use the unordered flow.
Consistent error handling: You can log failures per batch without stopping the entire import.
Works well with validation rules: You still get detailed error messages from BulkWriteCommandException.

6. Best practices when working with MongoDB BulkWrite

To get optimum performance, reliability, and clarity from your bulk operations, follow these recommended best practices. These guidelines are commonly used in production systems that handle imports, data migrations, and large update operations.

1. Use unordered writes when order does not matter

MongoDB supports two bulk modes:

a. Ordered (default)

Operations run in sequence.
If one operation fails, the rest stop.
Ideal when order is important (e.g., dependent updates).

b. Unordered

MongoDB executes operations in parallel, in any order.
Failures do not stop the entire batch.
Much faster for large writes.

For example, we use the unordered writes in the CSV import code sample.

2. Verbose results

In the CSV import code, verboseResults is disabled. You can enable it if you are experiencing errors in your import and want to perform some debugging/auditing. Otherwise, for memory efficiency, keep it disabled for large imports.

3. Manage large batches (chunking strategy)

While BulkWrite can theoretically process thousands of operations, large batches can:

Increase memory usage.
Risk timeouts.
Make error reporting harder.

Chunking helps you:

Control memory usage.
Handle partial successes.
Ensure predictable performance.
Build resumable import systems—in case of a failure, you can resume from the batch number that failed, skipping others.

Example $batchSize = 5000;

4. Log errors for debugging failed operations

BulkWrite exceptions provide detailed information about:

Duplicate key errors.
Validation failures.
Network issues.
WriteConcern failures.
Operations that failed.

You should always wrap BulkWrite in a try...catch block and log failure details as seen in the example code blocks above, because:

BulkWrite may succeed partially, and you need to know where it failed.
Logging helps diagnose invalid data, duplicate keys, or corrupted CSV entries.
In production pipelines, logs help users retry or fix only the failed items.

5. Working with transactions for atomicity

When you need atomic guarantees, where either all operations succeed, or none are applied, then wrap your bulk write in a transaction. The MongoDB\with_transaction() helper simplifies this by handling retries for transient errors automatically. Here is a sample code snippet for working with transactions:

use function MongoDB\with_transaction;

$client = new Client($uri);

// Start a session for the transaction
$session = $client->startSession();

$customersCollection = $client->selectCollection('shop', 'customers');
$ordersCollection = $client->selectCollection('shop', 'orders');

// Use MongoDB\with_transaction() helper for automatic retry handling
with_transaction($session, function (Session $session) use ($client, $customersCollection, $ordersCollection) {
    // Create bulk write with the session
    $bulkWrite = ClientBulkWrite::createWithCollection($customersCollection, [
        'session' => $session,
        'ordered' => true
    ]);

    // Add operations that must all succeed together
    $bulkWrite->insertOne(['name' => 'Alice', 'email' => 'alice@example.com', 'balance' => 1000]);
    $bulkWrite->updateOne(
        ['name' => 'Bob'],
        ['$inc' => ['balance' => -500]]
    );

    // Switch to orders collection (same transaction)
    $bulkWrite = $bulkWrite->withCollection($ordersCollection);
    $bulkWrite->insertOne([
        'customer' => 'Bob',
        'recipient' => 'Alice',
        'amount' => 500,
        'type' => 'transfer'
    ]);

    // Execute all operations atomically
    $client->bulkWrite($bulkWrite);
});

Why use transactions with bulk writes?

Atomicity: All operations commit together or roll back on failure.
Automatic retries: The with_transaction() helper retries on transient errors (e.g., network issues), avoiding the need for manually implementing retry logic.
Cross-collection consistency: Maintain data integrity across multiple collections.

When to use transactions

Financial operations (transfers, payments)
Related data that must stay consistent (e.g., inventory + orders)
Multi-collection updates that depend on each other
Any scenario where partial writes would leave data in an invalid state

Note: For standalone deployments, transactions are only supported for single-document writes. Multi-document transactions require a replica set or sharded cluster; they are not available on standalone MongoDB deployments. For bulk writes, if a single operation cannot be successfully executed, any operations from the same bulk write that had already completed will only be rolled back if you used a transaction.

6. Additional best practices

Optimize indexing: When performing bulk updates or deletes, an indexed filter greatly speeds up execution. Additionally, avoid over-indexing. While indexes improve query speed, they can hinder write operations and consume additional disk space. Regularly review and remove unused or unnecessary indexes.
Validate data before adding to the batch: Catch “bad rows” early during CSV import rather than letting MongoDB reject them.
Use upserts carefully: They help merge data, but ensure your match filter is accurate.
Avoid mixing too many different operation types: Multiple inserts + updates + deletes are fine, but mixing extremely complex update operations may complicate debugging.
Monitor your MongoDB logs: MongoDB will log slow bulk operations or write concern issues.

Conclusion

BulkWrite is one of the most powerful additions to the MongoDB PHP library features. It proves useful when your application requires high-volume inserts, updates, and deletes. By combining multiple operations, you significantly reduce network round-trips and improve performance, especially when importing CSV data, huge system logs, cleaning up large datasets, or syncing records.

In this tutorial, we explored: