Introduction
Distributed transactions are a fundamental challenge in distributed systems. The rise of microservices architecture has amplified the importance of addressing this challenge effectively. Two-Phase Commit (2PC) is a well-established protocol designed to guarantee the reliable and consistent completion of transactions that involve multiple databases or systems. It remains a significant feature supported by many modern relational database management systems (RDBMS) such as Oracle and PostgreSQL. This post aims to demystify Two-Phase Commit (2PC) by implementing it using Node.js and PostgreSQL.
What is Two-Phase Commit (2PC)
Two-Phase Commit is a protocol for managing distributed transactions involving multiple databases or systems. It ensures data atomicity and consistency across multiple systems where a distributed transaction is necessary. It is primarily based on the RDBMS Prepared Transaction feature, which is supported by Oracle DBMS_TRANSACTION, PostgreSQL PREPARE TRANSACTION, and MS SQL Server Distributed Transaction Coordinator. 
2PC involves two distinct phases.
Prepare Phase
In this phase, a coordinator informs related services to prepare a transaction with relevant data. However, the transaction remains in the staging area of RDBMS without impacting the actual data. For PostgreSQL, it can be viewed in the pg_prepared_xacts table. If transaction preparation is successful related services inform the coordinator service with a YES response.
Decision Phase
This phase is also known as Commit/Rollback Phase. If all the related services for the transaction reply with affirmative. The coordinator service informs related services to commit the prepared transaction.
if any of the related services failed to prepare the transaction and reply with NO. then the coordinator informs other services to rollback the prepared transaction. Thus it maintains atomicity and consistency in the data for multiple databases spanning throughout the different services. 

Fig-1: Commit scenario for 2PC

Fig-2: Rollback scenario for 2PC
Preparing PostgreSQL for 2PC
In PostgreSQL, max_prepared_transactions parameter is used for controlling the number of prepared transactions that can be queued. Adjust it proportionally to your needs and resource availability in the database server. 
For example, you can set this parameter in the postgresql.conf file located in /etc/postgresql/<version>/main/postgresql.conf in Ubuntu for default apt based installation.
max_prepared_transactions = 50
PostgreSQL Syntax for 2PC
PostgreSQL supports two-phase commit (2PC) using the commands PREPARE TRANSACTION '<tx_name>' and COMMIT PREPARED '<tx_name>'. The transaction name acts as the anchor between the prepare and commit phases, ensuring transaction consistency across multiple nodes or systems. Below is an example demonstrating 2PC functionality in PostgreSQL, which can be tested in pgAdmin.
-- Table Creation
CREATE TABLE IF NOT EXISTS public.students (
    id SERIAL PRIMARY KEY, 
    name TEXT UNIQUE,
    cgpa NUMERIC
);
-- Begin the transaction
BEGIN;
-- Perform some operations
INSERT INTO students(name, cgpa) VALUES('Khan', 3.80);
-- Prepare the transaction for future commit/rollback
PREPARE TRANSACTION 'txn_001';
-- Either commit the transaction at a later point in time
COMMIT PREPARED 'txn_001';
-- Or rollback the transaction using tx_name
ROLLBACK PREPARED 'txn_001';
Implementation in Node.js
In this example, Express.js and node-postgres have been used for API development.
const express = require("express");
const { Pool } = require("pg");
const crypto = require("crypto");
const app = express();
app.use(express.json());
// PostgreSQL connection details
const pool = new Pool({
  user: "postgres",
  host: "localhost",
  database: "testdb",
  password: "1234",
  port: 5432, // Default PostgreSQL port
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000,
});
/**
 * Health Check
 */
app.get("/health", (req, res) => {
  res.status(200).json({
    message: "Ok",
  });
});
/**
 * Prepare a Transaction for Commit
 */
app.post("/prepare", async (req, res) => {
  let result = null;
  let client = null;
  const { name, cgpa } = req.body;
  const transactionName = crypto.randomBytes(5).toString("hex");
  try {
    // Connect to the database
    client = await pool.connect();
    // Start the transaction
    await client.query("BEGIN");
    // Perform some operations
    result = await client.query(
      "INSERT INTO students (name, cgpa) VALUES ($1, $2) RETURNING id",
      [name, cgpa]
    );
    // Prepare the transaction
    await client.query(`PREPARE TRANSACTION '${transactionName}'`);
    console.log(`[INFO] Transaction '${transactionName}' prepared.`);
  } catch (error) {
    console.error("[ERROR] Error during transaction:", error.message);
    return res.status(500).json({
      message: error.message,
    });
  } finally {
    // Clean up and close the connection
    await client.release();
  }
  return res.status(201).json({
    id: result ? result.rows[0].id : null,
    txId: transactionName,
  });
});
/**
 * Commit the Transaction
 */
app.post("/commit", async (req, res) => {
  let client = null;
  const { txId } = req.body;
  try {
    // Connect to the database
    client = await pool.connect();
    // Commit the prepared transaction
    await client.query(`COMMIT PREPARED '${txId}'`);
    console.log(`[INFO] Transaction '${txId}' committed.`);
  } catch (error) {
    console.error("[ERROR] Error during transaction:", error.message);
    return res.status(500).json({
      message: error.message,
    });
  } finally {
    // Clean up and close the connection
    await client.release();
  }
  return res.status(200).json({
    txId: txId,
  });
});
/**
 * Rollback the Transaction
 */
app.post("/rollback", async (req, res) => {
  let client = null;
  const { txId } = req.body;
  try {
    // Connect to the database
    client = await pool.connect();
    // Commit the prepared transaction
    await client.query(`ROLLBACK PREPARED '${txId}'`);
    console.log(`[INFO] Transaction '${txId}' rollbacked.`);
  } catch (error) {
    console.error("[ERROR] Error during transaction:", error.message);
    return res.status(500).json({
      message: error.message,
    });
  } finally {
    // Clean up and close the connection
    await client.release();
  }
  return res.status(200).json({
    txId: txId,
  });
});
// Start the application
app.listen(3030, () => {
  console.log("[INFO] Server listening on port 3030");
});
- 
Prepare API: Prepare a transaction with insertoperation instudentstable. And returns the transaction name astxIdin response.
curl --location 'http://localhost:3030/prepare' \
--header 'Content-Type: application/json' \
--data '{
  "name": "Nahid Chowdhury",
  "cgpa": "3.9"
}'
- Commit API: Commit a staged transaction with transaction name (txId).
curl --location 'http://localhost:3030/commit' \
--header 'Content-Type: application/json' \
--data '{
    "txId": "7b8a1e6bc0"
}'
- Rollback API: Rollback a staged transaction with transaction name (txId).
curl --location 'http://localhost:3030/rollback' \
--header 'Content-Type: application/json' \
--data '{
    "txId": "7b8a1e6bc0"
}'
Github Repo: node-pg-2pc
Conclusion
Two-Phase Commit (2PC) in PostgreSQL is a critical protocol for ensuring data consistency and integrity in distributed systems. By coordinating transaction preparation and finalization across multiple participants, 2PC provides a robust mechanism to manage complex transactions spanning multiple databases or nodes. While its implementation requires careful consideration of performance and failure scenarios, it remains an invaluable tool for systems demanding strong consistency.
Thank you for reading! If you have any questions or suggestions, feel free to reach out.
 
 
              
 
    
Top comments (2)
As I understood correctly, in a real example, we have to have two instances of the database and two connections to them. Because 2pc doesn't make sense with one instance, and 2pc elaborated for a distributed system.
Secondly, "manager" or client has to implement a failure mechanism - something might go wrong, and we will have an open transaction with two different databases. Network never reliable 😊
This is a simplified example that demonstrates the code that needs to be implemented in a single service (i.e., Service A in Figure 2). I haven’t included the implementation of the coordinator or Service B here.
And yes, you’re right that the coordinator needs to implement a failover mechanism. However, since it does not connect to the database directly, it only needs to call the rollback API.