Written by Ibiyemi Adewakun✏️
A key part of a distributed system’s architecture is how data is transmitted between its different parts. Data transmission can be done using various methods, including HTTP requests, WebSockets, event emitters, and other protocols.
In this article, we will explore a crucial aspect of how data is transferred within distributed systems — the serialization of data. We will focus on a specific protocol for serialization called Protobuf and also focus on serializing TypeScript objects using Protobuf messages.
Jump ahead:
- What is data serialization?
- What is Protobuf?
- Serializing data using Protobuf and TypeScript
- Building and serializing a Protobuf message from TypeScript objects
The full code for our demo project can be found in this GitHub repo. You can check it out as we get started to help you follow along with this tutorial.
What is data serialization?
Data serialization is a process that involves transforming a data object into a stream of bytes, facilitating easier storage and transmission.
Data deserialization is the opposite operation. It involves reconstructing the data object from the stream of bytes into a native structure. This allows the application receiving or retrieving the data object to understand it for read and write operations.
There are several serialization data formats for storing serialized data, including XML, JSON, YAML, and Protobufs — our main focus in this article.
Common use cases of data serialization
Data serialization plays an essential role in storing and transferring data within distributed systems. Some common use cases include:
- Making requests to and receiving responses from REST APIs
- Storing in-memory data on disks or in databases
- Transporting data through messaging protocols like AMQP
- Putting items in a queue
- Sending event messages to a topic in a system like Kafka
Now that we understand data serialization and some of its common use cases, let’s introduce Protobuf.
What is Protobuf?
Protobuf is short for Protocol Buffers, a language- and platform-neutral mechanism developed by Google for serializing structured data. It allows for a structured way to define data schemas, making it easy to build and update your serialized data.
Unlike JSON and XML, Protobufs are not intended to be easily read by humans, as they use a binary format for data serialization. However, Protobuf encoding is smaller, faster, and simpler than JSON and XML.
Some pros of Protobufs include:
- Faster and smaller than most serialization encodings, like JSON or XML
- Handle breaking changes better than any other serialization mechanism by enforcing deprecation rather than completely removing a field
- Language- and platform-neutral, making them a good mechanism for transferring data between systems with different language implementations
- Support a wider range of data types than JSON, such as enums
Some cons of Protobufs include:
- Lack human readability
- Have limited support for complex data types, such as maps and nested objects
- Place restrictions on changing structured data, making collaboration between multiple authors or teams somewhat challenging
Now, let’s take a look at using Protobufs and TypeScript for data serialization.
Serializing data using Protobuf and TypeScript
In this section, we will explore how to serialize and deserialize structured data in TypeScript using Protobufs.
TypeScript is a great option for Protobuf serialization because it’s strongly typed. This strict typing is a good match for Protobuf’s message structures and allows us to work with clearly defined data models and easier-to-maintain code that is less prone to runtime errors.
To follow along with this tutorial, you’ll need to install:
- Node.js v16 or newer
- A JavaScript package manager — I’ll use Yarn
- An IDE or text editor of your choice
Let’s jump right in.
Creating our TypeScript project
To understand how Protobuf serializes and deserializes to TypeScript, we will create a TypeScript project and model the data for serialization based on a phone book with contacts.
First, let’s create our project directory by typing the following into our terminal:
$ mkdir phonebook
$ cd phonebook
Inside the new phonebook
project directory, we will run npm init
to initialize the Node.js project with a package.json
file:
$ npm init
Follow the prompts to fill in the metadata. Next, we will add TypeScript to our project using our package installer — in this case, Yarn:
$ yarn add -D typescript ts-node @types/node
Note that we’ve added a flag, -D
, to our installation command. This flag tells Yarn to add these libraries as dev dependencies, meaning these libraries are only needed when the project is in development.
Now let’s configure TypeScript and create our ts.config
configuration file by running the following command:
$ npx tsc --init
Our project should be all set up now. Let’s move on to defining our data in TypeScript.
Defining our TypeScript data structure
For Protobuf to correctly serialize and deserialize our data objects, we must define the message shapes in a .proto
format for Protobuf to use.
For us to correctly build our .proto
files, we will create a sample data object. In this object, we can decide what attributes to include and how they are defined. This will guide what attributes and possible values we would like to support in our Protobuf messages.
To start, we’ll create a src
directory where all our code will reside:
$ mkdir src
As discussed earlier, we want to our app to be able to serialize data for a phone book containing multiple contacts. Let’s say each contact has some required basic information such as first name and last name, as well as some optional information like email and phone number.
We can also decide to support some optional, more complex attributes describing the contact’s relationship to the phone book owner.
With these attributes in mind, we’ll create a file to hold our data object:
// src/phoneBook.ts
export const myPhoneBook = {
contacts: [
{
firstName: "Jane",
lastName: "Doe",
email: "jane.d@mymail.com",
phoneNumber: "213-999-0876",
address: {
addressLine1: "111 Cherry Blossom Rd"
city: "Plateau"
state: "Philadelphia"
country: "US",
postCode: "90210"
}
socialPlatForms: [
{
platform: "WHATSAPP",
profile: "2139990876",
profileUrl: "https://api.whatsapp.com/+12139990876"
}
],
emergencyContact: {
relationship: "FRIEND",
},
isBlocked: false,
isFavorite: true,
createdAt: "2021-03-04",
updatedAt: "2023-10-10"
},
{
firstName: "Hannah",
lastName: "Buree",
email: "hburee@nomail.com",
phoneNumber: "390-123-7654",
socialPlatForms: [
{
platform: "INSTAGRAM",
profile: "h_buree"
}
],
isBlocked: false,
isFavorite: false,
createdAt: "2011-02-09",
updatedAt: "2023-10-10"
}
]
}
Now that we have a real-world object to reference, we can move on to defining the Protobuf messages.
Defining Protobuf messages for our data
In this section, we will define the .proto
files containing the message structures for our data serialization and deserialization. These files define a data structure that is language-independent and statically typed.
To start, let’s create a directory to contain our Protobuf messages in our src
directory:
$ mkdir proto
=
Next, we’ll create our Protobuf message definitions:
// src/proto/phonebook/v1/phonebook.proto
syntax = "proto3";
package phonebook.v1;
import "google/protobuf/timestamp.proto";
message PhoneBook {
repeated Contact contact = 1;
}
message Contact {
string first_name = 1;
string last_name = 2;
optional string email = 3;
optional string phone_number = 4;
repeated SocialPlatform social_platforms = 5;
optional EmergencyContactDetails emergency_contact = 6;
optional Address address = 7;
bool is_blocked = 8;
bool is_favorite = 9;
google.protobuf.Timestamp created_at = 10;
google.protobuf.Timestamp updated_at = 11;
optional google.protobuf.Timestamp deleted_at = 12;
message SocialPlatform {
SocialPlatformOptions platform = 1;
string profile = 2;
optional string profile_url = 3;
}
message EmergencyContactDetails {
Relationships relationship = 1;
optional string custom_label = 2;
}
message Address {
string address_line_1 = 1;
optional string address_line_2 = 2;
optional string postal_code = 3;
optional string city = 4;
optional string state = 5;
optional string country = 6;
}
enum Relationships {
BROTHER = 0;
MOTHER = 1;
SISTER = 2;
FATHER = 3;
FRIEND = 4;
COUSIN = 5;
}
enum SocialPlatformOptions {
WHATSAPP = 0;
FACEBOOK = 1;
INSTAGRAM = 2;
}
}
In our above Protobuf message definition, we’ve created a representation of the phone book data we want to handle. Let’s highlight a few important keywords and definitions in our Protobuf messages:
-
optional
— Used to specify that the message attribute is not required and can be used to serialize null or undefined attributes in our TypeScript object -
repeated
— Used to define that an attribute contains a list or an array, such as:- Lists of primitive data types, like strings
- More complex embedded messages, such as our
Phon
ebook
message that hasrepeated
contacts, indicating a list of contacts
-
google.protobuf.Timestamp
— A message type provided by Protobuf for serializing date and time data -
enum
— Defines a set of predefined constants that can be selected for an attribute
So far, we have defined our Protobuf messages and reviewed the keywords we’re using to define those messages. Next, let’s put these messages to work serializing and deserializing our data.
Compiling the TypeScript objects from our Protobuf definitions
In this section, we will turn our Protobuf messages into TypeScript interfaces we can use in our code.
First, we need to install protoc
, the gRPC Protobuf compiler. This process depends on what OS you’re using. You can find all the available options listed in the gRPC protoc
installation docs.
In this article, I’ll focus on the Mac OS installation process using Homebrew:
$ brew install protobuf
$ protoc --version # Ensure compiler version is 3+
At the time this article was written, Protobuf v3.21.x — the latest version available via Homebrew — had an issue when trying to generate JavaScript objects. As a workaround, I installed the slightly older v3.20.3 by running the following commands:
brew install protobuf@3
brew link --overwrite protobuf@3
Protobuf does not support TypeScript out of the box, though it does support JavaScript. For TypeScript support, we’ll need to install a plugin.
There are various plugins available that support TypeScript generation from Protobuf messages. In this article, we’ll be using the ts-protoc-gen
npm package, which generates TypeScript declaration files to provide typings for the corresponding JavaScript objects.
To install this package, we’ll run the following command:
$ yarn add ts-protoc-gen
Then, to compile our .proto
message into TypeScript code, we need to run the Protobuf compiler with the path to our Protobuf definitions. We’ll also need to specify some options as follows:
-
plugin
— Path to the TypeScript Protobuf compiler plugin -
ts_opt
— TypeScript configuration options -
js_out
— Path to the directory to write the generated JavaScript code -
ts_out
— Path to the directory to write the generated TypeScript code (TypeScript declarations)
After filling all of these in, our command should look like so:
$ protoc \
--plugin="protoc-gen-ts=./node_modules/.bin/protoc-gen-ts" \
--ts_opt=esModuleInterop=true \
--js_out="./src/generated" \
--ts_out="./src/generated" \
src/proto/phonebook/v1/phonebook.proto
When we run the above command, we will get a TypeScript declaration file containing our messages as objects, along with class functions for retrieving attributes and deserializing objects from a stream of bytes.
We also get a JavaScript file, but this TypeScript declaration file gives us the type restrictions we love about TypeScript.
Here is a sample of our generated TypeScript Contact
class and object:
// src/generated/src/proto/phonebook/v1/phonebook_pb.d.ts
import * as jspb from "google-protobuf";
import * as google_protobuf_timestamp_pb from "google-protobuf/google/protobuf/timestamp_pb";
export class PhoneBook extends jspb.Message {
clearContactList(): void;
getContactList(): Array<Contact>;
setContactList(value: Array<Contact>): void;
addContact(value?: Contact, index?: number): Contact;
serializeBinary(): Uint8Array;
toObject(includeInstance?: boolean): PhoneBook.AsObject;
static toObject(includeInstance: boolean, msg: PhoneBook): PhoneBook.AsObject;
static extensions: {[key: number]: jspb.ExtensionFieldInfo<jspb.Message>};
static extensionsBinary: {[key: number]: jspb.ExtensionFieldBinaryInfo<jspb.Message>};
static serializeBinaryToWriter(message: PhoneBook, writer: jspb.BinaryWriter): void;
static deserializeBinary(bytes: Uint8Array): PhoneBook;
static deserializeBinaryFromReader(message: PhoneBook, reader: jspb.BinaryReader): PhoneBook;
}
export namespace PhoneBook {
export type AsObject = {
contactList: Array<Contact.AsObject>,
}
}
export class Contact extends jspb.Message {
getFirstName(): string;
setFirstName(value: string): void;
getLastName(): string;
setLastName(value: string): void;
hasEmail(): boolean;
clearEmail(): void;
getEmail(): string;
setEmail(value: string): void;
...
export namespace Contact {
export type AsObject = {
firstName: string,
lastName: string,
email: string,
phoneNumber: string,
socialPlatformsList: Array<Contact.SocialPlatform.AsObject>,
emergencyContact?: Contact.EmergencyContactDetails.AsObject,
address?: Contact.Address.AsObject,
isBlocked: boolean,
isFavorite: boolean,
createdAt?: google_protobuf_timestamp_pb.Timestamp.AsObject,
updatedAt?: google_protobuf_timestamp_pb.Timestamp.AsObject,
deletedAt?: google_protobuf_timestamp_pb.Timestamp.AsObject,
}
export class SocialPlatform extends jspb.Message {
getPlatform(): Contact.SocialPlatformOptionsMap[keyof Contact.SocialPlatformOptionsMap];
setPlatform(value: Contact.SocialPlatformOptionsMap[keyof Contact.SocialPlatformOptionsMap]): void;
getProfile(): string;
setProfile(value: string): void;
...
As we can see from our generated file above, our objects and classes are nested within each other. This isn’t an ideal structure for our objects because it makes them difficult to reuse throughout our project. Let’s see how to improve this in the next section.
Improving the TypeScript object structure in our Protobuf messages
We can improve our Protobuf messages by building the messages into separate files. This way we generate classes in individual files that can be referenced by import.
Here’s how we can split our Protobuf messages into multiple .proto
files:
// src/proto/phonebook/v1/phonebook.proto
syntax = "proto3";
package phonebook.v1;
import "google/protobuf/timestamp.proto";
import "phonebook/v1/contact.proto";
message PhoneBook {
repeated Contact contact = 1;
}
// src/proto/phonebook/v1/contact.proto
syntax = "proto3";
package phonebook.v1;
import "google/protobuf/timestamp.proto";
import "phonebook/v1/socialplatform.proto";
import "phonebook/v1/emergencycontactdetails.proto";
import "phonebook/v1/address.proto";
message Contact {
string first_name = 1;
string last_name = 2;
optional string email = 3;
optional string phone_number = 4;
repeated SocialPlatform social_platforms = 5;
optional EmergencyContactDetails emergency_contact = 6;
optional Address address = 7;
bool is_blocked = 8;
bool is_favorite = 9;
google.protobuf.Timestamp created_at = 10;
google.protobuf.Timestamp updated_at = 11;
optional google.protobuf.Timestamp deleted_at = 12;
}
// src/proto/phonebook/v1/socialplatform.proto
syntax = "proto3";
package phonebook.v1;
message SocialPlatform {
SocialPlatformOptions platform = 1;
string profile = 2;
optional string profile_url = 3;
enum SocialPlatformOptions {
WHATSAPP = 0;
FACEBOOK = 1;
INSTAGRAM = 2;
}
}
// src/proto/phonebook/v1/emergencycontactdetails.proto
syntax = "proto3";
package phonebook.v1;
message EmergencyContactDetails {
Relationships relationship = 1;
optional string custom_label = 2;
enum Relationships {
BROTHER = 0;
MOTHER = 1;
SISTER = 2;
FATHER = 3;
FRIEND = 4;
COUSIN = 5;
}
}
// src/proto/phonebook/v1/address.proto
syntax = "proto3";
package phonebook.v1;
message Address {
string address_line_1 = 1;
optional string address_line_2 = 2;
optional string postal_code = 3;
optional string city = 4;
optional string state = 5;
optional string country = 6;
}
Now, to build our TypeScript files, we pass all the .proto
files into our protoc
command, along with an additional argument, proto_path
:
$ protoc \
--plugin="protoc-gen-ts=./node_modules/.bin/protoc-gen-ts" \
--ts_opt=esModuleInterop=true \
--js_out="./src/generated" \
--ts_out="./src/generated" \
--proto_path="./src/proto" \
src/proto/phonebook/v1/phonebook.proto src/proto/phonebook/v1/contact.proto \
src/proto/phonebook/v1/socialplatform.proto \
src/proto/phonebook/v1/emergencycontactdetails.proto \
src/proto/phonebook/v1/address.proto
The proto_path
argument specifies the directory where the compiler should check for imports, as we are compiling multiple files that reference each other.
Improving our TypeScript generation build
As explained earlier, the protoc
command takes our defined Protobuf schema and compiles them into TypeScript object interfaces and provides builder classes and functions for the objects to be created.
To run this command, we have to provide a few arguments. We also want to make sure that our plugin does not have issues creating the necessary files whenever we run the command. We can ensure this by cleaning out our output directory.
To simplify running the protoc
command with the needed arguments, as well as the cleanup tasks necessary to assure a smooth build, we can move them into a Bash script. We can then run this Bash script as an npm script:
// scripts/protoc-generate.sh
#!/usr/bin/env bash
# Root directory of app
ROOT_DIR=$(git rev-parse --show-toplevel)
# Path to Protoc Plugin
PROTOC_GEN_TS_PATH="${ROOT_DIR}/node_modules/.bin/protoc-gen-ts"
# Directory holding all .proto files
SRC_DIR="${ROOT_DIR}/src/proto"
# Directory to write generated code (.d.ts files)
OUT_DIR="${ROOT_DIR}/src/generated"
# Clean all existing generated files
rm -r "${OUT_DIR}"
mkdir "${OUT_DIR}"
# Generate all messages
protoc \
--plugin="protoc-gen-ts=${PROTOC_GEN_TS_PATH}" \
--ts_opt=esModuleInterop=true \
--js_out="import_style=commonjs,binary:${OUT_DIR}" \
--ts_out="${OUT_DIR}" \
--proto_path="${SRC_DIR}" \
$(find "${SRC_DIR}" -iname "*.proto")
Now we can add a script to our package.json
to run the protoc
command and other cleanup tasks:
// package.json
{
"name": "myaddressbook",
"version": "1.0.0",
"description": "Data serialization with Protobuf and TypeScript",
"scripts": {
"build:proto": "scripts/protoc-generate.sh"
},
...
Building and serializing a Protobuf message from TypeScript objects
Our ts-protoc-gen
plugin provides these setter and getter functions:
-
set
functions to add an attribute -
get
functions to retrieve an attribute -
clear
functions to clear the value of a specific attribute -
add
functions to add entries to list objects — i.e., arrays of objects
Using our generated TypeScript Message
classes and their setter methods, we can create and set our Phonebook
message, serialize it into bytes, and deserialize it back into an object that we then log:
// src/index.ts
import { myPhoneBook } from "./phoneBook"
import { PhoneBook } from "./generated/phonebook/v1/phonebook_pb"
import { Contact } from "./generated/phonebook/v1/contact_pb"
import { SocialPlatform } from "./generated/phonebook/v1/socialplatform_pb"
import { EmergencyContactDetails } from "./generated/phonebook/v1/emergencycontactdetails_pb"
const [phoneBookContactOne] = myPhoneBook.contacts
const socialPlatForm = new SocialPlatform()
socialPlatForm.setPlatform(SocialPlatform.SocialPlatformOptions.WHATSAPP)
socialPlatForm.setProfile(phoneBookContactOne.socialPlatForms[0].profile)
if (phoneBookContactOne.socialPlatForms[0].profileUrl) {
socialPlatForm.setProfileUrl(phoneBookContactOne.socialPlatForms[0].profileUrl)
}
const emergencyContact = new EmergencyContactDetails()
emergencyContact.setRelationship(EmergencyContactDetails.Relationships.FRIEND)
const contactOne = new Contact()
contactOne.setFirstName(phoneBookContactOne.firstName)
contactOne.setLastName(phoneBookContactOne.lastName)
contactOne.setEmail(phoneBookContactOne.email)
contactOne.setPhoneNumber(phoneBookContactOne.phoneNumber)
contactOne.setIsBlocked(phoneBookContactOne.isBlocked)
contactOne.setIsFavorite(phoneBookContactOne.isFavorite)
contactOne.addSocialPlatforms(socialPlatForm)
contactOne.setEmergencyContact(emergencyContact)
const phoneBook = new PhoneBook()
phoneBook.addContact(contactOne)
const serializedPhoneBook = phoneBook.serializeBinary()
const deserializedPhoneBook = PhoneBook.deserializeBinary(serializedPhoneBook)
console.log("\n Serialized Bytes of object: ", serializedPhoneBook)
console.log("\n Deserialized object: ", JSON.stringify(deserializedPhoneBook.toObject()))
Note that you can add the following script to your package.json
file to run our serialization and deserialization process:
"dev": "npx ts-node ./src/index.ts"
Here’s our result: The deserialized object matches our phone book contact object defined earlier in this article:
// src/phoneBook.ts
...
{
firstName: "Jane",
lastName: "Doe",
email: "jane.d@mymail.com",
phoneNumber: "213-999-0876",
socialPlatForms: [
{
platform: "WHATSAPP",
profile: "2139990876",
profileUrl: "https://api.whatsapp.com/+12139990876"
}
],
emergencyContact: {
relationship: "FRIEND",
},
isBlocked: false,
isFavorite: true,
createdAt: "2021-03-04",
updatedAt: "2023-10-10"
},
...
With that, we’re all done! We can now serialize and deserialize data with Protobuf and TypeScript.
Conclusion
In this article, we explored the powerful Protobuf data serialization protocol. Protobuf is language-independent and can be used to build objects in several supported languages.
We used Protobuf with TypeScript to serialize and deserialize data using the example of phone book contact information. While Protobuf supports building into JavaScript objects out of the box, for typing via TypeScript, we had to use ts-protoc-gen
— one of several Protobuf plugins.
You can view the full code for this implementation of serializing and deserializing TypeScript objects using Protobuf in this GitHub repo. Let me know in the comments if you have any questions!
LogRocket: Full visibility into your web and mobile apps
LogRocket is a frontend application monitoring solution that lets you replay problems as if they happened in your own browser. Instead of guessing why errors happen, or asking users for screenshots and log dumps, LogRocket lets you replay the session to quickly understand what went wrong. It works perfectly with any app, regardless of framework, and has plugins to log additional context from Redux, Vuex, and @ngrx/store.
In addition to logging Redux actions and state, LogRocket records console logs, JavaScript errors, stacktraces, network requests/responses with headers + bodies, browser metadata, and custom logs. It also instruments the DOM to record the HTML and CSS on the page, recreating pixel-perfect videos of even the most complex single-page and mobile apps.
Top comments (1)
Nice! In what real-world situations would you think about using protobuf vs JSON or other formats?