DEV Community

Cover image for WTF is protobuf? And why do as a developer I need to worry?
SSK
SSK

Posted on

WTF is protobuf? And why do as a developer I need to worry?

What is Protobuf?

Protobuf, short for Protocol Buffers, is a binary serialization format developed by Google. It allows you to define a schema for your data using a simple, language-agnostic interface description language (IDL). This schema can then be used to generate code in multiple programming languages, making it easy to serialize and deserialize your data.

Why Use Protobuf?

Protobuf offers several advantages over traditional text-based formats like JSON or XML:

  • Efficiency: Protobuf is more space-efficient than text-based formats, leading to smaller payloads and faster data transmission over the network.
  • Performance: The binary nature of protobuf makes it faster to serialize and deserialize data, resulting in faster API response times.
  • Language-Agnostic: Protobuf supports multiple programming languages, allowing you to maintain consistency across different parts of your application.
  • Schema Validation: Protobuf allows you to define a schema for your data, which can help catch errors early and ensure that your data is well-formed.
  • Versioning: Protobuf supports versioning, allowing you to evolve your API over time without breaking existing clients.
  • Security: Protobuf can be used with secure channels like gRPC, providing strong guarantees of confidentiality, integrity, and authentication.

Getting Started with Protobuf:

To get started with protobuf, you'll need to install the protobuf compiler (protoc) and the protobuf runtime library for your programming language. You can find installation instructions and documentation for various languages on the official protobuf website.

Defining a Schema:

Once you have protobuf set up, you can define a schema for your data using the protobuf IDL. Here's an example of a simple schema that defines a Person message with id, name, and email fields:

syntax = "proto3";

message Person {
  int32 id = 1;
  string name = 2;
  string email = 3;
}
Enter fullscreen mode Exit fullscreen mode

Serializing and Deserializing Data:

Once you have a schema, you can use the protoc compiler to generate code for your chosen programming language. This code will include functions for serializing and deserializing data to and from the protobuf binary format.

Here's an example of how you might use protobuf in Python:

import person_pb2

person = person_pb2.Person()
person.id = 123
person.name = "John Doe"
person.email = "john.doe@example.com"

# Serialize the data to a binary string
serialized_data = person.SerializeToString()

# Deserialize the data back into a Person object
deserialized_person = person_pb2.Person()
deserialized_person.ParseFromString(serialized_data)

print(deserialized_person)
Enter fullscreen mode Exit fullscreen mode

Schema Validation:

Protobuf allows you to define a schema for your data, which can help catch errors early and ensure that your data is well-formed. Here's an example of how you might use schema validation in Python:

import person_pb2

person = person_pb2.Person()
person.id = 123
person.name = "John Doe"
person.email = "john.doe@example.com"

# Serialize the data to a binary string
serialized_data = person.SerializeToString()

# Deserialize the data back into a Person object
deserialized_person = person_pb2.Person()
deserialized_person.ParseFromString(serialized_data)

# Validate the deserialized data against the schema
assert deserialized_person.id == 123
assert deserialized_person.name == "John Doe"
assert deserialized_person.email == "john.doe@example.com"
Enter fullscreen mode Exit fullscreen mode

Versioning:

Protobuf supports versioning, allowing you to evolve your API over time without breaking existing clients. Here's an example of how you might use versioning in Python:

import person_pb2

person_v1 = person_pb2.PersonV1()
person_v1.id = 123
person_v1.name = "John Doe"

# Serialize the data to a binary string
serialized_data_v1 = person_v1.SerializeToString()

# Deserialize the data back into a PersonV1 object
deserialized_person_v1 = person_pb2.PersonV1()
deserialized_person_v1.ParseFromString(serialized_data_v1)

print(deserialized_person_v1)

# Now let's add a new field to the schema
person_v2 = person_pb2.PersonV2()
person_v2.id = 123
person_v2.name = "John Doe"
person_v2.email = "john.doe@example.com"

# Serialize the data to a binary string
serialized_data_v2 = person_v2.SerializeToString()

# Deserialize the data back into a PersonV2 object
deserialized_person_v2 = person_pb2.PersonV2()
deserialized_person_v2.ParseFromString(serialized_data_v2)

print(deserialized_person_v2)
Enter fullscreen mode Exit fullscreen mode

Security:

Protobuf can be used with secure channels like gRPC, providing strong guarantees of confidentiality, integrity, and authentication. Here's an example of how you might use gRPC with protobuf in Python:

import grpc
import person_pb2
import person_pb2_grpc

# Define a service with a method that takes a Person message and returns a Person message
class PersonService(person_pb2_grpc.PersonServiceServicer):
    def GetPerson(self, request, context):
        return person_pb2.Person(id=request.id, name="John Doe", email="john.doe@example.com")

# Create a gRPC server and add the service to it
server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
person_pb2_grpc.add_PersonServiceServicer_to_server(PersonService(), server)

# Start the server
server.add_insecure_port("[::]:50051")
server.start()

# Create a gRPC channel and stub
channel = grpc.insecure_channel("localhost:50051")
stub = person_pb2_grpc.PersonServiceStub(channel)

# Call the GetPerson method on the service
response = stub.GetPerson(person_pb2.GetPersonRequest(id=123))

# Print the response
print(response)
Enter fullscreen mode Exit fullscreen mode

Mock API , you can use mock API to speed up your development for free. Checkout fakend.fyi

Conclusion:

In this blog post, we covered the basics of Protocol Buffers (protobuf) and how they can be used to optimize your API for performance, efficiency, and maintainability. We looked at how to define a schema, serialize and deserialize data, validate data against the schema, version your API, and use protobuf with secure channels like gRPC.


Top comments (0)