There are many data structures that can be used to perform communication between applications like XML and JSON. These two data structures are commonly used in application development but there are many disadvantages in performance and data size. To solve this disadvantages can be done by using another alternative called protocol buffers.
Protocol Buffers
Protocol Buffers is a mechanism that developed by Google to perform data serialization. There are many advantages when using Protocol Buffers:
- The data structure is tidy and more manageable.
- Can be used for RPC communication.
- Validation feature in data structure.
- Better performance compared to XML and JSON.
There are many disadvantages of Protocol Buffers:
- Only support some programming languages like Java, JavaScript, Go, C++. Probably in future, other programming languages support for protocol buffers will be added.
Setup
In order to use protocol buffers in programming language, the protocol buffers compiler is needed. The compiler can be downloaded here then choose the compiler based on the operating system that is used. For example, for windows operating system the compiler that is needed is protoc-3.17.3-win64.zip
file.
For windows user, follow these steps.
Download the protocol buffers compiler in this link. Then choose for windows operating system (example: protoc-3.17.3-win64.zip
).
Extract that file in a folder called proto3
, this folder can be put in any location.
Open start then search for "environment variables" and then choose Edit the system environment variables.
Choose Environment Variables..
Choose Path in system variables.
Click new then add the folder location for proto3
folder for bin
folder (example: D:\proto3\bin
).
Click OK. After the path variable is added. Check the compiler installation with protoc --version
command. If the version is visible then the compiler is installed correctly.
Create a Protocol Buffers
In this tutorial, the protocol buffers version that is used is Protocol Buffers version 3 (proto3)
The data structure of protocol buffers looks like this:
message message_name {
data_type field_name = tag;
..
}
In this example, the protocol buffers is used to define a data structure for car entity.
// define the syntax type of protocol buffers
// the syntax type is proto3
syntax = "proto3";
// create a message
message Car {
int32 id = 1;
string manufacturer = 2;
string name = 3;
float mileage = 4;
bool is_new = 5;
}
Then naming convention for message name, enum and service is using capital letter for each first sentence like SearchRequest
then for field name is using underscore _
like zip_code
.
These are the basic data types that can be used in protocol buffers.
Data Type | Description |
---|---|
int32 | 32 non decimal integer |
string | A group of alphanumeric characters |
bool | Contains true and false |
float | decimal integer |
uint32 | 32 non decimal integer and must be positive integer |
All supported data types in protocol buffers can be checked here.
In protocol buffers, the enum can be created if needed. This is the basic syntax to create an enum.
enum enum_name {
enum_value = tag;
}
In this protocol buffers, the enum called CarType
is created.
// define the syntax type of protocol buffers
// the syntax type is proto3
syntax = "proto3";
// create a message
message Car {
int32 id = 1;
string manufacturer = 2;
string name = 3;
float mileage = 4;
bool is_new = 5;
uint32 vin = 6;
// create an enum
enum CarType {
UNKNOWN = 0;
RACE_CAR = 1;
ROAD_CAR = 2;
}
// using enum
CarType car_type = 7;
}
Another data type that is supported is map. Map is a data type that can store many key value pairs. This is the example of map usage.
syntax = "proto3";
message Student {
string name = 1;
uint32 student_id = 2;
// using map
// key: string
// value: string
map<string, string> courses = 3;
}
The message that already created is also a data type that can be used. This is the example of using another message as a data type.
syntax = "proto3";
message Student {
string name = 1;
uint32 student_id = 2;
// using Course message as a data type
repeated Course courses = 3;
}
message Course {
string code = 1;
string name = 2;
}
To import another protocol buffers file, use the full path from the root folder that is used to store protocol bufers files. This is the example of import mechanism in protocol buffers. The protocol buffers file that is imported is person.proto
.
person.proto
file in models
folder.
syntax = "proto3";
message Person {
string name = 1;
string address = 2;
int32 age = 3;
}
The car.proto
file is using person
that is imported.
// using proto3 syntax
syntax = "proto3";
// import person.proto
import "models/person.proto";
// create a message
message Car {
int32 id = 1;
string manufacturer = 2;
string name = 3;
float mileage = 4;
bool is_new = 5;
uint32 vin = 6;
// create an enum
enum CarType {
UNKNOWN = 0;
RACE_CAR = 1;
ROAD_CAR = 2;
}
CarType car_type = 7;
// using person that is imported
Person owner = 8;
}
Rules in Protocol Buffers
There are many rules that can be used in protocol buffers.
Rule | Description |
---|---|
repeated |
A field could contains many values |
oneof |
A field's value must be chosen from the specified value choices |
By default in proto3
, the field without specified rule could contain empty value.
The repeated
rule is used to create a field that can store many values like list or array. For example, if there is a field repeated int32 numbers = 1;
. This field means that a numbers
can store many values that has a int32
data type.
This is the example of rule usage in protocol buffers.
syntax = "proto3";
message Student {
string name = 1;
string student_id = 2;
repeated Course courses = 3;
uint32 member_id = 4;
oneof card_number {
string student_card_number = 5;
string id_card_number = 6;
}
}
message Course {
string code = 1;
string name = 2;
}
Based on the code above, the courses
field could contains many values that has a Course
data type. The oneof
rule is applied into card_number
field which means one of the card_number
value must be chosen from student_card_number
or id_card_number
.
Changes in Protocol Buffers
If the changes is occurred in protocol buffers that is used in application. There are many rules when the change is occurred in protocol buffers.
- Tag number change is not allowed.
- The field name can be changed
- If there is a new field addition, the new field is filled automatically with the default value based on the specified data type. The default values for each data type can be checked here.
In this example, the new field is added in protocol buffers file called Blog
.
Before new field addition.
syntax = "proto3";
message Blog {
string title = 1;
string author = 2;
string content = 3;
}
After new field is added.
syntax = "proto3";
message Blog {
string title = 1;
string author = 2;
string content = 3;
// add new field
string category = 4;
}
- If the field is unused. The unused field must be specified with
OBSOLETE_
keyword or usingreserve
keyword for the unused field's tag number. Usingreserve
keyword is recommended to avoid bug.
This is the example of reserved
keyword usage in student_id
field.
syntax = "proto3";
message Course {
string code = 1;
string name = 2;
// string student_id = 3;
// string lecturer_name = 4;
// student_id field with tag number 3 is not used
reserved 3;
// field called lecturer_name is not used
reserved "lecturer_name";
}
- In enum, the addition, changes and removal can be performed.
In this example, the enum is changed.
Before changed.
syntax = "proto3";
message User {
string name = 1;
// using enum
enum Role {
UNKNOWN = 0;
USER = 1;
}
}
After changed.
syntax = "proto3";
message User {
string name = 1;
// using enum
enum Role {
// default value for enum
UNKNOWN = 0;
// USER = 1;
STAFF = 2;
// add new value inside enum
ADMIN = 3;
// remove "USER" value from enum
reserved "USER";
reserved 1;
}
Role role = 2;
}
Using Protocol Buffers
Protocol buffers can be used together with programming languages such as Java, JavaScript, Go, C++ and Dart. In this example, the protocol buffers is used with Go programming language.
The Go application project is created with this command. Make sure the domain name is specified based on your repository.
go mod init github.com/nadirbasalamah/protodemo
Add some dependencies to use protocol buffers.
go get github.com/golang/protobuf
go get google.golang.org/protobuf
Create a new protocol buffers file in src/model
directory called student.proto
. The option inside protocol buffers file is added so the generated code can be used from student
package.
syntax = "proto3";
package tutorial;
// only for Golang
option go_package = "model;student";
message Student {
string name = 1;
string student_id = 2;
repeated Course courses = 3;
uint32 member_id = 4;
oneof card_number {
string student_card_number = 5;
string id_card_number = 6;
}
}
message Course {
string code = 1;
string name = 2;
}
Generate code from protocol buffers so the code can be used together with Go. In this command, the file is generated from root directory src/
then the generated code is stored in src/
. The protocol buffers file that will be generated is src/model/student.proto
.
protoc -I=src/ --go_out=src/ src/model/student.proto
If the file called student.pb.go
is exists, then the generate operation is success.
The generated code is used in main.go
file.
package main
import (
"fmt"
student "github.com/nadirbasalamah/protodemo/src/model"
)
func main() {
// create a student entity
var newStudent student.Student = student.Student{
Name: "Nathan Mckane",
StudentId: "RVN2021",
CardNumber: &student.Student_IdCardNumber{
IdCardNumber: "12345321",
},
}
// print some value from fields
fmt.Println("Student Data")
fmt.Println("Name: ", newStudent.GetName())
fmt.Println("Student ID: ", newStudent.GetStudentId())
fmt.Println("Card Number: ", newStudent.GetCardNumber())
}
Output
Student Data
Name: Nathan Mckane
Student ID: RVN2021
Card Number: &{12345321}
Based on the code above, the generate result from protocol buffers is used in student
package. The object from Student
struct is created then the values from newStudent
object is retrieved from provided getter methods.
Another example of protocol buffers usage is to convert protocol buffers into another data structure like JSON or vice versa.
package main
import (
"fmt"
"log"
"github.com/golang/protobuf/jsonpb"
"github.com/golang/protobuf/proto"
student "github.com/nadirbasalamah/protodemo/src/model"
)
func main() {
// create a student entity
var newStudent student.Student = student.Student{
Name: "Nathan Mckane",
StudentId: "RVN2021",
CardNumber: &student.Student_IdCardNumber{
IdCardNumber: "12345321",
},
}
// add some courses
newStudent.Courses = []*student.Course{
{
Code: "C001",
Name: "Algorithm",
},
{
Code: "C002",
Name: "Data Structure",
},
}
// print courses
fmt.Println("Courses")
for _, v := range newStudent.Courses {
fmt.Println(v)
}
// convert to JSON
var result string = convertToJSON(&newStudent)
fmt.Println("to JSON: ", result)
// convert to protocol buffers from JSON data
var jsonData string = `{"name":"Ryan Cooper","studentId":"RPC2007","idCardNumber":"98753443"}`
var protoResult student.Student = student.Student{}
// call the function
convertToProto(jsonData, &protoResult)
fmt.Println("to proto: ", &protoResult)
}
func convertToJSON(pb proto.Message) string {
var marshaler jsonpb.Marshaler = jsonpb.Marshaler{}
result, err := marshaler.MarshalToString(pb)
if err != nil {
log.Fatalln("Cant convert to JSON", err)
return ""
}
return result
}
func convertToProto(data string, pb proto.Message) {
err := jsonpb.UnmarshalString(data, pb)
if err != nil {
log.Fatalln("Cant convert to proto, ", err)
}
}
Output
Courses
code:"C001" name:"Algorithm"
code:"C002" name:"Data Structure"
to JSON: {"name":"Nathan Mckane","studentId":"RVN2021","courses":[{"code":"C001","name":"Algorithm"},{"code":"C002","name":"Data Structure"}],"idCardNumber":"12345321"}
to proto: name:"Ryan Cooper" student_id:"RPC2007" id_card_number:"98753443"
Notes
The protocol buffers can be used by these steps:
Create a protocol buffers file.
Generate a code from protocol buffers file based on the programming language that is used.
Use the code from generated code.
The code example of using protocol buffers in Go can be checked here.
Sources
- Protocol Buffers Version 3 Documentation.
- Comparison Between JSON and Protocol Buffers.
- Protocol Buffers Writing Conventions.
I hope this article is helpful to learn protocol buffers. If you have any thoughts, you can write it in the discussion section below.
Top comments (0)