Introduction

Protocol Buffer is an object serialization method produced by google. Its small size and fast transmission are deeply loved by everyone. Protobuf is a platform-independent and language-independent protocol. Through the definition file of protobuf, it can be easily converted into the realization of multiple languages, which is very convenient.

Today, I will introduce to you the basic use of protobuf and specific cases of combining with java.

Why use protobuf

We know that data is transmitted in binary form in network transmission. Generally, we use byte byte to represent, a byte is 8bits. If you want to transmit an object on the network, you generally need to serialize the object. The purpose of serialization is to convert the object. It is converted into a byte array and transmitted on the network. When the receiver receives the byte array, it deserializes the byte array and finally converts it into an object in java.

Then there may be several ways to serialize java objects:

  1. Use JDK's built-in object serialization, but the JDK's built-in serialization itself has some problems, and this serialization method is only suitable for transmission between java programs, if it is a non-java program, such as PHP or GO, then serialization It's not universal.
  2. You can also customize the serialization protocol. This method is more flexible, but it is not universal enough, and it is more complicated to implement, and unexpected problems are likely to occur.
  3. Convert the data into XML or JSON for transmission. The advantage of XML and JSON is that they both have starting symbols that can distinguish objects, and the complete object can be read by judging the position of these symbols. But the disadvantage of both XML and JSON is that the converted data is relatively large. It also consumes more resources when deserializing.

So we need a new serialization method, this is protobuf, it is a flexible, efficient and automated solution.

By writing a .proto data structure definition file, and then calling the protobuf compiler, the corresponding class will be generated, which realizes the automatic encoding and parsing of protobuf data in an efficient binary format. The generated class provides getter and setter methods for the data fields in the definition file, as well as the processing details of reading and writing. The important thing is that protobuf is forward compatible, which means that old binary codes can also be read using the latest protocol.

Define .proto file

What is defined in the .proto file is the message object you will serialize. Let's come to a most basic student.proto file, this file defines the most basic properties in the student object.

First look at a relatively simple .proto file:

syntax = "proto3";

package com.flydean;

option java_multiple_files = true;
option java_package = "com.flydean.tutorial.protos";
option java_outer_classname = "StudentListProtos";

message Student {
  optional string name = 1;
  optional int32 id = 2;
  optional string email = 3;

  enum PhoneType {
    MOBILE = 0;
    HOME = 1;
  }

  message PhoneNumber {
    optional string number = 1;
    optional PhoneType type = 2;
  }

  repeated PhoneNumber phones = 4;
}

message StudentList {
  repeated Student student = 1;
}

The first line defines the syntax protocol used in protobuf, which is proto2 by default, because the latest protocol is proto3, so here we use proto3 as an example.

Then we define the package where we are. This package refers to the package that generates files when compiling. This is a namespace. Although we define java_package later, it is necessary to define package in order to conflict with the protocol in non-java languages.

Then there are three options specifically for java programs. java_multiple_files, java_package, and java_outer_classname.

Among them, java_multiple_files refers to the number of java files after compilation. If it is true, then there will be one java object and one class, if it is false, then the defined java objects will be included in the same file.

java_package specifies the Java package name that the generated class should use. If it is not explicitly specified, the previously defined package value will be used.

The java_outer_classname option defines the class name that will represent the wrapper class of this file. If no value is assigned to java_outer_classname, it will be generated by converting the file name to uppercase camel case. For example, by default, "student.proto" will use "Student" as the packaging class name.

The next part is the definition of the message. For simple types, you can use bool, int32, float, double, and string to define the type of the field.

In the above example, we also used complex composite attributes and nested types. An enumeration class is also defined.

Above we assigned an ID to each attribute value. This ID is the only "label" used in binary encoding. Because marking numbers 1-15 in protobuf occupies less byte space than marking numbers above 16, as an optimization, these markings 1-15 are usually used for commonly used or repeated elements, and the markings 16 and Higher marks are used for optional elements that are not commonly used.

Then look at the modifiers of the field, there are three modifiers are optional, repeated and required.

Optional means that the field is optional and can be set or not. If it is not set, the default value will be used. For simple types, we can customize the default value. If not, the system will be used. Defaults. For the system default value, the number is 0, the string is an empty string, and the Boolean value is false.

Repeated means that the field can be repeated. This kind of repetition is actually an array structure.

required means that the field is required. If the field has no value, then the field will be considered as not initialized. Trying to build an uninitialized message will throw a RuntimeException, and parsing an uninitialized message will throw an IOException.

Note that the required field is not supported in Proto3.

Compile the protocol file

After defining the proto file, you can use the protoc command to compile it.

protoc is the compiler provided by protobuf. Generally, you can download it directly from the github release library. If you don't want to download it directly, or the official library does not have the version you need, you can use the source code to compile it directly.

The commands used by protoc are as follows:

protoc --experimental_allow_proto3_optional -I=$SRC_DIR --java_out=$DST_DIR $SRC_DIR/student.proto

If you compile proto3, you need to add the --experimental_allow_proto3_optional option.

Let's run the above code. You will find that 5 files are generated in the com.flydean.tutorial.protos package. They are:

Student.java              
StudentList.java          
StudentListOrBuilder.java 
StudentListProtos.java    
StudentOrBuilder.java

Among them, StudentListOrBuilder and StudentOrBuilder are two interfaces, and Student and StudentList are implementations of these two classes.

Explain the generated files in detail

In the proto file, we mainly define two classes, Student and StudentList, in which an internal class Builder is defined. Taking Student as an example, look at the definition of these two classes:

public final class Student extends
    com.google.protobuf.GeneratedMessageV3 implements
    StudentOrBuilder

  public static final class Builder extends
      com.google.protobuf.GeneratedMessageV3.Builder<Builder> implements
      com.flydean.tutorial.protos.StudentOrBuilder

You can see that the interfaces they implement are the same, indicating that they may provide the same functionality. In fact, the Builder is a wrapper for the message, and all operations on the Student can be completed by the Builder.

For the fields in Student, the Student class only has get methods for these fields, while the Builder has both get and set methods.

For Student, the methods for fields are:

// required string name = 1;
public boolean hasName();
public String getName();

// required int32 id = 2;
public boolean hasId();
public int getId();

// optional string email = 3;
public boolean hasEmail();
public String getEmail();

// repeated .tutorial.Person.PhoneNumber phones = 4;
public List<PhoneNumber> getPhonesList();
public int getPhonesCount();
public PhoneNumber getPhones(int index);

For Builder, there are two more methods for each attribute:

// required string name = 1;
public boolean hasName();
public java.lang.String getName();
public Builder setName(String value);
public Builder clearName();

// required int32 id = 2;
public boolean hasId();
public int getId();
public Builder setId(int value);
public Builder clearId();

// optional string email = 3;
public boolean hasEmail();
public String getEmail();
public Builder setEmail(String value);
public Builder clearEmail();

// repeated .tutorial.Person.PhoneNumber phones = 4;
public List<PhoneNumber> getPhonesList();
public int getPhonesCount();
public PhoneNumber getPhones(int index);
public Builder setPhones(int index, PhoneNumber value);
public Builder addPhones(PhoneNumber value);
public Builder addAllPhones(Iterable<PhoneNumber> value);
public Builder clearPhones();

The two more methods are the set and clear methods. Clear is to clear the content of the field and let it change back to the initial state.

We also define an enumeration class PhoneType:

  public enum PhoneType
      implements com.google.protobuf.ProtocolMessageEnum

The implementation of this class is not much different from ordinary enumeration classes.

Builders and Messages

As shown in the previous section, the class corresponding to Message has only get and has methods, so it cannot be changed. Once the message object is constructed, it cannot be modified. To build a message, you must first build a builder, set any fields to be set to values of your choice, and then call the builder's build() method.

Every time the method of Builder is called, a new Builder will be returned. Of course, the returned Builder is the same as the original Builder. Returning the Builder is just for the convenience of continuous code writing.

The following code is how to create a Student instance:

        Student xiaoming =
                Student.newBuilder()
                        .setId(1234)
                        .setName("小明")
                        .setEmail("flydean@163.com")
                        .addPhones(
                                Student.PhoneNumber.newBuilder()
                                        .setNumber("010-1234567")
                                        .setType(Student.PhoneType.HOME))
                        .build();

Some commonly used methods are provided in Student, such as isInitialized() to check whether all necessary fields are set. toString() converts the object into a string. The Builder using it can also call clear() to clear the state that has been set, and mergeFrom(Message other) to merge objects.

Serialization and deserialization

The generated object provides serialization and deserialization methods, we only need to call them when needed:

  • byte[] toByteArray();: Serialize the message and return a byte array containing its raw bytes.
  • static Person parseFrom(byte[] data);: Parse a message from the given byte array.
  • void writeTo(OutputStream output);: Serialize the message and write it to the OutputStream.
  • static Person parseFrom(InputStream input);: Read and parse a message InputStream from a message.

By using the above methods, you can easily serialize and deserialize objects.

Protocol extension

After we define the proto, if we want to modify it in the future, then we hope that the new protocol is compatible with historical data. Then we need to consider the following points:

  1. The ID number of an existing field cannot be changed.
  2. You cannot add or delete any required fields.
  3. You can delete optional or repeated fields.
  4. You can add new optional fields or repeating fields, but you must use a new ID number.

Summarize

Well, the basic usage of protocol buf is introduced here. In the next article, we will introduce the specific content of the proto protocol in more detail, so stay tuned.

For the examples in this article, please refer to: learn-java-base-9-to-20

This article has been included in http://www.flydean.com/01-protocolbuf-guide/

The most popular interpretation, the most profound dry goods, the most concise tutorial, and many tips you don't know are waiting for you to discover!

Welcome to pay attention to my official account: "Program those things", know technology, know you better!


flydean
890 声望433 粉丝

欢迎访问我的个人网站:www.flydean.com