Dubbo study notes (1) basic concepts and simple use

In fact, I planned to write Dubbo last week, but found that Dubbo needs a registration center. Because there is also a plan to learn Dubbo, I will introduce Zookeeper and Dubbo together.

What is it?

I remember the last time I looked at Dubbo's official website, Dubbo defined itself as an RPC framework, and it became Dubbo3:

Apache Dubbo is a microservice development framework that provides two key capabilities, RPC communication and microservice governance. This means that the microservices developed by Dubbo will have the capability of remote discovery and communication with each other, and at the same time, by using the rich service governance capabilities provided by Dubbo, service governance demands such as service discovery, load balancing, and traffic scheduling can be realized. At the same time, Dubbo is highly extensible, and users can customize their implementation at almost any function point to change the default behavior of the framework to meet their business needs.

Let's discuss what RPC is (this is already discussed in the first chapter of RPC study notes (1)), many articles introducing RPC will start with an application that wants to call a function of another application. But this is less intuitive than Wikipedia:

In distributed computing, Remote Procedure Call (RPC) is a computer communication protocol. This protocol allows a program running on one computer to call a subroutine in another address space (usually a computer on an open network) without the programmer having to additionally program this interaction as if it were calling a local program (no need to attention to detail).

So why start with functions? This is a kind of abstraction and encapsulation. Two processes need to communicate, and standards need to be formulated on top of TCP, that is, to formulate application layer protocols. You can choose HTTP (cross-language), or based on TCP, a protocol for custom application layer. We can confirm our point with the protocol in the Dubbo3 Conceptual Architecture section:

Dubbo3 provides Triple(Dubbo3) and Dubbo2 protocols, which are native protocols of the Dubbo framework. In addition, Dubbo3 also integrates many third-party protocols and incorporates them into Dubbo's programming and service governance system, including gRPC, Thrift, JsonRPC, Hessian2, REST, etc. The following focuses on the Triple and Dubbo2 protocols.
In the end, we chose to be compatible with gRPC and build a new protocol with HTTP2 as the transport layer, which is Triple.

That is, we can think that the HTTP protocol is a kind of RPC. As for microservice governance, the discussion that will not be repeated here, please refer to my Nuggets article: "Spring Cloud Introductory Tutorial for Xiaobai". Since you said that the HTTP protocol is a kind of RPC, what is the meaning of Dubbo? I personally think it is to transform the HTTP protocol. Before HTTP 2.0, it was all in text form, and binary byte streams were used to transmit more data in the network. Quick, in addition to using the HTTP protocol to transmit data, you also need to serialize the data yourself. If you need to communicate across languages, you need to define more rules. Dubbo helps us do all this, and even more . That's what we're learning Dubbo for.

To summarize, RPC is a computer communication protocol, so why is it constructed in the form of function calls, this is because it is the most reasonable abstraction, and we can roughly infer the evolution:

The first is that the two processes need to exchange information. TCP is selected as the protocol of the transport layer. Some people choose the HTTP protocol because it is simpler. When the exchanged information is relatively simple, the Socket API of each high-level language can meet its needs. demand.
If we expect this exchange of information to be a little more complicated, if we choose TCP or HTTP as the form of inter-application communication, then there will be some repetitive coding tasks, such as taking values and serializing them into objects. If it is TCP The protocol also needs to consider unpacking, etc., which undoubtedly increases the burden of coding for programmers. Can you simplify this process, shield the complex details involved in network programming, and abstract a simple model? High-level languages are built-in If there are functions, then it is better to start with functions, so that the exchange of information between processes is like calling two applications. This is why many RPC tutorials start with functions. I think it is the process of communication between processes. In order to shield the For the complex details of network programming, we choose to start with functions, so that it is easier to understand, rather than the form of function calls at the beginning. In other words, most programmers may not understand the concept of unpacking in Socket programming, but they must understand the concept of function, which is a kind of encapsulation.

Although Dubbo declares itself as a microservice development framework on the official website, in practical application scenarios, Apache Dubbo is generally used as the implementation framework for RPC calls between back-end systems. We can compare it to many of the corresponding HTTP protocols. HTTP Client. Dubbo provides multi-language support. Currently, it only supports three languages: Java, Go, and Erlang. Then we naturally ask a question. The built-in data types and method forms of different languages are different. As the implementer of RPC, it is How to do it across languages.

Of course, the introduction of an intermediate layer - IDL

In order to communicate with computers, we introduce programming languages, which are an intermediate layer. In order to allow different high-level languages to communicate, Dubbo introduces IDL. In Dubbo, it is recommended to use IDL to define cross-language services. What is IDL, Dubbo There was no official explanation, so I went to Wikipedia:

An interface description language or interface definition language ( IDL ), is a generic term for a language that lets a program or object written in one language communicate with another program written in an unknown language. IDLs describe an interface in a language-independent way, enabling communication between software components that do not share one language, for example, between those written in C++ and those written in Java.
The interface definition language or interface description language is a language for two different languages to communicate. IDL describes the interface in any form independent of the language, and supports different high-level languages for communication. For example, applications written in C++ and applications written in Java
IDLs are commonly used in remote procedure call software. In these cases the machines at either end of the link may be using different operating systems and computer languages. IDLs offer a bridge between the two different systems.
IDL is usually used in the application of RPC, and the two sides of the communication in RPC are usually different operating systems and programming languages at both ends of the link. IDL provides a bridge between two disparate systems.

Why is it posted in English again, doesn't Wikipedia also have Chinese? The following is the Chinese explanation of IDL in Wikipedia:

Interface description language (Interface description language, abbreviated IDL ) is a computer language used to describe the interface of software components. IDL describes interfaces in a programming language-independent way, so that objects running on different platforms and programs written in different languages can communicate with each other; for example, one component is written in C++ and another is written in C++.

I was stunned when I saw this interface. I guess it is a translation of interface. The Chinese of interface means interface. Since it is a computer language, we reason reasonably, then there is a grammar, and an example of IDL is provided in Dubbo:

 syntax = "proto3";

option java_multiple_files = true;
option java_package = "org.apache.dubbo.demo";
option java_outer_classname = "DemoServiceProto";
option objc_class_prefix = "DEMOSRV";

package demoservice;

// The demo service definition.
service DemoService {
  rpc SayHello (HelloRequest) returns (HelloReply) {}
}

// The request message containing the user's name.
message HelloRequest {
  string name = 1;
}

// The response message containing the greetings
message HelloReply {
  string message = 1;
}

This is how Dubbo describes it:

The above is a simple example of using IDL to define a service, we can name it DemoService.proto , the RPC service name DemoService and the method signature SayHello (HelloRequest) returns (HelloReply) {} are defined in the proto file , and also defines the method's input and output structures HelloRequest and HelloReply . The services in IDL format rely on the Protobuf compiler to generate client-side and server-side programming APIs that can be called by users. Based on the native Protobuf Compiler, Dubbo provides unique plug-ins for adapting to multiple languages to adapt to the Dubbo framework Unique API and programming model.

A new term has appeared: Protobuf, what is Protobuf? Apache Dubbo didn't explain, I had to resort to search engines again:

Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages. --《Protocol Buffers》official website
Protocol Buffers is a language- and platform-independent extensible mechanism designed by Google for serializing structured data, similar to XML, but smaller, simpler, and faster. You only need to define how the data is structured, and with the generated source code, you can read and write your structured data in different languages.

XML is a form of describing data. Since it is similar to XML, it is also a form of describing data. Combined with the above cross-language context, that is to say, we use the corresponding Protobuf compiler to use proto to Generates calling client and server programming APIs.

IDL调用图

Getting Started with Protobuf

Since it is describing data, there will be data types. Protobuf declares some data types for cross-language, and has a corresponding mapping relationship with the data type of each language. Here is a brief list of the mapping relationship with java data types:

double ==> java double
float ==> java float
int64 ==> java long
uint32 ==> java int
bool ==> java bool
String ==> java String

In the above example, the fields of HelloRequest and HelloReply are each assigned a value, but this is not the default value, but the field number, which is used to identify the field in binary form. So far we only have a few optionals above that we can't understand:

java_multiple_files

If true, each message and service will be generated as a class. If false, all messages and services will be generated into a class.

java_package

The location of the produced code, if not, the package declared after package will be produced.

java_outer_classname

The name of the production service.

objc_class_prefix

It's strange why the official example put this in, I checked a lot of information, this syntax is provided by objective-c, which is used to generate a prefix for the specified class.

Dubbo also said:

The service defined by Dubbo3 IDL only allows one input and output. This form of service signature has two advantages, one is more friendly to multi-language implementation, and the other is to ensure the backward compatibility of the service, relying on Protobuf serialization compatibility, we can easily adjust the transmitted data structure such as adding and deleting fields, etc., without worrying about the compatibility of the interface

So far we've seen the official example, and now we're going to use it.

Basic usage example

Dubbo officially recommends using IDL, so we still use the official example to define the service. Official examples are provided:

Dubbo 官方提供的示例

I paste the instructions here:

 git clone -b master https://github.com/apache/dubbo-samples.git
cd dubbo-samples/dubbo-samples-protobuf
# 要求配置maven的环境变量
mvn clean package
# 运行 Provider
java -jar ./protobuf-provider/target/protobuf-provider-1.0-SNAPSHOT.jar 
# 运行 consumer
java -jar ./protobuf-consumer/target/protobuf-consumer-1.0-SNAPSHOT.jar

Then you will find that you can't run, I run like this:

官方的示例-Zookeeper

Zookeeper can't be connected. I criticize the official example document of Apache Dubbo here. It can't run at all. Are you really writing the document with your heart? This Zookeeper has been introduced in "Zookeeper Study Notes (1) Basic Concepts and Simple Use", a distributed coordination service that provides a naming service. Why is it required to connect to Zookeeper in the example of Dubbo? In order to understand the coupling, if we directly adjust the service provider's service through IP+port, it will be coupled together. If we change the machine in production, we have to change the code. , and in the case of a cluster, I just know the service name, I don't need to know the specific ip, this is the concept of the registry, the service provider registers the service to the registry, and the consumer provides the service name and waist consumption services That's it. Common architecture of Dubbo service:

Dubbo的架构

Monitor is monitoring, monitoring service calls, we will not introduce it here. In fact, the Zookeeper registry is also connected by default in the source code provided by Dubbo:

Dubbo源码

Fortunately, we have installed Zookeeper, we can just change the address. Note that the high version of JDK has changed a lot of things. The official example provided by Dubbo may not run under JDK 17. If the compilation error occurs, just adjust the environment to JDK8. If I test it myself, JDK 11 can also be used. , but sometimes it will report an error that Zookeeper cannot connect. I start it with IDEA:

Dubbo Service 成功的被启动

Then start the consumer:

消费者消费成功

It was found to be no problem. I analyzed the reason why Zookeeper could not be connected in my windows power shell. The reason may be that the JDK environment variable I configured is JDK 11. The reason why it can run successfully in IDEA is IDEA JDK8 is used.

There is a feeling that you can send whatever you want, and I will continue to use JDK8.

Analyze from example

proto文件

There are Protobuf plugins in the pom:

 <plugin>
                <groupId>org.xolstice.maven.plugins</groupId>
                <artifactId>protobuf-maven-plugin</artifactId>
                <version>0.5.1</version>
                <configuration>
                    <protocArtifact>com.google.protobuf:protoc:3.7.1:exe:${os.detected.classifier}</protocArtifact>
                     <!--将protobuf文件输出到这个目录-->
                    <outputDirectory>build/generated/source/proto/main/java</outputDirectory>
                    <clearOutputDirectory>false</clearOutputDirectory>
                    <protocPlugins>
                        <protocPlugin>
                            <id>dubbo</id>
                            <groupId>org.apache.dubbo</groupId>
                            <artifactId>dubbo-compiler</artifactId>
                            <version>${compiler.version}</version>
                            <mainClass>org.apache.dubbo.gen.dubbo.Dubbo3Generator</mainClass>
                        </protocPlugin>
                    </protocPlugins>
                </configuration>
                <executions>
                    <execution>
                        <goals>
                            <goal>compile</goal>
                            <goal>test-compile</goal>
                        </goals>
                    </execution>
                </executions>
  </plugin>

 public class ConsumerApplication {
    public static void main(String[] args) throws Exception {
        // 加载Spring的上下文文件
        ClassPathXmlApplicationContext context = new ClassPathXmlApplicationContext("spring/dubbo-consumer.xml");
        context.start();
        // 从容器中获取demoService
        DemoService demoService = context.getBean("demoService", DemoService.class);
        // 构建入参
        HelloRequest request = HelloRequest.newBuilder().setName("Hello").build();
        // 实现RPC
        HelloReply reply = demoService.sayHello(request);
        System.out.println("result: " + reply.getMessage());
        System.in.read();
    }
}
public class Application {
    public static void main(String[] args) throws Exception {
        // 加载bean
        ClassPathXmlApplicationContext context = new ClassPathXmlApplicationContext("spring/dubbo-provider.xml");
        context.start();
        System.out.println("dubbo service started");
        // 避免应用关闭
        new CountDownLatch(1).await();
    }
}
/**
 * 真正的实现类
 */
public class DemoServiceImpl implements DemoService {
    private static final Logger logger = LoggerFactory.getLogger(DemoServiceImpl.class);

    @Override
    public HelloReply sayHello(HelloRequest request) {
        logger.info("Hello " + request.getName() + ", request from consumer: " + RpcContext.getContext().getRemoteAddress());
        return HelloReply.newBuilder()
                .setMessage("Hello " + request.getName() + ", response from provider: "
                        + RpcContext.getContext().getLocalAddress())
                .build();
    }

    @Override
    public CompletableFuture<HelloReply> sayHelloAsync(HelloRequest request) {
        return CompletableFuture.completedFuture(sayHello(request));
    }
}

in conclusion

The communication between processes can directly use application layer protocols such as HTTP, or can customize application layer protocols based on TCP, but for object-oriented high-level languages, there is a serialization process for data reception. Based on TCP, we also need to consider the problem of unpacking, what we all like, can we block the communication details in the middle? The communication between the two processes is like calling their own functions, which is RPC, but if The two processes are written in different languages. For language neutrality, we introduce IDL, which is cross-language, but the communication still needs to choose the application layer protocol, either based on TCP or based on the existing application layer protocol, such as HTTP , but now there is a highly mature RPC framework, you don't need to care about the communication details of the HTTP protocol and the sequence process, under a certain configuration, you can implement the function of another process just like calling a local function, Highly encapsulated. RPC chose functions as the carrier during the evolution process, which is to shield the details of communication and serialization, rather than appearing in the form of functions from the beginning, which is essentially a communication protocol.

References

From principle to operation, it is more convenient for you to proxy Dubbo service in Apache APISIX https://dubbo.apache.org/zh/blog/2022/01/18/%E4%BB%8E%E5%8E%9F%E7 %90%86%E5%88%B0%E6%93%8D%E4%BD%9C%E8%AE%A9%E4%BD%A0%E5%9C%A8-apache-apisix-%E4%B8% AD%E4%BB%A3%E7%90%86-dubbo-%E6%9C%8D%E5%8A%A1%E6%9B%B4%E4%BE%BF%E6%8D%B7/
Chinese version of gRPC official documentation https://doc.oschina.net/grpc?t=60140

Dubbo study notes (1) basic concepts and simple use

What is it?

Of course, the introduction of an intermediate layer - IDL

Getting Started with Protobuf

Basic usage example

Analyze from example

in conclusion

References

北冥有只鱼

引用和评论

从阻塞IO到io_uring: Linux IO模型的演进之路

Dubbo 的底层原理

Spring Cloud 和 Dubbo 区别