Cloud native event-driven engine (RocketMQ-EventBridge) application scenarios and technical analysis

Author: Luo Jing

At the just past RocketMQ Summit 2022 Global Developer Summit, we officially open-sourced our new product, the RocketMQ-Eventbridge event-driven engine.

The biggest impression of RocketMQ has always been a message engine. So what is an event-driven engine? Why are we launching an event-driven engine this time? What application scenarios does it have and what are the corresponding technical solutions?

Today we will take a look at it together. The whole article consists of three parts:

In the first part, let's take a look at what an event is.

In the second part, let’s take a look at the different “superpowers” of events. What can we do with these “superpowers”?

In the third part, let's talk about the solution about events given by RocketMQ, which is also our open source project: RocketMQ-EventBridge.

what is an event

You can first think about it in your head, what is an event? One of our definitions of events is:

Things that have happened in the past, especially important things.

A thing that happens, especially one of importance.

This is easy to understand. For example, I did a nucleic acid test yesterday afternoon; I ate another ice cream this morning. These are events that have happened in the past. But if I ask again: what's the difference between an event and a message? At this time, do you feel that the definition of event is not so clear?

Can the events just mentioned be understood as news? If Lao Zhang sent me a text message, is this an event or a message? In the usual development process, "when to use messages and when to use events?"

Before answering this question, though, let's take a look at a typical microservice.

title=

The interaction between a microservice system and external systems can be simply divided into two parts: one is to receive external requests (the yellow part above in the figure); the other is to call external services (the green part below the figure).

There are two ways to receive external requests: one is to provide an API to receive external Query requests and Commond requests; the other is to actively subscribe to external Command messages. After these two types of operations enter the system, we often call other microservice systems to process them together to complete a specific operation. When these operations change the state of the system, events are generated.

Here, we call the Command messages received from the outside and the events generated inside the system as messages.

Let's summarize, the relationship between messages and events is as follows: the message contains two parts, the Command message and the Event message

1. Look at the left half of the figure, Command is an operation command sent by the external system to the system;

2. Look at the right half of the figure again, Event is when the system receives a Command operation request, and an event occurs after the system changes;

title=

Therefore, events and messages are different, and events can be understood as a special kind of message. Its special points are mainly in 4 places:

Happened and immutable

The event must be "sent". What does "has happened" mean? Immutable. We cannot change the past. This feature is very important. When we process and analyze events, it means that we can absolutely believe that these events, as long as they are received events, must be behaviors that have actually occurred in the system. And it is not modifiable.

Compare Command and Query. What is the Chinese for Command? Order. Obviously, it hasn't happened yet, it's just an expression of an expectation. We know that the "desired" will not necessarily happen successfully.

For example: turn on the light in the kitchen, press the doorbell, transfer 10w to the A account...

These are Commond, expected behaviors. However, did it finally happen? Don't know.

An Event is something that specifies what has happened. For example: the kitchen light was turned on, someone rang the doorbell, the A account received 10w...

Compared with Query, it is a request to query the current state of the system, such as: the kitchen light is on, the doorbell is ringing, checking the account and showing the balance of 11w...

Unexpected

How to understand this? An event is an objective description of a change in the state or property value of a thing, but does not make any expectations about how to handle the event itself.

In contrast, Commond and Query have expectations, they want the system to make changes or return results, but Event, it only objectively describes a change in the system.

Let's look at an example: a traffic light, changing from a green light to a red light, the event itself does not require pedestrians or cars to be prohibited from passing, but the traffic law requires a traffic light and gives its rules.

title=

Therefore, the system generally does not send events to another system in a targeted manner, but tells the "event center", "event center" in a unified way, there are various events reported by various systems. The system will explain to the event center: what events will be generated by this system, and what the format of these events will be.

If other systems are interested, they can actively subscribe to these events. It is the event consumer that really gives value to an event. Event consumers want to see, what has changed in a certain system? OK, then he subscribes to these events, so events are consumer-driven.

How is this different from news? The sending and subscription of Commond messages are agreed upon by both parties, and outsiders do not know, usually in the form of documents or codes. Everyone sends and subscribes for consumption according to the agreed protocol, so the messages are driven by producers.

Let's make an analogy. Events are like a market economy. When a commodity is produced, its specific value and how much value depends largely on its consumers. We can see all kinds of events in the system, just like all kinds of goods are displayed in the window. And the Commond news, a bit like a planned economy, was born with a strong purpose, I just want to "distribute" to whom to consume.

natural order

The third property of events is: "natural order". Meaning: For the same entity, A and B cannot occur at the same time, and there must be a sequential relationship; if so, the two events must belong to different event types.

For example, for the same traffic light, it cannot become both green and red. At the same time, it can only become one state.

title=

You may have noticed that there is an additional attribute of the event hidden here: because it is naturally ordered, it is strongly bound to a certain moment on the timeline, and cannot happen at the same time, so it must be unique.

If we see two events with the same content, then it must have happened twice, one before and one after. (This is very valuable for us to deal with the eventual consistency of data and the analysis of system behavior: what we see is not just a final result of the system, but a series of intermediate processes before it becomes this result)

figurative

The fourth characteristic of events is that they are "representative".

The event will record the "scene scene" as completely as possible, because it does not know how consumers will use it, so it will be as detailed as possible, such as:

● Who generated the event? Subject

● What type of event? Type

● Who sent the event? Source

● What is the unique sign of the event? Id

●When will it happen? Time

●What is the content of the event? Data

●What information is there about the content of the event? Dataschema

Let's take the example of traffic lights:

title=

Compared with our common news, because the upstream and downstream are generally determined, often for performance and transmission efficiency, it will be as streamlined as possible, as long as it meets the consumer needs specified by the "planned economy".

To sum up, the four features above the event are a huge attribute bonus to the event, giving the event a "superpower" that is different from ordinary messages. Making events is often used in four typical scenarios: event notification, event sourcing, inter-system integration, and CQRS.

title=

Let us expand one by one and look at these application scenarios in detail.

Typical application scenarios for events

event notification

Event notification is a very common scenario in our system. For example: the user order event is notified to the payment system; the user payment event is notified to the transaction system.

Here, let's go back to the semaphore example at the beginning. There may be many systems that need this information when a traffic light changes from red to green.

Method 1: The sender actively calls and adapts the receiver

In the simplest way, we call each system in turn and pass the information out. For example, the signal light system actively calls the API service of map navigation, calls the API service of the central control of the traffic police, calls the API service of the city brain, and sends the traffic light change signal.

title=

But as we all know, this design is terrible. Especially when there are more and more systems, this is undoubtedly a disaster. Not only is the development cost high, but if there is a problem with one of the systems, the entire service may be hung, and calls to other systems will be affected.

Method 2: The receiver actively subscribes and adapts to the sender

A natural solution is to send these messages to the intermediate message service Broker, and other systems can actively subscribe to these messages if necessary.

title=

At this time, the signal light system has no direct call dependence with other systems. The traffic police central control service, map navigation service, and city brain service only need to subscribe to the news of the signal light according to the agreed protocol and parse the information.

However, there is also a problem here: in this architecture, the "signal light" is the center. Consumers need to understand the sender's business domain, and actively add an adaptation layer (the white boomerang part in the figure) to convert messages into the language of their business domain. But for every microservice, he wants to have high cohesion and low coupling.

If the traffic police central control needs the signal light data of the whole country, but the message format of each region is different, this means that the traffic police central control needs to adapt to the protocol of each region and do a layer of conversion. And what if it changes later? Think about how terrible this operation and maintenance cost is.

title=

Can the traffic police central control system require all traffic light systems in the country to be given to themselves according to the same data protocol? Sorry, these traffic light data map services are also used, and the city brain is also used, and cannot be changed.

Method 3: Introduce events, and Borker can flexibly adapt according to the receiver's agreement

But if you use events, it's different. Because the incident is "unexpected" and "represented", it naturally retains as much information as possible on the scene of the crime, and it is more standardized. For consumers (that is, the traffic police are hollow), it is easy to distinguish different Provinces, collect events, and easily assemble them into a format that meets your business requirements.

title=

Moreover, this assembly occurs in the middle layer Broker. For the traffic police central control, it only needs to provide an API for receiving events according to the design of its own business field, and then other events can be actively delivered to this API through the Broker. From the beginning to the end, there is no line of code adapted to the external business for the traffic police central control system.

So, this approach has 3 distinct advantages:

1. Only focus on your own business field, and do not need to adapt external code;

2. All changes to the system converge to the API, which is the only entry; the same API may be used for both receiving events and for console operations at the same time;

3. Because the events are pushed, there is no need to introduce an SDK as before, to connect with the Broker to obtain messages, which reduces the complexity of the system.

In this way, our initial diagram will look like this: traffic lights generate events and deliver them to the event center. Other consumers who need these events subscribe to the event center, and then the event center will follow the event format they expect. Actively deliver it.

title=

Let's review the whole process again:

title=

The first picture: At the beginning, we let the signal light system actively send information to each system through a strong dependence method. In this picture, we are centered on each downstream service, and the signal light system is adapted to each downstream service.

Picture 2: Later, we decoupled the call link by using the traditional message method. The two systems are no longer directly dependent, but there will still be business dependencies. Consumers need to understand the message format of producers, and convert and adapt within their own systems. So, here is actually the producer as the center.

Figure 3: Finally, we introduce the way of event notification, for this way, both producers and consumers only need to pay attention to their own system itself. Producers, what events to produce, consumers, and what data formats to consume are all centered on their own business and do not need to be adapted for each other. Really achieve what we call high cohesion and low coupling, and achieve complete and complete decoupling.

Now, going back to the typical microservice model we mentioned at the beginning, for some scenarios, we can change to the following way: change operations on microservices, converge to the API operation entry, and remove the Commond message entry. Convergence entry is often very beneficial for us to maintain microservices and ensure system stability.

title=

Event Sourcing

What is event sourcing? The simple understanding of event sourcing is to let the system go back to any time in the past. So how can we make the system go back to the past? It is very simple. First, all the changes in the system are recorded in the form of events; then, we can go back to any moment in the past by playing back the events.

So why only events can do this, and other ordinary messages can't? This still has to go back to the characteristics of several events we just mentioned: the scene of the incident that has been immutable, naturally ordered and unique, and is very detailed and completely recorded. Therefore, for the scenario of event sourcing, events can be said to be first-class citizens of the system.

For example: For example, if we can completely collect information on various events on the road, including signal lights, traffic volume, weather, traffic congestion, etc., then we can "travel through time", go back to the traffic scene, and do the same again. a decision. For example, in a smart traffic scenario, when we want to verify a scheduling algorithm, we can play back all the events that happened at that time to reproduce the scene.

title=

You may think this is amazing, but in fact, we have been in contact all the time. Do you know what it is? It is our commonly used code version-management system, such as: github.

Some people here may ask, if a system has accumulated a lot of likes, does it take a long time to replay it? For example, in some transaction scenarios, a large number of events are generated every day, so how should we deal with it? Here, the system generally takes a snapshot every night. If the system crashes unexpectedly and you want to go back to a certain moment, you can take out the snapshot of the previous day, and then rerun the events of the day to recover. During the day, all events are processed in memory and do not interact with the database, so the system performance is very fast, and only events will be placed on the disk.

Of course, event sourcing is not suitable for all scenarios. It has advantages and disadvantages. See the figure above for details.

system integration

The first scenario just mentioned: event notification, generally involves the collaborative development of two upstream and downstream teams; the second scenario: event source tracing, which is generally the development within a team; but the integration between systems is often Facing the collaborative development of three business teams. How to understand this?

In fact, this is also very common: for example, the company has purchased an ERP system, and also purchased an external attendance system, external marketing system services, and so on. These systems all have one thing in common, what is it? We didn't develop it ourselves, we bought it.

title=

What if we want to synchronize the personnel information of the ERP system to the attendance system in real time and automatically? In fact, this is a bit troublesome, because these are not developed by ourselves.

1. We cannot modify the code of the ERP system, actively call the attendance system, and send the personnel change information;

2. You cannot modify the code of the test system and actively call the API of the external ERP system;

However, we can collect personnel change events generated by the upstream ERP system through the event bus, with the help of webhook or standard API, etc., and then filter and convert them, and push them to the downstream attendance system. Of course, it can also be an internal automation system. research services.

Therefore, the current research and development model has become: the event center manages all SaaS services, including all events generated by the internal self-developed system. Then, we only need to find the events we need in the event center, subscribe, and perform simple service orchestration for SaaS services and internal self-developed systems to complete the development.

CQRS

C in CQRS stands for Command, what does Command mean? It is an explicit order, generally including: Create/Update/Delete, Q stands for Query, which means query. So the essence of CQRS is the separation of read and write: all write operations are completed in the system on the left in the figure, and then the events that the system changes due to Command are synchronized to the query system on the right.

title=

Students here may have questions, what is the difference between this and the separation of reading and writing in the database? Database read-write separation also provides a write DB, a read DB, and synchronization on both sides. Right…

A big difference here is that the read-write separation of the database is centered on the database, the databases on both sides are exactly the same, and even the data storage structure is exactly the same.

However, for the read-write separation scenario of CQRS, it is business-centric, and the data structure formats stored on both sides are often different, and even the databases are not the same. Completely design the best technology selection around their respective read and write business logic. For write scenarios, in order to ensure transactions, we may use relational databases; for read scenarios, in order to improve performance, we may use Nosql databases such as Redis and HBase.

Of course, CQRS is not suitable for all scenarios. He is often more suitable for:

●I hope to satisfy high concurrent writing and high concurrent reading at the same time;

●When the difference between the write model and the read model is relatively large;

●When the read/write ratio is very high;

We just talked about the 4 application scenarios of events, but events are not omnipotent, just like there is no silver bullet in software development, and there are many scenarios that are not suitable for using events. include:

Synchronous call scenarios that strongly rely on Response;
Scenarios that require service calls to maintain strong transaction consistency.

RocketMQ's solution to events

What kind of abilities are needed?

First of all, according to the event application scenario mentioned earlier, let's sort out, if we do a good job of event-driven, what kind of capabilities does our system need to have?

title=

First, we must have an event standard, right...because events are not for ourselves, nor for him, but for everyone. Just now, we also mentioned that an event is unexpected, it has no clear consumers, and all are potential consumers. Therefore, we have to standardize the definition of events so that everyone can understand and understand at a glance.

Second, we have to have an event center, which has all the systems and registered events, (this is different from the message, we do not have a message center, because the message is generally directed, it is the agreement between the producer and the consumer Yes, it's a bit like a planned economy. When news is produced, it has a strong purpose, and it is for who and who consumes it. The event is a bit like a market economy, the event center.) This is a bit similar to a market economy hypermarket, full of exquisite , there are all kinds of events in the categories, and everyone can come in and take a look even if they don't buy them. Take a look and see if there are events that may be what I need, then you can buy them back.

Third, we have to have an event format to describe the specific content of the event. This is equivalent to a sales contract in a market economy. The format of events sent by producers has to be determined and cannot always be changed; the format of events received by consumers also has to be determined, otherwise the entire market will be in chaos.

Fourth, we have to give consumers the ability to deliver events to the target. And before delivery, the event can be filtered and transformed so that it can adapt to the format of the parameters received by the target API. We call this process a subscription rule.

Fifth, we have to have a place to store events, which is the middle event bus.

event criteria

Regarding the first event standard mentioned just now, we selected CloudEvents, an open source project under CNCF, which has been widely integrated and is a de facto standard.

title=

Its protocol is also very simple, mainly standardizing 4 required fields: id, source, type, specversion; and multiple optional fields: subject, time, dataschema, datacontenttype and data. On the right side of the picture above, we have a simple example, you can take a look, it will not be expanded here.

In addition, the transmission of events also needs to define a protocol to facilitate communication between different systems. By default, three HTTP transmission modes are supported: Binary Content Mode, Structured Content Mode and Batched Content Mode. Through HTTP's Content-Type, these three different modes can be distinguished. The first two are to deliver a single event; the third is to deliver a batch of events.

title=

Event Schema

The schema of the event is used to describe the attributes, corresponding meanings, constraints and other information in the event. At present, we have selected Json Schema. and OpenAPI 3.0. According to the schema description of the event, we can verify the validity of the event. , of course, the modification of the Schema itself also needs to comply with the compatibility principle, which is not detailed here.

Event filtering and transformation

Regarding event filtering and conversion, we provide 7 event filtering methods and 4 event conversion methods, which can be described in detail in the following figure:

title=

Technology Architecture

The event-driven product launched by RocketMQ, called EventBridge, is also a new product we want to open source this time.

His entire architecture can be divided into two parts: the top is our control plane, and the bottom is our data plane.

title=

The uppermost EventSource in the control plane is the event source registered by each system. These events can be sent to the event bus through APIGateway, or through the configured EventSource, a SourceRuner can be generated to actively pull events from our system. After the event arrives on the event bus EventBus, we can configure the subscription rule EventRule. In the rule EventRule, we set how to filter the event, and what conversions to do before delivery to the target. Based on the created rules, the system will generate TargetRunner, which can push events to the specified target.

So what are SourceRuner and TargetRunner here? Which upstream and downstream Sources and Targets can we connect to?

These can be registered in advance in the SourceRegister and TargetRegister below.

Therefore, the data plane of EventBridge is an open architecture, which defines the SPI of event processing, and there can be multiple implementations underneath. For example, if we register RocketMQ's HTTPConnector with EventBridge, then we can push events to the HTTP server.

If we register Kafka's JDBC Connector with EventBridge, we can push events to the database.

Of course, if your system is not a common protocol such as HTTP/JDPC, you can also develop your own Connector, so that you can synchronize events to EventBridge in real time, or receive events from EventBridge.

In addition, we will have some additional operation and maintenance capabilities, including: event tracking, event playback, event analysis, and event archiving.

RocketMQ-EventBridge and cloud

Among all open-source connectors that integrate with other upstream and downstream systems, we have a special connector called EventBridgeConnector, which can be easily integrated with the event bus on Alibaba Cloud. Here are two typical application scenarios:

The first scenario is: the events generated inside the IDC system can not only be used for decoupling between internal systems, but also can be synchronized to the cloud in real time to drive some computing services on the cloud. The event is analyzed offline, or the image recognition service on the cloud is driven to analyze the pictures marked in the event in real time.

The second scenario is: if IDC uses self-built MQ internally, we can also synchronize events to the cloud in real time through MQConnector and EventBridgeConnector, and gradually migrate the internal self-built MQ to MQ on the cloud.

title=

ecological development

Regarding the future direction of EventBridge, we hope to open source and build an event bus ecosystem that supports multi-cloud architecture. How to understand this? To put it simply, we hope that between different cloud vendors, including between cloud vendors and internal IDC systems, events can be used to break down walls and achieve interoperability. Although cloud computing has developed rapidly in recent years, for some particularly large customers, sometimes they do not want to be strongly bound to a certain cloud vendor. This is not only the result of full competition in the market, but also a means of reducing risks for large customers. Therefore, at this time, how to flexibly interact and even flexibly migrate between different cloud vendors, including cloud vendor systems and their own internal IDC systems, is a very important requirement for enterprises.

title=

Of course, it is difficult to achieve this. However, if we are designing and developing an enterprise architecture based on an event-driven architecture - the interaction between different systems, revolving around events, it will be much easier.

Events, here, are like a universal language, through which communication with different systems can be achieved. For example: use events in the IDC system to drive services on Alibaba Cloud; even use events on Alibaba Cloud to drive services running on AWS;

In order to achieve this goal, when we integrate systems with different cloud vendors and different SaaS system service providers, we need to develop corresponding connectors.

You are also welcome to build the RocketMQ-EventBridge ecosystem together.

Source address:

https://github.com/apache/rocketmq-eventbridge

Interested friends can scan the QR code below to join the DingTalk group discussion (group number: 44552972)

title=

Click here to enter the EventBridge official website for more information~

Cloud native event-driven engine (RocketMQ-EventBridge) application scenarios and technical analysis

what is an event

Happened and immutable

Unexpected

natural order

figurative

Typical application scenarios for events

event notification

Event Sourcing

system integration

CQRS

RocketMQ's solution to events

What kind of abilities are needed?

event criteria

Event Schema

Event filtering and transformation

Technology Architecture

RocketMQ-EventBridge and cloud

ecological development

阿里云云原生

引用和评论

从 AI Agent 到模型推理：端到端 AI 可观测实践

支付宝H5下载被拦截的原因排查与解决指南

JManus - 面向 Java 开发者的开源通用智能体

MCP协议重大升级，Spring AI Alibaba联合Higress发布业界首个Streamable HTTP实现方案

PAI Model Gallery 支持云上一键部署 Qwen3 全尺寸模型

2025年3月中国数据库排行榜：PolarDB夺魁傲群雄，GoldenDB晋位入三强

分析型数据库入门指南：如何选择适合你的实时分析工具？