Blog recommendation | Implement Exactly-Once semantics based on Pulsar transaction - ApachePulsar

Translator Profile
The original text was published by Li Penghui on StreamNative English site: https://streamnative.io/en/blog/release/2021-06-14-exactly-once-semantics-with-transactions-in-pulsar

Translator: Jialing@中国Mobile Cloud Capability , product leader of mobile cloud Pulsar, Apache Pulsar Contributor, active in Apache Pulsar and other open source projects and communities

The Apache Pulsar community has implemented a milestone feature in the just released version of Pulsar 2.8.0: Exactly-once semantics . Before that, we could only ensure the exact-once semantics of a single topic by enabling message deduplication on the Broker side. With the release of Pulsar 2.8.0, the transaction API can ensure the atomic operation of message production and confirmation in cross-topic scenarios. Next, I will explain the meaning and implementation of these two methods, and how to use Pulsar transaction features to implement Exactly-once semantics in real-time data messaging and stream computing systems.

Before we deeply understand the transaction characteristics of Pulsar, let's review the concept of message semantics.

What is Exactly-once semantics?

In a distributed system, any node may be abnormal or even downtime. The same is true in Apache Pulsar. When the Producer is producing messages, it may happen that the Broker or Bookie is down and unavailable, or the network is suddenly interrupted. According to the way the Producer processes messages when an exception occurs, the system can have the following three message semantics.

At-least-once (at least once) semantics

The Producer ensures that the message is successfully written to the Pulsar Topic by receiving the ACK (Message Acknowledgement) notification from the Broker. However, when the Producer receives an ACK notification timeout, or receives a Broker error message, it will try to resend the message. If the Broker happens to successfully write the message to the Topic, but fails to send an ACK to the Producer, the message re-sent by the Producer will be written to the Topic again, eventually causing the message to be repeatedly distributed to the Consumer.

At-most-once (at most once) semantics

When the Producer receives an ACK timeout, or does not resend the message when it receives a Broker error message, it may cause the message to be lost, not written into the Topic, and it will not be consumed by the Consumer. In some scenarios, in order to avoid repeated consumption, we can tolerate the occurrence of message loss.

Exactly-once (exactly once) semantics

Exactly-once semantics ensures that even if the Producer sends the same message to the server multiple times, the server will only record it once. Exactly-once semantics are the most reliable and at the same time the most difficult to understand. Exactly-once semantics requires the collaboration of the message queue server, the message producer, and the consumer application. For example, when the consumer application successfully consumes and ACKs a message, and then rolls back the consumption location to a previous message ID, then all messages from that message ID onwards will be re-consumed by the consumer application.

Difficulties in implementing Exactly-once semantics

There are many challenges in implementing Exactly-once semantics in a distributed messaging middleware system. The following is described by a simple example.

Suppose a Producer sends a message with the content "Hello StreamNative" to the Topic "Greetings" on Pulsar, and then a Consumer will receive the message from this Topic and print it out. In an ideal situation, no abnormality occurs. The message "Hello StreamNative" will only be written once to the topic "Greetings", and then the Consumer will receive this message and process it, and then notify Pulsar through ACK that the message has been Processing is complete. Even if the Consumer goes down or restarts afterwards, it will not receive this message again.

However, exceptions and errors are often everywhere.

Bookie may be down

Pulsar uses BookKeeper to store messages. BookKeeper is a highly available persistent log storage system. Data written to Ledger (a shard of Topic in Pulsar) will be stored on N Bookie nodes. That is to say, BookKeeper can tolerate the downtime of N-1 Bookie nodes. machine. As long as there is at least one Bookie node available, the data on this Ledger will not be lost. Relying on the Zab protocol and Paxos algorithm, BookKeeper's copy protocol can guarantee that once data is successfully written to Bookie, these data will be automatically copied to Bookie nodes belonging to the same group for permanent storage.

Broker may be down, or the network with Producer may be interrupted

The Producer ensures that the message is sent successfully by receiving the ACK notification from the Broker. However, not receiving an ACK notification does not always mean that the message has failed to be sent. The Broker may have an exception after successfully writing the message to the Topic but before sending an ACK to the Producer, or it may have an exception before writing the message to the Topic. Since there is no way to know the reason for the exception of the Broker, the Producer will assume that the message has not been sent successfully and resend it by default when receiving the ACK failure. This means that in some cases, Pulsar will write duplicate messages, resulting in repeated consumer consumption.

Pulsar client may be down

The unavailability of the Pulsar client must be considered when implementing Exactly-once. It is difficult to accurately distinguish whether the client has an unrecoverable downtime or is only temporarily unavailable, but it is important for Broker to have this judgment ability. Pulsar Broker needs to block messages sent by clients in abnormal conditions. Once the client is restarted, the client can know the state of the previous failure, and proceed to process subsequent messages from the appropriate place.

The Pulsar community implements Exactly-once semantics in two stages. In Pulsar 1.20.0-incubating version, we use idempotent Producer to ensure Exactly-once semantics on a single topic. In the latest version of Pulsar 2.8.0, we introduced transaction APIs to ensure the atomic operation of messages in cross-topic scenarios.

Idempotent Producer: Implement Exactly-once semantics of a single topic

We start with the use of idempotent Producer to ensure Exactly-once semantics on a single topic in the Pulsar 1.20.0-incubating version.

What is an idempotent Producer? Idempotence means that the results of one or multiple requests initiated for the same operation are consistent and will not produce different results due to multiple operations. If message deduplication is enabled at the cluster or namespace level and an idempotent Producer is configured on the message production side, when an exception causes the Producer to resend a message, the duplicate message will only be written in the Broker Enter once.

Through this function, it can be realized that no messages will be lost under a single topic, there will be no duplicate messages, and all messages are in order. We can enable this feature through the following configuration:

Enable message deduplication at the Cluster level (valid for all topics under the Namespace), Namespace level (valid for the topics under the Namespace), or topic level (valid for a single topic)
Set an arbitrary name for the Producer and set the message timeout time to 0

How is this function implemented? Simply put, it is very similar to the message deduplication mechanism of the TCP protocol: each message sent to Pulsar will have a unique sequence number, and Pulsar Broker uses this sequence number to determine and remove duplicate messages. The difference is that the TCP protocol can only guarantee the deduplication of messages in the real-time connection, while Pulsar will save the sequence number in the message body to the Topic, and record the latest received sequence number. So even if the Broker node goes down abnormally, another Broker node that takes over processing the topic again can determine whether the message is duplicated. This principle is very simple, and the increased performance loss compared with the non-idempotent Producer is almost negligible.

Pulsar 1.20.0-incubating and later versions support this function. You can find the introduction of this function here

However, the idempotent Producer can only guarantee Exactly-once semantics in certain scenarios, and cannot do anything in other scenarios. For example: When the Producer needs to ensure that a message is sent to multiple Topics at the same time, the Broker responsible for handling some of the Topics goes down. If the Producer does not resend the message, some messages in the topic will be lost. If the Producer resends the message, it will cause the repeated writing of the message in other topics.

On the consumer side, the ACK request sent by the Consumer to the Broker is Best-effort (best-effort), which means that the ACK request may be lost, and the Consumer cannot know whether the Broker has received the ACK request normally, nor will it occur when the ACK request is lost. Resend. This will also cause the Consumer to receive duplicate messages.

Transaction API: Achieve the atomic operation of cross-topic message production and confirmation

In order to solve the above problems, we introduced transaction API to ensure the atomic operation of message sending and confirmation in cross-topic scenarios. Through this function, the Producer can ensure that a message is sent to multiple topics at the same time. Either these messages are successfully sent and can be consumed on all topics, or all messages cannot be consumed. This function also allows ACK confirmation of messages on multiple topics in one transaction operation, thereby realizing end-to-end Exactly-once semantics.

The following example code demonstrates how to use the transaction API:

PulsarClient pulsarClient = PulsarClient.builder()
        .serviceUrl("pulsar://localhost:6650")
        .enableTransaction(true)
        .build();
Transaction txn = pulsarClient
        .newTransaction()
        .withTransactionTimeout(1, TimeUnit.MINUTES)
        .build()
        .get();
producer.newMessage(txn).value("Hello Pulsar Transaction".getBytes()).send();
Message<byte[]> message = consumer.receive();
consumer.acknowledge(message.getMessageId(), txn);
txn.commit().get();

This code shows how to use the transaction API to implement the atomic operation of message sending and confirmation, and how to use the transaction API to confirm messages in the same transaction operation.

have to be aware of is:

In the same topic, some messages can belong to a certain transaction, and some messages do not belong to any transaction.
Multiple concurrent uncommitted transactions are allowed in the Pulsar client. This is the most fundamental difference from other messaging systems that support transactions, and can greatly improve the processing capabilities of transaction messages.
The current Pulsar transaction API only supports the READ_COMMITTED (read committed) isolation level. The Consumer will only consume messages that do not belong to any transaction and messages in committed transactions, and will not consume messages in uncommitted and rolled back transactions.

No additional configuration and dependencies are required to use the transaction API on the Pulsar client.

End-to-end Exactly-once stream computing becomes simpler: an example of Pulsar+Flink

Through the Pulsar transaction API, we can already implement Exactly-once semantics in stream computing scenarios.

In streaming computing systems, a key issue is often mentioned: "If some intermediate nodes are down during the streaming computing process, how to ensure that the final calculation result will not be a problem?" The key to solving this problem lies in After the abnormal node recovers, how to resume processing the stream data from the state before the abnormality occurred.

Stream computing on Apache Pulsar is essentially a Read-Process-Write operation on messages on multiple topics. The Source node consumes messages from one or more input topics, then performs a series of calculations and state processing on the messages through the Process node, and finally sends the processing results to the Topic that records the results through the Sink node. Exactly-once in the streaming computing scenario means that the execution of a complete set of operations for Read-Process-Write conforms to Exactly-once semantics, that is, no messages on the input Topic will be lost, nor will it be repeated on the Topic where the result is recorded. Write the message. This is the Exactly-once effect that users expect on the streaming computing system.

Let's look at an example of Pulsar combined with Flink for stream computing.

Before Pulsar 2.8.0, Pulsar combined with Flink for stream computing only supported Exactly-once Source Connector and At-least-once Sink Connector. This means that the end-to-end stream computing system built using Pulsar and Flink can only achieve At-least-once semantics at most. In other words, the messages sent to the Topic of the record result may be repeated.

Utilizing the transaction API introduced in Pulsar 2.8.0, Pulsar-Fink Sink Connector can support Exactly-once semantics through a simple modification. Flink uses the Two-Phase Commit protocol to ensure end-to-end Exactly-once semantics, so we can implement TwoPhaseCommitSinkFunction and embed Pulsar's transaction API. When the Pulsar-Fink Sink Connector calls beginTransaction , we create a Pulsar transaction and save the transaction ID. All subsequent messages written to the sink connector will set this transaction ID. These messages are written to Pulsar when the Connector calls preCommit When the Connector calls recoverAndCommit or recoverAndAbort , the Pulsar transaction API is called to commit or roll back the Pulsar transaction. This transformation is very simple. You only need to save the relationship between the Pulsar transaction ID and Flink Checkpoints in the Connector, so that the corresponding Pulsar transaction ID can be obtained in Flink's transaction commit and rollback operations.

Based on the idempotent and atomic operations provided by Pulsar transactions and the global consistency CheckPoint check mechanism provided by Apache Flink, we can easily use Pulsar and Flink to construct a stream computing system that complies with end-to-end Exactly-once semantics.

Follow-up

If you want to know more details of Exactly-once implementation, it is recommended to read Pulsar community improvement proposal PIP-31 . For more design details, it is also recommended to read the design document .

This article is mainly based on the user's perspective to introduce the new feature transaction API in Apache Pulsar 2.8.0, and how to use this feature to implement Exactly-once semantics. In the next article, we will introduce the design and implementation of transaction API in more detail.

Pulsar Summit held recently North America summit, relevant speech "Exactly-Once Made Easy: Transactional Messaging in Apache Pulsar", can view video .

Thanks

In the past year, many Pulsar Committers and Contributors have participated in the development of this milestone feature. I would like to thank them: Li Penghui, Gao Ran, Cong Bo, Addison Higham, Zhai Jia, Zhang Yong, Ran Xiaolong, Matteo Merli , Guo Sijie.

At the same time, I would like to thank the translator Jialing@中国Mobile’s cloud competence his excellent translation. Let us quickly see the Chinese version of this blog post.

Blog recommendation | Implement Exactly-Once semantics based on Pulsar transaction

Translator Profile

What is Exactly-once semantics?

At-least-once (at least once) semantics

At-most-once (at most once) semantics

Exactly-once (exactly once) semantics

Difficulties in implementing Exactly-once semantics

Bookie may be down

Broker may be down, or the network with Producer may be interrupted

Pulsar client may be down

Idempotent Producer: Implement Exactly-once semantics of a single topic

Transaction API: Achieve the atomic operation of cross-topic message production and confirmation

End-to-end Exactly-once stream computing becomes simpler: an example of Pulsar+Flink

Follow-up

Thanks

Related Reading

ApachePulsar

引用和评论

深入解析 Apache BookKeeper 系列：第二篇 — 写操作原理

【RocketMQ 消息中间件】RocketMQ篇之-消息存储为什么性能高 CommitLog 刷盘机制同步异步

得物新一代可观测性架构：海量数据下的存算分离设计与实践

53 倍性能提升！TiDB 全局索引如何优化分区表查询？

Debian/Ubuntu清理硬盘空间

百度视频搜索架构演进

Nginx+Promtail+Loki+Grafana Nginx日志展示

Blog recommendation | Implement Exactly-Once semantics based on Pulsar transaction

Translator Profile

What is Exactly-once semantics?

At-least-once (at least once) semantics

At-most-once (at most once) semantics

Exactly-once (exactly once) semantics

Difficulties in implementing Exactly-once semantics

Bookie may be down

Broker may be down, or the network with Producer may be interrupted

Pulsar client may be down

Idempotent Producer: Implement Exactly-once semantics of a single topic

Transaction API: Achieve the atomic operation of cross-topic message production and confirmation

End-to-end Exactly-once stream computing becomes simpler: an example of Pulsar+Flink

Follow-up

Thanks

Related Reading

ApachePulsar

引用和评论

深入解析 Apache BookKeeper 系列：第二篇 — 写操作原理

【RocketMQ 消息中间件】RocketMQ篇之-消息存储 为什么性能高 CommitLog 刷盘机制 同步 异步

得物新一代可观测性架构：海量数据下的存算分离设计与实践

53 倍性能提升！TiDB 全局索引如何优化分区表查询？

Debian/Ubuntu清理硬盘空间

百度视频搜索架构演进

Nginx+Promtail+Loki+Grafana Nginx日志展示

【RocketMQ 消息中间件】RocketMQ篇之-消息存储为什么性能高 CommitLog 刷盘机制同步异步