Apache Pulsar and Kafka performance comparison: latency (test process)

🎙️It takes 4 minutes to read this article

This series of articles will focus on the latency of Pulsar and Kafka, and subsequent articles will discuss throughput.

This article will introduce the details of the test (the blue part in the figure below), the next article will introduce the test method in detail (the green part in the figure below), and the test results will be detailed in the previous article (the red part in the figure below).

Test details:

Set benchmark

To set up the benchmark tests, we followed the steps documented on the OpenMessaging site. After applying the Terraform configuration, you get the following set of EC2 instances: We followed the steps documented on the OpenMessaging site. Applying Terraform configuration can get the following EC2 instance set:

The i3.4xlarge instance used for Pulsar/BookKeeper and Kafka broker contains two NVMe SSDs to improve performance. Both of these powerful virtual machines have 16 vCPUs, 122 GiB of memory, and high-performance disks.

Two SSDs are ideal settings for Pulsar, because not only can two data streams be written, but the data streams can also be parallelized on the disk. Kafka can also use these two SDDs by allocating partitions on two drives.

The Ansible playbook for Pulsar and Kafka uses the tuned-adm command (delayed performance profile) to tune low-latency performance.

For details of the tuned-adm command, please refer to https://linux.die.net/man/1/tuned-adm

Workload

Although there are some workloads that can be run immediately in the benchmark test, we have made some changes in order to get closer to the results of the Kafka test in the LinkedIn Engineering blog. It is not difficult to define a new workload, just create a YAML file with test update parameters.

Reading the LinkedIn blog, you will find that the message size they run is 100 bytes, because generally speaking, if the message is too small (much less than 100 bytes), the test comparison result is not obvious; and all message queues are Not good at handling "large messages" (much greater than 100 bytes), so a compromise size is selected here, namely 100 bytes, which is also the size of a single message selected for use in all messaging system tests.

This size is more conducive to testing the performance of the message system itself. Regardless of the size of each message, the total amount of messages used for testing is fixed. The more efficient the message system is to process messages, the better the performance. At the same time, the network or disk throughput limitations are less likely to affect the test results. The performance of the messaging system in processing "big messages" is also a topic worth discussing, but we currently only test "small messages".

In addition, in the test, we also added a benchmark test with 6 partitions (abbreviated as 6 partitions). Because a lot of 6 partitions were used in the LinkedIn test, we also added it.

The LinkedIn blog contains producer-only and consumer-only workloads, and the workloads we used in our tests include both producer and consumer. There are two reasons.

First of all, as far as the current situation is concerned, the benchmark test does not support producer-only or consumer-only workloads; secondly, in actual situations, the messaging system serves both producer and consumer. We decided to use the actual scenarios of producing messages and consuming messages to test.

In summary, the load set we used for testing is as follows:

Kafka consumer group and Pulsar subscription are very similar. Both allow one or more consumers to receive all messages on a topic. When a topic is associated with multiple consumer groups/subscriptions, the messaging system provides multiple copies of each message to the topic, or "fan-out" messages.

Every message published on the topic is sent to all consumer group/subscription. If all messages are sent to the same topic, and there is only one consumer group/subscription on this topic, the producer rate is equal to the consumer rate.

If there are two consumer group/subscriptions on a single topic, the consumer rate is twice the producer rate. We try to simplify the test as much as possible, so the former is adopted, that is, multiple consumers receive all the messages on a topic.

The previous article detailed the test results of Pulsar and Kafka. The status of Fsync is a variable in the test. In addition, we also adjusted the number of partitions in the test to better compare the latency of Pulsar and Kafka.

Want to keep abreast of Pulsar's R&D progress, user cases and hot topics? Come and pay attention to Apache Pulsar and StreamNative WeChat public accounts, we will share everything related to Pulsar here for the first time.

Click the link to view the English original

Apache Pulsar and Kafka performance comparison: latency (test process)

Set benchmark

Workload

ApachePulsar

引用和评论

深入解析 Apache BookKeeper 系列：第二篇 — 写操作原理

祝贺陈梓立(Tison)当选 2025 年度 Apache 软件基金会董事会

K8s 小白入门｜从电影配乐谈起，聊聊容器编排和 K8s

MCP协议重大升级，Spring AI Alibaba联合Higress发布业界首个Streamable HTTP实现方案

一键实现 Oracle 数据整库同步至 Apache Doris

草莓不是莓，西瓜才是莓——解读 Kubernetes 中被驱逐的 Pod

亲身体验云原生顶会北美 KubeCon，5个要点和4个 Fun Facts