云原生 - "New Programmer" Magazine | Li Penghui talks about open source cloud native message flow system - ApachePulsar

Editor's Note:

The birth of cloud native is to solve the problems of traditional applications in terms of architecture, fault handling, system iteration, etc., while open source contributes the backbone to enterprises to build cloud native architecture. In the process of devoting to open source and participating in cloud native on a daily basis, the author of this article has different thinking and practice on the open source industry and cloud native streaming system solutions.

The following article comes from CSDN, author Li Penghui

This article is from " New Programmers Cloud Native and Comprehensive Digital Practice ". Author Li Penghui, member of Apache Pulsar PMC, chief engineer of StreamNative. Editor in charge of CSDN Tang Xiaoyin.

With the changes in business and environment, the trend of cloud native is becoming more and more obvious. Now is the era when enterprises are transforming from cloud computing to cloud native. After several years of practice, the cloud native concept has been widely recognized by enterprises, and cloud application management has become a must for enterprise digital transformation. It can be said that the current developers are either using products and tools derived from cloud-native technology architecture, or the developers of these products and tools.

Cloud native as a foundational strategy

So, what is cloud native? Everyone has a different interpretation. In my opinion, first of all, cloud native is an application developed to run on the cloud, a solution for enterprises to continuously deliver business quickly, reliably and at scale. Several keywords of cloud native, such as containerization, continuous delivery, DevOps, microservices, etc., all interpret its features and capabilities as a solution, and Kubernetes, with its pioneering declarative API and regulator mode, Laid the foundation for cloud native.

Second, cloud native is a strategy. The birth of cloud native is to solve the problems existing in traditional applications in terms of architecture, fault handling, and system iteration. Going from traditional applications to the cloud is not so much a technology upgrade as it is a strategic transformation. Enterprises are faced with the comprehensive integration of application development, system architecture, enterprise organizational structure, and even commercial products when migrating to the cloud. Whether or not to join the cloud-native trend is a strategic decision that will affect the long-term development of enterprises in all aspects.

Cloud native with open source background

Most of the architecture-related open source projects born in recent years adopt cloud-native architecture design, and open-source has contributed the backbone for enterprises to build cloud-native architectures.

Open source technology and ecology are trustworthy, and the cloud can bring good scalability to users and reduce resource waste. The relationship between cloud native and open source can also be seen from the continuous promotion of the development of cloud native by the CNCF-based open source foundation. Many open source projects are born for cloud-native architecture, which is a basic software feature that users will prioritize when going to the cloud.

Take the Apache Software Foundation as an example, it is a neutral open source software incubation and governance platform. In the long-term open source governance of the Apache Software Foundation, the Apache Way (Apache Way) summarized by everyone is regarded as the standard. Among them, "community is greater than code" is widely circulated, that is, a project without a community cannot last for a long time. An open source project with a highly active community and code, after being polished by developers all over the world in various scenarios, can be continuously improved, frequently upgraded and iterated, and a rich ecosystem can be born to meet different user needs. The combination of two factors, the cloud native tide and the current open source environment, will make those excellent technologies that accompany the continuous upgrading of the technical environment to emerge and stand out. As I said before, cloud native is a strategic decision, and the strategic decision of the enterprise will definitely prefer the most advanced and reliable technology.

A message flow data system for the cloud

The previous article described the importance of open source in a cloud-native environment. How should a cloud-native open source project be designed, planned, and evolved? How to choose a messaging and streaming system for enterprise digital transformation in the cloud-native era? In this article, I'll dissect the design and planning of Apache Pulsar, an open source cloud-native messaging and streaming data system that I'm fully committed to. I hope it can provide you with reference ideas and inspire you to seek solutions for message and streaming data systems.

A Look Back in History: The Two-Track System of Messages and Streams

Message queues are typically used to build core business application services, and streams are typically used to build real-time data services including data pipelines. The message queue has a longer history than the stream, which is the message middleware that developers are familiar with. It focuses on the communication industry. Common systems include RabbitMQ and ActiveMQ. Relatively speaking, the streaming system is a new concept. It is mostly used in scenarios where large amounts of data are moved and processed. Operational data such as log data and click events are displayed in the form of streams. Common streaming systems include Apache Kafka and AWS Kinesis.

For previous technical reasons, people treat messages and streams as separate models. Enterprises need to build a variety of different systems to support these two business scenarios (see Figure 1), which results in a large number of "dual-track" phenomena in the infrastructure, resulting in data isolation, data islands, data cannot flow smoothly, and governance difficulty is greatly increased , the complexity of the architecture and the cost of operation and maintenance are also high.

Figure 1 "Dual-track system" caused by enterprises building different systems to support business scenarios

Based on this, we urgently need a unified real-time data infrastructure that integrates message queues and stream semantics, and Apache Pulsar was born. Messages are stored once on the Apache Pulsar topic, but can be consumed in different ways through different subscription models (see Figure 2), which solves a lot of problems caused by the "dual-track system" of traditional messages and streams.

Figure 2 Apache Pulsar integrates message queue and stream semantics

Key Elements for Achieving Natural Cloud Native

As mentioned above, what the cloud native era brings to developers is the ability to rapidly expand and shrink capacity, reduce waste of resources, and accelerate business development. With a native cloud-native messaging and streaming data infrastructure like Apache Pulsar, developers can better focus on application and microservice development instead of wasting time maintaining complex underlying systems.

Why is Apache Puslar said to be "native cloud native"? This has to do with the underlying architecture in which the prototype was designed in the first place. The cloud-native architecture of separation of storage and computing and layered sharding greatly reduces the expansion and operation and maintenance difficulties encountered by users in the message system, and can provide users with high-quality services at a lower cost on the cloud platform, which can well meet the needs of users. Requirements for message systems and streaming data systems in the cloud-native era.

Biology has a conclusion called "structure and function fit". From single-celled protists to mammals, their life structures have become more and more complex, and their functions have become more and more advanced. In the same way for the basic system, the "suitability of architecture and functions" is reflected in the following points on Apache Pulsar:

The storage-computing separation architecture ensures high scalability and can give full play to the elastic advantages of the cloud.
Cross-region replication can meet the needs of multiple backups of cross-cloud data.
Tiered storage can make full use of cloud-native storage such as AWS S3 to effectively reduce data storage costs.
Pulsar Functions, a lightweight functional computing framework, similar to the AWS Lambda platform, brings FaaS to Pulsar. Function Mesh is a Kubernetes Operator that helps users use Pulsar Functions and connectors natively in Kubernetes, giving full play to the features of Kubernetes resource allocation, elastic scaling, and flexible scheduling.

Infrastructure: separation of storage and computing, hierarchical sharding

As mentioned above, Pulsar adopted a cloud-native design at the beginning of its birth, that is, an architecture that separates storage and computing. The storage layer is based on the Apache Software Foundation open source project BookKeeper. BookKeeper is a highly consistent, distributed append-only log abstraction. Similar to the message system and streaming data scenarios, new messages are continuously appended, which is just applied to the message and streaming data fields.

In the Pulsar architecture, data service and data storage are two separate layers (see Figure 3). The data service layer is composed of stateless Broker nodes, and the data storage layer is composed of Bookie nodes. Each node in the service layer and the storage layer is peer-to-peer. . Broker is only responsible for the service support of messages and does not store data, which provides independent scalability and high availability for the service layer and storage layer, and greatly reduces the time when services are unavailable. The peer-to-peer storage nodes in BookKeeper can ensure that multiple backups are accessed concurrently, and can also provide external services even if only one copy of the data in the storage is available.

Figure 3 Pulsar architecture

In this layered architecture, both the service layer and the storage layer can expand independently, providing flexible and elastic capacity expansion, especially in elastic environments (such as clouds and containers), which can automatically scale and dynamically adapt to traffic peaks. At the same time, the complexity of cluster expansion and upgrade is significantly reduced, and the availability and manageability of the system are improved. Also, this design is very container friendly.

Pulsar stores topic partitions with smaller shard granularity (see Figure 4). These shards are evenly dispersed and will be distributed on the Bookie nodes of the storage layer. This sharding-centric data storage method takes topic partitioning as a logical concept and divides it into multiple smaller shards, which are evenly distributed and stored in the storage layer. Such a design can bring better performance, more flexible scalability and higher availability.

Figure 4 Sharded storage model

As can be seen from Figure 5, in contrast to most message queuing or streaming systems (including Apache Kafka), which use a monolithic architecture, message processing and message persistence (if provided) are on the same node within the cluster. Such architectural designs are suitable for deployment in small-scale environments, and when used on a large scale, traditional message queuing or streaming systems face performance, scalability, and flexibility issues. With the increase of network bandwidth and the significant reduction of storage latency, the architectural advantage of separation of storage and computing becomes more obvious.

Figure 5 Traditional monolithic architecture vs storage computing layered architecture

difference between reading and writing

Following the above content, let's take a look at the difference between message writing and reading.

First look at writing. The left side of Figure 6 is an application of a single architecture. Data is written to the leader, and the leader replicates the data to other followers. This is a typical architecture design where storage and computing are not separated. On the right side of Figure 6 is the application of separation of storage and computing. Data is written to the Broker, and the Broker writes to multiple storage nodes in parallel. If 3 replicas are required, it is only successful if two replicas return when strong consistency and low latency are selected. If the Broker has the role of leader, it will be limited by the resources of the machine where the leader is located. Only when the leader returns can we confirm that the message is successfully written.

Figure 6 Comparison of monolithic architecture and layered architecture writing

In the peer-to-peer layered architecture on the right, any two of the three nodes return a successful write after writing. When we performed performance tests on AWS, we found that the latency of the two architectures also differs by a few milliseconds when flushing: in a single-machine system, the topic that falls on the leader will have a delay, but in a layered architecture, it will be affected by delay. smaller.

In real-time data processing, real-time reads account for 90% of the scenarios (see Figure 7). In the layered architecture, real-time reading can be performed directly through the Broker's topic tail cache without contacting storage nodes, which can greatly improve the efficiency and real-time performance of data reading.

Figure 7 Comparison of real-time data read between monolithic architecture and layered architecture

Schema also causes differences when reading historical data. As can be seen from Figure 8, in the monolithic architecture, the leader is directly found when the message is played back, and the message is read from the disk. In the architecture where storage and computing are separated, data needs to be loaded into the broker and then returned to the client to ensure the sequentiality of data reading. When there is no strict requirement on the sequentiality of reading data, Apache Pulsar supports reading data segments from multiple storage nodes in parallel at the same time. Even if the data of one topic is read, the resources of multiple storage nodes can be used to improve the read throughput. , Pulsar SQL also uses this method to read.

Figure 8 Comparison of reading historical data between monolithic architecture and layered architecture

IO isolation

BookKeeper does a good job of IO isolation between data writing and reading. BookKeeper can specify two types of storage devices. The left side of Figure 9 is the Journal disk to store the writeheadlog, and the right side is where the data is actually stored. Even when reading historical data, the latency of writing will not be affected as much as possible.

Figure 9 IO isolation of BookKeeper

If the resources of the cloud platform are utilized, Pulsar's IO isolation allows users to choose different resource types. Because the journal disk does not need to store a large amount of data, many cloud users will configure it according to their own needs to achieve the purpose of low cost and high service quality. For example, the journal disk uses resources with low storage space, high throughput and low latency, and the data disk chooses Throughput devices that can hold large amounts of data.

Expansion

The separation of storage and computing allows Broker and BookKeeper to expand and shrink respectively. The following describes the process of expanding and shrinking topics. Assuming that n topics are distributed on different brokers, adding a new broker can transfer topic ownership within 1s, which can be regarded as the transfer of a stateless topic group. In this way, some topics can be quickly transferred to the new Broker.

For storage nodes, multiple data shards are scattered on different BookKeeper nodes, and a new BookKeeper is added when the capacity is expanded, and this behavior will not lead to the replication of historical data. After a period of data writing, each topic will perform shard switching, that is, switch to the next data shard. When switching, Bookies will be reselected to place the data, thus achieving a gradual balance. If a BookKeeper node hangs, BookKeeper will automatically fill up the number of replicas, and topics will not be affected during this process.

Cross-cloud data backup

Pulsar supports multiple backups of data across clouds (see Figure 10), allowing the formation of clusters across computer rooms for bidirectional data synchronization. Many foreign users deploy cross-cloud clusters in different cloud vendors. When there is a problem with one cluster, they can quickly switch to another cluster. Asynchronous replication only creates minor data synchronization gaps, but can achieve higher quality of service, and the state of subscriptions can also be synchronized across clusters.

Figure 10 Cross-cloud data backup

Entering the era of serverless architecture

Pulsar Functions and Function Mesh bring Pulsar into the era of serverless architecture. Pulsar Functions is a lightweight computing framework, mainly to provide a very simple platform for deployment and operation and maintenance. Pulsar Functions is mainly lightweight and simple, and can be used to process simple ETL jobs (extraction, transformation, loading), real-time aggregation, event routing, etc. It can basically cover more than 90% of stream processing scenarios. Pulsar Functions draws on the concepts of serverless architecture (Serverless) and function as a service (FaaS), which allows data to be processed "nearby" and value can be instantly mined (see Figure 11).

Figure 11 Flow of a single Pulsar Function message

Pulsar Functions is just a single application function. In order to associate multiple functions and combine them to complete data processing goals, Function Mesh (open source) was born. Function Mesh also adopts a serverless architecture. It is also a Kubernetes Operator. With it, developers can natively use Pulsar Functions and various Pulsar connectors on Kubernetes to give full play to the features of Kubernetes resource allocation, elastic scaling, and flexible scheduling. . For example, Function Mesh relies on the scheduling capabilities of Kubernetes to ensure the failure resilience of Functions, and functions can be properly scheduled at any time.

Function Mesh is mainly composed of two components: Kubernetes Operator and Function Runner. Kubernetes Operator monitors Function Mesh CRDs, creates Kubernetes resources (ie StatefulSets), and runs Functions, Connectors, and Meshes in Kubernetes. The Function Runner is responsible for calling Function and connector logic, processing events received from the input stream, and sending the processing results to the output stream. Currently, Function Runner is implemented based on Pulsar Functions Runner.

When a user creates a Function Mesh CRD (see Figure 12), the Function Mesh controller receives the submitted CRD from the Kubernetes API server, then processes the CRD and generates the corresponding Kubernetes resources. For example, when the Function Mesh controller processes the Function CRD, it will create a StatefulSet, and each of its Pods will start a Runner to call the corresponding Function.

Figure 12 Function Mesh processing CRD process

The Function Mesh API is implemented based on the existing Kubernetes API, so Function Mesh resources are compatible with other Kubernetes native resources, and cluster administrators can use existing Kubernetes tools to manage Function Mesh resources. Function Mesh adopts Kubernetes Custom Resource Definition (CRD), and cluster administrators can customize resources through CRD to develop event streaming applications.

Instead of using the pulsar-admin CLI tool to send Function requests to the Pulsar cluster, users can submit CRDs directly to the Kubernetes cluster using the kubectl CLI tool. The Function Mesh controller monitors CRDs and creates Kubernetes resources to run custom Functions, Sources, Sinks or Meshes. The advantage of this approach is that Kubernetes directly stores and manages Function metadata and running state, thereby avoiding the problem of inconsistency between metadata and running state that may exist in Pulsar's existing solutions.

Epilogue

In this article, I share my thoughts on the open source industry and the technical practice of cloud-native streaming platform solutions in a cloud-native environment. As a dedicated open source person, I am glad to see that more and more people recognize open source concepts and become open source developers and contributors in recent years, and the open source industry is booming. I hope that like countless developers, I can continue on the road of open source and help more enterprises to accelerate the process of cloud native and digitalization.

About the Author
Li Penghui: Member and Committer of Apache Pulsar PMC, a top-level project of the Apache Software Foundation, currently working as the chief architect of StreamNative. For a long time, his field of work has been closely related to messaging systems, microservices and Apache Pulsar. In 2019, he promoted the implementation of Pulsar in Zhaopin, built an internal unified messaging service, and then joined StreamNative, a commercialization company of Apache Pulsar, to complete the transformation of personal identity from one The transition from open source project users to open source project developers. He and his team at StreamNative are responsible for supporting users in mass-scale messaging scenarios to implement Apache Pulsar.

Follow the official account "Apache Pulsar" to get dry goods and news

Join the Apache Pulsar Chinese exchange group 👇🏻

"New Programmer" Magazine | Li Penghui talks about open source cloud native message flow system

Cloud native as a foundational strategy

Cloud native with open source background

A message flow data system for the cloud

A Look Back in History: The Two-Track System of Messages and Streams

Key Elements for Achieving Natural Cloud Native

Infrastructure: separation of storage and computing, hierarchical sharding

difference between reading and writing

IO isolation

Expansion

Cross-cloud data backup

Entering the era of serverless architecture

Epilogue

About the Author

ApachePulsar

引用和评论

深入解析 Apache BookKeeper 系列：第二篇 — 写操作原理

定档 7 月！Community Over Code Asia 2025 议题征集全面启动！

本地玩转 DeepSeek 和 Qwen 最新开源版本（入门+进阶）

蚂蚁技术研究院发布推理大模型强化学习框架，邀请开发者共同助力 AGI 生态

祝贺陈梓立(Tison)当选 2025 年度 Apache 软件基金会董事会

K8s 小白入门｜从电影配乐谈起，聊聊容器编排和 K8s

Koupleless 助力「人力家」实现分布式研发集中式部署，又快又省！