The authors of this article are StreamNative engineers Li Penghui and Liu Yu.
About Apache Pulsar
Apache Pulsar is a top-level project of the Apache Software Foundation, a next-generation cloud-native distributed message flow platform that integrates messaging, storage, and lightweight functional computing. It uses a separate computing and storage architecture design and supports multiple Tenants, persistent storage, and cross-regional data replication in multiple computer rooms have streaming data storage features such as strong consistency, high throughput, low latency, and high scalability.
GitHub address: http://github.com/apache/pulsar/
One of the great advantages of Apache Pulsar is that Pulsar's multi-layer fragmentation architecture and hierarchical resource management provide a solid foundation for isolation. Users can isolate resources in a required way, avoid resource competition, and ensure system stability.
This series of blogs will discuss Pulsar isolation . This article is the first in a series of articles. It mainly introduces how to achieve isolation in Pulsar in the following ways:
- Separate Pulsar cluster
- Shared BookKeeper cluster
- Single Pulsar cluster
Separate Pulsar cluster
In this environment, you need to create separate Pulsar clusters for isolation units.
working principle
Figure 1 shows how isolation can be achieved by deploying a separate Pulsar cluster.
Key points of working principle:
- Each Pulsar cluster accesses the service through the DNS entry point, and ensures that the client can access the cluster through the DNS entry point. The client can use one or more Pulsar URLs, which are the connection addresses of the Pulsar cluster.
- Each Pulsar cluster has one or more broker and bookie .
- Each Pulsar cluster has a metadata storage area.
- The metadata storage area can be divided into Pulsar metadata storage area and BookKeeper metadata storage area . This article does not distinguish between the above two types of metadata storage areas. The metadata storage area contains both.
- Multiple Pulsar clusters share a configuration storage .
- Pulsar's hierarchical resource management provides a solid foundation for isolation. If you need to isolate the namespace, you need to specify the cluster for the namespace. Clusters must belong tenant allows the cluster list . Topics belonging to this namespace will also be assigned to this cluster. For how to set up a cluster for a namespace, see Set up a cluster for a namespace . For how to manage the Pulsar cluster, refer to 161a771d865e00 Manage Pulsar Cluster .
Migrate namespace
If you need to migrate the namespace between different clusters, you need to enable the cross-regional replication function, and turn off the cross-regional replication function after all the data is copied to the target cluster. For how to set up cross-regional replication Set up cross-regional replication for a namespace .
Scaling nodes
The expansion and contraction of broker or bookie needs to be performed in the corresponding cluster.
Shared BookKeeper cluster
In this way, you need to deploy a BookKeeper cluster shared by multiple broker clusters.
working principle
Figure 2 shows how to achieve isolation by deploying a shared BookKeeper cluster.
Key points of working principle:
- Each Pulsar cluster accesses the service through the DNS entry point, and ensures that the client can access the cluster through the DNS entry point. The client can use one or more Pulsar cluster connection addresses.
- Each Pulsar cluster has one or more brokers .
- Each Pulsar cluster has a metadata storage area.
- Multiple Pulsar clusters share a BookKeeper cluster.
- Pulsar's hierarchical resource management provides a solid foundation for isolation. If you need to isolate the namespace, you need to specify the cluster for the namespace. Clusters must belong tenant allows the cluster list . Topics belonging to this namespace will also be assigned to this cluster. For how to set up a cluster for a namespace, see Set up a cluster for a namespace . For how to manage the Pulsar cluster, see Manage Pulsar Cluster .
Storage isolation is achieved through different bookie affinity groups, as shown in Figure 3.
- All bookie isolation groups share the BookKeeper cluster and metadata storage area.
- Each bookie isolation group has one or more bookies.
- The user can specify one or more primary group or secondary group . First create the topic in the namespace on the bookie of the primary group, and then create the topic in the namespace on the bookie of the secondary group. For how to set up bookie affinity group, refer to 161a771d866069 set bookie affinity group .
Migrate namespace
Migrating the message service on the namespace to other broker clusters requires to change the cluster for the namespace. To migrate the namespace to another bookie affinity group, you need to change the bookie affinity group. For how to set up bookie affinity group, refer to 161a771d8660c1 setting bookie affinity group . Since all broker clusters share the same BookKeeper cluster, there is no need to copy data to the new BookKeeper cluster.
Scaling nodes
Broker
When expanding and shrinking the broker, you need to pay attention to the following points:
- When expanding the broker, you need to use primary group or secondary group to specify the broker isolation group for the newly added broker.
- When scaling down brokers, ensure that the broker isolation group has a sufficient number of brokers.
Bookie
When expanding and shrinking bookie, you need to pay attention to the following points:
- When expanding the bookie, you need to specify the bookie affinity group for the newly added bookie.
- When shrinking bookies, make sure that the bookie affinity group has a sufficient number of bookies. For how to set up bookie affinity group, refer to 161a771d8661b1 Setting up bookie affinity group .
Single Pulsar cluster
To achieve isolation in this way, you only need to manage a single Pulsar cluster, instead of deploying multiple broker clusters and multiple bookie clusters.
working principle
Figure 4 shows how to achieve isolation by deploying a single Pulsar cluster.
Key points of working principle:
- Each Pulsar cluster accesses the service through the DNS entry point and ensures that the client can access the cluster through the DNS entry point. The client can use the connection address of the Pulsar cluster.
- Broker isolation is achieved through different broker isolation groups (Pulsar assigns topics to brokers in specific broker isolation). For how to set up a broker isolation group, refer to 161a771d86635b Setting up a broker isolation group .
- Storage isolation is achieved through different bookie affinity groups. For how to set up bookie affinity group, refer to 161a771d8663ab setting bookie affinity group .
Migrate namespace
To migrate the namespace to another broker isolation group, the namespace isolation strategy needs to be changed. For how to set the namespace isolation strategy, refer to 161a771d8663fe Set the namespace isolation strategy .
To migrate the namespace to another bookie affinity group (do not move the original data to the target bookie affinity group), you need to change the bookie affinity group. For how to set up the bookie affinity group, set up the bookie affinity group .
Scaling mode
Broker
When expanding or shrinking the broker, you need to pay attention to the following points:
- When expanding the broker, you need to use primary group or secondary group to specify the broker isolation group for the newly added broker.
- When scaling down brokers, ensure that the broker isolation group has a sufficient number of brokers.
Bookie
When expanding and shrinking bookie, you need to pay attention to the following points:
- When expanding the bookie, you need to specify the bookie affinity group for the newly added bookie.
- When shrinking bookies, make sure that the bookie affinity group has a sufficient number of bookies. For how to set up bookie affinity group, set up bookie affinity group .
Reference
In a production environment, users can use the three Pulsar isolation methods mentioned in this article at the same time according to their needs, or they can choose other methods. Generally speaking, you can refer to the following precautions when choosing an isolation method: Some key services (such as billing, advertising, etc.) use multiple small Pulsar clusters, and do not share storage or local ZooKeeper with other clusters. Using multiple small Pulsar clusters can provide the highest level of isolation for important workloads. Large enterprises (with multiple teams) can deploy a large Pulsar cluster, use different namespaces for different isolation groups, and determine isolation groups based on capacity or workload. For example, a scene with a large number of fan-outs differs from the hardware used in a scene with the lowest end-to-end latency.
Related Reading
- Blog post recommendation | This article takes you to understand Pulsar's message retention and expiration strategy
- blog post recommendation | Tencent experts deeply analyze the five application scenarios of Apache Pulsar
- blog post recommendation | Pulsar storage space is not released problem analysis and solutions
Click link to get Apache Pulsar hard core dry goods information!
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。