Recently, this Apache Pulsar message middleware is very popular. It is known as the next generation message middleware. Today, let’s take a look at how awesome it is?
Overview
Apache Pulsar is a pub/sub messaging platform that uses Apache Bookkeeper to provide persistence. It is a server-to-server messaging middleware. It was originally developed by Yahoo and open sourced in 2016. It is currently being incubated under the Apache Foundation. It can provide the following features:
- Cross-regional replication
- Multi-tenant
- Zero data loss
- Zero rebalancing time
- Unified queue and flow model
- High scalability
- High throughput
- Pulsar Proxy
- function
Architecture
Pulsar uses a hierarchical structure to isolate the storage mechanism from the broker. This architecture provides Pulsar with the following benefits:
- Independent extension broker
- Independent expansion storage (Bookies)
- Easier to containerize Zookeeper, Broker and Bookies
- ZooKeeper provides cluster configuration and state storage
In the Pulsar cluster, one or more agents process and load balance incoming messages from producers, dispatch the messages to consumers, communicate with Pulsar configuration storage to handle various coordination tasks, and store the messages in the BookKeeper instance (also known as bookies), rely on cluster-specific ZooKeeper cluster tasks, etc.
- The BookKeeper cluster composed of one or more bookies handles the persistent storage of messages.
- The ZooKeeper cluster specific to this cluster handles coordination tasks between the Pulsar clusters.
For more information about Pulsar's architecture, please refer to: https://pulsar.apache.org/docs/en/concepts-architecture-overview/
Four subscription models
There are four subscription modes in Pulsar: exclusive, shared, failover and key\_shared. These modes are shown in the figure below.
For detailed introduction, refer to: 160f3fbaad7d77 https://pulsar.apache.org/docs/en/concepts-messaging/
Performance is better than Kafka
The best performance of Pulsar is performance. Pulsar is much faster than Kafka. Compared with Kafka, Pulsar's speed is increased by 2.5 times and latency is reduced by 40%.
Data source: https://streaml.io/pdf/Gigaom-Benchmarking-Streaming-Platforms.pdf
Note: The comparison is for 1 topic in 1 partition, which contains 100 bytes of messages. Pulsar can send 220,000+ messages per second.
installation
Install Pulsar in the binary version
#下载官方二进制包
[root@centos7 ~]# wget https://archive.apache.org/dist/pulsar/pulsar-2.8.0/apache-pulsar-2.8.0-bin.tar.gz
#解压
[root@centos7 ~]# tar zxf apache-pulsar-2.8.0-bin.tar.gz
[root@centos7 ~]# cd apache-pulsar-2.8.0
[root@centos7 apache-pulsar-2.8.0]# ll
total 72
drwxr-xr-x 3 root root 225 Jan 22 2020 bin
drwxr-xr-x 5 root root 4096 Jan 22 2020 conf
drwxr-xr-x 3 root root 132 Jul 6 11:47 examples
drwxr-xr-x 4 root root 66 Jul 6 11:47 instances
drwxr-xr-x 3 root root 16384 Jul 6 11:47 lib
-rw-r--r-- 1 root root 31639 Jan 22 2020 LICENSE
drwxr-xr-x 2 root root 4096 Jan 22 2020 licenses
-rw-r--r-- 1 root root 6612 Jan 22 2020 NOTICE
-rw-r--r-- 1 root root 1269 Jan 22 2020 README
#bin目录下就有直接启动的命令
Docker installation (emphasis on introduction)
[root@centos7 ~]# docker run -it \
-p 6650:6650 \
-p 8080:8080 \
--mount source=pulsardata,target=/pulsar/data \
--mount source=pulsarconf,target=/pulsar/conf \
apachepulsar/pulsar:2.8.0 \
bin/pulsar standalone
HTTP protocol access uses port 8080, and pulsar protocol (Java, Python, etc. client) access uses port 6650.
The official visualization tool Pulsar Manager can visually manage multiple Pulsars. https://pulsar.apache.org/docs/en/administration-pulsar-manager/
[root@centos7 ~]# docker pull apachepulsar/pulsar-manager:v0.2.0
[root@centos7 ~]# docker run -it \
-p 9527:9527 -p 7750:7750 \
-e SPRING_CONFIGURATION_FILE=/pulsar-manager/pulsar-manager/application.properties \
apachepulsar/pulsar-manager:v0.2.0
Set admin user and password
[root@centos7 ~]# CSRF_TOKEN=$(curl http://localhost:7750/pulsar-manager/csrf-token)
curl \
-H 'X-XSRF-TOKEN: $CSRF_TOKEN' \
-H 'Cookie: XSRF-TOKEN=$CSRF_TOKEN;' \
-H "Content-Type: application/json" \
-X PUT http://localhost:7750/pulsar-manager/users/superuser \
-d '{"name": "admin", "password": "admin123", "description": "test", "email": "mingongge@test.org"}'
{"message":"Add super user success, please login"}
The browser directly enters http://server_ip:9527 log in as follows
Enter the user and password you just created to configure and manage the server
List
Toptic list
Toptic details
Client configuration
Java client
The following is an example of a Java consumer configuration that uses shared subscriptions:
import org.apache.pulsar.client.api.Consumer;
import org.apache.pulsar.client.api.PulsarClient;
import org.apache.pulsar.client.api.SubscriptionType;
String SERVICE_URL = "pulsar://localhost:6650";
String TOPIC = "persistent://public/default/mq-topic-1";
String subscription = "sub-1";
PulsarClient client = PulsarClient.builder()
.serviceUrl(SERVICE_URL)
.build();
Consumer consumer = client.newConsumer()
.topic(TOPIC)
.subscriptionName(subscription)
.subscriptionType(SubscriptionType.Shared)
// If you'd like to restrict the receiver queue size
.receiverQueueSize(10)
.subscribe();
Python client
Here is an example of a Python consumer configuration that uses shared subscriptions:
from pulsar import Client, ConsumerType
SERVICE_URL = "pulsar://localhost:6650"
TOPIC = "persistent://public/default/mq-topic-1"
SUBSCRIPTION = "sub-1"
client = Client(SERVICE_URL)
consumer = client.subscribe(
TOPIC,
SUBSCRIPTION,
# If you'd like to restrict the receiver queue size
receiver_queue_size=10,
consumer_type=ConsumerType.Shared)
C++ client
Here is an example of a C++ consumer configuration using shared subscription:
#include <pulsar/Client.h>
std::string serviceUrl = "pulsar://localhost:6650";
std::string topic = "persistent://public/defaultmq-topic-1";
std::string subscription = "sub-1";
Client client(serviceUrl);
ConsumerConfiguration consumerConfig;
consumerConfig.setConsumerType(ConsumerType.ConsumerShared);
// If you'd like to restrict the receiver queue size
consumerConfig.setReceiverQueueSize(10);
Consumer consumer;
Result result = client.subscribe(topic, subscription, consumerConfig, consumer);
More configuration and operation guides, the official documents are very clear, the official document: https://pulsar.apache.org/docs/
to sum up
As a next-generation distributed message queue, Plusar has many attractive features, and it also makes up for some of the shortcomings of other competing products, such as geographic replication, multi-tenancy, scalability, read-write isolation, and so on.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。