1

About StreamNative

StreamNative is an open source basic software company formed by the founding team of Apache Pulsar, a top-level project of the Apache Software Foundation, to build the next-generation cloud-native batch-stream fusion data platform around Pulsar. As an Apache Pulsar commercial company, StreamNative focuses on open source ecology and community building, and is committed to innovation in the field of cutting-edge technologies. The founding team members have worked in well-known large companies such as Yahoo, Twitter, Splunk, and EMC.

Introduction: This article is a text-organized version of the TGIP-CN 037 live event by Li Penghui, a member of Apache Pulsar PMC and the chief architect of StreamNative. Pulsar 2.10.0 version is about to be released. This live broadcast will bring you the main new features and version interpretation of Apache Pulsar 2.10.0, and answer your questions about the technical details of the new version.

Click to view the review video

Pulsar 2.10.0 contains 1000+ commits from 99 contributors, many of which are from domestic contributors, thank you for your support and contributions to Pulsar. The release of this version is a new milestone, and so many commits have also brought upgrades to the documentation. During the upgrade of the Apache Pulsar website, the Beta version of the new website has re-archived and improved the documentation. You are welcome to try it out and give your valuable comments .

New features in Apache Pulsar 2.10.0 include:

  • Remove strong dependence on ZooKeeper;
  • New consumption type TableView;
  • Multi-cluster automatic failover;
  • Producer Lazy Loading + Partial RoundRobin;
  • Redeliver Backoff;
  • Init Subscription for DLQ;
  • Introduce multi-cluster global topic policy setting support and topic-level cross-region replication configuration;
  • ChunkMessageId;
  • Added support for batch operation Metadata service: can improve the stability of Pulsar in the scenario of a large number of topics;
  • ...

Remove strong dependency on ZooKeeper API

ZooKeeper is a very widely used API in Pulsar. The old version relies on this API everywhere, but this kind of dependence is not conducive to users choosing other types of metadata services. In order to solve this problem, Pulsar has finally removed its strong dependence on ZooKeeper in version 2.10.0 after a lot of preparation and testing after multiple iterations.

The new version currently supports three metadata services:

  • ZooKeeper
  • Etcd
  • RocksDB(standalone)

It should be noted that Etcd does not currently have a good Java client, and it should be used carefully after comprehensive consideration. In addition, from the benchmark test results, the performance of ZooKeeper and Etcd is similar, and users can choose according to their own situation.

The proposal for this feature is PIP 45 : Pluggable metadata interface (Metadata Store + Coordination Service). As the name suggests, the Metadata Store here is a metadata store, and the Coordination Service provides a centralized service to obtain global locks.

Version 2.10 also adds support for metadata batch operations, reducing the interaction between the client and the server and greatly reducing the pressure on metadata operations.

 # The metadata store URL
# Examples:
# * zk:my-zk-1:2181,my-zk-2:2181,my-zk-3:2181
# * my-zk-1:2181,my-zk-2:2181,my-zk-3:2181 (will default to ZooKeeper when the schema is not specified)


# * zk:my-zk-1:2181,my-zk-2:2181,my-zk-3:2181/my-chroot-path (to add a ZK chroot path) 

metadataStoreUrl=

# The metadata store URL for the configuration data. If empty, we fall back to use metadataStoreUrl configurationMetadataStoreUrl=

The above is the API implementation of the new feature. You can see that the above configuration parameters no longer require ZooKeeper. But removing the dependency on ZooKeeper doesn't mean removing it. Considering that ZooKeeper still has a large number of users and is widely used, the official will not consider deleting the implementation in the short term, but just plug it in for convenience.

For more details on mitigating ZooKeeper dependencies, you can read the blog Apache Pulsar Lightweight: Towards a Light ZooKeeper Era .

New consumption type TableView

Pulsar's consumption model is more diverse, and now version 2.10 has introduced TableView, which is a table service similar to KV. It is a pure view that does not support writing. It can directly build a table view in the client memory, which is suitable for generating views in scenarios with a small amount of data. However, TableView is not suitable for scenarios where the amount of data is large and the memory of a single machine is unbearable.

TableView can be used with Topic Compaction, which can do key compression on the server side. The principle of this feature is to save the latest state of the key in only one snapshot, and the consumer only needs to view this snapshot when needed, without having to consume more cost to read the original backlog. TableView can be seamlessly connected with this feature, and the snapshot generated by the broker can be directly used when restoring TableView, thereby reducing the recovery overhead. The mechanism and usage scenarios of this compression are introduced in detail in subsection 22 of the original video.

 try (TableView<byte[]> tv = client. newTableViewBuilder (Schema.BYTES)
        .topic ("public/default/tableview-test")
        .autoUpdatePartitionsInterval(60, TimeUnit.SECONDS)
        .create()) {
    System.out.println("start tv size: " + tv.size());
    tv. forEachAndListen((k, v) -> System.out.println(k + "->"+ Arrays. toString(v)));

    while (true) f
        Thread. sleep (20000) ;
        System.out.println(tv.size)):
        tv. forEach((k, v) -> System, out.println("checkpoint: "+ k+ "->" + Arrays.toString(v)));
    }

} catch (Exception ex) {
    System.out.println("Table view failed: " + ex. getCause());
}

Cluster automatic failover

 ServiceUrlProvider provider = AutoClusterFailover.builder()
        .primary(primary)
        .secondary(Collections.singletonList(secondary))
        .failoverDelay(failoverDelay, TimeUnit.SECONDS)
        .switchBackDelay(switchBackDelay, TimeUnit.SECONDS)
        .checkInterval(checkInterval, TimeUnit.MILLISECONDS)
        .build();

Pulsar supports multiple clusters, and data can be synchronized between clusters, so users often have failover requirements between clusters, so the automatic failover feature is introduced. In the past, domain name switching was required for failover, or self-made auxiliary nodes were used, but these methods often required manual intervention, and SLA was difficult to guarantee. The advantage of the new feature lies in automation and configurability . You can set primary and secondary clusters, configure parameters such as delay, and realize automatic cluster switching as expected through probing. But at present, this feature only detects whether the Pulsar port is connected or not when probing. Future versions will continue to improve the probing method.

Producer Lazy loading + Partial RoundRobin

Currently in a large-scale cluster, if there are many partitions, the producer needs to poll all the partitions when sending messages, and the partitions may be distributed on different brokers, which may cause huge connection pressure, as shown in the upper part of the figure above. To this end, this new feature implements lazy loading of the producer, so that if a partition is not used, it will not be created, reducing the burden on the system . In some polling, all partitions are listed first and then shuffled, so that different clients can write to different partitions, reducing the number of connections between producer instances and brokers, which can also reduce system pressure. It should be noted that Shared consumer does not support this mechanism for the time being. In the future, the community needs to explore whether a similar mechanism can be implemented in this regard.

 
PartitionedProducerImpl<byte[]> producerImpl = (PartitionedProducerImpl<byte[]>) pulsarClient.newProducer()
        .topic(topic)
        .enableLazyStartPartitionedProducers(true)
        .enableBatching(false)
        .messageRoutingMode(MessageRoutingMode.CustomPartition)
        .messageRouter(new PartialRoundRobinMessageRouterImpl(3))
        .create();  

Redeliver Backoff

 client.newConsumer().negativeAckRedeliveryBackoff(MultiplierRedeliveryBackoff.builder()
        .minDelayMs(1000)
        .maxDelayMs(60 * 1000)
        .build()).subscribe();

client.newConsumer().ackTimeout(10, TimeUnit.SECOND)
        .ackTimeoutRedeliveryBackoff(MultiplierRedeliveryBackoff.builder()
        .minDelayMs(1000)
        .maxDelayMs(60 * 1000)
        .build()).subscribe();

Pulsar has an existing ackTimeout mechanism. If you use shared subscription, you may not be able to sign for a certain period of time when consuming data. The ackTimeout can ensure that the client will automatically re-deliver the message and redistribute the message to other consumers when the receipt is not signed for more than a certain period of time.

The time for message redelivery is difficult to determine, the length varies, and the delay needs to increase as the number of message processing failures increases. This API was introduced for this purpose, and the delay can be gradually increased. Compared with existing methods, the advantage of this feature is that it has less overhead , does not need to go through another topic, and is more flexible . The downside is that messages are retried immediately once the client is down. In addition, the cost of using this feature is very low , and the API is concise and easy to master. It should be noted that currently only the Java client supports this feature, and it can also be used with dead letter queues.

Initialize dead letter queue subscription

 Consumer<byte[]> consumer = pulsarClient.newConsumer(Schema.BYTES)
    .topic(topic)
    .subscriptionName(subscriptionName)
    .subscriptionType(SubscriptionType.Shared)
    .ackTimeout(1, TimeUnit.SECONDS)
    .deadLetterPolicy(DeadLetterPolicy.builder()
        .maxRedeliverCount(maxRedeliveryCount)
        .initialSubscriptionName(my-sub)
        .build())
    .receiverQueueSize(100)
    .subscriptionInitialPosition(SubscriptionInitialPosition.Earliest )
    .subscribe();

The creation of the dead letter queue is a lazy creation strategy, which presents a problem: when the dead letter message has not yet occurred and the topic has not been created, the data retention policy cannot be specified for the topic. For this reason, only a strategy can be created for the namespace, and the granularity will be very large. The new version introduces InitialSubscriptionName . When setting the dead letter queue, you can create a subscription at the same time as create, so that the data can be preserved. And for the dead letter queue, most scenarios only need one subscription processing, so the subscription will correspond to InitialSubscriptionName , so that the message sent to the dead letter queue can be retained without setting retention.

Cross-cluster topic strategy

 bin/pulsar-admin topics set-retention -s 1G -t 1d --global my-topic

Message message = MessageBuilder.create()
    ...
    .setReplicationClusters(restrictDatacenters)
    .build();
producer.send(message);

This feature can apply topic policy across clusters and take effect on all clusters through the -global parameter. On the surface, a global parameter is easy to implement, but a lot of work has been done behind the scenes. The main reason is that the bottom layer needs to synchronize the schema to all clusters in order to achieve cross-cluster applications. It should be noted that the broker does not have a retry strategy. Choose one of the following two methods:

  • Actively tell the broker to retry;
  • Disconnect the client.

New message ID type ChunkMessageId

 public class ChunkMessageIdImpl extends MessageIdImpl implements Messaged {
    private final MessageIdImpl firstChunkMsgId;

    public ChunkMessageIdImpl(MessageIdImplfirstChunkMsgId,MessageIdImpllastChunkMsgId){
        super(lastChunkMsgId.getLedgerId(), lastChunkMsgId.getEntryId(), lastChunkMsgId.getPartitionIndex());
        this. firstChunkMsgId = firstChunkMsgId;
    }
    public MessageIdImplgetFirstChunkMessageId(){
        return firstChunkMsgId;
    }
    public MessageIdImplgetLastChunkMessageId(){
        return this;
    }

}

The ChunkMessage introduced in the previous version can effectively reduce the system pressure, but the problem is that only the last MessageId in the Chunk is returned to the client, so the client cannot know the previous Id. This feature solves the problem of missing Id corresponding to the beginning to the end of ChunkMessage, which is convenient for users to seek MessageId and consume. Currently this feature is only supported by the Java client.

Other features

  • Topic properties: Add more information than name to Topic, and store it together with metadata;
  • Metadata batch operation: improve performance;
  • CPP client: provides chunk message support;
  • Broker graceful shutdown: Support the REST API to gracefully shut down the Broker, first remove the Broker from the cluster and then close the Topic to avoid continuing to initiate connections to the Broker;
  • Support creat a consumer in the paused state : You can specify a pause state when creating a consumer, without getting messages from the server;
  • ...

Featured Q&A

After the introduction of the new features, Mr. Li also answered the audience's barrage questions one by one. The following is a summary of the selected QA content. For details, see the video after 54 minutes.

Q: Does Pulsar support failover?

  • Pulsar supports failover mode, and a partition can have multiple consumers, and the overhead mainly exists in message receipt. Pulsar supports maintaining the receipt status of a single message, so there will be a certain overhead;

Q: Does Pulsar operate ack exactly once?

  • Pulsar defaults to at least once implementation instead of exactly once when transactions are not turned on;

Q: Does ChunkMessage support transactions?

  • Transactions are not currently supported;

Q: Will Pulsar's message sending fail in the middle and succeed later?

  • The sending of all messages in Pulsar will not succeed after an intermediate failure, and one of the failures will fail afterward;

Q: How to solve the problem of cluster size limitation caused by metadata?

  • Version 2.10 has not yet solved the problem of cluster size limitation caused by metadata, and will consider solving it in the future;

Q: Does KoP support Kafka format?

  • Now KoP can support Pulsar format and Kafka format, avoid serialization/deserialization on the server side, and hand over the work to the client side. The client can parse the Kafka format data by loading a formatter, reducing the pressure on the broker.

About PPT

Please copy the link to the browser to download the PPT: https://pan.baidu.com/s/1sqt99KVF7n0jBS_aue2SXw
Password: 6wtk

Related Reading

Follow the public account "Apache Pulsar" to get more technical dry goods

Join the Apache Pulsar Chinese exchange group👇🏻
image.png

Click to watch TGIP-CN 37: Apache Pulsar 2.10.0 New Feature Analysis Review Video!


ApachePulsar
192 声望939 粉丝

Apache软件基金会顶级项目,下一代云原生分布式消息系统