Author: Chen Yu

He is currently the project manager of Aikesheng South District, responsible for the responsibility assurance system for the overall quality, safety, progress and cost management of the project. Obsessed with open source technology, responsible for customers, like extreme sports, football.

Source of this article: original contribution

*The original content is produced by the open source community of Aikesheng, and the original content shall not be used without authorization. For reprinting, please contact the editor and indicate the source.


1. What is ClickHouse-Keeper:

The ClickHouse community began to introduce ClickHouse-Keeper in version 21.8, until the ClickHouse 21.12 announcement mentioned that the ClickHouse Keeper function was basically completed.

ClickHouse Keeper is a replacement for ZooKeeper. Unlike ZooKeeper, ClickHouse Keeper is written in C++ and implemented using the RAFT algorithm, which allows linearization of reads and writes, with open source implementations in many different languages.

2. Comparison of some scenarios between Zookeeper and ClickHouse-Keeper

Why introduce ClickHouse-Keeper? Mainly, ClickHouse has many pain points when using Zookeeper:

  • Develop with java
  • Inconvenient operation and maintenance
  • Requires standalone deployment
  • zxid overflow problem
  • snapshots and logs are not compressed
  • Linear consistency of reads is not supported

And ClickHouse-Keeper has the following advantages:

  • Developed in C++, the technology stack is unified with ClickHouse
  • Can be deployed independently or integrated into ClickHouse
  • No zxid overflow problem
  • Better read performance, comparable write performance
  • Support compression and verification of snapshots and logs
  • Supports linear consistency for reads and writes

3. Configuration method

Compared to the previous cluster configuration, there is not much difference, and ClickHouse-Keeper will only run if the <keeper_server> tag is present in the configuration. The configuration template is as follows:

4. Start command

clickhouse-keeper --config /etc/your_path_to_config/config.xml

5. Parameter description

  • tcp_port: The port the client connects to (ZooKeeper's default value is 2181)
  • tcp_port_secure: Secure port for SSL connection between client and keeper-server
  • server_id: Unique ID of each node of the Keeper cluster
  • log_storage_path: log path, it is best to store logs in devices with strong IO performance
  • snapshot_storage_path: snapshot path

<keeper_server>.<coordination_settings> section

  • operation_timeout_ms: single client operation timeout configuration
  • min_session_timeout_ms: client session minimum timeout
  • session_timeout_ms: maximum timeout for client sessions
  • dead_session_check_period_ms: how often expired sessions are checked and deleted
  • heart_beat_interval_ms: how often the leader sends heartbeats to the followers
  • election_timeout_lower_bound_ms: If the follower does not receive the leader's heartbeat within this time interval, it can initiate a leader election
  • rotate_log_storage_interval: how many log records to store in a single file
  • reserved_log_items: how many log records to store before compaction
  • snapshot_distance: how often to create snapshots
  • snapshots_to_keep: the number of snapshots to keep
  • max_requests_batch_size: Maximum batch request size in a request before sending it to RAFT
  • raft_logs_level: raft logging level
  • auto_forwarding: Allows to forward the follower's write request to the leader
  • shutdown_timeout: Time to wait for internal connections to complete and close

<keeper_server>.<raft_configuration> section

  • Id: Each node ID of the cluster
  • Hostname: The hostname of the server
  • Port: server listening port

6. Status check

6.1, ruok is mainly used to diagnose the client/server of Keeper

The command is as follows:

echo ruok | nc 127.0.0.1 9181

If successful, return imok

6.2. Make sure ClickHouse-Server knows the keeper cluster, we can query the system.zookeeper table

Note: The output of the above information means the installation is successful

7. How to migrate Zookeeper to ClicHouse-Keeper

Since the advantages of ClickHouse-Keeper are so obvious, how to migrate the data in Zookeeper to ClickHouse-Keeper? The official migration tool ClickHouse-Keeper-Converter is provided, which can dump the data in Zookeeper into a snapshot that ClicHouse-Keeper can load.

The migration steps are as follows

  • Stop all Zookeeper nodes
  • Find the Zookeeper-leader node and stop it again (this step is for the leader node to generate a snapshot)
  • Run ClickHouse-Keeper-Converter to generate the snapshot file of Keeper

Command reference:

clickhouse-keeper-converter --zookeeper-logs-dir /var/lib/zookeeper/version-2 --zookeeper-snapshots-dir /var/lib/zookeeper/version-2 --output-dir /path/to/clickhouse/keeper/snapshots
  • Start ClickHouse-Keeper so that it loads the snapshot from the previous step

References: https://clickhouse.com/docs/en/operations/clickhouse-keeper/


爱可生开源社区
426 声望209 粉丝

成立于 2017 年,以开源高质量的运维工具、日常分享技术干货内容、持续的全国性的社区活动为社区己任;目前开源的产品有:SQL审核工具 SQLE,分布式中间件 DBLE、数据传输组件DTLE。