ClickHouse logical cluster introduction

What is a logical cluster

ClickHouse As a database based on the OLAP scenario, it is natural to support the cluster. What we usually call ClickHouse cluster, refers to the physical cluster. That is, each node of the cluster is managed by the same zookeeper cluster, and various data DDL operations are valid for the entire cluster.

Corresponding to physical clusters, there is another type of cluster, which we call logical clusters. It refers to a physical cluster that does not have a certain necessary relationship in physics, and forms a logical cluster with each other. The following picture can be used to visualize the relationship between physical clusters and logical clusters.

As shown in the figure above, the three physical clusters are independent of each other and cannot perceive various data operations of cluster1 78d9c383b09799b275b9122cd3918e2e---, cluster2 and cluster3 , but for logic For clusters, the data changes of any physical cluster can be obtained by querying the logical cluster.

Why do you need logical clusters

So, when you have physical clusters, why do you need logical clusters? What pain points can it solve for users? What benefits can it bring?

First, logical clusters can solve the zookeeper pressure problem

We know that the data consistency of ---2df3953a02f68725aafa61957199536c ClickHouse cluster is guaranteed by Zookeeper cluster. When the amount of data in the cluster is very large or the operations are very frequent, it zookeeper a large number of znode and frequent updates. And the pressure of zookeeper has an upper limit. If the cluster is too large, or a zookeeper cluster management clickhouse cluster is too large, it is easy to cause zookeeper Crash or stuck.

It is precisely because "the world is suffering zookeeper for a long time" that the emergence of logical clusters can just solve this problem.

For example, the amount of data in a physical cluster is so huge that a single zookeeper cluster cannot support its metadata management. At this time, we can split the physical cluster into multiple physical clusters and use different The zookeeper cluster to manage, and then query data through the logical cluster. This not only shares the pressure of zookeeper , but also preserves the business correlation between each physical cluster, without affecting the query of business data.

Second, logical clusters solve the multi-data center problem

Another common scenario is multiple data centers. With the expansion of enterprise data scale, it is impossible to store all data in one data center. In the face of such a multi-data center scenario, if only one physical cluster is established, the first problem is network delay, and the second problem is network delay. There are high traffic costs, which are all issues that need to be considered.

Then it is better to establish its own physical cluster in each data center, so that whether it is data insertion or query, the restrictions on the network are greatly reduced, so as to minimize the delay.

But let's consider such a scenario: a banking institution has a data center in Shanghai and Hefei, respectively, and has established two physical clusters called bench_shanghai and bench_hefei . Now the head office needs to integrate The data of the two data centers in Shanghai and Hefei are used for data analysis. Then we must also query the Shanghai cluster and the Hefei cluster separately, and then aggregate them. If there are more data centers, the number of queries will be particularly large, which is obviously less effective.

If we build a logical cluster, including the Shanghai cluster and the Hefei cluster, we will call it bench cluster, so that we only need to query the bench cluster All the data are checked out, as shown in the following figure:

Again, logical clusters can save costs

The cost savings of logical clusters are certain. Of course, it is mainly the cost of network traffic. Especially for servers on the cloud, traffic is surprisingly expensive. Imagine, if there is only one physical cluster in the two data centers in Shanghai and Hefei, and exactly one shard in different replicas are located in different data centers, when the daily number TB level When data is written to the database, when data is synchronized between replicas, the traffic generated is terrible.

What a logical cluster can do

We do not recommend performing operations related to mutation in logical clusters. Insertion, deletion, etc. of data is not recommended. These operations should be done in the physical cluster.

In fact, logical clusters are not inherently good at doing the above operations. This mainly depends on the following layers of restrictions:

A logical cluster may contain multiple physical clusters, and these physical clusters are not necessarily on the same zookeeper . Therefore, whether it is creating a table or inserting data, consistency cannot be guaranteed.
Even if multiple physical clusters of a logical cluster use the same zookeeper , it is still impossible to create a distributed table at the logical cluster level. Because it is possible that physical cluster A uses a copy, the created local table engine is ReplicatedMergeTree , but physical cluster B does not use a copy, and the created local table is MergeTree , so that even if the local table structure Exactly the same, it is not possible to create a distributed table on a logical cluster.
再退一步， A 3b7d27cc952e44da3ed66046ed61d911---和B副本，建表的schema ， ReplicatedMergeTree Engine, creating distributed tables on logical clusters is still an unavoidable problem. Because in order to prevent mutual interference between clusters, we generally define the path of zookeeper with the name of the cluster, such as the following:

 CREATE TABLE default.t1
(
    `@time` DateTime,
    `@item_guid` String,
    `@metric_name` LowCardinality(String),
    `@alg_name` LowCardinality(String)
)
ENGINE = ReplicatedReplacingMergeTree('/clickhouse/tables/{cluster}/{shard}/default/t1', '{replica}')
PARTITION BY toYYYYMMDD(`@time`)
ORDER BY (`@time`, `@item_guid`, `@metric_name`)
SETTINGS index_granularity = 8192

Therefore, the paths of different physical clusters in zookeeper are actually different, and the reason why the distributed table can use ON CLUSTER xxx This syntax is created, mainly because in zookeeper task queue DDL statement is saved in the ---f4c903d32d5ac2454d0f6cc5c8f63a15---, and the paths of different clusters task queue are different, so they cannot be synchronized. So creating a distributed table will also fail.

Therefore, the meaning of the existence of logical clusters is mainly used for query. That's the only thing it can do.

So, since logical clusters have so many restrictions on creating distributed tables (almost impossible to create successfully), why can we still use logical clusters for queries? Isn't that contradicting itself?

In fact, we can create distributed tables across logical clusters on the physical cluster, as long as the table schema is the same, the query can be implemented. Written like this:

 --上海集群创建：
 CREATE TABLE default.dist_logic_t123 on cluster bench_shanghai as t123 ENGINE = Distributed('bench', 'default', 't123', rand());
 
 --合肥集群创建
  CREATE TABLE default.dist_logic_t123 on cluster bench_hefei as t123 ENGINE = Distributed('bench', 'default', 't123', rand());

In this way, whether we are inserting data into the Shanghai cluster or the Hefei cluster, we can query it from the dist_logic_t123 table.

How to create a logical cluster

The way to create a logical cluster is similar to that of an ordinary physical cluster. You only need to add the relevant information of the logical cluster to the metrika.xml configuration file. As follows:

Shanghai cluster metrika.xml Configuration file:

Hefei cluster metrika.xml configuration file:

Creating a logical cluster is not complicated. What is complicated is the subsequent operation and maintenance of the logical cluster.

Suppose we want to add a data center in Chengdu now, we need to create a physical cluster in Chengdu, and add the physical cluster in Chengdu to the bench logical cluster, and then modify the configuration files of all nodes in the Shanghai and Hefei clusters synchronously.

Similar operations include adding nodes, deleting nodes, destroying clusters, etc. It may be a small change in a physical cluster, which will be accompanied by the refresh of the configuration files of all logical clusters. ".

There is no way to do this, and nothing is perfect in this world. Since you want to use the convenience brought by logical clusters, you have to accept tedious configuration work.

ckman support for logical clusters

ckman is an operation and maintenance tool independently developed by Qingchuang Technology to manage and monitor clickhouse cluster. Its full name is " ClickHouse Manager Console ". It can easily create, start, stop, upgrade, and add or delete nodes through an intuitive visual interface. Operation and maintenance personnel do not need to pay attention clickhouse Details of configuration file changes, everything is done proactively by ckman . At the same time, it also provides data table, session and node-level monitoring, which can visually observe the usage of the cluster.

In the latest version ckman2.0beta , ckman has added support for logical clusters.

As mentioned above, the operation and maintenance actions of logical clusters, such as adding or deleting nodes, adding or destroying physical clusters, need to refresh the configuration of all nodes in the logical cluster synchronously. This work is not only time-consuming and labor-intensive, but also prone to errors and requires manual operations. If so, it is difficult to guarantee efficiency. The emergence of ckman just solved this problem. We only need a simple click on the interface, and all configuration file refresh operations will be completed by ckman , which greatly simplifies the operation steps.

Create a logical cluster

Using ckman to create a logical cluster is very simple, just enter the name of the logical cluster in the "logical name" input box.

After the logical cluster is successfully created, it will be displayed in the first cluster list.

At the same time, this information will be persisted to the hard disk. In the conf/clusters.json file, there are the following options:

 "logic_clusters": {
    "bench": [
      "bench_shanghai",
      "bench_hefei"
    ]
  }

It means that bench logical cluster consists of bench_shanghai and bench_hefei physical cluster, the nodes of this logical cluster are all the nodes of its physical cluster.

Other operation and maintenance operations

ckman Supports operations such as adding nodes, deleting nodes, upgrading clusters, and destroying clusters in logical clusters. Although these operations are for physical clusters, if a logical cluster is set, the metrika.xml configuration of all nodes in the logical cluster will be refreshed synchronously, thereby ensuring that the nodes of the logical cluster are synchronized in real time.

Create distributed tables across logical clusters

As mentioned earlier, we can create a distributed table across logical clusters in each physical cluster, but when there are many physical clusters in the logical cluster, the above table creation operation is still very cumbersome, ckman provide A dist_table of API , which is a POST interface.

URL : /api/v1/ck/dist_logic_table/{clusterName}
METHOD : POST
BODY :

 {
    "database":"default",
    "table_name":"t123"
}

This interface will create a distributed table across the logical cluster on each physical cluster of the logical cluster. Using the distributed table, data query can be performed on the logical cluster.

write at the end

This article briefly introduces the gameplay of logical clusters. It is undeniable that the introduction of logical clusters can indeed solve the pain points of many customers, but it also brings a lot of operational and maintenance resistance. Although there are operation and maintenance tools like ckman which can help, but some high-end gameplay, such as the parent logical cluster composed of multiple logical clusters, also ckman cannot be supported of. However, as the surrounding clickhouse becomes more and more complete, I believe that more excellent cluster gameplay will be born. Maybe the next bright idea is yours, who said not?

ClickHouse logical cluster introduction

What is a logical cluster

Why do you need logical clusters

First, logical clusters can solve the zookeeper pressure problem

Second, logical clusters solve the multi-data center problem

Again, logical clusters can save costs

What a logical cluster can do

How to create a logical cluster

ckman support for logical clusters

Create a logical cluster

Other operation and maintenance operations

Create distributed tables across logical clusters

write at the end

禹鼎侯

引用和评论

ClickHouse利用跳数索引加速模糊查询

从零开始 Elasticsearh Docker 单机集群

得物新一代可观测性架构：海量数据下的存算分离设计与实践

JVS逻辑引擎，拖拽式配置，让业务逻辑构建如搭积木般简单！

比SaaS更具性价比，火山引擎云数仓ByteHouse上新“云托管”模式

700PB 数据的数仓依然“快稳省”！ByteHouse 这本白皮书揭秘关键（内附下载链接）

AI幻觉破解术，逻辑引擎如何平衡精准+智能，重塑AI应用场景？