Author: Da Rui
Darui, DBA, likes to lose weight, lost more than 30 pounds, and is responsible for the management and maintenance of the company's database cluster.
Source of this article: original contribution
*The original content is produced by the open source community of Aikesheng, and the original content shall not be used without authorization. For reprinting, please contact the editor and indicate the source.
The materialized view is written to the storage table using the method of to
, which is as follows:
CREATE MATERIALIZED VIEW
[IF NOT EXISTS]
[db.]table_name
[ON CLUSTER] TO
[db.]nameAS
SELECT ...
The storage table is specified, so the creation of the materialized view does not need to specify the engine. In the query, the materialized view and the actual storage table get the same data, because they all come from the same storage data.
The materialized view calculates the data written to the original table each time, and writes it to the target table after aggregation. For example, if there is a detailed table recorded at a time of 1s, and at the same time, the aggregated statistics of data need to be aggregated at the minute level (similar needs), the aggregated data can be written to the 1min table by creating a materialized view (this kind of Feels a bit like a trigger)
example
Schedule of 1s records
CREATE TABLE dba_test.t_1s
(
`ctime` DateTime64(0),
`pv` Int64
)
ENGINE = MergeTree
PARTITION BY toDate(ctime)
ORDER BY ctime
SETTINGS index_granularity = 8192
Aggregate data recorded in 1min
CREATE TABLE dba_test.t_1m
(
`ctime` DateTime64(0),
`pv` Int64
)
ENGINE = SummingMergeTree
PARTITION BY toDate(ctime)
ORDER BY ctime
SETTINGS index_granularity = 8192
Materialized view t_1m_mv , the query condition is to aggregate the query results from the 1s table (t_1s) according to the minute level (toStartOfMinute), and rewrite them to the 1min table (t_1m)
materialized view
CREATE MATERIALIZED VIEW dba_test.t_1m_mv TO dba_test.t_1m
(
`toStartOfMinute(ctime)` DateTime,
`pv` Int64
) AS
SELECT
toStartOfMinute(ctime),
sum(pv) AS pv
FROM dba_test.t_1s
GROUP BY ctime
write test
dba-clickhouse-001 :) insert into t_1s values('2022-01-01 00:10:01',1),('2022-01-01 00:10:01',1),('2022-01-01 00:20:01',2),('2022-01-01 00:20:01',2),('2022-01-01 00:30:01',3);
INSERT INTO t_1s VALUES
Query id: 0bf16844-0123-4e25-a3d4-f9b5a5c8db37
Ok.
5 rows in set. Elapsed: 0.003 sec.
dba-clickhouse-001 :) select * from t_1s;
SELECT *
FROM t_1s
Query id: cb442100-37a6-4de7-b6f3-f80f084710dc
┌───────────────ctime─┬─pv─┐
│ 2022-01-01 00:10:01 │ 1 │
│ 2022-01-01 00:10:01 │ 1 │
│ 2022-01-01 00:20:01 │ 2 │
│ 2022-01-01 00:20:01 │ 2 │
│ 2022-01-01 00:30:01 │ 3 │
└─────────────────────┴────┘
5 rows in set. Elapsed: 0.002 sec.
dba-clickhouse-001 :) select * from t_1m;
SELECT *
FROM t_1m
Query id: f9d2d05d-8ad7-44a4-b66a-ea8c3c758f1f
┌───────────────ctime─┬─pv─┐
│ 1970-01-01 08:00:00 │ 9 │
└─────────────────────┴────┘
1 rows in set. Elapsed: 0.002 sec.
The insertion time turned out to be 1970-01-01 08:00:00
Start to verify whether the query statement is wrong
Check that the query results in the materialized view are as expected
dba-clickhouse-001 :) SELECT
:-] toStartOfMinute(ctime),
:-] sum(pv) AS pv
:-] FROM dba_test.t_1s
:-] GROUP BY ctime;
SELECT
toStartOfMinute(ctime),
sum(pv) AS pv
FROM dba_test.t_1s
GROUP BY ctime
Query id: 1ecaf07e-c766-40b7-bfa2-0f87ee54abad
┌─toStartOfMinute(ctime)─┬─pv─┐
│ 2022-01-01 00:20:00 │ 4 │
│ 2022-01-01 00:30:00 │ 3 │
│ 2022-01-01 00:10:00 │ 2 │
└────────────────────────┴────┘
3 rows in set. Elapsed: 0.002 sec.
The query results are as expected
Directly through the insert ...select...
method to confirm whether the inserted data is as expected
dba-clickhouse-001 :) insert into t_1m SELECT
:-] toStartOfMinute(ctime),
:-] sum(pv) AS pv
:-] FROM dba_test.t_1s
:-] GROUP BY ctime;
INSERT INTO t_1m SELECT
toStartOfMinute(ctime),
sum(pv) AS pv
FROM dba_test.t_1s
GROUP BY ctime
Query id: 5db8279a-ffb1-4174-843c-80cee48b448c
Ok.
0 rows in set. Elapsed: 0.002 sec.
dba-clickhouse-001 :) select * from t_1m;
SELECT *
FROM t_1m
Query id: acd79ea7-dc82-49f1-bb71-430a05895f19
┌───────────────ctime─┬─pv─┐
│ 1970-01-01 08:00:00 │ 9 │
└─────────────────────┴────┘
┌───────────────ctime─┬─pv─┐
│ 2022-01-01 00:10:00 │ 2 │
│ 2022-01-01 00:20:00 │ 4 │
│ 2022-01-01 00:30:00 │ 3 │
└─────────────────────┴────┘
4 rows in set. Elapsed: 0.002 sec.
Directly inserted, the data is correct, the time is not converted.
It can be confirmed that there is no problem with the query part of the materialized view, it can only be a problem when writing The time is 1970-01-01 08:00:00 more than 8 hours, which means that the time is delayed due to the time zone. Could it be because the inserted data is not standardized, or the "empty" has been converted.
verify
dba-clickhouse-001 :) insert into t_1m values('',100);
INSERT INTO t_1m VALUES
Query id: af1785ef-dca1-467b-84c6-27f9da6547f6
Ok.
1 rows in set. Elapsed: 0.002 sec.
dba-clickhouse-001 :) select * from t_1m;
SELECT *
FROM t_1m
Query id: 34db2057-7274-4859-898e-6132f8df4465
┌───────────────ctime─┬─pv─┐
│ 1970-01-01 08:00:00 │ 9 │
└─────────────────────┴────┘
┌───────────────ctime─┬─pv─┐
│ 2022-01-01 00:10:00 │ 2 │
│ 2022-01-01 00:20:00 │ 4 │
│ 2022-01-01 00:30:00 │ 3 │
└─────────────────────┴────┘
┌───────────────ctime─┬──pv─┐
│ 1970-01-01 08:00:00 │ 100 │
└─────────────────────┴─────┘
5 rows in set. Elapsed: 0.002 sec.
Sure enough, when the inserted data is empty, the time is reset.
Compare the structure of the materialized view and the aggregate table of the target
dba-clickhouse-001 :) desc t_1m;
DESCRIBE TABLE t_1m
Query id: 96c6a5ca-e42a-47e1-8212-cbcfefa6ffa4
┌─name──┬─type──────────┬─default_type─┬─default_expression─┬─comment─┬─codec_expression─┬─ttl_expression─┐
│ ctime │ DateTime64(0) │ │ │ │ │ │
│ pv │ Int64 │ │ │ │ │ │
└───────┴───────────────┴──────────────┴────────────────────┴─────────┴──────────────────┴────────────────┘
2 rows in set. Elapsed: 0.001 sec.
dba-clickhouse-001 :) desc t_1m_mv;
DESCRIBE TABLE t_1m_mv
Query id: a258f6b5-f195-4386-a9bb-4ec86e7e9bd1
┌─name───────────────────┬─type─────┬─default_type─┬─default_expression─┬─comment─┬─codec_expression─┬─ttl_expression─┐
│ toStartOfMinute(ctime) │ DateTime │ │ │ │ │ │
│ pv │ Int64 │ │ │ │ │ │
└────────────────────────┴──────────┴──────────────┴────────────────────┴─────────┴──────────────────┴────────────────┘
2 rows in set. Elapsed: 0.001 sec.
The time field of the aggregate table is called ctime , and the materialized view is toStartOfMinute(ctime)
Readjust the writing of the materialized view and clean up the data in the t_1m table
dba-clickhouse-001 :) show create table t_1m_mv\G
statement: CREATE MATERIALIZED VIEW dba_test.t_1m_mv TO dba_test.t_1m
(
`ctime` DateTime,
`pv` Int64
) AS
SELECT
toStartOfTenMinutes(ctime) AS ctime,
sum(pv) AS pv
FROM dba_test.t_1s
GROUP BY ctime
dba-clickhouse-001 :) insert into t_1s values('2022-01-01 00:10:01',1),('2022-01-01 00:10:01',1),('2022-01-01 00:20:01',2),('2022-01-01 00:20:01',2),('2022-01-01 00:30:01',3);
INSERT INTO t_1s VALUES
Query id: 812d1bbd-55f3-4a8f-b9f7-bbbe93e694af
Ok.
5 rows in set. Elapsed: 0.003 sec.
dba-clickhouse-001 :) select * from t_1m;
SELECT *
FROM t_1m
Query id: 2d1a045a-4e53-4f94-bb6a-fe5e5d58f5c7
┌───────────────ctime─┬─pv─┐
│ 2022-01-01 00:10:00 │ 2 │
│ 2022-01-01 00:20:00 │ 4 │
│ 2022-01-01 00:30:00 │ 3 │
└─────────────────────┴────┘
3 rows in set. Elapsed: 0.002 sec.
in conclusion
The field (t_1m_mv) of the materialized view should be aligned with the field name of the target table (t_1m)
(The expression is not very strict, probably the meaning above)
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。