数据库 - Academic Gas Station｜Research progress on benchmarking tools for HTAP database - OceanBase技术站

Editor's note

This article is written by Yu Rong and Wang Qingshuai from East China Normal University. This article is also the sixth article in the OceanBase academic series.

" Yu Rong, a master student at the School of Data Science and Engineering, East China Normal University. During his postgraduate period, he was engaged in technical research on evaluation of HTAP database systems in the DBHammer group of the School of Data. He is committed to defining new benchmarks and developing open source tools to serve the fairness and impartiality of HTAP databases. , Efficient evaluation work.”

" Wang Qingshuai, a doctoral student at the School of Data Science and Engineering, East China Normal University. During the doctoral period, he was devoted to the technical research of database systems in the DBHammer group of the School of Data, East China Normal University. Currently, he is working on load generation for OLAP applications, distributed, HTAP and other new databases. Some progress has been made in the evaluation direction of the database, and we will continue to explore optimization and evaluation technologies for new databases.”

The topic shared today is "Research Progress of Benchmark Evaluation Tools for HTAP Database" , which summarizes the representative HTAP evaluation benchmarks in recent years, aiming to serve the evaluation needs of HTAP database systems. I hope that after reading this article, you can gain new insights on this topic. If you have different views, please leave a message at the bottom to discuss.

With the growing demand for online real-time analysis, the HTAP (Hybrid Transaction and Analitical Process) database emerges, which can efficiently process OLTP and OLAP loads in the same system, and provide the ability to analyze fresh data. In recent years, a variety of HTAP database architectures have been proposed by industry and academia, so how to evaluate various new HTAP databases has attracted extensive attention from academia and industry.

This article mainly discusses benchmarking tools for HTAP databases and research progress. As a distributed HTAP database system extended from the OLTP database system, OceanBase provides two resource isolation schemes: when the OLAP load ratio is low, analysis tasks are performed on the primary copy to obtain real-time data; OLAP ratio is high When performing analysis tasks on read replicas to achieve explicit physical isolation, OceanBase has the opportunity to make a better trade-off between isolation and database performance, that is, it may have better HTAP load support capabilities. In the future, we will also release OceanBase's test report on the support capability of HTAP load.

introduction

The difficulty of HTAP database implementation lies in the resource isolation and data synchronization of TP/AP. Therefore, in addition to evaluating the performance of TP load in the face of high concurrent load and the performance of AP load in the face of complex queries, the existing HTAP benchmark pays more attention to the following two issues in the design of evaluation performance:

Hybrid load generation: Generates TP and AP loads, and controls the intersection of data access between AP and TP loads.

Load index: Quantitatively evaluate the isolation of mixed loads, that is, the degree of mutual interference.

The current mainstream HTAP evaluation benchmarks (tools) are CH-benCHmark (2011) [1], HTAPBench (2017) [2], OLxPBench (2022) [3], HATtrick (2022) [4]. The following article will analyze and summarize these four tasks from the aspects of table mode and load generation, test methods, control methods, and test indicators.

1. CH-benCHmark

It was proposed in 2011 and is the first officially proposed benchmark for mixed workloads, defined based on standard OLTP and OLAP benchmarks.

Figure 1 TP load and AP load operation mode of CH-benCHmark[1]

Table Patterns and Loads: Simple stitching of TPC-C and TPC-H table patterns. Transactional workloads use TPC-C's 5 workloads, and analytical workloads use TPC-H's 22 queries. However, there is inconsistency in the data access space between AP scanning and TP modification in this method. Smaller access intersection makes the probability of encountering read and write conflicts lower, and the resource interference in processing of different types of loads is lower.

Test method: respectively run the transactions with the specified proportion of TPC-C and 22 queries similar to TPC-H after modification. The operation mode is shown in Figure 1. During the testing process, by specifying the number of OLAP flows, the initial proportion of different OLTP transactions, and the number of clients, the TP and AP capabilities under this load mode were measured respectively. At the same time, in order to compare the mutual interference between the two types of loads, the test needs to be carried out at least three groups, the pure AP flow load without TP flow, the pure TP flow load without AP flow, and the specified number of TP flow and AP flow load. The method manually analyzes the test results in isolation and interference.

Test Metrics: First Adoption

$$ tpmC/QphH@tpmC $$

and

$$ tpmC/QphH@QphH $$

Although the indicators are very objective, they are not suitable for horizontal comparison between databases, and are suitable for the performance display of individual databases.

2. HTAPBench

HTAPBench[2] first proposed an evaluation process based on TP throughput in 2017.

Table Mode and Load: Consistent with that used by CH-benCHmark.

Test method: HTAPBench specifies the lower limit range of OLTP target throughput that the application can tolerate, runs enough TP threads to ensure that the initial lower limit of throughput is met, and then determines whether to add OLAP streams according to the real-time feedback of TP throughput during the execution process. To ensure the maximum OLAP capability under the TP throughput, the operation mode is shown in Figure 2. This measurement method includes the consideration of mutual interference between TP/AP and only needs to be performed once, which is relatively simple.

Test Metrics: HTAPBench uses

$$ QphH/(OLAPworkers)@tpmC $$

Do a comparison between individual worker performance.

Distributed control method: HTAPBench proposes how to control the complexity of the analysis task and the query access mode. The main purpose is to control the access of the AP task to the data generated by the TP. At the same time, a method of density estimation is proposed to determine the data distribution of the current database, so that it can dynamically determine the analysis query according to the current database state.

Figure 2 Schematic diagram of HTAPBench operation mode [2]

3. OLxPBench

OLxPBench[3] is an evaluation tool for the HTAP database benchmark developed by the Institute of Computing Technology of the Chinese Academy of Sciences (the tool architecture is shown in Figure 3). They analyzed the task of evaluating the HTAP database and concluded that the load should meet three characteristics, namely real-time query, semantic consistency Sexuality and domain-specific orientation. Semantic consistency requires that all data modified by TP be accessed by AP. Real-time query includes real-time query and batch query, which can simulate user behavior and customer decision-making requirements. The paper points out that the CH-benCHmark and HTAPBench benchmarks simply stitch the original benchmarks and the TPC-H query fails to truly show the interference between TP/AP.

Table schema and workload: OLxPBench has designed three workloads for general scenarios (Subenchmark), financial scenarios (Finbenchmark) and telecom scenarios (Tabenchmark), including query logic design for real-time query and semantic consistency. As a general workload, Subenchmark refers to the generation of the TPC-C benchmark table schema, using 5 transactions + 9 analytical queries + 5 mixed transactions; Fibenchmark refers to the generation of the SmallBank benchmark table schema, using 6 transactions + 4 analytical queries +6 mixed transactions; Tabenchmark refers to the generation of the TATP benchmark table schema, using 7 transactions + 5 analytical queries + 6 mixed transactions.

Test method: Same as HTAPBench.

Test indicators: Combined with HTAPBench and CH-benCHmark, use

$$ QphH/(OLAPworkers)@tpmC $$

and

$$ QphH/(OLAPworkers)@tpmC $$

Present the results.

Figure 3 OLxPBench architecture [3]

4. HATtrick

HATtrick [4] is a benchmark for the HTAP database proposed by the University of Wisconsin in 2022, which proposes that isolation between different tasks and controlling access to fresh data are the main challenges in the implementation of HTAP databases.

Table mode and load: HATtrick is extended from the SSB table mode, adding a history table, a freshness record table and some fields; there are two types of loads, transactional loads are inspired by TPC-C and use self-built orders Transactions, Payment Transactions, and Order Count Transactions, analytic workload using 13 adjusted SSB queries.

Test method: Given the number of TP/AP clients, 13 queries of transaction and SSB are executed at the same time. The queries are executed continuously in batches, and the query order within the batch is random.

Test metrics: Two new evaluation metrics are proposed for isolation and freshness. First, the concept of using throughput frontier is proposed to conduct isolation evaluation through two-dimensional visualization, as shown in Figure 4. In the raster graph, as the number of clients changes, the more parallel the line is to the coordinate axis, the better the isolation performance; while in the comprehensive graph, in the image that varies with the expansion factor, the throughput boundary line is above the scale line and the closer the boundary line isolation performance is. The better, the closer the scale line is, the more the trade-off between transaction load and analytical load is, and the closer to the axis below the scale line, the higher the degree of interference between transaction load and analytical load, and the more intense the competition for resources. Second, a metric function is given for freshness (the time difference between the query-initiated version and the first invisible TP version)

$$ f_(A_q )=max⁡(0,t_(A_q)^s-t_(A_q)^(f_ns )) $$

Figure 4 Schematic diagram of each curve[4]

Epilogue

According to the survey, in addition to the early CH-benCHmark, the recent three Benchmarks clearly require the throughput capacity of OLTP during the evaluation. On the HTAP database system, OLTP and OLAP access the "same data", and the transaction processing capability is likely to be affected by synchronization (freshness). How to balance resource sharing and resource isolation [5] is a difficult problem. As Yang Chuanhui said in "What does real HTAP mean to users and developers? ” said that the real HTAP database system requires high-performance OLTP first, and then supports real-time analysis on the fresh data generated by OLTP [5].

Through the analysis of the existing typical HTAP database evaluation benchmarks, it is found that the existing benchmark evaluation tools have their own characteristics in table mode and load generation, test methods, distribution control methods, test indicators, etc., which are designed to serve the evaluation of HTAP characteristics. . Specifically, HTAPBench considers the control of computational cost, OLxPBench considers the use of real-time queries, and HATtrick considers freshness indicators, which are worthy of our reference and study.

*references:

[1] Cole R, Funke F, Giakoumakis L, et al. The mixed workload CH-benCHmark[C]//Proceedings of the Fourth International Workshop on Testing Database Systems. 2011: 1-6.

[2] Coelho F, Paulo J, Vilaça R, et al. Htapbench: Hybrid transactional and analytical processing benchmark[C]//Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering. 2017: 293-304.

[3] Kang G, Wang L, Gao W, et al. OLxPBench: Real-time, Semantically Consistent, and Domain-specific are Essential in Benchmarking, Designing, and Implementing HTAP Systems[J]. arXiv preprint arXiv:2203.16095, 2022 .

[4] Milkai E, Chronis Y, Gaffney KP, et al. How Good is My HTAP System? [C]//Proceedings of the 2022 International Conference on Management of Data. 2022: 1810-1824.

[5] [Yang Chuanhui, "What does real HTAP mean to users and developers?" ]

[6] Weakly consistent read, https://open.oceanbase.com/docs/observer-cn/V3.1.4/10000000000449449

Academic Gas Station｜Research progress on benchmarking tools for HTAP database

introduction

1. CH-benCHmark

2. HTAPBench

3. OLxPBench

4. HATtrick

Epilogue

OceanBase技术站

引用和评论

OceanBase v4.3.1 列存 FAQ

MySQL慢查询日志：性能优化的终极指南

做到真正0丢失、0重复：Apache SeaTunnel 实现万亿级数据一致性全解密

Devin 发布 DeepWiki，2 星的项目直接装出万星的气场

好用的开源埋点方案-ClkLog埋点用户分析系统

DNS服务器地址大全

实战分享：DolphinScheduler 中 Shell 任务环境变量最佳配置方式