1

TiDB 5.4, as the groundbreaking work in 2022, includes many useful and beneficial new features and continuous performance/stability improvements. This document focuses on the new experience and value that major new functions and features bring to users, organized in the following chapters:

Basic performance optimization and improvement

Function expansion for cloud environment

Product ease of use and operation and maintenance support

Added system configuration switch options related to utility functions

Major bug fixes and stability improvements

For more details, please refer to Release Notes and User Manual.

Basic performance optimization and improvement
TiDB 5.4 has achieved the following important improvements in performance improvement: query plans can use indexes on multiple columns to perform efficient condition filtering related optimization work, that is, by formally supporting the index merge query optimization function, the performance of such queries can be improved by orders of magnitude It has the outstanding characteristics of stable response time and does not occupy system resources; for analysis scenarios with large amounts of data and frequent read and write updates, the performance optimization of the TiFlash storage engine will significantly reduce the CPU usage on the existing basis and indirectly help Improve the overall performance of concurrent queries; finally, TiDB 5.4 significantly improves the performance of the synchronization tool Lightning in large-scale data synchronization scenarios, making large-scale data synchronization tasks easier and faster.

TiFlash storage layer greatly optimizes row-to-column transcoding efficiency
User Scenarios and Challenges
In HTAP platforms and applications, data updates and a large number of table scans are intertwined. The performance of the storage system is a key factor affecting performance and stability. Heavy users always expect the system to have better performance and carry more services. In particular, TiDB HTAP adopts an architecture that physically separates transaction processing and analytical queries. The same data is automatically converted from row storage to columnar storage in the background according to business needs to respond to OLTP and OLAP loads respectively. The storage layer has greatly optimized the conversion efficiency from TiKV "row" storage to TiFlash "column" storage format, and the CPU efficiency in the "row-to-column" conversion link has been greatly improved, thereby saving more CPU resources under high load conditions Computational tasks for other links.

Solutions and Effects
TiDB 5.4 reorganizes the code structure of the row-to-column transcoding module in TiFlash, greatly simplifies redundant data structures and judgment logic, and adopts CPU cache and vectorization-friendly data structures and algorithms, thereby greatly improving operating efficiency. In addition, the new version also optimizes the default configuration of TiFlash's storage engine DeltaTree accordingly. See below for the confirmation of the comprehensive effect.

Columnar storage engine write performance verification: throughput performance is improved by 60% to 90% under different concurrency conditions

Verify the environment:

6 TiKV (1 copy), 1 TiFlash (1 copy), 40 CPU logical cores per node, CH-benCHmark (1500 warehouse).

8f05395d27709f5889aa8076c12396f.jpg

CH-benCHmark test results: The performance of some queries (Q1, Q6) under 10 concurrency is improved by about 20%

test environment:

3 TiKV (3 copies), 2 TiFlash (2 copies), 40 CPU logical cores per node, CH-benCHmark (1500 warehouse).

76cdb2246587ea597d1e137764e4e2c.jpg

summary
Data conversion synchronization between TiDB row-to-column storage and column-to-column storage is usually regarded as a large overhead by the outside world. However, after efforts, in the future, when the structure of TiDB's row-column conversion remains largely unchanged, the actual runtime has no obvious effect. performance bottleneck.

TiDB officially supports index merge query optimization
User Scenarios and Challenges
In the past, some queries logically needed to scan multiple columns at the same time. However, in the query of the previous version of TiDB processing area scan, only an index (or a composite column index) on a single column (or multiple columns) can be selected, even if the index on each column is All have indexes, but the overall performance is not ideal due to this impact. In TiDB 5.4 version, the index merge function is officially provided, which allows the optimizer to select indexes of multiple columns at the same time in query processing to reduce the return table and achieve a filtering effect of more than one or two orders of magnitude. For such problems that were easy to form performance bottlenecks in the past, it is of great help in terms of query time and performance stability. For example (see below), the response time of queries can be shortened from 600 milliseconds by 30 times to 20 milliseconds, and the query is improved by an order of magnitude. Concurrency.

solution
TiDB adds a fourth type of operator IndexMergeReader to the original three data read operators:

TableReader

IndexReader

IndexLookUpReader

IndexMergeReader

The first two are easier to understand, directly read the main table data or index table data. IndexLookUpReader (return table operator) will first read the index table, get the row_id, and then read the data from the main table according to the row_id.

The execution process of IndexMergeReader is similar to that of IndexLookUpReader. The difference is that multiple indexes can be used to return the table. In the following scenarios, the performance of the query can be greatly improved:

SELECT * FROM t1 WHERE c1 < 10 OR c2 < 100;
In the above query, if there are indexes on c1/c2, but because the filter condition is OR, the c1 or c2 indexes cannot be used alone, resulting in only full table scan. Using index merging can solve the problem that indexes cannot be used: index merging will use the c1/c2 index to obtain row_idx, and then perform UNION operation on the row_id obtained by the two indexes, and obtain the actual row from the main table with the result of the UNION operation.

Index Merge Advantage Scenarios
The data source is TPC-H SF 1 (the number of rows in the lineitem table is 600W). Use the following query to get the row in the lineitem table where the price is 930, or the orderkey is 10000 and the comment is a specific string:

SELECT l_partkey, l_orderkey
FROM lineitem
WHERE ( l_extendedprice = 930
OR l_orderkey = 10000 AND Substring(Lpad(l_comment, 'b', 100), 50, 1) = 'b' )
AND EXISTS(SELECT 1
FROM part
WHERE l_partkey = p_partkey);
When not using index merge, the execution time is 3.38s+

736c106e4ecebdcc08f3ec851f3d5d6.jpg

When using index merge, the execution time is 8.3ms:

ed5da0a260b6d2361fc93265cde3244.jpg

Index Merge Non-Comfortable Scenarios
If you use poorly filtering conditions, then using full table scan directly will perform better than using index merge

SELECT l_partkey, l_orderkey
FROM lineitem
WHERE ( l_extendedprice >= 930
OR l_orderkey >= 10000 )
AND EXISTS(SELECT 1
FROM part
WHERE l_partkey = p_partkey);
The query below does not use index merging and the execution time is only 2.34s
image.png

Use hint to force the optimizer to select index merging, and the execution time takes 19.79s due to the extra return table time.

image.png

No obvious benefit scenario
In the following scenarios, using index merging will not bring obvious benefits. In the case of a small amount of data, since the data can be scanned and filtered quickly, using index merging will not bring great benefits. The calculation of filter conditions is relatively fast. In some cases (such as only integer comparisons), even if the amount of data is relatively large (such as millions of rows), compared to full table scan, index merging will not be much improved.

The following query, although the filterability of the conditions is very high, but because the calculation is relatively lightweight (only the comparison of integers), the calculation is all done in TiKV, so the performance difference between using index merging (26ms) and not using index merging (30ms) is not significant. Big.

SELECT l_partkey, l_orderkey
FROM lineitem
WHERE ( l_extendedprice <= 930
OR l_orderkey = 10000 )
AND EXISTS(SELECT 1
FROM part
WHERE l_partkey = p_partkey);
LF
In the query of the index merge advantage scenario, since most of the rows have been filtered out in TiKV, the CPU/memory resource consumption can be basically ignored.

For queries that do not use index merging, because the filter conditions cannot be pushed down, all rows need to be passed to TiDB for selection operation, which consumes a lot of memory. In the case of 10 concurrent queries, TiDB needs 2G memory to filter 600W of row data.

b1474af71ffcbc77fbd6308a566b97a.jpg

summary
When the selectivity of filter conditions is high, index merging can be considered. In scenarios with large data traffic and good filtering conditions, the query response time can be shortened by 2 orders of magnitude (the example given in this article is 400 times), changing the query response in seconds to milliseconds.
Index merging can greatly reduce system resource consumption during concurrent queries, especially queries that cannot be pushed down by filter conditions (large resource consumption) can achieve subversive effects (for example, a query that originally consumes 2GB, after using index merging, the resource consumption can even be reduced. can be ignored).
TiDB Lightning adds efficient duplicate data detection features
User Scenarios and Challenges
In a production environment, due to historical reasons, the data of each sharded MySQL instance of a user may be duplicated. When the primary key/unique index is duplicated, conflicting data will be formed. Therefore, detection and processing must be performed when multiple sharded MySQLs are merged and imported into downstream TiDB, and the single-table data volume that may be as high as tens of terabytes is also a great challenge to detection efficiency.

solution
TiDB Lightning is a tool for importing large-scale data from CSV/SQL files into a new TiDB cluster at high speed. Lightning's local backend mode can encode source data into ordered key-value pairs and directly insert them into TiKV storage. The import efficiency is extremely high, and it supports the use of multiple hosts to import a table in parallel. It is currently a common way to initialize TiDB cluster data.

In local backend mode, data import does not go through the interface of transactional writes, so there is no way to detect conflicting data when inserting. In the previous version, Lightning could only be achieved by comparing the KV level check code, but it was limited. It could only know that an error occurred, but could not locate the location of the conflicting data.

The new duplicate data detection feature enables Lightning to accurately detect conflicting data, and a conflicting data removal algorithm is built in for automatic aggregation. When conflicting data is detected, Lightning can save it for users to filter and insert again.

The duplicate data detection feature is disabled by default, and different processing methods such as record/remove can be set according to different scenarios.

For detailed performance verification data, please refer to the following table:

2bd598fd01b5df95130615e997cc6ad.jpg
summary
Using Lightning's parallel import feature and enabling duplicate data detection, you can accurately locate conflicting data in the data source. The execution efficiency takes about 20% of the total time after multiple optimizations.

Function expansion for cloud environment
TiDB 5.4 attaches great importance to the ecological integration with the cloud environment, and has specially launched a Raft Engine log storage engine that can greatly save storage costs, which can save users more than 30% of data transmission costs and additional performance benefits in some occasions ;Enhanced data backup efficiency. On the basis of supporting Amazon S3 and Google Cloud Storage, it added support for the Azure environment, and completed the docking with the world's mainstream cloud manufacturers' storage services.

Support using Raft Engine as TiKV's log storage engine
User Scenarios and Challenges
One of the most concerned issues for users in cloud environments is cost. The overhead caused by data storage and IO requests cannot be underestimated. Generally speaking, a distributed database needs to replicate and persistently record a large number of logs when processing user writes, which will undoubtedly increase the cost of deploying services for users. On the other hand, since the hardware resources in the cloud environment have relatively clear restrictions, when the user load fluctuates and the preset hardware configuration cannot be satisfied, the service quality will also be significantly affected.

solution
TiDB 5.4 launched an experimental feature: Raft Engine, a self-developed open source log storage engine. Compared with the default RocksDB engine, its biggest advantage is that it saves the usage of disk bandwidth. First, Raft Engine compresses the "write log" of the TiDB cluster and stores it directly in the file queue without additional full dump. In addition, Raft Engine has a more efficient garbage collection mechanism, which uses the spatial locality of expired data for batch cleaning, and does not occupy additional background resources in most scenarios.

These optimizations can greatly reduce the disk bandwidth usage of storage nodes in a TiDB cluster under the same load. This means that clusters of the same level will be able to serve higher peak business inputs, and businesses of the same intensity can reduce the number of storage nodes to obtain similar service levels.

In addition, as a self-developed engine, Raft Engine has a lighter execution path, which significantly reduces the tail latency of write requests in some scenarios.

In the experimental environment, under the load of TPC-C 5000 Warehouse, using Raft Engine can reduce the total write bandwidth of the cluster by about 30%, and improve the foreground throughput to a certain extent.
ec91d2ced991c4810a6b38d500a118b.jpg

See the user manual for details.

summary
Using Raft Engine to store cluster logs can effectively reduce write bandwidth usage, save cloud deployment costs and improve service stability.

Support for Azure Blob Storage as backup destination storage
Backup & Restore (BR) supports Azure Blob Storage as a remote destination for backups. Users who deploy TiDB in the Azure Cloud environment can use this function to easily back up cluster data to the Azure Blob Storage service, and support AD backup and key access recovery methods. Due to space limitations, please refer to the Azure Blob Storage Backup Support and External Storage chapters in the user manual for specific information.

Product ease of use and operation and maintenance support
TiDB 5.4 continues to improve product usability and operation and maintenance efficiency, and has made the following efforts: to improve the quality of the execution plan generated by the optimizer, to enhance the collection and management of statistical information, and to set different tables, partitions, and indexes. Different collection configuration items are saved by default, so that the existing configuration items can be used for subsequent collection of statistical information; the data backup process sometimes affects normal business, which is a problem that has plagued some users for a long time. 5.4 BR adds the function of automatic adjustment of backup threads, which can significantly reduce the negative impact of backup; improper handling of a large amount of log data generated during TiDB operation will affect performance and stability. Version 5.4 provides new experimental features Raft Engine can be used to save logs, which can significantly reduce write traffic and CPU usage, improve foreground throughput and reduce tail latency; in addition, TiDB has supported GBK character set since 5.4.

Statistics collection configuration persistence
See the user manual for details.

Optimizing the impact of backups on the cluster
See the user manual for details.

Added system configuration switch options related to utility functions
TiDB 5.4 has made a lot of efforts to optimize the system configuration parameters. For complete information, please refer to the Release notes and related user manuals. This article only cites one representative improvement point.

Support session variable setting bounded expired read
TiDB is a multi-copy distributed database based on the Raft protocol. In the face of high concurrency and high throughput business scenarios, follower nodes can be used to achieve read performance expansion and build a read-write separation architecture. For different business scenarios, followers can provide the following two read modes:

Strongly consistent read: Through strong consistent read, the data read by all nodes can be guaranteed to be consistent in real time, which is suitable for business scenarios with strict consistency requirements. However, due to the data synchronization delay between leaders and followers, the throughput and latency of strong consistent reads are low. High, especially in the cross-machine room architecture, the delay problem is further magnified.

Expired read: Expired read means that expired data in the past can be read through snapshots, and real-time consistency is not guaranteed, which solves the delay problem of leader nodes and follower nodes and improves throughput. This read mode is suitable for business scenarios with low real-time data consistency requirements or infrequent data updates.

TiDB currently supports enabling follower expired reads by displaying read-only transactions or SQL statements. Both methods support precise expired reading of "specified time" and inexact expired reading of "specified time boundary". For details, please refer to the official documentation of expired reading.

Starting from TiDB version 5.4.0, TiDB supports setting bounded expired reads through session variables to further improve ease of use. The specific setting examples are as follows:

set @@tidb_replica_read=leader_and_follower
set @@tidb_read_staleness="-5"
With this setting, TiDB can quickly enable expired reads through session variables, avoiding frequent display of read-only transactions or specifying expired read syntax in each SQL, improving ease of use, and facilitating batch selection of nearby leader or follower nodes. And read the latest expired data within 5 seconds to meet the business requirements of low-latency and high-throughput data access in quasi-real-time scenarios.

Major bug fixes and stability improvements
Since the release of TiDB 5.1, improving stability has been "catching bugs", which is the top priority of the R&D and QA teams. Version 5.4 has carried out more than 200 experience optimizations in total, the main part of which is recorded in the Release notes, and other details can refer to the PR records on GitHub. TiDB's R&D and QA teams have always been responsible for users, improving the product day by day, and looking forward to winning the trust of users.

Message from PM
2022 will be a year of prosperity for the majority of TiDB users. I hope that TiDB 5.4 will bring more comfort and convenience to users, and help the healthy and efficient development of various applications and businesses. Please pay attention to TiDB's official website, GitHub and various social media accounts. On behalf of the TiDB product manager (PM) team, R&D engineering and quality teams, we look forward to creating a better year with our users. If you have any suggestions for TiDB products, welcome to communicate with us at internals.tidb.io.

Check out the TiDB 5.4.0 Release Notes, download the trial now, and start the journey of TiDB 5.4.0.


PingCAP
1.9k 声望4.9k 粉丝

PingCAP 是国内开源的新型分布式数据库公司,秉承开源是基础软件的未来这一理念,PingCAP 持续扩大社区影响力,致力于前沿技术领域的创新实现。其研发的分布式关系型数据库 TiDB 项目,具备「分布式强一致性事务...