Since the release of TiDB 5.0, it has been applied in the production environment of users in the finance, Internet & new economy, logistics and other industries, and has received positive comments from many users:

TiDB serves 58 financial, Anjuke and other data warehouse reports for complex reading and related queries. In multi-table related queries, the performance is up to 90% higher than version 4.0;

After the actual measurement of NetEase interactive entertainment scene, the overall performance of TiDB 5.0 is more stable compared with 4.0, and there is no obvious jitter;

TiDB 5.0 In the application of car home big data join and aggregation scenarios, MPP shows obvious advantages. Compared with MySQL, the overall performance is increased by 20-50 times.

"User feedback motivates us to keep moving forward. Our mission is to continuously improve the experience of developers and DBAs, so that users can use it with ease and ease." said Huang Dongxu, co-founder and CTO of PingCAP, "Every version of TiDB Releases are based on solving the pain points of DBA. Real scenarios are the best architects. Starting from version 5.0, TiDB has shortened the release cycle and adopted a more flexible and agile train release model. Each user needs to input real scenarios. , Within a two-month period, it may become the feature delivered in the next version."

Thanks to the rapid feedback from a large number of users in real application scenarios, TiDB 5.1 has been released at an accelerated speed to further create a smoother enterprise-level database experience. TiDB 5.1 has more stable response delay performance, better MPP performance and stability, and more convenient operation and maintenance. Developers and DBAs can easily build key business applications of any scale based on TiDB 5.1.

TiDB 5.1 feature highlights and user value

  • Supporting the Common Table Expression of ANSI SQL 99 standard, users can write more concise and easier to maintain SQL code, easily deal with complex business logic, and improve development efficiency.
  • Further improve the performance and stability of MPP and help users make real-time decisions faster. 5.1 By supporting the partition table in MPP mode and the newly added multiple function expressions and operator optimizations, real-time analysis performance is improved by more than an order of magnitude; through enhanced memory management and load balancing mechanisms, analysis and query becomes faster and more efficient. stable.
  • In scenarios such as sudden large traffic writes, cluster expansion and shrinking, and online data import and backup, version 5.1 optimizes the stability of long-tail query latency of the database, and can reduce latency by 20%-70 in response to different workloads. %. Especially for key business applications that are delay-sensitive in the financial industry, the stability of queries under high pressure loads is greatly improved.
  • Support column type change, higher compatibility with MySQL. 5.1 The new Stale Read mode is added to greatly improve read throughput by breaking up read hotspots in read-write separation scenarios; introduce new system tables to quickly locate lock conflicts in high-concurrency transaction scenarios; improve the statistical information analysis engine to improve optimization The accuracy of the server selection index ensures the efficiency and stability of business queries.
  • Provide a more friendly operation and maintenance experience for large clusters, further reducing DBA workload. The 5.1 version of the cluster expansion and data migration speed increased by 40%, improved the reliability of large-scale cluster operation and maintenance, and reduced the overall backup and recovery time of large-scale clusters. By optimizing the automatic recovery mechanism after the CDC data link is temporarily interrupted , To further improve the reliability of the data synchronization link.

Common Table Expression makes SQL simple

In financial transaction scenarios, due to the objective complexity of the business, sometimes a single SQL statement of up to 2000 lines is written, which contains a large number of aggregations and multi-layer subquery nesting. Maintaining such SQL can be a developer’s nightmare . Version 5.1 supports the Common Table Expression (CTE) of the ANSI SQL 99 standard and recursive writing, which greatly improves the efficiency of developers and DBAs in writing complex business logic SQL, and enhances the maintainability of the code.

HTAP's real-time analysis capabilities are upgraded

Further improve the performance and stability of MPP

Version 5.1 further enhances the comprehensive capabilities of the TiFlash MPP calculation engine and helps users improve the speed of business decision-making:

  • MPP supports partition tables, combined with business logic, can optimize the resources consumed by massive data analysis and query, and improve query speed;
  • Added support for multiple commonly used SQL functions, and optimized operators so that queries can make full use of MPP to accelerate;
  • Provide a convenient forced MPP mode switch, the user can decide whether to turn on the MPP mode;
  • By optimizing the distribution and balance mechanism of cluster load, eliminate hot spots and improve the "comprehensive" carrying capacity of the system;
  • Fix engine memory usage problems and provide a smoother and smoother experience.

Improve the stability of query analysis under high pressure load

In financial business scenarios, technicians perform high-pressure batch calculations on data every day, and generate the latest market and marketing analysis reports to assist business decision-making. The batch running process requires extremely high continuity and cannot tolerate errors in the intermediate process. For this scenario, version 5.1 optimizes the request retry mechanism of TiDB and the request processing mechanism of TiKV, which significantly reduces the probability of Region Unavailable errors caused by untimely synchronization of TiFlash data under high load.

Seamless integration of TiSpark

TiSpark 5.1 version implements read and write support for tables containing clustered indexes without any additional performance overhead. It is completely transparent to users. Users can immediately migrate to the new version of TiSpark to experience seamless integration with TiDB 5.1.

Reduce read and write delay jitter

In delay-sensitive application scenarios, when online bursts of write traffic, operation of TiDB expansion and contraction, background execution of statistical tasks, and online data import and backup, may cause the delay jitter of the P99 and P999 percentile of the database. Has a certain impact on long-tail queries. TiDB 5.1 strengthens the management of disk read and write links, restricts the use of disk resources by background tasks, greatly reduces the interference of the above scenarios on online business, and improves the efficiency and stability of read and write links . In the environment where the AWS EC2 r5b.4xlarge instance is mounted with the EBS gp3 disk, the actual test results through the TPC-C benchmark test (10k WH):

  • The operation cluster is reduced from 6 TiKV to 3, the P99 response time is reduced by 20%, and the P999 response time is reduced by 15%;
  • Perform online import of 200GB data, P99 response time is reduced by 71%, P999 response time is reduced by 70%.

Enhance business development flexibility

Support column type changes

In typical TiDB application scenarios, binlog is often used to aggregate multiple MySQL upstream data into a TiDB cluster. Originally TiDB did not support the operation of changing the column type. If the upstream MySQL modifies the column type of the table, the synchronization with TiDB data will be interrupted. Version 5.1 adds support for modifying column type DDL statements, which completely solves the above problems and further improves MySQL compatibility.

Stale Read

Stale Read is suitable for scenarios that read more and write less and can tolerate reading old data. For example, after a Twitter user sends a message, the system will generate thousands or even hundreds of millions of reads, and it is tolerable for the newly sent message to be read after a certain period of time. This scenario brings considerable read concurrency pressure to the database, which may generate read hotspots, resulting in uneven load distribution of nodes, and overall throughput becomes a bottleneck. With Stale Read, users can specify a past point in time to read data from any data copy (without reading from the leader), thereby significantly dispersing the stress load of the node, and nearly doubling the overall read throughput capacity.

/* 例如:可以通过设置当前事务为查询 5 秒之前的数据状态来开启 Stale Read */
> SET TRANSACTION READ ONLY AS OF TIMESTAMP NOW() - INTERVAL 5 SECOND;
> SELECT * FROM T;

Quickly locate the lock conflict (experimental feature)

Business development requires careful handling of concurrent database transactions. Once a lock table occurs, it will have a huge impact on online business, and the DBA needs to quickly locate the cause of the lock table to ensure that the business can return to normal. The new Lock View system table view in TiDB 5.1 can quickly locate the transaction and related SQL statements that caused the lock table, thereby improving the efficiency of processing lock conflicts. The following small example shows how to use Lock View to quickly locate the transaction and SQL statement in which the locked table occurs.

Faster and more accurate statistical information analysis

As the business data continues to change, the statistical information of the table will also become stale, which in turn leads to a decrease in the accuracy of the optimizer's execution plan and slows down queries. The DBA performs the ANALYZE operation to reconstruct the statistics of the table. TiDB 5.1 optimizes the performance of the ANALYZE sampling algorithm. The average time to generate statistics is reduced to one-third. At the same time, a new statistical data type is added to allow the optimizer to select indexes more accurately.

Improve the reliability of large cluster operation and maintenance and data transmission

Backup optimization of a large number of tables

Optimize the backup of a large number of tables. Under the order of 50k tables, the full backup time of the TiDB cluster is reduced to 30-40% of the previous. In addition, version 5.1 optimizes the metadata file organization of the backup module (v2 for short). When starting BR, you can enable v2 by specifying the parameter "--backupmeta-version=2", thereby reducing the amount of single writes and reducing memory consumption , To effectively avoid abnormal exits in low-spec memory (≤8GB) environments.

Improve the reliability of large-scale cluster operation and maintenance

The larger the scale of the TiDB cluster, the longer it will take for daily operation and maintenance operations such as expansion and contraction of the production cluster, hardware upgrades, and node relocation. TiDB 5.1 significantly improves the performance of data migration during expansion and contraction. The following are two sets of test results:

  • With a scale of 100 nodes, the time to complete the cross-data center migration of all data in the cluster is reduced by 20%;
  • Adding a new node or migrating the data of a node reduces the time consumption by about 40%.

Optimize memory usage

Out of Memory has always been a typical problem that plagues the database industry. Version 5.1 has made a series of optimizations for the memory usage of TiDB to reduce OOM risks:

  • Regardless of the amount of data, the window function row_number will only occupy a fixed size of memory;
  • Optimize the reading of the partition table, occupy less memory;
  • Add a configurable memory limit to the storage layer. When the limit is triggered, the system will release part of the cache to reduce memory usage;
  • The memory occupied by TiFlash is reduced by 80% compared to the previous version.

Improve CDC synchronization link reliability

TiCDC 5.1 provides the reliability of the synchronization link without manual intervention: When environmental disturbances or hardware failures occur, TiCDC can guarantee continuous synchronization; even if synchronization is interrupted, TiCDC will automatically retry according to the actual situation.

Finally, I would like to thank Xiaomi, Qihoo 360, Zhihu, iQiyi, Ideal Auto, Sina, Huya, Xiaodian, Leap Express, Yima Technology and other companies and community developers for their design, development and development of TiDB version 5.1. The contribution you made during the test is your continued support to help TiDB continue to improve the experience of developers and DBAs in actual scenarios, and make TiDB easier to use.

Click to view TiDB 5.1 Release Notes , download and try now, and start the TiDB 5.1 tour.


PingCAP
1.9k 声望4.9k 粉丝

PingCAP 是国内开源的新型分布式数据库公司,秉承开源是基础软件的未来这一理念,PingCAP 持续扩大社区影响力,致力于前沿技术领域的创新实现。其研发的分布式关系型数据库 TiDB 项目,具备「分布式强一致性事务...