How to quickly retrieve data in a table with a scale of hundreds of billions of rows

Introduction to Since the relational data model was invented fifty years ago, with its excellent expressive ability and clear and easy-to-understand characteristics, it quickly emerged in the database market, quickly occupied the market, and became the mainstream of all walks of life. Data storage system. In the past fifty years, database architecture, expressions, storage structures, optimizers, etc. have made great progress, but the development of index structure has been relatively slow, and more development is based on the existing index foundation to optimize queries. Optimizer. After 30 years of development, we entered the era of the Internet and mobile Internet, and the scale of data exploded, and non-relational data was immediately produced.

background

Since the relational data model was invented 50 years ago, with its excellent expressive ability and clear and easy-to-understand characteristics, it quickly emerged in the database market, quickly occupied the market, and became the mainstream data storage system in all walks of life. In the past 50 years, database architecture, expressions, storage structures, optimizers, etc. have made great progress, but the development of index structure has been relatively slow, and more development is based on the existing index foundation to optimize queries. Optimizer.

After 30 years of development, we entered the Internet and mobile Internet era. The data scale has exploded, and then non-relational databases (NoSQL) have emerged. The emergence of NoSQL supplements the lack of the original database in terms of scale, but these NoSQL indexes The structural principle is still similar to that of traditional relational databases, and both are based on the original table structure to build secondary indexes.

Both the secondary index of the relational database and the secondary index of the NoSQL database are basically based on the primary key column rearrangement of the original table structure, which will have shortcomings in indexing capabilities: First, the leftmost matching principle limits the index Function, the second is to determine the query column in advance, and build multiple secondary indexes after combining the query columns. If the index cannot be matched during the query, the performance will be greatly reduced, so there will be a slow query that is abhorrent. Slow queries will seriously affect the user experience and the stability of the database itself.

In order to solve the above problems, there is an architecture to introduce search engines, such as Elasticsearch, Solr (declining period) or other cloud search systems, etc., and use the inverted index of search engines to support reading index: free combination query of any column, but also Support geographic location query, full-text search. Since search engines are systems optimized for queries, query performance will be more stable. However, there are some problems with this approach, and even some problems are not realized by many people:

Data reliability: For databases, ensuring that data is not lost is the core requirement, but not for search engines. Most search engines have the problem of losing data.
Integrity of query results: The core goal of search engines is query performance, so it will give priority to query performance rather than data integrity. Therefore, some search engines may abort query requests in advance to ensure delay.
Functional stability hidden dangers: In order to attract customers, most open source products or commercial products are most keen to continuously release new features. Some functions are fine at the small data level, but when the amount of data increases, it may be very serious. Potential stability hazards, such as bursting memory, bursting CPU, or causing the entire cluster to freeze, etc. The most important thing is that if you are not a very professional expert, it is difficult to predict these hidden dangers in advance, and they will eventually happen again and again. Slowly explore and understand in the fault, and the more difficult thing is that you never know how many holes have not been stepped on.
The complexity of operation and maintenance: The search field is a highly specialized field. Although some products are easy to use, they require high professionalism for operation and maintenance personnel. Many people are only getting started after several years of exploration. When a problem is encountered, it is still not possible to locate and solve it quickly, or even know which link has the problem (you can't know which link has the problem without seeing more fine-grained monitoring indicators), and it is also difficult in the business If the risks are discovered in advance before going online, the end result may be both sides: the operation and maintenance personnel are very tired, and there are still many business problems.
Complexity of the architecture: In order to synchronize data from the database to the search engine, a synchronization system needs to be introduced. In this way, at least three systems need to be managed, and the synchronization status and timeliness need to be managed. This complexity and cost will increase a lot. Another solution is to double-write the database and search engine, but this one has to deal with more complicated consistency issues. At the same time, R&D requires two different systems to read and write.

The above architecture has realized the shortcomings of traditional databases, and found a solution, but the solution still has great shortcomings.

Why not go further here and introduce the capabilities of search engines into the database system? If this is possible, then the above-mentioned problems will be solved and disappear.

Based on the above thinking, Alibaba Cloud, after ten years of self-developed non-relational (NoSQL) structured data table storage service, Tablestore, successfully introduced the core capabilities of search engines: inverted index, BKD index, etc., to search engine The capabilities are fully embedded in the NoSQL database.

This capability is called in Tablestore: SearchIndex.

So far, Table Store has two major capabilities: wide tables and multiple indexes. The wide table engine is similar to Bigtable. It is mainly for the high-reliability storage of data. It solves the problem of data scale and expansion, while multiple indexes solve the data. The efficiency and flexibility of query retrieval.

The current complete structure and capabilities of Table Store are as follows:

value

Compared with the traditional solution, Tablestore's multiple index not only makes up for all the shortcomings of the database plus search engine solution mentioned above, but also has huge advantages in other aspects:

One system supports multiple capabilities: it can not only provide database-level data reliability, but also provide the rich query capabilities of search engines.
The application layer architecture is simpler: only one system is required for data storage and query, and operations and maintenance, R&D, and even financial work will be simpler.
Abundant query capabilities: support very rich query functions, sorting and statistical aggregation, etc. It can meet the needs of most online queries and lightweight analysis scenarios.
Better performance: Both storage and query performance are better than the industry's open source products. For example, the performance of Count is more than 10 times faster than the industry's best Elasticsearch.
Combine with DLA to provide complex analysis capabilities: Alibaba Cloud Data Lake analysis product DLA can now push most of the SQL operators into the multi-index, which can greatly improve the performance of SQL analysis in DLA. Currently, Tablestore is the only DLA that can push down the limit. Data systems for operators such as, agg, sort, etc., combined with DLA can provide more complex analysis capabilities.
The value of ALL IN ONE: A piece of data supports online writing and querying, offline import and export, lightweight analysis and DLA-based complete SQL analysis capabilities. These capabilities will be multiple corresponding isolations in Tablestore to avoid mutual influence. Because it is a system, customer research and development, operation and maintenance and financial management will be easier.
R&D efficiency improvement: In addition to the above obvious advantages, there is a big advantage that can greatly improve R&D efficiency: no additional deployment system is required, no need to learn the interfaces and behaviors of many different systems, and no need to pay attention The delay of the synchronization link no longer needs to consider operation and maintenance and so on. According to customer feedback, after using multiple indexes, the development cycle of a basic function can be reduced from one month to one week, which greatly increases the speed of product launch.

Functions and capabilities

Table Store is a distributed NoSQL product that Alibaba Cloud has built heavily. The core goal is to build a massive data platform that can support online, offline, and lightweight analysis. It is hoped that based on the design concept of ALL IN ONE, customers' one-stop requirements for large-scale structured data storage and query can be realized.

The core positioning of multi-index in table storage products is data value discovery, which provides query and analysis capabilities:

Query ability

The current multi-index has abundant query capabilities. There is no restriction on the leftmost matching principle of traditional databases and various other NoSQL. As long as the indexed columns can be queried by any combination of columns, the user experience is greatly improved.

At the same time, it also supports array types (Array) and nested types similar to Json, which can be more easily adapted to various application layer models, and the research and development efficiency will be higher.

In addition, there is another ability that traditional databases do not have, that is, the rich word segmentation ability and full-text search function. The full-text search function supports sorting by relevance score or sorting results by any column. The relevance algorithm uses BM25. algorithm.

In the current period of rapid development of the mobile Internet, the Internet of Things, and the Internet of Vehicles, many applications or businesses require geographic location queries, such as querying the surrounding people or the needs of electronic fences. At this time, you need to use the geographic location query function. The function is also provided in the multiple index. When writing, specify the column as the GeoPoint type, and then you can use a rich geographic location query when querying, and the geographic location query can be queried or filtered with other index columns, such as combined with time.

The query capabilities of the multi-index basically have the most complete query functions currently available. Since it is a self-developed system, if there are new business scenarios or new query requirements, our rapid research and development capabilities can also launch new functions as soon as possible.

Real-time analysis capabilities

Multi-index also provides lightweight real-time analysis capabilities for online scenarios, and is mainly suitable for scenarios where query latency requires milliseconds to seconds.

Support basic statistics aggregation: Min, Max, Sum, Avg, Count, DinstinctCount, GroupBy, etc.
Support advanced statistical aggregation: histogram statistics, percentile statistics, etc.

Compared with the open source system, the performance of some of our lightweight analysis functions has a performance improvement of more than 10 times.

More importantly, these lightweight analysis-related requests will be isolated from other online requests when executed internally, and will not affect the availability of online requests.

If some scenarios need to query the total number or grouping, etc., you can directly use the multiple index without introducing other systems.

SQL analysis capabilities

In some scenarios, SQL analysis capabilities are required, but you don't care about time. Minute level returns are acceptable. At this time, you can use multi-index + Alibaba Cloud Data Lake Analysis DLA to achieve complete analysis capabilities. DLA is a Severless analysis system that supports standard SQL capabilities and can push operators down to the underlying storage system or database. The current multiple index of Table Store implements most of the operators in DLA SQL, and it is also the only data storage system that allows operators such as Limit, Sort, Min, Max, Sum, Avg, Count, DinstinctCount, and GroupBy to be pushed down to the storage layer.

The analysis function combined with multi-index and DLA is suitable for complex analysis requests with delays ranging from seconds to packet level. The lightweight analysis capabilities of the multi-index itself are suitable for short answer analysis scenarios with millisecond to second delays.

For more detailed DLA and multi-index usage, please refer to this article " Tablestore calculation push down ".

High concurrent export capability

In some scenarios, customers need to quickly export data that meets the conditions to an external system, and do some other operations. For example, after device data is exported, it may be necessary to send notifications to these devices. After the analysis data is exported to an external computing system, it is more responsible. Analysis and processing and report generation. If you can filter out useless data in the storage system before exporting, and quickly filter out the final data set, then performance and cost will be more advantageous.

In order to meet the needs of such scenarios, we have developed a concurrent export function: ParallelScan. The interface has the following three basic capabilities:

Supports complete query functions: including all Query functions supported by the Search interface. Useless data can be filtered out at the storage layer in advance, reducing the amount and cost of data to be transmitted, and improving performance.
High throughput: Online filtering and export of up to 10 million lines/sec can be supported.
Resumable transmission with breakpoint: If an error occurs during the reading process, it can support re-reading from the location of the error at this time, and has the ability to resume transmission with a breakpoint.

The above features also allow ParallelScan to play its greatest advantage in the following scenarios:

Device Center: Sometimes applications need to select devices or apps that meet certain conditions, and push some notifications or upgrade information for them. At this time, the system needs to support the free combination of any conditions, and it also supports quickly pulling out a large number of devices from the database. .
Computing system: If there are complex SQL queries in computing systems such as Spark, Presto, DLA, etc., ParallelScan can be used to push down some operators, and the remaining results after the operator filtering can be quickly pulled into the computing system memory for secondary calculations. Significantly reduce costs and improve performance.

Dynamically modify Schema and A/B Test

In addition to functions, we are also constantly investing in ease of use, hoping to greatly simplify the customer experience and improve the efficiency of R&D and operation and maintenance. After the customer uses the multi-index, because the multi-index is a strong schema product, if the subsequent business needs to change the field, such as adding, deleting, modifying the type, modifying the column name, etc., you need to create an index first, and the index data will be tracked. After uploading, verify that there is no problem, and then make changes online, and replace the online index with the new index. Although this process can solve the problem, there are two fatal problems:

Prone to failure: the wrong index may be switched during the switch, or there may be a problem with the new index, which can cause problems with online services, cause failures, and cause losses.
Very low efficiency: This process is all done by manpower, which lasts for a long time, and because it is an online change, every step must be taken seriously. If you don't pay attention to it, you may make a mistake and need to be repeated.

Basically every system with strong Schema will face such a problem. Although this problem seems to be a small problem, it is a very painful point for users. Each user suffers once a month. If there is With thousands of customers, the time and energy spent on this matter every month will be terrible. In order to truly make customers comfortable, simplify use, relieve customers' pain, and improve user happiness, we have launched the dynamic modification of the Schema function.

Currently, our dynamic modification Schema function has the following three functions:

Supports functions such as adding columns, deleting columns, modifying column types, modifying class names, and modifying routing keys.
Support A/B Test of old and new indexes. The traffic of the original index can be cut to the new index to verify the availability and delay of the new index.
The ability to intelligently remind when the new index is switched to avoid data rollback problems caused by customers switching in advance.

The above-mentioned functions are currently online and are being tested. In just one month, dozens of customers are already using it, which greatly simplifies the customer's use and reduces the risk. The praise continues. It is expected that it will be fully open to the outside world in June. Next, we will have a dedicated article to introduce the ability and use of dynamic schema modification.

Scenes

With the addition of multiple indexes, the adaptability of table storage in some scenarios has become very high.

Order

For orders with a small amount of data, such as those with less than 20 million rows, you can use MySQL directly. If a larger amount of data, or even billions or tens of billions of rows of order data, use the multi-index of the table storage, it will be better.

An Internet company currently has tens of billions of historical order data. In the future, with business growth, the order volume is expected to double every year. The current architecture is based on MySQL sub-database and table, but there are some pain points: 1) Sub-database Tables are becoming more and more complex, and the pressure on operation and maintenance is increasing; 2) There are more and more slow requests, and user complaints are uninterrupted. 3) Inquiries from major customers often time out. In order to solve these pain points, the customer stores the latest day's order in MySQL, synchronizes the full order data to the table storage in real time through DTS, and uses the multi-index function for query, which brings more than expected benefits: first, no longer need to consider future expansion The problem; the second is that operation and maintenance is no longer needed, and the main need to focus on business research and development, the efficiency is greatly improved; the third is that the query performance has been improved by up to 55 times; the fourth is that slow requests are completely eliminated, and there are no more complaints from users; It can be directly combined with DLA or MaxCompute for more complex analysis.

More detailed order scenario introduction: " Large-scale Order System Interpretation-Architecture ".

Device metadata

The multi-index of Table Store launched a new concurrent export function last year. Combined with the previous features, Table Store has a great competitiveness in the management of device metadata.

A company has tens of billions of device APP information, these device information will be updated in real time, the maximum update per second will reach 500,000 rows/s, when there are activities or emergencies, you need to quickly circle the target APP for message push. When selecting, you need to have the ability to circle out 200 million devices from tens of billions of devices in one minute. The combined use of multiple systems in the current architecture has some pain points: 1) There are many systems, including multiple storage and query systems, big data computing systems, etc., which are complicated to manage and costly. 2) Timeliness inspection and large-scale circle selection are all hourly, which cannot meet the ever-increasing operational needs. 3) As the volume of business growth and update increases, the bottleneck of the original system is getting bigger and bigger. After half a year of research, the customer moved the entire system to table storage, and used the query and export capabilities of multiple indexes to do real-time query and in-line selection, which brought more than expected results: 1) The number of systems was reduced to one system, R&D and operation The dimension complexity is greatly reduced, and the stability is higher; 2) The timeliness of circle selection is reduced from the hour level to the minute level. 3) The update rate can be linearly expanded, and it no longer becomes a bottleneck.

news

Message type storage (IM, feed stream, notification, etc.) is one of the scenarios with the largest number of customers on Table Store. Table Store’s highly reliable storage, real-time scalability, and self-incrementing column functions can greatly simplify the storage library, synchronization library architecture, and Multi-index provides a full range of query capabilities so that message data can be one-stop solution to all storage, synchronization and search requirements.

Based on the above advantages, most of the storage, synchronization and search systems of the Alibaba Group's internal IM systems are based on table storage, such as the internal DingTalk, and many external Internet and Internet of Things customers.

The following figure is a classic message architecture diagram:

At last

Multi-index currently supports the creation of Alibaba Cloud official website console or SDK. If it is the first time to use, you can refer to the article on Getting Started with Multi-Index, which will be released soon.

We have a Dingding public communication group, you can join to maintain a better communication, Dingding group number: 23307953.

For important customers, we will provide an expert service group for free. In the group, there will be core R&D experts for each module of Table Store, who will solve technical or stability problems in the first time and provide customers with an excellent experience.

Copyright Statement: content of this article is contributed spontaneously by Alibaba Cloud real-name registered users. The copyright belongs to the original author. The Alibaba Cloud Developer Community does not own its copyright and does not assume corresponding legal responsibilities. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and the "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find suspected plagiarism in this community, fill in the infringement complaint form to report it. Once verified, the community will immediately delete the suspected infringing content.

How to quickly retrieve data in a table with a scale of hundreds of billions of rows

background

value

Functions and capabilities

Query ability

Real-time analysis capabilities

SQL analysis capabilities

High concurrent export capability

Dynamically modify Schema and A/B Test

Scenes

Order

Device metadata

news

At last

阿里云开发者

引用和评论

福利来了！计算巢支持在已经购买的 ECS 上搭建幻兽帕鲁服务器，支持图形化管理配置

大模型中的Token究竟是什么？从原理到作用深度解析

百万级群聊的设计实践

分布式数据库解析

嘎嘎好用！推荐三款开源的 Redis 桌面客户端！

Ubuntu 常用运维脚本大全（30个干货）

WGCLOUD搭建并使用 - 监控ActiveMQ运行情况