Author: Li Xin
In the past ten years, Internet technology has developed rapidly, and more and more industries have joined the Internet matrix, which has brought richer and more complex business scenarios, which is undoubtedly a huge challenge to the performance of data application systems. .
The relational database MySQL is the most widely used database product in the application system, with powerful data query and strong transaction processing capabilities. In today's era of cloud, applications to be built gradually evolved into cloud-based native Serverless architecture, because it has low cost, high elasticity advantages . However, MySQL-based data storage still has some obvious shortcomings under the Serverless architecture system:
- The elastic expansion ability is poor. An important feature in serverless scenarios is that the application load has significant peaks and valleys. When faced with a traffic peak, the DBA needs to manually expand the cluster to prevent the cluster from being blown up; and when the traffic is low, it needs to shrink the cluster to save costs.
- The complexity of operation and maintenance is high. To build MySQL, you need to purchase clusters, install services, and manage connections. After the business goes online, we must also pay attention to data security, service availability, response time, etc. The proportion of time spent on cluster operation and maintenance will increase, and it is impossible to focus more energy on business research and development.
- high cost. Generally, a DBA needs to estimate the business scale to set the initial capacity of the database in advance. When the amount of business requests does not reach the estimated value, the resources in the cluster will always be idle, resulting in a waste of resources.
Serverless DataBase
The relational model supported by MySQL and its strong transaction characteristics make it a very important position in the application system, and it is one of the storage components that cannot be completely replaced at present. However, blindly relying on MySQL will make the application system unable to be fully serverless and unable to enjoy the extreme flexibility brought by serverless.
We have some new architecture practices within Alibaba. Those data that require strong transaction processing still use relational table storage, and for non-strong transaction table data storage, we have designed a serverless table storage with extreme flexibility.
Regarding serverless database products, our design requirements must have the following characteristics:
- completely elastic. can automatically and flexibly expand and contract according to the application load. This feature can bring users a more economical billing model and a smoother experience.
- is billed by volume. cost of using the 161d57a37caeef Serverless database mainly comes from calculation costs and storage costs. Users only need to pay for the storage units and response units actually generated by the business, saving costs.
- zero operation and maintenance. is ready to use, no need to manage capacity, water level, software upgrades, kernel optimization and other operation and maintenance matters, really let R&D focus on business development.
Serverless architecture has a wide range of application practices in many business scenarios. For example, in its core e-commerce business, Century Lianhua Group, in response to the difficulties of resource budgeting and system deployment difficulties encountered by self-built IDC computer rooms, has realized full cloudification of the business and gradually transformed it into a mid-office model with a full serverless architecture. .
Century Lianhua Group adopted the functional computing + API gateway + Tablestore solution, which easily supported major promotion activities such as 6.18 and Double 11. Among them, Tablestore, as the core storage in the Cloud Serverlesss architecture of the Century Lianhua e-commerce system, has the advantages of extreme flexibility, free operation and maintenance, and low cost.
Introduction to Tablestore
Table Store Tablestore was established in 2009 when Alibaba Cloud was founded. It is built from scratch based on the underlying Feitian platform. It is a multi-model, multi-engine Serverless table storage. It has exported more than 30 regions at home and abroad on the public cloud, with a scale of 15,000 servers and a storage scale of 200PB. It is the underlying core storage for many commercial products of Alibaba Cloud.
At the same time, it has been exported offline to finance, energy, power, logistics, medical, government and enterprise industries, serving 1,000+ public cloud corporate customers and 500+ offline projects.
Tablestore has the fusion function of HBase and ElasticSearch, has the characteristics of extremely flexible experience, free of operation and maintenance, and ready to use. It supports flexible storage from GB to PB and unconscious expansion of 100,000-level TPS service capabilities. supports massive table data while providing rich data retrieval and analysis capabilities. It is a one-stop structured data storage platform integrating storage, search and analysis.
The overall architecture of Tablestore is shown in the following figure:
Tablestore architecture diagram
Table Store provides a variety of data models, including wide table model (Widecolumn), message model (Timeline) and time series model (Timeseries).
- The wide table model mainly carries table structure data storage, such as e-commerce order data.
- The message model mainly carries message data storage, such as IM/Feeds messages.
- The time series model mainly carries the storage of time series data, such as the time series data of IoT devices.
Below we will take the e-commerce order scenario as an example to show you how to build a serverless order storage system based on the wide table model of Tablestore.
Tablestore experience
Ready to work
Before experiencing the extreme flexibility brought by Tablestore, you need to prepare the following steps:
(1) Create an Alibaba Cloud account and obtain the AK of the Alibaba Cloud account. (The cloud account AK is the key to access all cloud services including Tablestore, and you need to use AK to access the Tablestore service in the future).
(2) Download and start the command-line tool Tablestore CLI provided by Tablestore. The command-line tool provides some simple instructions to manage the table storage service.
First configure the connection key through the config command and activate the tablestore service through the enable_service command:
config --id accessKeyID --key accessKeySecret
enable_service
(3) Create an instance through the create_instance command:
create_instance -d "order storage" -n serverless-db -r cn-hangzhou
The instance is equivalent to the concept of the MySQL database. After the instance is created, there is no need to worry about the water level of the physical machine cluster where the instance is located, just focus on developing business logic. At the same time, the read and write and storage on the instance are billed according to the amount. If there is no read, write and no storage, no fees will actually be incurred.
At this point, a fully flexible Serverless DataBase that can support GB to PB storage, no concurrency restrictions, zero operation and maintenance, and complete flexibility has been created.
Create table
The wide table model is (Widecolumn) is a kind of Schema-free data table. Different from the relational database MySQL, the creation of a table Widecolumn model data table only needs to define the primary key structure, and does not need to define the attribute column structure.
For example, the table structure of an order table order is as shown in the table below:
To create an order table with a wide table model, the attribute column information does not need to be defined, just define the primary key id of the order table. The command is as follows:
create_instance -d "order storage" -n serverless-db -r cn-hangzhou
After executing the create command, an order width table is successfully created, and the newly created order width table will be initialized with 1 data partition.
As the amount of order data increases or the amount of access increases, the wide table model will split and expand into multiple data partitions according to the distribution range of the first primary key (the order ID in the above data model) is evenly distributed to multiple physical machines to support With larger data scale (TB or even PB) and read and write throughput (above 100,000 TPS), the entire expansion process is completely automated by the server without manual intervention.
data import
One million pieces of sample order data were simulated and imported into the order table in batches through the import command. The write speed of a single data partition can reach tens of thousands of rows/s. As the partition expands, the write throughput can be further improved.
import -i orderDataFile -l 1000000
Current speed is: 10000 rows/s. Total succeed count 10000, failed count 0.
Current speed is: 12600 rows/s. Total succeed count 22600, failed count 0.
......
Current speed is: 9200 rows/s. Total succeed count 1000000, failed count 0.
Import finished, total count is 1000000, failed 0 rows.
Order Tracking
Use the get command to query the wide table model in a single row according to the order number (id) to obtain a piece of order data. The get command can only perform a single row query based on rowKey.
Example of querying an order:
id = "0000005be2b43dd134eae18ebe079774"
get --pk '["0000005be2b43dd134eae18ebe079774"]
+----------------------------------+-------+--------+---------+-------------+---------------+--------+--------+----------+--------+---------+-------+-------+--------+------------+
| order_id | cId | cName | hasPaid | oId | orderTime | pBrand | pCount | pId | pName | pPrice | pType | sId | sName | totalPrice |
+----------------------------------+-------+--------+---------+-------------+---------------+--------+--------+----------+--------+---------+-------+-------+--------+------------+
| 0000005be2b43dd134eae18ebe079774 | c0015 | 消周五 | false | o0035062633 | 1507519847532 | 小米 | 3 | p0005003 | 小米 6 | 2299.21 | 手机 | s0017 | 售郑七 | 6897.63 |
+----------------------------------+-------+--------+---------+-------------+---------------+--------+--------+----------+--------+---------+-------+-------+--------+------------+
Order retrieval and statistics
Relying on multiple criteria combination filtering often occurs in order scenarios. In this case, you need to rely on the multi-index feature of Tablestore. multi-index is a table data index similar to Elasticsearch provided by Tablestore. It supports rich query methods and data aggregation capabilities, and can be indexed separately on multiple columns. difference between 161d57a37cb5bf and the joint index in MySQL is that the multiple index can be queried according to any combination of fields, and it will not be matched according to the leftmost prefix of multiple columns.
For example, we build indexes on fields such as id, pName, totalPrice, etc., and use data structures such as inverted index, word segmentation, and BKDTree to provide query capabilities such as precise query, full-text search, and range query. In addition, the multiple index also supports grouping by field, sorting by multiple fields, and statistical aggregation capabilities.
Use the create_search_index command to create a multiple index on a wide table to speed up the query.
create_search_index -t order -n order_index
{
"IndexSetting": null,
"FieldSchemas": [{
"FieldName": "id",
"FieldType": "KEYWORD",
"Index": true,
"EnableSortAndAgg": true,
"Store": true
},{
"FieldName": "pName",
"FieldType": "TEXT",
"Index": true,
"EnableSortAndAgg": false,
"Store": true
},{
"FieldName": "totalPrice",
"FieldType": "DOUBLE",
"Index": true,
"EnableSortAndAgg": true,
"Store": true
}
...//其他字段
]
}
Tablestore supports SQL query capabilities, is compatible with MySQL's query syntax, and keeps the usage habits of relational databases as much as possible. SQL can automatically select indexes and perform query acceleration. Through the query acceleration of multiple indexes, it also has the ability to delay queries in milliseconds under the scale of tens of billions of data.
Retrieve orders based on the three field conditions of sName, pBrand, and pName:
select * from `order` where
sName = "售周五" and
pBrand = "小米" and
pName like "红米%"
limit 3;
+----------------------------------+-------+--------+---------+-------------+---------------+--------+--------+----------+---------+--------+-------+---------------+-------+--------+------------+
| id | cId | cName | hasPaid | oId | orderTime | pBrand | pCount | pId | pName | pPrice | pType | payTime | sId | sName | totalPrice |
+----------------------------------+-------+--------+---------+-------------+---------------+--------+--------+----------+---------+--------+-------+---------------+-------+--------+------------+
| 00001c760c04126da067e90409467c4e | c0022 | 消赵一 | true | o0009999792 | 1494976931954 | 小米 | 3 | p0005004 | 红米 5s | 499.01 | 手机 | 1494977189780 | s0005 | 售周五 | 1497.03 |
+----------------------------------+-------+--------+---------+-------------+---------------+--------+--------+----------+---------+--------+-------+---------------+-------+--------+------------+
| 0000d89f46952ac03da71a33c8e83eef | c0012 | 消钱二 | false | o0024862442 | 1502415559707 | 小米 | 2 | p0005004 | 红米 5s | 499.01 | 手机 | null | s0015 | 售周五 | 998.02 |
+----------------------------------+-------+--------+---------+-------------+---------------+--------+--------+----------+---------+--------+-------+---------------+-------+--------+------------+
| 0000f560b62779285e86947f8e8d0e4c | c0008 | 消冯八 | false | o0000826505 | 1490386088808 | 小米 | 1 | p0005004 | 红米 5s | 499.01 | 手机 | null | s0015 | 售周五 | 499.01 |
+----------------------------------+-------+--------+---------+-------------+---------------+--------+--------+----------+---------+--------+-------+---------------+-------+--------+------------+
Count the number of orders for each product brand in all orders:
select pBrand,count(*) from `order` group bypBrand;
+--------+----------+
| pBrand | count(*) |
+--------+----------+
| vivo | 162539 |
+--------+----------+
| 联想 | 304252 |
+--------+----------+
| oppo | 242513 |
+--------+----------+
| 苹果 | 96153 |
+--------+----------+
| 小米 | 194543 |
+--------+----------+
Summarize
As a widely used Serverless DataBase, Tablestore provides an economical billing model that can greatly reduce business costs. The above order scenario is an example. With a data level of 100 million orders and an average read and write volume of 2000TPS, using Tablestore to store Tablestore only costs less than 400 yuan/month. At the same time, Tablestore has extremely flexible service capabilities and completely zero operation and maintenance features, which can bring users a smoother experience.
If you have any questions about what is described in this article or want to know more about Table Storage, you can pin the search group number: "23307953". Free online expert services are provided in the group. Welcome to join.
Click here , you can view a table Memory details!
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。