Author | Li Xin

In the past ten years, Internet technology has developed rapidly, and more and more industries have joined the Internet matrix, which has brought more abundant and complex business scenario requirements, which is undoubtedly a huge challenge to the performance of data application systems. .​

The relational database MySQL is the most widely used database product in application systems, with powerful data query and transaction processing capabilities. In today's era of cloud, applications to be built gradually evolved into cloud-based native Serverless architecture, because it has low cost, high elasticity advantages . However, MySQL-based data storage still has some obvious deficiencies under the Serverless architecture:

  1. Poor elastic scalability. An important feature of serverless scenarios is that the application load has significant peaks and valleys. When faced with traffic peaks, DBAs need to manually expand the cluster to avoid the cluster being exploded; and when traffic is low, the cluster needs to be scaled down to save costs.
  2. Operation and maintenance complexity is high. To build MySQL, you need to purchase clusters, install services, and manage connections. After the business is launched, it is necessary to pay attention to data security, service availability, response time, etc. The proportion of time spent on cluster operation and maintenance will become higher, and more energy cannot be devoted to business research and development.
  3. high cost. Usually, the DBA needs to estimate the business scale to set the initial database capacity in advance. When the business request volume does not reach the estimated value, the resources in the cluster will remain idle all the time, resulting in a waste of resources.​

    Serverless DataBase

The relational model supported by MySQL and its strong transaction characteristics make it a very important position in the application system, and it is one of the storage components that cannot be completely replaced at present. However, blindly relying on MySQL will make the application system unable to be completely serverless and unable to enjoy the extreme flexibility brought by serverless.

In Alibaba, we have some new architectural practices. Those data that require strong transaction processing still use relational table storage, while for non-strong transaction table data storage, we have designed a serverless table storage with extreme flexibility.

Regarding serverless database products, our design requirements must have the following characteristics:

  • fully elastic. can automatically expand and shrink the capacity according to the application load. This feature can bring users a more economical billing model and a smoother experience.
  • as-you-go. The usage cost of serverless database mainly comes from computing cost and storage cost. Users only need to pay for the storage units and response units actually generated by the business, saving costs.
  • Zero operation and maintenance. be used out of the box, without the need to manage capacity, water level, software upgrade, kernel optimization and other operation and maintenance matters, so that R&D can really focus on business development.

Serverless architecture has a wide range of application practices in many business scenarios. For example, in its core e-commerce business, Century Lianhua Group, in view of the difficulties encountered in the self-built IDC room, such as resource budget difficulties and system deployment difficulties, has fully implemented the cloud and gradually transformed its business into a middle-office model with a full serverless architecture. .

Century Lianhua Group adopted the function computing + API gateway + Tablestore solution to easily support the promotion activities such as 6.18 and Double 11. Among them, Tablestore, as the core storage in the cloud serverlesss architecture of Century Lianhua e-commerce system, has the advantages of extreme flexibility, free operation and maintenance, and low cost.

Table Store Introduction to Tablestore

Tablestore Tablestore was established as a research and development project at the beginning of Alibaba Cloud in 2009. It was built from scratch based on the underlying Apsara platform. It is a multi-model, multi-engine serverless table store. It has exported more than 30 regions at home and abroad on the public cloud, with a scale of 15,000 servers and a storage scale of 200PB. It is the underlying core storage for many commercial products of Alibaba Cloud.

At the same time, it has been exported offline to finance, energy, power, logistics, medical care, government and enterprises and other industries, serving 1000+ enterprise customers and 500+ offline projects in public cloud.

Tablestore Tablestore has the fusion function of HBase and ElasticSearch. It has the characteristics of extreme elastic experience, free operation and maintenance, and out-of-the-box use. It supports elastic storage from GB to PB and non-perceptual expansion of 100,000-level TPS service capabilities. supports massive table data and provides rich data retrieval and analysis capabilities. It is a one-stop structured data storage platform that integrates storage, search and analysis functions.

The overall architecture of Tablestore is shown in the following figure:


Tablestore Architecture Diagram

Table Store provides a variety of data models, including the wide table model (Widecolumn), the message model (Timeline), and the time series model (Timeseries).

  • The wide table model mainly carries table structure data storage, such as e-commerce order data.
  • The message model mainly carries the message data store, such as IM/Feeds messages.
  • The time series model mainly carries time series data storage, such as the time series data of IoT devices.

Next, we will take the e-commerce order scenario as an example to show you how to build a serverless order storage system based on the wide table model of Tablestore.​

Tablestore experience

1. Preparation

Before experiencing the extreme flexibility brought by Tablestore, you need to prepare the following steps:

(1) Create an Alibaba Cloud account and obtain the AK of the Alibaba Cloud account. _ (The cloud account AK is the key to access all cloud services including Tablestore, and you need to use AK to access the Tablestore service later). _

(2) Download and start the command-line tool Tablestore CLI provided by Tablestore. The command-line tool provides some simple commands to manage the Tablestore service.

First configure the connection key through the config command and enable the Tablestore service through the enable_service command:

config --id accessKeyID --key accessKeySecretenable_service

(3) Create an instance through the create_instance command:

create_instance -d "order storage" -n serverless-db -r cn-hangzhou

An instance is equivalent to the concept of a MySQL database. After an instance is created, you don't need to worry about the water level of the physical machine cluster where the instance is located. You only need to focus on developing business logic. At the same time, the reading and writing and storage on the instance are billed according to the amount. If there is no reading and writing and no storage, there will be no actual charges.

So far, a Serverless DataBase that can support GB to PB storage, has no concurrency limit, zero operation and maintenance, and is fully elastic has been created.

2. Create table

Widecolumn is a Schema-free data table. Different from relational database MySQL, creating a data table of Widecolumn model only needs to define the primary key structure, and does not need to define the attribute column structure.

For example, the table structure of an order table order is shown in the following figure:

|

id(PrimaryKey)cNamepTypesIdtotal_Price......
type of dataSTRINGSTRINGSTRINGSTRINGDOUBLE......
Business Implicationsorder idconsumer nameOrder item typeSalesperson IDThe total amount of orders......

Create an order table with a wide table model. There is no need to define the attribute column information, just define the primary key id of the order table. The command is as follows:

create_instance -d "order storage" -n serverless-db -r cn-hangzhou

After executing the create command, an order-wide table is successfully created, and the newly-created order-wide table will be initialized with one data partition.

As the amount of order data increases or the number of visits increases, the wide table model will split and expand into multiple data partitions and evenly distribute them to multiple physical machines according to the distribution range of the first primary key _ (the order ID in the above data model) _ To support larger data scale_(TB or even PB)_and read/write throughput_(above 100,000 TPS)_, the entire expansion process is completely automated by the server without manual intervention.

3. Data import

The simulation generated 1 million sample order data, and imported them into the order table in batches through the import command. The write speed of a single data partition can reach tens of thousands of rows/s, and the write throughput can be further improved as the partition expands.

import -i orderDataFile -l 1000000
Current speed is: 10000 rows/s. Total succeed count 10000, failed count 0.Current speed is: 12600 rows/s. Total succeed count 22600, failed count 0.......Current speed is: 9200 rows/s. Total succeed count 1000000, failed count 0.Import finished, total count is 1000000, failed 0 rows.
4. Order inquiry

Use the get command to query the wide table model according to the order number (id) in a single row, and get an order data. The get command can only query a single row based on rowKey.

An example of querying an order:
id = "0000005be2b43dd134eae18ebe079774"

get --pk '["0000005be2b43dd134eae18ebe079774"]'
+--------------------------------------+-------+------ --+---------+-------------+---------------+------- -+------------+----------+--------+---------+-------+- ------+--------+------------+| order_id | cId | cName | hasPaid | oId | orderTime | pBrand | pCount | pId | pName | pPrice | pType | sId | sName | totalPrice |+------------------------------------------------+---- ---+--------+---------+-------------+------------- --+------------+--------+----------+--------+--------- +-------+-------+--------+------------+| 0000005be2b43dd134eae18ebe079774 | c0015 | Eliminate Friday | false | o0035062633 | 1507519847532 | Xiaomi | 3 | p0005003 | Xiaomi 6 | 2299.21 | ----------+-------+------------+---------+------------ -+---------------+--------+--------+----------+--- -----+---------+-------+-------+--------+--------- ---+
5. Order retrieval and statistics

Order scenarios often rely on multi-condition combination filtering. In this case, it is necessary to rely on the multi-index feature of Tablestore. multi-index is an Elasticsearch-like table data index provided by Tablestore, which supports rich query methods and data aggregation capabilities, and can create indexes on multiple columns respectively. difference between 161fb4538340f1 and the joint index in MySQL is that the multiple index can be queried based on any combination of fields, and will not be matched according to the leftmost prefix of multiple columns.

For example, we build indexes on fields such as id, pName, and totalPrice, and use data structures such as inverted index, word segmentation, and BKDTree to provide query capabilities such as precise query, full-text search, and range query. In addition, multivariate indexes also support grouping by field, sorting by multiple fields, and statistical aggregation capabilities.

Use the create_search_index command to create multiple indexes on wide tables to speed up queries.

create_search_index -t order -n order_index{ "IndexSetting": null, "FieldSchemas": [{ "FieldName": "id", "FieldType": "KEYWORD", "Index": true, "EnableSortAndAgg": true, "Store ": true },{ "FieldName": "pName", "FieldType": "TEXT", "Index": true, "EnableSortAndAgg": false, "Store": true },{ "FieldName": "totalPrice", "FieldType": "DOUBLE", "Index": true, "EnableSortAndAgg": true, "Store": true } ..._//Other fields_ ] }​

Tablestore supports SQL query capability, is compatible with MySQL query syntax, and keeps the usage habits of relational databases as far as possible. SQL can automatically select indexes and perform query acceleration. Through multi-index query acceleration, it also has the ability to delay query in milliseconds under the scale of tens of billions of data.

Order retrieval based on the three field conditions of sName, pBrand and pName:

select * from order where sName = "Sale Friday" and pBrand = "Xiaomi" and pName like "Redmi%" limit 3;
+--------------------------------------+-------+------ --+---------+-------------+---------------+------- -+------------+----------+---------+--------+-------+- --------------+-------+--------+------------+| id | cId | cName | hasPaid | oId | orderTime | pBrand | pCount | pId | pName | pPrice | pType | payTime | sId | sName | totalPrice |+--------------------------------- -------------+-------+--------+---------+--------- ----+---------------+--------+--------+----------+ ---------+--------+-------+---------------+------- +------------+------------+| 00001c760c04126da067e90409467c4e | c0022 | Xiao Zhaoyi | true | o0009999792 | 1494976931954 | Xiaomi | 3 | p0005004 | Mobile | 1494977189780 | s0005 | Sale Friday | 1497.03 |+------------------------------------------------+-- -----+------------+---------+-------------+------------ ----+--------+--------+----------+---------+------ --+-------+---------------+-------+--------+------ ------+| 0000d89f46952ac03da71a33c8e83eef | c0012 | Money 2 | false | o0024862442 | 1502415559707 | Xiaomi | 2 | p0005004 | .01 | Mobile | null | s0015 | Sale Friday | 998.02 |+------------------------------------------------ +-------+--------+---------+-------------+-------- -------+--------+--------+----------+---------+--- -----+-------+---------------+-------+--------+--- ---------+| 0000f560b62779285e86947f8e8d0e4c | c0008 | Xiao Feng Ba | false | o0000826505 | 1490386088808 | Xiaomi | 1 | p0005004 | ----------------------------------+-------+------- -+---------+-------------+---------------+-------- +--------+----------+---------+--------+-------+-- -------------+-------+--------+------------+

Count the number of orders for each product brand in all orders:

select pBrand,count(*) from order group bypBrand;
+-------+------------+| pBrand | count(*) |+--------+----------+ | vivo | 162539 |+--------+----------+| Lenovo| 304252 |+--------+--------- -+| oppo | 242513 |+--------+----------+| apple| 96153 |+--------+------- ---+| Xiaomi| 194543 |+--------+------------+

Summary

Table Store Tablestore, as a widely used Serverless DataBase, provides an economical billing model and can greatly reduce business costs. above order scenario as an example, with a data volume of 100 million orders and an average read and write volume of 2000 TPS, the cost of using Tablestore is less than 400 yuan/month. At the same time, Tablestore has the ultimate elastic service capability and zero operation and maintenance features, which can bring users a smoother user experience.

described in this article or want to know more about Table Store, you can search for the group number: "23307953". The group provides free online expert services, welcome to join.

For more content, pay attention to the Serverless WeChat official account (ID: serverlessdevs), which brings together the most comprehensive content of serverless technology, regularly holds serverless events, live broadcasts, and user best practices.


Serverless
69 声望265 粉丝