From October 18th to 20th, the 12th China Database Technology Conference (DTCC2021) was held in Beijing International Convention Center. Huang Dongxu, co-founder and CTO of PingCAP, was invited to give a speech on "TiDB Cloud: from Product to Platform" at the main venue, sharing the importance of platformization of database products in the cloud-native era and the experience of TiDB from DB to DBaaS And experience. The following is a sharing record.

In the recent development of the database industry, scientific problems have become more prominent than the engineering and technical problems of "Whether the code is written well": one thing has profoundly changed the entire database industry, that is, the underlying database has changed.
In the past, when thinking about database software and system software, everyone would first make an assumption: the software runs on specific hardware such as computers. Even if it is a distributed database, each node is still an ordinary computer. Now this assumption has changed: our next generation will be at the age of being able to learn programming or write code, they will no longer be able to see the CPU, hard disk, and network like we do now. What they see may be an S3 API provided by AWS. In fact, this change is not only a change in the software carrier, but more importantly, a change in the underlying logic of the architecture and programming.
The impact and changes of cloud on infrastructure and software are profound. When it comes to PingCAP, the biggest feeling is that compared to doing a database kernel, there may be much more investment in TiDB Cloud services on the cloud. This is also the topic I want to share today, From Product to Platform-from DB to DBaaS, the current and future of database technology.

PingCAP's original intention of starting a business


The picture above is my understanding of the database development process. Dating back more than ten years ago, we started to use stand-alone MySQL. During this period, we only needed simple additions, deletions, changes and checks. From around 2010 to today, the amount of data that broke out made stand-alone databases unsustainable, and everyone could only use sub-databases. Sub-table or middleware to achieve distributed deployment.

However, the sub-database and sub-table is too intrusive to the business. Can there be such a database, which is as easy to use as a stand-alone MySQL, but does not need to consider sharding when expanding, but realizes flexibility and comfort through the mechanism of the system itself , Business expansion without intrusion? This is the original intention of PingCAP.
Since PingCAP was founded more than six years ago, in order to achieve this small goal, it has also summarized several experiences:

Ease of use first: agreement is greater than implementation

The MySQL protocol is more important than the MySQL specific software. If a database is compatible with the MySQL protocol, users can have the largest user group without considering the impact on applications and businesses during the database selection process. We don't need to invent a new way of using it, just like electric cars are still controlled by the steering wheel and throttle, although the world under the engine is completely different from gasoline cars.

User experience first

Database performance indicators such as TPS and QPS are important, but user experience is the key to the success of a database. Therefore, TiDB makes all technical decisions based on user experience (Usability matters). From my past experience, many Internet companies need to maintain a wide variety of databases, and each new database will have an additional data island. Therefore, while meeting user data processing needs, a simplified technology stack may be the real user pain point. Whether it is OLTP, OLAP or HTAP, what TiDB hopes to do is to make everyone's lives better.

Open source first

PingCAP has always adhered to an open source strategy and has benefited a lot from it.
From an ecological perspective, the open source R&D model can quickly accumulate users. TiDB 1.0 version was released in November 2017. From its birth to the present, we know that there are more than 2,000 users and more than 1,500 contributors. In the Contribution Rank of the CNCF open source organization, PingCAP ranks sixth in the world.

From a technical point of view, open source accelerates the iterative speed of products. The vertical axis of this picture is the amount of code, and the horizontal axis is the time. The different color blocks represent the amount of code written in a certain year. From the figure, we can see that the code of TiDB is basically being rewritten every year, and almost no year is the same as last year's code. The speed of this iteration is achieved through the open source community. It is an evolution speed that no team, any company, or any company can achieve to build a database from scratch.

Why DBaaS

TiDB's product capabilities are not the focus of today's sharing. What I want to talk about today is how important it is to turn a product into a cloud service. First, throw out a final conclusion. In this era, for CIOs, especially overseas customers, the adaptation of database products to the cloud has become a must.

Now we are standing at the junction of the times. Technically speaking, the development of the database is the process from Standalone (single machine) to Cloud-Native (cloud native). Now we are at the second red line, which is the boundary from Shared-Nothing to Cloud-Native. From a business perspective, the business model of the entire database and basic software industry is also undergoing particularly major changes: in the past we hoped to sell licenses for privatization deployment, and now we hope to achieve large-scale expansion, which is also On-Prem Change to DBaaS.
As a company that has successfully commercialized its database, MongoDB has embarked on a very representative path. MongoDB's market value is doubling every year, and now it has reached more than 30 billion U.S. dollars. From MongoDB's financial report, it can be seen that the DBaaS product MongoDB Atlas basically maintains a compound annual growth rate of more than 100% every year, which is the value of cloud services.

The road to platformization of TiDB on the cloud

In the past two years, I have also redefined our vision and mission: developers all over the world enjoy our service, Anywhere with Any Scale.
To achieve this goal, going from DB to DBaaS is a must. Only services on the cloud can break through geographical restrictions and provide unlimited computing power. From DB to DBaaS, it is much more than just replacing the underlying resources with the cloud. There are many more considerations. Technically, in order to achieve cost reduction and efficiency enhancement, operation and maintenance automation, and multi-tenant management, data security must be considered in compliance, and in business, pricing models, commercialization strategies, etc., all need to be considered. Next, I will talk about TiDB's efforts in the DBaaS process from a technical perspective.

Cost saving: separate architecture design

The ultimate solution to cloud native technology is the issue of cost.

In the past, TiDB had a TiDB + TiKV co-processing engine. The boundary between calculation and storage was very blurred, and it was difficult to handle scenarios with different load rates. In the case of local deployment, if you need to increase storage capacity, you need to increase storage nodes. Because of hardware limitations, in addition to disks, CPU and network bandwidth will also increase simultaneously, which causes a waste of resources. This is all Shared-Nothing. Problems faced by databases.

When it comes to the cloud, everything is completely different. For example, AWS's block storage device service EBS, especially the GP3 series, can run on different machines and achieve the same IOPS and Cost. The performance and integration of cloud native are very good. In order to take advantage of the features of GP3, can we move the boundary of calculation and storage down, from the original TiKV to storage, and now most of TiDB and TiKV can be calculation units, which is more flexible.

The cost savings brought by the cloud don't stop there. The real valuable thing on the cloud is the CPU, and the bottleneck will be computing, not capacity. Clusters and instances can be optimized based on resource shared pools (Spot instances & Clusters based on shared resource pools), storage service types can be selected on demand, different types of EC2 instances can be combined and delivered in specific scenarios, serverless computing, and flexible computing resources Will become possible.
In addition, according to my judgment, in addition to the separation of computing and storage, the network, memory, and even the CPU cache will be separated. Because for an application, especially a distributed program, the requirements for hardware resources are different. No matter what the business is, it is like cooking. There is definitely no flower to make with only one dish, but there are many raw materials, and you can make combinations according to your taste. Cloud brings such an opportunity.

safety

In addition to cost, cloud security is also an important issue. The public clouds officially supported by TiDB are AWS and GCP. Cloud network users all use their own VPC, and there will also be links to VPC Peering. We can't see user data, but users can access their own services with high performance. How to ensure security?

The picture shows the security system of TiDB. The security system on the cloud is completely different from our thinking under the cloud. To give a particularly simple example: under the cloud, only the internal permissions of the RBAC database need to be considered, but on the cloud, it is very complicated, and a whole set of user-sound and secure systems from the network to the storage need to be considered. The key to doing a good job in cloud security is to never re-invent yourself, because there are basically security vulnerabilities. So we are now to make full use of the complete set of security mechanisms provided by the cloud itself, such as key management and rules. Of course, the best part is that these services can be clearly marked, as long as the billing model is made.

Operation and maintenance automation

There is another important thing about the construction of DBaaS. In fact, it is also related to cost, which is the automation of operation and maintenance. Cloud is a large-scale business, and one of the most troublesome parts of the domestic database business now is delivery. A big client can't wait to send twenty people on site, but is this thing sustainable? . What we want to achieve is to support a system with 1,000 customers through a 10-person delivery team, which is a prerequisite for scale.

These are TiDB's own cloud service technology selections, deploy on the cloud through Kubernetes, federate management through Gardener, and control multiple Kubernetes clusters. Pulumi is an automation tool for infrastructure as code.

Kubernetes

There are several steps to turn TiDB into a cloud service? The first step is to turn human flesh operation and maintenance into code. TiDB is about to expand. Don't expand with human flesh, can the system expand by itself? TiDB fails to recover. People can't participate, can machines participate? We have turned all the operation and maintenance of TiDB into a Kubernetes Operator, which is equivalent to achieving automatic operation and maintenance of TiDB. Kubernetes can shield the interface complexity of all cloud vendors, and each cloud vendor will provide Kubernetes services.

Pulumi

I just said that the logic of deployment, operation and maintenance, and scheduling of these things is that if people write scripts, one is unstable, and the other is unmaintainable. Our philosophy is that as long as the things that can be turned into code are solidified, don't rely on people, including opening a server, or buying a virtual machine, we will turn it into a script in the Pulumi programming language.

Gardener

TiDB manages multiple Kubernetes clusters in different regions through Gardener's API, and each Kubernetes cluster divides the TiDB clusters of different tenants to form a large multi-cloud, multi-region, multi-AZ system. This architecture has one advantage: users can enable TiDB on demand in the cloud service provider and geographic area where their application is located, maintaining the unity of the technology stack.

Commercial SLA

There are many things to consider in the SLA, which is what TiDB is doing and is doing.
TiDB has a large number of overseas customers. The demand for databases of overseas users is very different from that of domestic users. Cross-data center is a rigid demand. Due to the current data security requirements of various countries, data transmission has many restrictions, and compliant and cross-data center capabilities are very important for databases. For example, in the face of European GDPR control, if you can put some of the data in Europe, don't release it. Only things that are not within the scope of control can be released, which will save a lot of trouble. I believe that this ability will become a rigid demand for Chinese manufacturers and customers, including manufacturing overseas and domestic compliance.
This function is very easy to implement on the cloud. For example, AWS itself is a multi-AZ, multi-region architecture. There is no need to consider the underlying layer. If you open a few machines in another data center, the user only needs to click the mouse on the interface and the data will pass, but For databases that cannot be deployed in the cloud, if you want to deal with global data distribution or global Transaction and Local Transaction, there are many more things to consider.
TiDB is now taking precautions, and this feature is about to be released soon.

To provide services on the cloud, technology is important, and compliance is a prerequisite. Ecological integration on the cloud has a main line, which is to follow the data. The upstream, downstream, and management of data are the three most important points. The upstream of TiDB is the data files in MySQL and S3, and the downstream only needs to support synchronization with Kafka or other message queue services. In terms of data management and control, in the cloud, especially for overseas users, it is more hopeful to connect with platforms like DataDog and Confluent than to do overall management and control through database vendors.
Finally, as an advertisement, TiDB will launch a 12-month free trial version for developers in Q4, which can be deployed quickly, supports HTAP function by default, realizes computing isolation through containers, and has dedicated block storage. You can use it on the cloud. Feel free to use. Our website is tidbcloud.com, and we will also support domestic clouds in the future. We look forward to your experience and feedback.
I hope PingCAP can really do it: Let developers all over the world enjoy our service, Anywhere with Any Scale.


PingCAP
1.9k 声望4.9k 粉丝

PingCAP 是国内开源的新型分布式数据库公司,秉承开源是基础软件的未来这一理念,PingCAP 持续扩大社区影响力,致力于前沿技术领域的创新实现。其研发的分布式关系型数据库 TiDB 项目,具备「分布式强一致性事务...