On August 10th, the 2022 OceanBase annual conference was held in Beijing, Shanghai and Shenzhen at the same time. Yang Zhenkun, the founder and chief scientist of OceanBase, shared the keynote speech of "Interpretation of OceanBase 4.0 Core Technology" at the meeting, and shared the core of 4.0 with everyone. Technological breakthrough - the industry's first single-machine distributed integrated database, RTO has moved from 30 seconds to 8 seconds, and entered the real second-level disaster recovery era.

At the press conference, Yang Zhenkun talked about OceanBase's 0.1 release to today's 4.0 release, and expressed his thinking behind 12 years of technological evolution. "From the first version of OceanBase in 2010 to today, more than 12 years have passed, and we have developed many versions. It can be said that each version is the iteration and upgrade of the entire OceanBase dream." Yang Zhenkun said.

0.1 era, quasi-distributed <br>baseline + incremental, central write + distributed read

In the 0.5 era, high availability <br>Officially enabled three copies of Paxos

1.0 Era, Crossing the Distributed Threshold <br>Solving Distributed Transactions

In the 2.0 era, native distributed <br>horizontal expansion, storage compression, topped the TPC-C list twice

3.0 era, Oracle compatible + open source

Get out of Ant Group

In the 4.0 era, stand-alone distributed integration

The industry's first stand-alone distributed integrated database, RTO<8s

The following is the transcript of the speech:

From 0.1 to 4.0, the thinking behind twelve years of technological evolution

In the 0.1 era of OceanBase, we had to solve the problem of Taobao Favorites. At that time, Taobao Favorites was facing huge business pressure. We adopted two corresponding strategies technically.

The first strategy is to use the structure of the data that has never been used - divide the data into baselines and modified increments, the baseline is placed on disk, and the modified increments are placed in memory. Many people question this solution, because no one has ever done this in relational databases, and we have only found this way to solve the problem of favorites until now.

Looking at this technical solution today does bring us a lot of challenges, because the traditional database is a fixed-length page, and its execution only needs to target this one object, but the baseline of OceanBase is a fixed-length data block, and its modification content is in The memory is variable, and each read must be a fusion of the two. Although it brought great difficulties and challenges to our evaluation optimizer, it also brought us great benefits and benefits.

Especially today, if we want to carry out transactions and analysis on one system, everyone thought it was impossible. The column storage of all systems is updated in batches, and the transaction database is updated in real time. But OceanBase unifies the two - both can be updated in real time because we have memory-based increments.

The second strategy is a semi-distributed structure. At that time, the team had just been formed, and the growth of personnel still required a process, but the business did not have much time to wait. Therefore, we decided to first use a central write node plus multiple read nodes. The read is distributed and the write is centralized.

In 2013, the Alipay transaction library and other core systems decided to upgrade from IOE to OceanBase. The biggest challenge at that time was the high availability of data and high availability of the system. To this end, we decided to adopt the three-copy architecture for the first time on the relational database, and select two copies of the three copies. When the two copies are consistent, the entire data will be consistent.

That is, in this year, we proposed RPO=0 for the first time in the relational database industry, which was never achieved by traditional centralized databases; it was the first time that we achieved and proposed RTO in the entire database industry. 30s. Today, RTO<30s has become the standard in the Chinese database industry.

In 2014, when the Alipay transaction library was launched, we started the development of 1.0, because OceanBase 0.5 still maintains a single write node in the center. With the development of the business, the bottleneck has become more and more prominent. Therefore, in the two full years from 2014 to 2016, most of our resources were invested in turning "semi-distributed" into "full-distributed", so that all points can be written. This is a big challenge, but when this challenge is solved, we gradually migrated Alipay's accounting library to OceanBase after 2016, and truly achieved multi-point writing in the relational database.

The 2.0 era is another progress after OceanBase crossed the distributed threshold. Because 1.0 is our first real distributed version, which is the earliest version in relational databases around the world, but it has many shortcomings in terms of performance functions, so a lot of work has been done in 2.0 of perfection. With this improvement, we have reached the top of the global database performance benchmark for the first time. We have done a small-scale test for the first time and achieved 60 million, which is twice that of Oracle at that time. A few months later, we did a relatively large-scale test and achieved 700 million. Until today, OceanBase is the first and second in the TPC-C list.

After 2.0, we gradually began to step out of Alibaba and Ant Group, and began to provide database services to the whole society. In version 3.0, we conducted the TPC-H ranking, and we also won the first place with 15.26 million QphH.

Since the previous version was more optimized in the Ant Group scenario, and faced with very high-configured servers, we found that many scenarios we encountered were not exactly like this after we came out, and we also encountered a lot of analysis requirements. . Therefore, on the one hand, we are working on miniaturization to lay the foundation for today's 4.0; on the other hand, we are gradually increasing our analytical capabilities to meet the needs of more businesses and more customers.

OceanBase 4.0 Xiaoyu was born, and the era of stand-alone distributed integration is coming

12 years of growth, training and accumulation have made today's OceanBase 4.0 Xiaoyu. As Yang Bing, Han Fusheng and others have shown, we have made many new breakthroughs in 4.0.

Today I will show you two aspects: First, our stand-alone integrated architecture. Everyone knows that OceanBase is a distributed system. We can support many large-scale high-concurrency businesses. Based on the efforts of the past few years, we can support various businesses from small to medium to large. From what we know, today's OceanBase 4.0 is the first in the industry to integrate distributed tracking machines. Not only Raspberry Pi, but also ordinary PCs will be able to run OceanBase 4.0 smoothly.

Second, we further achieve the previously proposed goal. At the beginning of 2014, we proposed RPO=0 for the first time in the relational database industry, that is, there is no data loss, and the business can automatically recover from failure within 30 seconds . In order to further improve, we have achieved 8 seconds of single-machine failure at the fault level of the computer room. The earliest time for an F1 car to replace a tire is a few minutes, which is a bit like our database. In the past few years, everyone's failure time was minutes or even half an hour. Provide better usability to the business.

MySQL compatibility is further opened, and the community edition enterprise edition has the same performance

The last is open source. With the release of OceanBase 4.0, we will achieve the same performance as the enterprise version and the community version. Except for a few management capabilities and security capabilities, other MySQL-compatible capabilities will be open-sourced and open. It is hoped that OceanBase can provide greater impetus to the entire society and the entire industry.

In the past 12 years, many colleagues of OceanBase have been striving forward in the most basic industry of database with their love and dreams. Just now a guest asked us what is the main difference between us and other databases? I said the main difference is that every line of code in the OceanBase database is typed out line by line with their passion and with their dreams. Therefore, we have the ability to solve all kinds of problems you encounter in the process of using today. In addition to these passions and dreams, we also have the support of so many partners, customers and users. I believe that OceanBase will have greater success in the future. development of. thank you all!


OceanBase技术站
22 声望122 粉丝

海量记录,笔笔算数