1
头图

After nearly 12 hours of lively live streaming, the DC 2021 Distributed Database Developer Conference ended successfully at 21:00 on January 6th. The theme of this conference is "Digital Gathering in the Future". It is guided by China Electronics Standardization Institute, hosted by CSDN, hosted by OceanBase, and co-organized by Mulan Open Source Community, Open Source China, 51CTO, Sifu, Geekbang Technology, and Rare Earth Nuggets.

The conference was opened by Ms. Yang Liyun, Director of the Research Office of China Electronics Standardization Research Institute, and specially invited Michael "Monty" Widenius, the father of MySQL, founder of MariaDB, and Bruce Momjian, co-founder of PostgreSQL Global Development Group, to bring in-depth industry analysis. At the same time, OceanBase founder Yang Zhenkun, CEO Yang Bing, CTO Yang Chuanhui, Jushan Chief Architect & R&D Vice President Chen Yuanxi, PingCAP Vice President Liu Song, Tencent Distributed Database TDSQL Chief Architect Li Haixiang, Huawei Cloud Database Chief Architect Feng Ke and many other important guests also came to the live broadcast room, contributing a technical "feast" in the field of distributed databases to developers.

There are so many dry goods in the conference, and the richness of guests can be called the first one in 2022. In order to better let readers understand the wonderfulness of this developer conference, the editor specially selected 8 keywords from this conference. Share with everyone.

Distributed-Key Word 1

Yang Liyun, Director of the Research Office of China Electronics Standardization Institute: Against the background of the rapid development of new application scenarios such as the Internet in my country, distributed databases with large-scale horizontal expansion capabilities have grown up, and they are not behind the world's leading products. The new generation of database types such as distributed and cloud databases do not have the old burden of the traditional data inventory market, so they have sprung up in China in recent years. Under the development of national science and technology in recent years, after the rapid development of distributed databases in large-scale Internet scenarios, distributed databases are moving towards a broader market, such as enterprise-level application scenarios such as finance, communications, government affairs, and the Internet of Things, all of which have distributed databases to undertake innovation. The figure of the business is gradually entering the core system field.

Objectively, compared with traditional centralized databases, distributed databases still have a gap in product maturity and technology popularity. Therefore, while the distributed database is developing rapidly, it is also constantly responding to challenges and polishing products. I believe that under the national science and technology development strategy and the in-depth application of cloud computing and AI intelligence, my country's distributed database software meets the needs of digital development and will surely achieve rapid innovation and development.

Liu Song, Vice President of PingCAP: Distributed database is a combination of database technology and distributed architecture. Therefore, the new generation of distributed database not only has the online transaction and online analysis capabilities of the classic database, but also has the high scalability and automatic operation and maintenance of the new generation of distributed architecture, including the ability to undertake the new generation of cloud native.

Feng Ke, Chief Architect of Huawei Cloud Database: Six key technical directions of distributed databases: global multi-activity and high availability, deep collaboration between software and hardware, enterprise-level mixed workload, cloud native, data security and trustworthiness, and AI-Native. Root technology capability building road.

OceanBase CTO Yang Chuanhui: We have been the believers and pioneers of native distributed databases for 11 years. I think several core features of native distributed databases are: unlimited expansion, always online, and support for a mix of TP and AP in one engine load to ensure strong consistency.

OceanBase's native distributed database has undergone three technical iterations, from the earliest NoSQL system to the first-generation distributed database. The second-generation distributed database uses building blocks. On the basis of NoSQL, SQL support is introduced. Basic SQL functions, but often sacrifice the performance and cost of a single machine. At present, the third-generation native distributed database that pursues extremes supports complete enterprise-level functions, and the performance of a single machine is basically equivalent to that of a centralized database.

Open Source, Ecology-Key Word 2

Bruce Momjian, co-founder of PostgreSQL Global Development Group: He believes that open source is a great opportunity for developers around the world. Under the overall open source environment, developers' works can be recognized on a global scale. Opportunity to speak at international conferences. When it comes to the development of distributed data, he believes that as the market matures and its value is revealed, more and more people will turn their attention to distributed data, and for practitioners, they should invest more in innovation and guarantee the overall project On top of the health degree, so as to achieve the real market first.

Liu Song, Vice President of PingCAP: The trend of open source distributed databases is unstoppable. The biggest mission of the database in the future is to digitize all walks of life, which is also the biggest application requirement. The technological evolution based on this demand relies on open source, which continuously supplies more technology engines. At the same time, in order to serve enterprise customers, a new generation of cloud infrastructure, especially cross-cloud cloud native, is required. Application requirements + open source + cloud infrastructure is a triangle. In the mobile Internet era, the architecture of distributed databases has evolved to this day, and even in the next ten years, it may continue to develop within this triangle framework.

Feng Ke, Chief Architect of HUAWEI CLOUD Database: Distributed databases are in line with the current stage of China's development, and are a new database form generated by the application of traffic driven by China's demographic dividend. A distributed database is like a high-speed rail, and a single machine is like a car. Although the development of distribution is complicated, just like we can't make high-speed rail as convenient and flexible as a car, both of them lead to the same intelligent goal.

Cloud, Openness - Key Word 3

Jiang Tao, founder & chairman of CSDN, founding partner of Geekbang Venture Capital: We see that one of the core values of distributed distribution is scalability, which is difficult for our original technical architecture to satisfy. The second is high availability. Now, whether it is on the cloud or in the hybrid cloud, multi-location and multi-center deployment has become the norm. So what is the core of this core value? In Jiang Tao's eyes, it is openness, which is worth keeping in mind by every distributed database developer.

Liu Song, Vice President of PingCAP: We are entering the next era of distributed databases. From the initial Internet demand to the digital demand at the top of the pyramid, it is one of the biggest backgrounds that drives the whole society to pay attention to the distributed database industry. Nowadays, many cloud databases do not necessarily meet the needs of high concurrency and high expansion, and cross-cloud issues have been unresolved. However, the new generation of cloud-native application scenarios has a very strong demand for distributed databases. The biggest mission of distributed databases in the future is to promote thousands of industries. Complete digital goals.

Consistency - Key Word 4

Li Haixiang, Chief Architect of Tencent Distributed Database TDSQL: In his speech, he recalled the definition and generalization of data exceptions since the establishment of the database system, and elaborated on data exceptions and the entire transaction processing field on data exceptions, isolation levels and consistency. The relationship between. The TDSQL research team has successfully established a systematic framework for studying data anomalies by defining conflict relationships, constructing conflict graphs, mapping graphs and anomalies, and further classifying data anomalies, and preliminarily described concurrent access algorithms. When the data is abnormal, taking the directed ring graph as an example, the number of vertices and edges is infinite, which means that there are infinite data abnormalities. How can we understand the infinite? So we want to classify data anomalies. The classification of data anomalies can be summarized and a table can be obtained, which summarizes all data anomalies. Then when we separate all data exceptions, we can define what is called isolation level and what is called consistency. Simply put, if there is data anomaly, the consistency is not satisfied, and if the consistency is satisfied, it is equal to no data anomaly.

HTAP Mixed Load - Key Word 5

Yang Zhenkun, founder and chief scientist of OceanBase: Yang Zhenkun, founder and chief scientist of OceanBase, believes that a distributed database is "one" horizontally scalable and a data storage that performs both transaction processing and analytical processing. Why it is said that it is a big challenge to let the database do both OLTP and OLAP, that is, HTAP, because there is a huge difference between OLTP and OLAP, and this difference always exists.

image.png

Yang Zhenkun listed four aspects of the challenge. The first is distributed transaction processing, why must it be distributed? Because of the huge amount of data and computation required for analytical processing, the entire system must be distributed. Secondly, the large query of the priority analysis of the transaction needs to consume a lot of CPU memory and IO resources, which may cause the small query of the transaction to fail to obtain the required resources, resulting in the waiting time-out. Third, because row storage is friendly to transaction processing and column storage is friendly to analysis processing, HTAP systems require both row storage and column storage, that is, a mixed storage of rows and columns. The fourth is the performance evaluation of HTAP. Today's various benchmarks are single performance evaluations, either transaction processing or analytical processing. But HTAP requires both. Yang Zhenkun firmly believes that human wisdom is infinite, and some of these challenges of HTAP have been overcome. In the near future, these challenges will all be overcome.

Integrated Architecture-Key Word 6

OceanBase CTO Yang Chuanhui: OceanBase, as a representative of native distributed databases, the core technology behind it is the integrated architecture. On the one hand, the native distributed architecture can enjoy the infinite expansion of distributed technology, and on the other hand, it reflects the externality of traditional databases. Perfectly compatible. Through the integrated architecture to play the dual technical advantages of distributed and centralized, its bottom layer is still a native distributed architecture, which can fully enjoy the infinite expansion of distributed technology and the always-on technical dividends.

In 2021, OceanBase has achieved five core product technical breakthroughs including overall performance from OLTP to HTAP, single-core cost performance, batch running capability, Oracle smooth migration, and ease of use. At the same time, at this conference, Yang Chuanhui officially announced OceanBase's new 3.X tool family - operation and maintenance monitoring tool OCP, developer tool ODC and migration synchronization tool OMA&OMB, and released OceanBase Community Edition 3.1.2.

Core Preferred - Key Word 7

OceanBase CEO Yang Bing: Among various database types, native distributed databases are leading the development trend of database management technology with their many characteristics. According to a report recently released by Gartner, the native distributed database represented by OceanBase has the advantages of high availability, scalability, multi-region, multi-deployment, mixed load, multi-tenancy and transparent compatibility, and is becoming the first choice for enterprise core system upgrades. . As a representative of enterprise-level native distributed databases, OceanBase has developed rapidly in the past year.

Yang Bing revealed at the meeting that the number of OceanBase customers will double to more than 400 in 2021. In the core advantage scenarios such as finance, OceanBase continues to develop deeply. Currently, 1/4 of the top 200 leading financial institutions in the country take OceanBase as the first choice for core system upgrade. Among regional banks, insurance, securities and fund companies, OceanBase has the largest market share in the industry. In addition to financial scenarios, OceanBase has penetrated into all walks of life, and has been applied in important fields related to the national economy and people's livelihood, such as government affairs, energy, and communications.

According to Yang Bing, the proportion of revenue from non-financial customers has reached 35% of OceanBase's total revenue and is growing rapidly. It is worth mentioning that, with the continuous iteration of products, OceanBase's customer structure is also continuously optimized, and a large number of small and medium customers begin to favor native distributed databases. At present, nearly 70% of OceanBase's customers are small and medium-sized customers.

"OceanBase's mission is to use technology to make the management and use of massive data easier. We believe in long-termism and adhere to the 'product-driven growth' business model. We look forward to working with partners, customers, industry colleagues, and developers to develop the best era, to contribute to the development of the database industry, and constantly create the future of data management technology." Yang Bing said.

Customer Value - Key Word 8

Michael "Monty" Widenius, the father of MySQL and the founder of MariaDB: He believes that the huge user base is an important object to guide the development direction of the database. When creating Maria DB, it is through the analysis of user needs and solves problems together with users. to face challenges with ease. He said: A distributed database can perform basic calculations on different nodes, so it has great advantages in processing large amounts of data and group calculations, but it will be slower in transaction processing, so there is no absolute perfection for technology. , more needs-based tradeoffs.

Chen Yuanxi, Chief Architect & Vice President of R&D at Jushan: I would like to say that distributed databases are actually driven from the perspective of customers and application scenarios. Computing power to solve problems encountered in actual customer scenarios is a huge challenge we face in distributed development. The development of distributed technology comes from data. China has the best data market, but what kind of distributed architecture should be chosen? to decide. Jushan Database has been developing distributed databases since 2011. Although we are based on native distributed database technology, when we analyze customers’ capabilities and introduce products, we will still conduct more effective and efficient methods according to customers’ scenarios. 's introduction.


Few words can't capture all the highlights of this Distributed Database Conference, but what remains unchanged is our respect for "technology" and "developers". This conference is based on "development", and the climax still falls on "development".

Behind the ever-changing technology is the research dedication of countless developers day and night. To this end, at this distributed database conference, CSDN, together with Geekbang, Sifu, Open Source China, 51CTO, Nuggets, and Mulan open source communities, jointly launched the Haina Award selection, and selected the "2021 Annual Award" in the field of distributed technology. Haina Award | Top 10 Practitioners of Distributed Database” (*A list of winners is attached, in no particular order).

image.png

While congratulating these unknown developers, we hope to set an example for the development of the distributed database industry and promote the development of the distributed database industry through the stories behind them. Of course, the excitement of this conference is far more than that. The two four-hour technical sub-forums in the afternoon and the "Geek Supper" in the evening are equally exciting (* Please pay attention to the follow-up push).

Under the development strategy of science and technology, basic software such as database is gradually taking center stage in the development of IT industry. At this DC2021 Distributed Database Developers Conference, we witnessed what is called "the future of database technology" - the elegance of distributed database technology. It is believed that with the joint promotion of government, industry, academia and research, database technology will enter a new chapter.

Every year, the DC 2021 Distributed Database Developer Conference has ended successfully. We look forward to seeing you with a better look next year!

Scan the code to watch "Live Playback"
image.png


OceanBase技术站
22 声望122 粉丝

海量记录,笔笔算数