About DTCC 2021 conference, Wang Weimin, general manager of the Product and Solutions Department of Alibaba Cloud Database Business Unit (named Weimin), delivered a keynote speech "Cloud Native Database 2.0, One-stop Full Link Data Management and Service", He also accepted a special interview with Lao Yu, the executive editor of IT168 Enterprise & ITPUB. The content of this article is compiled from on-site interviews.
Interviewed guest: Wang Weimin, General Manager of Product and Solutions Department of (nickname: Weimin)
Interviewer: IT168 Enterprise & ITPUB Executive Editor-in-Chief Lao Yu
(This article is based on the on-site interview of the DTCC 2021 conference)
Reporter: Please introduce your current daily work in
Wang Weimin: First introduce myself, I have experienced Oracle, Microsoft, Huawei and other companies in these companies are engaged in database-related work, and held different positions.
I joined Alibaba Cloud this year and I am currently responsible for four aspects of work.
The first part is product management, which is exactly the same as my previous job. It is mainly market-oriented and customer demand-oriented, to see which products we need to develop.
The second part is the solution, responsible for the commercial success of the product, including industry, regional and international markets.
The third part is product experience, including documentation, experience design, etc., mainly to allow users to have a better experience and use our products and services more efficiently.
The fourth part is the brand and ecology. The brand aspect includes the continuous influence on building products in the industry. In terms of ecology, databases are basic software. At the PaaS layer, unlike computing, network storage, security, and other general capabilities, everyone needs them; they also don’t have their own traffic portals like SaaS applications. Therefore, is not particularly efficient to promote the database alone. We need to do marketing and promotion in an ecological way, and work with partners to promote and replicate it to major application scenarios more quickly.
The above is my current work in Alibaba Cloud.
: Equivalent to OLAP, OLTP, and NoSQL. R&D is integrated and developed for their respective products, and your side becomes productized.
Wang Weimin: Yes, all customer needs come to our team. We will plan products, research and develop to meet demand, invest resources, and promote product launches according to the roadmap of product planning. In addition, GTM work such as product business model and pricing is also in charge of our team.
Reporter: The topic of your speech today is "Cloud Native Database 2.0, a one-stop full-link data management and service" . Why did you choose this theme, and what benefits do you think it can achieve for the participants?
Wang Weimin: We first emphasized "one-stop". This "stop" means "One Stop", not the "stack" of "technology stack". This is called "Stack". What we are doing now is cloud services, and various engines have been put on the shelves to achieve diversified supply; each product or engine is customized and optimized for specific user scenarios, which are also the classic General Purpose (GP) and Special Purpose ( SP) trade-off between. There are many companies trying to solve all problems with General Purpose, but so far, it may be the optimal solution in some scenarios, but it is sub-optimal in all other scenarios. Of course, General Purpose has an advantage. For the product R&D team, its ROI is the highest. It continuously invests in building a product to try to solve all scenarios.
But in fact, there is a problem. We see that there are many products designed for special scenarios, such as cache, embedded IoT scenarios, edge scenarios, and multi-mode, which are not well solved by GP products. So far, thousands of industries are doing digital transformation and going to the cloud, and want to use one product to solve all the problems. From a philosophical point of view, this is an overly ideal state and it is impossible to achieve. The way we want to use is to use a combination of products and products to meet the needs of various business scenarios, which is equivalent to our GP+SP model. So this is why we chose this theme.
Secondly, we must emphasize the "full link". We have done research. In most companies, various products and solutions are very complicated, but there are still big problems in connection and coordination between them. We may be able to solve the "chimney" problem by using the cloud, but in fact it only solves the "chimney" of resources, and the business and data are still not connected. Since data is a new means of production, a data operating system is needed. We want to use DMS to implement DataOS. This is our idea and the direction we have been working hard on.
So far, the rich business scenarios and different business load characteristics make it difficult to use one product to finally solve all problems. This is our biggest insight. So we hope to bring actual changes.
: You just mentioned this point of view, which is the same trend as I asked below. In terms of dedicated databases and multi-mode databases, it is obvious that your dedicated database is similar to Amazon's, rather than Microsoft's "one database package to fight the world"-a set of products applied to all scenarios. Now all vendors are actually talking about cloud native. Is there any difference between your cloud native architecture and your partner's during the evolution?
Wang Weimin: First of all, Microsoft and Oracle are actually more focused and unified, but they are actually a combination of GP and SP. Oracle has more brands, such as Oracle TimesTen for caching scenarios, and for high-availability scenarios. It is Oracle RAC, so Oracle has a first-class brand and a second-class brand. Similarly, Microsoft has SSIS, SSAS, SSRS for analysis, reporting, and integration, and Azure CosmosDB for multi-mode;
So I think the reason why everyone chose the path of "towards cloud native 2.0" unanimously, the core reason is that everyone sees that the diversity of loads is too challenging for the design of the underlying basic software, and it is impossible to use one architecture and one set of code. Go cover all.
When it comes to the differences with friends, in the past era of cloud native 1.0, everyone has actually completed productization and serviceization, and basically completed infrastructure pooling, and achieved multi-tenant, on-demand, and on-demand management and control. Measured by quantity, and can be expanded and contracted to varying degrees.
In the cloud native 2.0 era, the linkage between products and services is more important. In my sharing (the 12th China Database Technology Conference), I mentioned seven customer cases, none of which was dealt with with a single product. In order to meet the transformation needs of customers, there are basically many needs such as transactional, analytical, real-time data warehousing, and warehouse integration. All need to realize data discovery, insight and value realization.
In our opinion, the cloud native 2.0 era may evolve in two directions. First, continue to consolidate and enhance the differentiated competitiveness of single products. Second, efficient collaboration and experience enhancement among multiple products.
: In terms of single products, we must continue to strengthen our advantages. On the whole, you will also emphasize the advantages of the overall solution, whether it is OLTP, OLAP or NoSQL, all products seek better collaboration.
Wang Weimin: We now have an idea to solution into . For example, if a customer wants an online e-commerce system, we can directly provide a system with caches and transactions, with links in the middle, ready like a small data warehouse. Because there may not be so much data at the beginning, this data warehouse is scalable.
Although it is on the cloud, it is also vertical. Business is a chimney and resources are pulled through horizontally. This is an attempt we are making.
Solution productization, in fact, many companies in the industry have done it. We may only be able to do it on general solutions, and it may be more difficult to combine industry know how to do it.
: If you want to describe "Cloud Native Database 2.0" in a relatively simple way, one sentence and two sentences, what should you say? Let others understand the concept of "cloud native database 2.0".
Wang Weimin: "One-stop full link" is still a relatively technical term. In business terms, it should be multiple engines to meet the diverse business load of customers . Another term is "full scene" and "full data life cycle", but these are relatively marketing terms.
: Yes, for non-technical personnel, it is still a bit obscure to understand. The next question actually comes from a point of view mentioned by TiDB Huang Dongxu. He believes that the current so-called cloud-native databases are not truly cloud-native. Anything that can return to offline deployment is not called cloud-native. What do you think of this view?
Wang Weimin: quite new, and I have never heard of it before. We have encountered a type of cloud-based customers who require data to be reflowed and ready to be downloaded to the cloud at any time. The core reason customers have this appeal is that they worry that the cloud is a super lock-in, and worry that "I am on your cloud, unless I hang up, I will be on you in this life and this world."
In the process of communicating with customers, you will find that some customers are more resistant to differentiated features, worrying about relying on a certain manufacturer because of the differentiated features. Of course, there are many different technical schools in this area. Some customers do not use database-specific syntax when writing SQL. There is an abstraction layer in the middle, which can be adapted to the underlying MySQL or can be replaced with SQL Server or Oracle. It's not visible to him. In fact, there are products in the industry, such as Source Pro, which can shield the difference of the underlying database from the upper-level application, so I don't agree with this thesis.
Database productization and cloudification should be in two stages. It can be a product. This product can be provisioned on-premise or on the cloud, but it can take advantage of many cloud-specific capabilities such as elastic resource scaling. , But not only on the cloud.
For example, can't RDS MySQL be downloaded to the cloud? It will definitely be able to go to the cloud. If it follows the standard just now, it is not a cloud native database. But if it is decoupled and becomes like AWS Aurora, where computing and storage are decoupled, it will be difficult for Aurora to get down.
Whether it belongs to "cloud native" should be considered based on the customer's business scenarios and needs, rather than judging from the technical realization. After all, the requirements of many customers are different. For example, check in a hotel with a bag, check in this hotel today, and check in that hotel tomorrow, but the hotel has to let customers go.
: Do you think his statement is based on conclusions drawn from foreign markets? Like what you said can be both up and down, is it a unique demand in the domestic market? Or is this the demand of the global market? In fact, I recently read an article whose point of view is "Distributed is not used in foreign countries, so distributed is used in China."
Wang Weimin: Regarding what you mentioned, is this a special situation in the domestic market? I think that when public cloud becomes mainstream in the future, there will definitely be questions about how users consider platform neutrality and cloud neutrality. Now we have seen many similar requirements. For example, the customer's main site is in Cloud A, and disaster recovery is in Cloud B; the other is cross-cutting. Part of the business production is in A, backup or Standby is in B, and the other part of the business is in B. , Standby station A. The needs of customers may continue to change in the future. Our philosophy is "customers first", which is something we must consider. 's view is relatively new at present, but can you say TiDB can be deployed offline? It is definitely possible.
Regarding the recent article, I mean that InfoQ has only recently implemented the sub-database of the underlying database, and it was all single-database before. As far as the product and its architecture are concerned, first of all, the cloud vendors themselves have the choice of the target market. For example, AWS does not do offline market, it does not do on-premise deployment, there is at most one Outposts edge deployment, and only provides limited capabilities. , It may be a matter of his client's choice.
: Tunghsu’s point of view is relatively advanced. Many of their customers are abroad. Foreign markets have a high degree of embracing public cloud. However, for China, it is obvious that offline deployment, hybrid cloud, private cloud, etc. must be compared. The embrace of public clouds is even higher, at least for large contributors in industries such as finance and telecommunications, relatively speaking.
Wang Weimin: With regard to the general trend of public cloud, my judgment is the same as Tunghsu. As for the domestic financial and telecommunications industries, the large business load has not yet been placed on the public cloud. I think there may be two reasons: first, domestic policy requirements in terms of information security, industry supervision, and compliance; second, customers are Independent and controllable requirements for system software operation and maintenance management.
In fact, do you say that it is not safe on the public cloud? uncertain. The US Department of Defense's 10 billion JEDI projects are all on public clouds. CapitalOne, the largest bank in the United States, is 100% on public clouds. Private clouds are not used. I think the future of public cloud is more optimistic. Many industries need a process. Going to the cloud may take more than ten years and the track is very long.
: Because foreign countries don’t use distributed systems, does distributed systems have no meaning or application value?
Wang Weimin: This is another issue that is particularly worthy of discussion. I think complexity is a matter of conservation. Foreign countries, such as GitHub did not use distributed before, but it does not mean that this part of the work will not be done. This part of its work may be done at the application layer and middleware layer; but From the perspective of the database, we hope to make complex Leave it to the database, and leave the simplicity to the application. We hope that distributed databases can do application development and management operation and maintenance like a single machine, so that applications do not need to pay attention to the underlying database capacity, transactions, scalability, and consistent backup.
In fact, foreign databases are not as prominent as domestic ones in terms of concurrent user volume and user scale. After all, we have a demographic dividend. For example, Ali’s Double Eleven "Hand-Chopping Festival", can the payment system be supported by an Oracle database? It is certainly possible. It is also possible to replace the OceanBase underneath with Oracle, but the TCO is too high.
Therefore, whether to use distributed products to solve this problem or to use distributed solutions to solve this problem is another way of thinking. For example, applications can be connected to distributed middleware, such as ShardingSphere and a single-machine database underneath, which can do this the same thing.
I think it’s not that it’s meaningless if foreign countries don’t use it, but it depends on the level at which these investments are made. For example, I think from the product dimension, with the increase in usage, the marginal cost continues to decline, so I think it is very meaningful to do this. For example, you can so that applications do not need to care about flexible transactions and business corrections. Wait, this complexity is handled at the database level, so I think distributed databases are still very meaningful. In foreign countries, you may not need to use a distributed database, but you can degenerate the distributed database into a single instance. In Oracle RAC, you can also use only one node, which is all right.
: In the past, I have always felt that among domestic cloud vendors, Huawei is the best eco-system in terms of databases. I often see who joins its eco-partnership circle. Alibaba is the first to eat crabs in the cloud, so Alibaba Cloud has maintained a certain first-mover advantage. At the same time, Alibaba also has some extreme scenarios similar to Double Eleven, which makes it have technical accumulation. Of course, this is just my personal opinion. From your point of view, what are the advantages of Alibaba Cloud Database over other competitors in terms of technology, products, business, ecology, etc.?
Wang Weimin: First of all, for our production and research team, I think the first advantage is from a high degree of organizational unity and consistency, we can do our best. Li Feifei is not only the number one position of the Alibaba Cloud Database Division, he is also the number one position of the Dharma Academy's database and storage laboratory, so he is equivalent to leading the product, R&D, and pre-research teams of all databases in the entire Alibaba system. The product strategy, communication, marketing and other tactics are highly consistent, which is the first advantage.
Second, Ali does have a first-mover advantage. Many of our customers are not new to the cloud. For example, based on IDC, they just went to the cloud. But day one is to grow on the cloud, born in the cloud and grow in the cloud.
Third, we are more focused in terms of products. We do not have many products. Our products are divided into three categories: open source hosting, commercial product hosting, and self-developed products. The core is to focus, and for these three types of products, We have different strategies.
For example, the open source hosting category focuses on simplicity, ease of use, safety, reliability, and cost-effectiveness; commercial products mainly use its open ecosystem. In this regard, we mainly solve the problem of diversified business compliance ecology. We are directly connected with SQL Server and MongoDB. For business cooperation, these require Alibaba Cloud's continuous compliance; self-developed products, we focus on four major categories, PolarDB, AnalyticDB, Lindorm, and Tair, and they are not very divergent.
When it comes to ecology, in fact, the ways in which each manufacturer makes ecology are different. Although some friends are doing databases, they do not want to do databases as an independent industry. Solving the problem of "stuck neck" by creating a second plane is also very remarkable, and we also need this kind of technology.
Ali is doing ecology, especially we hope to do it in an "integrated" way. The most important thing for the ecology to be open and prosperous is to be able to share benefits with partners. If this is not possible, I think the ecology cannot be done. So we have been exploring how to let ecological participants join us to make the pie bigger, and at the same time to see how to better take care of all parties.
We have also been continuously investing in ecology, such as distribution partners, SaaS integration, ISV, delivery partners, training certification, etc., but it is true that our investment in this area needs to be continuously strengthened.
Reporter 161a58cb44a57a: At the Ali media communication meeting last month, I interviewed Li Feifei. He mentioned that you are going to For high-end customers in the database field in the offline market, such as finance, government and enterprises, are there some friends who have more advantages than Ali? Ali is now going to the offline market, what is it going to do? Where is your differentiated advantage?
Wang Weimin: First of all, the demand in the offline market is rigid, which exists objectively. At present, we have accumulated many customers including operators, financial industry, government and so on. Everyone is discussing a question. If we do it again, should we change the platform intact or do a transformation and upgrade on the original basis?
At present, we see that customers including operators have a common demand. Today I also shared a case of an operator. Its business systems (including the boss system) are directly cloudified. In this process, they have to combine many concepts including "cloud", "swimming lane" cutting of upper-layer applications, micro-services, and the entire operation and maintenance development. I think this replacement is not simply going Take a product and replace it with a solution.
Second, in the government and enterprise markets, go directly to PK with Oracle, such as performance. We think this may be misleading. Because performance is not the only factor when users choose models, there are many other factors. We hope that DBStack can be used for agile delivery and seamless integration, so that such a lightweight cloud that can be deployed to the customer's local area will put all the data engines that users may use. Now, customers such as Guangdong Mobile and Anhui Mobile are using these products.
Compared with friends, they may have their flexibility, and we also have our own differentiated features. For example, if a friend of the business has made it clear that it will not do a private cloud, a large part of the market will be released, and there are bound to be many friends to fill the gap in the market.
Regarding the point mentioned by Li Feifei, I understand that there are two core reasons why we are going to do it: First, this is the high-end market, it is also a benchmark, and it is the market just need on the bright side. Second, these business scenarios and business loads are very helpful for traction product development, inspection, and continuous improvement of product stability and reliability. I think combining these two factors, we will definitely do it unswervingly.
: Just as you mentioned "the one-stop full-link data management and service in the era of cloud native database 2.0", it is also the reality that we have seen. For different scenarios, different databases are needed to solve problems. There is no such thing as " "One library will hit the world". The so-called trends that we have seen, such as cloud native + distributed, software and hardware integration, lake warehouse integration, HTAP, etc., are in themselves a kind of integration, behind which is driven by the user's simplification needs. For users, the complexity and difficulty of managing one database and managing four databases at the same time are definitely different. In the trend, everyone wants to simplify, but we see that everyone still uses a dedicated database for dedicated scenarios. Is this a bit contradictory?
Wang Weimin: Many of the problems we have to solve are actually multivariate. First of all, it is very challenging to consider all variables, and how to prioritize these variables, such as reliability, development, management, operation and maintenance, etc., must be considered comprehensively.
Secondly, since general-purpose products cannot meet the demand, then use the combination of "general + special" to do it, and reduce the complexity, what should be done? We use DAS, DMS and other technical means to solve the problem. We hope that daily work such as patch upgrades, daily inspections, parameter configuration or adjustment, and backup alarms of the previous traditional DBA. Now there is a particularly good trend. We have accumulated a huge amount of database load, water level, slow SQL governance and other best practices for us to do these tasks. Like deep learning, when Mr. Yang Likun first proposed it in the 1990s, it could only detect postal codes in the US Postal Service in the 1990s. Unlike today, it has a very wide range of uses, mainly because of the increase in data and computing power.
Going back to the question just mentioned, this contradiction may indeed be insurmountable in the past. But today, we see an opportunity we can use AI4DB to solve it. In this way, people can focus more on high-value tasks such as Application DBA, logic design, and data placement. Routine tasks can be automated and the two can be organically combined.
: We mentioned in the previous press release that Alibaba Cloud will fully enter the offline market. In the offline market, everyone is actually paying close attention to the financial industry, especially banks. I have observed that it is an inevitable trend for banks to implement distributed transformation. Li Feifei also mentioned before that at this Yunqi Conference, you open sourced a distributed database like PolarDB-X. I can understand that this is your combination for entering the offline market?
Wang Weimin: right, it can be understood that this is one of the key actions. The enthusiasm of the domestic database participants is very high, and there are many manufacturers. But for many companies, they are very worried about how large your company is? Will we contribute 50%-60% of your revenue? They are actually very worried about the sustainability of the database company, which is actually a question of trust. Open source, in my opinion, includes two aspects: First, use an open source approach and an open mind to solve open problems and challenges, and attract more people to participate. Second, open source is equivalent to intellectual crowdfunding;
In addition, open source also expresses a kind of self-confidence, which is a way for users to check the quality, which can greatly reduce the challenges in the process of trust building.
Finally, of course, we also hope to do ecology in this way, such as the ecology of applications and the ecology of talents.
: So, open source is just a strategy, it is not a necessary condition for determining whether database companies and vendors can stand out from the competition in the future?
Wang Weimin: Yes, it is not. There are many open source products, which may be popular, but not necessarily popular (commercially successful). There are also many products, which are never applauded or popular, and just can't be played anymore. It has to be Open Source. We hope to continue to do it in this way, it does not mean that we will succeed if we do this, because there are indeed too many open source projects, and the commercialization is actually unsuccessful in the end. Not only in the database field, but also in other fields. For example, there used to be OpenOffice and Open Solaris. Even if we are serious about doing Open Source, there are still many places to learn and pitfalls to avoid. There are many things to do during this period.
: There are more and more domestic database applications on the ground, but in fact, more of them exist in the edge business, and the core business is relatively small. Many companies are still renewing the Oracle database while deploying domestic databases, and even up to now the two lines are running side by side. This phenomenon is due to the concerns of customers-fearing that domestic databases will not be able to support them, or are domestic databases really having certain problems in undertaking high-end and core business?
Wang Weimin: these two factors, and there are more than these two factors. Indeed, many customers are trying to use more suppliers, including domestic database vendors, while continuing to use commercial database products. This is the current status quo. There are several reasons:
First, after decades of development and the long-term real business load test of a large number of users (millions), the commercial database has been tested in all aspects such as its enterprise-level features, stability, business model, and services. No matter what product is used, the user's business system can operate efficiently.
At present, many domestic manufacturers still lag behind commercial databases in terms of R&D investment, product maturity, product information, ecology, and talents. This is an objective fact, and we should also face this objective fact squarely. People have worked for 50 years, we have worked for 10 years, and the investment is much less than others. It is impossible to surpass them. This does not conform to objective laws.
Second, there are indeed many business scenarios that do not require such high-end products. This sentence has two meanings. One is that many products currently have gaps in corporate functions, features, completeness, performance, stability, etc. (with commercial databases), but it does not mean that the gap is insurmountable, but it takes time to overcome. . The second is that in many business scenarios, many commercial database functions and features are actually not used and do not need to be used. This is indicated by statistical data.
: I learned earlier that most companies using Oracle may not even be able to use half of the functions and performance. For many companies, performance problems can be solved by many architectures.
Wang Weimin: To a large extent, our products are sufficient for many core business scenarios and typical business loads, but the stability or other aspects of our products may have a long way to go. , Also need to accept the long-term, continuous load test of a large number of users.
: Can I understand this way? At present, the domestic database is available, but there is still some distance from the usefulness? Earlier I saw that the goals of domestic databases are divided into technical goals and market goals. The market goal, the first is to realize the independent control of localization, and the second is to go global. In terms of technical goals, it is divided into four goals: people have me without, people have me, people without me, and people have me superior. In your opinion, what are the stages of the current domestic database market in terms of technical goals and market goals?
Wang Weimin: I think that from the perspective of technical goals and functional characteristics, there may be some characteristics that no one has, but for many commonly required characteristics and commonalities, we are still in the catching-up and supplementary stage. "There is a part" or "people have me and a small part", different manufacturers are not the same in the completeness of the list.
The market is even worse. If we count online and offline markets, Alibaba Cloud has achieved the first place in China, surpassing Oracle. But if we return to the global perspective, according to Gartner statistics in 2020, Microsoft is the first, Oracle is the second, and Alibaba Cloud is the seventh in the world. In the To B, cloud and database markets, we still have a big gap with leading companies.
For example, Oracle's net profit in a quarter is almost more than 10 billion U.S. dollars. Even if the database only accounts for 40%, it may have several billion U.S. dollars except for its standard services. This is incredible data. The net profits of China's top 100 software companies may not add up to that much.
: At present, when you help customers go to O, go to MySQL, go to Teradata, etc., what challenges will you face? In what way do you usually help customers?
Wang Weimin: is the ecological level. No customer's business system can achieve drop-in replacement. There may be an application side in the north direction, data integration in the vertical direction, and a series of adaptations such as operating system and hardware on the south side. When you want to replace, many customers will say "not only to replace this product, but also to retain the original control, development, operation and maintenance, application and other series of process docking. Therefore, the ecological aspect is a big challenge.
The second is talent. Many companies require a certain degree of professional accumulation in terms of talents. Changing products may require a learning process. For talents, the slope of the learning curve should be reduced and the learning cost should not be too high. So we have to do a lot of work in terms of compatibility. The work on compatibility is again a dynamic moving target. Today, a certain point has been achieved, and the compatible object has moved forward and has to continue to do it. Compatibility work, the difference between 90 points and 80 points is not big, but the workload from 90 points to 95 points is huge, and it is almost impossible to get from 95 points to 100 points. So we are thinking about whether we want to follow, which aspects we want to follow, and which aspects we want to replace with other methods-or simply ignore compatibility, etc. These are all issues to be considered. Of course, the choice between this is very difficult.
: That's it for today's interview. Thank you Mr. Wang for accepting the interview.
related reading: A quick overview | DTCC 2021 conference, what are the big
Copyright Statement: content of this article is contributed spontaneously by Alibaba Cloud real-name registered users. The copyright belongs to the original author. The Alibaba Cloud Developer Community does not own its copyright and does not assume corresponding legal responsibilities. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and the "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find suspected plagiarism in this community, fill in the infringement complaint form to report it. Once verified, the community will immediately delete the suspected infringing content.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。