About In recent years, the digital economy has developed rapidly, and “digital power” has frequently emerged behind the transformation of enterprises. The rapid integration of cloud computing, big data, and artificial intelligence has formed a new infrastructure for the digital economy, and it has also brought new opportunities for the development of the digital economy.
In recent years, the digital economy has developed rapidly, and "digital power" has frequently emerged behind the transformation of enterprises. The rapid integration of cloud computing, big data, and artificial intelligence has formed a new infrastructure for the digital economy, and it has also brought new opportunities for the development of the digital economy.
On May 20th, Jia Yangqing, vice president of Alibaba and head of Alibaba Cloud Computing Platform, gave a speech on "Digital Power in the Age of Technological Innovation" at a media communication meeting. This article simplified and edited the content of his speech for readers.
01 Digital Power in the Era of Technological Innovation
Let's get to know a construction company first.
The reason for talking about construction companies is that, behind every industrial revolution, the most important thing is how the existing industries innovate their own productivity. The construction industry is a very typical example. So much big data and AI have been mentioned today, what value can they bring to them?
This company is called China Construction Third Bureau and One Company. It is the core force in national infrastructure construction and has always been known for its construction speed and efficiency.
More than 30 years ago, in 1985, Shenzhen's first super-high-rise landmark building was built with the "three days and one-story building", and then the "most high-rise building in China"-Shenzhen International Trade Building.
In 1996, with the speed of "nine days and four structural layers", the Shenzhen Diwang Building, the first tallest building in Asia and the fourth tallest in the world, was built at the time, pushing the Chinese construction industry from a super high-rise building to a skyscraper comparable to the world's skyscrapers. Leading level.
Looking across the country and the world, they have their works, and they have built many benchmark buildings that we are familiar with: the National Stadium (Bird's Nest), the CCTV building at the new CCTV site... In addition to landmark buildings, they have also built airports, subways, highways, and hospitals. (Leishenshan Hospital), school (Tsinghua Academy of Fine Arts), office building (Ali, Tencent, Sina Mobile and other office buildings)...
The efficient construction capabilities of China Construction Third Bureau and One Company have brought us great value.
Decades have passed, the architectural design has become more and more new, the brick structure has become a reinforced concrete structure, and the understanding of the construction industry by the China Construction Third Bureau and the first company has also been moving forward. More than 30 years ago, they relied on the race between man and time; today, they relied on the flow of data. Last year, China Construction Third Bureau and One Company teamed up with Alibaba Cloud to jointly build a data center.
To build a tall building, there are a lot of materials circulating, from a grain of sand to bricks, glass, steel bars, screws, and various construction machinery. How to make them circulate more efficiently is a problem that construction companies will encounter. Not only that, they also need to consider how to improve the construction process, improve innovative construction methods, and manage a series of issues such as the construction process and building materials through digital capabilities.
Alibaba Cloud builds a "digital twin" for China Construction Third Engineering Corporation and one company based on the one-stop data development and comprehensive management platform DataWorks, and uses data and algorithms to predict when to replenish sand and when to deploy Construction machinery, and other operations and management.
Today, we see that there are 100,000 construction companies in the entire construction market in China. In addition to large benchmark companies such as China Construction Third Bureau and One Company, there are also many small and medium-sized construction companies with a total of more than 50 million employees. Helping these small and medium-sized enterprises change from a traditional, small workshop-style, slash-and-burn model to a company like China Construction Three Bureau and one company is something that Alibaba Cloud hopes to do in terms of data.
We believe that by combining the core capabilities of Alibaba Cloud Data Center Construction and the expertise of various industries, it can help more companies to achieve digital transformation just like China Construction Three Bureaus and One Company.
02 "One body and two sides" to help enterprises make good use of data
Although everyone is mentioning big data and everyone feels that they are using big data, no one actually knows how to use big data.
Alibaba Cloud has created a series of "weapons" that use data, hoping to empower enterprises with digital power through comprehensive management and intelligence of data on the cloud.
The challenge that companies often face is that they have built many fragmented data systems. Heterogeneous data such as tables, Word, photos, and videos are stored in different databases such as Excel and data warehouses, and finally become "data islands."
Therefore, when building a data center, companies often encounter challenges in three aspects: technology, business, and organization. In terms of technology, how to get through data; in terms of business, how to summarize data of different calibers; in terms of organization, how to manage data stored in different locations in a unified manner.
A challenge that commercial companies often encounter is that calculating income will face various different calibers such as finance and the China Securities Regulatory Commission. Operating students need to look at the turnover of different situations. These will eventually sink into a SQL language or a Data task. If these tasks are inconsistent, inconsistent data will eventually appear, inconsistent results, and inconsistencies in caliber, all of which are a series of problems.
From a technical point of view, we have gradually built a complete data processing system called "one body and two sides."
"One" refers to the integrated data development and data comprehensive management platform DataWorks. Various industry applications are built on this platform.
DataWorks has accumulated about 80,000 users so far. Every day, about a quarter of Alibaba's employees do data development and application on DataWorks.
Under the integrated development platform, there are two different forms of data organization-data warehouse and data lake, the so-called "two sides".
The concept of "data warehouse" has existed a long time ago, and it can be understood as a huge Excel table or a bunch of huge Excel tables. Alibaba has built its own data warehouse MaxCompute a long time ago. It is an important part of "Flying" and has accumulated very good large-scale data warehouse capabilities.
During the evolution of MaxCompute, the need for real-time data analysis was born. For example, on Double 11, the promotion strategy should be adjusted in time according to the user's buying behavior. Therefore, a few years ago, we developed a set of real-time computing engine Flink. Flink was originally made by a German team, and now Alibaba and the German team continue to advance Flink as an open source stream computing implementation standard.
In the past, we only summarized data and produced reports; but more and more data began to require real-time services, such as "Guess you like", which requires real-time analysis and real-time analysis of users' historical behaviors. Serve related products quickly.
In the past few years, based on the offline data warehouse calculated by "T+1", we have made a real-time data warehouse + service integrated application-the interactive analysis product Hologres, which supports a lot of real-time decision-making during Double 11 . The decision-making level of Taobao and Tmall can use Hologres to see the real-time sales of each product category in each region in real time. When the sales/reach rate is found to be inconsistent with expectations, they can adjust their strategies in time.
With more and more heterogeneous data, when we do various services, it is no longer in the form of accurate data presentation like tables, but may be in the form of logs. These pictures, videos, voices and other data forms are Traditional data warehouses are not so suitable. Remember that when we first started doing machine learning at Google in 2013, we stored a bunch of pictures in the data warehouse, and it turned out that all the pictures were a bunch of strings, and the content of the pictures could not be seen.
Thus, the concept of "data lake" emerged. Don't worry, save all the data in an Excel table. Word is Word, picture is picture, and video is video. Regardless of data source and format, put all the data in a lake first.
But some of the business data is stored in the lake and some in the warehouse. How can we analyze and process them together? Last year, we proposed "Integration of Lake and Warehouse" to build a data center on the traditional data lake and data warehouse.
image
This is no problem for innovative businesses. However, many companies already have data warehouses, so how to make use of existing resources?
We have done a lot of work on the technical side. Through the connection of the storage resources and computing resources at the bottom, it is easier for everyone to access the information in the data lake from the perspective of the data warehouse, or build a series of open source engines on the data lake, and analyze the data in the data lake and the data warehouse at the same time. data.
03 AI blessing, mining the value of data, changing "cost" to "asset"
While managing the data, we found that the amount of data is getting bigger and bigger, and the unit value of the data is getting lower and lower.
Therefore, we began to think about how to tap the value of data, help companies innovate their businesses, improve efficiency, and turn data from costs into assets.
AI can make data smarter. AI algorithms can not only summarize data, but also analyze and make decisions.
But not all companies have the ability to turn AI into productivity for their own use. Gartner's research found that only 53% of projects can transform artificial intelligence (AI) prototypes into production. For AI to become enterprise productivity, engineering technologies must be used to solve the problems of full-link lifecycle management such as model development, deployment, management, forecasting, and reasoning.
We concluded that there are three major things that need to be promoted in the field of AI engineering: cloud nativeization of data and computing power, scaling of scheduling and programming paradigms, and standardization and universalization of development and services.
First, from the perspective of supply, AI engineering is the cloud nativeization of data and computing power.
The intelligent age is driven by data and computing power. Whether it is computer vision, natural language processing, or other AI systems, it is inseparable from the amount of data.
In the 1990s, handwritten postal codes were already recognized by AI. At that time, the amount of data used to train AI models was only about 10M. The ultra-large-scale Chinese multi-modal pre-training model M6, jointly released by Ali and Tsinghua University not long ago, was pre-trained with 2TB images and 300GB corpus data. Today, in the industry, the amount of data required to train an AI model is usually greater.
OpenAI has made a statistic. From AlexNet in 2012 to AlphaGo Zero in DeepMind in 2018, the demand for computing has increased by about 300,000 times.
According to Moore's Law, the computing power of a single CPU core doubles every 18 months. But around 2008, Moore's Law began to "fail", and the growth rate of computing power began to gradually slow down.
It can be seen that as the amount of data becomes larger and larger, the model becomes more and more accurate, efficient and complex. Whether in terms of data or calculation, a larger and larger base is needed to support the upper layer. The needs of AI. And cloud computing can provide stronger support in terms of data and computing power.
Second, from the perspective of core technology, AI engineering is the scale of scheduling and programming paradigm.
Because behind the large-scale and large-scale base, there are often two cost issues:
One is the cost of resources. Training a large model often requires a bunch of GPUs to do large-scale calculations. Nvidia’s latest DGX-2 costs about $200,000 per unit, which is really expensive. The OpenAI training model requires approximately 512 GPUs and 64 machines. If you build a cluster dedicated to large-scale training, it may cost a hundred million yuan less. At this time, if I go to a company, a research institute, or the government, I need 100 million to build a cluster. This cluster is to train a model. I don’t know how to use this model. I have to train it first. Take a look. This is obviously very troublesome.
To manage large-scale clusters and large-scale systems, we need to use a very typical "peak-shaving and valley-filling" method to test whether we can break up and smash AI computing tasks into small pieces of tasks. Deploy on machines with free resources. Behind this is a huge training task, and AI engineers need to do a lot of work.
When we trained the M6 model, we did not buy a new machine. We just used the "tidal effect" on the existing production cluster to bring out the amount of calculation and use it to train the model.
The other is human cost. AI is not as clear and goal-oriented as SQL. For example, writing a SQL can drive computing engines such as MaxCompute to pull a bunch of machines to perform calculations. AI is not like online services. Simple duplication of one machine and several machines, no interaction between machines, simple operation.
AI programs need to mash data between various machines and resources (between GPU and GPU, or between GPU and CPU), and place an algorithm (a mathematical formula) on the parameter server, telling machine A when and When machine B talks, when does machine B talk to machine C, and preferably faster. As a result, AI engineers have to write a bunch of extremely complex code that many people can't understand.
AI engineers have heard of concepts such as data parallelism and model parallelism. These concepts require a relatively simple software programming paradigm, which makes it easier for us to slice clusters and computing requirements, and better distribute Computer and Communication. But the programming paradigm today has not reached a level that allows everyone to understand each other well. Therefore, the labor cost is very high.
In other words, based on a large amount of data and computing power, a very obvious need is how to better achieve resource scheduling and resource allocation, and how to make it easier for engineers to write distributed programming paradigms, especially how to scale , This is the second embodiment of AI engineering.
We have designed a relatively simple and clean programming framework Whale, which allows developers to more easily jump from the stand-alone programming paradigm to the distributed programming paradigm. For example, just tell Whale to divide the model into 4 stages, and Whale will automatically put these stages on different machines for calculations.
Third, from the perspective of demand or export, AI engineering is the standardization and universalization of development and services.
AI has made a lot of interesting models. In order to make these models more closely applied in actual scenarios, a lot of work is needed. But not everyone has time to learn how to model AI, how to train and deploy, etc.
Therefore, we have been thinking about how to make it easier for everyone to use these tall AI technologies.
The Alibaba Cloud machine learning platform PAI team, based on Alibaba Cloud IaaS products, has built a complete AI development lifecycle management system on the cloud, from writing the model at the beginning, to training the model, to the deployment model. Among them, the Studio platform provides visual modeling, the DLC platform (Deep Learning Container) provides cloud-native one-stop deep learning training, the DSW platform (Data Science Workshop) provides interactive modeling, and the EAS platform (Elastic Algorithm Service) provides easier , Worry-free model reasoning service. Our goal is to hope that AI engineers can start writing the first line of AI code within a few minutes.
So far, Alibaba Cloud has served customers from all walks of life through big data and AI platforms, such as Baosteel, Sany Group, Sichuan Rural Credit Union, Pacific Insurance, Xiaohongshu, VIPKID, Douyu, Qinbao, etc. We hope that through our big data and AI capabilities, we can provide enterprises with the motivation to upgrade.
Copyright statement: content of this article is contributed spontaneously by Alibaba Cloud real-name registered users. The copyright belongs to the original author. The Alibaba Cloud Developer Community does not own the copyright, and does not assume corresponding legal responsibilities. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and the "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find suspected plagiarism in this community, fill in the infringement complaint form to report it. Once verified, the community will immediately delete the suspected infringing content.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。