Abstract: full of curiosity about the Pangu large model from the outside world. Two Huawei cloud experts who participated in the development of the large model came to answer your questions.

This article is shared from the HUAWEI CLOUD community " Expert Answers | About HUAWEI CLOUD Pangu large model, all you want to ask is here~ ", the original author: HWCloudAI.

On April 25th, HUAWEI CLOUD released the Pangu series of ultra-large-scale pre-training models, including the world’s largest visual (CV) pre-training model with 3 billion parameters, as well as 100 billion parameters and 40TB training data jointly developed with Loop Intelligence and Pengcheng Lab. The world’s largest Chinese language (NLP) pre-trained model.
image.png

Among them, the Pangu NLP model was jointly developed by Huawei Cloud, Recycle Intelligence and Pengcheng Lab. It has leading language understanding and model generation capabilities: In the authoritative Chinese language understanding benchmark CLUE list, Pangu NLP model is ranked in the overall rankings. Ranked first in the list, classification, and reading comprehension, refreshing the world history record of the three lists; the total ranking score is 83.046, and the multiple subtask scores are industry-leading, which is a big step toward the human level (85.61).

The outside world is full of curiosity about the Pangu large model. During the Huawei Developer Conference (Cloud), two Huawei Cloud experts who participated in the development of the large model answered the following questions that everyone cares about.

Interview with Dr. Lingxi Xie

image.png

Q: As a developer, how easy are these pre-trained models to use? How high is the cost of use?

Lingxi: pre-training model is designed to allow everyone to reduce the cost of use. The cost of the pre-training process of the model is relatively high, but this cost does not need to be borne by the developer. When using these large models, its ease of use will further reduce the cost of use and reach a more appropriate level. For example, we will develop some easy-to-understand pipelines. If you are a developer with a certain foundation, you can do more customized development from our pipelines to better release our pre-training Model capabilities. If you are just an AI developer and want to use a large model to do simple AI development, we will also give you a more easy-to-understand interface so that everyone can use the Pangu large model in a drag-and-drop manner. Generally speaking, when you use the pre-training model, the calculation time and the cost of repetition required for tuning will be reduced to a very low level. Generally speaking, it is very friendly to developers.

Q: For those who are new to computer vision, what knowledge do they need to master in order to quickly enter the study and research and development?

Lingxi: artificial intelligence, computer vision, after decades of development, now has a very large knowledge system. If a beginner wants to understand these things before starting to do research, the efficiency will be a little bit low. My advice to everyone is that you can find a problem first in the learning process. At the beginning, this question may be a relatively rudimentary question, but there must be a specific scenario. For example, if you want to do weakly supervised learning, you usually encounter a practical problem, and it does require a weakly supervised algorithm. But do I have to master full supervision at this time in order to do weak supervision? This is not the case. You can first check some information to understand the current weakly supervised learning method, what is its baseline, and where is its frontier. Then you can start doing some simple experiments. In the course of the experiment, there are usually some difficulties or some doubts. The process of solving these difficulties and doubts will generally lead you to its foundation, such as how full supervision is done. When you have more foundation, when you look back, you will also find that you have a better understanding of the algorithm you are currently doing.

So my suggestion is that you can find a textbook with in-depth introduction to machine learning and computer vision. But don't limit yourself to this textbook: you will be more efficient when doing specific topics and learning knowledge at the same time.

Interview with Dr. Xiaopeng Zhang

image.png

Q: What are the successful landings of the Pangu CV model? Where does it stand compared to the industry?

: visual pre-training CV large model, combined with related process development, has been successfully implemented within Huawei and other cooperative projects, and these directions cover all walks of life, including industrial vision, online review, and retail Ultra, as well as medical and other scenes, have obtained some higher results than before without using the pre-training large model. In some scenarios, such as the remote sensing image segmentation just mentioned, we designed a pre-training algorithm for remote sensing images to achieve a segmentation accuracy improvement of up to 12% without adding additional annotation costs. There is another interesting phenomenon. Our pre-training model using super-large-scale images has better transferability, that is, we directly transfer such a model to the defects of industrial quality inspection for reasoning. We are very pleased to find , We did not make any fine-tuning on the downstream data set, but in the industrial defect detection, we have obtained a higher degree of optimization than my previous model, and even use the downstream data to fine-tune better results. This result will basically be higher. 3 to 4 percentage points. This inspires us, once the model has enough data, its generalization ability can be better guaranteed.

Second, we are one of the first companies in China to make large-scale visual pre-training models. In foreign countries, Facebook and Google have made some applications on images since 2019. Our visual pre-training model has been around since the end of 2019. Through a series of self-developed improved algorithms, we have reached the level of the fully-supervised baseline in the linear classification accuracy of the unsupervised pre-training model based on imagNet for the first time. The sample learning is far ahead of the existing technology, and these are the industry-leading results.

Q: What kind of data and learning tasks are used in Huawei's pre-training? How to ensure end-to-side performance for large models?

: is very simple for different angles of visual images and changes in different scenes. First, we may have a massive data set. The scale of this data set has reached a scale of 100 million or even billion. We believe that this massive data set can model all aspects of our actual scene images. The other one is what kind of learning methods we have adopted. In fact, one of its core ideas is the relatively popular global-based contrast self-supervised learning method starting in 2019.

Of course we have made a lot of improvements on this. Including how to use weak label information, how to expand the global information to the local to better model the local correlation relationship. At the same time, it will also echo what I just mentioned, how to deal with image problems of different perspectives and different scales, and how to make it perform efficient modeling. This is to allow it to perform different data enhancements. In the pre-training algorithm, we integrate ten The remaining data enhancement methods allow it to be enhanced through different data, so that the entire model has the invariance of different data enhancements.

So far, we are in a large model, equipped with model distillation, extraction, and industry large models. We have now adapted about more than a dozen pre-trained models. And these more than ten models are all obtained through the extraction and distillation of one of our large models. It has been greatly improved in the corresponding industry. At the same time, it greatly reduces the annotation cost and model iteration cycle.

Q: How does Huawei's pre-training model combine different industry knowledge to solve the problem of large labeled data?

: gives an example of the intelligent inspection of State Grid electric power that we released on HDC Cloud. This is a very typical use of Pangu CV large model to solve industry knowledge.

During the development of the State Grid power inspection model, it has a huge amount of data and it is very difficult to label. What have we done? Through our visual pre-training algorithm, pre-training is carried out on the massive inspection data. This pre-training uses dozens of terabytes and millions of scales of drone inspections to carry out pre-training. Its pre-training You can see a lot of our data and its internal distribution. For our large model, the larger the model parameters, the more data we have seen, so it can better model the subtle differences in the pictures during the drone inspection process.

Using our large visual pre-training model, it can provide better representation, because of its defects and stronger representation capabilities of normal samples, we have basically reduced the cost of labeling by more than 80%. This whole piece is in manpower. The above is a very big improvement. In addition to reducing annotations, one of our models can adapt to more than one hundred defects in our power industry, which greatly reduces the iteration cycle of the model. The efficiency of the entire iteration is approximately increased by 10 times. During each iteration, we give feedback to people who need to annotate the whole The less work will be needed. Through these two modes, we have realized that in the electric power industry, using our visual pre-training model has greatly improved our development efficiency.

Click to follow, and get to know the fresh technology of Huawei Cloud for the first time~


华为云开发者联盟
1.4k 声望1.8k 粉丝

生于云,长于云,让开发者成为决定性力量