Abstract: VEGA is a full-process AutoML algorithm set self-developed by Huawei's Noah's Ark Laboratory. It provides basic capabilities for full-process machine learning automation such as architecture search, hyperparameter optimization, data enhancement, and model compression.
This article is shared from Huawei Cloud Community " VEGA: Introduction to Noah's AutoML High-performance Open Source Algorithm Set ", author: kourei.
Overview
VEGA is a full-process AutoML algorithm collection self-developed by Huawei's Noah's Ark Laboratory, providing basic capabilities for full-process machine learning automation such as architecture search, hyperparameter optimization, data enhancement, and model compression. At present, most of the integrated algorithms have been incorporated into Huawei's DaVinci full-stack AI solution CANN + MindSpore. Some simple tests show that it has considerable advantages over GPUs. It is expected that the next version of Vega will provide support for DaVinci.
As an automated machine learning tool tailored for researchers and algorithm engineers, VEGA was released internally at Huawei in December 2019, supporting the automation of multiple teams within Noah (computational vision, recommended search, and basic AI research) Research on machine learning algorithms has produced 20+ algorithms at the relevant AI summit (CVPR/ICCV/ECCV/AAAI/ICLR/NIPS). The following is an introduction to the representative AutoML algorithm that is open sourced this time:
Automated Network Architecture Search (NAS)
Efficient classification network search scheme based on hardware constraints (CARS)
In different application scenarios, computing resource constraints are different, so there are naturally differentiated results requirements for search results. In addition, although the NAS method based on the evolutionary algorithm has achieved good performance, each generation of samples needs to be retrained for evaluation, which greatly affects the search efficiency. This paper considers the shortcomings of existing methods and proposes a multi-objective efficient neural network structure search method (CARS) based on continuous evolution. CARS maintains an optimal model solution set, and uses the model in the solution set to update the parameters in the super network. Each time the evolutionary algorithm generates the next-generation population, the parameters of the network can be directly inherited from the super network, which effectively improves the evolution efficiency. CARS can obtain a series of models of different sizes and precisions in one search, and users can select the corresponding models according to the resource constraints in actual applications. Related work was published in CVPR2020: https://arxiv.org/abs/1909.04977 .
Lightweight Hyperdivided Network Structure Search (ESR-EA)
Noah proposed a lightweight superdivision network structure search algorithm, which builds an efficient superdivision network basic module from the three perspectives of channel, convolution and feature scale. The algorithm is based on high-efficiency modules and takes model parameters, calculations, and model accuracy as the goals. It uses multi-objective optimization evolutionary algorithms to search for lightweight super-division network structures. The algorithm can comprehensively compress the redundancy of the hyperdivision network from the three perspectives of channel, convolution and feature scale. Experiments show that, with the same amount of parameters or calculations, the lightweight superdivision network (ESRN) searched by the algorithm has achieved better than the manually designed network structure on the standard test set (Set5, Set14, B100, Urban100) (CARN, etc.) Better results. In addition, the algorithm can also reduce the amount of calculation under the premise of ensuring the accuracy of the algorithm, and meet the delay and power consumption constraints of mobile devices. Related papers are published in AAAI 2020: https://www.aaai.org/Papers/AAAI/2020GB/AAAI-SongD.4016.pdf .
End-to-end detection network architecture search solution (SM-NAS)
The existing target detection model can be decoupled into several main parts: backbone (Backbone), feature fusion network (Neck), RPN and RCNN head. Each part may have different modules and structural designs. How to weigh the calculation cost and accuracy of different combinations is an important issue. The existing target detection NAS methods (NAS-FPN, DetNas, etc.) only focus on searching for better designs of individual modules, such as backbone networks or feature fusion networks, and ignore the overall consideration of the system. In order to solve this problem, in this article we propose a two-stage neural network search strategy from structured to modular, named Structural-to-Modular NAS (SM-NAS). Specifically, the structured stage performs a rough search of the model architecture to determine the optimal model architecture for the current task (such as using a single-stage detector or a two-stage detector, what type of backbone to use, etc.), and the matching one Enter the size of the image; in the modular search stage, the backbone module is fine-tuned to further improve the performance of the model. . In the search strategy, we adopted the evolutionary algorithm, and at the same time considered the dual optimization of model efficiency and model performance, and used non-dominate sorting to construct the Pareto front to obtain a series of network structures that reached the optimal simultaneously on multiple goals. In addition, we explored an effective training strategy that enables the network to achieve a faster convergence rate than with pretrain without imagenet pretrain, so as to evaluate the performance of any backbone more quickly and accurately. On the COCO data set, the model we searched is significantly ahead of the traditional target detection architecture in terms of speed and accuracy. For example, our E2 model is twice as fast as Faster-RCNN, and mAP reaches 40% (an increase of 1%); The speed of our E5 model is similar to MaskRCNN, and mAP can reach 46% (an increase of 6%). Related work was published in AAAI2020: https://arxiv.org/abs/1911.09929 .
Efficient detection network backbone architecture search solution (SP-NAS)
We use neural network structure search (NAS) technology to automatically design task-specific backbone networks to bridge the domain gap between classification tasks and detection tasks. Common deep learning object detectors usually use a backbone network designed and trained for ImageNet classification tasks. The existing algorithm DetNAS turns the problem of searching and detecting the backbone network into pre-training a weight-sharing super network to select the best sub-network structure. However, this pre-defined super network cannot reflect the actual performance level of the sampled substructure, and the search space is very small. We hope to design a flexible and task-oriented detection backbone network through the NAS algorithm: a two-stage search algorithm called SP-NAS (serial to parallel search) is proposed. Specifically, the serial search stage aims to efficiently find the serial sequence with the best receptive field ratio and output channel in the feature hierarchy through the search algorithm of "exchange, expansion, and focus"; then, the parallel search stage will automatically Search and assemble several sub-structures and the previously generated backbone network into a backbone network with a more powerful parallel structure. We have verified the effect of SP-NAS on multiple detection data sets, and the searched architecture can achieve SOTA results, that is, the top performance of reaching the first place in the public pedestrian detection rankings of EuroCityPersons (LAMR: 0.042); Accuracy and speed are better than DetNAS and AutoFPN. Related work was published in CVPR2020: https://openaccess.thecvf.com/content_CVPR_2020/papers/Jiang_SP-NAS_Serial-to-Parallel_Backbone_Search_for_Object_Detection_CVPR_2020_paper.pdf .
AutoTrain
Beyond Google's training regularization method (Disout)
In order to extract important features from a given data set, deep neural networks usually contain a large number of trainable parameters. On the one hand, a large number of trainable parameters enhance the performance of deep networks. On the other hand, they bring about the problem of overfitting. For this reason, the method based on Dropout will disable certain elements in the output feature map during the training phase to reduce the co-adaptation between neurons. Although these methods can enhance the generalization ability of the resulting model, Dropout based on whether to discard elements is not the best solution. Therefore, we studied the empirical Rademacher complexity related to the middle layer of deep neural networks, and proposed a feature perturbation method (Disout) to solve the above problems. During training, by exploring the upper bound of generalization error, randomly selected elements in the feature map are replaced with specific values. Experiments have proved that the feature map perturbation method we proposed has higher test accuracy on multiple image data sets. Related work was published in AAAI 2020: https://arxiv.org/abs/2002.11022 .
Use knowledge distillation to suppress the noise of automatic data amplification (KD+AA)
The idea of this algorithm is to solve some of the disadvantages of the automatic data augmentation (AA) method itself. AA searches for the best data enhancement strategy for the entire data set. Although from a global perspective, AA can make the data more differentiated and make the final model performance better; but AA is relatively rough, not a single The images are optimized, so relatively speaking, there will be a certain line of defense. When the intensity of data enhancement is relatively large, it is easy to bring semantic confusion to some images (that is, the image semantics changes due to excessive elimination of discriminative information. This is what we call semantic confusion. Obviously in the model training At that time, it is not appropriate for us to use the previous fox label as constraint guidance. In order to solve this problem, we use the knowledge distillation (KD) method to generate soft labels through a pre-trained model, and this label can guide through AA What should be the best label for the image. This algorithm is simple and effective. After combining with the large model, it has achieved the current best performance of 85.8% on ImageNet. The relevant paper was published in ECCV 2020: https://arxiv. org/abs/2003.11342v1 .
Automatic data generation (AutoData)
Low-cost image enhancement data acquisition scheme based on generative model (CylceSR)
In a specific image enhancement task (for example, super-divided), it is difficult to obtain paired data in real scenes. Therefore, most academic circles use synthetic paired data for algorithm research, but the results obtained through synthetic data Algorithm models often do not perform well in real scenes. In order to solve the above problems, we propose a novel algorithm: the algorithm uses synthetic low-quality images as a bridge, and completes the synthetic image domain to the real scene image domain through unsupervised image conversion. At the same time, the converted image is used to supervise and train the image enhancement network. The algorithm is flexible enough to integrate any unsupervised transformation model and image model. In this method, the image conversion network and the supervision network are jointly trained and cooperate with each other to achieve better degraded learning and super-scoring performance. The proposed method achieves very good performance on the NTIRE 2017 and NTIRE 2018 data sets, and is even comparable to the supervised method; the method was adopted by the AITA-Noah team in the NTIRE2020 Real-World Super-Resolution competition and was used in track1 Achieved the first in IPIPS and the second in MOS indicator in China. Related papers were published in CVPR 2020 Workshop on NTIRE: https://openaccess.thecvf.com/content_CVPRW_2020/papers/w31/Chen_Unsupervised_Image_Super-Resolution_With_an_Indirect_Supervised_Path_CVPRW_2020_paper.pdf .
Automatic network compression (AutoCompress)
Automatic compression based on evolutionary strategy neural network
This technology is aimed at the automatic compression of neural networks. Starting from the recognition accuracy, calculation amount, storage capacity, and running speed of the compression model, it uses multi-objective optimization evolutionary algorithms to compress the neural network such as mixed bit quantization and sparse pruning. The optimal compression hyperparameters of each layer are searched out, and a non-dominated solution set containing several compression models with excellent performance is obtained, which can meet the different needs of users for different indicators. This technology is suitable for high-performance cloud servers and mobile devices with weak computing performance. For high-performance cloud servers, it can provide models with high algorithm accuracy and calculation and memory consumption within a certain range. For mobile devices, it can ensure the accuracy of the algorithm. Reduce calculation and memory consumption, and meet the delay and power consumption constraints of mobile devices. Related papers were published in KDD 2018: https://www.kdd.org/kdd2018/accepted-papers/view/towards-evolutionary-compression .
This open source release is a preliminary stable version. In the future, the most cutting-edge algorithms will be added to it, and support for new algorithms and DaVinci will be added. The open source address is: https://github.com/huawei-noah/vega , please try it out and give feedback.
Vega has the following advantages:
• High-performance Model Zoo: presets a large number of Noah's leading deep learning models, and provides the best performance model on ImageNet/MSCOCO/NuScenes/NITRE and other data sets. These models represent Noah's latest research results on AutoML research and can be used directly: https://github.com/huawei-noah/vega/blob/master/docs/en/model_zoo/ .
• hardware affinity model optimization: In order to achieve hardware affinity, Vege defines the Evaluator module, which can be directly deployed on the device for reasoning, and supports the simultaneous operation of multiple devices such as mobile phones and Davinci chips.
• Benchmark reproduction: provides a benchmark tool to help you reproduce the algorithm provided by Vega.
• supports multiple links in the deep learning life cycle, and flexible calls based on pipeline arrangement: built-in architecture search, super parameter optimization, loss function design, data expansion, full training and other components, each component is called a Step , You can connect multiple Steps in series to form an end-to-end solution, which is convenient for you to experiment with different ideas, increase the searchable range, and find better models.
Finally, VEGA provides a large number of sample documents to help developers get started quickly. For complete Chinese and English documents, please refer to: https://github.com/huawei-noah/vega/tree/master/docs .
Click to follow and learn about Huawei Cloud's fresh technology for the first time~
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。