头图

Author: Lu You

background

With the rapid development of the mobile Internet, the application of artificial intelligence on the mobile terminal becomes more and more extensive, and the terminal intelligence plays an important role in core scenarios such as image recognition, video detection, and data computing. As we all know, Python is the preferred language for end-intelligence research and development of algorithms. At present, Alibaba has established an end-intelligence research and development ecosystem, including Python virtual machines, a series of data/visual runtime Python extension libraries, Python task scheduling libraries, and supporting facilities. Task release system and so on.

The main scenarios of end intelligence are in the two fields of data calculation and visual content understanding. If the algorithm development is completely on the PC side, it will be much simpler, because the PC environment has official natural support for the development of Python, but on the mobile side, the algorithm is carried out. The deployment, debugging, and verification of the software are still in the era of "slash and burn". The current algorithm mainly adopts two methods:

  • Log debugging: completely separate from the runtime code, by inserting a log in the code, to verify the running logic and results of the program.
  • PC model end-side environment: In order to be able to debug the code at breakpoints, build and install all the libraries that the end-side Python code depends on on the PC side, so that the Python program can run independently and be debugged on the PC side general IDE (such as PyCharm) Above, the data input uses the Mock method.

Of course, you can verify the results and locate the problem by logging, but once the project is a little more complicated, the production efficiency will be very low; and it is separated from the mobile terminal environment and runs independently on the PC side, which does not reflect the real operating environment (real-time data, operating performance) Etc.), the consistency of PC and end-side data cannot be guaranteed.

The main problem of the above two research and development methods is that they are separated from the current operating environment of the code, which will result in a lack of debugging information. We know that the popularity of each language among developers is inseparable from the supporting IDE debugging tools. Just like Xcode can be used to debug the OC code of mobile applications, we need a real smart algorithm on the end side. Deployment and debugging tools and solutions to improve the efficiency of AI R&D and production.

MNN workbench

MNN is a lightweight deep learning end-to-side reasoning engine open sourced by Alibaba. The core solves the problem of end-to-side inference operation of deep neural network models, covering the optimization, conversion and reasoning of deep neural network models.

Open source address: https://github.com/alibaba/MNN

In addition to the construction of the MNN engine, the MNN team is also deeply involved in Alibaba's internal end-to-end AI application practice. The high threshold of AI application and the long link of algorithm model deployment are long-standing problems that have plagued us. In order to solve these two problems, we have accumulated the long-term solutions accumulated in hundreds of practices into a one-stop AI research and development paradigm-MNN workbench. (Www.mnn.zone download MNN workbench)

The MNN workbench provides a research and development paradigm for end AI. At the same time, it is also a general debugging editor for mobile Python. The workbench not only includes standard end-side Python debugging capabilities, but also covers the common R&D deployment of Alibaba Group's internal intelligent algorithms. Scenarios are efficiency tools for algorithm research and development.

The workbench provides a VSCode-style Python integrated development environment. Its use process is very simple, generally divided into three steps:

  1. Establish connection: The application establishes a connection between the mobile phone and the PC in the local area network by scanning the QR code on the workbench
  2. Code push: the project on the workbench is pushed to the end-side application through the LAN
  3. Run: The end-side application triggers the execution of the end-side Python code. If a breakpoint is set, the end-side code can be debugged

[]()

End computing research and development

The MNN workbench also provides an efficient local debugging and deployment solution for the development of an important end-to-end computing scenario in Alibaba, covering the main end-to-end computing scenarios of Walle/Jarvis and CV.

R & D process

Prior to this, the development of end computing relied on platform release. In the development phase, debug logs were added to the code, released to the platform through pre-release/beta, and then scanned the code to connect to the platform to output the log, and locate it through the log And analyze the problem, and then go back and forth. This method is relatively primitive, and the development and debugging cycle is relatively long. One modification or verification is a round of uploading and publishing operations, and the debugging information is only a log, and the overall development efficiency is relatively low.

The figure above shows the general process of the algorithm using the workbench for end-to-end computing research and development:

  1. Create a new end calculation project, or open the end calculation project of clone from the platform
  2. Scan the code to connect to the workbench, at this time, the algorithm log will be output in real time in the workbench
  3. Develop code, modify configuration, and add resources in the workbench IDE environment
  4. Run the task. At this time, the task will be deployed to the end through local push. Online/pre-release environment can be used, any code will not affect the online, no need to worry about stability issues
  5. Code breakpoint debugging, view stack, variables, etc.
  6. After the local test of the workbench is passed, submit the code, and then perform subsequent platform release operations

The workbench effectively solves the problem of debugging and deployment of the algorithm in the development phase. Its main advantages are:

  • Local deployment, the development stage can be completely separated from the platform
  • Support real-time Python debugging for end computing engineering
  • End-to-end deployment is not restricted by pre-release/online environment or window period
  • View the running performance of the end-side algorithm in real time

Persistent debugging

The end computing task is usually triggered by UT, and it is also actively called by the project. Regardless of the trigger method, the code only needs to be pushed once, and it will run and debug each time it is triggered. Unless the code is modified, it needs to be pushed again. This is the persistence of the workbench. Debugging ability.

Real-time log

The workbench supports real-time printing of various types of logs, which can be used after the device is successfully connected to the workbench. The supported log types include:

  • Current Task: This project log, including the normal print log of this project and the running log of the project
  • Python: All Python logs in the application
  • C: Logs printed through the C interface provided by the end-to-end computing
  • OC/Java: logs printed through the Java/OC interface in the end computing

File browser

In the development of end computing, it is often necessary to view or process the local files of the device, such as database files, deep learning models, resource pictures, scripts, and so on. A visual device file browser is provided in the workbench. The sandbox directory of the application can be viewed on the iOS device. The browsing on the Android device is the same as Android Studio, and it also supports local storage and path copy of commonly used directories/files.

Rely on two-party library joint debugging

Sometimes the algorithm needs to check the running status of the dependent library code. The workbench can also support the development of the Python second-party library. When deploying locally, the dependent library and project specified by the algorithm will be pushed to the end together, and breakpoints can also be made during debugging. Into the library code.

The above figure shows the process of a Jarvis project relying on the code in the Jarvis base library for joint debugging. log.log is the printing method in the Jarvis base library. Relying on two-party library joint debugging only affects this project, and will not have any impact on other projects.

Visual content understanding research and development

Walle/Jarvis are all data scenarios for terminal computing. In addition, a large part of the algorithms deal with videos and pictures. The most common ones are camera/video stream input and pictures in memory. Typical algorithms such as image classification and target Detection, etc., they are all specific applications of algorithms in computer vision (CV, Computer Vision) scenarios, referred to as vision algorithms.

CV deployment method

The end computing framework provides a rich Python runtime extension library for data algorithms. Algorithms can develop various Python computing tasks based on these libraries and then deploy them for execution on the end. For visual scenes, the processing of pictures and video streams generally involves the computer vision library OpenCV and the function calculation library Numpy. The previous end-side Python runtime library did not include it, so the algorithm uses the visual algorithm developed by Python on the PC. It cannot be run directly on the terminal, so the deployment method integrated by SDK is generally used:

Algorithms are developed in a PC environment using Python, usually based on basic libraries such as OpenCV/Numpy. When ported to the end, although these foundations are not yet supported on the end, they can be converted to similar C/C++ implementations in some ways, such as format conversion Use MNN CV to replace the implementation of OpenCV, and at the same time do some performance optimizations for the end-side operating environment. The blue part in the back requires the deep participation of engineering students to package the SDK on Android/iOS, then build a local Demo application for verification, and then integrate it into the application.

The main disadvantages are:

  1. The vision algorithm needs to be re-implemented and optimized for the end-side part, and the cost is high
  2. The algorithm relies heavily on engineering collaboration, and the cost of engineering maintenance of the SDK is also relatively high (the SDK with the algorithm as the core)
  3. Algorithm testing and verification requires engineering to develop App Demo, and the algorithm generally cannot be directly tested in the application environment such as hand Taobao.
  4. Once a defect occurs after the release, whether it is an algorithm or an engineering problem, the iteration cycle depends on the version plan of the application, generally more than 1 week

MNNKit is the most typical SDK integration method for visual algorithms.

In the past, the deployment of vision algorithms could only be integrated into applications in the form of algorithm C/C++, but the current end-side runtime has been built to improve the CV runtime capabilities (OpenCV/Numpy/MNN/MNNRuntime), so that the vision algorithm can be like an ordinary The Walle task is the same. It is dynamically deployed to the end through the workbench or publishing platform. Typical services such as white screen detection in cats and smart watch points in Taobao live broadcast are deployed in this way.

The main advantages of deploying CV algorithms at runtime are:

  • Algorithms can build their own visual verification Playground Demo (see three terminals below)
  • Like ordinary computing tasks, the CV algorithm is developed using Python, dynamically released, and has a short iteration cycle
  • You can use the workbench for local deployment, debug algorithms in the application, and both pre-release/online environments

CV R&D process

The above picture shows the complete process of using the workbench to develop the CV algorithm. After the initial output of the algorithm, it will go through three stages of development and testing:

  1. Unit test: the algorithm verifies the correctness of the visual effects in an independent Demo;
  2. Pre-integration test: The algorithm is locally tested in the target application and related businesses. Because the algorithm has not yet used the platform deployment method, it is called pre-integration test;
  3. Integration test: The algorithm is deployed through the platform, and the regression test is pre-launched, and then the online grayscale/formal release;

Algorithms are integrated testing on the platform before going online, while unit testing and pre-integration testing can be performed in the workbench environment.

Three ends in one

Different from the end calculation data algorithm, the newly developed CV model will generally be verified locally first. It is independent of the integrated application and is equivalent to the unit test in the Demo. For example, if you develop an image algorithm, such as target detection, you want to see if the detection result is correct on the end, and the positioning standard is not accurate, then you need an Android/iOS application to integrate the model and code. Then go to develop the upper-level code (typically MNNKit Demo). As an algorithm, I don’t understand Android/iOS development, and often require the assistance of engineering students, and the cost of collaboration is high.

Is there a way to get the algorithm out of the project and write a Playground verification application on Android/iOS?

  • Cross-platform effect debugging library DebugUI

The MNN workbench provides a "three-in-one" solution for the algorithm's pain points in model effect verification. To put it simply, we provide a set of cross-platform effect debugging Python extension library-DebugUI, the algorithm can use DebugUI's Python API to quickly build a visual verification application, and it can be written on one end and run on three ends.

The DebugUI extension library abstracts and extracts the core link of visual effect verification, provides a streamlined and practical Python interface for the algorithm, and covers the actual verification scenarios of the current visual algorithm:

  • Input mode: general data sources are only pictures and videos
  • Interactive components: for simple control, for example, you can adjust the threshold through the slider, switch the model used through the selector, turn on/off a function through a switch, etc.
  • Rendering component: used to display the results, such as the classification score value can be rendered with text, the face position can be rendered with key points, the portrait segmentation can be rendered with image, etc.
  • Data callback: Camera, photo album selection, and events of interactive components will all generate data callback and hand it to the algorithm for processing
  • Environment variables: the algorithm can get the variables set by the project, such as a local file path, etc.

The workbench model market has built-in some commonly used out-of-the-box algorithms, such as comic face, face detection, OCR, portrait segmentation, cartoon style, etc. They are all playground sample applications built by the algorithm using DebugUI. The above figure shows the playground example of the face detection algorithm. The same detection results can be obtained when running on Mac, Android, and iOS. This is "write at one end and run at three ends".

  • Playground code debugging

The Playground built with DebugUI also supports breakpoint debugging of Python code, which is very helpful for the development of the algorithm in the model unit test phase.

CV algorithm debugging

After using the three-terminal one to quickly build a visual Playground application to verify the effect of the algorithm, the next step is to deploy the algorithm module locally through the workbench to the integrated application (such as hand Tao). The application environment will include the active call trigger algorithm of the business module, so that you can The algorithm is debugged in the integrated application. This is the stage of pre-integration testing.

After the algorithm passes the local debugging on the workbench, the pre-release/beta release test is carried out on the platform. This is the method of platform deployment, but it is deployed in the pre-release environment of the application, which is the integration test phase. After the integration test passed, the algorithm was officially released and launched.

As shown in the example, Taobao Live uses the workbench for local deployment of the CV algorithm during the pre-integration test phase. The project actively calls the algorithm initialization and the camera performs inference every few seconds, which will trigger the debugging breakpoints into the corresponding methods, so that you can It is convenient to debug the CV algorithm in the application.

Performance evaluation

Algorithms often need to evaluate the running performance of the code for targeted optimization, especially for CV real-time algorithms. The MNN workbench provides a real-time performance display of the Python code running on the terminal. By running the Python project in the “profile” mode, you can see the execution path of the code, as well as the execution time and times of each line of code in the path. This is very helpful for evaluating the operating performance of the algorithm in the actual application environment and analyzing the bottleneck of performance optimization.

[]()

As shown in the figure, the profile is used to evaluate the performance of the CV algorithm in Taobao live broadcast. The running time and times will be displayed in the corresponding code line in real time. profile is applicable to all Python projects that can be deployed through the workbench. Projects such as Walle, Jarvis, CV, Playground, etc. can be tested for performance. In the future, we will add more end-to-side real-time information, such as memory, CPU usage, etc., to the performance evaluation to help the algorithm better analyze the code performance.

Concluding remarks

In the research and development process of end intelligence, the algorithm development is only a small part. Most of the work is actually outside the algorithm. The MNN workbench not only lowers the threshold of AI research and development for ordinary developers, but is also an efficiency improvement tool for algorithm research and development. It effectively solves the difficulties in the development and deployment of end-to-end Python, allows the algorithm to be separated from the project and builds its own demo application, and also supports the main end-intelligence research and development scenarios in the Alibaba Group, so that algorithm students can focus more on the development of the algorithm itself, thereby improving AI R & D production efficiency.

We are hiring!

Welcome to join the Alibaba Amoy Technology Department-End Intelligence Team, responsible for building the industry's leading open source reasoning engine MNN and one-stop machine learning software-MNN Workbench. In Alibaba, we are responsible for the core e-commerce AR platform and new forms of product navigation. At the same time, there are also innovative applications and systems such as search recommendation, user reach, and live content understanding with huge application scale.

Recruitment position:

Algorithm category: CV/CG/recommendation/search/machine learning/model compression
Engineering category: iOS/Android/Java server/C++/high-performance computing
How to submit your resume

Resume delivery: luyou.cy@alibaba-inc.com

, 3 mobile technology practices & dry goods for you to think about every week!


阿里巴巴终端技术
336 声望1.3k 粉丝

阿里巴巴移动&终端技术官方账号。