PaddlePaddle: A dozen lines of code to achieve OCR capability on Serverless architecture

Introduction to flying paddle deep learning framework adopts a programming logic-based networking paradigm, which is easier for ordinary developers to use, supports declarative and imperative programming, and has both development flexibility and high performance.

PaddlePaddle is based on Baidu's years of deep learning technology research and business applications. It is China's first self-developed, fully functional, open source and open industrial deep learning platform, integrating deep learning core training and reasoning frameworks, and basic model libraries. , End-to-end development kit and rich tool components in one.

The flying paddle deep learning framework adopts a programming logic-based networking paradigm, which is easier for ordinary developers to use. It also supports declarative and imperative programming, and has both development flexibility and high performance. In addition, the flying paddle is not only compatible with the deployment of models trained by third-party open source frameworks, but also provides a complete inference engine for the production environment of different scenarios, including the native inference library Paddle Inference, which is suitable for high-performance servers and cloud inference, and is oriented to distributed, Paddle Serving, a service-oriented reasoning framework with high-level functions such as automatic cloud access and A/B testing in a pipeline production environment, and a lightweight reasoning engine Paddle Lite for mobile and IoT scenarios, as well as in environments such as browsers and applets Paddle.js, the front-end inference engine used. At the same time, through the highly adaptive optimization of mainstream hardware in different scenarios and the support of heterogeneous computing, the inference performance of the flying paddle is also ahead of most mainstream implementations.

Install flying propeller

Flying Oar can be regarded as a Python dependent library, and the official provides pip, conda, source code compilation and other installation methods. Taking the pip installation method as an example, Feida provides two installation methods for CPU and GPU:

CPU version installation method:

pip install paddlepaddle

GPU version installation method:

pip install paddlepaddle-gpu

Practice: Handwritten digit recognition task

MNIST is a very famous handwritten digit recognition data set. Whether it is the official website of Tensorflow or the beginners of PaddlePaddle, it is used for practical explanations. It is composed of pictures of handwritten digits and corresponding tags, such as:

The MNIST data set is divided into training images and test images. There are 60,000 training images and 10,000 test images. Each picture represents a number from 0-9, and the size of the picture is a 28*28 matrix. This section will take the MNIST handwritten digit recognition task officially provided by PaddlePaddle as an example to carry out the basic learning of the PaddlePaddle framework. Like other deep learning tasks, flying paddles also need to complete a relatively complete deep learning task through the following four steps:

Preparation and loading of data sets;
Model building;
Model training;
Model evaluation.

Load the built-in data set

The flying oar framework has some common data sets built in. In this example, developers can load the built-in data set of the flying oar framework, such as the handwritten digits data set involved in this case. Two data sets are loaded here, one for training the model and one for evaluating the model.

import paddle.vision.transforms as Ttransform = T.Normalize(mean=[127.5], std=[127.5], data\_format='CHW')
Download the data set
train\_dataset = paddle.vision.datasets.MNIST(mode='train', transform=transform)val\_dataset = paddle.vision.datasets.MNIST(mode='test', transform=transform)

model building

Through Sequential, a layer-by-layer network structure is formed. Note that you need to perform a Flatten operation on the data first, and change the shape of the image data of the shape [1, 28, 28] to [1, 784].

mnist = paddle.nn.Sequential(
paddle.nn.Flatten(),
paddle.nn.Linear(784, 512),
paddle.nn.ReLU(),
paddle.nn.Dropout(0.2),
paddle.nn.Linear(512, 10))

Model training

Before training the model, you need to configure the calculation method and optimization method of the loss when training the model. Developers can use the prepare provided by the flying paddle framework to complete, and then use the fit interface to start training the model.

\# The model structure is expected to generate model objects, which is convenient for subsequent configuration, training and verification
model \= paddle.Model(mnist)
_\# Model training related configuration, preparation loss calculation method, optimizer and accuracy calculation method_model.prepare(paddle.optimizer.Adam(parameters\=model.parameters()), paddle.nn.CrossEntropyLoss(),
paddle.metric.Accuracy())
\# Start model training
model.fit(train\_dataset,
epochs=5,
batch\_size=64,
verbose=1)

Training results:

The loss value printed in the log is the current step, and the metric is the average value of previous steps.Epoch 1/5step 938/938 [==============================] - loss: 0.1801 - acc: 0.9032 - 8ms/stepEpoch 2/5step 938/938 [==============================] - loss: 0.0544 - acc: 0.9502 - 8ms/stepEpoch 3/5step 938/938 [==============================] - loss: 0.0069 - acc: 0.9595 - 7ms/stepEpoch 4/5step 938/938 [==============================] - loss: 0.0094 - acc: 0.9638 - 7ms/stepEpoch 5/5step 938/938 [\=\=============================] - loss: 0.1414 - acc: 0.9670 - 8ms/step

Model evaluation

Developers can use pre-defined validation data sets to evaluate the accuracy of the model trained in the previous step.

model.evaluate(val\_dataset, verbose=0)
The results are as follows:
{'loss': [2.145765e-06], 'acc': 0.9751}

It can be seen that the model effect obtained by preliminary training is around 97.5%. After gradually understanding the flying paddle, the developer can improve the accuracy of the model by adjusting the training parameters.

Combine with serverless architecture

The PaddlePaddle team is the first open source text recognition model suite PaddleOCR. The goal is to create a rich, leading and practical text recognition model/tool library. The model kit is a practical ultra-lightweight OCR system. It is mainly composed of three parts: DB text detection, detection frame correction and CRNN text recognition. The system adopts 19 effective strategies from 8 aspects including backbone network selection and adjustment, prediction head design, data enhancement, learning rate transformation strategy, regularization parameter selection, pre-training model use, and automatic model tailoring and quantization. The model was optimized for effect and slimming, and finally got an ultra-lightweight Chinese and English OCR with an overall size of 3.5M and a 2.8M English digital OCR.

Local development

\# index.py
import base64
import bottle
import random
from paddleocr import PaddleOCR
ocr = PaddleOCR(use\_gpu=False)
@bottle.route('/ocr', method='POST')
def login():
filePath = './temp/' + (''.join(random.sample('zyxwvutsrqponmlkjihgfedcba', 5)))
with open(filePath, 'wb') as f:
f.write(base64.b64decode(bottle.request.body.read().decode("utf-8").split(',')[1]))
ocrResult = ocr.ocr(filePath, cls=False)
return {'result': [[line[1][0], float(line[1][1])] for line in ocrResult]}
bottle.run(host='0.0.0.0', port=8080)

After the development is complete, run the project:

pythonindex.py

You can see that the service has been started:

Then use the Postman tool to test, first prepare a picture (here, take the built-in test picture of the PaddleOCR project as an example):

By converting the picture to Base64 encoding and requesting the newly started Web service with the POST method, you can see the execution result of PaddleOCR:

Deploy to serverless architecture

At present, the FaaS platforms of major cloud vendors have gradually supported container image deployment. Therefore, the project can be packaged into an image and deployed to Alibaba Cloud Function Computing through Serverless Devs.

Preparation before deployment

First, you need to complete the Dockerfile:

FROM python:3.7-slim
RUN apt update && apt install gcc libglib2.0-dev libgl1-mesa-glx libsm6 libxrender1 -y && pip install paddlepaddle bottle scikit-build paddleocrle scikit-build paddleocr
\# Create app directory
WORKDIR /usr/src/app
\# Bundle app source
COPY . .

Write Yaml documents that comply with Serverless Devs specifications:

\# s.yaml
edition: 1.0.0
name: paddle-ocr
access: default
services:
paddle-ocr:
component: fc
props:
region: cn-shanghai
service:
name: paddle-ocr
description: paddle-ocr service
function:
name: paddle-ocr-function
runtime: custom-container
caPort: 8080
codeUri: ./
timeout: 60
customContainerConfig:
image: 'registry.cn-shanghai.aliyuncs.com/custom-container/paddle-ocr:0.0.1'
command: '["python"]'
args: '["index.py"]'
triggers:
- name: httpTrigger
type: http
config:
authType: anonymous
methods:
- GET
- POST
customDomains:
- domainName: auto
protocol: HTTP
routeConfigs:
- path: /*

Project deployment

First build the image, here you can build it through Serverless Devs:

s build --use-docker

After the build is complete, you can deploy directly through the tool:

s deploy --push-registry acr-internet --use-local -y

After the deployment is complete, you can see the test address returned by the system:

Project test

At this point, you can test through the test address, and the expected effect is also obtained:

project optimization

By requesting the project deployed on the serverless architecture, you can see the time consumption of cold start and warm start:

Through the comparison of cold start and hot start, we can find that the performance of the whole system is relatively excellent during hot start. However, when encountering a cold start, the response of the entire project is often uncontrollable. At this time, you can consider the following ways to optimize:

Reduce the volume of the container image, reduce unnecessary dependencies, files, etc., clean up the cache left when installing dependencies, etc.; because the cold start of the function calculation includes the image pull time;
Part of the process is optimized. For example, it is clearly stated in the PaddleOCR project: “paddleocr will automatically download the ppocr lightweight model as the default model”, so this means that the cold start of the project in the serverless architecture is relatively more expensive than the hot start. Added a model download and decompression process, so this part can be inserted into the container image when necessary, thereby reducing the impact of cold start;
Turning on image acceleration can effectively reduce the cold start of container images. There is a related performance test description of image acceleration in the official Alibaba Cloud Function Computing document: Mirror pull is shortened to the second level";
Instance reservations minimize the cold start rate. Through instance reservation, the pre-heating and pre-starting of instances can be carried out through a variety of algorithms/strategies, which can minimize the impact of the cold start of the serverless architecture;

Copyright Statement: content of this article is contributed spontaneously by Alibaba Cloud real-name registered users. The copyright belongs to the original author. The Alibaba Cloud Developer Community does not own the copyright, and does not assume corresponding legal responsibilities. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and the "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find suspected plagiarism in this community, fill in the infringement complaint form to report it. Once verified, the community will immediately delete the suspected infringing content.