Author: KubeVela Community

In the current wave of machine learning, AI engineers not only need to train and debug their own models, but also need to deploy the models online to verify the effects of the models (of course, sometimes, this part of the work is done by AI system engineers. ). This part of the work is tedious and consumes extra effort for AI engineers.

In the cloud-native era, our model training and model serving are also usually performed on the cloud. Doing so not only improves scalability, but also improves resource utilization. This is very effective for machine learning scenarios that consume a lot of computing resources.

But it is often difficult for AI engineers to use cloud-native capabilities. The concept of cloud native has become more complex over time. To deploy a simple model service on cloud native, AI engineers may need to learn several additional concepts: Deployment, Service, Ingress, etc.

As a simple, easy-to-use, and highly scalable cloud-native application management tool, KubeVela allows developers to quickly and easily define and deliver applications on Kubernetes without knowing any details about the underlying cloud-native infrastructure. KubeVela has rich extensibility. Its AI plug-in provides functions such as model training, model service, and A/B testing, covering the basic needs of AI engineers and helping AI engineers quickly conduct model training and model training in a cloud-native environment. Serve.

This article mainly introduces how to use KubeVela's AI plug-in to help engineers complete model training and model services more easily.

KubeVela AI plugin

The KubeVela AI plugin is divided into two plugins: model training and model service. The model training plugin is based on KubeFlow's training-operator and can support distributed model training of different frameworks such as TensorFlow, PyTorch, and MXNet. The model service plug-in is based on Seldon Core, which can easily use the model to start the model service, and also supports advanced functions such as traffic distribution and A/B testing.

Through the KubeVela AI plug-in, the deployment of model training tasks and model services can be greatly simplified. At the same time, the process of model training and model services can be combined with KubeVela's own workflow, multi-cluster and other functions to complete production-available services. deploy.

Note: You can find all source code and YAML files in KubeVela Samples [1] . If you want to use the model pretrained in this example, the style-model.yaml and color-model.yaml in the folder will copy the model into the PVC.

model training

First start the two plugins for model training and model serving.

 vela addon enable model-training
vela addon enable model-serving

Model training includes two component types, model-training and jupyter-notebook, and model serving includes model-serving. The specific parameters of these three components can be viewed through the vela show command.

You can also choose to consult the KubeVela AI plugin documentation [2] for more information.
 vela show model-training
vela show jupyter-notebook
vela show model-serving

Let's train a simple model using the TensorFlow framework that turns gray images into colors. Deploy the following YAML file:

Note: The source code for model training comes from: emilwallner/Coloring-greyscale-images [3]
 apiVersion: core.oam.dev/v1beta1
kind: Application
metadata:
  name: training-serving
  namespace: default
spec:
  components:
  # 训练模型
  - name: demo-training
    type: model-training
    properties:
      # 训练模型的镜像
      image: fogdong/train-color:v1
      # 模型训练的框架
      framework: tensorflow
      # 声明存储,将模型持久化。此处会使用集群内的默认 storage class 来创建 PVC
      storage:
        - name: "my-pvc"
          mountPath: "/model"

At this point, KubeVela will pull up a TFJob for model training.

It's hard to see the effect just by training the model. Let's modify this YAML file and put the model service after the model training step. At the same time, because the model service will directly start the model, and the input and output of the model are not intuitive (ndarray or Tensor), therefore, we deploy a test service to call the service and convert the result into an image.

Deploy the following YAML file:

 apiVersion: core.oam.dev/v1beta1
kind: Application
metadata:
  name: training-serving
  namespace: default
spec:
  components:
  # 训练模型
  - name: demo-training
    type: model-training
    properties:
      image: fogdong/train-color:v1
      framework: tensorflow
      storage:
        - name: "my-pvc"
          mountPath: "/model"
  
  # 启动模型服务
  - name: demo-serving
    type: model-serving
    # 模型服务会在模型训练完成后启动
    dependsOn:
      - demo-training
    properties:
      # 启动模型服务使用的协议,可以不填,默认使用 seldon 自身的协议
      protocol: tensorflow
      predictors:
        - name: model
          # 模型服务的副本数
          replicas: 1
          graph:
            # 模型名
            name: my-model
            # 模型框架
            implementation: tensorflow
            # 模型地址,上一步会将训练完的模型保存到 my-pvc 这个 pvc 当中,所以通过 pvc://my-pvc 指定模型的地址
            modelUri: pvc://my-pvc

  # 测试模型服务
  - name: demo-rest-serving
    type: webservice
    # 测试服务会在模型训练完成后启动
    dependsOn:
      - demo-serving
    properties:
      image: fogdong/color-serving:v1
      # 使用 LoadBalancer 暴露对外地址,方便调用
      exposeType: LoadBalancer
      env:
        - name: URL
          # 模型服务的地址
          value: http://ambassador.vela-system.svc.cluster.local/seldon/default/demo-serving/v1/models/my-model:predict
      ports:
        # 测试服务的端口
        - port: 3333
          expose: true

After deployment, use vela ls to view the status of the application:

 $ vela ls

training-serving        demo-training        model-training           running  healthy  Job Succeeded  2022-03-02 17:26:40 +0800 CST
├─                    demo-serving         model-serving            running  healthy  Available      2022-03-02 17:26:40 +0800 CST
└─                    demo-rest-serving    webservice               running  healthy  Ready:1/1      2022-03-02 17:26:40 +0800 CST

As you can see, the application has started normally. Use vela status <app-name> --endpoint to view the service address of the application.

 $ vela status training-serving --endpoint

+---------+-----------------------------------+---------------------------------------------------+
| CLUSTER |     REF(KIND/NAMESPACE/NAME)      |                     ENDPOINT                      |
+---------+-----------------------------------+---------------------------------------------------+
|         | Service/default/demo-rest-serving | tcp://47.251.10.177:3333                          |
|         | Service/vela-system/ambassador    | http://47.251.36.228/seldon/default/demo-serving  |
|         | Service/vela-system/ambassador    | https://47.251.36.228/seldon/default/demo-serving |
+---------+-----------------------------------+---------------------------------------------------+

The application has three service addresses, the first is the address of our test service, the second and third are the addresses of the native model. We can call the test service to see the effect of the model: the test service will read the content of the image, convert it into a Tensor and request the model service, and finally convert the Tensor returned by the model service into an image to return.

We choose a black and white female image as input:

After the request, you can see that a color image is output:

Model Services: Grayscale Testing

In addition to directly starting the model service, we can also use multiple versions of the model in one model service and assign different traffic to it for grayscale testing.

Deploy the following YAML, and you can see that both the v1 version of the model and the v2 version of the model are set to 50% traffic. Again, we deploy a test service behind the model service:

 apiVersion: core.oam.dev/v1beta1
kind: Application
metadata:
  name: color-serving
  namespace: default
spec:
  components:
  - name: color-model-serving
    type: model-serving
    properties:
      protocol: tensorflow
      predictors:
        - name: model1
          replicas: 1
          # v1 版本的模型流量为 50
          traffic: 50
          graph:
            name: my-model
            implementation: tensorflow
            # 模型地址,在 color-model 这个 pvc 中 /model/v1 路径下存放了我们的 v1 版本模型,所以通过 pvc://color-model/model/v1 指定模型的地址
            modelUri: pvc://color-model/model/v1
        - name: model2
          replicas: 1
          # v2 版本的模型流量为 50
          traffic: 50
          graph:
            name: my-model
            implementation: tensorflow
            # 模型地址,在 color-model 这个 pvc 中 /model/v2 路径下存放了我们的 v2 版本模型,所以通过 pvc://color-model/model/v2 指定模型的地址
            modelUri: pvc://color-model/model/v2
  - name: color-rest-serving
    type: webservice
    dependsOn:
      - color-model-serving
    properties:
      image: fogdong/color-serving:v1
      exposeType: LoadBalancer
      env:
        - name: URL
          value: http://ambassador.vela-system.svc.cluster.local/seldon/default/color-model-serving/v1/models/my-model:predict
      ports:
        - port: 3333
          expose: true

When the model deployment is complete, view the address of the model service through vela status <app-name> --endpoint:

 $ vela status color-serving --endpoint

+---------+------------------------------------+----------------------------------------------------------+
| CLUSTER |      REF(KIND/NAMESPACE/NAME)      |                         ENDPOINT                         |
+---------+------------------------------------+----------------------------------------------------------+
|         | Service/vela-system/ambassador     | http://47.251.36.228/seldon/default/color-model-serving  |
|         | Service/vela-system/ambassador     | https://47.251.36.228/seldon/default/color-model-serving |
|         | Service/default/color-rest-serving | tcp://47.89.194.94:3333                                  |
+---------+------------------------------------+----------------------------------------------------------+

Request the model with a black and white city image:

As you can see, the result of the first request is as follows. While the sky and ground are rendered in color, the city itself is black and white:

Request again, you can see that in the result of this request, the sky, ground and city are rendered in color:

By distributing traffic to different versions of the model, it can help us better judge the model results.

Model Serving: A/B Testing

For the same black and white image, we can either turn it into color through the model, or upload another style image to transfer the style of the original image.

For users, are colorful pictures better or pictures of different styles better? We can explore this by conducting A/B testing.

Deploy the following YAML, and forward the request with style: transfer in the Header to the model of style transfer by setting customRouting. At the same time, make this style transfer model share the same address as the colorized model.

Note: The model for style transfer comes from TensorFlow Hub [4]
 apiVersion: core.oam.dev/v1beta1
kind: Application
metadata:
  name: color-style-ab-serving
  namespace: default
spec:
  components:
  - name: color-ab-serving
    type: model-serving
    properties:
      protocol: tensorflow
      predictors:
        - name: model1
          replicas: 1
          graph:
            name: my-model
            implementation: tensorflow
            modelUri: pvc://color-model/model/v2
  - name: style-ab-serving
    type: model-serving
    properties:
      protocol: tensorflow
      # 风格迁移的模型需要的时间较长,设置超时时间使请求不会被超时
      timeout: "10000"
      customRouting:
        # 指定自定义 Header
        header: "style: transfer"
        # 指定自定义路由
        serviceName: "color-ab-serving"
      predictors:
        - name: model2
          replicas: 1
          graph:
            name: my-model
            implementation: tensorflow
            modelUri: pvc://style-model/model
  - name: ab-rest-serving
    type: webservice
    dependsOn:
      - color-ab-serving
      - style-ab-serving
    properties:
      image: fogdong/style-serving:v1
      exposeType: LoadBalancer
      env:
        - name: URL
          value: http://ambassador.vela-system.svc.cluster.local/seldon/default/color-ab-serving/v1/models/my-model:predict
      ports:
        - port: 3333
          expose: true

After successful deployment, view the address of the model service through vela status <app-name> --endpoint:

 $ vela status color-style-ab-serving --endpoint

+---------+---------------------------------+-------------------------------------------------------+
| CLUSTER |    REF(KIND/NAMESPACE/NAME)     |                       ENDPOINT                        |
+---------+---------------------------------+-------------------------------------------------------+
|         | Service/vela-system/ambassador  | http://47.251.36.228/seldon/default/color-ab-serving  |
|         | Service/vela-system/ambassador  | https://47.251.36.228/seldon/default/color-ab-serving |
|         | Service/vela-system/ambassador  | http://47.251.36.228/seldon/default/style-ab-serving  |
|         | Service/vela-system/ambassador  | https://47.251.36.228/seldon/default/style-ab-serving |
|         | Service/default/ab-rest-serving | tcp://47.251.5.97:3333                                |
+---------+---------------------------------+-------------------------------------------------------+

In this application, the two services have two addresses each, but the model service address of the second style-ab-serving is invalid because the model service is already pointed to the address of color-ab-serving. Again, we see the model effect by requesting the test service.

First, without the header, the image changes from black and white to color:

Let's add an image of an ocean wave as a style render:

We add the header of style: transfer to this request, and you can see that the city has become a wave style:

We can also use an ink painting image as a style rendering:

It can be seen that this time the city has become an ink painting style:

Summarize

Through KubeVela's AI plug-in, it can help you perform model training and model service more conveniently.

In addition, through the combination with KubeVela, we can also deliver the model after the test effect to different environments through KubeVela's multi-environment function, so as to realize the flexible deployment of the model.

Related Links

[1] KubeVela Samples

​​https://github.com/oam-dev/samples/tree/master/11.Machine_Learning_Demo

[2] KubeVela AI plugin documentation

​​https://kubevela.io/en/docs/next/reference/addons/ai

[3] emilwallner/Coloring-greyscale-images

​​https://github.com/emilwallner/Coloring-greyscale-images

[4] TensorFlow Hub

​​https://tfhub.dev/google/magenta/arbitrary-image-stylization-v1-256/2

Recently popular

You can learn more about KubeVela and the OAM project in the following materials:

  • Project code base: ​​github.com/oam-dev/kubevela​​ Welcome to Star/Watch/Fork!
  • The official homepage and documentation of the project: kubevela.io, since version 1.1, Chinese and English documents have been provided, and developers are welcome to translate more language documents.
  • Project DingTalk Group: 23310022; Slack: CNCF #kubevela Channel
  • Join WeChat group: Please add the following maintainer WeChat account to indicate that you have entered the KubeVela user group:

Click " ​​here ​​​" to view the official website of the KubeVela project.​


阿里云云原生
1k 声望302 粉丝