利用Operator技术打包Helm图表并部署到K8S集群中

前言

在“使用helm将复杂应用打包并部署到k8s集群中”这篇文章中我们用helm将应用打包为图表，而后通过其简化了部署流程，然而，helm对于基础安装（Basic Install）在行，虽支持无状态应用的无缝升级（Seamless Upgrades）：替换镜像版本以利用K8S RC控制器的滚动升级特性，但其对于有状态应用却无能为力。

如若升级MySQL数据库，其是有状态的，故没法简单的替换程序版本来完成数据库升级，需执行一系列复杂的操作：使用mysqldump导出导入到新版本数据库，或采用原地升级方式执行脚本以更新数据库字典信息，这些复杂的逻辑对于helm来说无能为力，鉴于此，我们可将复杂的逻辑操作包装到operator中。

Operator提供了如上5个维度能力，其提供了”第一天“应用安装能力，也支持”第二天“应用升级维护、备份等全生命周期、深度分析、自巡航等特性。利用operator-sdk我们可将helm图表制成operator，使用ansible制作operator，亦或者用go语言开发operator。

为Helm图表创建Operator

当将helm图表制作为operator后，其并没有具备超出helm图表的能力，换言之，若helm图表支持基础安装与无缝升级，那么制作成operator后不会多出备份~~~~等全生命周期等特性，但operator具有一些额外的能力。

首先安装SDK客户端，可参考文档Install the Operator SDK CLI。

$ RELEASE_VERSION=v0.18.1
$ curl -LO https://github.com/operator-framework/operator-sdk/releases/download/${RELEASE_VERSION}/operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu
$ chmod +x operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu && \
  sudo mkdir -p /usr/local/bin/ && \
  sudo cp operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu \
       /usr/local/bin/operator-sdk && \
  rm operator-sdk-${RELEASE_VERSION}-x86_64-linux-gnu
$ operator-sdk version
operator-sdk version: "v0.18.1"

若读者按照“使用helm将复杂应用打包并部署到k8s集群中”这篇文章配置并创建了helm图表与仓库，则执行如下命令对图表hello创建operator：

$ operator-sdk new hello-operator \
  --api-version=charts.helm.k8s.io/v1alpha1 \
  --kind=Hello --type=helm \
  --helm-chart-repo=http://chartmuseum.app.zyl.io \
  --helm-chart=hello

否则可将图表克隆到本地，而后执行operator-sdk new命令以基于本地目录创建operator，如下所示。

$ git clone https://github.com/zylpsrs/helm-example.git
$ operator-sdk new hello-operator \
  --api-version=charts.helm.k8s.io/v1alpha1 \
  --kind=Hello --type=helm \
  --helm-chart=helm-example/helm/hello

上述命令生成了hello-operator目录，其结构如下所示：

$ tree hello-operator/
hello-operator/
├── build                    # 构建镜像目录，含Dockerfile文件
│   └── Dockerfile
├── deploy                   # operator部署目录
│   ├── crds
│   │   ├── hello.helm.k8s.io_hellos_crd.yaml
│   │   └── hello.helm.k8s.io_v1alpha1_hello_cr.yaml
│   ├── operator.yaml        # operator部署清单
│   ├── role_binding.yaml    # 角色绑定，及将role与sa绑定
│   ├── role.yaml            # 角色文件
│   └── service_account.yaml # sa
├── helm-charts              # helm charts目录
│   └── hello                # hello chart
└── watches.yaml             # operator需监视哪些资源

operator监视哪些资源由watches.yaml文件所定义，其内容是执行operator-sdk new命令时传递的参数。对于本例，其在API：charts.helm.k8s.io/v1alpha1上监视Hello类型的资源，对于此类型的请求，其执行的helm图表为helm-charts/hello。

---
- group: charts.helm.k8s.io
  version: v1alpha1
  kind: Hello
  chart: helm-charts/hello

因为基于现有helm图表构建operator，我们无需调整helm-charts目录下的图表，执行如下命令为operator生成镜像，而后推送到前文搭建的镜像仓库中。如下所示：

$ operator-sdk build registry.zyl.io:5000/hello-operator \
    --image-builder podman
$ podman push registry.zyl.io:5000/hello-operator

镜像构建成功后，执行如下命令以实际镜像名替换deploy/operator.yaml文件中的REPLACE_IMAGE字符串。

$ cat deploy/operator.yaml 
...
      containers:
        - name: hello-operator
          # Replace this with the built image name
          image: REPLACE_IMAGE
...
$ perl -i -ne 's#REPLACE_IMAGE#registry.zyl.io:5000/hello-operator#;print' \
       deploy/operator.yaml

将Operator部署到集群中

K8S集群允许通过自定义资源定义（CRD）向其注册我们的API，对于operator控制器来说，其监听在特定的API上并响应请求，如下为SDK为此operator生成的CRD定义：

$ cat deploy/crds/charts.helm.k8s.io_hellos_crd.yaml 
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition              # 使用此类型来注册自定义资源
metadata:
  name: hellos.charts.helm.k8s.io           # 自定义资源名称
spec:
  group: charts.helm.k8s.io                 # 自定义资源API组
  names:
    kind: Hello                             # 自定义资源类型
    listKind: HelloList
    plural: hellos
    singular: hello
  scope: Namespaced
...

本例我们选择以*.k8s.io作为api组，但其为k8s保留组，若向集群注册时将报错，解决方法则是参照所提示的github pr添加对应的注释。

# crd没有命名空间概念，无需通过-n <namespace>指定命名空间
$ kubectl create -f deploy/crds/charts.helm.k8s.io_hellos_crd.yaml 
The CustomResourceDefinition "hellos.charts.helm.k8s.io" is invalid: metadata.annotations[api-approved.kubernetes.io]: Required value: protected groups must have approval annotation "api-approved.kubernetes.io", see https://github.com/kubernetes/enhancements/pull/1111

# 编辑deploy/crds/charts.helm.k8s.io_hellos_crd.yaml文件添加如下注释
$ vi deploy/crds/charts.helm.k8s.io_hellos_crd.yaml
...
metadata:
  annotations:
    "api-approved.kubernetes.io": "https://github.com/kubernetes/kubernetes/pull/78458"
...

# 然后执行如下命令注册：
$ kubectl create -f deploy/crds/charts.helm.k8s.io_hellos_crd.yaml 
customresourcedefinition.apiextensions.k8s.io/hellos.charts.helm.k8s.io created

$ kubectl get crd | grep hello
hellos.charts.helm.k8s.io                             2020-06-19T10:24:13Z

如下我们将operator部署到demo命名空间中，执行命令应用deploy目录下的清单文件。

$ kubectl -n demo apply -f deploy/
deployment.apps/hello-operator unchanged
role.rbac.authorization.k8s.io/hello-operator configured
rolebinding.rbac.authorization.k8s.io/hello-operator unchanged
serviceaccount/hello-operator unchanged

上面的命令将在demo命名空间中生成如下对象：

$ kubectl get pod,svc,deployment
NAME                                  READY   STATUS    RESTARTS   AGE
pod/hello-operator-756bb58dc5-4g88j   1/1     Running   1          152m

NAME                             TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S) 
service/hello-operator-metrics   ClusterIP   10.106.213.245   <none>        8383/TCP,8686/TCP 

NAME                             READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/hello-operator   1/1     1            1           152m

通过CR向Operator请求服务

SDK默认在deploy目录生成了一个自定义资源（CR）实例请求文件，其内容如下所示，其显示默认参数取至hello图表的values.yaml文件，也就是说，我们可配置的参数同hello图表一样。

$ cat deploy/crds/charts.helm.k8s.io_v1alpha1_hello_cr.yaml 
apiVersion: charts.helm.k8s.io/v1alpha1
kind: Hello
metadata:
  name: example-hello
spec:
  # Default values copied from <project_dir>/helm-charts/hello/values.yaml
  affinity: {}
  autoscaling:
  ...

执行如下命令通过CR请求一名为example的实例，operator将调用helm图表并响应此请求后，命名空间demo中可见如下对象：

$ kubectl -n demo apply -f - <<'EOF'
apiVersion: charts.helm.k8s.io/v1alpha1
kind: Hello
metadata:
  name: example
spec:
  greeter:
    replicaCount: 1
EOF

$ kubectl -n demo get pod
NAME                               READY   STATUS    RESTARTS   AGE
example-greeter-76745b98b8-nswcq   1/1     Running   0          3m28s
example-hello-7bc5d74c5b-5tstd     1/1     Running   0          3m28s
hello-operator-756bb58dc5-4g88j    1/1     Running   1          173m

$ kubectl -n demo get hello
NAME      AGE
example   5m59s

我们再执行如下命令更新CR，此时请求为hello应用配置ingress，查看operator日志可发现有如下报错，其显示权限问题。

$ kubectl -n demo apply -f - <<'EOF'
apiVersion: charts.helm.k8s.io/v1alpha1
kind: Hello
metadata:
  name: example
spec:
  greeter:
    replicaCount: 1
  ingress:
    enabled: true    
EOF

$ kubectl -n demo logs hello-operator-756bb58dc5-4g88j 
Unable to continue with install: could not get information about the resource: ingresses.networking.k8s.io "example-hello" is forbidden: User "system:serviceaccount:demo:hello-operator" cannot get resource "ingresses" in API group "networking.k8s.io" in the namespace "demo""

为了解决此问题，我们可更新roles.yaml文件添加所需的权限，而后删除operator当前pod，待新pod启动后，其拥有正确的权限将能生成ingress，如下所示：

$ cat > deploy/role.yaml <<'EOF'
- apiGroups:
  - networking.k8s.io
  resources:
  - ingresses
  verbs:
  - '*'
EOF
$ kubectl -n demo apply -f deploy/role.yaml
$ kubectl -n demo delete pod hello-operator-756bb58dc5-4g88j
$ kubectl -n demo get ingress
NAME            CLASS    HOSTS              ADDRESS         PORTS   AGE
example-hello   <none>   hello.app.zyl.io   192.168.120.6   80      41s

执行如下命令删除operator所创建的deploy/ingress对象，一段时间后再次查看，可发现对象被重建了，原因是operator监视所创建的CR对象，其根据CR所求情的配置，确保所管理的后端对象如deploy/ingress一致性。

$ kubectl -n demo delete deploy,ingress -l app.kubernetes.io/instance=example
deployment.apps "example-greeter" deleted
deployment.apps "example-hello" deleted
ingress.extensions "example-hello" deleted

$ kubectl -n demo get deploy,ingress -l app.kubernetes.io/instance=example
NAME                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/example-greeter   1/1     1            1           4m30s
deployment.apps/example-hello     1/1     1            1           4m30s

NAME                               CLASS    HOSTS              ADDRESS         PORTS   AGE
ingress.extensions/example-hello   <none>   hello.app.zyl.io   192.168.120.6   80      4m30s

同上面一样，operator如同管家一样确保CR所定义镜像数，故我们无法手动调整此后端镜像数。

$ kubectl -n demo scale deploy example-greeter --replicas=5
deployment.apps/example-greeter scaled

$ kubectl -n demo get deploy example-greeter 
NAME              READY   UP-TO-DATE   AVAILABLE   AGE
example-greeter   1/1     1            1           7m23s

结束语

本章我们通过operator-sdk将helm图表制作成operator，将其安装到集群后，后续无需部署helm客户端工具，通过标准的集群命令行工具kubectl即可部署应用；helm无法确保配置一致性，当通过helm部署应用后，若我们调整了deploy等配置，其将造成现有运行中的配置与helm保存的版本不一致，而operator确能很好的解决此问题。

本章我们将helm打包成operator，但对其能力却提升不是很大，但若通过ansible来实施operator，则其支持第一章中图示的所有5个特性，而后续本人将予以讲解。

利用Operator技术打包Helm图表并部署到K8S集群中

前言

为Helm图表创建Operator

将Operator部署到集群中

通过CR向Operator请求服务

结束语

我是读书人

引用和评论

小团队docker hub被墙的终极方案：基于docker打造本地docker仓库镜像

记录下安装open-eBackup过程

🔥吐血整理 Bolt.diy 部署与应用攻略

【Docker】基本概念及语法与环境搭建

狂揽17k star！Docker可视化神器，一键部署项目真香！

Docker 安装报【未打开 com.docker.vmnetd 因其包含恶意软件此操作未对 mac 造成危害】

麒麟系统中theia终端崩溃问题排查小记