Getting started with Kubernetes Operator development

1. Introduction

Kubernetes operator is a method of packaging, deploying, and managing kubernetes applications. It is an extension of customize resource management applications and components. All the operations of the operator are to call the interface of the Kubernetes Apiserver, so in essence it is also the client software of the Apiserver.

This article is a tutorial on getting started with the development of Kubernetes operator. It aims to give newbies who are interested in understanding Operator development a glimpse of the basic process of Operator development.

`2. Preparation`

First of all, you need to have an available kubernetes test cluster. If you do not have a full understanding of kubernetes related concepts and cluster construction, I suggest you to understand this knowledge first
This tutorial uses the Go language and requires you to have a simple understanding of the syntax of the Go language, which is also the development language of kubernetes. If you use other languages, there is no problem. The bottom layer is all HTTP requests. The official or community SDK also provides a variety of languages to choose from. Once you understand the principles, you should be able to use other languages for development.
We will use the official k8s.io/client-go library for testing, which basically encapsulates most of the operations on Kurbernetes.

The sample code directory is as follows:

├── Dockerfile
├── go.mod
├── go.sum
├── k8s           //客户端封装
│   └── client.go
├── LICENSE
├── main.go
├── Makefile
├── utils       //助手组件
│   ├── errs
│   │   └── errs.go
│   └── logs
│       └── slog.go
└── yaml
    ├── Deployment.yaml    //operator 部署文件
    └── ServiceAccount.yaml //权限绑定文件， 下文把权限配置，绑定定义分开了，这里放在一起也是可以的

As a demonstration, in this tutorial we mainly focus on the following operations:

List all Node/namespace
List Deployment/Services in the specified namespace
Create a Deployment/Service
Delete Deployment/Service

Operator development is no different from the programs you usually develop. Its most important concern is the issue of permissions. Kubernetes has a very strict and detailed permission design, specific to each resource and each operation. So we are not critical Operator software must run in a container Kubernetes cluster, as long as permissions are configured properly, you can run go build out binary packages, you can even directly in your development environment go run are possible. Usually we will directly use this method to run for the convenience of development and debugging.

If you are not familiar with the authority management of Kubernetes, I suggest that you run your code in the Master node of your test cluster. The Master node has the highest authority of the cluster, which saves you the trouble of configuring permissions and puts the main energy in Focus on business logic.

`Three, start`

`0x01, initialize the client object`

First of all, we need to instantiate an k8s.io/client-go/kubernetes.Clientset in the code, which is the client object operated by our entire Operator application.

It can be made by

func NewForConfig(c *rest.Config) (*Clientset, error)
func NewForConfigOrDie(c *rest.Config) *Clientset

Two functions are instantiated. The difference between the two functions: One is that the instantiation fails and returns an error, and the other throws an exception directly. It is generally recommended to use the former, where the program handles errors instead of throwing exceptions directly.

Both methods require a rest.Config object as a parameter, and the most important configuration item for rest.Config

The SDK provides us with the func BuildConfigFromFlags(masterUrl, kubeconfigPath string) (*restclient.Config, error) method to instantiate the rest.Config object.

masterUrl parameter is the Server URL of the master node
kubeconfigPath parameter is the path to the rights configuration file.

The authority profile of the Master node is usually the file: /etc/kubernetes/admin.conf .

kubernetes will suggest that you /etc/kubernetes/admin.conf file to $HOME/.kube/config after deploying the master node, so you can see that the contents of the files in these two places are the same.

We usually recommend using the $HOME/.kube/config file when transferring parameters, so as to avoid abnormal file permissions and increase the complexity of the problem.

BuildConfigFromFlags method can actually pass null values. If our Operator program is running in a Kubernetes cluster container, passing a null value (through the same method) will use the default permission configuration in the container. But in the non-kubernetes cluster container, it does not have this default configuration, so in the non-kubernetes cluster container we need to explicitly pass in the path of the permission configuration file.

Having said a lot, let's go directly to the code:


import "k8s.io/client-go/kubernetes"

//调用之前请确认文件存在，如果不存在使用/etc/kubernetes/admin.conf
cfg, err := clientcmd.BuildConfigFromFlags("", "/root/.kube/config") 
if err != nil {
    log.Fatalln(err)
}
k8sClient, err := kubernetes.NewForConfig(cfg)
if err != nil {
    log.Fatalln(err)
}

k8sClient is the client object we frequently use.

The code repo of this tutorial is attached at the end of the article. The final code has been adjusted and polished to ensure that the final code is available.

Let's start to show the "real technology" below.

`0x02, list all nodes/namespace`

//ListNodes 获取所有节点
func ListNodes(g *gin.Context) {
    nodes, err := k8sClient.CoreV1().Nodes().List(context.Background(), metav1.ListOptions{})
    if err != nil {
        g.Error(err)
        return
    }
    g.JSON(0, nodes)
}

//ListNamespace 获取所有命令空间
func ListNamespace(g *gin.Context) {
    ns, err := k8sClient.CoreV1().Namespaces().List(context.Background(),metav1.ListOptions{})
    if err != nil {
        g.Error(err)
        return
    }
    g.JSON(0, ns)
}

For simplicity, we print out the data returned by the interface without task processing.

Too much content is returned, so I won't post the content. From the returned content, we can see that the node information contains

system message
Node status
Node event
Resource usage
Node label, annotation, creation time, etc.
Node local image, container group

I will not list them one by one. Interested readers can run it in their own environment to see the output results.

The following is the result printed by namespace, intercepting the data of a namespace.

{
    "metadata": {
        "resourceVersion": "190326"
    },
    "items": [
        {
            "metadata": {
                "name": "default",
                "uid": "acf4b9e4-b1ae-4b7a-bbdc-b65f088e14ec",
                "resourceVersion": "208",
                "creationTimestamp": "2021-09-24T11:17:29Z",
                "labels": {
                    "kubernetes.io/metadata.name": "default"
                },
                "managedFields": [
                    {
                        "manager": "kube-apiserver",
                        "operation": "Update",
                        "apiVersion": "v1",
                        "time": "2021-09-24T11:17:29Z",
                        "fieldsType": "FieldsV1",
                        "fieldsV1": {
                            "f:metadata": {
                                "f:labels": {
                                    ".": {},
                                    "f:kubernetes.io/metadata.name": {}
                                }
                            }
                        }
                    }
                ]
            },
            "spec": {
                "finalizers": [
                    "kubernetes"
                ]
            },
            "status": {
                "phase": "Active"
            }
        },
        ... ...
    ]
}

`0x03, List the Deployment/Services of the specified namespace`

//列出指定命名空间的deployment
func ListDeployment(g *gin.Context) {
    ns := g.Query("ns")
    
    dps, err := k8sClient.AppsV1().Deployments(ns).List(context.Background(), metav1.ListOptions{})
    if err != nil {
        g.Error(err)
        return
    }
    g.JSON(200, dps)
}
//列出指定命名空间的Services
func ListService(g *gin.Context) {
    ns := g.Query("ns")

    svc, err := k8sClient.CoreV1().Services(ns).List(context.Background(), metav1.ListOptions{})
    if err != nil {
        g.Error(err)
        return
    }
    g.JSON(200, svc)
}

Specify the namespace through parameters. Let's take a look at the returned results:

# deployment
{
    ... ...
    "items": [
        {
            "metadata": {
                "name": "nginx",
                "namespace": "testing",
                "labels": {
                    "k8s.kuboard.cn/layer": "web",
                    "k8s.kuboard.cn/name": "nginx"
                },
                ... ...
            },
            "spec": {
                "replicas": 2,
                "selector": {
                    "matchLabels": {
                        "k8s.kuboard.cn/layer": "web",
                        "k8s.kuboard.cn/name": "nginx"
                    }
                },
                "template": {
                    "metadata": {
                        "labels": {
                            "k8s.kuboard.cn/layer": "web",
                            "k8s.kuboard.cn/name": "nginx"
                        }
                    },
                    "spec": {
                        "containers": [
                            {
                                "name": "nginx",
                                "image": "nginx:alpine",
                                ... ...
                            }
                        ],
                    }
                },
                "strategy": {
                    "type": "RollingUpdate",
                    "rollingUpdate": {
                        "maxUnavailable": "25%",
                        "maxSurge": "25%"
                    }
                },
            },
            "status": ...
        }
        ... ...
    ]
}

# Services
{
    "items": [
        {
            "metadata": {
                "name": "nginx",
                "namespace": "testing",
                "labels": {
                    "k8s.kuboard.cn/layer": "web",
                    "k8s.kuboard.cn/name": "nginx"
                },
                "managedFields": [...]
            },
            "spec": {
                "ports": [
                    {
                        "name": "nkcers",
                        "protocol": "TCP",
                        "port": 8080,
                        "targetPort": 80
                    }
                ],
                "selector": {
                    "k8s.kuboard.cn/layer": "web",
                    "k8s.kuboard.cn/name": "nginx"
                },
                "clusterIP": "10.96.55.66",
                "clusterIPs": [
                    "10.96.55.66"
                ],
                "type": "ClusterIP",
                "sessionAffinity": "None",
                "ipFamilies": [
                    "IPv4"
                ],
                "ipFamilyPolicy": "SingleStack"
            },
            "status": ...
        }
        ... ...
    ]
}

From the results testing have named the namespace nginx of Deployment , using nginx:alpine mirror. Named nginx of Service to ClusterIP mapped to port 8080 of the same name in the form Deployment 80 port.

`0x04 Create a Deployment/Service`

func CreateDeployment(g *gin.Context) {
    var replicas int32 = 2
    var AutomountServiceAccountTokenYes bool = true

    deployment := &apiAppv1.Deployment{
        TypeMeta:   metav1.TypeMeta{
            Kind:       "Deployment",
            APIVersion: "apps/v1",
        },
        ObjectMeta: metav1.ObjectMeta{
            Name:  "k8s-test-stub",
            Namespace: "testing",
            Labels: map[string]string{
                "app": "k8s-test-app",
            },
            Annotations: map[string]string{
                "creator":"k8s-operator-test",
            },
        },
        Spec: apiAppv1.DeploymentSpec{
            Selector: &metav1.LabelSelector{
                MatchLabels: map[string]string{
                    "app": "k8s-test-app",
                },
            },
            Replicas:  &replicas,
            Template: v1.PodTemplateSpec{
                ObjectMeta: metav1.ObjectMeta{
                    Labels: map[string]string{
                        "app": "k8s-test-app",
                    },
                },
                Spec:v1.PodSpec{
                    Containers: []apiCorev1.Container{
                        {
                            Name:  "nginx",
                            Image: "nginx:alpine",
                        },
                    },
                    RestartPolicy: "Always",
                    DNSPolicy: "ClusterFirst",
                    NodeSelector: nil,
                    ServiceAccountName: "",
                    AutomountServiceAccountToken:  &AutomountServiceAccountTokenYes,
                },
            },
            Strategy: apiAppv1.DeploymentStrategy{
                Type: "RollingUpdate",
                RollingUpdate: &apiAppv1.RollingUpdateDeployment{
                    MaxUnavailable: &intstr.IntOrString{
                        Type:   intstr.String,
                        IntVal: 0,
                        StrVal: "25%",
                    },
                    MaxSurge: &intstr.IntOrString{
                        Type: intstr.String,
                        IntVal: 0,
                        StrVal: "25%",
                    },
                },
            },
        },
    }

    dp, err := k8sClient.AppsV1().Deployments("testing").Create(context.Background(), deployment, metav1.CreateOptions{})
    if err != nil {
        g.AbortWithStatusJSON(500, err)
        return
    }
    g.JSON(200, dp)
}

The above code is in testing namespace creates a file named k8s-test-stub of Deployment . The container uses the nginx:alpine image, and replicas designated as 2 . The configuration streamlines many non-essential configuration items. After the execution is successful, we can see that two pod have been started:

root@main ~# kubectl get pods -n testing --selector=app=k8s-test-app
NAME                             READY   STATUS    RESTARTS   AGE
k8s-test-stub-7bcdb4f5ff-bmcgf   1/1     Running   0          16m
k8s-test-stub-7bcdb4f5ff-cmng8   1/1     Running   0          16m

Next, we create Service Deployment , so that it can provide services externally, the code is as follows:

func CreateService(g *gin.Context) {
    svc := &apiCorev1.Service{
        TypeMeta:   metav1.TypeMeta{
            Kind:       "Service",
            APIVersion: "v1",
        },
        ObjectMeta: metav1.ObjectMeta{
            Name: "k8s-test-stub",
            Namespace: "testing",
            Labels: map[string]string{
                "app": "k8s-test-app",
            },
            Annotations: map[string]string{
                "creator":"k8s-test-operator",
            },
        },
        Spec:apiCorev1.ServiceSpec{
            Ports: []apiCorev1.ServicePort{
                {
                    Name:        "http",
                    Protocol:    "TCP", //注意这里必须为大写
                    Port:        80,
                    TargetPort:  intstr.IntOrString{
                        Type:   intstr.Int,
                        IntVal: 80,
                        StrVal: "",
                    },
                    NodePort:    0,
                },
            },
            Selector: map[string]string{
                "app": "k8s-test-app",
            },
            Type: "NodePort",
        },
    }

    svs, err := k8sClient.CoreV1().Services("testing").Create(context.Background(), svc, metav1.CreateOptions{})
    if err != nil {
        g.AbortWithStatusJSON(500, err)
        return
    }
    g.JSON(200, svs)
}

The above code is k8s-test-stub Deployment create a Service same name. NodePort external services in 06178f1fbe7aef

root@main ~# kubectl get svc -n testing --selector=app=k8s-test-app
NAME            TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
k8s-test-stub   NodePort    10.96.138.143   <none>        80:30667/TCP   113s

`0x05 Delete Deployment/Service`

func DeleteDeploymentAndService(g *gin.Context) {
    //删除Deployment
    err := k8sClient.AppsV1().Deployments("testing").Delete(context.Background(), "k8s-test-stub", metav1.DeleteOptions{})
    if err != nil {
        g.AbortWithStatusJSON(500, err)
        return
    }
    //删除Service
    err = k8sClient.CoreV1().Services("testing").Delete(context.Background(), "k8s-test-stub", metav1.DeleteOptions{})
    if err != nil {
        g.AbortWithStatusJSON(500, err)
        return
    }
    g.JSON(200, nil)
}

The above code deletes the testing and the corresponding Service named k8s-test-stub in the Deployment namespace.

root@main ~# kubectl get deployment,svc -n testing --selector=app=k8s-test-app
No resources found in testing namespace.

`Fourth, let your Operator run in the Kubernetes cluster`

The previous code example demonstrates the basic operations of creating a namespace, creating and deleting Deployment and Service. As a starting point, more operations are left for readers to explore and share.

The previous examples run directly in the Host environment of the master node, which is convenient for us to refer to the permission configuration of the master node. Our operator will eventually run in the k8s cluster. If you do not set the necessary permissions, we will most likely get errors similar to the following:

{
    "ErrStatus": {
        "metadata": {},
        "status": "Failure",
        "message": "nodes is forbidden: User \"system:serviceaccount:testing:default\" cannot list resource \"nodes\" in API group \"\" at the cluster scope",
        "reason": "Forbidden",
        "details": {
            "kind": "nodes"
        },
        "code": 403
    }
}

The above return result is that the nodes operation is forbidden, because the operator does not have enough running permissions.

So how to give the operator sufficient permissions to meet our needs?

As mentioned earlier, k8s has a strict and detailed permission design. For security reasons, ordinary containers in the cluster are not given too many permissions. The default permissions of each container cannot meet the functional requirements of most operators.

Let's first take a look at how Operator obtains permission configuration in the container.

Let's start with the SDK code. I can find the following code in the SDK:

func BuildConfigFromFlags(masterUrl, kubeconfigPath string) (*restclient.Config, error) {
    if kubeconfigPath == "" && masterUrl == "" {
        klog.Warning("Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.")
        kubeconfig, err := restclient.InClusterConfig()
        if err == nil {
            return kubeconfig, nil
        }
        klog.Warning("error creating inClusterConfig, falling back to default config: ", err)
    }
    return NewNonInteractiveDeferredLoadingClientConfig(
        &ClientConfigLoadingRules{ExplicitPath: kubeconfigPath},
        &ConfigOverrides{ClusterInfo: clientcmdapi.Cluster{Server: masterUrl}}).ClientConfig()
}

This code is how to build the client configuration. kubeconfigPath parameter when calling this part of the code, and passed in the permission file of the master node, so our operator has all the permissions of the super administrator. Although it is convenient, it also brings great security risks. Operator has all the permissions and can do a lot of bad things.

From the code, you can see that the BuildConfigFromFlags function allows the input parameter to be null. When the restclient.InClusterConfig() parameter is empty, the 06178f1fbe7c1b method will be called. We enter this method:

func InClusterConfig() (*Config, error) {
    const (
        tokenFile  = "/var/run/secrets/kubernetes.io/serviceaccount/token"
        rootCAFile = "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt"
    )
    host, port := os.Getenv("KUBERNETES_SERVICE_HOST"), os.Getenv("KUBERNETES_SERVICE_PORT")
    if len(host) == 0 || len(port) == 0 {
        return nil, ErrNotInCluster
    }

    token, err := ioutil.ReadFile(tokenFile)
    if err != nil {
        return nil, err
    }

    tlsClientConfig := TLSClientConfig{}

    if _, err := certutil.NewPool(rootCAFile); err != nil {
        klog.Errorf("Expected to load root CA config from %s, but got err: %v", rootCAFile, err)
    } else {
        tlsClientConfig.CAFile = rootCAFile
    }

    return &Config{
        Host:            "https://" + net.JoinHostPort(host, port),
        TLSClientConfig: tlsClientConfig,
        BearerToken:     string(token),
        BearerTokenFile: tokenFile,
    }, nil
}

We see that the code references the following two files in the container:

/var/run/secrets/kubernetes.io/serviceaccount/token
/var/run/secrets/kubernetes.io/serviceaccount/ca.crt

These two files are the default permission configuration given to the container by the k8s cluster. In fact, it is the current namespace in the corresponding named default of ServiceAccount (each namespace at the time of creation will be shipped to create a default of ServiceAccount and generate a name similar default-token-xxxx ciphertext and called kube-root-ca.crt dictionary). The above two files map these two configurations.

更多关于ServiceAccount的知识，请参与官方的文档！

The default default ServiceAccount cannot meet the needs of Operator, we need to create a new ServiceAccount and give it sufficient permissions.

First you need to define ClusterRole

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: k8s-operator、
  annotations:
    app: k8s-operator-test
rules:
  - apiGroups:
      - apps
    resources:
      - daemonsets
      - deployments
      - replicasets
      - statefulsets
    verbs:
      - create
      - delete
      - get
      - list
      - update
      - watch
      - patch
  - apiGroups:
      - ''
    resources:
      - nodes
      - namespaces
      - pods
      - services
      - serviceaccounts
    verbs:
      - create
      - delete
      - get
      - list
      - patch
      - update
      - watch

Create a new ServiceAccount , named k8s-test-operator :

apiVersion: v1
kind: ServiceAccount
metadata:
  name: k8s-test-operator
  namespace: testing
  annotations:
    app: k8s-operator-test
secrets:
  - name: k8s-test-operator-token-2hfbn

Bind ClusterRole to ServiceAccount :

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: k8s-test-operator-cluster
  annotations:
    app: k8s-operator-test
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: k8s-operator
subjects:
  - kind: ServiceAccount
    name: k8s-test-operator
    namespace: testing

Execute kubbectl apply -f *.yaml make the permission binding take effect, and then we specify the new role name in the following location in the Deployment configuration file

deployment.Spec.Template.Spec.ServiceAccountName: "k8s-test-operator"

We can directly execute: kubectl edit deployment operator-test -n testing find Spec.Template.Spec add serviceAccountName: k8s-test-operator to make the permission binding take effect.

Let's execute the commands just now in turn

List all Node/namespace
List Deployment/Services in the specified namespace
Create a Deployment/Service
Delete Deployment/Service You can see that it can be executed normally

`Summarize`

The development of kubernetes operator is no different from the usual software development. In the end, it calls the http interface of ApiServer. The only thing you need to pay attention to is permissions. Operators can only achieve all the functions you can imagine if they have enough permissions!

demo repo: https://gitee.com/longmon/k8s-operator-tester.git

Getting started with Kubernetes Operator development

1. Introduction

`2. Preparation`

`Three, start`

`0x01, initialize the client object`

`0x02, list all nodes/namespace`

`0x03, List the Deployment/Services of the specified namespace`

`0x04 Create a Deployment/Service`

`0x05 Delete Deployment/Service`

`Fourth, let your Operator run in the Kubernetes cluster`

`Summarize`

longmon

`引用和评论`

花点小时间上手AI效率编程，2小时出一个JSON工具小网站

Jenkins 企业级 CI/CD 实践：安装、配置与 Kubernetes & Docker 集成

k8s集群部署（一主两从）

k8s实战基础

使用kubeadm部署高可用IPV4/IPV6集群---V1.32

centos7使用yum网络安装

基于k3s部署Nginx、MySQL、PHP和Redis的详细教程