1. Introduction
Kubernetes operator is a method of packaging, deploying, and managing kubernetes applications. It is an extension of customize resource management applications and components.
All the operations of the operator are to call the interface of the Kubernetes Apiserver, so in essence it is also the client software of the Apiserver.
This article is a tutorial on getting started with the development of Kubernetes operator. It aims to give newbies who are interested in understanding Operator development a glimpse of the basic process of Operator development.
2. Preparation
- First of all, you need to have an available kubernetes test cluster. If you do not have a full understanding of kubernetes related concepts and cluster construction, I suggest you to understand this knowledge first
- This tutorial uses the Go language and requires you to have a simple understanding of the syntax of the Go language, which is also the development language of kubernetes. If you use other languages, there is no problem. The bottom layer is all HTTP requests. The official or community SDK also provides a variety of languages to choose from. Once you understand the principles, you should be able to use other languages for development.
- We will use the official
k8s.io/client-go
library for testing, which basically encapsulates most of the operations on Kurbernetes.
The sample code directory is as follows:
├── Dockerfile
├── go.mod
├── go.sum
├── k8s //客户端封装
│ └── client.go
├── LICENSE
├── main.go
├── Makefile
├── utils //助手组件
│ ├── errs
│ │ └── errs.go
│ └── logs
│ └── slog.go
└── yaml
├── Deployment.yaml //operator 部署文件
└── ServiceAccount.yaml //权限绑定文件, 下文把权限配置,绑定定义分开了,这里放在一起也是可以的
As a demonstration, in this tutorial we mainly focus on the following operations:
- List all Node/namespace
- List Deployment/Services in the specified namespace
- Create a Deployment/Service
- Delete Deployment/Service
Operator development is no different from the programs you usually develop. Its most important concern is the issue of permissions. Kubernetes has a very strict and detailed permission design, specific to each resource and each operation.
So we are not critical Operator software must run in a container Kubernetes cluster, as long as permissions are configured properly, you can run go build
out binary packages, you can even directly in your development environment go run
are possible. Usually we will directly use this method to run for the convenience of development and debugging.
If you are not familiar with the authority management of Kubernetes, I suggest that you run your code in the Master node of your test cluster. The Master node has the highest authority of the cluster, which saves you the trouble of configuring permissions and puts the main energy in Focus on business logic.
Three, start
0x01, initialize the client object
First of all, we need to instantiate an k8s.io/client-go/kubernetes.Clientset
in the code, which is the client object operated by our entire Operator application.
It can be made by
func NewForConfig(c *rest.Config) (*Clientset, error)
func NewForConfigOrDie(c *rest.Config) *Clientset
Two functions are instantiated.
The difference between the two functions: One is that the instantiation fails and returns an error, and the other throws an exception directly. It is generally recommended to use the former, where the program handles errors instead of throwing exceptions directly.
Both methods require a rest.Config
object as a parameter, and the most important configuration item for rest.Config
The SDK provides us with the func BuildConfigFromFlags(masterUrl, kubeconfigPath string) (*restclient.Config, error)
method to instantiate the rest.Config
object.
masterUrl
parameter is the Server URL of the master nodekubeconfigPath
parameter is the path to the rights configuration file.
The authority profile of the Master node is usually the file: /etc/kubernetes/admin.conf
.
kubernetes will suggest that you /etc/kubernetes/admin.conf
file to $HOME/.kube/config
after deploying the master node, so you can see that the contents of the files in these two places are the same.
We usually recommend using the $HOME/.kube/config
file when transferring parameters, so as to avoid abnormal file permissions and increase the complexity of the problem.
BuildConfigFromFlags
method can actually pass null values. If our Operator program is running in a Kubernetes cluster container, passing a null value (through the same method) will use the default permission configuration in the container. But in the non-kubernetes cluster container, it does not have this default configuration, so in the non-kubernetes cluster container we need to explicitly pass in the path of the permission configuration file.
Having said a lot, let's go directly to the code:
import "k8s.io/client-go/kubernetes"
//调用之前请确认文件存在,如果不存在使用/etc/kubernetes/admin.conf
cfg, err := clientcmd.BuildConfigFromFlags("", "/root/.kube/config")
if err != nil {
log.Fatalln(err)
}
k8sClient, err := kubernetes.NewForConfig(cfg)
if err != nil {
log.Fatalln(err)
}
k8sClient
is the client object we frequently use.
The code repo of this tutorial is attached at the end of the article. The final code has been adjusted and polished to ensure that the final code is available.
Let's start to show the "real technology" below.
0x02, list all nodes/namespace
//ListNodes 获取所有节点
func ListNodes(g *gin.Context) {
nodes, err := k8sClient.CoreV1().Nodes().List(context.Background(), metav1.ListOptions{})
if err != nil {
g.Error(err)
return
}
g.JSON(0, nodes)
}
//ListNamespace 获取所有命令空间
func ListNamespace(g *gin.Context) {
ns, err := k8sClient.CoreV1().Namespaces().List(context.Background(),metav1.ListOptions{})
if err != nil {
g.Error(err)
return
}
g.JSON(0, ns)
}
For simplicity, we print out the data returned by the interface without task processing.
Too much content is returned, so I won't post the content. From the returned content, we can see that the node information contains
- system message
- Node status
- Node event
- Resource usage
- Node label, annotation, creation time, etc.
- Node local image, container group
I will not list them one by one. Interested readers can run it in their own environment to see the output results.
The following is the result printed by namespace, intercepting the data of a namespace.
{
"metadata": {
"resourceVersion": "190326"
},
"items": [
{
"metadata": {
"name": "default",
"uid": "acf4b9e4-b1ae-4b7a-bbdc-b65f088e14ec",
"resourceVersion": "208",
"creationTimestamp": "2021-09-24T11:17:29Z",
"labels": {
"kubernetes.io/metadata.name": "default"
},
"managedFields": [
{
"manager": "kube-apiserver",
"operation": "Update",
"apiVersion": "v1",
"time": "2021-09-24T11:17:29Z",
"fieldsType": "FieldsV1",
"fieldsV1": {
"f:metadata": {
"f:labels": {
".": {},
"f:kubernetes.io/metadata.name": {}
}
}
}
}
]
},
"spec": {
"finalizers": [
"kubernetes"
]
},
"status": {
"phase": "Active"
}
},
... ...
]
}
0x03, List the Deployment/Services of the specified namespace
//列出指定命名空间的deployment
func ListDeployment(g *gin.Context) {
ns := g.Query("ns")
dps, err := k8sClient.AppsV1().Deployments(ns).List(context.Background(), metav1.ListOptions{})
if err != nil {
g.Error(err)
return
}
g.JSON(200, dps)
}
//列出指定命名空间的Services
func ListService(g *gin.Context) {
ns := g.Query("ns")
svc, err := k8sClient.CoreV1().Services(ns).List(context.Background(), metav1.ListOptions{})
if err != nil {
g.Error(err)
return
}
g.JSON(200, svc)
}
Specify the namespace through parameters.
Let's take a look at the returned results:
# deployment
{
... ...
"items": [
{
"metadata": {
"name": "nginx",
"namespace": "testing",
"labels": {
"k8s.kuboard.cn/layer": "web",
"k8s.kuboard.cn/name": "nginx"
},
... ...
},
"spec": {
"replicas": 2,
"selector": {
"matchLabels": {
"k8s.kuboard.cn/layer": "web",
"k8s.kuboard.cn/name": "nginx"
}
},
"template": {
"metadata": {
"labels": {
"k8s.kuboard.cn/layer": "web",
"k8s.kuboard.cn/name": "nginx"
}
},
"spec": {
"containers": [
{
"name": "nginx",
"image": "nginx:alpine",
... ...
}
],
}
},
"strategy": {
"type": "RollingUpdate",
"rollingUpdate": {
"maxUnavailable": "25%",
"maxSurge": "25%"
}
},
},
"status": ...
}
... ...
]
}
# Services
{
"items": [
{
"metadata": {
"name": "nginx",
"namespace": "testing",
"labels": {
"k8s.kuboard.cn/layer": "web",
"k8s.kuboard.cn/name": "nginx"
},
"managedFields": [...]
},
"spec": {
"ports": [
{
"name": "nkcers",
"protocol": "TCP",
"port": 8080,
"targetPort": 80
}
],
"selector": {
"k8s.kuboard.cn/layer": "web",
"k8s.kuboard.cn/name": "nginx"
},
"clusterIP": "10.96.55.66",
"clusterIPs": [
"10.96.55.66"
],
"type": "ClusterIP",
"sessionAffinity": "None",
"ipFamilies": [
"IPv4"
],
"ipFamilyPolicy": "SingleStack"
},
"status": ...
}
... ...
]
}
From the results testing
have named the namespace nginx
of Deployment
, using nginx:alpine
mirror. Named nginx
of Service
to ClusterIP
mapped to port 8080 of the same name in the form Deployment
80 port.
0x04 Create a Deployment/Service
func CreateDeployment(g *gin.Context) {
var replicas int32 = 2
var AutomountServiceAccountTokenYes bool = true
deployment := &apiAppv1.Deployment{
TypeMeta: metav1.TypeMeta{
Kind: "Deployment",
APIVersion: "apps/v1",
},
ObjectMeta: metav1.ObjectMeta{
Name: "k8s-test-stub",
Namespace: "testing",
Labels: map[string]string{
"app": "k8s-test-app",
},
Annotations: map[string]string{
"creator":"k8s-operator-test",
},
},
Spec: apiAppv1.DeploymentSpec{
Selector: &metav1.LabelSelector{
MatchLabels: map[string]string{
"app": "k8s-test-app",
},
},
Replicas: &replicas,
Template: v1.PodTemplateSpec{
ObjectMeta: metav1.ObjectMeta{
Labels: map[string]string{
"app": "k8s-test-app",
},
},
Spec:v1.PodSpec{
Containers: []apiCorev1.Container{
{
Name: "nginx",
Image: "nginx:alpine",
},
},
RestartPolicy: "Always",
DNSPolicy: "ClusterFirst",
NodeSelector: nil,
ServiceAccountName: "",
AutomountServiceAccountToken: &AutomountServiceAccountTokenYes,
},
},
Strategy: apiAppv1.DeploymentStrategy{
Type: "RollingUpdate",
RollingUpdate: &apiAppv1.RollingUpdateDeployment{
MaxUnavailable: &intstr.IntOrString{
Type: intstr.String,
IntVal: 0,
StrVal: "25%",
},
MaxSurge: &intstr.IntOrString{
Type: intstr.String,
IntVal: 0,
StrVal: "25%",
},
},
},
},
}
dp, err := k8sClient.AppsV1().Deployments("testing").Create(context.Background(), deployment, metav1.CreateOptions{})
if err != nil {
g.AbortWithStatusJSON(500, err)
return
}
g.JSON(200, dp)
}
The above code is in testing
namespace creates a file named k8s-test-stub
of Deployment
. The container uses the nginx:alpine
image, and replicas
designated as 2
. The configuration streamlines many non-essential configuration items.
After the execution is successful, we can see that two pod
have been started:
root@main ~# kubectl get pods -n testing --selector=app=k8s-test-app
NAME READY STATUS RESTARTS AGE
k8s-test-stub-7bcdb4f5ff-bmcgf 1/1 Running 0 16m
k8s-test-stub-7bcdb4f5ff-cmng8 1/1 Running 0 16m
Next, we create Service
Deployment
, so that it can provide services externally, the code is as follows:
func CreateService(g *gin.Context) {
svc := &apiCorev1.Service{
TypeMeta: metav1.TypeMeta{
Kind: "Service",
APIVersion: "v1",
},
ObjectMeta: metav1.ObjectMeta{
Name: "k8s-test-stub",
Namespace: "testing",
Labels: map[string]string{
"app": "k8s-test-app",
},
Annotations: map[string]string{
"creator":"k8s-test-operator",
},
},
Spec:apiCorev1.ServiceSpec{
Ports: []apiCorev1.ServicePort{
{
Name: "http",
Protocol: "TCP", //注意这里必须为大写
Port: 80,
TargetPort: intstr.IntOrString{
Type: intstr.Int,
IntVal: 80,
StrVal: "",
},
NodePort: 0,
},
},
Selector: map[string]string{
"app": "k8s-test-app",
},
Type: "NodePort",
},
}
svs, err := k8sClient.CoreV1().Services("testing").Create(context.Background(), svc, metav1.CreateOptions{})
if err != nil {
g.AbortWithStatusJSON(500, err)
return
}
g.JSON(200, svs)
}
The above code is k8s-test-stub
Deployment
create a Service
same name. NodePort
external services in 06178f1fbe7aef
root@main ~# kubectl get svc -n testing --selector=app=k8s-test-app
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
k8s-test-stub NodePort 10.96.138.143 <none> 80:30667/TCP 113s
0x05 Delete Deployment/Service
func DeleteDeploymentAndService(g *gin.Context) {
//删除Deployment
err := k8sClient.AppsV1().Deployments("testing").Delete(context.Background(), "k8s-test-stub", metav1.DeleteOptions{})
if err != nil {
g.AbortWithStatusJSON(500, err)
return
}
//删除Service
err = k8sClient.CoreV1().Services("testing").Delete(context.Background(), "k8s-test-stub", metav1.DeleteOptions{})
if err != nil {
g.AbortWithStatusJSON(500, err)
return
}
g.JSON(200, nil)
}
The above code deletes the testing
and the corresponding Service
named k8s-test-stub
in the Deployment
namespace.
root@main ~# kubectl get deployment,svc -n testing --selector=app=k8s-test-app
No resources found in testing namespace.
Fourth, let your Operator run in the Kubernetes cluster
The previous code example demonstrates the basic operations of creating a namespace, creating and deleting Deployment and Service. As a starting point, more operations are left for readers to explore and share.
The previous examples run directly in the Host environment of the master node, which is convenient for us to refer to the permission configuration of the master node.
Our operator will eventually run in the k8s cluster. If you do not set the necessary permissions, we will most likely get errors similar to the following:
{
"ErrStatus": {
"metadata": {},
"status": "Failure",
"message": "nodes is forbidden: User \"system:serviceaccount:testing:default\" cannot list resource \"nodes\" in API group \"\" at the cluster scope",
"reason": "Forbidden",
"details": {
"kind": "nodes"
},
"code": 403
}
}
The above return result is that the nodes
operation is forbidden, because the operator does not have enough running permissions.
So how to give the operator sufficient permissions to meet our needs?
As mentioned earlier, k8s has a strict and detailed permission design. For security reasons, ordinary containers in the cluster are not given too many permissions. The default permissions of each container cannot meet the functional requirements of most operators.
Let's first take a look at how Operator obtains permission configuration in the container.
Let's start with the SDK code. I can find the following code in the SDK:
func BuildConfigFromFlags(masterUrl, kubeconfigPath string) (*restclient.Config, error) {
if kubeconfigPath == "" && masterUrl == "" {
klog.Warning("Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.")
kubeconfig, err := restclient.InClusterConfig()
if err == nil {
return kubeconfig, nil
}
klog.Warning("error creating inClusterConfig, falling back to default config: ", err)
}
return NewNonInteractiveDeferredLoadingClientConfig(
&ClientConfigLoadingRules{ExplicitPath: kubeconfigPath},
&ConfigOverrides{ClusterInfo: clientcmdapi.Cluster{Server: masterUrl}}).ClientConfig()
}
This code is how to build the client configuration. kubeconfigPath
parameter when calling this part of the code, and passed in the permission file of the master node, so our operator has all the permissions of the super administrator. Although it is convenient, it also brings great security risks. Operator has all the permissions and can do a lot of bad things.
From the code, you can see that the BuildConfigFromFlags
function allows the input parameter to be null. When the restclient.InClusterConfig()
parameter is empty, the 06178f1fbe7c1b method will be called. We enter this method:
func InClusterConfig() (*Config, error) {
const (
tokenFile = "/var/run/secrets/kubernetes.io/serviceaccount/token"
rootCAFile = "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt"
)
host, port := os.Getenv("KUBERNETES_SERVICE_HOST"), os.Getenv("KUBERNETES_SERVICE_PORT")
if len(host) == 0 || len(port) == 0 {
return nil, ErrNotInCluster
}
token, err := ioutil.ReadFile(tokenFile)
if err != nil {
return nil, err
}
tlsClientConfig := TLSClientConfig{}
if _, err := certutil.NewPool(rootCAFile); err != nil {
klog.Errorf("Expected to load root CA config from %s, but got err: %v", rootCAFile, err)
} else {
tlsClientConfig.CAFile = rootCAFile
}
return &Config{
Host: "https://" + net.JoinHostPort(host, port),
TLSClientConfig: tlsClientConfig,
BearerToken: string(token),
BearerTokenFile: tokenFile,
}, nil
}
We see that the code references the following two files in the container:
/var/run/secrets/kubernetes.io/serviceaccount/token
/var/run/secrets/kubernetes.io/serviceaccount/ca.crt
These two files are the default permission configuration given to the container by the k8s cluster. In fact, it is the current namespace in the corresponding named default
of ServiceAccount
(each namespace at the time of creation will be shipped to create a default
of ServiceAccount
and generate a name similar default-token-xxxx
ciphertext and called kube-root-ca.crt
dictionary). The above two files map these two configurations.
更多关于ServiceAccount的知识,请参与官方的文档!
The default default
ServiceAccount
cannot meet the needs of Operator, we need to create a new ServiceAccount
and give it sufficient permissions.
First you need to define ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: k8s-operator、
annotations:
app: k8s-operator-test
rules:
- apiGroups:
- apps
resources:
- daemonsets
- deployments
- replicasets
- statefulsets
verbs:
- create
- delete
- get
- list
- update
- watch
- patch
- apiGroups:
- ''
resources:
- nodes
- namespaces
- pods
- services
- serviceaccounts
verbs:
- create
- delete
- get
- list
- patch
- update
- watch
Create a new ServiceAccount
, named k8s-test-operator
:
apiVersion: v1
kind: ServiceAccount
metadata:
name: k8s-test-operator
namespace: testing
annotations:
app: k8s-operator-test
secrets:
- name: k8s-test-operator-token-2hfbn
Bind ClusterRole
to ServiceAccount
:
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: k8s-test-operator-cluster
annotations:
app: k8s-operator-test
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: k8s-operator
subjects:
- kind: ServiceAccount
name: k8s-test-operator
namespace: testing
Execute kubbectl apply -f *.yaml
make the permission binding take effect, and then we specify the new role name in the following location in the Deployment configuration file
deployment.Spec.Template.Spec.ServiceAccountName: "k8s-test-operator"
We can directly execute: kubectl edit deployment operator-test -n testing
find Spec.Template.Spec
add serviceAccountName: k8s-test-operator
to make the permission binding take effect.
Let's execute the commands just now in turn
- List all Node/namespace
- List Deployment/Services in the specified namespace
- Create a Deployment/Service
- Delete Deployment/Service
You can see that it can be executed normally
Summarize
The development of kubernetes operator is no different from the usual software development. In the end, it calls the http interface of ApiServer. The only thing you need to pay attention to is permissions. Operators can only achieve all the functions you can imagine if they have enough permissions!
demo repo: https://gitee.com/longmon/k8s-operator-tester.git
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。