这个教程是使用 kubernetes 的 python client sdk 获取 pod 的 cpu 占用率,而不是通过 kubectl 命令!
动机
我要做什么?
最近有一个 pod ,是 rabbitmq 的消费者,但是会出现频繁卡死的情况,所以我需要判断 pod 是不是卡死了,然后重启。
这个判断没有办法通过一般的健康检查发现
判断依据:CPU 使用配额低于20m就认定为卡死,就删除 pod(删除 pod 之后,k8s 会重新一个新的)
技术方案
方案一:使用 shell+kubectl。但是我不喜欢 shell,也不喜欢解析非结构化的输出,所以这个方案就淘汰了
方案二:使用 python + kubernetes sdk。我喜欢 python,而且这样可以输出结构化的数据结构,比如 json,方便我解析,good
所以,我才用方案二!
获取一个『命名空间』下的所有 pod
首先,我们要列出一个 namespace 下面所有的 pod
类似 kubectl get pod -n vddb
vddb 是 namespace 的 name
from kubernetes.client.models.v1_pod import V1Pod
from kubernetes.client.models.v1_pod_list import V1PodList
from kubernetes.client.models.v1_object_meta import V1ObjectMeta
from kubernetes import client, config
from kubernetes.client import ApiClient
from kubernetes.client.rest import RESTResponse
from loguru import logger
config.load_kube_config()
v1 = client.CoreV1Api()
namespaced_name = 'vddb'
pod_list: V1PodList = v1.list_namespaced_pod(namespaced_name)
for pod in pod_list.items:
pod: V1Pod
metadata: V1ObjectMeta = pod.metadata
pod_name = metadata.name
获取一个 pod 的 metrics
列出了 pod name 之后,我们就是获取 pod 的对应的 metrics,比如使用的 CPU、内存配额
import json
from kubernetes.client.models.v1_pod import V1Pod
from kubernetes.client.models.v1_pod_list import V1PodList
from kubernetes.client.models.v1_object_meta import V1ObjectMeta
from kubernetes import client, config
from kubernetes.client import ApiClient
from kubernetes.client.rest import RESTResponse
from loguru import logger
config.load_kube_config()
v1 = client.CoreV1Api()
api_client = ApiClient()
namespaced_name = 'vddb'
pod_list: V1PodList = v1.list_namespaced_pod(namespaced_name)
for pod in pod_list.items:
pod: V1Pod
metadata: V1ObjectMeta = pod.metadata
pod_name = metadata.name
rest_response: RESTResponse = api_client.request(
url=api_client.configuration.host +
f'/apis/metrics.k8s.io/v1beta1/namespaces/{namespaced_name}/pods/{pod_name}',
method='GET'
)
_data: str = rest_response.data
data: dict = json.loads(_data)
_cpu: str = data['containers'][0]['usage']['cpu']
cpu = int(int(_cpu.removesuffix('n'))/1000/1000)
响应体的格式如下所示:
{
"kind": "PodMetrics",
"apiVersion": "metrics.k8s.io/v1beta1",
"metadata": {
"name": "svddb-servixxxxxxxxxxxxxxx4b4-bzs84",
"namespace": "vddb",
"selfLink": "/apis/metrics.k8s.io/v1beta1/namespaces/vddb/pods/svddbxxxxxxxxxxxxxxxx-bzs84",
"creationTimestamp": "2022-12-16T14:40:46Z"
},
"timestamp": "2022-12-16T14:40:09Z",
"window": "30s",
"containers": [
{
"name": "svdxxxxxxxxrators",
"usage": { "cpu": "2575748239n", "memory": "1257180Ki" }
}
]
}
注意,这里的 containers 是一个列表
删除 pod
import json
from kubernetes.client.models.v1_pod import V1Pod
from kubernetes.client.models.v1_pod_list import V1PodList
from kubernetes.client.models.v1_object_meta import V1ObjectMeta
from kubernetes import client, config
from kubernetes.client import ApiClient
from kubernetes.client.rest import RESTResponse
from loguru import logger
config.load_kube_config()
v1 = client.CoreV1Api()
api_client = ApiClient()
namespaced_name = 'vddb'
pod_list: V1PodList = v1.list_namespaced_pod(namespaced_name)
for pod in pod_list.items:
pod: V1Pod
metadata: V1ObjectMeta = pod.metadata
pod_name = metadata.name
rest_response: RESTResponse = api_client.request(
url=api_client.configuration.host +
f'/apis/metrics.k8s.io/v1beta1/namespaces/{namespaced_name}/pods/{pod_name}',
method='GET'
)
_data: str = rest_response.data
data: dict = json.loads(_data)
_cpu: str = data['containers'][0]['usage']['cpu']
cpu = int(int(_cpu.removesuffix('n'))/1000/1000)
if 'svddb-service-generators-server-prod' in pod_name and cpu < 20:
v1.delete_namespaced_pod(pod_name, namespaced_name)
参考教程:
Get cpu and memory usage through in cluster config
Does the library support "kubectl top pod" api?
https://kubernetes.io/docs/tasks/debug/debug-cluster/resource-metrics-pipeline/
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。