5

k8s中容器资源的监控

在promethues中如何配置采集容器指标

  • 采用promethues的kubernetes_sd_configs中 node级别的role
- job_name: kubernetes-nodes-cadvisor
  honor_timestamps: false
  scrape_interval: 30s
  scrape_timeout: 10s
  metrics_path: /metrics
  scheme: https
  kubernetes_sd_configs:
  - role: node
  bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  tls_config:
    ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
    insecure_skip_verify: true
  relabel_configs:
  - separator: ;
    regex: __meta_kubernetes_node_label_(.+)
    replacement: $1
    action: labelmap
  - separator: ;
    regex: (.*)
    target_label: __metrics_path__
    replacement: /metrics/cadvisor
    action: replace

cadvisor架构图

cadvisor架构图.png

cadvisor中 POD

image.png

  • 在查看cadvisor代码时发现有一种container_name=="POD"的容器,查了下是 k8s中的pause pod

下面追踪下打tag的过程:以pod cpu使用率为例

kubelet最终tag

  • 对应cadvisor指标为 container_cpu_usage_seconds_total,可以看到最终查询出来的有如下tag
  • 那我们会好奇标识app或service的tag:pod,pod_name,container,container_name是如何打上去的呢

image.png

访问集成在kubelet中的cadvisor的tag

  • curl localhost:4194/metrics

image.png

  • 可以发现除了cpu是container_cpu_usage_seconds_total指标特有的tag之外,还有id,name,namespace,pod_name,container_name,image这几个tag
  • 上述tag作为cadvisor通用tag会附加在每一个metric上面
  • 其实在裸的cadvisor中只有id,image,name三个tag原始cadvisor 打tag
  • namespace,pod_name,container_name等属性是k8s才有的,cadvisor肯定无法感知pod信息,说明是k8s注入的

image.png

kubelet内置的cadvisor中使用了自定义的PrometheusLabelsFunc

以k8s 1.15版本为例
代码在:E:go_pathsrcgithub.comkuberneteskubernetespkgkubeletserverserver.go

func containerPrometheusLabelsFunc(s stats.Provider) metrics.ContainerLabelsFunc {
    // containerPrometheusLabels maps cAdvisor labels to prometheus labels.
    return func(c *cadvisorapi.ContainerInfo) map[string]string {
        // Prometheus requires that all metrics in the same family have the same labels,
        // so we arrange to supply blank strings for missing labels
        var name, image, podName, namespace, containerName string
        if len(c.Aliases) > 0 {
            name = c.Aliases[0]
        }
        image = c.Spec.Image
        if v, ok := c.Spec.Labels[kubelettypes.KubernetesPodNameLabel]; ok {
            podName = v
        }
        if v, ok := c.Spec.Labels[kubelettypes.KubernetesPodNamespaceLabel]; ok {
            namespace = v
        }
        if v, ok := c.Spec.Labels[kubelettypes.KubernetesContainerNameLabel]; ok {
            containerName = v
        }
        // Associate pod cgroup with pod so we have an accurate accounting of sandbox
        if podName == "" && namespace == "" {
            if pod, found := s.GetPodByCgroupfs(c.Name); found {
                podName = pod.Name
                namespace = pod.Namespace
            }
        }
        set := map[string]string{
            metrics.LabelID:    c.Name,
            metrics.LabelName:  name,
            metrics.LabelImage: image,
            "pod_name":         podName,
            "pod":              podName,
            "namespace":        namespace,
            "container_name":   containerName,
            "container":        containerName,
        }
        return set
    }
}

k8s 1.15 1.16版本对于pod和pod_name的变化

  • 经过观察发现1.15的pod pod_name都有,1.16只有pod
  • 这是因为 在k8s 1.16版本为了统一cadvisor和kube-stats指标tag做了变更这个pr

image.png

kubelet启动时可以使用--node-labels注入node级别tag

--node-labels=os.name=xxxx,os.version=xxxx,os.architecture=amd64,
  • 这些tag会转化为promethues metric命名方式xxx_xxx
  • 最后追加为os_version=xxx,os_architecture=amd64

ning1875
167 声望67 粉丝

k8s/prometheus/cicd运维开发专家,想进阶的dy搜 小乙运维杂货铺