Why is prometheus born for cloud native monitoring? Four major adaptations of prometheus for k8s monitoring

video tutorial

Tutorial address

k8s tutorial description

Prometheus full component tutorial

go language courses

foreword

Why prometheus was born for cloud-native monitoring

You must have heard that prometheus is born for cloud-native monitoring more than once. Have you ever wondered what this sentence means?
We know that containers and k8s environments are used as the basic operating environment in cloud native
The integrated architecture is divided into many scattered microservices, and the changes and expansions and contractions of microservices are very frequent, which leads to frequent changes in the collected target information. This puts forward two requirements for the timing monitoring system:
- Need to collect massive pod containers running on multiple hosts
- At the same time, it is necessary to sense their changes in time
- At the same time, it is necessary to build a complete k8s monitoring ecosystem.
In fact, to put it bluntly, monitoring in a cloud-native environment has become more difficult and complicated. Therefore, a system that is suitable for cloud-native monitoring scenarios at the beginning of design is needed, and prometheus is designed in this way.

What improvements have prometheus made to make it worthy of k8?

What metrics should be paid attention to in k8s

In the following table, I briefly list the four major indicators that we need to pay attention to in k8s:

Indicator type	Acquisition source	Application examples	discovery type
Container Basic Resource Metrics	kubelet built-in cadvisor metrics interface	View container cpu, mem utilization, etc.	k8s_sd node level directly accesses node_ip
k8s resource metrics	kube-stats-metrics (referred to as ksm)	For details, see the many components of k8s from container monitoring kube-stats-metrics Look at the pod status, such as the number of reasons for the pod waiting state, such as: check the distribution of node pods by namespace	Access domain names through coredns
k8s service component metrics	Service component metrics interface	View apiserver, scheduler, etc, coredns request delay, etc.	k8s_sd endpoint level
Deployed in the pod business buried point indicators	pod metrics interface	Scenarios based on business indicators	k8s_sd pod level, access metricspath of pod ip

Adaptation 1. sdk + indicator self-exposure + pull model: building the entire ecosystem of k8s monitoring

In the table listed above, we see four major indicators that need to be paid attention to in k8s
In fact, we can simply divide k8s users into two roles: k8s cluster administrators and ordinary users. Each role focuses on different metrics

Isn't it tiring to collect by yourself?

Since there are so many needs, if the monitoring system only collects it by itself, the first is very tiring, and the second is that such a complete ecosystem cannot be built.

secret

Prometheus is collected by the pull model. Each monitored source only needs to expose its own indicators to the local http port, and prometheus can access the interface to collect indicators.
The same is true for prometheus in k8s. Components need to expose their own metrics. For example, the built-in cadvisor metrics of kubelet mentioned in the container basic resource metrics are exposed under /metrics/cadvisor under port 10250.
Prometheus discovers these indicator sources through the k8s service to complete the collection

Adaptation 2. k8s service discovery

Example

1. Service discovery at the endpoint level: for example, collecting apiserver, kube-controller-manager, etc.

 kubernetes_sd_configs:
   role: endpoints

2. Service discovery at the node level: for example, when collecting cadvisor and kubelet's own indicators

 kubernetes_sd_configs:
  - role: node

3. Service discovery at the node level: for example, when collecting pod self-managing indicators

 kubernetes_sd_configs:
  - role: pod
    follow_redirects: true

Interpretation: watch real-time update

Real-time discovery of resource changes through watch satisfies one of the challenges of monitoring in cloud-native situations that we proposed at the beginning, and it is necessary to sense changes in collection sources in time.
At the same time, in the large second-layer environment of k8s, prometheus can access the discovered endpoints, nodes, and pods.

Adaptation 3. Collection authentication: token & certificate

Many interfaces in k8s require authentication, and even TLS two-way authentication is required

At the same time, we know that many interfaces in k8s have access authentication. For example, if we directly access the /metrics/cadvisor interface of the kubelet on the k8s node, it will return unauthorized. as shown below

 [root@Kubernetes-node01 logs]# curl -k https://localhost:10250/metrics/cadvisor
Unauthorized

Prometheus also faces authentication problems when collecting cadvisor metrics

Solution

Smart prometheus developers solve this problem by supporting the relevant tokens and certificates in the configuration in the acquisition, such as the following configuration represents a token file, and a certificate file at the same time.

 bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
tls_config:
ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
insecure_skip_verify: true

When we use k8s, we know that k8s solves the problem of token and certificate mounting through service account and clusterrolebinding
Prometheus also takes advantage of this. When creating a prometheus container, it is a related service account. An example of the clusterrolebinding configuration is as follows:

 apiVersion: rbac.authorization.k8s.io/v1 # api的version
kind: ClusterRole # 类型
metadata:
  name: prometheus
rules:
- apiGroups: [""]
  resources: # 资源
  - nodes
  - nodes/proxy
  - services
  - endpoints
  - pods
  verbs: ["get", "list", "watch"] 
- apiGroups:
  - extensions
  resources:
  - ingresses
  verbs: ["get", "list", "watch"]
- nonResourceURLs: ["/metrics"]
  verbs: ["get"]
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: prometheus # 自定义名字
  namespace: kube-system # 命名空间
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: prometheus
roleRef: # 选择需要绑定的Role
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects: # 对象
- kind: ServiceAccount
  name: prometheus
  namespace: kube-system

We need to configure the corresponding serviceAccountName serviceAccountName: prometheus in the prometheus yaml when we create the statsfulset of the related prometheus
After configuration, Kubernetes will mount the corresponding token and certificate files to the pod.
When we exec into the pod of prometheus, we can see that the relevant files are in /var/run/secrets/kubernetes.io/serviceaccount/ , as shown in the following figure:

 / # ls /var/run/secrets/kubernetes.io/serviceaccount/ -l
total 0
lrwxrwxrwx 1 root root 13 Jan 7 20:54 ca.crt -> ..data/ca.crt
lrwxrwxrwx 1 root root 16 Jan 7 20:54 namespace -> ..data/namespace
lrwxrwxrwx 1 root root 12 Jan 7 20:54 token -> ..data/token
/ # df -h |grep service
tmpfs 7.8G 12.0K 7.8G 0% /var/run/secrets/kubernetes.io/serviceaccount
/ #

When collecting etcd, you need to configure the secret of the relevant certificate

 kubectl create secret generic etcd-certs --from-file=/etc/kubernetes/pki/etcd/healthcheck-client.crt --from-file=/etc/kubernetes/pki/etcd/healthcheck-client.key --from-file=/etc/kubernetes/pki/etcd/ca.crt -n kube-system

Adaptation 4. Powerful relabel capability for label interception, transformation, and static segmentation

prometheus relabel description

Documentation address https://prometheus.io/docs/prometheus/latest/configuration/configuration/#relabel_config

Application 1: labelmap intercepts the key name of the service discovery label when collecting cadvisor indicators

When collecting cadvisor, you can see that the service discovery source has added a lot of tags starting with __meta_kubernetes_node_label_

But these tag names are too long and need to be shortened. We use the following relabel configuration

 relabel_configs:
- separator: ;
regex: __meta_kubernetes_node_label_(.+)
replacement: $1
action: labelmap

Take this label as an example, __meta_kubernetes_node_label_kubernetes_io_os="linux" , the above configuration represents the key that matches the beginning of _meta_kubernetes_node_label_, only the latter part is reserved, so what you see in the final label is beta_kubernetes_io_os="linux" .
labelmap represents that the matched label is assigned to the target label

Application 2: replace assigns values to labels when collecting pod custom metrics

When we use pod custom indicators, we need to define three configurations starting with prometheus.io in spec.template.metadata.annotations of pod yaml. The distribution represents whether we need prometheus collection, metrics exposed ports, metrics http path information, the detailed configuration is as follows:
```
 spec:
selector:
  matchLabels:
    name: fluentd-elasticsearch
template:
  metadata:
    labels:
      name: fluentd-elasticsearch
    annotations:
      prometheus.io/scrape: 'true'
      prometheus.io/port: '9102'
      prometheus.io/path: 'metrics'
```

When collecting pod custom metrics, the following is used

 relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
  separator: ;
  regex: "true"
  replacement: $1
  action: keep
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
  separator: ;
  regex: (.+)
  target_label: __metrics_path__
  replacement: $1
  action: replace
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
  separator: ;
  regex: ([^:]+)(?::\d+)?;(\d+)
  target_label: __address__
  replacement: $1:$2
  action: replace

It means to assign __meta_kubernetes_pod_annotation_prometheus_io_path to __metrics_path__
It means to assign the relevant __meta_kubernetes_pod_annotation_prometheus_io_port to the port behind __address__

Application 2: keep for filtering, when collecting service component endpoints

The endpoint resource is a list of ip addresses and ports that expose a service
It means that the k8s service is used to discover the endpoint, and there will be a lot of endpoints, so it is necessary to filter the apiserver

 kubernetes_sd_configs:
- role: endpoints

The filtering method is that the tag __meta_kubernetes_namespace matches default and __meta_kubernetes_service_name matches kubernetes and __meta_kubernetes_endpoint_port_name matches https, what about: keep

 relabel_configs:
- source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
separator: ;
regex: default;kubernetes;https
replacement: $1
action: keep

k8s will create the apiserver service in the default namespace

 $ kubectl get svc -A |grep  443
default         kubernetes                                     ClusterIP   10.96.0.1       <none>        443/TCP                  9d

The obtained endpoint is converted to the collection path as: https://masterip:6443/metrics

The adaptation work done by prometheus for k8s monitoring

Adapter name	illustrate	Example
The metrics of each component are self-exposed	All components expose their own indicators on their respective service ports, and prometheus pulls the indicators through pull	apiserver:6443/metrics
k8s service discovery	Instantly discover resource changes through watch	`kubernetes_sd_configs:- role: node`
Authentication	The component interfaces of k8s are all authenticated, so the collector of k8s should support configuration authentication	Support configuration of token and tls certificate
Label relabel capability	Filter service discovery target	`labelmap` Remove the long prefix of the service discovery label `replace` do replace `hashmod` do static sharding

Why is prometheus born for cloud native monitoring? Four major adaptations of prometheus for k8s monitoring

video tutorial

k8s tutorial description

Prometheus full component tutorial

go language courses

foreword

Why prometheus was born for cloud-native monitoring

What improvements have prometheus made to make it worthy of k8?

What metrics should be paid attention to in k8s

Adaptation 1. sdk + indicator self-exposure + pull model: building the entire ecosystem of k8s monitoring

Isn't it tiring to collect by yourself?

secret

Adaptation 2. k8s service discovery

Example

Interpretation: watch real-time update

Adaptation 3. Collection authentication: token & certificate

Many interfaces in k8s require authentication, and even TLS two-way authentication is required

Solution

Adaptation 4. Powerful relabel capability for label interception, transformation, and static segmentation

prometheus relabel description

Application 1: labelmap intercepts the key name of the service discovery label when collecting cadvisor indicators

Application 2: replace assigns values to labels when collecting pod custom metrics

Application 2: keep for filtering, when collecting service component endpoints

The adaptation work done by prometheus for k8s monitoring

ning1875

引用和评论

k8s中的11运维开发方向是哪些如果快速提升k8s开发能力

JManus - 面向 Java 开发者的开源通用智能体

MCP协议重大升级，Spring AI Alibaba联合Higress发布业界首个Streamable HTTP实现方案

Jenkins 企业级 CI/CD 实践：安装、配置与 Kubernetes & Docker 集成

k8s集群部署（一主两从）

k8s实战基础

深度测评国产 AI 程序员，在 QwQ 和满血版 DeepSeek 助力下，哪些能力让你眼前一亮？

Why is prometheus born for cloud native monitoring? Four major adaptations of prometheus for k8s monitoring

video tutorial

k8s tutorial description

Prometheus full component tutorial

go language courses

foreword

Why prometheus was born for cloud-native monitoring

What improvements have prometheus made to make it worthy of k8?

What metrics should be paid attention to in k8s

Adaptation 1. sdk + indicator self-exposure + pull model: building the entire ecosystem of k8s monitoring

Isn't it tiring to collect by yourself?

secret

Adaptation 2. k8s service discovery

Example

Interpretation: watch real-time update

Adaptation 3. Collection authentication: token & certificate

Many interfaces in k8s require authentication, and even TLS two-way authentication is required

Solution

Adaptation 4. Powerful relabel capability for label interception, transformation, and static segmentation

prometheus relabel description

Application 1: labelmap intercepts the key name of the service discovery label when collecting cadvisor indicators

Application 2: replace assigns values to labels when collecting pod custom metrics

Application 2: keep for filtering, when collecting service component endpoints

The adaptation work done by prometheus for k8s monitoring

ning1875

引用和评论

k8s中的11运维开发方向是哪些 如果快速提升k8s开发能力

JManus - 面向 Java 开发者的开源通用智能体

MCP协议重大升级，Spring AI Alibaba联合Higress发布业界首个Streamable HTTP实现方案

Jenkins 企业级 CI/CD 实践：安装、配置与 Kubernetes & Docker 集成

k8s集群部署（一主两从）

k8s实战基础

深度测评国产 AI 程序员，在 QwQ 和满血版 DeepSeek 助力下，哪些能力让你眼前一亮？

k8s中的11运维开发方向是哪些如果快速提升k8s开发能力