头图

k8s tutorial description

Prometheus full component tutorial

go language courses

Problem Description

  • For example, for a constructed pipeline indicator pipeline_step_duration, one label will be set as step
  • The steps contained in each pipeline may be different

     # 比如 流水线a 第1次的step 包含clone 和build
    pipeline_step_duration{step="clone"}
    pipeline_step_duration{step="build"}
    # 第2次 的step 包含 build 和push
    pipeline_step_duration{step="build"}
    pipeline_step_duration{step="push"}
  • So here comes the question: Do you want to delete the second pipeline_step_duration{step="build"}?
  • In fact, it needs to be deleted in this scene, because clone is no longer included

The problem can be summarized as: the previously collected tags no longer exist, and the data must be cleaned up in time -- the question is how to clean up?

Do an experiment before discussing this issue: compare the deletion of inactive indicators by two common self-management methods

Experimental method: prometheus client-go sdk

  • Start 1 rand_metrics
  • Contains rand_key, each time the key is different, test the result of requesting the metrics interface

     var (
      T1 = prometheus.NewGaugeVec(prometheus.GaugeOpts{
          Name: "rand_metrics",
          Help: "rand_metrics",
      }, []string{"rand_key"})
    )

Implementation 01 Directly implement the management in the business code: do not implement the Collector interface

  • The code is as follows, simulating extreme cases, generating random keys and values every 0.1 seconds to set metrics

     package main
    
    import (
      "fmt"
      "github.com/prometheus/client_golang/prometheus"
      "github.com/prometheus/client_golang/prometheus/promhttp"
      "math/rand"
      "net/http"
      "time"
    )
    
    var (
      T1 = prometheus.NewGaugeVec(prometheus.GaugeOpts{
          Name: "rand_metrics",
          Help: "rand_metrics",
      }, []string{"rand_key"})
    )
    
    func init() {
      prometheus.DefaultRegisterer.MustRegister(T1)
    }
    func RandStr(length int) string {
      str := "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
      bytes := []byte(str)
      result := []byte{}
      rand.Seed(time.Now().UnixNano() + int64(rand.Intn(100)))
      for i := 0; i < length; i++ {
          result = append(result, bytes[rand.Intn(len(bytes))])
      }
      return string(result)
    }
    
    func push() {
      for {
          randKey := RandStr(10)
          rand.Seed(time.Now().UnixNano() + int64(rand.Intn(100)))
          T1.With(prometheus.Labels{"rand_key": randKey}).Set(rand.Float64())
          time.Sleep(100 * time.Millisecond)
    
      }
    }
    
    func main() {
      go push()
      addr := ":8081"
      http.Handle("/metrics", promhttp.Handler())
      srv := http.Server{Addr: addr}
      err := srv.ListenAndServe()
      fmt.Println(err)
    }
  • After starting the service, request: 8081/metrics interface finds that the expired rand_key will remain and will not be cleaned up

     # HELP rand_metrics rand_metrics
    # TYPE rand_metrics gauge
    rand_metrics{rand_key="00DsYGkd6x"} 0.02229735291486387
    rand_metrics{rand_key="017UBn8S2T"} 0.7192676436571013
    rand_metrics{rand_key="01Ar4ca3i1"} 0.24131184816722678
    rand_metrics{rand_key="02Ay5kqsDH"} 0.11462075954697458
    rand_metrics{rand_key="02JZNZvMng"} 0.9874169937518104
    rand_metrics{rand_key="02arsU5qNT"} 0.8552103362564516
    rand_metrics{rand_key="02nMy3thfh"} 0.039571420204118024
    rand_metrics{rand_key="032cyHjRhP"} 0.14576779289125183
    rand_metrics{rand_key="03DPDckbfs"} 0.6106184905871918
    rand_metrics{rand_key="03lbtLwFUO"} 0.936911945555629
    rand_metrics{rand_key="03wqYiguP2"} 0.20167059771916385
    rand_metrics{rand_key="04uG2s3X0C"} 0.3324314184499403

Implementation 02 Implement the Collector interface

  • Implement the collect interface in prometheus sdk: that is, bind the Collect and Describe methods to a structure
  • Implement set label and assignment methods in Collect
  • Pass in desc in Describe

     package main
    
    import (
      "fmt"
      "github.com/prometheus/client_golang/prometheus"
      "github.com/prometheus/client_golang/prometheus/promhttp"
      "log"
      "math/rand"
      "net/http"
      "time"
    )
    
    var (
      T1 = prometheus.NewDesc(
          "rand_metrics",
          "rand_metrics",
          []string{"rand_key"},
          nil)
    )
    
    type MyCollector struct {
      Name string
    }
    
    func (mc *MyCollector) Collect(ch chan<- prometheus.Metric) {
      log.Printf("MyCollector.collect.called")
      ch <- prometheus.MustNewConstMetric(T1,
          prometheus.GaugeValue, rand.Float64(), RandStr(10))
    }
    func (mc *MyCollector) Describe(ch chan<- *prometheus.Desc) {
      log.Printf("MyCollector.Describe.called")
      ch <- T1
    }
    
    func RandStr(length int) string {
      str := "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
      bytes := []byte(str)
      result := []byte{}
      rand.Seed(time.Now().UnixNano() + int64(rand.Intn(100)))
      for i := 0; i < length; i++ {
          result = append(result, bytes[rand.Intn(len(bytes))])
      }
      return string(result)
    }
    
    func main() {
      //go push()
      mc := &MyCollector{Name: "abc"}
      prometheus.MustRegister(mc)
      addr := ":8082"
      http.Handle("/metrics", promhttp.Handler())
      srv := http.Server{Addr: addr}
      err := srv.ListenAndServe()
      fmt.Println(err)
    }
  • Metrics effect test: request: 8082/metrics interface found that rand_metrics always has only 1 value

     # HELP rand_metrics rand_metrics
    # TYPE rand_metrics gauge
    rand_metrics{rand_key="e1JU185kE4"} 0.12268247569586412
  • And looking at the log, MyCollector.collect.called will be called every time we request the /metrics interface

     2022/06/21 11:46:40 MyCollector.Describe.called
    2022/06/21 11:46:44 MyCollector.collect.called
    2022/06/21 11:46:47 MyCollector.collect.called
    2022/06/21 11:46:47 MyCollector.collect.called
    2022/06/21 11:46:47 MyCollector.collect.called
    2022/06/21 11:46:47 MyCollector.collect.called

Phenomenon Summary

  • The way to implement the Collector interface can meet the needs of expired metrics cleaning, and the dot function is triggered with the request of the /metrics interface
  • The method that does not implement the Collector interface cannot meet the needs of clearing expired indicators, and the indicators will accumulate with the business management.

Reasons related to source code interpretation

01 Both methods are metrics obtained from web requests, so you must first look at the /metrics interface

  • The entry is http.Handle("/metrics", promhttp.Handler())
  • After tracking, I found that it is D:\go_path\pkg\mod\github.com\prometheus\client_golang@v1.12.2\prometheus\promhttp\http.go
  • The main logic is:

    • Call the Gather method of reg to get the MetricFamily array
    • Then encode and write to the resp of http
  • The pseudo code is as follows

     func HandlerFor(reg prometheus.Gatherer, opts HandlerOpts) http.Handler {
      mfs, err := reg.Gather()
      for _, mf := range mfs {
          if handleError(enc.Encode(mf)) {
          return
      }
    }
    }

reg.Gather: Traverse the registered collectors in reg and call their collect method

  • First call their collect method to get metrics results

     collectWorker := func() {
          for {
              select {
              case collector := <-checkedCollectors:
                  collector.Collect(checkedMetricChan)
              case collector := <-uncheckedCollectors:
                  collector.Collect(uncheckedMetricChan)
              default:
                  return
              }
              wg.Done()
          }
      }
  • Then consume the data in chan and process the metrics

     cmc := checkedMetricChan
      umc := uncheckedMetricChan
    
      for {
          select {
          case metric, ok := <-cmc:
              if !ok {
                  cmc = nil
                  break
              }
              errs.Append(processMetric(
                  metric, metricFamiliesByName,
                  metricHashes,
                  registeredDescIDs,
              ))
          case metric, ok := <-umc:
              if !ok {
                  umc = nil
                  break
              }
              errs.Append(processMetric(
                  metric, metricFamiliesByName,
                  metricHashes,
                  nil,
              ))

The processing method of processMetric is the same, so the difference of method 12 is in the collect method

02 Tracking of collect methods that do not implement the Collector interface

  • Because what we registered in reg is the *GaugeVec pointer generated by prometheus.NewGaugeVec
  • So the implementation is the collect method of *GaugeVec
  • And GaugeVec inherits MetricVec

     type GaugeVec struct {
      *MetricVec
    }
  • And there is a metricMap object in MetricVec, so it is the collect method of metricMap in the end

     type MetricVec struct {
      *metricMap
    
      curry []curriedLabelValue
    
      // hashAdd and hashAddByte can be replaced for testing collision handling.
      hashAdd     func(h uint64, s string) uint64
      hashAddByte func(h uint64, b byte) uint64
    }

    Observe the metricMap structure and methods

  • metricMap has a map of metrics
  • And its Collect method is to traverse all the metricWithLabelValues interfaces in the inner layer of the map and insert it into ch for processing

     // metricVecs.
    type metricMap struct {
      mtx       sync.RWMutex // Protects metrics.
      metrics   map[uint64][]metricWithLabelValues
      desc      *Desc
      newMetric func(labelValues ...string) Metric
    }
    
    // Describe implements Collector. It will send exactly one Desc to the provided
    // channel.
    func (m *metricMap) Describe(ch chan<- *Desc) {
      ch <- m.desc
    }
    
    // Collect implements Collector.
    func (m *metricMap) Collect(ch chan<- Metric) {
      m.mtx.RLock()
      defer m.mtx.RUnlock()
    
      for _, metrics := range m.metrics {
          for _, metric := range metrics {
              ch <- metric.metric
          }
      }
    }
  • Seeing this is very clear, as long as the elements in the metrics map are not displayed and deleted, the data will always exist
  • There are exporters in this explicitly deleted genre, such as event_expoter

03 The collect method tracking of the way to implement the Collector interface

  • Because our collector implements the collect method
  • So requesting Gather directly will call our collect method to get the result

     func (mc *MyCollector) Collect(ch chan<- prometheus.Metric) {
      log.Printf("MyCollector.collect.called")
      ch <- prometheus.MustNewConstMetric(T1,
          prometheus.GaugeValue, rand.Float64(), RandStr(10))
    }
  • So it doesn't write to metricsMap, so there is only 1 value

Summarize

  • The collect methods of the two dotting methods are different
  • In fact, the effect of the mainstream exporter is also inactive indicators will be deleted:

    • For example, process-exporter monitors the process. If the process does not exist, the indicator curve will disappear: it is a breakpoint from the grafana diagram: otherwise, it will always exist once collected.
    • For example, node-exporter monitors mount points, etc. When the mount point disappears, the relevant curve will disappear.
  • Because the mainstream exporter adopts the way of implementing the collect method:
  • In addition, kube-state-metrics in k8s uses metrics-store as the informer's store to watch etcd's delete event: when the pod is deleted, the related curve will also disappear
  • Or you can call the delete method explicitly to delete the expired series from the map, but you need the last and this diff in the hold
  • In short two schools: map explicitly delete VS implement collector interface

ning1875
167 声望67 粉丝

k8s/prometheus/cicd运维开发专家,想进阶的dy搜 小乙运维杂货铺