Use Kubernetes health check

Preface

I have recently seen a lot of questions about Kubernetes health checks and how to use them, and I will try to explain them and the differences between the types of health checks and how each check will affect your application.

Liveness Probes

Kubernetes health checks are divided into survival probes and ready probes. The purpose of survival probes is to check whether your application is running. Under normal circumstances, your application may crash, and Kubernetes will see that the application has been terminated and restarted, but the purpose of survival detection is to capture the situation where the application crashes or deadlocks and cannot be terminated. Therefore, a simple An HTTP response is sufficient.

This is a simple example of running health checks that I often use in Go applications.

http.HandleFunc("/healthz", func(w http.ResponseWriter, r *http.Request) {
    w.Write([]byte("OK"))
}
http.ListenAndServe(":8080", nil)

The content in the deployment file is:

livenessProbe:
  # an http probe
  httpGet:
    path: /healthz
    port: 8080
  initialDelaySeconds: 15
  timeoutSeconds: 1

This just tells Kubernetes that the application is up and running. InitialDelaySeconds tells Kubernetes to delay the start of the health check for this number of seconds after starting the Pod. If your application takes a while to start, you can use this setting to solve the problem. timeoutSeconds tells Kubernetes how long it should wait for a health check response. For survival probes, this shouldn't be very long, but you really should give your application enough time to respond, even under under-load conditions.

If the application never starts or does not respond with an HTTP error code, Kubernetes will restart the Pod. You should do your best not to do anything fancy on the survival probe, because if the survival probe fails to detect it may cause the application to be interrupted.

Readiness Probes

Ready probes are very similar to survival probes. The difference is that the result of probe failure is different. The readiness probe is designed to check whether the application is ready to provide services to the outside world. This is slightly different from the survival probe. For example, if your application depends on a database and a memory cache, if both of them need to be up and running to enable your application to provide services to the outside world, then these two conditions can be said Both are required for the application to be "ready".

If the readiness probe for your application fails, the Pod will be deleted from the endpoints that constitute the service. This prevents the Kubernetes service discovery mechanism from importing traffic to Pods that are not yet ready. This is important for starting new services and dynamics. Scaling, rolling updates, etc. are very helpful. The "ready" probe ensures that no traffic is imported to the Pod during the period between the time the Pod is started and when it is ready to provide services to the outside world.

The definition of ready probes is the same as that of survival probes. Ready probes are defined as part of Deployment, as shown below:

readinessProbe:
  # an http probe
  httpGet:
    path: /readiness
    port: 8080
  initialDelaySeconds: 20
  timeoutSeconds: 5readinessProbe:
  # an http probe
  httpGet:
    path: /readiness
    port: 8080
  initialDelaySeconds: 20
  timeoutSeconds: 5

You will need to check whether the ready probe can connect to all application dependencies, and to use the example that relies on the database and the memory cache, then you will check whether you can connect to both at the same time.
Probably similar to this, here, I checked memcached and the database, if it is not available, it returns 503.

http.HandleFunc("/readiness", func(w http.ResponseWriter, r *http.Request) {
  ok := true
  errMsg = ""

  // Check memcache
  if mc != nil {
    err := mc.Set(&memcache.Item{Key: "healthz", Value: []byte("test")})
  }
  if mc == nil || err != nil {
    ok = false
    errMsg += "Memcached not ok.¥n"
  }

  // Check database
  if db != nil {
    _, err := db.Query("SELECT 1;")
  }
  if db == nil || err != nil {
    ok = false
    errMsg += "Database not ok.¥n"
  } 

  if ok {
    w.Write([]byte("OK"))
  } else {
    // Send 503
    http.Error(w, errMsg, http.StatusServiceUnavailable)
  }
})
http.ListenAndServe(":8080", nil)

More robust applications

Liveness and Readiness probes really help improve the stability of the application. They help ensure that traffic flows only to the ready instances and heals itself when the application becomes unresponsive. 12 Fractured Apps mentioned by my colleague Kelsey Hightower, they are better solutions. With proper running health checks, you can deploy applications in any order without worrying about dependencies or complex endpoints scripts. After the application is ready, it will start to provide traffic, and automatic scaling and rolling updates will also proceed smoothly.

Use Kubernetes health check

Preface

Liveness Probes

Readiness Probes

More robust applications

EngineerLeo

引用和评论

k8s集群高负载pod优化策略

记录下安装open-eBackup过程

腾讯 tRPC-Go 教学——（5）filter、context 和日志组件

Go slice切片使用教程，一次通关！

腾讯 tRPC-Go 教学——（1）搭建服务

一文弄懂用Go实现MCP服务

gozero限流、熔断、降级如何实现？面试的时候怎么回答？