9
头图

Hi everyone, this is Zhang Jintao.

Before I wrote an article "A More Elegant Kubernetes Cluster Event Measurement Scheme" , using Jaeger to use tracing to collect events in the Kubernetes cluster and display it. The final effect is as follows:

When I wrote that article, I set up a flag to introduce the principles in detail. I have been pigeoning for a long time. Now it's the end of the year and it's time to send it out.

Eents overview

Let's first make a simple example to see what events in a Kubernetes cluster are.

Create a new namespace called moelove, and then create a deployment called redis in it. Next, look at all events in this namespace.

(MoeLove) ➜ kubectl create ns moelove
namespace/moelove created
(MoeLove) ➜ kubectl -n moelove create deployment redis --image=ghcr.io/moelove/redis:alpine 
deployment.apps/redis created
(MoeLove) ➜ kubectl -n moelove get deploy
NAME    READY   UP-TO-DATE   AVAILABLE   AGE
redis   1/1     1            1           11s
(MoeLove) ➜ kubectl -n moelove get events
LAST SEEN   TYPE     REASON              OBJECT                        MESSAGE
21s         Normal   Scheduled           pod/redis-687967dbc5-27vmr    Successfully assigned moelove/redis-687967dbc5-27vmr to kind-worker3
21s         Normal   Pulling             pod/redis-687967dbc5-27vmr    Pulling image "ghcr.io/moelove/redis:alpine"
15s         Normal   Pulled              pod/redis-687967dbc5-27vmr    Successfully pulled image "ghcr.io/moelove/redis:alpine" in 6.814310968s
14s         Normal   Created             pod/redis-687967dbc5-27vmr    Created container redis
14s         Normal   Started             pod/redis-687967dbc5-27vmr    Started container redis
22s         Normal   SuccessfulCreate    replicaset/redis-687967dbc5   Created pod: redis-687967dbc5-27vmr
22s         Normal   ScalingReplicaSet   deployment/redis              Scaled up replica set redis-687967dbc5 to 1

But we will find that by default kubectl get events is not arranged in the order in which the events occur, so we often need to add the --sort-by='{.metadata.creationTimestamp}' parameter to it so that its output can be arranged in time.

This is why Kubernetes adds v1.23 version kubectl alpha events reason command. I have made a detailed introduction in the previous article "K8S Ecological Weekly | Kubernetes v1.23.0 Official Release, List of New Features" , so I won't expand it here.

After sorting by time, you can see the following results:

(MoeLove) ➜ kubectl -n moelove get events --sort-by='{.metadata.creationTimestamp}'
LAST SEEN   TYPE     REASON              OBJECT                        MESSAGE
2m12s       Normal   Scheduled           pod/redis-687967dbc5-27vmr    Successfully assigned moelove/redis-687967dbc5-27vmr to kind-worker3
2m13s       Normal   SuccessfulCreate    replicaset/redis-687967dbc5   Created pod: redis-687967dbc5-27vmr
2m13s       Normal   ScalingReplicaSet   deployment/redis              Scaled up replica set redis-687967dbc5 to 1
2m12s       Normal   Pulling             pod/redis-687967dbc5-27vmr    Pulling image "ghcr.io/moelove/redis:alpine"
2m6s        Normal   Pulled              pod/redis-687967dbc5-27vmr    Successfully pulled image "ghcr.io/moelove/redis:alpine" in 6.814310968s
2m5s        Normal   Created             pod/redis-687967dbc5-27vmr    Created container redis
2m5s        Normal   Started             pod/redis-687967dbc5-27vmr    Started container redis

Through the above operations, we can find that events is actually a resource in the Kubernetes cluster. When the resource status in the Kubernetes cluster changes, new events can be generated.

In-depth Events

Single Event object

Since events is a resource in a Kubernetes cluster, its metadata.name should contain its name under normal circumstances for individual operations. So we can use the following command to output its name:

(MoeLove) ➜ kubectl -n moelove get events --sort-by='{.metadata.creationTimestamp}' -o jsonpath='{range .items[*]}{.metadata.name}{"\n"}{end}'
redis-687967dbc5-27vmr.16c4fb7bde8c69d2
redis-687967dbc5.16c4fb7bde6b54c4
redis.16c4fb7bde1bf769
redis-687967dbc5-27vmr.16c4fb7bf8a0ab35
redis-687967dbc5-27vmr.16c4fb7d8ecaeff8
redis-687967dbc5-27vmr.16c4fb7d99709da9
redis-687967dbc5-27vmr.16c4fb7d9be30c06

Select any one of the event records and output it in YAML format for viewing:

(MoeLove) ➜ kubectl -n moelove get events redis-687967dbc5-27vmr.16c4fb7bde8c69d2 -o yaml
action: Binding
apiVersion: v1
eventTime: "2021-12-28T19:31:13.702987Z"
firstTimestamp: null
involvedObject:
  apiVersion: v1
  kind: Pod
  name: redis-687967dbc5-27vmr
  namespace: moelove
  resourceVersion: "330230"
  uid: 71b97182-5593-47b2-88cc-b3f59618c7aa
kind: Event
lastTimestamp: null
message: Successfully assigned moelove/redis-687967dbc5-27vmr to kind-worker3
metadata:
  creationTimestamp: "2021-12-28T19:31:13Z"
  name: redis-687967dbc5-27vmr.16c4fb7bde8c69d2
  namespace: moelove
  resourceVersion: "330235"
  uid: e5c03126-33b9-4559-9585-5e82adcd96b0
reason: Scheduled
reportingComponent: default-scheduler
reportingInstance: default-scheduler-kind-control-plane
source: {}
type: Normal

You can see that it contains a lot of information, we will not expand it here. Let's look at another example.

Events in kubectl describe

describe on the Deployment object and the Pod object respectively, and the following results can be obtained (the intermediate output is omitted):

  • Operations on Deployment
(MoeLove) ➜ kubectl -n moelove describe deploy/redis                
Name:                   redis
Namespace:              moelove
...
Events:
  Type    Reason             Age   From                   Message
  ----    ------             ----  ----                   -------
  Normal  ScalingReplicaSet  15m   deployment-controller  Scaled up replica set redis-687967dbc5 to 1
  • Operate on Pod
(MoeLove) ➜ kubectl -n moelove describe pods redis-687967dbc5-27vmr
Name:         redis-687967dbc5-27vmr                                                                 
Namespace:    moelove
Priority:     0
Events:
  Type    Reason     Age   From               Message
  ----    ------     ----  ----               -------
  Normal  Scheduled  18m   default-scheduler  Successfully assigned moelove/redis-687967dbc5-27vmr to kind-worker3
  Normal  Pulling    18m   kubelet            Pulling image "ghcr.io/moelove/redis:alpine"
  Normal  Pulled     17m   kubelet            Successfully pulled image "ghcr.io/moelove/redis:alpine" in 6.814310968s
  Normal  Created    17m   kubelet            Created container redis
  Normal  Started    17m   kubelet            Started container redis

We can find that when describes different resource objects, the contents of events that can be seen are directly related to itself. When you describe Deployment, you cannot see Pod-related Events.

This shows that, Event object that contains information about the resource objects it describes , they are directly linked.

Combining the single Event object we saw earlier, we found involvedObject of the resource object associated with the Event.

Learn more about Events

Let's take a look at the following example, creating a Deployment, but using a non-existing image:

(MoeLove) ➜ kubectl -n moelove create deployment non-exist --image=ghcr.io/moelove/non-exist
deployment.apps/non-exist created
(MoeLove) ➜ kubectl -n moelove get pods
NAME                        READY   STATUS         RESTARTS   AGE
non-exist-d9ddbdd84-tnrhd   0/1     ErrImagePull   0          11s
redis-687967dbc5-27vmr      1/1     Running        0          26m

We can see that the current Pod is in a state of ErrImagePull View the events in the current namespace (I omitted the record of deploy/redis before)

(MoeLove) ➜ kubectl -n moelove get events --sort-by='{.metadata.creationTimestamp}'                                                           
LAST SEEN   TYPE      REASON              OBJECT                           MESSAGE
35s         Normal    SuccessfulCreate    replicaset/non-exist-d9ddbdd84   Created pod: non-exist-d9ddbdd84-tnrhd
35s         Normal    ScalingReplicaSet   deployment/non-exist             Scaled up replica set non-exist-d9ddbdd84 to 1
35s         Normal    Scheduled           pod/non-exist-d9ddbdd84-tnrhd    Successfully assigned moelove/non-exist-d9ddbdd84-tnrhd to kind-worker3
17s         Warning   Failed              pod/non-exist-d9ddbdd84-tnrhd    Error: ErrImagePull
17s         Warning   Failed              pod/non-exist-d9ddbdd84-tnrhd    Failed to pull image "ghcr.io/moelove/non-exist": rpc error: code = Unknown desc = failed to pull and unpack image "ghcr.io/moelove/non-exist:latest": failed to resolve reference "ghcr.io/moelove/non-exist:latest": failed to authorize: failed to fetch anonymous token: unexpected status: 403 Forbidden
18s         Normal    Pulling             pod/non-exist-d9ddbdd84-tnrhd    Pulling image "ghcr.io/moelove/non-exist"
4s          Warning   Failed              pod/non-exist-d9ddbdd84-tnrhd    Error: ImagePullBackOff
4s          Normal    BackOff             pod/non-exist-d9ddbdd84-tnrhd    Back-off pulling image "ghcr.io/moelove/non-exist"

describe operation on this Pod:

(MoeLove) ➜ kubectl -n moelove describe pods non-exist-d9ddbdd84-tnrhd
...
Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
  Normal   Scheduled  4m                     default-scheduler  Successfully assigned moelove/non-exist-d9ddbdd84-tnrhd to kind-worker3
  Normal   Pulling    2m22s (x4 over 3m59s)  kubelet            Pulling image "ghcr.io/moelove/non-exist"
  Warning  Failed     2m21s (x4 over 3m59s)  kubelet            Failed to pull image "ghcr.io/moelove/non-exist": rpc error: code = Unknown desc = failed to pull and unpack image "ghcr.io/moelove/non-exist:latest": failed to resolve reference "ghcr.io/moelove/non-exist:latest": failed to authorize: failed to fetch anonymous token: unexpected status: 403 Forbidden
  Warning  Failed     2m21s (x4 over 3m59s)  kubelet            Error: ErrImagePull
  Warning  Failed     2m9s (x6 over 3m58s)   kubelet            Error: ImagePullBackOff
  Normal   BackOff    115s (x7 over 3m58s)   kubelet            Back-off pulling image "ghcr.io/moelove/non-exist"

We can find that the output here is different from the previous Pod running correctly. The main difference is in the column Age Here we see output 115s (x7 over 3m58s)

Its meaning means: This type of event has occurred 7 times in 3m58s, and the most recent one occurred before

But when we went to kubectl get events directly, we did not see 7 repeated events. This shows that Kubernetes will automatically merge duplicate events into .

Select the last Event (the method has been described in the previous content) and output its content in YAML format:

(MoeLove) ➜ kubectl -n moelove get events non-exist-d9ddbdd84-tnrhd.16c4fce570cfba46 -o yaml
apiVersion: v1
count: 43
eventTime: null
firstTimestamp: "2021-12-28T19:57:06Z"
involvedObject:
  apiVersion: v1
  fieldPath: spec.containers{non-exist}
  kind: Pod
  name: non-exist-d9ddbdd84-tnrhd
  namespace: moelove
  resourceVersion: "333366"
  uid: 33045163-146e-4282-b559-fec19a189a10
kind: Event
lastTimestamp: "2021-12-28T18:07:14Z"
message: Back-off pulling image "ghcr.io/moelove/non-exist"
metadata:
  creationTimestamp: "2021-12-28T19:57:06Z"
  name: non-exist-d9ddbdd84-tnrhd.16c4fce570cfba46
  namespace: moelove
  resourceVersion: "334638"
  uid: 60708be0-23b9-481b-a290-dd208fed6d47
reason: BackOff
reportingComponent: ""
reportingInstance: ""
source:
  component: kubelet
  host: kind-worker3
type: Normal

Here we can see that the field includes a count field, which indicates how many times the event of the same type has occurred. And firstTimestamp and lastTimestamp respectively represent the time of the last occurrence of this event for the first time. This also explains the duration of the events in the previous output.

Understand Events thoroughly

The following content is a random selection from Events, we can see some of the field information it contains:

apiVersion: v1
count: 1
eventTime: null
firstTimestamp: "2021-12-28T19:31:13Z"
involvedObject:
  apiVersion: apps/v1
  kind: ReplicaSet
  name: redis-687967dbc5
  namespace: moelove
  resourceVersion: "330227"
  uid: 11e98a9d-9062-4ccb-92cb-f51cc74d4c1d
kind: Event
lastTimestamp: "2021-12-28T19:31:13Z"
message: 'Created pod: redis-687967dbc5-27vmr'
metadata:
  creationTimestamp: "2021-12-28T19:31:13Z"
  name: redis-687967dbc5.16c4fb7bde6b54c4
  namespace: moelove
  resourceVersion: "330231"
  uid: 8e37ec1e-b3a1-420c-96d4-3b3b2995c300
reason: SuccessfulCreate
reportingComponent: ""
reportingInstance: ""
source:
  component: replicaset-controller
type: Normal

The meanings of the main fields are as follows:

  • count: Indicates how many times the current similar event has occurred (described earlier)
  • involvedObject: The resource object directly related to this event (introduced above), the structure is as follows:
type ObjectReference struct {
    Kind string
    Namespace string
    Name string
    UID types.UID
    APIVersion string
    ResourceVersion string
    FieldPath string
}
  • source: directly related components, the structure is as follows:
type EventSource struct {
    Component string
    Host string
}
  • Reason: A simple summary (or a fixed code), which is more suitable for filtering conditions, mainly for machine readable. There are currently more than 50 such codes;
  • message: give a detailed description that is easier for people to understand
  • type: Currently there are only Normal and Warning , and their meanings are also written in the source code:
// staging/src/k8s.io/api/core/v1/types.go
const (
    // Information only and will not cause any problems
    EventTypeNormal string = "Normal"
    // These events are to warn that something might go wrong
    EventTypeWarning string = "Warning"
)

Therefore, when we collect these Events as tracing source , we can classify them involvedObject , and sort them by time.

Summarize

In this article, I mainly use two examples, a properly deployed Deploy, and a Deploy that uses a non-existent image deployment, to introduce the actual function of the Events object and the meaning of each field in depth.

For Kubernetes, Events contain a lot of useful information, but this information does not have any impact on Kubernetes, and they are not actual Kubernetes logs. By default, the logs in Kubernetes will be cleaned up after 1 hour in order to release the resource occupation of etcd.

So in order to better let the cluster administrator know what happened, in the production environment, we usually collect the events of the Kubernetes cluster. The tool I personally recommend is: https://github.com/opsgenie/kubernetes-event-exporter

Of course, you can also follow my previous article "A More Elegant Kubernetes Cluster Event Measurement Scheme" , using Jaeger to use tracing to collect events in the Kubernetes cluster and display them.


Welcome to subscribe to my article public account【MoeLove】


张晋涛
1.7k 声望19.7k 粉丝