author

Wang Cheng, Tencent Cloud R&D engineer, Kubernetes contributor, is engaged in the containerization of database products, resource management and control, etc., focusing on the fields of Kubernetes, Go, and cloud native.

Overview

Entering the world of K8s, you will find that there are many Controllers, all of which are to complete the tuning of certain types of resources (such as pods are managed through DeploymentController, ReplicaSetController), and the goal is to maintain the state that users expect.

There are dozens of types of resources in K8s. How to enable K8s internal and external users to easily and efficiently obtain the changes of certain types of resources is what this article Informer wants to achieve. This article will start with Reflector, DeleteFIFO, Indexer, Controller, SharedInformer, processorListener, and workqueue. ) And other aspects.

This article and subsequent related articles are based on K8s v1.22

(K8s-informer)

Speaking from Reflector

The main responsibility of Reflector is to pull from the apiserver and continuously monitor (ListAndWatch) related resource type Add/Update/Delete events, and store them in the local store implemented by DeltaFIFO.

First look at the Reflector structure definition:

// staging/src/k8s.io/client-go/tools/cache/reflector.go
type Reflector struct {
    // 通过 file:line 唯一标识的 name
    name string

    // 下面三个为了确认类型
    expectedTypeName string
    expectedType     reflect.Type
    expectedGVK      *schema.GroupVersionKind

    // 存储 interface: 具体由 DeltaFIFO 实现存储
    store Store
    // 用来从 apiserver 拉取全量和增量资源
    listerWatcher ListerWatcher

    // 下面两个用来做失败重试
    backoffManager         wait.BackoffManager
    initConnBackoffManager wait.BackoffManager

    // informer 使用者重新同步的周期
    resyncPeriod time.Duration
    // 判断是否满足可以重新同步的条件
    ShouldResync func() bool
    
    clock clock.Clock
    
    // 是否要进行分页 List
    paginatedResult bool
    
    // 最后同步的资源版本号,以此为依据,watch 只会监听大于此值的资源
    lastSyncResourceVersion string
    // 最后同步的资源版本号是否可用
    isLastSyncResourceVersionUnavailable bool
    // 加把锁控制版本号
    lastSyncResourceVersionMutex sync.RWMutex
    
    // 每页大小
    WatchListPageSize int64
    // watch 失败回调 handler
    watchErrorHandler WatchErrorHandler
}

As can be seen from the structure definition, ListAndWatch is performed by specifying the target resource type, and paging related settings can be performed.

After pulling the full amount of resources (target resource type) for the first time, use the syncWith function to replace the full amount (Replace) to DeltaFIFO queue/items, and then continuously monitor the Watch (target resource type) incremental events, and update to DeltaFIFO queue/ Among items, waiting to be consumed.

The watch target type is implemented by Go reflect reflection as follows:

// staging/src/k8s.io/client-go/tools/cache/reflector.go
// watchHandler watches w and keeps *resourceVersion up to date.
func (r *Reflector) watchHandler(start time.Time, w watch.Interface, resourceVersion *string, errc chan error, stopCh <-chan struct{}) error {

    ...
    if r.expectedType != nil {
        if e, a := r.expectedType, reflect.TypeOf(event.Object); e != a {
            utilruntime.HandleError(fmt.Errorf("%s: expected type %v, but watch event object had type %v", r.name, e, a))
            continue
        }
    }
    if r.expectedGVK != nil {
        if e, a := *r.expectedGVK, event.Object.GetObjectKind().GroupVersionKind(); e != a {
            utilruntime.HandleError(fmt.Errorf("%s: expected gvk %v, but watch event object had gvk %v", r.name, e, a))
            continue
        }
    }
    ...
}
The target resource type is confirmed by reflection, so it is more appropriate to name it as Reflector; the target resource type of List/Watch is determined in NewSharedIndexInformer.ListerWatcher, but Watch will compare the target type again in watchHandler;

Meet DeltaFIFO

Let's first look at the definition of the DeltaFIFO structure:

// staging/src/k8s.io/client-go/tools/cache/delta_fifo.go
type DeltaFIFO struct {
    // 读写锁、条件变量
    lock sync.RWMutex
    cond sync.Cond

    // kv 存储:objKey1->Deltas[obj1-Added, obj1-Updated...]
    items map[string]Deltas

    // 只存储所有 objKeys
    queue []string

    // 是否已经填充:通过 Replace() 接口将第一批对象放入队列,或者第一次调用增、删、改接口时标记为true
    populated bool
    // 通过 Replace() 接口将第一批对象放入队列的数量
    initialPopulationCount int

    // keyFunc 用来从某个 obj 中获取其对应的 objKey
    keyFunc KeyFunc

    // 已知对象,其实就是 Indexer
    knownObjects KeyListerGetter

    // 队列是否已经关闭
    closed bool

    // 以 Replaced 类型发送(为了兼容老版本的 Sync)
    emitDeltaTypeReplaced bool
}

DeltaType can be divided into the following types:

// staging/src/k8s.io/client-go/tools/cache/delta_fifo.go
type DeltaType string

const (
    Added   DeltaType = "Added"
    Updated DeltaType = "Updated"
    Deleted DeltaType = "Deleted"
    Replaced DeltaType = "Replaced" // 第一次或重新同步
    Sync DeltaType = "Sync" // 老版本重新同步叫 Sync
)

From the above Reflector analysis, it can be known that the responsibility of DeltaFIFO is to process through queue locking (queueActionLocked), deduplication (dedupDeltas), and store in the local store implemented by DeltaFIFO, including queue (only objKeys) and items (Save objKeys and corresponding Deltas incremental changes), and continue to consume through Pop, and process related logic through Process(item).

(K8s-DeltaFIFO)

Indexer

The resources obtained by ListAndWatch in the previous step have been stored in DeltaFIFO, and then Pop is called to consume from the queue. In actual use, the Process processing function is implemented by sharedIndexInformer.HandleDeltas. The HandleDeltas function performs Add/Update/Delete according to the above different DeltaTypes, and creates, updates, and deletes corresponding indexes at the same time.

The specific index is implemented as follows:

// staging/src/k8s.io/client-go/tools/cache/index.go
// map 索引类型 => 索引函数
type Indexers map[string]IndexFunc

// map 索引类型 => 索引值 map
type Indices map[string]Index

// 索引值 map: 由索引函数计算所得索引值(indexedValue) => [objKey1, objKey2...]
type Index map[string]sets.String

Index function (IndexFunc): It is the function to calculate the index, which allows to expand a variety of different index calculation functions. The default and most commonly used index function is: MetaNamespaceIndexFunc .

Indexed Value (indexedValue): In some places, it is called indexKey, which represents the index value (such as ns1) calculated by the index function (IndexFunc).

Object key (objKey): the unique key of the object obj (such as ns1/pod1), which corresponds to a certain resource object one-to-one.

(K8s-indexer)

As you can see, Indexer is integrated by the ThreadSafeStore interface and finally implemented by threadSafeMap.

Index function IndexFunc (such as MetaNamespaceIndexFunc), KeyFunc (such as MetaNamespaceKeyFunc) difference: the former means how to calculate the index, the latter means how to obtain the object key (objKey); the difference between index key (indexKey, indexedValue in some places) and object key (objKey): The former represents the index key calculated by the index function (IndexFunc) (such as ns1), and the latter is the unique key of obj (such as ns1/pod1);

Controller

As the core hub, Controller integrates the above components Reflector, DeltaFIFO, Indexer, and Store to become a bridge connecting downstream consumers.

Controller is specifically implemented by the controller structure:

It is a convention in K8s: the interface defined in uppercase is implemented by the structure defined in lowercase.
// staging/src/k8s.io/client-go/tools/cache/controller.go
type controller struct {
    config         Config
    reflector      *Reflector // 上面已分析的组件
    reflectorMutex sync.RWMutex
    clock          clock.Clock
}

type Config struct {
    // 实际由 DeltaFIFO 实现
    Queue

    // 构造 Reflector 需要
    ListerWatcher

    // Pop 出来的 obj 处理函数
    Process ProcessFunc

    // 目标对象类型
    ObjectType runtime.Object

    // 全量重新同步周期
    FullResyncPeriod time.Duration

    // 是否进行重新同步的判断函数
    ShouldResync ShouldResyncFunc

    // 如果为 true,Process() 函数返回 err,则再次入队 re-queue
    RetryOnError bool

    // Watch 返回 err 的回调函数
    WatchErrorHandler WatchErrorHandler

    // Watch 分页大小
    WatchListPageSize int64
}

Start the Run method in the goroutine coroutine mode in the Controller, which will start Reflector's ListAndWatch(), which is used to pull the full amount and monitor incremental resources from the apiserver and store them in DeltaFIFO. Then, start processLoop and continue to consume from DeltaFIFO Pop. In the sharedIndexInformer, the function that Pop comes out for processing is HandleDeltas. On the one hand, it maintains the Add/Update/Delete of Indexer, and on the other hand, it calls downstream sharedProcessor for handler processing.

Start SharedInformer

The SharedInformer interface is integrated by SharedIndexInformer and implemented by sharedIndexInformer (see here, it is the interface interface defined in uppercase and implemented by the structure defined by the corresponding lowercase).

Take a look at the structure definition:

// staging/src/k8s.io/client-go/tools/cache/shared_informer.go
type SharedIndexInformer interface {
    SharedInformer
    // AddIndexers add indexers to the informer before it starts.
    AddIndexers(indexers Indexers) error
    GetIndexer() Indexer
}

type sharedIndexInformer struct {
    indexer    Indexer
    controller Controller

    // 处理函数,将是重点
    processor *sharedProcessor

    // 检测 cache 是否有变化,一把用作调试,默认是关闭的
    cacheMutationDetector MutationDetector

    // 构造 Reflector 需要
    listerWatcher ListerWatcher

    // 目标类型,给 Reflector 判断资源类型
    objectType runtime.Object

    // Reflector 进行重新同步周期
    resyncCheckPeriod time.Duration

    // 如果使用者没有添加 Resync 时间,则使用这个默认的重新同步周期
    defaultEventHandlerResyncPeriod time.Duration
    clock                           clock.Clock

    // 两个 bool 表达了三个状态:controller 启动前、已启动、已停止
    started, stopped bool
    startedLock      sync.Mutex

    // 当 Pop 正在消费队列,此时新增的 listener 需要加锁,防止消费混乱
    blockDeltas sync.Mutex

    // Watch 返回 err 的回调函数
    watchErrorHandler WatchErrorHandler
}

type sharedProcessor struct {
    listenersStarted bool
    listenersLock    sync.RWMutex
    listeners        []*processorListener
    syncingListeners []*processorListener // 需要 sync 的 listeners
    clock            clock.Clock
    wg               wait.Group
}

It can be seen from the structure definition that the Reflector ListAndWatch is performed through the integrated controller (analyzed above), and stored in DeltaFIFO, and the Pop consumption queue is started. The function that Pop comes out for processing in sharedIndexInformer is HandleDeltas.

All listeners are added to the processorListener array slice through sharedIndexInformer.AddEventHandler, and different processing is done by judging whether the current controller has been started as follows:

// staging/src/k8s.io/client-go/tools/cache/shared_informer.go
func (s *sharedIndexInformer) AddEventHandlerWithResyncPeriod(handler ResourceEventHandler, resyncPeriod time.Duration) {
    ...

    // 如果还没有启动,则直接 addListener 加入即可返回
    if !s.started {
        s.processor.addListener(listener)
        return
    }

    // 加锁控制
    s.blockDeltas.Lock()
    defer s.blockDeltas.Unlock()

    s.processor.addListener(listener)
    
    // 遍历所有对象,发送到刚刚新加入的 listener
    for _, item := range s.indexer.List() {
        listener.add(addNotification{newObj: item})
    }
}

Then, in HandleDeltas, according to the Delta type of obj (Added/Updated/Deleted/Replaced/Sync), sharedProcessor.distribute is called to process all the listening listeners.

Register SharedInformerFactory

As a factory class that uses SharedInformer, SharedInformerFactory provides a factory class design pattern with high cohesion and low coupling. Its structure is defined as follows:

// staging/src/k8s.io/client-go/informers/factory.go
type SharedInformerFactory interface {
    internalinterfaces.SharedInformerFactory // 重点内部接口
    ForResource(resource schema.GroupVersionResource) (GenericInformer, error)
    WaitForCacheSync(stopCh <-chan struct{}) map[reflect.Type]bool

    Admissionregistration() admissionregistration.Interface
    Internal() apiserverinternal.Interface
    Apps() apps.Interface
    Autoscaling() autoscaling.Interface
    Batch() batch.Interface
    Certificates() certificates.Interface
    Coordination() coordination.Interface
    Core() core.Interface
    Discovery() discovery.Interface
    Events() events.Interface
    Extensions() extensions.Interface
    Flowcontrol() flowcontrol.Interface
    Networking() networking.Interface
    Node() node.Interface
    Policy() policy.Interface
    Rbac() rbac.Interface
    Scheduling() scheduling.Interface
    Storage() storage.Interface
}

// staging/src/k8s.io/client-go/informers/internalinterfaces/factory_interfaces.go
type SharedInformerFactory interface {
    Start(stopCh <-chan struct{}) // 启动 SharedIndexInformer.Run
    InformerFor(obj runtime.Object, newFunc NewInformerFunc) cache.SharedIndexInformer // 目标类型初始化
}

Take PodInformer as an example to illustrate how users build their own Informer. PodInformer is defined as follows:

// staging/src/k8s.io/client-go/informers/core/v1/pod.go
type PodInformer interface {
    Informer() cache.SharedIndexInformer
    Lister() v1.PodLister
}

由小写的 podInformer 实现(又看到了吧,大写接口小写实现的 K8s 风格):

type podInformer struct {
    factory          internalinterfaces.SharedInformerFactory
    tweakListOptions internalinterfaces.TweakListOptionsFunc
    namespace        string
}

func (f *podInformer) defaultInformer(client kubernetes.Interface, resyncPeriod time.Duration) cache.SharedIndexInformer {
    return NewFilteredPodInformer(client, f.namespace, resyncPeriod, cache.Indexers{cache.NamespaceIndex: cache.MetaNamespaceIndexFunc}, f.tweakListOptions)
}

func (f *podInformer) Informer() cache.SharedIndexInformer {
    return f.factory.InformerFor(&corev1.Pod{}, f.defaultInformer)
}

func (f *podInformer) Lister() v1.PodLister {
    return v1.NewPodLister(f.Informer().GetIndexer())
}

The user passes in the target type (&corev1.Pod{}), the constructor (defaultInformer), calls SharedInformerFactory.InformerFor to achieve the registration of the target Informer, and then calls SharedInformerFactory.Start to run, and the SharedIndexedInformer -> Controller-analyzed above is started. > Reflector -> DeltaFIFO process.

Through the user's own input target type and constructor to register the Informer, the SharedInformerFactory design pattern of high cohesion and low coupling is realized.

Callback processorListener

All listeners are implemented by processorListener and are divided into two groups: listeners and syncingListeners, which respectively traverse all listeners in the group, and deliver data to processorListener for processing.

Because the resyncPeriod set by each listener may be inconsistent, the ones that are not set (resyncPeriod = 0) are classified as the listeners group, and those with resyncPeriod set to the syncingListeners group; if a listener is in multiple places (sharedIndexInformer.resyncCheckPeriod, sharedIndexInformer.AddEventHandlerWithResyncPeriod) ) Is set to resyncPeriod, then the minimum value minimumResyncPeriod is taken;
// staging/src/k8s.io/client-go/tools/cache/shared_informer.go
func (p *sharedProcessor) distribute(obj interface{}, sync bool) {
    p.listenersLock.RLock()
    defer p.listenersLock.RUnlock()

    if sync {
        for _, listener := range p.syncingListeners {
            listener.add(obj)
        }
    } else {
        for _, listener := range p.listeners {
            listener.add(obj)
        }
    }
}

From the code, you can see that the processorListener cleverly uses two channels (addCh, nextCh) and a pendingNotifications (rolling Ring implemented by slice) for buffer buffering. The default initialBufferSize = 1024. It has achieved efficient data transfer without blocking upstream and downstream processing, which is worth learning.

(K8s-processorListener)

workqueue get busy

Through the processorListener callback function in the previous step, it is handed over to the internal ResourceEventHandler for real addition, deletion and modification (CUD) processing, and the OnAdd/OnUpdate/OnDelete registration functions are respectively called for processing.

In order to process quickly without blocking the processorListener callback function, workqueue is generally used for asynchronous decoupling processing, which is implemented as follows:

(K8s-workqueue)

As can be seen from the figure, workqueue.RateLimitingInterface integrates DelayingInterface, and DelayingInterface integrates Interface, which is finally implemented by rateLimitingType, providing rateLimit speed limit, delay delay enqueue (implemented by priority queue through small top heap), queue queue Deal with three core competencies.

In addition, you can see in the code that K8s implements three RateLimiters: BucketRateLimiter, ItemExponentialFailureRateLimiter, ItemFastSlowRateLimiter, the Controller uses the first two by default as follows:

// staging/src/k8s.io/client-go/util/workqueue/default_rate_limiters.go
func DefaultControllerRateLimiter() RateLimiter {
    return NewMaxOfRateLimiter(
        NewItemExponentialFailureRateLimiter(5*time.Millisecond, 1000*time.Second),
        // 10 qps, 100 bucket size.  This is only for retry speed and its only the overall factor (not per item)
        &BucketRateLimiter{Limiter: rate.NewLimiter(rate.Limit(10), 100)},
    )
}

In this way, on the user side, flexible queue processing can be carried out by calling workqueue related methods, such as no retry after failure, time control for delayed enqueue, queue speed control (QPS), etc., to achieve non-blocking Asynchronous logic processing.

summary

This article analyzes the Reflector, DeleteFIFO (incremental queue), Indexer, Controller, SharedInformer, processorListener, and workqueue in K8s. Work queue) and other components, analyze the implementation mechanism of Informer, and explain related process processing through source code and graphics, in order to better understand the operation process of K8s Informer.

It can be seen that in order to achieve an efficient and non-blocking core process, K8s uses a large number of methods such as goroutine coroutines, channel channels, queue queues, index indexes, and map deduplication; and through a good interface design pattern, it is open to users Many extension capabilities; unified interface and implementation naming methods are adopted, which are worthy of in-depth study and reference.

PS: For more content, please pay attention to
k8s-club GitHub address: https://github.com/k8s-club/k8s-club

Reference

[1] Kubernetes official document: [ https://kubernetes.io/]

[2] Kubernetes source code: [ https://github.com/kubernetes/kubernetes]

[3] Kubernetes Architectural Roadmap:【https://github.com/kubernetes/community/blob/master/contributors/design-proposals/architecture/architectural-roadmap.md】

about us

For more cases and knowledge about cloud native, please follow the public account of the same name [Tencent Cloud Native]~

Welfare: The official account backstage reply [manual], you can get "Tencent Cloud Native Roadmap Manual" & "Tencent Cloud Native Best Practices"~

[Tencent Cloud Native] Cloud Talk New Products, Cloud Research New Techniques, Yunyou Xinhuo, Yunxiang Information, scan the QR code to follow the public account of the same name, and get more dry goods in time! !

账号已注销
350 声望974 粉丝