Kubernetes可以使用一个lock对象,进行多实例的选举:

  • 抢到lock对象更新权的实例即为leader;

lock对象可以是:

  • Lease
  • ConfigMap
  • Endpoints

controller-runtime常用于operator的开发,其封装了client-go中leader选举的细节,leader选举的场景为:

  • 某一时刻只有一个实例可以reconcile,其它实例处于standby状态;
  • 一旦leader实例挂掉,standby实例可以顶上,继续执行reconcile;

一.operator

Operator开发中,使用controller-runtime实现leader选举非常简单,调用时传入election参数即可:

func main() {
    flag.BoolVar(&enableLeaderElection, "enableLeaderElection", false, "default false, if enabled the cronHPA would be in primary and standby mode.")
    flag.Parse()
    mgr, err := ctrl.NewManager(ctrl.GetConfigOrDie(), ctrl.Options{
        LeaderElection:     enableLeaderElection,
       LeaderElectionID:   "kubernetes-cronhpa-controller",
        MetricsBindAddress: metricsAddr,
    })
    ...
}

多实例抢锁时,使用的资源对象为configmap:

# kubectl get cm -n kube-system kubernetes-cronhpa-controller -oyaml
apiVersion: v1
kind: ConfigMap
metadata:
  annotations:
    control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"kubernetes-cronhpa-controller-6bcbdf9844-d2z4q_c36ac963-3e9c-4e32-ac14-ab48b6f2633b","leaseDurationSeconds":15,"acquireTime":"2023-04-08T16:14:30Z","renewTime":"2023-04-10T07:55:15Z","leaderTransitions":126}'
  creationTimestamp: "2022-10-13T06:20:42Z"
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    ...
    manager: kubernetes-cronhpa-controller
    operation: Update
    time: "2022-10-13T06:20:42Z"
  name: kubernetes-cronhpa-controller
  namespace: kube-system
  resourceVersion: "76431408"
  uid: 5099216c-2fd6-452a-afeb-bd39047c2cbb

二.controller-runtime

若使用底层的client-go进行实例选举,通常有以下步骤:

  • 首先,创建resourcelock,指定锁的资源类型;
  • 然后,创建leaderElector对象,调用leaderElector.Run();

controller-runtime也是这么做的。

1. 创建resourcelock

创建ControllerManager的时候,创建resourcelock:

// controller-runtime/pkg/manager/manager.go
func New(config *rest.Config, options Options) (Manager, error) {
    ...
    resourceLock, err := options.newResourceLock(leaderConfig, recorderProvider, leaderelection.Options{
        LeaderElection:          options.LeaderElection,
        LeaderElectionID:        options.LeaderElectionID,
        LeaderElectionNamespace: options.LeaderElectionNamespace,
    })
    if err != nil {
        return nil, err
    }
    return &controllerManager{
        config:                  config,
        ...
        resourceLock:            resourceLock,
    }
}

创建resourcelock的细节:

  • 可以看出,这里创建的是ConfigMap类型的resourcelock;
// controller-runtime/pkg/leaderelection/leader_election.go
func NewResourceLock(config *rest.Config, recorderProvider recorder.Provider, options Options) (resourcelock.Interface, error) {
    ...
    return resourcelock.New(resourcelock.ConfigMapsResourceLock,
        options.LeaderElectionNamespace,
        options.LeaderElectionID,
        client.CoreV1(),
        client.CoordinationV1(),
        resourcelock.ResourceLockConfig{
            Identity:      id,
            EventRecorder: recorderProvider.GetEventRecorderFor(id),
        })
}

2. 创建LeaderElector对象,调用LeaderElector.Run():

首先,是ControllerManager.Start():

// controller-manager/pkg/manager/internal.go
func (cm *controllerManager) Start(stop <-chan struct{}) (err error) {
    ...
    go func() {
        if cm.resourceLock != nil {        //需要选举
            err := cm.startLeaderElection()
            if err != nil {
                cm.errChan <- err
            }
        } else {                        //不需要选举
            // Treat not having leader election enabled the same as being elected.
            close(cm.elected)
            go cm.startLeaderElectionRunnables()
        }
    }()
    select {
    case <-stop:
        // We are done
        return nil
    case err := <-cm.errChan:
        // Error starting or running a runnable
        return err
    }
}

若需要选举:

  • 创建LeaderElector对象;
  • 调用LeaderElector.Run();
  • 选举成功后,执行startLeaderElectionRunnables(),这里的runables即是Reconcile();
// controller-runtime/pkg/manager/internal.go
func (cm *controllerManager) startLeaderElection() (err error) {
    ...
    l, err := leaderelection.NewLeaderElector(leaderelection.LeaderElectionConfig{
        Lock:          cm.resourceLock,
        LeaseDuration: cm.leaseDuration,
        RenewDeadline: cm.renewDeadline,
        RetryPeriod:   cm.retryPeriod,
        Callbacks: leaderelection.LeaderCallbacks{
            OnStartedLeading: func(_ context.Context) {
                close(cm.elected)
                cm.startLeaderElectionRunnables()    // 选举成功后,执行Reconcile()
            },
            OnStoppedLeading: cm.onStoppedLeading,
        },
    })
    ...
    // Start the leader elector process
    go l.Run(ctx)
    return nil
}

若不要选举,则直接执行startLeaderElectionRunnables(),即执行Reconcile()。

参考:

1.https://itnext.io/leader-election-in-kubernetes-using-client-...


a朋
63 声望39 粉丝