1

Istio 1.8中多集群支持的演变 一文中,我们介绍了4种Istio多集群部署模型,并且简单介绍了单网络 Primary-Remote部署模型的部署步骤。今天我们通过对源码分析,来介绍Istio如何支持多集群模式。

主要通过istioctl 命令 和 Pilot-discovery源码两部分来讲述,并且基于Istio1.8 版本。

Istioctl 命令

Istioctl 提供了诸多对于多集群支持的命令。该代码位于 istioctl/pkg/multicluster路径下,包含了如下子命令:

  • apply :基于网格拓扑更新多集群网格中的集群
  • describe : 描述多集群网格的控制平面的状态
  • generate:根据网格描述和运行时状态生成特定于集群的控制平面配置

以上三个命令,大家可以-h,获取帮助:

$ istioctl x multicluster -h
Commands to assist in managing a multi-cluster mesh [Deprecated, it will be removed in Istio 1.9]

Usage:
  istioctl experimental multicluster [command]

Aliases:
  multicluster, mc

Available Commands:
  apply       Update clusters in a multi-cluster mesh based on mesh topology
  describe    Describe status of the multi-cluster mesh's control plane'
  generate    generate a cluster-specific control plane configuration based on the mesh description and runtime state

Flags:
  -h, --help   help for multicluster

Global Flags:
      --context string          The name of the kubeconfig context to use
  -i, --istioNamespace string   Istio system namespace (default "istio-system")
  -c, --kubeconfig string       Kubernetes configuration file
  -n, --namespace string        Config namespace

Use "istioctl experimental multicluster [command] --help" for more information about a command.
  • create-remote-secret:创建具有凭据的secret,以允许Istio访问远程Kubernetes apiserver。

比如我们在部署多集群模型中,一定会执行如下的命令(此次演示远程集群名为 sgt-base-sg1-prod):

istioctl x create-remote-secret 
    --context="${CTX_REMOTE}" 
    --name=sgt-base-sg1-prod | 
    kubectl apply -f - --context="${CTX_CONTROL}"

该命令分为两部分:

  • 针对远程集群操作:将会在远程集群创建istio-system命名空间,并在该命名空间下创建istio-reader-service-account 和 istiod-service-account 两个服务账户,以及对这两个账户的RBAC相关授权,执行成功后,返回控制集群所需的Secret。
  • 将 上一步骤返回的Secret应用到控制集群。

我们看下实际操作返回Secret内容:

# This file is autogenerated, do not edit.
apiVersion: v1
kind: Secret
metadata:
  annotations:
    networking.istio.io/cluster: sgt-base-sg1-prod
  creationTimestamp: null
  labels:
    istio/multiCluster: "true"
  name: istio-remote-secret-sgt-base-sg1-prod
  namespace: istio-system
stringData:
  sgt-base-sg1-prod: |
    apiVersion: v1
    clusters:
    - cluster:
        certificate-authority-data: LS0tLS1CRUdJTiBDRVJ0tCk1JSUN5RENDQWJDZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJd01Ea3lNekF5TlRJME0xb1hEVE13TURreU1UQXlOVEkwTTFvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTWFJCk5DcW1McGNjTENGNDNqTDZET1phNnhUMU5kbm9yNkpWR0w5a0FNNGMzVDZDZ1ZYOUpDbGxxdmVDQkRMclgremEKcGQwZ1orNFZqZUtHWk9jdklnc3p2dDV4TTJoWDBBZ1BQMFFDNnl2bnc5VXBrOHBNcDFLVkV1L3pUSXFPTlplcAp0NmlGcjIya1dUaWgwYmhIeDQwc3JoQXZjWXM2NStlb240QmhBYTBGR1dreWM4dUZqRmRnT2hYS3hzd01EdkRiCmUzenlMc3ZOb2NvT3V1U2JrR3hUNmtKeGhmdHI4dEZnWGllM2dYSFJnSitQUUN6UElCM1JZdEsxMGdROHB6T1UKOTAwb3p0TlllZGg4MUhZcjZSV0ZDb1FBMXlpN2xEL3BUWlo4UnRkZTZQWmt0bStFNnJkaEI2a0ZkZmFtY3U4MgptamlQZGxmYWVrSXFCTGxoa1NFQ0F3RUFBYU1qTUNFd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFHUHEzVllkWmFJZFdOMDk5OW5TV1RIV0E0VkYKMzROZ1pEVEdHY3QvWUpNWmZGRnVnSjlqRVBMdTZiSklrZFVHcHNCbkhvNUFsTHJZTjU2dnFkL0MrVTlOc2R2NwpnQ0FBTlNDMVArYktUZmVmWGpQd1dhY0R2RCtTZWIrTHhGUmF3NWZyNDZJNEtTRE12RUZ0T3JaRmhWL3AvQkF5ClZJT01GMDF3aCtOa045OVlWMUZ0S1pLRnd6WGVaM3N3TXBCek50a2daYzlDMjhvdlR5TGNFT05ucGk0dDRmc28KSGpYdkJubUVvak5UcmZtL3F3M1l6Y3dBNXUzekRoRlFkTU5PWlFWVk1EVmhzOFZBOXhyRk1iUFhCSWRiZmZRSApva3QvWkJ0WHRwQm9qaGZmYlJSR0pRQTBFbTk0WTRGNEhhSlFMM2QwMGRoSy9mL1Fiak5BUVhFVFhqRT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=
        server: https://88876557684F299B0ED2.xxx.ap-southeast-1.eks.amazonaws.com
      name: sgt-base-sg1-prod
    contexts:
    - context:
        cluster: sgt-base-sg1-prod
        user: sgt-base-sg1-prod
      name: sgt-base-sg1-prod
    current-context: sgt-base-sg1-prod
    kind: Config
    preferences: {}
    users:
    - name: sgt-base-sg1-prod
      user:
        token: eyJhbGciOiJSUzI1NirbWFQRjFVeVI3WlZ2Qk9YQ0Qzb2FINl9xMkE5X0MzbXEwb2hVWFVnZjgifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJpc3Rpby1zeXN0ZW0iLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlY3JldC5uYW1lIjoiaXN0aW8tcmVhZGVyLXNlcnZpY2UtYWNjb3VudC10b2tlbi1wdHFmOSIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VydmljZS1hY2NvdW50Lm5hbWUiOiJpc3Rpby1yZWFkZXItc2VydmljZS1hY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiMzJlZDQwYzktNGNmNC00Y2EwLWI1YzYtZThhZTczNjFlMDI2Iiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50OmlzdGlvLXN5c3RlbTppc3Rpby1yZWFkZXItc2VydmljZS1hY2NvdW50In0.WbtOZc0390Yq147gvOFdWsaxhEwAC7vaNzhKtlKIf9JXRIGZhkt91zPU_fJLGAlMlj9RSc5QMzQokLSvA_69fGlXnZpdiPvVBrmWJtOQ_tUNJCAL-MfBerZ1y7Kp6Itaw3j1t2M2Ksj5h1SuqfWdiBbNAwb5ehyVJoGpAxppSGdrLGbMWHH1iZCCz6T3WnPPmMfFktcgFDJYlHuuwRaIsuNgD-nUOrUM7-PQiv2sOGVy8EYbObl9AvcvlklZz5KSHfk6GkJ_RYYObFpy-M8ZOYEA2lTpeg5Wer65nlOXo_FYUQ1It4jsZdsuj9cctIQautT6ExhrG30oAhpamzKs8A

我们可以看到该secret,被 label istio/multiCluster: "true"标记,在后续的pilot-discovery代码中,会对带有该标记的secret的进行处理。

为什么创建该secret很重要?

  • 使控制平面能够验证来自远程集群中运行的工作负载的连接请求。没有API Server访问权限,控制平面将拒绝请求。
  • 启用发现在远程集群中运行的服务端点的功能。

Pilot-discovery

Pilot-discovery 在其server结构体中,包含multicluster 对象,该对象定义如下:

type Multicluster struct {
    WatchedNamespaces string
    DomainSuffix      string
    ResyncPeriod      time.Duration
    serviceController *aggregate.Controller
    XDSUpdater        model.XDSUpdater
    metrics           model.Metrics
    endpointMode      EndpointMode

    m                     sync.Mutex // protects remoteKubeControllers
    remoteKubeControllers map[string]*kubeController
    networksWatcher       mesh.NetworksWatcher

    // fetchCaRoot maps the certificate name to the certificate
    fetchCaRoot      func() map[string]string
    caBundlePath     string
    systemNamespace  string
    secretNamespace  string
    secretController *secretcontroller.Controller
    syncInterval     time.Duration
}

其包含远程kube控制器和多集群特定的属性。

在pilot-discovery 组件bootstrap过程中,对该对象进行实例化。

if err := s.initClusterRegistries(args); err != nil {
        return nil, fmt.Errorf("error initializing cluster registries: %v", err)
    }

根据传入的RegistryOptions参数,启动secret控制器以监视远程集群并初始化多集群结构。

func (s *Server) initClusterRegistries(args *PilotArgs) (err error) {
    if hasKubeRegistry(args.RegistryOptions.Registries) {
        log.Info("initializing Kubernetes cluster registry")
        mc, err := controller.NewMulticluster(s.kubeClient,
            args.RegistryOptions.ClusterRegistriesNamespace,
            args.RegistryOptions.KubeOptions,
            s.ServiceController(),
            s.XDSServer,
            s.environment)

        if err != nil {
            log.Info("Unable to create new Multicluster object")
            return err
        }

        s.multicluster = mc
    }
    return nil
}

该方法里的核心是NewMulticluster方法:

func NewMulticluster(kc kubernetes.Interface, secretNamespace string, opts Options,
    serviceController *aggregate.Controller, xds model.XDSUpdater, networksWatcher mesh.NetworksWatcher) (*Multicluster, error) {

    remoteKubeController := make(map[string]*kubeController)
    if opts.ResyncPeriod == 0 {
        // make sure a resync time of 0 wasn't passed in.
        opts.ResyncPeriod = 30 * time.Second
        log.Info("Resync time was configured to 0, resetting to 30")
    }
    mc := &Multicluster{
        WatchedNamespaces:     opts.WatchedNamespaces,
        DomainSuffix:          opts.DomainSuffix,
        ResyncPeriod:          opts.ResyncPeriod,
        serviceController:     serviceController,
        XDSUpdater:            xds,
        remoteKubeControllers: remoteKubeController,
        networksWatcher:       networksWatcher,
        metrics:               opts.Metrics,
        fetchCaRoot:           opts.FetchCaRoot,
        caBundlePath:          opts.CABundlePath,
        systemNamespace:       opts.SystemNamespace,
        secretNamespace:       secretNamespace,
        endpointMode:          opts.EndpointMode,
        syncInterval:          opts.GetSyncInterval(),
    }
    mc.initSecretController(kc)

    return mc, nil
}

对于Multicluster 结构,其实现了如下3个主要方法:

  • AddMemberCluster : 作为添加远程集群时要调用的回调。此功能需要设置所有处理程序,以监视在远程集群上添加,删除或更改的资源。
  • DeleteMemberCluster:当删除远程集群时,也就是某远程集群不再纳入到mesh中,要调用的回调。同时清除缓存,以删除远程集群资源。
  • UpdateMemberCluster : 该方法先执行 DeleteMemberCluster 操作,再执行AddMemberCluster 操作。

以上三个方法会传递到MultiCluster 对象中的secret控制器。

func (m *Multicluster) initSecretController(kc kubernetes.Interface) {
    m.secretController = secretcontroller.StartSecretController(kc,
        m.AddMemberCluster,
        m.UpdateMemberCluster,
        m.DeleteMemberCluster,
        m.secretNamespace,
        m.syncInterval)
}

该secret 控制器监测secret变化,当然并不是对所有的secret变化都执行对应操作。当secret 包含 istio/multiCluster: "true" lable的时候,表明该secret代表一个远程集群,才会做对应的操作,具体操作就是执行上面讲到的三个方法。

 secretsInformer := cache.NewSharedIndexInformer(
        &cache.ListWatch{
            ListFunc: func(opts meta_v1.ListOptions) (runtime.Object, error) {
                opts.LabelSelector = MultiClusterSecretLabel + "=true"
                return kubeclientset.CoreV1().Secrets(namespace).List(context.TODO(), opts)
            },
            WatchFunc: func(opts meta_v1.ListOptions) (watch.Interface, error) {
                opts.LabelSelector = MultiClusterSecretLabel + "=true"
                return kubeclientset.CoreV1().Secrets(namespace).Watch(context.TODO(), opts)
            },
        },
        &corev1.Secret{}, 0, cache.Indexers{},
    )

这样istio就实现了多集群的自动发现目的。

那么发现远程集群之后,istio会做哪些操作那?

MultiCluster 对象中,包含remoteKubeControllers 和 serviceController 两个核心对象。

remoteKubeControllers 是一个map[string]*kubeController map对象。Key为远程集群ID,值为kubeController指针。

type kubeController struct {
    *Controller
    stopCh chan struct{}
}

kubeController 可以获取远程集群的Service,Pod信息,node信息等,然后将其转换为istio内部模型对象。

serviceController 是一个aggregate.Controller 对象,该控制器汇总不同注册表中的数据并监视更改。这里相当于我们平时用到的注册中心。

type Controller struct {
    registries []serviceregistry.Instance
    storeLock  sync.RWMutex
    meshHolder mesh.Holder
}

当我们新增一个集群的时候,AddMemberCluster 方法中,会将新集群的kubeController实例化,并添加到remoteKubeControllers 对象中。起一个协程运行该Controller,然后将该远程集群注册到serviceController中,也就是控制集群开始对该远程集群进行资源对象发现。

当我们删除一个远程集群的时候,DeleteMemberCluster方法中,会将目的集群的kubeController从remoteKubeControllers 中删除。并且通知运行该Controller的协程退出,然后将该远程集群从注册中心serviceController中反注册。

总结

本文从源代码角度简单介绍了一下istio对于多集群的支持。


iyacontrol
1.4k 声望2.7k 粉丝

专注kubernetes,devops,aiops,service mesh。