ACK One builds a two-site three-center disaster recovery solution for an application system

Author: Yuhui, Zhuanghuai, Pioneer

Overview

Three centers in two places refers to the deployment of three business processing centers in two cities, namely: a production center, a disaster recovery center in the same city, and a disaster recovery center in a different place. Deploy 2 sets of environments in a city to form dual centers in the same city, process services at the same time and achieve data synchronization through high-speed links, which can be switched to operate. Deploy a set of environments in another city as an off-site disaster recovery center for data backup. When both centers fail at the same time, the off-site disaster recovery center can switch to process services. The disaster recovery solution of three centers in two places can guarantee the continuous operation of business to a great extent.

Using ACK One's multi-cluster management application distribution function can help enterprises manage 3 K8s clusters in a unified way, realize rapid deployment and upgrade of applications in 3 K8s clusters, and achieve differentiated configuration of applications on 3 K8s clusters. Combined with GTM (Global Traffic Management), service traffic can be automatically switched between three K8s clusters when a fault occurs. For the data replication at the RDS data level, this practice does not give a detailed introduction. You can refer to the DTS data transmission service.

Scheme Architecture

title=

Preconditions

Enable multi-cluster management master instance [ 1]

By managing the associated cluster [2 ] , three K8s clusters are added to the master instance to build two locations and three centers. In this practice, as an example, two K8s clusters (cluster1-beijing and cluster2-beijing) are deployed in Beijing, and one K8s cluster (cluster1-hangzhou) is deployed in Hangzhou.

Create a GTM instance [3 ]

Application deployment

Through the application distribution function [4 ] of the ACK One master instance, applications are distributed in three K8s clusters. Compared with traditional script deployment, application distribution using ACK One can obtain the following benefits.

title=

In this practice, the sample application is a web application, including K8s Deployment/Service/Ingress/Configmap resources, Service/Ingress exposes services, and Deployment reads configuration parameters in Configmap. By creating application distribution rules, applications are distributed to 3 K8s clusters, including 2 Beijing clusters and 1 Hangzhou cluster, to achieve two locations and three centers. During the distribution process, the deployment and configmap resources are configured differently to adapt to clusters in different locations. At the same time, the distribution process realizes the grayscale control of manual review and limits the wrong explosion radius.

Execute the following command to create a namespace demo.

 kubectl create namespace demo

Create an app-meta.yaml file with the following content.

 apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: web-demo
  name: web-demo
  namespace: demo
spec:
  replicas: 5
  selector:
    matchLabels:
      app: web-demo
  template:
    metadata:
      labels:
        app: web-demo
    spec:
      containers:
      - image: acr-multiple-clusters-registry.cn-hangzhou.cr.aliyuncs.com/ack-multiple-clusters/web-demo:0.4.0
        name: web-demo
        env:
        - name: ENV_NAME
          value: cluster1-beijing
        volumeMounts:
        - name: config-file
          mountPath: "/config-file"
          readOnly: true
      volumes:
      - name: config-file
        configMap:
          items:
          - key: config.json
            path: config.json
          name: web-demo
---
apiVersion: v1
kind: Service
metadata:
  name: web-demo
  namespace: demo
  labels:
    app: web-demo
spec:
  selector:
    app: web-demo
  ports:
    - protocol: TCP
      port: 80
      targetPort: 8080
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: web-demo
  namespace: demo
  labels:
    app: web-demo
spec:
  rules:
    - host: web-demo.example.com
      http:
        paths:
        - path: /
          pathType: Prefix
          backend:
            service:
              name: web-demo
              port:
                number: 80
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: web-demo
  namespace: demo
  labels:
    app: web-demo
data:
  config.json: |
    {
      database-host: "beijing-db.pg.aliyun.com"
    }

Execute the following command to deploy the application web-demo on the master instance. Note: The kube resource created on the master instance will not be distributed to the sub-cluster. This kube resource is used as the original data and is referenced in the subsequent Application (step 4b).

 kubectl apply -f app-meta.yaml

Create app distribution rules.

a. Execute the following commands to view the associated clusters managed by the master instance and determine the distribution target of the application

 kubectl amc get managedcluster

Expected output:

 Name                                               Alias               HubAccepted
managedcluster-cxxx   cluster1-hangzhou   true
managedcluster-cxxx   cluster2-beijing    true
managedcluster-cxxx   cluster1-beijing    true

b. Create an app distribution rule app.yaml with the following content. Replace and managedcluster-cxxx in the example with the actual cluster name to be published. Best practices for distribution rule definitions are described in the comments.

In app.yaml, include the following resource types: Policy (type:topology) distribution target, Policy (type: override) differentiation rule, Workflow workflow, Application application. For details, please refer to: application replication and distribution [5 ] , application distribution differential configuration [6 ] and gray distribution between application clusters [7 ] .

 apiVersion: core.oam.dev/v1alpha1
kind: Policy
metadata:
  name: cluster1-beijing
  namespace: demo
type: topology
properties:
  clusters: ["<managedcluster-cxxx>"] #分发目标集群1 cluster1-beijing
---
apiVersion: core.oam.dev/v1alpha1
kind: Policy
metadata:
  name: cluster2-beijing
  namespace: demo
type: topology
properties:
  clusters: ["<managedcluster-cxxx>"] #分发目标集群2 cluster2-beijing
---
apiVersion: core.oam.dev/v1alpha1
kind: Policy
metadata:
  name: cluster1-hangzhou
  namespace: demo
type: topology
properties:
  clusters: ["<managedcluster-cxxx>"] #分发目标集群3 cluster1-hangzhou
---
apiVersion: core.oam.dev/v1alpha1
kind: Policy
metadata:
  name: override-env-cluster2-beijing
  namespace: demo
type: override
properties:
  components:
  - name: "deployment"
    traits:
    - type: env
      properties:
        containerName: web-demo
        env:
          ENV_NAME: cluster2-beijing #对集群cluster2-beijing的deployment做环境变量的差异化配置
---
apiVersion: core.oam.dev/v1alpha1
kind: Policy
metadata:
  name: override-env-cluster1-hangzhou
  namespace: demo
type: override
properties:
  components:
  - name: "deployment"
    traits:
    - type: env
      properties:
        containerName: web-demo
        env:
          ENV_NAME: cluster1-hangzhou #对集群cluster1-hangzhou的deployment做环境变量的差异化配置
---
apiVersion: core.oam.dev/v1alpha1
kind: Policy
metadata:
  name: override-replic-cluster1-hangzhou
  namespace: demo
type: override
properties:
  components:
  - name: "deployment"
    traits:
    - type: scaler
      properties:
        replicas: 1          #对集群cluster1-hangzhou的deployment做副本数的差异化配置
---
apiVersion: core.oam.dev/v1alpha1
kind: Policy
metadata:
  name: override-configmap-cluster1-hangzhou
  namespace: demo
type: override
properties:
  components:
  - name: "configmap"
    traits:
    - type: json-merge-patch  #对集群cluster1-hangzhou的deployment做configmap的差异化配置
      properties:
        data:
          config.json: |
            {
              database-address: "hangzhou-db.pg.aliyun.com"
            }
---
apiVersion: core.oam.dev/v1alpha1
kind: Workflow
metadata:
  name: deploy-demo
  namespace: demo
steps:       #顺序部署cluster1-beijing，cluster2-beijing，cluster1-hangzhou。
  - type: deploy
    name: deploy-cluster1-beijing
    properties:
      policies: ["cluster1-beijing"]  
  - type: deploy
    name: deploy-cluster2-beijing
    properties:
      auto: false   #部署cluster2-beijing前需要人工审核
      policies: ["override-env-cluster2-beijing", "cluster2-beijing"] #在部署cluster2-beijing时做环境变量的差异化
  - type: deploy
    name: deploy-cluster1-hangzhou
    properties:
      policies: ["override-env-cluster1-hangzhou", "override-replic-cluster1-hangzhou", "override-configmap-cluster1-hangzhou", "cluster1-hangzhou"]
      #在部署cluster2-beijing时做环境变量，副本数，configmap的差异化
---
apiVersion: core.oam.dev/v1beta1
kind: Application
metadata:
  annotations:
    app.oam.dev/publishVersion: version8
  name: web-demo
  namespace: demo
spec:
  components:
    - name: deployment  #独立引用deployment，方便差异化配置
      type: ref-objects
      properties:
        objects:
          - apiVersion: apps/v1
            kind: Deployment
            name: web-demo
    - name: configmap   #独立引用configmap，方便差异化配置
      type: ref-objects
      properties:
        objects:
          - apiVersion: v1
            kind: ConfigMap
            name: web-demo
    - name: same-resource  #不做差异化配置
      type: ref-objects
      properties:
        objects:
          - apiVersion: v1
            kind: Service
            name: web-demo
          - apiVersion: networking.k8s.io/v1
            kind: Ingress
            name: web-demo
  workflow:
    ref: deploy-demo

Execute the following command to deploy the distribution rule app.yaml on the master instance.

 kubectl apply -f app.yaml

View the deployment status of the app.

 kubectl get app web-demo -n demo

Expected output, workflowSuspending means deployment is suspended

 NAME       COMPONENT    TYPE          PHASE                HEALTHY   STATUS   AGE
web-demo   deployment   ref-objects   workflowSuspending   true               47h

View the running status of the application on each cluster

 kubectl amc get deployment web-demo -n demo -m all

Expected output:

 Run on ManagedCluster managedcluster-cxxx (cluster1-hangzhou)
No resources found in demo namespace    #第一次新部署应用，工作流还没有开始部署cluster1-hangzhou
Run on ManagedCluster managedcluster-cxxx (cluster2-beijing)
No resources found in demo namespace     #第一次新部署应用，工作流还没有开始部署cluster2-beijiing，等待人工审核
Run on ManagedCluster managedcluster-cxxx (cluster1-beijing)
NAME       READY   UP-TO-DATE   AVAILABLE   AGE
web-demo   5/5     5            5           47h   #Deployment在cluster1-beijing集群上运行正常

After the manual review is passed, the clusters cluster2-beijing and cluster1-hangzhou are deployed.

 kubectl amc workflow resume web-demo -n demo
Successfully resume workflow: web-demo

View the deployment status of the app.

 kubectl get app web-demo -n demo

Expected output, running means the app is running normally

 NAME       COMPONENT    TYPE          PHASE     HEALTHY   STATUS   AGE
web-demo   deployment   ref-objects   running   true               47h

View the running status of the application on each cluster

 kubectl amc get deployment web-demo -n demo -m all

Expected output:

 Run on ManagedCluster managedcluster-cxxx (cluster1-hangzhou)
NAME       READY   UP-TO-DATE   AVAILABLE   AGE
web-demo   1/1     1            1           47h
Run on ManagedCluster managedcluster-cxxx (cluster2-beijing)
NAME       READY   UP-TO-DATE   AVAILABLE   AGE
web-demo   5/5     5            5           2d
Run on ManagedCluster managedcluster-cxxx (cluster1-beijing)
NAME       READY   UP-TO-DATE   AVAILABLE   AGE
web-demo   5/5     5            5           47h

View the Ingress status applied to each cluster

 kubectl amc get ingress -n demo -m all

As expected, the Ingress of each cluster is running normally, and the public IP is assigned successfully.

 Run on ManagedCluster managedcluster-cxxx (cluster1-hangzhou)
NAME       CLASS   HOSTS                  ADDRESS         PORTS   AGE
web-demo   nginx   web-demo.example.com   47.xxx.xxx.xxx   80      47h
Run on ManagedCluster managedcluster-cxxx (cluster2-beijing)
NAME       CLASS   HOSTS                  ADDRESS         PORTS   AGE
web-demo   nginx   web-demo.example.com   123.xxx.xxx.xxx   80      2d
Run on ManagedCluster managedcluster-cxxx (cluster1-beijing)
NAME       CLASS   HOSTS                  ADDRESS          PORTS   AGE
web-demo   nginx   web-demo.example.com   182.xxx.xxx.xxx   80      2d

Traffic management

By configuring global traffic management, it can automatically detect the application running status, and automatically switch traffic to the monitoring cluster when an exception occurs.

Configure a global traffic management instance, web-demo.example.com is the domain name of the example application, please replace it with the domain name of the actual application, and set DNS resolution to the CNAME access domain name of global traffic management.

title=

In the created GTM example, create 2 address pools:
1. pool-beijing: Contains the Ingress IP addresses of the two Beijing clusters. The load balancing strategy is to return all addresses to achieve load balancing between the two Beijing clusters. The Ingress IP address can be obtained by running "kubectl amc get ingress -n demo -m all" on the master instance.
  1. pool-hangzhou: Contains the Ingress IP address of one Hangzhou cluster.

title=

Enable health check in the address pool, and addresses that fail the check will be removed from the address pool and no longer receive traffic.

title=

Configure an access policy and set the primary address pool to be the Beijing address pool and the backup address pool to be the Hangzhou address pool. Normal traffic is processed by Beijing cluster applications. When all Beijing cluster applications are unavailable, it will automatically switch to Hangzhou cluster applications for processing.

title=

Deployment verification

Normally, all traffic is processed by the application on the two clusters in Beijing, and each cluster handles 50% of the traffic.

 for i in {1..50}; do curl web-demo.example.com; sleep 3;  done
This is env cluster1-beijing !
Config file is {
  database-host: "beijing-db.pg.aliyun.com"
}


This is env cluster1-beijing !
Config file is {
  database-host: "beijing-db.pg.aliyun.com"
}

This is env cluster2-beijing !
Config file is {
  database-host: "beijing-db.pg.aliyun.com"
}

This is env cluster1-beijing !
Config file is {
  database-host: "beijing-db.pg.aliyun.com"
}

This is env cluster2-beijing !
Config file is {
  database-host: "beijing-db.pg.aliyun.com"
}

This is env cluster2-beijing !
Config file is {
  database-host: "beijing-db.pg.aliyun.com"
}

When the application on cluster1-bejing is abnormal, GTM routes all traffic to cluster2-bejing for processing.

 for i in {1..50}; do curl web-demo.example.com; sleep 3;  done
...
<html>
<head><title>503 Service Temporarily Unavailable</title></head>
<body>
<center><h1>503 Service Temporarily Unavailable</h1></center>
<hr><center>nginx</center>
</body>
</html>

This is env cluster2-beijing !
Config file is {
  database-host: "beijing-db.pg.aliyun.com"
}

This is env cluster2-beijing !
Config file is {
  database-host: "beijing-db.pg.aliyun.com"
}

This is env cluster2-beijing !
Config file is {
  database-host: "beijing-db.pg.aliyun.com"
}

This is env cluster2-beijing !
Config file is {
  database-host: "beijing-db.pg.aliyun.com"
}

When the applications on cluster1-beijing and cluster2-beijing are abnormal at the same time, GTM routes the traffic to cluster1-hangzhou for processing.

 for i in {1..50}; do curl web-demo.example.com; sleep 3;  done
<head><title>503 Service Temporarily Unavailable</title></head>
<body>
<center><h1>503 Service Temporarily Unavailable</h1></center>
<hr><center>nginx</center>
</body>
</html>
<html>
<head><title>503 Service Temporarily Unavailable</title></head>
<body>
<center><h1>503 Service Temporarily Unavailable</h1></center>
<hr><center>nginx</center>
</body>
</html>
This is env cluster1-hangzhou !
Config file is {
  database-address: "hangzhou-db.pg.aliyun.com"
}

This is env cluster1-hangzhou !
Config file is {
  database-address: "hangzhou-db.pg.aliyun.com"
}

This is env cluster1-hangzhou !
Config file is {
  database-address: "hangzhou-db.pg.aliyun.com"
}

This is env cluster1-hangzhou !
Config file is {
  database-address: "hangzhou-db.pg.aliyun.com"
}

Summarize

This article focuses on the multi-cluster application distribution function through ACK One , which can help enterprises manage multi-cluster environments . Through the unified application distribution portal provided by the multi-cluster master control example, the multi-cluster distribution of applications, differentiated configuration, and workflow can be realized. Manage and other distribution strategies. Combined with GTM global traffic management, quickly build an application disaster recovery system that manages three centers in two places.

In addition to multi-cluster application distribution, ACK One supports connecting and managing Kubernetes clusters in any region and on any infrastructure, providing consistent management and community-compatible APIs , supporting computing, networking, storage, security, monitoring, logging, Unified operation and maintenance management and control of jobs, applications, and traffic. Alibaba Cloud Distributed Cloud Container Platform (ACK One for short) is an enterprise-level cloud-native platform for scenarios such as hybrid cloud, multi-cluster, distributed computing, and disaster recovery. For more information, please refer to the product introduction distributed cloud container platform ACK One [8 ] .

ACK One builds a two-site three-center disaster recovery solution for an application system

Overview

Scheme Architecture

Preconditions

Application deployment

Traffic management

Deployment verification

Summarize

Related Links

阿里云云原生

引用和评论

通义灵码 AI IDE 上线，第一时间测评体验

支付宝H5下载被拦截的原因排查与解决指南

JManus - 面向 Java 开发者的开源通用智能体

MCP协议重大升级，Spring AI Alibaba联合Higress发布业界首个Streamable HTTP实现方案

PAI Model Gallery 支持云上一键部署 Qwen3 全尺寸模型

k8s集群部署（一主两从）

黑客眼中的"肥羊"：刚开通的VPS为何最危险？