kubernetes中cronhpa与hpa的共存对伸缩的影响

一.背景

在kubernetes中，hpa（HorizontalPodAutoscaler）可以根据工作负载的指标（cpu/mem等），自动对其进行伸缩，即自动增加、减少其pod数量。

cronhpa是一个定时扩缩容的组件，它支持按照Crontab表达式的策略，定时地对workload进行扩缩容，这里使用aliyun的kubernetes-cronhpa-controller。

cronhpa的例子如下：

apiVersion: autoscaling.alibabacloud.com/v1beta1
kind: CronHorizontalPodAutoscaler
metadata:
  name: cronhpa-sample
spec:
   scaleTargetRef:
      apiVersion: apps/v1
      kind: Deployment
      name: nginx            // 对deploy: nginx进行定时扩缩容；
   jobs:
   - name: "scale-down"        // 在每分钟的30s，将deploy缩容到1个副本；
     schedule: "30 */1 * * * *"
     targetSize: 1
   - name: "scale-up"        // 在每分钟的0s，将deploy扩容到3个副本；
     schedule: "0 */1 * * * *"
     targetSize: 3

问题

如果CronHPA和HPA，同时操作一个scaleTargetRef，就会出现replicas被CronHPA和HPA轮番修改的情况，比如:

HPA根据指标，要将副本扩容为5个;
CronHPA根据crontab规则，将副本扩容为3个;
两个controller独立工作，两个扩容操作会相互覆盖；

二.方案

kubernetes-cronhpa-controller对该问题的方案：

将CronHPA的scaleTargetRef设置为HPA；
再设置HPA的target为deployment，从而让CronHPA感知HPA的当前状态；

最终：

CronHPA可以感知到HPA的minReplicas/maxReplicas/desiredReplicas；
CronHPA可以通过HPA的target，知道deploy的currentReplicas；

CronHPA通过调整hpa.minReplicas/hpa.maxReplicas/deploy.replicas的方式，与HPA一起完成扩缩容操作。

策略调整

在kubernetes-cronhpa-controller方案的基础上，可以根据自己的需求进行调整，比如：

规定cronhpa不修改hpa的参数；
规定cronhpa在伸缩时，伸缩范围为[hpa.minReplica, hpa.maxReplica]，不能超过范围；

三.源码

看一下kubernetes-cronhpa-controller对cronhpa和hpa共存伸缩的源码。

1. 源码入口

在执行定时任务的时候，对cronhpa.TargetRef.Kind="HPA"单独处理：

// pkg/controller/cronjob.go
func (ch *CronJobHPA) Run() (msg string, err error) {
    startTime := time.Now()
    times := 0
    for {
        now := time.Now()
        ...
        // hpa compatible
        if ch.TargetRef.RefKind == "HorizontalPodAutoscaler" {    // target=HPA
            msg, err = ch.ScaleHPA()
            if err == nil {
                break
            }
        } else {
            msg, err = ch.ScalePlainRef()    // target=Deploy/Sts
            if err == nil {
                break
            }
        }
        ...
    }
    return msg, err
}

CronHPA对HPA伸缩时，进行两个操作：

修改HPA的min/maxReplicas；
对HPA.target(deploy/statefulset)进行Scale；

CronHPA对HPA伸缩时，依据的参数：

min: hpa.minReplicas
max: hpa.maxReplicas
deploy: hpa.Status.CurrentReplicas
target: cronhpa.job.TargetSize

2. 伸缩逻辑

ScaleHPA()做的事情：

根据上述不同的场景，更新hpa.minReplicas和maxReplicas；
伸缩hpa.targetRef：
- 若hpa.targetRef副本数 >= cronhpa.target，则不需要伸缩，直接返回；
- 否则，将hpa.targetRef副本数，伸缩至cronhpa.target；

// pkg/controller/cronjob.go
func (ch *CronJobHPA) ScaleHPA() (msg string, err error) {
    ...
    // 查询HPA对象
    hpa := &autoscalingapi.HorizontalPodAutoscaler{}
    err = ch.client.Get(ctx, types.NamespacedName{Namespace: ch.HPARef.Namespace, Name: ch.TargetRef.RefName}, hpa)
    ...
    mappings, err := ch.mapper.RESTMappings(targetGK)
    // 查询HPA的target，即scale对象
    for _, mapping := range mappings {
        targetGR = mapping.Resource.GroupResource()
        scale, err = ch.scaler.Scales(ch.TargetRef.RefNamespace).Get(context.Background(), targetGR, targetRef.Name, v1.GetOptions{})
        if err == nil {
            found = true
            break
        }
    }
    // 根据不同场景，修改hpa.minReplicas/maxReplicas
    updateHPA := false
    if ch.DesiredSize > hpa.Spec.MaxReplicas {
        hpa.Spec.MaxReplicas = ch.DesiredSize
        updateHPA = true
    }
    if ch.DesiredSize < *hpa.Spec.MinReplicas {
        *hpa.Spec.MinReplicas = ch.DesiredSize
        updateHPA = true
    }
    if hpa.Status.CurrentReplicas == *hpa.Spec.MinReplicas && ch.DesiredSize < hpa.Status.CurrentReplicas {
        *hpa.Spec.MinReplicas = ch.DesiredSize
        updateHPA = true
    }
    if hpa.Status.CurrentReplicas < ch.DesiredSize {
        *hpa.Spec.MinReplicas = ch.DesiredSize
        updateHPA = true
    }
    // 将hpa的修改持久化
    if updateHPA {
        err = ch.client.Update(ctx, hpa)
        if err != nil {
            return "", err
        }
    }
    // 若目标对象不需要伸缩，则直接返回
    if hpa.Status.CurrentReplicas >= ch.DesiredSize {
        // skip change replicas and exit
        return fmt.Sprintf("Skip scale replicas because HPA %s current replicas:%d >= desired replicas:%d.", hpa.Name, scale.Spec.Replicas, ch.DesiredSize), nil
    }
    // 否则，伸缩目标对象到target副本数
    scale.Spec.Replicas = int32(ch.DesiredSize)
    _, err = ch.scaler.Scales(ch.TargetRef.RefNamespace).Update(context.Background(), targetGR, scale, metav1.UpdateOptions{})
    return msg, nil
}

参考

1.https://help.aliyun.com/document_detail/151557.html
2.https://github.com/AliyunContainerService/kubernetes-cronhpa-controller

kubernetes中cronhpa与hpa的共存对伸缩的影响

一.背景

问题

二.方案

策略调整

三.源码

1. 源码入口

2. 伸缩逻辑

参考

a朋

引用和评论

alertmanager源码：整体架构和流程分析

Jenkins 企业级 CI/CD 实践：安装、配置与 Kubernetes & Docker 集成

k8s集群部署（一主两从）

k8s实战基础

使用kubeadm部署高可用IPV4/IPV6集群---V1.32

centos7使用yum网络安装

基于k3s部署Nginx、MySQL、PHP和Redis的详细教程