理解eureka的自我保护机制

相关参数

eureka.instance.leaseRenewalIntervalInSeconds = 30 # hearbeat interval
eureka.server.renewalPercentThreshold = 0.85
eureka.server.renewalThresholdUpdateIntervalMs = 15 * 60 * 1000 # 15mins 

leaseRenewalIntervalInSeconds

client发送心跳的频率

renewalPercentThreshold

触发自我保护的心跳数比例阈值

renewalThresholdUpdateIntervalMs

多久重置一下心跳阈值

计算公式

Number of heartbeats expected from one client instance / min

factor = 60/leaseRenewalIntervalInSeconds

Number of heartbeats expected from N instances / min

factor * N

Minimum expected heartbeat threshold / min

factor N renewalPercentThreshold

源代码

this.expectedNumberOfRenewsPerMin = N * 2;
this.numberOfRenewsPerMinThreshold = (int) (this.expectedNumberOfRenewsPerMin * serverConfig.getRenewalPercentThreshold());

eureka启动时

eureka-core-1.4.12-sources.jar!/com/netflix/eureka/registry/PeerAwareInstanceRegistryImpl.java

@Override
    public void openForTraffic(ApplicationInfoManager applicationInfoManager, int count) {
        // Renewals happen every 30 seconds and for a minute it should be a factor of 2.
        this.expectedNumberOfRenewsPerMin = count * 2;
        this.numberOfRenewsPerMinThreshold =
                (int) (this.expectedNumberOfRenewsPerMin * serverConfig.getRenewalPercentThreshold());
        logger.info("Got " + count + " instances from neighboring DS node");
        logger.info("Renew threshold is: " + numberOfRenewsPerMinThreshold);
        this.startupTime = System.currentTimeMillis();
        if (count > 0) {
            this.peerInstancesTransferEmptyOnStartup = false;
        }
        DataCenterInfo.Name selfName = applicationInfoManager.getInfo().getDataCenterInfo().getName();
        boolean isAws = Name.Amazon == selfName;
        if (isAws && serverConfig.shouldPrimeAwsReplicaConnections()) {
            logger.info("Priming AWS connections for all replicas..");
            primeAwsReplicas(applicationInfoManager);
        }
        logger.info("Changing status to UP");
        applicationInfoManager.setInstanceStatus(InstanceStatus.UP);
        super.postInit();
    }

count为eureka个数,此时expectedNumberOfRenewsPerMin为2(1个注册中心,每30秒发送一次心跳,则每分钟对于每个client,应该发送2次心跳),numberOfRenewsPerMinThreshold=(int)2*0.85 =1

client注册时

eureka-core-1.4.12-sources.jar!/com/netflix/eureka/registry/AbstractInstanceRegistry.java

public void register(InstanceInfo registrant, int leaseDuration, boolean isReplication) {
        try {
            read.lock();
            Map<String, Lease<InstanceInfo>> gMap = registry.get(registrant.getAppName());
            REGISTER.increment(isReplication);
            if (gMap == null) {
                final ConcurrentHashMap<String, Lease<InstanceInfo>> gNewMap = new ConcurrentHashMap<String, Lease<InstanceInfo>>();
                gMap = registry.putIfAbsent(registrant.getAppName(), gNewMap);
                if (gMap == null) {
                    gMap = gNewMap;
                }
            }
            Lease<InstanceInfo> existingLease = gMap.get(registrant.getId());
            // Retain the last dirty timestamp without overwriting it, if there is already a lease
            if (existingLease != null && (existingLease.getHolder() != null)) {
                Long existingLastDirtyTimestamp = existingLease.getHolder().getLastDirtyTimestamp();
                Long registrationLastDirtyTimestamp = registrant.getLastDirtyTimestamp();
                logger.debug("Existing lease found (existing={}, provided={}", existingLastDirtyTimestamp, registrationLastDirtyTimestamp);
                if (existingLastDirtyTimestamp > registrationLastDirtyTimestamp) {
                    logger.warn("There is an existing lease and the existing lease's dirty timestamp {} is greater" +
                            " than the one that is being registered {}", existingLastDirtyTimestamp, registrationLastDirtyTimestamp);
                    logger.warn("Using the existing instanceInfo instead of the new instanceInfo as the registrant");
                    registrant = existingLease.getHolder();
                }
            } else {
                // The lease does not exist and hence it is a new registration
                synchronized (lock) {
                    if (this.expectedNumberOfRenewsPerMin > 0) {
                        // Since the client wants to cancel it, reduce the threshold
                        // (1
                        // for 30 seconds, 2 for a minute)
                        this.expectedNumberOfRenewsPerMin = this.expectedNumberOfRenewsPerMin + 2;
                        this.numberOfRenewsPerMinThreshold =
                                (int) (this.expectedNumberOfRenewsPerMin * serverConfig.getRenewalPercentThreshold());
                    }
                }
                logger.debug("No previous lease information found; it is new registration");
            }
            Lease<InstanceInfo> lease = new Lease<InstanceInfo>(registrant, leaseDuration);
            if (existingLease != null) {
                lease.setServiceUpTimestamp(existingLease.getServiceUpTimestamp());
            }
            gMap.put(registrant.getId(), lease);
            synchronized (recentRegisteredQueue) {
                recentRegisteredQueue.add(new Pair<Long, String>(
                        System.currentTimeMillis(),
                        registrant.getAppName() + "(" + registrant.getId() + ")"));
            }
            // This is where the initial state transfer of overridden status happens
            if (!InstanceStatus.UNKNOWN.equals(registrant.getOverriddenStatus())) {
                logger.debug("Found overridden status {} for instance {}. Checking to see if needs to be add to the "
                                + "overrides", registrant.getOverriddenStatus(), registrant.getId());
                if (!overriddenInstanceStatusMap.containsKey(registrant.getId())) {
                    logger.info("Not found overridden id {} and hence adding it", registrant.getId());
                    overriddenInstanceStatusMap.put(registrant.getId(), registrant.getOverriddenStatus());
                }
            }
            InstanceStatus overriddenStatusFromMap = overriddenInstanceStatusMap.get(registrant.getId());
            if (overriddenStatusFromMap != null) {
                logger.info("Storing overridden status {} from map", overriddenStatusFromMap);
                registrant.setOverriddenStatus(overriddenStatusFromMap);
            }

            // Set the status based on the overridden status rules
            InstanceStatus overriddenInstanceStatus = getOverriddenInstanceStatus(registrant, existingLease, isReplication);
            registrant.setStatusWithoutDirty(overriddenInstanceStatus);

            // If the lease is registered with UP status, set lease service up timestamp
            if (InstanceStatus.UP.equals(registrant.getStatus())) {
                lease.serviceUp();
            }
            registrant.setActionType(ActionType.ADDED);
            recentlyChangedQueue.add(new RecentlyChangedItem(lease));
            registrant.setLastUpdatedTimestamp();
            invalidateCache(registrant.getAppName(), registrant.getVIPAddress(), registrant.getSecureVipAddress());
            logger.info("Registered instance {}/{} with status {} (replication={})",
                    registrant.getAppName(), registrant.getId(), registrant.getStatus(), isReplication);
        } finally {
            read.unlock();
        }
    }

其中,重点看这段

synchronized (lock) {
                    if (this.expectedNumberOfRenewsPerMin > 0) {
                        // Since the client wants to cancel it, reduce the threshold
                        // (1
                        // for 30 seconds, 2 for a minute)
                        this.expectedNumberOfRenewsPerMin = this.expectedNumberOfRenewsPerMin + 2;
                        this.numberOfRenewsPerMinThreshold =
                                (int) (this.expectedNumberOfRenewsPerMin * serverConfig.getRenewalPercentThreshold());
                    }
                }

每注册上一个实例,重新算一下,即

this.expectedNumberOfRenewsPerMin = this.expectedNumberOfRenewsPerMin + 2;
this.numberOfRenewsPerMinThreshold = (int) (this.expectedNumberOfRenewsPerMin * serverConfig.getRenewalPercentThreshold());

总的来说,所有实例都注册上了,就是

this.expectedNumberOfRenewsPerMin = N * 2;
this.numberOfRenewsPerMinThreshold = (int) (this.expectedNumberOfRenewsPerMin * serverConfig.getRenewalPercentThreshold());

计算例子

2个eureka,算在内,factor=6,N:16

leaseRenewalIntervalInSeconds: 10  --> 每分钟6次
this.expectedNumberOfRenewsPerMin = N * 2 = 32 ;
this.numberOfRenewsPerMinThreshold = (int) (32 * 0.85) = (int)27.2 = 27;

2个eureka,算在内,factor=2,N=3

leaseRenewalIntervalInSeconds: 30  --> 每分钟2次
this.expectedNumberOfRenewsPerMin = N * 2 = 6 ;
this.numberOfRenewsPerMinThreshold = (int) (6* 0.85) = (int)5.1 = 5;

对于单机版eureka的,比较特殊,由于openForTraffic这个方法里头

this.expectedNumberOfRenewsPerMin = count * 2;

这里count=1多算了一个实例的心跳次数,所以如下:

1个eureka,不注册,factor=2,N=1

leaseRenewalIntervalInSeconds: 30  --> 每分钟2次
this.expectedNumberOfRenewsPerMin = N * 2 = 2 ;
this.numberOfRenewsPerMinThreshold = (int) (2 * 0.85) = (int)1.7 = 1;

实际显示是3,应该是按N=2算
this.expectedNumberOfRenewsPerMin = N * 2 = 4 ;
this.numberOfRenewsPerMinThreshold = (int) (4 * 0.85) = (int)3.4 = 3;

1个eureka,不注册,factor=2,N=2

leaseRenewalIntervalInSeconds: 30  --> 每分钟2次
this.expectedNumberOfRenewsPerMin = N * 2 = 4 ;
this.numberOfRenewsPerMinThreshold = (int) (4 * 0.85) = (int)3.4 = 3;
实际显示是5,应该是按N=3算
this.expectedNumberOfRenewsPerMin = N * 2 = 6 ;
this.numberOfRenewsPerMinThreshold = (int) (6 * 0.85) = (int)5.1 = 5;

反思源码

由于eureka-core-1.4.12版本里头,你去调整eureka.instance.leaseRenewalIntervalInSeconds的话,代码里头没有相应调整factor,也就是代码还是60/30=2,所以会破坏eureka内置的设计思路。不过对于小型项目来说,没有跨机房,网络没有那么恶劣的话,想避免自我保护导致的服务注册列表不能修改的问题,可以选择以下任一方式尝试下:

  • 关闭自我保护eureka.server.enableSelfPreservation=false

  • 调小eureka.instance.leaseRenewalIntervalInSeconds,比如设置为10秒

  • 调小renewalPercentThreshold,比如改为0.49

另外,还有一个参数可以调整,就是心跳阈值重新计算的周期:
eureka.server.renewalThresholdUpdateIntervalMs = 15 60 1000 # 15mins
默认是15分钟可以改小一点,比如5分钟=5601000

doc


想获取最新内容,请关注微信公众号

图片描述


code-craft
spring boot , docker and so on 欢迎关注微信公众号: geek_luandun
1 篇内容引用

当一个代码的工匠回首往事时,不因虚度年华而悔恨,也不因碌碌无为而羞愧,这样,当他老的时候,可以很...

11.8k 声望
2k 粉丝
0 条评论
推荐阅读
2022年终总结
最近两年开始陷入颓废中,博客也写的越来越少了。究其原因,主要还是陷入了职业倦怠期,最近一次跳槽感觉颇为失败,但是碍于给的薪资高,为了五斗米折腰,又加上最近行情不好,想要往外跳也跳不了,就这样子一直...

codecraft阅读 708

feign调用把CPU吃满了?这个锅HttpMessageConverters来背
SpringEncoder / SpringDecoder 在每次编码 / 解码时都会调用 ObjectFactory&lt;HttpMessageConverters&gt;.getObject()).getConverters() 获取 HttpMessageConverters。

开翻挖掘机1阅读 464

Spring Cloud中MyBatis-Plus动态数据源刷新问题
在使用MyBatis-Plus的DynamicRoutingDataSource时遇到的问题,当我在配置中心动态增加或者删除了一个数据源,他并不会自动同步最新的数据源,导致我用DynamicDataSourceContextHolder.push(ds)方法的时候拿不到刚...

Pursuer丶阅读 690

封面图
这些不知道,别说你熟悉 Nacos,深度源码解析!
大家好,这篇文章跟大家聊下 SpringCloudAlibaba 中的微服务组件 Nacos。Nacos 既能做注册中心,又能做配置中心,这篇文章主要来聊下做配置中心时 client 端的一些设计,主要从源码层面进行分析,相信看完这篇文...

yanhom1阅读 290

封面图
Spring Cloud OpenFeign调用流程
上一节给大家分享了Spring Cloud OpenFeign的启动流程,接下来给大家分享一下调用流程。话不多说,咱们直接开始。视频:[链接]调用流程xxxFeignClient → feign.ReflectiveFeign.FeignInvocationHandler#invoke→ f...

冯文议阅读 532

封面图
【Spring Cloud】Feign调用异常触发降级后如何捕获异常
在Spring Cloud的微服务架构中,通常微服务之间通过feign/openfeign来进行http调用,并且启用hystrix并配置降级策略fallback,可以在http调用异常时触发降级,代码如下

kamier阅读 495

Nacos 中的配置文件如何实现加密传输
小伙伴们知道,Spring Cloud Config 很早就提供了配置文件的加解密功能,并且支持对称加密和非对称加密两种不同的模式。Nacos 作为分布式配置中心+服务注册中心的合体,在配置文件加密这块一直差点意思,不过好在...

Java架构师阅读 352

当一个代码的工匠回首往事时,不因虚度年华而悔恨,也不因碌碌无为而羞愧,这样,当他老的时候,可以很...

11.8k 声望
2k 粉丝
宣传栏