凌晨时候钉钉告警群里一直大量报错:

  

接口异常报警:项目:mp-rest,域名:inside-mp.01zhuanche.com,IP:10.30.3.60,接口地址:/api/v3/driverLogin/driverType,请求方式:POST,错误信息:com.alibaba.dubbo.rpc.RpcException: Failed to invoke the method findByPhone in the service com.zhuanche.driver.service.DriverInfoService. Tried 1 times of the providers [10.30.3.72:8080] (1/8) from the registry 10.0.7.56:2181 on the consumer 10.30.3.60 using the dubbo version 2.5.8. Last error is: Failed to invoke remote method: findByPhone, provider: dubbo://10.30.3.72:8080/com.zhuanche.driver.service.DriverInfoService?anyhost=true&application=mp-rest&check=false&default.accesslog=/u01/mp-driver-provider/log/access.log&default.check=false&default.loadbalance=leastactive&default.retries=0&dispatcher=message&dubbo=2.5.8&generic=false&interface=com.zhuanche.driver.service.DriverInfoService&logger=slf4j&methods=queryReatImeiByDriverIdAndTime,updateImeiAndAppversion,findByIdcardOrPlateNum,updatePhoneByDriverId,findHistoryDriver,listByLicensePlates,sendDriverToMq,findDriverBaseInfoByPhone,findDriverInfoBySupplierId,findByPhone,getValidateLoginByDriverId,findChatUserIdByDriverId,findPhoneByCityIdOrCooperation,findHistoryBylicensePlates,findDriverIdByChatUserId,resetImei,findYOTHistoryDriver,findDriverBylicensePlates,findByDriverId,findYOTHistoryBylicensePlates,searchBlackDriver,updateIMEIByDriverId,updateGroupIdByDriverId,countReatImeiByDriverIdAndTime&organization=zhuanche&owner=mp&pid=8®ister.ip=10.30.3.60&remote.timestamp=1561644769858&revision=0.5.1-20190505.093234-1&side=consumer&timeout=5000×tamp=1563247918842, cause: com.alibaba.dubbo.remoting.RemotingException: Not found exported service: com.zhuanche.driver.service.DriverInfoService:8080 in [com.zhuanche.driver.service.DriverDutyService:8080, com.zhuanche.driver.service.driver.DriverJoinRecordService:8080], may be version or group mismatch , channel: consumer: /10.30.3.60:48174 --> provider: /10.30.3.72:8080, message:RpcInvocation [methodName=findByPhone, parameterTypes=[class java.lang.String], arguments=[13826557857], attachments={path=com.zhuanche.driver.service.DriverInfoService, input=256, dubbo=2.5.8, interface=com.zhuanche.driver.service.DriverInfoService, version=0.0.0, timeout=5000}]
com.alibaba.dubbo.remoting.RemotingException: Not found exported service: com.zhuanche.driver.service.DriverInfoService:8080 in [com.zhuanche.driver.service.DriverDutyService:8080, com.zhuanche.driver.service.driver.DriverJoinRecordService:8080], may be version or group mismatch , channel: consumer: /10.30.3.60:48174 --> provider: /10.30.3.72:8080, message:RpcInvocation [methodName=findByPhone, parameterTypes=[class java.lang.String], arguments=[13826557857], attachments={path=com.zhuanche.driver.service.DriverInfoService, input=256, dubbo=2.5.8, interface=com.zhuanche.driver.service.DriverInfoService, version=0.0.0, timeout=5000}]
at com.alibaba.dubbo.rpc.protocol.dubbo.DubboProtocol.getInvoker(DubboProtocol.java:205)
at com.alibaba.dubbo.rpc.protocol.dubbo.DubboProtocol$1.reply(DubboProtocol.java:76)
at com.alibaba.dubbo.remoting.exchange.support.header.HeaderExchangeHandler.handleRequest(HeaderExchangeHandler.java:98)
at com.alibaba.dubbo.remoting.exchange.support.header.HeaderExchangeHandler.received(HeaderExchangeHandler.java:170)
at com.alibaba.dubbo.remoting.transport.DecodeHandler.received(DecodeHandler.java:52)
at com.alibaba.dubbo.remoting.transport.dispatcher.ChannelEventRunnable.run(ChannelEventRunnable.java:81)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

   在某一个时刻,大量的dubbo 异常。 早上起来先去elk里面查询是不是并发量的问题,查了下这个时刻 并发量每秒大概300-400的量,和平时差别不大。然后找到具体的接口,通过elk的 request_time 发现有个别接口响应时间达到 1秒多,而调用的dubbo服务

@Reference(timeout = 1000)

private DriverInfoService driverInfoService;

时间超过了1秒。但是为什么会报错那。。。 正常来说又不会出现问题

看了 消费者,报的的dubbo 异常,之后再查看生产者,查了下这个时间段日志是正常的。 但是最后查看dubbo的启动日志时候:

03:07:23,213 |-INFO in ch.qos.logback.classic.joran.action.LevelAction - RocketmqRemoting level set to INFO
03:07:23,213 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [RocketmqClientAppender] to Logger[RocketmqRemoting]
03:07:23,213 |-INFO in ch.qos.logback.classic.joran.action.LoggerAction - Setting additivity of logger [RocketmqClient] to false
03:07:23,213 |-INFO in ch.qos.logback.classic.joran.action.LevelAction - RocketmqClient level set to INFO
03:07:23,213 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [RocketmqClientAppender] to Logger[RocketmqClient]
03:07:23,213 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - End of configuration.
03:07:23,213 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@4b87074a - Registering current configuration as safe fallback point
[2019-07-18 03:07:28] Dubbo service server started!

http://www.developcls.com

dubbo服务 在这个时间段内竟然重启了。。。

之后联系运维,原来是他们在凌晨这个时间点 重启了swap top5的服务。刚好这个时间段请求量比较大,zk注册的时候还没有注册上。导致请求了重启的那个机器ip上面。从而出现了大批量的报错信息。


droxy
4 声望0 粉丝

[链接]