我在研究『rabbitmq 消费者为什么需要心跳』
找了一篇东西:How does rabbitmq heartbeat work
其中一个答案提到一个概念:
from the RMQ Heartbeat documentation:
Network can fail in many ways, sometimes pretty subtle (e.g. high ratio packet loss). Disrupted TCP connections take a moderately long time (about 11 minutes with default configuration on Linux, for example) to be detected by the operating system. AMQP 0-9-1 offers a heartbeat feature to ensure that the application layer promptly finds out about disrupted connections (and also completely unresponsive peers). Heartbeats also defend against certain network equipment which may terminate "idle" TCP connections.
This isn't a request to a queue or stubbed message. This is a TCP/IP connection with packets sent across in a specific format for the heartbeat.
If you want the real details, you can read the AMQP 0.9.1 Specification, section 4.2.1 and 4.2.7 with errata on how RabbitMQ corrects for errors in the specification, as well.
翻译一下:
大概意思就是说,基于 tcp 自己的心跳机制,延迟太大了,对端要知道出毛病了,需要十几分钟才能发现。
我对 TCP 的了解相对比较匮乏
我不明白,这里为什么需要十几分钟对端才能知道?
- 如果是 A 发出『四次挥手』主动要求分手,B 应该是立刻知道,而不是需要十几分钟才知道,对吧?
- 如果是 A 和 B 之间的网线被人剪断了,才需要十几分钟才能知道对方已经『不可达』了是吗?
这里说的十分钟 ,不是在传输层 层面上的十分钟,而是mq自己的心跳保活等保活时间是十分钟。
例如一次心跳成功了,那么在十分钟之内的心跳信息即使都失败了,仍然算是“活”的状态。过了十分钟后才算“亡”。这个应该是可以配置的。