Do you feel the same?

When you are developing server-side code, do you often have the following questions?

  • Wondering how many connections does the MySQL connection pool have?
  • How long does the lifetime of each connection last?
  • When the connection is abnormally disconnected, is the server disconnected actively, or is the client disconnected?
  • When there is no request for a long time, does the underlying library have a KeepAlive request?

The handling of complex network conditions has always been one of the key and difficult points of back-end development. Are you also getting chills from debugging various network conditions?

So I wrote tproxy

When I'm doing backend development and writing go-zero , I often need to monitor network connections and analyze request content. for example:

  • Analyze when the gRPC connection is connected and when to reconnect, and adjust various parameters accordingly, such as: MaxConnectionIdle
  • Analyze the MySQL connection pool, how many connections are currently there, and what strategy is the life cycle of the connection
  • It can also be used to observe and analyze any TCP connection, see if the server actively disconnects, or the client actively disconnects, etc.

Installation of tproxy

 $ GOPROXY=https://goproxy.cn/,direct go install github.com/kevwan/tproxy@latest

Or use a docker image:

 $ docker run --rm -it -p <listen-port>:<listen-port> -p <remote-port>:<remote-port> kevinwan/tproxy:v1 tproxy -l 0.0.0.0 -p <listen-port> -r host.docker.internal:<remote-port>

arm64 system:

 $ docker run --rm -it -p <listen-port>:<listen-port> -p <remote-port>:<remote-port> kevinwan/tproxy:v1-arm64 tproxy -l 0.0.0.0 -p <listen-port> -r host.docker.internal:<remote-port>

Usage of tproxy

 $ tproxy --help
Usage of tproxy:
  -d duration
            the delay to relay packets
  -l string
            Local address to listen on (default "localhost")
  -p int
            Local port to listen on
  -q        Quiet mode, only prints connection open/close and stats, default false
  -r string
            Remote address (host:port) to connect
  -t string
            The type of protocol, currently support grpc

Analyzing gRPC connections

 tproxy -p 8088 -r localhost:8081 -t grpc -d 100ms
  • listening on localhost and port 8088
  • Redirect request to localhost:8081
  • Identify the packet format as gRPC
  • Packet delay 100ms

img

Among them, we can see the initialization and back and forth of a request of gRPC, and we can see that the stream id in the first request is 1.

For another example, gRPC has a MaxConnectionIdle parameter, which is used to set the idle time for the connection to be closed. We can directly observe that the server will send an http2 GoAway packet after the time is up.

img

For example, if I set MaxConnectioinIdle to 5 minutes, if there is no request for 5 minutes after the connection is successful, the connection is automatically closed, and then a connection is re-established.

Analyzing MySQL Connections

Let's analyze the impact of MySQL connection pool settings on the connection pool. For example, I set the parameters to:

 maxIdleConns = 3
maxOpenConns = 8
maxLifetime  = time.Minute
...
conn.SetMaxIdleConns(maxIdleConns)
conn.SetMaxOpenConns(maxOpenConns)
conn.SetConnMaxLifetime(maxLifetime)

We set MaxIdleConns and MaxOpenConns to different values, and then we use hey to do a stress test:

 hey -c 10 -z 10s "http://localhost:8888/lookup?url=go-zero.dev"

We did a stress test with a concurrency of 10QPS and a duration of 10 seconds. The connection results are as follows:

img

We can see that:

  • 2000+ connections established in 10 seconds
  • During the process, the existing connection is closed and the new connection is reopened.
  • Every time the connection is used and put back, it may exceed MaxIdleConns, and then the connection will be closed
  • Then when a new request came to get the connection, it was found that the number of connections was less than MaxOpenConns, but there were no available requests, so a new connection was created.

This is why we often see a lot of TIME_WAIT in MySQL.

Then we set MaxIdleConns and MaxOpenConns to the same value, and then do the same stress test again:

img

We can see that:

  • 8 connections have been maintained
  • One minute after the stress test (ConnMaxLifetime), all connections are closed

The ConnMaxLifetime here must be set less than wait_timeout. You can check the wait_timeout value as follows:

img

I recommend setting the value less than 5 minutes, because some switches will clean up idle connections in 5 minutes. For example, when we are socializing, the general heartbeat packet will not exceed 5 minutes. For specific reasons, see

https://github.com/zeromicro/go-zero/blob/master/core/stores/sqlx/sqlmanager.go#L65

Among them, there is a paragraph in issue 257 of go-sql-driver that also says ConnMaxLifetime, as follows:

14400 sec is too long. One minutes is enough for most use cases.

Even if you configure entire your DC (OS, switch, router, etc...), TCP connection may be lost from various reasons. (bug in router firmware, unstable power voltage, electric nose, etc...)

So if you don't know how to set MySQL connection pool parameters, you can refer to go-zero settings.

In addition, ConnMaxIdleTime has no effect on the above pressure test results, in fact, you do not need to set it.

If you have any questions about the above settings, or think something is wrong, welcome to discuss in the go-zero group.

project address

tproxy: https://github.com/kevwan/tproxy

go-zero: https://github.com/zeromicro/go-zero

Welcome to use and star support us!

WeChat exchange group

Follow the official account of " Microservice Practice " and click on the exchange group to get the QR code of the community group.


kevinwan
931 声望3.5k 粉丝

go-zero作者