代码版本:go1.20.2
我们知道在用go的http client时不需要我们主动关闭Request Body,下面是Client.Do的源码使用说明:

//src/net/http/client.go
//...
// 这里说了底层的Transport会主动关闭request Body
// The request Body, if non-nil, will be closed by the underlying
// Transport, even on errors.
//
// ...
func (c *Client) Do(req *Request) (*Response, error) {
    return c.do(req)
}

我们在项目中需要用到这个特性,就是需要client主动帮助我们关闭request.Body。但是我们发现有协程泄露,最后定位到可能是由于request.Body没有被主动关闭导致.难懂是官方的描述有问题吗?最后我们在github issue中看到了有人提出request.Body在特定情况下不会被关掉的场景, 最后官方也进行了修复.

我们先来看下这个issue(https://github.com/golang/go/issues/49621):
图片.png
他还写了演示示例(https://play.golang.com/p/lku8lEgiPu6)
图片.png
重点主要是这上图中说的在writeLoop()里,可能pc.writech和pc.closech都有内容,但是执行到了<-pc.closech导致Request.Body没有被close
我们先来看下writeLoop()源码,重点看下中文注释:

//src/net/http/transport.go
func (pc *persistConn) writeLoop() {
    defer close(pc.writeLoopDone)
    for {
        select {
        case wr := <-pc.writech:
            startBytesWritten := pc.nwrite
            // 这里面会去关闭Request.Body,具体细节就不去看了
            err := wr.req.Request.write(pc.bw, pc.isProxy, wr.req.extra, pc.waitForContinue(wr.continueCh))
            if bre, ok := err.(requestBodyReadError); ok {
                //...
            }
            if err == nil {
                err = pc.bw.Flush()
            }
            if err != nil {
                if pc.nwrite == startBytesWritten {
                    err = nothingWrittenError{err}
                }
            }
            pc.writeErrCh <- err // to the body reader, which might recycle us
            wr.ch <- err         // to the roundTrip function
            if err != nil {
                pc.close(err)
                return
            }
        case <-pc.closech: //直接退出
            return
        }
    }
}

我们可以看到如果正常请求下需要进入到case wr := <-pc.writech才会对request进行操作,才会在里面close request.Body.如果case wr := <-pc.writechcase <-pc.closech都满足,但是进入到了case <-pc.closech就会导致request.Body不会被关闭。那么这种情况在什么时候会发生了呢?

//src/net/http/transport.go
func (pc *persistConn) roundTrip(req *transportRequest) (resp *Response, err error) {
    // ...

    // Write the request concurrently with waiting for a response,
    // in case the server decides to reply before reading our full
    // request body.
    startBytesWritten := pc.nwrite
    writeErrCh := make(chan error, 1)
    // 这里写入pc.writech
    pc.writech <- writeRequest{req, writeErrCh, continueCh}
    //...
}

上面的roundTrip()写入了pc.writech,但是pc.closech是在其他协程写入的

//src/net/http/transport.go
// close closes the underlying TCP connection and closes
// the pc.closech channel.
//
// The provided err is only for testing and debugging; in normal
// circumstances it should never be seen by users.
func (pc *persistConn) close(err error) {
    pc.mu.Lock()
    defer pc.mu.Unlock()
    pc.closeLocked(err)
}

func (pc *persistConn) closeLocked(err error) {
    if err == nil {
        panic("nil error")
    }
    pc.broken = true
    if pc.closed == nil {
        pc.closed = err
        pc.t.decConnsPerHost(pc.cacheKey)
        // Close HTTP/1 (pc.alt == nil) connection.
        // HTTP/2 closes its connection itself.
        if pc.alt == nil {
            if err != errCallerOwnsConn {
                pc.conn.Close()
            }
            close(pc.closech) // 这里唤醒pc.closech
        }
    }
    pc.mutateHeaderFunc = nil
}

我们可以看到pc.closech主要是在persistConn close()的时候唤醒.所以大致逻辑就是申请到了一条连接persistConn然后在Read/Write的时候快速失败,因为这两个在不同的协程导致pc.writechpc.closech同时满足条件。go官方修复了这个bug(https://go-review.googlesource.com/c/go/+/461675),我们来看下怎么修复的:
https://go-review.googlesource.com/c/go/+/461675/4/src/net/ht...
图片.png
看上面修改部分就是在(t *Transport) roundTrip(req *Request)里面再去尝试关闭request.Body.我们再看下这次pr的测试用例,很清晰:
https://go-review.googlesource.com/c/go/+/461675/4/src/net/ht...
图片.png
下面把重要部分解释下:

// https://go.dev/issue/49621
func TestConnClosedBeforeRequestIsWritten(t *testing.T) {
    run(t, testConnClosedBeforeRequestIsWritten, testNotParallel, []testMode{http1Mode})
}
func testConnClosedBeforeRequestIsWritten(t *testing.T, mode testMode) {
    ts := newClientServerTest(t, mode, HandlerFunc(func(w ResponseWriter, r *Request) {}),
        func(tr *Transport) {
            tr.DialContext = func(_ context.Context, network, addr string) (net.Conn, error) {
                // Connection会快速返回错误
                return &funcConn{
                    // 这里自己定义一个conn,不管是Read还是Write都会即刻返回错误
                    read: func([]byte) (int, error) {
                        return 0, errors.New("error")
                    },
                    write: func([]byte) (int, error) {
                        return 0, errors.New("error")
                    },
                }, nil
            }
        },
    ).ts
    // 这里设置了一个hook就是在进入RoundTrip前休息一下给足够的时间让closech被close
    SetEnterRoundTripHook(func() {
        time.Sleep(1 * time.Millisecond)
    })
    defer SetEnterRoundTripHook(nil)
    var closes int
    _, err := ts.Client().Post(ts.URL, "text/plain", countCloseReader{&closes, strings.NewReader("hello")})
    if err == nil {
        t.Fatalf("expected request to fail, but it did not")
    }
    // 这里的closes应该等于1
    if closes != 1 {
        t.Errorf("after RoundTrip, request body was closed %v times; want 1", closes)
    }
}

目前这个bug fix已经合入了master,但是什么时候发布到正式版本未知

总结

在并发场景net.Conn的Read()/Write()快速发生错误时可能导致Request Body不会被主动关闭。所以不要太相信官方的说法,官方也是可能有bug的,要大胆猜疑并去探究。

相关链接


AVOli
6 声望0 粉丝