1
头图

foreword

This is the second in a series of ten common mistakes in Go: The pit of benchmark performance testing. The material comes from Teiva Harsanyi , a Go evangelist and now a senior engineer at Docker.

The source code involved in this article is all open source in: The source code of the top ten common errors in Go , you are welcome to pay attention to the public account and get the latest updates of this series in time.

Scenes

go test Support benchmark performance test, but do you know there may be pits here?

A common pit is compiler inline optimization. Let's look at a specific example:

 func add(a int, b int) int {
    return a + b
}

Now we want to test the performance of the add function, and we may write the following test code:

 func BenchmarkWrong(b *testing.B) {
    b.ResetTimer()
    for i := 0; i < b.N; i++ {
        add(1000000000, 1000000001)
    }
}

What could be the pit here? For the compiler, the add function is a leaf function, that is, the add function itself does not call other functions, so the compiler will add Function calls are optimized inline, which can lead to inaccurate performance test results. Because we usually test the execution efficiency of our own program itself, not the execution efficiency of the compiler after optimization, so that we can have a correct understanding of the performance of the program, and you do go test optimization effect of the compiler during testing may not be the same as the optimization effect of the compiler during the actual production environment .

So how do you know whether the compiler has done inline optimization when executing go test ? Very simple, add go test to -gcflags="-m" parameter, -m means to print the optimization decision made by the compiler.

 $ go test -gcflags="-m" -v -bench=BenchmarkWrong -count 1
# example.com/benchmark [example.com/benchmark.test]
./go_util.go:3:6: can inline add
./go_bench_test.go:19:6: inlining call to add
./go_bench_test.go:16:21: b does not escape
# example.com/benchmark.test
/var/folders/pv/_x849j6n22x37xxd9cstgwkr0000gn/T/go-build2365344599/b001/_testmain.go:33:6: can inline init.0
/var/folders/pv/_x849j6n22x37xxd9cstgwkr0000gn/T/go-build2365344599/b001/_testmain.go:41:24: inlining call to testing.MainStart
/var/folders/pv/_x849j6n22x37xxd9cstgwkr0000gn/T/go-build2365344599/b001/_testmain.go:41:42: testdeps.TestDeps{} escapes to heap
/var/folders/pv/_x849j6n22x37xxd9cstgwkr0000gn/T/go-build2365344599/b001/_testmain.go:41:24: &testing.M{...} escapes to heap
goos: darwin
goarch: amd64
pkg: example.com/benchmark
cpu: Intel(R) Core(TM) i5-5250U CPU @ 1.60GHz
BenchmarkWrong
BenchmarkWrong-4        1000000000               0.4601 ns/op
PASS
ok      example.com/benchmark   0.605s

The above execution result ./go_bench_test.go:19:6: inlining call to add means that the compiler has inline optimized the BenchmarkWrong add function call in ---d8764d6dad191b8f1203d318f292dc7b---.

Note : All parameter values of -gcflags d6b14590086ac72d9901a8bce4141080--- can be viewed by executing go tool compile --help .

Best Practices

So how to disable inline optimization at compile time during performance testing? There are 2 options:

-gcflags="-l"

In the first solution, when executing go test , add the -gcfloags="-l" parameter, -l to disable the compiler's inline optimization.

 $ go test -gcflags="-m -l" -v -bench=BenchmarkWrong -count 3
# example.com/benchmark [example.com/benchmark.test]
./go_bench_test.go:16:21: b does not escape
# example.com/benchmark.test
/var/folders/pv/_x849j6n22x37xxd9cstgwkr0000gn/T/go-build2785655381/b001/_testmain.go:41:42: testdeps.TestDeps{} escapes to heap
goos: darwin
goarch: amd64
pkg: example.com/benchmark
cpu: Intel(R) Core(TM) i5-5250U CPU @ 1.60GHz
BenchmarkWrong
BenchmarkWrong-4        476215998                2.447 ns/op
BenchmarkWrong-4        492860170                2.404 ns/op
BenchmarkWrong-4        483547294                2.388 ns/op
PASS
ok      example.com/benchmark   4.568s

It can be seen from the above output that there is no inlining call word, which proves that the compiler does not perform inline optimization after using the -gcflags="-l" parameter.

Comparing the results before and after the compilation inline optimization is disabled, the performance is nearly 5 times worse.

  • Turn on inline optimization, time: 0.4601 ns/op
  • -gcflags="-l" Turn off inline optimization, it takes about 2.4 ns/op

go:noinline

The second solution is to use the //go:noinline compiler directive. The compiler will recognize this directive when compiling and will not do inline optimization.

 //go:noinline
func add(a int, b int) int {
    return a + b
}

After modifying the code in this way, we don't need to use the -gcflags="-l" parameter, let's take a look at the performance test results:

 $ go test -gcflags="-m" -v -bench=BenchmarkWrong -count 3
# example.com/benchmark [example.com/benchmark.test]
./go_bench_test.go:16:21: b does not escape
# example.com/benchmark.test
/var/folders/pv/_x849j6n22x37xxd9cstgwkr0000gn/T/go-build1050705055/b001/_testmain.go:33:6: can inline init.0
/var/folders/pv/_x849j6n22x37xxd9cstgwkr0000gn/T/go-build1050705055/b001/_testmain.go:41:24: inlining call to testing.MainStart
/var/folders/pv/_x849j6n22x37xxd9cstgwkr0000gn/T/go-build1050705055/b001/_testmain.go:41:42: testdeps.TestDeps{} escapes to heap
/var/folders/pv/_x849j6n22x37xxd9cstgwkr0000gn/T/go-build1050705055/b001/_testmain.go:41:24: &testing.M{...} escapes to heap
goos: darwin
goarch: amd64
pkg: example.com/benchmark
cpu: Intel(R) Core(TM) i5-5250U CPU @ 1.60GHz
BenchmarkWrong
BenchmarkWrong-4        482026485                2.422 ns/op
BenchmarkWrong-4        495307399                2.413 ns/op
BenchmarkWrong-4        407674614                2.613 ns/op
PASS
ok      example.com/benchmark   4.439s

From the above output results, it can also be seen that the compiler has not done inline optimization, and the final execution efficiency is basically the same as the first solution.

Test source code address: benchmark performance test source code , you can download it locally for testing.

Remarks : Some articles on the Internet say that assigning the result of a function call to a local variable, and then using a global variable to inherit the value of this local variable can avoid the compiler's inline optimization. This claim is actually wrong, and the original author, Teiva Harsanyi, is also wrong in this regard. To determine whether the compiler has done inline optimization, you can refer to the method written in this article to verify.

open source address

Articles and sample code are open sourced on GitHub: Beginner, Intermediate, and Advanced Tutorials in Go .

Official account: coding advanced. Follow the official account to get the latest Go interview questions and technology stacks.

Personal website: Jincheng's Blog .

Zhihu: Wuji .

References


coding进阶
124 声望18 粉丝