background
In addition to the major design of generics introduced in Go 1.18, the Go official team also introduced fuzzing fuzzing in the Go 1.18 toolchain.
The main developers of Go fuzzing are Katie Hockman, Jay Conrod and Roland Shoemaker.
Editor's Note : Katie Hockman has left Google on 2022.02.19, and Jay Conrod has also left Google in October 2021.
What is Fuzzing
Fuzzing means fuzzing in Chinese. It is an automated testing technology that can randomly generate test data sets, and then call the function code to be tested to check whether the function meets the expectations.
Fuzz testing is a complement to unit testing, not a replacement for unit testing.
Unit testing is to check whether the results obtained from the specified input are consistent with the expected output results, and the test data set is relatively limited.
Fuzzing can generate random test data, find scenarios that cannot be covered by unit tests, and then discover potential bugs and security vulnerabilities in the program.
How to use Go Fuzzing
Fuzzing is not a new concept in the Go language. Before the official Go team released Go Fuzzing, there was already a similar fuzzing tool go-fuzz on GitHub.
The Fuzzing implementation of the Go official team draws on the design ideas of go-fuzz.
Go 1.18 integrates Fuzzing into the go test
toolchain and testing
package.
Example
Here is an example to illustrate how Fuzzing is used.
For the following string inversion function Reverse
, you can think about the potential problems of this code?
// main.go
package fuzz
func Reverse(s string) string {
bs := []byte(s)
length := len(bs)
for i := 0; i < length/2; i++ {
bs[i], bs[length-i-1] = bs[length-i-1], bs[i]
}
return string(bs)
}
Writing Fuzzing Fuzzing Tests
If no bugs in the above code are found, we might as well write a Fuzzing fuzzing function to find potential problems in the above code.
The syntax of the Go Fuzzing fuzzing function is as follows:
- The fuzzing function is defined in the
xxx_test.go
file, which is the same as Go's existing unit test and benchmark test. - Function names
Fuzz
beginning, parameter is* testing.F
type,testing.F
type has two important waysAdd
andFuzz
. -
Add
method is used to add seed corpus data, and the bottom layer of Fuzzing can automatically generate random test data according to the seed corpus data. -
Fuzz
The method receives a variable of function type as a parameter. The first parameter of the function type must be the type of*testing.T
, and the rest of the parameter types are the same as theAdd
method. The type of the actual parameter passed in remains the same. For example, in the following example,f.Add(5, "hello")
the first argument passed in is5
, the second argument ishello
, which corresponds toi int
ands string
.
- The bottom layer of Go Fuzzing will randomly generate test data and perform fuzzing based on the seed corpus specified in
Add
. For example in the example above, basedAdd
specified in5
andhello
, the production of new random test data, assigned toi
ands
, and then keep calling the function as the argument of thef.Fuzz
method, that is,func(t *testing.T, i int, s string){...}
.
After knowing the above rules, let's write a fuzzing function as follows for the Reverse
function.
// fuzz_test.go
package fuzz
import (
"testing"
"unicode/utf8"
)
func FuzzReverse(f *testing.F) {
str_slice := []string{"abc", "bb"}
for _, v := range str_slice {
f.Add(v)
}
f.Fuzz(func(t *testing.T, str string) {
rev_str1 := Reverse(str)
rev_str2 := Reverse(rev_str1)
if str != rev_str2 {
t.Errorf("fuzz test failed. str:%s, rev_str1:%s, rev_str2:%s", str, rev_str1, rev_str2)
}
if utf8.ValidString(str) && !utf8.ValidString(rev_str1) {
t.Errorf("reverse result is not utf8. str:%s, len: %d, rev_str1:%s", str, len(str), rev_str1)
}
})
}
Run Fuzzing Tests
The version of Go used is required to be go 1.18beta 1
or above. The Fuzzing test can be performed by executing the following command, and the results are as follows:
$ go1.18beta1 test -v -fuzz=Fuzz
fuzz: elapsed: 0s, gathering baseline coverage: 0/111 completed
fuzz: minimizing 60-byte failing input file
fuzz: elapsed: 0s, gathering baseline coverage: 5/111 completed
--- FAIL: FuzzReverse (0.04s)
--- FAIL: FuzzReverse (0.00s)
fuzz_test.go:20: reverse result is not utf8. str:æ, len: 2, rev_str1:��
Failing input written to testdata/fuzz/FuzzReverse/ce9e8c80e2c2de2c96ab9e63b1a8cf18cea932b7d8c6c9c207d5978e0f19027a
To re-run:
go test -run=FuzzReverse/ce9e8c80e2c2de2c96ab9e63b1a8cf18cea932b7d8c6c9c207d5978e0f19027a
FAIL
exit status 1
FAIL example/fuzz 0.179s
Focus on fuzz_test.go:20: reverse result is not utf8. str:æ, len: 2, rev_str1:��
In this example, a random string æ
is a UTF-8 string composed of 2 bytes. After inversion according to the Reverse
function, we get A non-UTF-8 string ��
.
So the function that we implemented before to reverse the string by bytes Reverse
has a bug. This function can correctly reverse the string composed of characters in ASCII code, but for non-ASCII If the characters in the code are simply reversed according to the bytes, the result may be an illegal string.
Interested friends, you can see what the result will be if you call the Reverse
function on the string "eat".
Note : If Go Fuzzing finds your bug during running, it will write the corresponding input data to the testdata/fuzz/FuzzXXX
directory. For example, in the above example, the output of go1.18beta1 test -v -fuzz=Fuzz
prints the following content: Failing input written to testdata/fuzz/FuzzReverse/ce9e8c80e2c2de2c96ab9e63b1a8cf18cea932b7d8c6c9c207d5978e0f19027a
, which means that the test input is written to the corpus file testdata/fuzz/FuzzReverse/xxx
.
The underlying mechanism of Go Fuzzing
go test
When executed, it will first compile and generate an executable file for each tested package, and then run the executable file to get the corresponding package TestXXX
and BenchmarkXXX
test results. Go Fuzzing operates in a similar pattern to this one, but with a few differences.
When go test
is executed, if there is a -fuzz
flag, go test
will be combined with the coverage tool to compile and generate an executable file for fuzzing. Most of the fuzzing logic is implemented in internal/fuzz .
When go test
compiles and generates an executable file, the executable file will run, and the running process is called the coordinator process. There are go test
most tags of the command in the startup parameters of the coordination process, including -fuzz=pattern
this tag, -fuzz=pattern
used to identify which fuzz test function (fuzz test) Do a fuzzing test.
Currently, for each go test -fuzz=pattern
call, only one fuzzer function is supported to match. If go test -fuzz=pattern
can match multiple FuzzXXX
functions, the following error will be reported:
$ go1.18beta1 test -v -fuzz=Fuzz
testing: will not fuzz, -fuzz matches more than one fuzz test: [FuzzReverse FuzzReverse2]
FAIL
exit status 1
FAIL example/fuzz 0.752s
After the coordination process is started, the main program logic is fuzz.CoordinateFuzzing
. fuzz.CoordinateFuzzing
will initialize the fuzzing system and open the coordinator event loop.
The coordinator process starts multiple worker processes, each of which runs the same executable program as the coordinator process, and the real fuzzing fuzzing is done by the worker process. The worker process starts with a flag parameter -test.fuzzworker
, indicating that this is a worker process. The number of worker processes started is equal to GOMAXPROCS.
Here I give an example, you can run go test -fuzz=pattern
in the process of executing ps aux | grep fuzz
to view the current fuzzing related processes.
$ ps aux | grep fuzz
xxx 13913 84.3 1.0 5219184 85124 s001 R+ 10:12下午 0:03.90 /var/folders/pv/_x849j6n22x37xxd9cstgwkr0000gn/T/go-build1953131131/b001/fuzz.test -test.fuzzworker -test.paniconexit0 -test.fuzzcachedir=/Users/xxx/Library/Caches/go-build/fuzz/example/fuzz -test.timeout=10m0s -test.fuzz=Fuzz
xxx 13910 81.9 1.0 5221180 86200 s001 R+ 10:12下午 0:03.94 /var/folders/pv/_x849j6n22x37xxd9cstgwkr0000gn/T/go-build1953131131/b001/fuzz.test -test.fuzzworker -test.paniconexit0 -test.fuzzcachedir=/Users/xxx/Library/Caches/go-build/fuzz/example/fuzz -test.timeout=10m0s -test.fuzz=Fuzz
xxx 13912 78.3 1.0 5219964 84984 s001 R+ 10:12下午 0:03.86 /var/folders/pv/_x849j6n22x37xxd9cstgwkr0000gn/T/go-build1953131131/b001/fuzz.test -test.fuzzworker -test.paniconexit0 -test.fuzzcachedir=/Users/xxx/Library/Caches/go-build/fuzz/example/fuzz -test.timeout=10m0s -test.fuzz=Fuzz
xxx 13911 74.5 1.0 5219184 85132 s001 R+ 10:12下午 0:03.76 /var/folders/pv/_x849j6n22x37xxd9cstgwkr0000gn/T/go-build1953131131/b001/fuzz.test -test.fuzzworker -test.paniconexit0 -test.fuzzcachedir=/Users/xxx/Library/Caches/go-build/fuzz/example/fuzz -test.timeout=10m0s -test.fuzz=Fuzz
xxx 13907 43.3 2.3 5944576 191172 s001 R+ 10:12下午 0:01.90 /var/folders/pv/_x849j6n22x37xxd9cstgwkr0000gn/T/go-build1953131131/b001/fuzz.test -test.paniconexit0 -test.fuzzcachedir=/Users/xxx/Library/Caches/go-build/fuzz/example/fuzz -test.timeout=10m0s -test.fuzz=Fuzz
xxx 13923 0.0 0.0 4268176 420 s000 R+ 10:12下午 0:00.00 grep fuzz
xxx 13891 0.0 0.2 5014396 16868 s001 S+ 10:12下午 0:00.52 /Users/xxx/sdk/go1.18beta2/bin/go test -fuzz=Fuzz
xxx 13890 0.0 0.0 4989312 4008 s001 S+ 10:12下午 0:00.01 go1.18beta2 test -fuzz=Fuzz
If the worker process crashes while running fuzzing, the coordinator process can record the test data that caused the worker process to crash. If it is directly handed over to the coordinator process to perform fuzzing, when it encounters an input that will cause the program to crash, the coordinator process itself will crash, and there is no way to record the input that causes the program to crash (Failing input). The model run by Go Fuzzing looks like this:
The coordinator process and the worker process communicate through a pair of pipes, using the JSON-based RPC communication protocol. This protocol is very streamlined because we don't need a complex RPC protocol like gRPC, and we don't want to introduce any new dependencies to the Go standard library.
Each worker process saves its own state in the mmap file, which is shared with the coordinator process. In most cases, mmap records only the number of iterations and the state of the random number generator. If the worker process crashes, the coordinator process can restore its state from shared memory without the worker process sending messages through the pipe.
The entire Fuzzing process is divided into 3 stages:
Stage 1: Baseline coverage
When the coordinator process starts, it will pull up the worker process. The coordinator process will send the seed corpus (including the test data added in f.Add
c396bc526fdc04fdecf6d18287933f13--- and the test input in the testdata/fuzz
directory) and the fuzzing cache corpus (cache corpus, located in $GOCACHE
subdirectory).
Each worker process runs the specified input, and then reports a snapshot of its coverage counter to the coordinator process, and the coordinator will combine the collected coverage data of the workers into a coverage array.
This phase is called the baseline coverage collection phase. The workers will only run the specified input sent to them by the coordinator, and will not generate random test data.
Stage 2: Fuzzing Fuzzing
At this stage, the coordinator process will send the seed corpus and cache corpus to the worker process again for real fuzzing.
Each worker process will receive a copy of the input data and baseline coverage array sent by the coordinator. The worker process will then randomly mutate the specified input to obtain new test data. There are many ways to mutate, it may be to invert the bit, change 0 to 1, change 1 to 0, or delete or add bytes, and so on. Then, the mutated data is given as a parameter to the fuzz target function to run.
In order to reduce the communication overhead between the coordinator process and the worker process, each worker process can mutate to obtain new test data within 100ms, and then call the fuzz target function without further input from the coordinator process.
After each call to the fuzz target function on the generated random data, the worker process checks two scenarios:
- Whether new coverage data was found compared to the baseline coverage array.
- Whether an error occurs, that is, the code executes
T.Fail
orT.FailNow
. Note :T.Error
,T.Errorf
will automatically callT.Fail
,T.Fatal
andT.Fatalf
will automatically callT.FailNow
.
If one of the two is satisfied, the worker process will immediately send the input data to the coordinator process.
Stage 3: Minimization
If the coordinator process receives the input data sent by the worker process is scenario 1, that is, it receives input that will generate new coverage, the coordinator will compare the coverage data of this worker with the coverage array of the current combination.
Because it is possible that other workers have already found an input that will provide the same coverage, if so, the coordinator will ignore the input directly. If this new input does provide new coverage, the coordinator will send this input to a worker (probably a different worker) for minimization.
Minimization is a bit like fuzzing, but workers mutate randomly to create a smaller input that still yields new coverage. Smaller inputs generally make fuzzing faster, so it's worth spending time up front to make the fuzzing process faster. The worker process will report to the coordinator when it finishes minimizing, even if it fails to find a smaller input. The coordinator process will add this minimized input to the cache corpus and continue fuzzing. Later, the coordinator may send this minimized input to all workers for further fuzzing. This is how the fuzzing system automatically adjusts to find new coverage.
If the coordinator process receives the input data sent by the worker process is scenario 2: that is 引发error的输入
, the coordinator process will send this input to the worker again for minimization. In this scenario, the worker will try to find a smaller input that will raise an error, although not necessarily the same error. After the input data is minimized, the coordinator process will store the minimized data to testdata/fuzz/$FuzzTarget
, gracefully shut down all worker processes, and then exit with a non-zero status.
If the worker process crashes during fuzzing, the coordinator process can use the input sent to the worker, the worker's RNG state, and the number of iterations (left in shared memory) to recover the input that caused the worker process to crash. The input to a crash is usually not minimized, because minimizing is a highly stateful process, and every crash destroys this state. Minimizing the crash-causing input is theoretically possible, but has not been implemented yet.
Fuzzing usually ends running in the following scenarios, otherwise it will keep running:
- Fuzzing finds the error, which triggers the error condition in your fuzzing function
- The user presses Ctrl-C to interrupt the program
- The running time has reached the set time of
-fuzztime
The fuzzing engine handles interrupts gracefully, regardless of whether the interrupt is sent to the coordinator process or the worker process. For example, if the worker process is interrupted while minimizing input, the coordinator process will save the input that was not minimized.
Precautions
- The implementation of
FuzzXXX
is also placed in the go file ending with_test.go
. - seed corpus (seed corpus): contains both the input specified by
f.Add
, and the input in the file under thetestdata/fuzz/$FuzzTarget
directory. -
go test
without-fuzz
marking is performed by defaultTestXXX
andFuzzXXX
functions that begin with, forFuzzXXX
only use Input from the seed corpus without generating random data. To generate random input, usego test -fuzz=pattern
.
open source address
Articles and sample code are open sourced on GitHub: Beginner, Intermediate, and Advanced Tutorials in Go .
Official account: coding advanced. Follow the official account to get the latest Go interview questions and technology stacks.
Personal website: Jincheng's Blog .
References
- Internals of Go's New Fuzzing System: https://jayconrod.com/posts/123/internals-of-go-s-new-fuzzing-system
- Introduction to Fuzzing: https://go.dev/doc/fuzz/
- Fuzzing Design Draft: https://go.googlesource.com/proposal/+/master/design/draft-fuzzing.md
- Fuzzing proposal: https://github.com/golang/go/issues/44551
- Fuzzing Tutorial: https://go.dev/doc/tutorial/fuzz
- tesing.F documentation: https://pkg.go.dev/testing@go1.18#F
- Fuzzing Tesing in Go in 8 Minutes: https://www.youtube.com/watch?v=w8STTZWdG9Y
- GitHub open source tool go-fuzz: https://github.com/dvyukov/go-fuz
- Go fuzzing bug finding example: https://julien.ponge.org/blog/playing-with-test-fuzzing-in-go/
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。