foreword
Go 1.18 introduced fuzzing fuzzing in the go toolchain, which can help us find bugs in Go code or input that may cause the program to crash. The Go official team has also released a fuzzing introductory tutorial on the official website to help you get started quickly.
I have made some optimizations on the expression of the official Go tutorial on the basis of translation for the sake of readers.
Note : fuzzing fuzzing and Go's existing unit testing and performance testing frameworks are complementary, not replacements.
Tutorial content
This tutorial will cover the basics of getting started with Go fuzzing. Fuzzing can construct random data to find bugs in code or input that could cause a program to crash. Vulnerabilities that can be found through fuzzing include SQL injection, buffer overflow, denial of service (Denial of Service) attacks, and XSS (cross-site scripting) attacks.
In this tutorial, you will write a fuzz test program for a function, then run the go command to find problems in the code, and finally debug to fix the problem.
For the terminology involved in this article, please refer to Go Fuzzing glossary .
It will be introduced in the following chapters:
- Create a directory for your code
- implements a function
- Added unit tests
- Added fuzzing
- Fix 2 bugs
- Summary
Ready to work
- Install Go 1.18 Beta 1 or newer. For installation instructions, please refer to the introduction below .
- There is a code editing tool. Any text editor will do.
- There is a command line terminal. Go can run on any command line terminal on Linux, Mac, or on Windows PowerShell or cmd.
- There is an environment that supports fuzzing. Currently Go fuzzing only supports AMD64 and ARM64 architectures.
Install and use beta versions
This tutorial requires the generics feature of Go 1.18 Beta 1 or above. Use the following steps to install the beta version
Install the beta version using the command below
$ go install golang.org/dl/go1.18beta1@latest
Run the following command to download the update
$ go1.18beta1 download
Note : If you execute
go1.18beta1
on MAC or Linux and promptcommand not found
, you need to set the profile environment variable file corresponding tobash
orzsh
.bash
is set in the~/.bash_profile
file, the content is:export GOROOT=/usr/local/opt/go/libexec export GOPATH=$HOME/go export PATH=$PATH:$GOROOT/bin:$GOPATH/bin
The values of
GOROOT
andGOPATH
can be viewed through thego env
command. After setting, executesource ~/.bash_profile
to make the setting take effect, and then executego1.18beta1
and no error will be reported.Use the beta version of the go command, do not use the release version of the go command
You can use the
go1.18beta1
command directly or givego1.18beta1
a simple aliasUse the
go1.18beta1
command directly$ go1.18beta1 version
Give an alias to the
go1.18beta1
command$ alias go=go1.18beta1 $ go version
The following tutorials assume that you have aliased the
go1.18beta1
command togo
.
Create a directory for your code
First create a directory to store the code you wrote.
Open a command line terminal and change to your
home
directoryExecute the following command on Linux or Mac (On Linux or Mac, you only need to execute
cd
to enter thehome
directory)cd
Execute the following command on Windows
C:\> cd %HOMEPATH%
In a command line terminal, create a directory named
fuzz
and enter that directory$ mkdir fuzz $ cd fuzz
create a go module
Run
go mod init
command to set the module path for your project$ go mod init example/fuzz
Note : For production code, you can specify the module path according to the actual situation of the project. If you want to know more, you can refer to Go Module Dependency Management .
Next, let's use map to write some simple code to reverse the string, and then use fuzzing to do fuzzing.
implement a function
In this chapter, you need to implement a function to reverse a string.
Write code
- Open your text editor and create a
main.go
source file in the fuzz directory. Write the following code in
main.go
:// maing.go package main import "fmt" func Reverse(s string) string { b := []byte(s) for i, j := 0, len(b)-1; i < len(b)/2; i, j = i+1, j-1 { b[i], b[j] = b[j], b[i] } return string(b) } func main() { input := "The quick brown fox jumped over the lazy dog" rev := Reverse(input) doubleRev := Reverse(rev) fmt.Printf("original: %q\n", input) fmt.Printf("reversed: %q\n", rev) fmt.Printf("reversed again: %q\n", doubleRev) }
run the code
Execute the command go run .
in the directory where main.go
is located to run the code, and the results are as follows:
$ go run .
original: "The quick brown fox jumped over the lazy dog"
reversed: "god yzal eht revo depmuj xof nworb kciuq ehT"
reversed again: "The quick brown fox jumped over the lazy dog"
add unit tests
In this chapter, you will write unit test code for the Reverse
function.
Write unit tests
- Create the file
reverse_test.go
in the fuzz directory. Write the following code in
reverse_test.go
:package main import ( "testing" ) func TestReverse(t *testing.T) { testcases := []struct { in, want string }{ {"Hello, world", "dlrow ,olleH"}, {" ", " "}, {"!12345", "54321!"}, } for _, tc := range testcases { rev := Reverse(tc.in) if rev != tc.want { t.Errorf("Reverse: %q, want %q", rev, tc.want) } } }
run unit tests
Use the go test
command to run unit tests
$ go test
PASS
ok example/fuzz 0.013s
Next, we add fuzz test code to the Reverse
function.
Add fuzzing
Unit testing has limitations, each test input must be specified by the developer and added to the test case of the unit test.
One of the advantages of fuzzing is that based on the test input specified in the developer code as the basic data, new random test data can be further automatically generated to discover the boundary conditions that the specified test input does not cover.
In this chapter, we will convert unit tests into fuzz tests, which make it easier to generate more test inputs.
Note : You can put unit tests, performance tests and fuzz tests in the same *_test.go
file.
Write fuzz tests
In a text editor, replace the unit test code reverse_test.go
in TestReverse
with the fuzz test code FuzzReverse
.
func FuzzReverse(f *testing.F) {
testcases := []string{"Hello, world", " ", "!12345"}
for _, tc := range testcases {
f.Add(tc) // Use f.Add to provide a seed corpus
}
f.Fuzz(func(t *testing.T, orig string) {
rev := Reverse(orig)
doubleRev := Reverse(rev)
if orig != doubleRev {
t.Errorf("Before: %q, after: %q", orig, doubleRev)
}
if utf8.ValidString(orig) && !utf8.ValidString(rev) {
t.Errorf("Reverse produced invalid UTF-8 string %q", rev)
}
})
}
Fuzzing also has certain limitations.
In the unit test, because the test input is fixed, you can know what the reversed string of each input string should be after calling Reverse
function, and then judge in the unit test code whether the execution result of Reverse
matches the expectation . For example, for test case Reverse("Hello, world")
, the unit test expected result is "dlrow ,olleH"
.
But when using fuzzing, we can't predict what the output will be, because the test input is randomly generated by fuzzing in addition to the use cases specified in our code. For randomly generated test inputs, we certainly have no way of knowing in advance what the output will be.
Even so, the Reverse
function in this article has several features that we can still verify in fuzzing.
- Reverse a string twice, the result is the same as the source string
- The reversed string is still a valid UTF-8 encoded string
Note : fuzzing fuzzing and Go's existing unit testing and performance testing frameworks are complementary, not replacements.
For example, if the Reverse
function we implemented is a wrong version, and it returns the input string directly, it can completely pass the above fuzzing test, but it cannot pass the unit test we wrote earlier. Therefore unit testing and fuzzing are complementary, not alternatives.
The syntax differences between Go fuzzing and unit testing are as follows:
- Go fuzzer functions start with FuzzXxx and unit test functions start with TestXxx
- The Go fuzz test function takes
*testing.F
as the input parameter, and the unit test function takes*testing.T
as the input parameter The Go fuzzer calls functions
f.Add
andf.Fuzz
.f.Add
function takes the specified input as the seed corpus for fuzzing, and fuzzing generates random input based on the seed corpus.f.Fuzz
function receives a fuzz target function as an input parameter. The fuzz target function has multiple parameters, the first parameter is*testing.T
, the other parameters are the obfuscated type ( Note : the obfuscated type currently only supports some built-in types, listed in Go Fuzzing docs , more will be supported in the future many built-in types).
The FuzzReverse
function above uses the utf8
package, so import this package at the beginning of reverse_test.go
, refer to the following code:
package main
import (
"testing"
"unicode/utf8"
)
run fuzz tests
Execute the following command to run the fuzz test.
This method will only use the seed corpus, and will not generate random test data . In this way, it can be used to verify whether the test data of the seed corpus can pass the test. (fuzz test without fuzzing)
$ go test PASS ok example/fuzz 0.013s
If there are other unit testing functions or fuzzing functions in the
reverse_test.go
file, but only want to runFuzzReverse
fuzzing function, we can execute thego test -run=FuzzReverse
command.Note :
go test
will execute all unit test functions starting withTestXxx
and fuzz test functionsBenchmarkXxx
withFuzzXxx
by-bench
.If you want to generate random test data for fuzzing based on the seed corpus, you need to add
-fuzz
parameter to thego test
command. (fuzz test with fuzzing)$ go test -fuzz=Fuzz fuzz: elapsed: 0s, gathering baseline coverage: 0/3 completed fuzz: elapsed: 0s, gathering baseline coverage: 3/3 completed, now fuzzing with 8 workers fuzz: minimizing 38-byte failing input file... --- FAIL: FuzzReverse (0.01s) --- FAIL: FuzzReverse (0.00s) reverse_test.go:20: Reverse produced invalid UTF-8 string "\x9c\xdd" Failing input written to testdata/fuzz/FuzzReverse/af69258a12129d6cbba438df5d5f25ba0ec050461c116f777e77ea7c9a0d217a To re-run: go test -run=FuzzReverse/af69258a12129d6cbba438df5d5f25ba0ec050461c116f777e77ea7c9a0d217a FAIL exit status 1 FAIL example/fuzz 0.030s
The result of the fuzzing test above is
FAIL
, which causes the input data ofFAIL
to be written to a corpus file. The next time you run thego test
command, the test data in this corpus file will be used even without the-fuzz
parameter.You can open the file in the
testdata/fuzz/FuzzReverse
directory with a text editor to see what the test data that caused the Fuzzing test to fail looks like. The following is an example file, the test data you get after running it may be different from this, but the content format in the file will be the same.go test fuzz v1 string("泃")
The first line in the corpus file identifies the encoding version (encoding version, to put it bluntly, it is the version of the content format in this seed corpus file). Although there is only one version of v1, the Fuzzing designer takes into account the possibility of future A new encoding version is introduced, so the concept of encoding version is added.
Starting from the second row, each row of data corresponds to one of the parameters of each test data (corpus entry) of the corpus, which is arranged in the order of the parameters.
f.Fuzz(func(t *testing.T, orig string) { rev := Reverse(orig) doubleRev := Reverse(rev) if orig != doubleRev { t.Errorf("Before: %q, after: %q", orig, doubleRev) } if utf8.ValidString(orig) && !utf8.ValidString(rev) { t.Errorf("Reverse produced invalid UTF-8 string %q %q", orig, rev) } })
The fuzz target function
func(t *testing.T, orig string)
in theFuzzReverse
in this article only has one parameter,orig
, as the real test input, that is, each test data is actually one input, so in the above example, there is onlystring in the file under the
testdata/fuzz/FuzzReverse
directory ") line.If each piece of test data has N parameters, then each piece of test data found by fuzzing that causes the fuzz test to fail will have N lines in the file under the
testdata
directory, and linei
corresponds to parameteri
.Run the
go test
command again, this time without the-fuzz
parameter.We will find that although there is no
-fuzz
parameter, the test data found in step 2 above is still used during fuzzing.$ go test --- FAIL: FuzzReverse (0.00s) --- FAIL: FuzzReverse/af69258a12129d6cbba438df5d5f25ba0ec050461c116f777e77ea7c9a0d217a (0.00s) reverse_test.go:20: Reverse produced invalid string FAIL exit status 1 FAIL example/fuzz 0.016s
Since the Go fuzzing test failed, we need to debug the code to find the problem.
Fix 2 bugs
In this chapter, we will debug the program and fix the bugs detected by Go fuzzing.
You can take some time to think for yourself and try to solve the problem yourself first.
positioning problem
You can use different methods to debug the bugs found above.
If you are using VS Code, you can set your Debug debugger in VS Code to add breakpoints for debugging.
In this article, we will use the method of printing logs for debugging.
The error message when running the fuzz test is: reverse_test.go:20: Reverse produced invalid UTF-8 string "\x9c\xdd"
Based on this error, let's take a look at the description in the document for utf8.ValidString
.
ValidString reports whether s consists entirely of valid UTF-8-encoded runes.
The Reverse
function we implemented is to reverse the string according to the byte dimension, which is the problem.
For example, the character in Chinese is actually composed of 3 bytes. If it is reversed according to the bytes, the reversed result is an invalid string.
Therefore, in order to ensure that the string is still a valid UTF-8 encoded string after inversion, we need to perform string inversion according to rune
.
In order to make it easier for everyone to understand how many characters in Chinese are
rune
according to the dimension rune
, and what the result will look like after inversion according to byte, we make some modifications to the code.
Write code
Modify the code in FuzzReverse
as follows.
f.Fuzz(func(t *testing.T, orig string) {
rev := Reverse(orig)
doubleRev := Reverse(rev)
t.Logf("Number of runes: orig=%d, rev=%d, doubleRev=%d", utf8.RuneCountInString(orig), utf8.RuneCountInString(rev), utf8.RuneCountInString(doubleRev))
if orig != doubleRev {
t.Errorf("Before: %q, after: %q", orig, doubleRev)
}
if utf8.ValidString(orig) && !utf8.ValidString(rev) {
t.Errorf("Reverse produced invalid UTF-8 string %q", rev)
}
})
run the code
$ go test
--- FAIL: FuzzReverse (0.00s)
--- FAIL: FuzzReverse/28f36ef487f23e6c7a81ebdaa9feffe2f2b02b4cddaa6252e87f69863046a5e0 (0.00s)
reverse_test.go:16: Number of runes: orig=1, rev=3, doubleRev=1
reverse_test.go:21: Reverse produced invalid UTF-8 string "\x83\xb3\xe6"
FAIL
exit status 1
FAIL example/fuzz 0.598s
Each symbol in our seed corpus is a single byte. But Chinese symbols like consist of multiple bytes, and if you reverse the byte dimension, you will get invalid results.
Note : If you are interested in how Go handles strings, you can read this article Strings, bytes, runes and characters in Go on the official blog for a better understanding.
Now that we have identified the problem, we can fix the bug.
fix the problem
Reverse the string with dimension rune
.
Write code
Modify the implementation of the Reverse
function as follows:
func Reverse(s string) string {
r := []rune(s)
for i, j := 0, len(r)-1; i < len(r)/2; i, j = i+1, j-1 {
r[i], r[j] = r[j], r[i]
}
return string(r)
}
run the code
Run the command:
go test
$ go test PASS ok example/fuzz 0.016s
The test passed! (Don't be too happy, this one just passed the seed corpus and before)
Run
go test -fuzz
again to see if we find new bugs$ go test -fuzz=Fuzz fuzz: elapsed: 0s, gathering baseline coverage: 0/37 completed fuzz: minimizing 506-byte failing input file... fuzz: elapsed: 0s, gathering baseline coverage: 5/37 completed --- FAIL: FuzzReverse (0.02s) --- FAIL: FuzzReverse (0.00s) reverse_test.go:33: Before: "\x91", after: "�" Failing input written to testdata/fuzz/FuzzReverse/1ffc28f7538e29d79fce69fef20ce5ea72648529a9ca10bea392bcff28cd015c To re-run: go test -run=FuzzReverse/1ffc28f7538e29d79fce69fef20ce5ea72648529a9ca10bea392bcff28cd015c FAIL exit status 1 FAIL example/fuzz 0.032s
Through the above error report, we found that after reversing a string twice, it is different from the original string.
This time, the test input itself is illegal unicode, but why is the string obtained after two inversions different?
Let's continue debugging.
Fix the bug of 2 string reversals
positioning problem
For this problem, adding breakpoint debugging will be very good positioning. For the convenience of explanation, this article uses the method of adding logs to debug.
We can locate the problem by carefully observing the result obtained after the first inversion of the original string.
Write code
Modify
Reverse
function.func Reverse(s string) string { fmt.Printf("input: %q\n", s) r := []rune(s) fmt.Printf("runes: %q\n", r) for i, j := 0, len(r)-1; i < len(r)/2; i, j = i+1, j-1 { r[i], r[j] = r[j], r[i] } return string(r) }
This can help us understand what happens after we convert the original string into
rune
slices.
run the code
This time, we only run the test data that failed the fuzz test, using the go test -run
command.
To run the corpus test data specified in the FuzzXxx/testdata directory, you can specify the value {FuzzTestName}/{filename} for the -run
parameter, which allows us to focus on the test data that failed the fuzz test.
$ go test -run=FuzzReverse/28f36ef487f23e6c7a81ebdaa9feffe2f2b02b4cddaa6252e87f69863046a5e0
input: "\x91"
runes: ['�']
input: "�"
runes: ['�']
--- FAIL: FuzzReverse (0.00s)
--- FAIL: FuzzReverse/28f36ef487f23e6c7a81ebdaa9feffe2f2b02b4cddaa6252e87f69863046a5e0 (0.00s)
reverse_test.go:16: Number of runes: orig=1, rev=1, doubleRev=1
reverse_test.go:18: Before: "\x91", after: "�"
FAIL
exit status 1
FAIL example/fuzz 0.145s
First we need to understand: In Go language, strings are read-only slices of bytes (In Go, a string is a read only slice of bytes), and each byte in a byte slice is not necessarily valid UTF -8 encoded bytes, please refer to a string is a read only slice of bytes for details.
In the above example, the input string is a byte slice with only 1 byte, and this 1 byte is \x91
.
When we convert this input string to []rune
, Go will encode the byte slice as UTF-8, so we replace \x91
with '�', '�' is still '�' after the restaurant, which leads to the original The string \x91
is reversed and the resulting string is '�'.
Now the problem is clear, because the input data is illegal unicode. Then we can modify the implementation of the Reverse
function.
fix the problem
The repair method is: check whether the input is a legal UTF-8 encoded string in Reverse
, and return eror if it is illegal.
Write code
Modify
Reverse
to achieve the following:func Reverse(s string) (string, error) { if !utf8.ValidString(s) { return s, errors.New("input is not valid UTF-8") } r := []rune(s) for i, j := 0, len(r)-1; i < len(r)/2; i, j = i+1, j-1 { r[i], r[j] = r[j], r[i] } return string(r), nil }
Since the
Reverse
function will now return an error, the corresponding code inmain.go
should be modified as follows:func main() { input := "The quick brown fox jumped over the lazy dog" rev, revErr := Reverse(input) doubleRev, doubleRevErr := Reverse(rev) fmt.Printf("original: %q\n", input) fmt.Printf("reversed: %q, err: %v\n", rev, revErr) fmt.Printf("reversed again: %q, err: %v\n", doubleRev, doubleRevErr) }
Because the
main
functions are all valid UTF-8 encoded strings, a call toReverse
will return an error with a value of nil.Since the
Reverse
function uses the two packageserrors
andutf8
, these two packages must be imported at the beginning ofmain.go
.import ( "errors" "fmt" "unicode/utf8" )
Similarly, we need to modify
reverse_test.go
file. For illegal string input, the test can be skipped directly.func FuzzReverse(f *testing.F) { testcases := []string {"Hello, world", " ", "!12345"} for _, tc := range testcases { f.Add(tc) // Use f.Add to provide a seed corpus } f.Fuzz(func(t *testing.T, orig string) { rev, err1 := Reverse(orig) if err1 != nil { return } doubleRev, err2 := Reverse(rev) if err2 != nil { return } if orig != doubleRev { t.Errorf("Before: %q, after: %q", orig, doubleRev) } if utf8.ValidString(orig) && !utf8.ValidString(rev) { t.Errorf("Reverse produced invalid UTF-8 string %q", rev) } }) }
Instead of using return, you can also call
t.Skip()
to skip the current test input and continue with the next round of test input.
run the code
run test code
$ go test PASS ok example/fuzz 0.019s
Run the fuzz test
go test -fuzz=Fuzz
, after a few seconds, end the test withctrl-C
.$ go test -fuzz=Fuzz fuzz: elapsed: 0s, gathering baseline coverage: 0/38 completed fuzz: elapsed: 0s, gathering baseline coverage: 38/38 completed, now fuzzing with 4 workers fuzz: elapsed: 3s, execs: 86342 (28778/sec), new interesting: 2 (total: 35) fuzz: elapsed: 6s, execs: 193490 (35714/sec), new interesting: 4 (total: 37) fuzz: elapsed: 9s, execs: 304390 (36961/sec), new interesting: 4 (total: 37) ... fuzz: elapsed: 3m45s, execs: 7246222 (32357/sec), new interesting: 8 (total: 41) ^Cfuzz: elapsed: 3m48s, execs: 7335316 (31648/sec), new interesting: 8 (total: 41) PASS ok example/fuzz 228.000s
If the fuzz test encounters no errors, it will continue to run by default, and you need to use
ctrl-C
to end the test.You can also pass
-fuzztime
parameter to specify the test time, so thatctrl-C
is not needed.Specify the test time.
go test -fuzz=Fuzz -fuzztime 30s
will automatically end after 30s if no error is encountered.$ go test -fuzz=Fuzz -fuzztime 30s fuzz: elapsed: 0s, gathering baseline coverage: 0/5 completed fuzz: elapsed: 0s, gathering baseline coverage: 5/5 completed, now fuzzing with 4 workers fuzz: elapsed: 3s, execs: 80290 (26763/sec), new interesting: 12 (total: 12) fuzz: elapsed: 6s, execs: 210803 (43501/sec), new interesting: 14 (total: 14) fuzz: elapsed: 9s, execs: 292882 (27360/sec), new interesting: 14 (total: 14) fuzz: elapsed: 12s, execs: 371872 (26329/sec), new interesting: 14 (total: 14) fuzz: elapsed: 15s, execs: 517169 (48433/sec), new interesting: 15 (total: 15) fuzz: elapsed: 18s, execs: 663276 (48699/sec), new interesting: 15 (total: 15) fuzz: elapsed: 21s, execs: 771698 (36143/sec), new interesting: 15 (total: 15) fuzz: elapsed: 24s, execs: 924768 (50990/sec), new interesting: 16 (total: 16) fuzz: elapsed: 27s, execs: 1082025 (52427/sec), new interesting: 17 (total: 17) fuzz: elapsed: 30s, execs: 1172817 (30281/sec), new interesting: 17 (total: 17) fuzz: elapsed: 31s, execs: 1172817 (0/sec), new interesting: 17 (total: 17) PASS ok example/fuzz 31.025s
Fuzzing test passed!
In addition to
-fuzz
parameter, several new parameters have also been introduced to thego test
command. For details, please refer to the documentation .
Summarize
So far you have learned how to use Go fuzzing.
Next, you can try to use fuzzing to find bugs in the code you have written.
If you do find a bug, please consider submitting the case to the trophy case .
If you find any issues with Go fuzzing or want to mention a feature, you can file a file an issue .
See documentation go.dev/doc/fuzz to learn more about Go Fuzzing.
The full code reference for this article is Go Fuzzing example code .
open source address
The article and sample code are open sourced on GitHub: Go language beginner, intermediate and advanced tutorials .
Official account: coding advanced. Follow the official account to get the latest Go interview questions and technology stacks.
Personal website: Jincheng's Blog .
Know: Wuji .
References
- Fuzzing Tutorial: https://go.dev/doc/tutorial/fuzz
- Fuzzing proposal: https://github.com/golang/go/issues/44551
- Introduction to Fuzzing: https://go.dev/doc/fuzz/
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。