头图

Official tutorial: Go fuzzing fuzzing

coding进阶
中文

foreword

Go 1.18 introduced fuzzing fuzzing in the go toolchain, which can help us find bugs in Go code or input that may cause the program to crash. The Go official team has also released a fuzzing introductory tutorial on the official website to help you get started quickly.


I have made some optimizations on the expression of the official Go tutorial on the basis of translation for the sake of readers.

Note : fuzzing fuzzing and Go's existing unit testing and performance testing frameworks are complementary, not replacements.

Tutorial content

This tutorial will cover the basics of getting started with Go fuzzing. Fuzzing can construct random data to find bugs in code or input that could cause a program to crash. Vulnerabilities that can be found through fuzzing include SQL injection, buffer overflow, denial of service (Denial of Service) attacks, and XSS (cross-site scripting) attacks.

In this tutorial, you will write a fuzz test program for a function, then run the go command to find problems in the code, and finally debug to fix the problem.

For the terminology involved in this article, please refer to Go Fuzzing glossary .

It will be introduced in the following chapters:

  1. Create a directory for your code
  2. implements a function
  3. Added unit tests
  4. Added fuzzing
  5. Fix 2 bugs
  6. Summary

Ready to work

  • Install Go 1.18 Beta 1 or newer. For installation instructions, please refer to the introduction below .
  • There is a code editing tool. Any text editor will do.
  • There is a command line terminal. Go can run on any command line terminal on Linux, Mac, or on Windows PowerShell or cmd.
  • There is an environment that supports fuzzing. Currently Go fuzzing only supports AMD64 and ARM64 architectures.

Install and use beta versions

This tutorial requires the generics feature of Go 1.18 Beta 1 or above. Use the following steps to install the beta version

  1. Install the beta version using the command below

    $ go install golang.org/dl/go1.18beta1@latest
  2. Run the following command to download the update

    $ go1.18beta1 download

    Note : If you execute go1.18beta1 on MAC or Linux and prompt command not found , you need to set the profile environment variable file corresponding to bash or zsh . bash is set in the ~/.bash_profile file, the content is:

    export GOROOT=/usr/local/opt/go/libexec
    export GOPATH=$HOME/go
    export PATH=$PATH:$GOROOT/bin:$GOPATH/bin

    The values of GOROOT and GOPATH can be viewed through the go env command. After setting, execute source ~/.bash_profile to make the setting take effect, and then execute go1.18beta1 and no error will be reported.

  3. Use the beta version of the go command, do not use the release version of the go command

    You can use the go1.18beta1 command directly or give go1.18beta1 a simple alias

    • Use the go1.18beta1 command directly

      $ go1.18beta1 version
    • Give an alias to the go1.18beta1 command

      $ alias go=go1.18beta1
      $ go version

    The following tutorials assume that you have aliased the go1.18beta1 command to go .

Create a directory for your code

First create a directory to store the code you wrote.

  1. Open a command line terminal and change to your home directory

    • Execute the following command on Linux or Mac (On Linux or Mac, you only need to execute cd to enter the home directory)

      cd
    • Execute the following command on Windows

      C:\> cd %HOMEPATH%
  2. In a command line terminal, create a directory named fuzz and enter that directory

    $ mkdir fuzz
    $ cd fuzz
  3. create a go module

    Run go mod init command to set the module path for your project

    $ go mod init example/fuzz

    Note : For production code, you can specify the module path according to the actual situation of the project. If you want to know more, you can refer to Go Module Dependency Management .

Next, let's use map to write some simple code to reverse the string, and then use fuzzing to do fuzzing.

implement a function

In this chapter, you need to implement a function to reverse a string.

Write code

  1. Open your text editor and create a main.go source file in the fuzz directory.
  2. Write the following code in main.go :

    // maing.go
    package main
    
    import "fmt"
    
    func Reverse(s string) string {
        b := []byte(s)
        for i, j := 0, len(b)-1; i < len(b)/2; i, j = i+1, j-1 {
            b[i], b[j] = b[j], b[i]
        }
        return string(b)
    }
    
    func main() {
        input := "The quick brown fox jumped over the lazy dog"
        rev := Reverse(input)
        doubleRev := Reverse(rev)
        fmt.Printf("original: %q\n", input)
        fmt.Printf("reversed: %q\n", rev)
        fmt.Printf("reversed again: %q\n", doubleRev)
    }

run the code

Execute the command go run . in the directory where main.go is located to run the code, and the results are as follows:

$ go run .
original: "The quick brown fox jumped over the lazy dog"
reversed: "god yzal eht revo depmuj xof nworb kciuq ehT"
reversed again: "The quick brown fox jumped over the lazy dog"

add unit tests

In this chapter, you will write unit test code for the Reverse function.

Write unit tests

  1. Create the file reverse_test.go in the fuzz directory.
  2. Write the following code in reverse_test.go :

    package main
    
    import (
        "testing"
    )
    
    func TestReverse(t *testing.T) {
        testcases := []struct {
            in, want string
        }{
            {"Hello, world", "dlrow ,olleH"},
            {" ", " "},
            {"!12345", "54321!"},
        }
        for _, tc := range testcases {
            rev := Reverse(tc.in)
            if rev != tc.want {
                    t.Errorf("Reverse: %q, want %q", rev, tc.want)
            }
        }
    }

run unit tests

Use the go test command to run unit tests

$ go test
PASS
ok      example/fuzz  0.013s

Next, we add fuzz test code to the Reverse function.

Add fuzzing

Unit testing has limitations, each test input must be specified by the developer and added to the test case of the unit test.

One of the advantages of fuzzing is that based on the test input specified in the developer code as the basic data, new random test data can be further automatically generated to discover the boundary conditions that the specified test input does not cover.

In this chapter, we will convert unit tests into fuzz tests, which make it easier to generate more test inputs.

Note : You can put unit tests, performance tests and fuzz tests in the same *_test.go file.

Write fuzz tests

In a text editor, replace the unit test code reverse_test.go in TestReverse with the fuzz test code FuzzReverse .

func FuzzReverse(f *testing.F) {
    testcases := []string{"Hello, world", " ", "!12345"}
    for _, tc := range testcases {
        f.Add(tc)  // Use f.Add to provide a seed corpus
    }
    f.Fuzz(func(t *testing.T, orig string) {
        rev := Reverse(orig)
        doubleRev := Reverse(rev)
        if orig != doubleRev {
            t.Errorf("Before: %q, after: %q", orig, doubleRev)
        }
        if utf8.ValidString(orig) && !utf8.ValidString(rev) {
            t.Errorf("Reverse produced invalid UTF-8 string %q", rev)
        }
    })
}

Fuzzing also has certain limitations.

In the unit test, because the test input is fixed, you can know what the reversed string of each input string should be after calling Reverse function, and then judge in the unit test code whether the execution result of Reverse matches the expectation . For example, for test case Reverse("Hello, world") , the unit test expected result is "dlrow ,olleH" .

But when using fuzzing, we can't predict what the output will be, because the test input is randomly generated by fuzzing in addition to the use cases specified in our code. For randomly generated test inputs, we certainly have no way of knowing in advance what the output will be.

Even so, the Reverse function in this article has several features that we can still verify in fuzzing.

  1. Reverse a string twice, the result is the same as the source string
  2. The reversed string is still a valid UTF-8 encoded string

Note : fuzzing fuzzing and Go's existing unit testing and performance testing frameworks are complementary, not replacements.

For example, if the Reverse function we implemented is a wrong version, and it returns the input string directly, it can completely pass the above fuzzing test, but it cannot pass the unit test we wrote earlier. Therefore unit testing and fuzzing are complementary, not alternatives.

The syntax differences between Go fuzzing and unit testing are as follows:

  • Go fuzzer functions start with FuzzXxx and unit test functions start with TestXxx
  • The Go fuzz test function takes *testing.F as the input parameter, and the unit test function takes *testing.T as the input parameter
  • The Go fuzzer calls functions f.Add and f.Fuzz .

    • f.Add function takes the specified input as the seed corpus for fuzzing, and fuzzing generates random input based on the seed corpus.
    • f.Fuzz function receives a fuzz target function as an input parameter. The fuzz target function has multiple parameters, the first parameter is *testing.T , the other parameters are the obfuscated type ( Note : the obfuscated type currently only supports some built-in types, listed in Go Fuzzing docs , more will be supported in the future many built-in types).

The FuzzReverse function above uses the utf8 package, so import this package at the beginning of reverse_test.go , refer to the following code:

package main

import (
    "testing"
    "unicode/utf8"
)

run fuzz tests

  1. Execute the following command to run the fuzz test.

    This method will only use the seed corpus, and will not generate random test data . In this way, it can be used to verify whether the test data of the seed corpus can pass the test. (fuzz test without fuzzing)

    $ go test
    PASS
    ok      example/fuzz  0.013s

    If there are other unit testing functions or fuzzing functions in the reverse_test.go file, but only want to run FuzzReverse fuzzing function, we can execute the go test -run=FuzzReverse command.

    Note : go test will execute all unit test functions starting with TestXxx and fuzz test functions BenchmarkXxx with FuzzXxx by -bench .

  2. If you want to generate random test data for fuzzing based on the seed corpus, you need to add -fuzz parameter to the go test command. (fuzz test with fuzzing)

    $ go test -fuzz=Fuzz
    fuzz: elapsed: 0s, gathering baseline coverage: 0/3 completed
    fuzz: elapsed: 0s, gathering baseline coverage: 3/3 completed, now fuzzing with 8 workers
    fuzz: minimizing 38-byte failing input file...
    --- FAIL: FuzzReverse (0.01s)
        --- FAIL: FuzzReverse (0.00s)
            reverse_test.go:20: Reverse produced invalid UTF-8 string "\x9c\xdd"
    
        Failing input written to testdata/fuzz/FuzzReverse/af69258a12129d6cbba438df5d5f25ba0ec050461c116f777e77ea7c9a0d217a
        To re-run:
        go test -run=FuzzReverse/af69258a12129d6cbba438df5d5f25ba0ec050461c116f777e77ea7c9a0d217a
    FAIL
    exit status 1
    FAIL    example/fuzz  0.030s

    The result of the fuzzing test above is FAIL , which causes the input data of FAIL to be written to a corpus file. The next time you run the go test command, the test data in this corpus file will be used even without the -fuzz parameter.

    You can open the file in the testdata/fuzz/FuzzReverse directory with a text editor to see what the test data that caused the Fuzzing test to fail looks like. The following is an example file, the test data you get after running it may be different from this, but the content format in the file will be the same.

    go test fuzz v1
    string("泃")

    The first line in the corpus file identifies the encoding version (encoding version, to put it bluntly, it is the version of the content format in this seed corpus file). Although there is only one version of v1, the Fuzzing designer takes into account the possibility of future A new encoding version is introduced, so the concept of encoding version is added.

    Starting from the second row, each row of data corresponds to one of the parameters of each test data (corpus entry) of the corpus, which is arranged in the order of the parameters.

    f.Fuzz(func(t *testing.T, orig string) {
            rev := Reverse(orig)
            doubleRev := Reverse(rev)
            if orig != doubleRev {
                t.Errorf("Before: %q, after: %q", orig, doubleRev)
            }
            if utf8.ValidString(orig) && !utf8.ValidString(rev) {
                t.Errorf("Reverse produced invalid UTF-8 string %q %q", orig, rev)
            }
    })

    The fuzz target function func(t *testing.T, orig string) in the FuzzReverse in this article only has one parameter, orig , as the real test input, that is, each test data is actually one input, so in the above example, there is only string in the file under the testdata/fuzz/FuzzReverse directory ") line.

    If each piece of test data has N parameters, then each piece of test data found by fuzzing that causes the fuzz test to fail will have N lines in the file under the testdata directory, and line i corresponds to parameter i .

  3. Run the go test command again, this time without the -fuzz parameter.

    We will find that although there is no -fuzz parameter, the test data found in step 2 above is still used during fuzzing.

    $ go test
    --- FAIL: FuzzReverse (0.00s)
        --- FAIL: FuzzReverse/af69258a12129d6cbba438df5d5f25ba0ec050461c116f777e77ea7c9a0d217a (0.00s)
            reverse_test.go:20: Reverse produced invalid string
    FAIL
    exit status 1
    FAIL    example/fuzz  0.016s

    Since the Go fuzzing test failed, we need to debug the code to find the problem.

Fix 2 bugs

In this chapter, we will debug the program and fix the bugs detected by Go fuzzing.

You can take some time to think for yourself and try to solve the problem yourself first.

positioning problem

You can use different methods to debug the bugs found above.

If you are using VS Code, you can set your Debug debugger in VS Code to add breakpoints for debugging.

In this article, we will use the method of printing logs for debugging.

The error message when running the fuzz test is: reverse_test.go:20: Reverse produced invalid UTF-8 string "\x9c\xdd"

Based on this error, let's take a look at the description in the document for utf8.ValidString .

ValidString reports whether s consists entirely of valid UTF-8-encoded runes.

The Reverse function we implemented is to reverse the string according to the byte dimension, which is the problem.

For example, the character in Chinese is actually composed of 3 bytes. If it is reversed according to the bytes, the reversed result is an invalid string.

Therefore, in order to ensure that the string is still a valid UTF-8 encoded string after inversion, we need to perform string inversion according to rune .

In order to make it easier for everyone to understand how many characters in Chinese are rune according to the dimension rune , and what the result will look like after inversion according to byte, we make some modifications to the code.

Write code

Modify the code in FuzzReverse as follows.

f.Fuzz(func(t *testing.T, orig string) {
    rev := Reverse(orig)
    doubleRev := Reverse(rev)
    t.Logf("Number of runes: orig=%d, rev=%d, doubleRev=%d", utf8.RuneCountInString(orig), utf8.RuneCountInString(rev), utf8.RuneCountInString(doubleRev))
    if orig != doubleRev {
        t.Errorf("Before: %q, after: %q", orig, doubleRev)
    }
    if utf8.ValidString(orig) && !utf8.ValidString(rev) {
        t.Errorf("Reverse produced invalid UTF-8 string %q", rev)
    }
})

run the code

$ go test
--- FAIL: FuzzReverse (0.00s)
    --- FAIL: FuzzReverse/28f36ef487f23e6c7a81ebdaa9feffe2f2b02b4cddaa6252e87f69863046a5e0 (0.00s)
        reverse_test.go:16: Number of runes: orig=1, rev=3, doubleRev=1
        reverse_test.go:21: Reverse produced invalid UTF-8 string "\x83\xb3\xe6"
FAIL
exit status 1
FAIL    example/fuzz    0.598s

Each symbol in our seed corpus is a single byte. But Chinese symbols like consist of multiple bytes, and if you reverse the byte dimension, you will get invalid results.

Note : If you are interested in how Go handles strings, you can read this article Strings, bytes, runes and characters in Go on the official blog for a better understanding.

Now that we have identified the problem, we can fix the bug.

fix the problem

Reverse the string with dimension rune .

Write code

Modify the implementation of the Reverse function as follows:

func Reverse(s string) string {
    r := []rune(s)
    for i, j := 0, len(r)-1; i < len(r)/2; i, j = i+1, j-1 {
        r[i], r[j] = r[j], r[i]
    }
    return string(r)
}

run the code

  1. Run the command: go test

    $ go test
    PASS
    ok      example/fuzz  0.016s

    The test passed! (Don't be too happy, this one just passed the seed corpus and before)

  2. Run go test -fuzz again to see if we find new bugs

    $ go test -fuzz=Fuzz
    fuzz: elapsed: 0s, gathering baseline coverage: 0/37 completed
    fuzz: minimizing 506-byte failing input file...
    fuzz: elapsed: 0s, gathering baseline coverage: 5/37 completed
    --- FAIL: FuzzReverse (0.02s)
        --- FAIL: FuzzReverse (0.00s)
            reverse_test.go:33: Before: "\x91", after: "�"
    
        Failing input written to testdata/fuzz/FuzzReverse/1ffc28f7538e29d79fce69fef20ce5ea72648529a9ca10bea392bcff28cd015c
        To re-run:
        go test -run=FuzzReverse/1ffc28f7538e29d79fce69fef20ce5ea72648529a9ca10bea392bcff28cd015c
    FAIL
    exit status 1
    FAIL    example/fuzz  0.032s

    Through the above error report, we found that after reversing a string twice, it is different from the original string.

    This time, the test input itself is illegal unicode, but why is the string obtained after two inversions different?

    Let's continue debugging.

Fix the bug of 2 string reversals

positioning problem

For this problem, adding breakpoint debugging will be very good positioning. For the convenience of explanation, this article uses the method of adding logs to debug.

We can locate the problem by carefully observing the result obtained after the first inversion of the original string.

Write code

  1. Modify Reverse function.

    func Reverse(s string) string {
        fmt.Printf("input: %q\n", s)
        r := []rune(s)
        fmt.Printf("runes: %q\n", r)
        for i, j := 0, len(r)-1; i < len(r)/2; i, j = i+1, j-1 {
            r[i], r[j] = r[j], r[i]
        }
        return string(r)
    }

    This can help us understand what happens after we convert the original string into rune slices.

run the code

This time, we only run the test data that failed the fuzz test, using the go test -run command.

To run the corpus test data specified in the FuzzXxx/testdata directory, you can specify the value {FuzzTestName}/{filename} for the -run parameter, which allows us to focus on the test data that failed the fuzz test.

$ go test -run=FuzzReverse/28f36ef487f23e6c7a81ebdaa9feffe2f2b02b4cddaa6252e87f69863046a5e0
input: "\x91"
runes: ['�']
input: "�"
runes: ['�']
--- FAIL: FuzzReverse (0.00s)
    --- FAIL: FuzzReverse/28f36ef487f23e6c7a81ebdaa9feffe2f2b02b4cddaa6252e87f69863046a5e0 (0.00s)
        reverse_test.go:16: Number of runes: orig=1, rev=1, doubleRev=1
        reverse_test.go:18: Before: "\x91", after: "�"
FAIL
exit status 1
FAIL    example/fuzz    0.145s

First we need to understand: In Go language, strings are read-only slices of bytes (In Go, a string is a read only slice of bytes), and each byte in a byte slice is not necessarily valid UTF -8 encoded bytes, please refer to a string is a read only slice of bytes for details.

In the above example, the input string is a byte slice with only 1 byte, and this 1 byte is \x91 .

When we convert this input string to []rune , Go will encode the byte slice as UTF-8, so we replace \x91 with '�', '�' is still '�' after the restaurant, which leads to the original The string \x91 is reversed and the resulting string is '�'.

Now the problem is clear, because the input data is illegal unicode. Then we can modify the implementation of the Reverse function.

fix the problem

The repair method is: check whether the input is a legal UTF-8 encoded string in Reverse , and return eror if it is illegal.

Write code

  1. Modify Reverse to achieve the following:

    func Reverse(s string) (string, error) {
        if !utf8.ValidString(s) {
            return s, errors.New("input is not valid UTF-8")
        }
        r := []rune(s)
        for i, j := 0, len(r)-1; i < len(r)/2; i, j = i+1, j-1 {
            r[i], r[j] = r[j], r[i]
        }
        return string(r), nil
    }
  2. Since the Reverse function will now return an error, the corresponding code in main.go should be modified as follows:

    func main() {
        input := "The quick brown fox jumped over the lazy dog"
        rev, revErr := Reverse(input)
        doubleRev, doubleRevErr := Reverse(rev)
        fmt.Printf("original: %q\n", input)
        fmt.Printf("reversed: %q, err: %v\n", rev, revErr)
        fmt.Printf("reversed again: %q, err: %v\n", doubleRev, doubleRevErr)
    }

    Because the main functions are all valid UTF-8 encoded strings, a call to Reverse will return an error with a value of nil.

  3. Since the Reverse function uses the two packages errors and utf8 , these two packages must be imported at the beginning of main.go .

    import (
        "errors"
        "fmt"
        "unicode/utf8"
    )
  4. Similarly, we need to modify reverse_test.go file. For illegal string input, the test can be skipped directly.

    func FuzzReverse(f *testing.F) {
        testcases := []string {"Hello, world", " ", "!12345"}
        for _, tc := range testcases {
            f.Add(tc)  // Use f.Add to provide a seed corpus
        }
        f.Fuzz(func(t *testing.T, orig string) {
            rev, err1 := Reverse(orig)
            if err1 != nil {
                return
            }
            doubleRev, err2 := Reverse(rev)
            if err2 != nil {
                 return
            }
            if orig != doubleRev {
                t.Errorf("Before: %q, after: %q", orig, doubleRev)
            }
            if utf8.ValidString(orig) && !utf8.ValidString(rev) {
                t.Errorf("Reverse produced invalid UTF-8 string %q", rev)
            }
        })
    }

    Instead of using return, you can also call t.Skip() to skip the current test input and continue with the next round of test input.

run the code

  1. run test code

    $ go test
    PASS
    ok      example/fuzz  0.019s
  2. Run the fuzz test go test -fuzz=Fuzz , after a few seconds, end the test with ctrl-C .

    $ go test -fuzz=Fuzz
    fuzz: elapsed: 0s, gathering baseline coverage: 0/38 completed
    fuzz: elapsed: 0s, gathering baseline coverage: 38/38 completed, now fuzzing with 4 workers
    fuzz: elapsed: 3s, execs: 86342 (28778/sec), new interesting: 2 (total: 35)
    fuzz: elapsed: 6s, execs: 193490 (35714/sec), new interesting: 4 (total: 37)
    fuzz: elapsed: 9s, execs: 304390 (36961/sec), new interesting: 4 (total: 37)
    ...
    fuzz: elapsed: 3m45s, execs: 7246222 (32357/sec), new interesting: 8 (total: 41)
    ^Cfuzz: elapsed: 3m48s, execs: 7335316 (31648/sec), new interesting: 8 (total: 41)
    PASS
    ok      example/fuzz  228.000s

    If the fuzz test encounters no errors, it will continue to run by default, and you need to use ctrl-C to end the test.

    You can also pass -fuzztime parameter to specify the test time, so that ctrl-C is not needed.

  3. Specify the test time. go test -fuzz=Fuzz -fuzztime 30s will automatically end after 30s if no error is encountered.

    $ go test -fuzz=Fuzz -fuzztime 30s
    fuzz: elapsed: 0s, gathering baseline coverage: 0/5 completed
    fuzz: elapsed: 0s, gathering baseline coverage: 5/5 completed, now fuzzing with 4 workers
    fuzz: elapsed: 3s, execs: 80290 (26763/sec), new interesting: 12 (total: 12)
    fuzz: elapsed: 6s, execs: 210803 (43501/sec), new interesting: 14 (total: 14)
    fuzz: elapsed: 9s, execs: 292882 (27360/sec), new interesting: 14 (total: 14)
    fuzz: elapsed: 12s, execs: 371872 (26329/sec), new interesting: 14 (total: 14)
    fuzz: elapsed: 15s, execs: 517169 (48433/sec), new interesting: 15 (total: 15)
    fuzz: elapsed: 18s, execs: 663276 (48699/sec), new interesting: 15 (total: 15)
    fuzz: elapsed: 21s, execs: 771698 (36143/sec), new interesting: 15 (total: 15)
    fuzz: elapsed: 24s, execs: 924768 (50990/sec), new interesting: 16 (total: 16)
    fuzz: elapsed: 27s, execs: 1082025 (52427/sec), new interesting: 17 (total: 17)
    fuzz: elapsed: 30s, execs: 1172817 (30281/sec), new interesting: 17 (total: 17)
    fuzz: elapsed: 31s, execs: 1172817 (0/sec), new interesting: 17 (total: 17)
    PASS
    ok      example/fuzz  31.025s

    Fuzzing test passed!

    In addition to -fuzz parameter, several new parameters have also been introduced to the go test command. For details, please refer to the documentation .

Summarize

So far you have learned how to use Go fuzzing.

Next, you can try to use fuzzing to find bugs in the code you have written.

If you do find a bug, please consider submitting the case to the trophy case .

If you find any issues with Go fuzzing or want to mention a feature, you can file a file an issue .

See documentation go.dev/doc/fuzz to learn more about Go Fuzzing.

The full code reference for this article is Go Fuzzing example code .

open source address

The article and sample code are open sourced on GitHub: Go language beginner, intermediate and advanced tutorials .

Official account: coding advanced. Follow the official account to get the latest Go interview questions and technology stacks.

Personal website: Jincheng's Blog .

Know: Wuji .

References

阅读 2.3k
75 声望
12 粉丝
0 条评论
75 声望
12 粉丝
文章目录
宣传栏