Introduction
Go has a lot of useful tools, pprof can be used to analyze the performance of a program. There are 4 types of pprof:
- CPU profiling: This is the most commonly used type. It is used to analyze the execution time of the function or method;
- Memory profiling: This type is also often used. Used to analyze the memory usage of the program;
- Block profiling: This is unique to Go and is used to record the time spent by goroutines waiting for shared resources;
- Mutex profiling: Similar to Block profiling, but only records waits or delays due to lock contention.
We mainly introduce the first two types. The pprof related functions in Go are in the package runtime/pprof
.
CPU profiling
pprof is very simple to use. First call pprof.StartCPUProfile()
enable CPU profiling. It accepts a io.Writer
type 060c7ed8edacc0, and pprof
will write the analysis result into this io.Writer
. In order to facilitate post-mortem analysis, we write to a file.
pprof.StopCPUProfile()
after the code to be analyzed. Then the code execution between StartCPUProfile()
and StopCPUProfile()
Convenience can be directly StartCPUProfile()
after, with defer
call StopCPUProfile()
, that is, all code analysis after this.
We now implement a function to calculate the number n
of the Fibonacci sequence:
func fib(n int) int {
if n <= 1 {
return 1
}
return fib(n-1) + fib(n-2)
}
Then use pprof to analyze the operation:
func main() {
f, _ := os.OpenFile("cpu.profile", os.O_CREATE|os.O_RDWR, 0644)
defer f.Close()
pprof.StartCPUProfile(f)
defer pprof.StopCPUProfile()
n := 10
for i := 1; i <= 5; i++ {
fmt.Printf("fib(%d)=%d\n", n, fib(n))
n += 3 * i
}
}
Executing go run main.go
will generate a cpu.profile
file. This file records the running status of the program. Use the go tool pprof
command to analyze this file:
Above, use the top
command to view the 10 most time-consuming functions. It can be seen that the fib
function takes the highest time, with a cumulative time of 390ms, which accounts for 90.70% of the total time. We can also use top5
and top20
to view the 5 and 20 most time-consuming functions, respectively.
When finding the time-consuming function, we can also use the list
command to see how the function is called, and what is the time-consuming on each call path. list
command is followed by a pattern representing the method name:
We know that using recursion to solve Fibonacci numbers has a lot of repetitive calculations. Let's optimize this function:
func fib2(n int) int {
if n <= 1 {
return 1
}
f1, f2 := 1, 1
for i := 2; i <= n; i++ {
f1, f2 = f2, f1+f2
}
return f2
}
How long does it take to switch to iteration? Let's test it. First execute go run main.go
generate cpu.profile
file, and then use go tool pprof
analyze:
The list seen by top here is empty. Because after CPU profiling is enabled, the runtime will be interrupted every 10ms, and the stack currently executed by each goroutine will be recorded to analyze the time-consuming. The code after our optimization is executed before it can be interrupted at runtime, so there is no information.
All commands executed by go tool pprof can be viewed through help
:
Memory profiling
Memory analysis is different. We can check the heap memory at any time during the running of the program. Below we write a function that generates a random string and repeats the string n
times:
const Letters = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
func generate(n int) string {
var buf bytes.Buffer
for i := 0; i < n; i++ {
buf.WriteByte(Letters[rand.Intn(len(Letters))])
}
return buf.String()
}
func repeat(s string, n int) string {
var result string
for i := 0; i < n; i++ {
result += s
}
return result
}
Write a program, call the above function, and record the memory usage:
func main() {
f, _ := os.OpenFile("mem.profile", os.O_CREATE|os.O_RDWR, 0644)
defer f.Close()
for i := 0; i < 100; i++ {
repeat(generate(100), 100)
}
pprof.Lookup("heap").WriteTo(f, 0)
}
Here after the end of the loop, pprof.Lookup("heap")
, and write the result to the file mem.profile
.
Run go run main.go
generate mem.profile
file, and then use go tool pprof mem.profile
to analyze:
Of course, you can also use the list
command to check which line of memory is allocated:
The result is as expected, because string splicing will take up a lot of temporary space.
pkg/profile
runtime/pprof
somewhat inconvenient to use, because it is necessary to repeatedly write the code to open the file, start the analysis, and end the analysis. So there have been packed runtime/pprof
library: pkg/profile
. pkg/profile
GitHub repository address is: https://github.com/pkg/profile. pkg/profile
just runtime/pprof
to make it more usable. Use pkg/profile
to simplify the code into one line. You need to use go get github.com/pkg/profile
get this library before use.
defer profile.Start().Stop()
CPU profiling is enabled by default, and data is written to file cpu.pprof
. Use it to analyze the performance of fib
$ go run main.go
2021/06/09 21:10:36 profile: cpu profiling enabled, C:\Users\ADMINI~1\AppData\Local\Temp\profile594431395\cpu.pprof
fib(10)=89
fib(13)=377
fib(19)=6765
fib(28)=514229
fib(40)=165580141
2021/06/09 21:10:37 profile: cpu profiling disabled, C:\Users\ADMINI~1\AppData\Local\Temp\profile594431395\cpu.pprof
The console will output the file path where the analysis results are written.
If you want to enable Memory profiling, you can pass in the function option MemProfile
:
defer profile.Start(profile.MemProfile).Stop()
In addition, the memory sampling rate can be controlled by function options, the default is 4096. We can change to 1:
defer profile.Start(profile.MemProfile, profile.MemProfileRate(1)).Stop()
Flame graph
Viewing the CPU or memory status through the command line is not intuitive enough. Bredan Gregg invented the flame graph (Flame Graph) can be very intuitive to see the memory and CPU consumption. The new version of the go tool pprof tool has integrated the flame graph (I use Go1.16). To generate flame graphs, graphviz must be installed.
On Mac:
brew install graphviz
On Ubuntu:
apt install graphviz
On Windows, the official website download page http://www.graphviz.org/download/ has executable installation files, just download and install. Pay attention to setting the PATH path.
The cpu.profile and mem.profile generated by the above program can be viewed directly on the web page. Execute the following command:
go tool pprof -http :8080 cpu.profile
By default, a browser window will open, displaying the following page:
We can switch to display the flame graph in the VIEW menu bar:
You can hover and click on the flame graph with the mouse to view a specific call.
net/http/pprof
What should I do if I encounter high CPU or memory usage online? It is impossible to compile the above Profile code to the production environment, which will undoubtedly greatly affect the performance. net/http/pprof
provides a method that will not cause any impact when not in use. When you encounter a problem, you can turn on profiling to help us troubleshoot the problem. We only need to use the import
package, and then call http.ListenAndServe()
in a new goroutine to start a default HTTP server on a certain port:
import (
_ "net/http/pprof"
)
func NewProfileHttpServer(addr string) {
go func() {
log.Fatalln(http.ListenAndServe(addr, nil))
}()
}
Let's write an HTTP server to move the Fibonacci number calculation and repeated strings in the previous example to the Web. In order to make the test result more obvious, I executed the function that was executed once 1000 times:
func fibHandler(w http.ResponseWriter, r *http.Request) {
n, err := strconv.Atoi(r.URL.Path[len("/fib/"):])
if err != nil {
responseError(w, err)
return
}
var result int
for i := 0; i < 1000; i++ {
result = fib(n)
}
response(w, result)
}
func repeatHandler(w http.ResponseWriter, r *http.Request) {
parts := strings.SplitN(r.URL.Path[len("/repeat/"):], "/", 2)
if len(parts) != 2 {
responseError(w, errors.New("invalid params"))
return
}
s := parts[0]
n, err := strconv.Atoi(parts[1])
if err != nil {
responseError(w, err)
return
}
var result string
for i := 0; i < 1000; i++ {
result = repeat(s, n)
}
response(w, result)
}
Create HTTP server, register processing function:
func main() {
mux := http.NewServeMux()
mux.HandleFunc("/fib/", fibHandler)
mux.HandleFunc("/repeat/", repeatHandler)
s := &http.Server{
Addr: ":8080",
Handler: mux,
}
NewProfileHttpServer(":9999")
if err := s.ListenAndServe(); err != nil {
log.Fatal(err)
}
}
We also started an HTTP server to handle pprof related requests.
In addition, for testing purposes, I wrote a program that always sends HTTP requests to this server:
func doHTTPRequest(url string) {
resp, err := http.Get(url)
if err != nil {
fmt.Println("error:", err)
return
}
data, _ := ioutil.ReadAll(resp.Body)
fmt.Println("ret:", len(data))
resp.Body.Close()
}
func main() {
var wg sync.WaitGroup
wg.Add(2)
go func() {
defer wg.Done()
for {
doHTTPRequest(fmt.Sprintf("http://localhost:8080/fib/%d", rand.Intn(30)))
time.Sleep(500 * time.Millisecond)
}
}()
go func() {
defer wg.Done()
for {
doHTTPRequest(fmt.Sprintf("http://localhost:8080/repeat/%s/%d", generate(rand.Intn(200)), rand.Intn(200)))
time.Sleep(500 * time.Millisecond)
}
}()
wg.Wait()
}
Use the command go run main.go
start the server. Running the above program has been sending requests to the server. After a while, we can open http://localhost:9999/debug/pprof/
with a browser:
go tool pprof
also supports remote acquisition of profile files:
$ go tool pprof -http :8080 localhost:9999/debug/pprof/profile?seconds=120
Among them, seconds=120
means sampling for 120s, and the default is 30s. The results are as follows:
It can be seen that in addition to the runtime consumption, the main processing consumption fibHandler
and repeatHandler
Of course, it is generally impossible to open this port online, because there is a great security risk. Therefore, we generally generate files on the online machine profile and download the files to the local for analysis. Above we see that go tool pprof
will generate a file and save it locally. For example, it is C:\Users\Administrator\pprof\pprof.samples.cpu.001.pb.gz
on my machine. Download this file to the local, then:
$ go tool pprof -http :8888 pprof.samples.cpu.001.pb.gz
net/http/pprof
implementation
net/http/pprof
is nothing mysterious about the implementation of 060c7ed8edb283. It is nothing more than registering some processing functions in the init()
net/http/pprof
// src/net/http/pprof/pprof.go
func init() {
http.HandleFunc("/debug/pprof/", Index)
http.HandleFunc("/debug/pprof/cmdline", Cmdline)
http.HandleFunc("/debug/pprof/profile", Profile)
http.HandleFunc("/debug/pprof/symbol", Symbol)
http.HandleFunc("/debug/pprof/trace", Trace)
}
http.HandleFunc()
will register the processing function to the default ServeMux
:
// src/net/http/server.go
var DefaultServeMux = &defaultServeMux
var defaultServeMux ServeMux
func HandleFunc(pattern string, handler func(ResponseWriter, *Request)) {
DefaultServeMux.HandleFunc(pattern, handler)
}
This DefaultServeMux
is net/http
, there is only one instance. In order to avoid path conflicts, we usually do not recommend using the default DefaultServeMux
when writing HTTP servers by ourselves. Generally, first call http.NewServeMux()
create a new ServeMux
, see the HTTP sample code above.
Let's look at the processing function registered by the net/http/pprof
// src/net/http/pprof/pprof.go
func Profile(w http.ResponseWriter, r *http.Request) {
// ...
if err := pprof.StartCPUProfile(w); err != nil {
serveError(w, http.StatusInternalServerError,
fmt.Sprintf("Could not enable CPU profiling: %s", err))
return
}
sleep(r, time.Duration(sec)*time.Second)
pprof.StopCPUProfile()
}
Deleted unrelated to the previous code, this function is invoked runtime/pprof
of StartCPUProfile(w)
method begins CPU profiling, then sleep for some time (this time is the sampling interval), the last call pprof.StopCPUProfile()
discontinued. StartCPUProfile()
method passes in a http.ResponseWriter
, so the sampling result is directly written back to the HTTP client.
The realization of memory profiling uses a little trick. First of all, we did not find a processing function for memory profiling init()
In terms of implementation, the /debug/pprof/heap
path will go to the Index()
function:
// src/net/http/pprof/pprof.go
func Index(w http.ResponseWriter, r *http.Request) {
if strings.HasPrefix(r.URL.Path, "/debug/pprof/") {
name := strings.TrimPrefix(r.URL.Path, "/debug/pprof/")
if name != "" {
handler(name).ServeHTTP(w, r)
return
}
}
// ...
}
Will eventually go to handler(name).ServeHTTP(w, r)
. handler
is just a new type based on the string
type definition. It defines the ServeHTTP()
method:
type handler string
func (name handler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
p := pprof.Lookup(string(name))
// ...
p.WriteTo(w, debug)
}
Delete other irrelevant code, leaving the above two lines. Statistics will be written to http.ResponseWriter
.
Benchmark
In fact, Benchmark
can also be generated when cpu.profile
, mem.profile
these analyzes files. bench_test.go
file in the directory of the first example:
func BenchmarkFib(b *testing.B) {
for i := 0; i < b.N; i++ {
fib(30)
}
}
Then execute the command go test -bench . -test.cpuprofile cpu.profile
:
Then you can analyze the cpu.profile
file.
to sum up
This article introduces the use of the pprof tool, as well as the more convenient library pkg/profile
, and also introduces how to use net/http/pprof
to add insurance to online programs, so that problems can be diagnosed at any time. No problem will not have any impact on performance.
reference
- pkg/profile GitHub:https://github.com/pkg/profile
- Go GitHub you don’t know: https://github.com/darjun/you-dont-know-go
I
My blog: https://darjun.github.io
Welcome to follow my WeChat public account [GoUpUp], learn together and make progress together~
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。