头图

In the last article, we talked about Golang's native resource embedding solution. In this article, let's talk about the top solution in the open source implementation: go-bindata .

The reason why we talk about this scheme first is because although its current popularity and popularity are not the highest, its influence scope and time are comprehensive, and in terms of implementation and use, due to historical reasons, It also has the most hard fork versions and the most complicated situation.

Origins between various open source projects

Let’s talk about the origins of such open source projects first. There are four main projects go-bindata that will be used in the current project, namely:

The common origin of these projects is the project jteeuwen/go-bindata whose first line of code was committed ten years ago on June 2011 .

But on February 7, 2018, the author deleted all the repositories he created for some reason, and then this account was also deprecated. This time, there was a kind of foreign users of the other users on Twitter remind .

来自好心人的提醒

Then, it naturally triggered a fake.js author's deletion of the library and the earlier npm left-pad warehouse software deletion, and a large number of software could not be built normally.

In some legacy projects, we can clearly see when this happened, such as twitter's fork archive of go-bindata.

从 Twitter fork 修改上游仓库地址也记录了这个事情的发生

On February 8, other students in the open source community tried to appeal to get this account, and restored the code before "deleting the repository" to this account. In order to show that this repository is only for restoration purposes, kind-hearted people set up the software repository a Lei Feng-style statement for read-only (archived) .

来自社区其他好心人的补救

In the years since, although this repository has lost the maintenance of the original author. However, the Golang and Golang community ecology is still booming, and the demand for static resource embedding is still relatively strong, so there are the other three open source software repositories above, as well as some less well-known repositories that I haven't mentioned yet.

Differences between versions of software

The origins between the various open source projects have been discussed above. Let's take a look at the differences between these repositories.

Among these repositories, go-bindata/go-bindata is the most well-known version, and elazarl/go-bindata-assetfs provides the FS package that the original software does not support net/http Remember previous article ? Yes, this project mainly does this function. In addition, in the past few years, the vigorous development of front-end technology, especially the vigorous development of SPA-type front-end applications, has also elazarl/go-bindata-assetfs a solution focused on serving SPA application single-file distribution, a practical place. So if you have similar needs, you can still use this warehouse to pack your front-end SPA project into an executable file to quickly distribute .

Of course, software development in the open source community is often staggered. elazarl/go-bindata-assetfs provided the FS package, go-bindata/go-bindata also provided the -fs parameter to support the function of using net/http So if you're after program dependencies minimization and want embedded resources to work with net/http , consider just using this repository .

In addition, some programmers with code cleanliness created a new fork version, kevinburke/go-bindata . Compared with the original version and the go-bindata/go-bindata code, its code is more robust, and it fixes go-bindata/go-bindata by community users on 061ebd5a98866a, and adds some new features expected by community users. But this warehouse program and like the original, is not included with net/http required for use with fs package. So if you want to use this program deal with static resources and net/http used together, need to match elazarl/go-bindata-assetfs , or own a simple package of fs .

Differences between these software and the official implementation

Compared with the official implementation, go-bindata actually has some additional functions:

  • Allows users to read static resources in two different modes (such as using reflection and unsafe.Pointer to read data directly, or using Golang program variables to interact with data)
  • Relatively lower resource storage footprint in some scenarios (based on GZip compression at build time)
  • The ability to dynamically adjust or preprocess the reference path of static resources
  • A more open resource import mode, which supports importing resources from the upper-level directory (the official implementation only supports the current directory)

Of course, compared to the official implementation in the previous article, the implementation of go-bindata is relatively "dirty" and will package static resources into a go program file. And before the program runs, we need to perform the resource construction operation to make the program run. Instead of "zero addition and no pollution" like the official implementation, go run or go build can solve "everything" problems with one command.

Next, let's talk about the basic usage and performance of go-bindata.

Basic usage: go-bindata default configuration

As in the previous article, let's finish writing the basics before understanding the performance differences.

mkdir basic-go-bindata && cd basic-go-bindata
go mod init solution-embed

Here is a small detail, because the go-bindata/go-bindata has not been officially released, so if we want to install the content that contains the latest feature fixes, we need to install it in the following way:

# go get -u -v github.com/go-bindata/go-bindata@latest

go get: added github.com/go-bindata/go-bindata v3.1.2+incompatible

In the previous article, if we wanted to use the official go-embed function for resource embedding, our program implementation would look like the following:

package main

import (
    "embed"
    "log"
    "net/http"
)

//go:embed assets
var assets embed.FS

func main() {
    mutex := http.NewServeMux()
    mutex.Handle("/", http.FileServer(http.FS(assets)))
    err := http.ListenAndServe(":8080", mutex)
    if err != nil {
        log.Fatal(err)
    }
}

When using go-bindata , because we need to use an additional generated program file, the program needs to be changed to something like the following, and a go:generate instruction needs to be added:

package main

import (
    "log"
    "net/http"

    "solution-embed/pkg/assets"
)

//go:generate go-bindata -fs -o=pkg/assets/assets.go -pkg=assets ./assets

func main() {
    mutex := http.NewServeMux()
    mutex.Handle("/", http.FileServer(assets.AssetFile()))
    err := http.ListenAndServe(":8080", mutex)
    if err != nil {
        log.Fatal(err)
    }
}

Here we use the go generate instruction to declare the relevant commands that need to be executed before the program runs. In addition to supporting the global program in the running environment, it can also run the executable commands installed go get If you have used the npx , this command is more closely related to the context of the program, supporting scattered writing in different programs, and the context of the program closer.

First implementation go generate , the project in the current directory pkg/assets/assets.go position there will be a program file that contains the resources we need, because bindata implementation uses \x00 as characters are encoded, so the generated code is compared to the original static resources It will expand by 4 to 5 times, but it does not affect the size of the binary file we get after compiling (consistent with the official implementation) .

du -hs *
 17M    assets
4.0K    go.mod
4.0K    go.sum
4.0K    main.go
 83M    pkg

Whether we choose to use go run main.go or go build main.go , when the program is running, http://localhost:8080/assets/example.txt can verify whether the program is normal by visiting 061ebd5a988889.

The relevant code is implemented in https://github.com/soulteary/awesome-golang-embed/tree/main/go-bindata-related/basic-go-bindata , you can pick it up if you are interested.

In addition, compared to the official program that does not support the use of resources outside the current program directory (you need to use the go generate cp -r ../originPath ./destPath method to save the country), go-bindata can directly reference external resources in the use of generated resources. And before providing external services, use the -prefix parameter to adjust the reference path in the generated resource file.

Test preparation: go-bindata default configuration

The difference between the test code and " previous article of " is not big, and it can be used with a little adjustment:

package main

import (
    "log"
    "net/http"
    "net/http/pprof"
    "runtime"

    "solution-embed/pkg/assets"
)

//go:generate go-bindata -fs -o=pkg/assets/assets.go -pkg=assets ./assets

func registerRoute() *http.ServeMux {

    mutex := http.NewServeMux()
    mutex.Handle("/", http.FileServer(assets.AssetFile()))
    return mutex
}

func enableProf(mutex *http.ServeMux) {
    runtime.GOMAXPROCS(2)
    runtime.SetMutexProfileFraction(1)
    runtime.SetBlockProfileRate(1)

    mutex.HandleFunc("/debug/pprof/", pprof.Index)
    mutex.HandleFunc("/debug/pprof/cmdline", pprof.Cmdline)
    mutex.HandleFunc("/debug/pprof/profile", pprof.Profile)
    mutex.HandleFunc("/debug/pprof/symbol", pprof.Symbol)
    mutex.HandleFunc("/debug/pprof/trace", pprof.Trace)
}

func main() {
    mutex := registerRoute()
    enableProf(mutex)

    err := http.ListenAndServe(":8080", mutex)
    if err != nil {
        log.Fatal(err)
    }
}

Performance test: go-bindata default configuration

Except for the main program and test program that need to be adjusted, the rest of the project content can directly use the code in the previous article. After executing the benchmark.sh script, the same performance sampling data as the previous article can be obtained.

Looking back at the previous article, the execution results of our test samples did not take long:

=== RUN   TestSmallFileRepeatRequest
--- PASS: TestSmallFileRepeatRequest (0.04s)
PASS
ok      solution-embed    0.813s
=== RUN   TestLargeFileRepeatRequest
--- PASS: TestLargeFileRepeatRequest (1.14s)
PASS
ok      solution-embed    1.331s
=== RUN   TestStaticRoute
--- PASS: TestStaticRoute (0.00s)
=== RUN   TestSmallFileRepeatRequest
--- PASS: TestSmallFileRepeatRequest (0.04s)
=== RUN   TestLargeFileRepeatRequest
--- PASS: TestLargeFileRepeatRequest (1.12s)
PASS
ok      solution-embed    1.509s

After executing the go-bindata sampling script in this article, we can see that the overall test time has become much longer:

=== RUN   TestSmallFileRepeatRequest
--- PASS: TestSmallFileRepeatRequest (1.47s)
PASS
ok      solution-embed    2.260s
=== RUN   TestLargeFileRepeatRequest
--- PASS: TestLargeFileRepeatRequest (29.43s)
PASS
ok      solution-embed    29.808s

The relevant code used in this part, I uploaded to https://github.com/soulteary/awesome-golang-embed/tree/main/go-bindata-related/benchmark , you can pick it up if you need it.

Performance of embedding large files

Here we still use go tool pprof -http=:8090 cpu-large.out to show the resource consumption of the program calculation call process (because there are so many calls, here we only look at the parts that are directly related). http://localhost:8090/ui/ in a browser and you can see a call graph similar to the following:

读取嵌入资源以及相对耗时的调用状况

Compared with the official go:embed implementation, the embed function only consumes 0.07s, and the io.copy only consumes 0.88s. go-bindata spent 12.99-13.08s and 26.06-27.03s on embed processing and io.copy, respectively. The performance consumption of the former increased by more than 180 times, and the latter was nearly 30 times.

Continue to use go tool pprof -http=:8090 mem-large.out to view the memory usage:

读取嵌入资源内存消耗状况

It can be seen that the consumption of go-bindata seems very exaggerated whether it is the complexity of the program's call chain or the amount of resources used. After the same 100 quick calls, a total of 19180 MB has been used in memory, which is 3 times the official implementation, which is equivalent to more than 1000 times the consumption of the original resource. average, we need to pay 10 times the original file for each request. resources to provide services, very uneconomical .

Therefore, it is not difficult to draw a simple conclusion here: Do not embed excessively large resources in go-bindata, it will cause serious waste of resources , if there is such a need, you can use the official solution mentioned in the previous article to solve the problem.

Resource usage for embedding small files

After reading the large file, let's also look at the resource usage of the small file. After executing go tool pprof -http=:8090 cpu-small.out , a very spectacular call can be seen. (On the premise that our code is simple enough, this call complexity can be said to be outrageous)

读取嵌入资源(小文件)CPU调用状况

Among the top calls in the official implementation, there are no embed-related function calls. In go-bindata, a large amount of data read and memory copy operations took 0.88 to 0.95s. In addition, the GZip decompression for resources also took a total of 0.85s.

读取嵌入资源(小文件)CPU调用详情

Note, however, this test based on thousands of small files on the acquisition, the average consumption each time, in fact, is acceptable . Of course, if there are similar requirements, it is more efficient to use the native implementation.

读取嵌入资源(小文件)内存调用详情

Next, let's look at the use of memory resources. Compared with the official implementation, the resource consumption of go-bindata is about 4 times that of it. Compared with the original file, we need to use an additional 6 times of resources. If there are too many small files or a large amount of requests, using go-bindata should not be an optimal solution. But if it is a temporary or a small number of files, it is not a problem for occasional use .

Throughput testing with Wrk

As in the previous article, we first execute go build main.go to obtain the built program, and then execute ./main start the service to test the throughput of small files:

# wrk -t16 -c 100 -d 30s http://localhost:8080/assets/vue.min.js
Running 30s test @ http://localhost:8080/assets/vue.min.js
  16 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    89.61ms   73.12ms 701.06ms   74.80%
    Req/Sec    74.17     25.40   210.00     68.65%
  35550 requests in 30.05s, 3.12GB read
Requests/sec:   1182.98
Transfer/sec:    106.43MB

It can be seen that compared with the official implementation in the previous article, the throughput capacity has shrunk by nearly 20 times. However, it can still maintain a throughput of more than 1,000 times per second, which is not a big problem for ordinary small projects.

Let's take a look at the throughput for large files:

# wrk -t16 -c 100 -d 30s http://localhost:8080/assets/chip.jpg 

Running 30s test @ http://localhost:8080/assets/chip.jpg
  16 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     0.00us    0.00us   0.00us     nan%
    Req/Sec     1.66      2.68    10.00     91.26%
  106 requests in 30.10s, 1.81GB read
  Socket errors: connect 0, read 0, write 0, timeout 106
Requests/sec:      3.52
Transfer/sec:     61.46MB

Compared with the official implementation, which can handle nearly 300 requests per second, after using go-bindata, it can only process 3.5 requests per second, which further verifies the previous judgment that go-bindata is not recommended to process large files.

Performance test: go-bindata turns off GZip compression and turns on the function of reducing memory usage

The default go-bindata will enable GZip compression (using Go's default compression ratio). Will the test performance improve if we don't enable GZip? In addition, if we turn on unsafe.Pointer , will the performance of the program be improved?

To turn off GZip and turn on the function of reducing memory usage, just add the following parameter switch to the go:generate

-nocompress -nomemcopy

After re-executing go generate , we check the size of the generated file, and we will find that it is smaller than that without GZip enabled (some resources are really not suitable for GZip):

du -hs *   
 17M    assets
4.0K    benchmark.sh
4.0K    go.mod
4.0K    go.sum
 24M    main
4.0K    main.go
 68M    pkg

After adjusting for the above test program, we tested the program again, and also executed benchmark.sh . We can see that the execution time has undergone a qualitative change, even approaching the official implementation (only a difference of 0.01s and 0.07s).

bash benchmark.sh 
=== RUN   TestSmallFileRepeatRequest
--- PASS: TestSmallFileRepeatRequest (0.05s)
PASS
ok      solution-embed    1.246s
=== RUN   TestLargeFileRepeatRequest
--- PASS: TestLargeFileRepeatRequest (1.19s)
PASS
ok      solution-embed    1.336s

Next, let's take a look at what amazing changes have taken place in program calls?

Regarding the relevant code of this part, I uploaded it to https://github.com/soulteary/awesome-golang-embed/tree/main/go-bindata-related/benchmark-no-compress , you can take it yourself if you are interested, and conduct experiments.

Performance of embedding large files

Or first use go tool pprof -http=:8090 cpu-large.out to show the resource consumption of the program calculation call process. It can be seen that the call complexity of resource processing here is similar to the official comparison. Compared with the official implementation of the call chain, the program after reducing memory usage and closing GZip compression is on. In terms of program parallel computing, 161ebd5a988c39 is even superior. officially called in the previous article.

读取嵌入资源以及相对耗时的调用状况

This is also the reason why the overall service response time is basically the same even if the resource processing calls have similar call complexity, and even if the execution time of 0.91s is more than double the official 0.42s.

Then use go tool pprof -http=:8090 mem-large.out , let's check the memory usage:

读取嵌入资源内存消耗状况

If you look at the previous article, you will find that after turns on the "reduce memory consumption" function, the memory footprint of go-bindata is even 3MB smaller than the official implementation . Of course, even with the same resource consumption as the official implementation, we still need to pay about 3.6 times the resources of the original file for each request.

Resource usage for embedding small files

The test results of small files seem to be not much different from the official implementation, so I will not waste too much space here. Let's go directly to the stress test to see the throughput of the program.

Throughput testing with Wrk

As in the previous article, we first execute go build main.go to obtain the built program, then execute ./main start the service, and first perform a small file throughput test:

# wrk -t16 -c 100 -d 30s http://localhost:8080/assets/vue.min.js

Running 30s test @ http://localhost:8080/assets/vue.min.js
  16 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     4.22ms    2.55ms  47.38ms   70.90%
    Req/Sec     1.46k   128.35     1.84k    77.00%
  699226 requests in 30.02s, 61.43GB read
Requests/sec:  23292.03
Transfer/sec:      2.05GB

Test results were very surprising, responsiveness per second or even more than a few hundred to achieve than the official . Next, let's take a look at the throughput capacity for large files:

# wrk -t16 -c 100 -d 30s http://localhost:8080/assets/chip.jpg 

Running 30s test @ http://localhost:8080/assets/chip.jpg
  16 threads and 100 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   340.98ms  138.47ms   1.60s    81.04%
    Req/Sec    18.24      9.33    60.00     73.75%
  8478 requests in 30.10s, 141.00GB read
Requests/sec:    281.63
Transfer/sec:      4.68GB

large files are almost no different from the official implementation of , and the numerical difference is a few per second.

other

Due to space limitations, I will not mention the use of the "homebrew" version of go-bindata. Interested students can refer to this article for a test.

In addition to the implementations mentioned above, there are actually some interesting implementations, although they are not well known:

At last

At this point in the test, we can make a simple judgment about go-bindata. If you pursue no or less use of reflection and unsafe.Pointer , then use go-bindata under the premise of a small number of files and no large files. It works.

Once the amount of data is large, it is recommended to use the official implementation. Of course, if you're comfortable with using reflection and unsafe.Pointer , go-bindata can give you performance comparable to the official go-embed implementation, as well as more customization capabilities.

--EOF


We have a small tossing group, which gathers hundreds of friends who like tossing.

Without advertising, we will chat about software and hardware, HomeLab, and programming issues together, and we will also share some technical salon information in the group from time to time.

Friends who like tossing are welcome to scan the code to add friends. (Add friends, please note your real name, indicate the source and purpose, otherwise it will not pass the review)

Those things about tossing the group into the group


If you think the content is still useful, please like and share it with your friends, thank you here.

If you want to see the updates of subsequent content faster, please don't hesitate to "like" or "forward and share", these free encouragements will affect the update speed of subsequent related content.


This article uses the "Signature 4.0 International (CC BY 4.0)" license agreement, welcome to reprint, or re-modify for use, but you need to indicate the source. Attribution 4.0 International (CC BY 4.0)

Author of this article: Su Yang

Created: January 16, 2022
Statistical words: 12144 words
Reading time: 25 minutes to read
Link to this article: https://soulteary.com/2022/01/16/explain-the-golang-resource-embedding-solution-go-bindata.html


soulteary
191 声望7 粉丝

折腾硬核技术,分享实用内容。