Introduction
In programming development, we often need frequently create and destroy similar objects. Such operations are likely to have an impact on performance. At this time, the commonly used optimization method is to use the object pool (object pool). When we need to create an object, we first look for it in the object pool. If there is a free object, remove the object from the pool and return it to the caller. Only when there are no free objects in the pool, a new object is actually created. On the other hand, after the object is used up, we do not destroy it. Instead, it is returned to the object pool for subsequent use. Using object pools can greatly improve performance in situations where objects are frequently created and destroyed. At the same time, in order to avoid the objects in the object pool occupying too much memory. The object pool is generally equipped with a specific cleaning strategy. The Go standard library sync.Pool
is one such example. sync.Pool
in 0609b51a78ac67 will be cleaned up by garbage collection.
Among these types of objects, a special one is byte buffering (the bottom layer is generally byte slices). When doing string concatenation, in order to concatenate efficiently, we usually store the intermediate result in a byte buffer. After the splicing is completed, the resulting string is generated from the byte buffer. When sending and receiving network packets, it is also necessary to temporarily store incomplete packets in the byte buffer.
bytes.Buffer
in the Go standard library encapsulates byte slices and provides some interfaces for use. We know that the capacity of slices is limited, and expansion is needed when the capacity is insufficient. Frequent expansion can easily cause performance jitter. bytebufferpool
implements its own Buffer
type, and uses a simple algorithm to reduce the performance loss caused by expansion. bytebufferpool
has been applied in the famous web framework fasthttp and the flexible Go module library quicktemplate . In fact, these 3 libraries are the same author: valyala😀.
Quick to use
The code in this article uses Go Modules.
Create a directory and initialize:
$ mkdir bytebufferpool && cd bytebufferpool
$ go mod init github.com/darjun/go-daily-lib/bytebufferpool
Install the bytebufferpool
library:
$ go get -u github.com/PuerkitoBio/bytebufferpool
A typical use first by bytebufferpool
provided Get()
obtain a method bytebufferpool.Buffer
object and then call the object's method of writing data, and then use the call after the completion of bytebufferpool.Put()
objects back into the object pool. example:
package main
import (
"fmt"
"github.com/valyala/bytebufferpool"
)
func main() {
b := bytebufferpool.Get()
b.WriteString("hello")
b.WriteByte(',')
b.WriteString(" world!")
fmt.Println(b.String())
bytebufferpool.Put(b)
}
Directly call Get()
Put()
bytebufferpool
package, and the underlying operation is the default object pool in the package:
// bytebufferpool/pool.go
var defaultPool Pool
func Get() *ByteBuffer { return defaultPool.Get() }
func Put(b *ByteBuffer) { defaultPool.Put(b) }
Of course, we can create a new object pool according to actual needs and put together objects of the same use (for example, we can create an object pool to assist in receiving network packets, and one to assist in splicing strings):
func main() {
joinPool := new(bytebufferpool.Pool)
b := joinPool.Get()
b.WriteString("hello")
b.WriteByte(',')
b.WriteString(" world!")
fmt.Println(b.String())
joinPool.Put(b)
}
bytebufferpool
does not provide a specific creation function, but it can be created new
Optimize the details
When the object is returned to the pool, it will be processed according to the capacity of the current slice. bytebufferpool
divides the size into 20 intervals:
| < 2^6 | 2^6 ~ 2^7-1 | ... | > 2^25 |
If the capacity is less than 2^6, it belongs to the first interval. If it is between 2^6 and 2^7-1, it falls into the second interval. And so on. After performing enough replacement times, bytebufferpool
will recalibrate and calculate which interval has the most objects. Set defaultSize
as the upper limit capacity of the interval, the upper limit capacity of the first interval is 2^6, the second interval is 2^7, and the last interval is 2^26. When subsequently Get()
, if there are no free objects in the pool, when creating a new object, directly set the capacity to defaultSize
. This basically avoids slice expansion during use, thereby improving performance. The following combined code to understand:
// bytebufferpool/pool.go
const (
minBitSize = 6 // 2**6=64 is a CPU cache line size
steps = 20
minSize = 1 << minBitSize
maxSize = 1 << (minBitSize + steps - 1)
calibrateCallsThreshold = 42000
maxPercentile = 0.95
)
type Pool struct {
calls [steps]uint64
calibrating uint64
defaultSize uint64
maxSize uint64
pool sync.Pool
}
We can see that bytebufferpool
internally uses the object sync.Pool
the standard library.
Here steps
is the interval mentioned above, a total of 20 copies. calls
array records the number of times the object capacity falls in each interval.
When calling Pool.Get()
put the object back, first calculate the interval in which the slice capacity falls, and increase the value of the corresponding element in the calls
// bytebufferpool/pool.go
func (p *Pool) Put(b *ByteBuffer) {
idx := index(len(b.B))
if atomic.AddUint64(&p.calls[idx], 1) > calibrateCallsThreshold {
p.calibrate()
}
maxSize := int(atomic.LoadUint64(&p.maxSize))
if maxSize == 0 || cap(b.B) <= maxSize {
b.Reset()
p.pool.Put(b)
}
}
If calls
array exceeds the specified value calibrateCallsThreshold=42000
(indicating that since the last calibration, the number of times the object is placed in the interval has reached the threshold, 42000 should be an empirical number), then call Pool.calibrate()
perform the calibration operation:
// bytebufferpool/pool.go
func (p *Pool) calibrate() {
// 避免并发放回对象触发 `calibrate`
if !atomic.CompareAndSwapUint64(&p.calibrating, 0, 1) {
return
}
// step 1.统计并排序
a := make(callSizes, 0, steps)
var callsSum uint64
for i := uint64(0); i < steps; i++ {
calls := atomic.SwapUint64(&p.calls[i], 0)
callsSum += calls
a = append(a, callSize{
calls: calls,
size: minSize << i,
})
}
sort.Sort(a)
// step 2.计算 defaultSize 和 maxSize
defaultSize := a[0].size
maxSize := defaultSize
maxSum := uint64(float64(callsSum) * maxPercentile)
callsSum = 0
for i := 0; i < steps; i++ {
if callsSum > maxSum {
break
}
callsSum += a[i].calls
size := a[i].size
if size > maxSize {
maxSize = size
}
}
// step 3.保存对应值
atomic.StoreUint64(&p.defaultSize, defaultSize)
atomic.StoreUint64(&p.maxSize, maxSize)
atomic.StoreUint64(&p.calibrating, 0)
}
step 1. Count and sort
calls
array records the number of times the object is put back into the corresponding interval. Sort by this number from largest to smallest. Note: minSize << i
represents the upper limit capacity of the i
step 2. Calculate defaultSize
and maxSize
defaultSize
well understood, just take the first size
after sorting. maxSize
value records the maximum value of multiple object capacities whose replacement times exceed 95%. Its function is to prevent the less used large-capacity objects from being put back into the object pool, thus taking up too much memory. Here you can understand the logic of the second half of the Pool.Put()
// 如果要放回的对象容量大于 maxSize,则不放回
maxSize := int(atomic.LoadUint64(&p.maxSize))
if maxSize == 0 || cap(b.B) <= maxSize {
b.Reset()
p.pool.Put(b)
}
step 3. Save the corresponding value
When subsequently Pool.Get()
, if there are no free objects in the pool, the default capacity of newly created objects is defaultSize
. Such capacity can meet the use in most cases and avoid slicing expansion during use.
// bytebufferpool/pool.go
func (p *Pool) Get() *ByteBuffer {
v := p.pool.Get()
if v != nil {
return v.(*ByteBuffer)
}
return &ByteBuffer{
B: make([]byte, 0, atomic.LoadUint64(&p.defaultSize)),
}
}
Some other details:
- The minimum capacity is 2^6 = 64, because this is the size of the CPU cache line on a 64-bit computer. Data of this size can be loaded into the CPU cache line at one time, and it is meaningless no matter how small it is.
atomic
atomic operation is used multiple times in the code to avoid performance loss caused by locking.
Of course, the shortcomings of this library are also obvious, because most of the used capacity is less than defaultSize
, there will be some memory waste.
to sum up
Remove the comments and blank lines, bytebufferpool
only used about 150 lines of code to realize a high-performance Buffer
object pool. The details are worth savoring. Reading high-quality code helps to improve your coding skills and learn the details of coding. It is strongly recommended to take the time to read it carefully! ! !
If you find a fun and useful Go language library, welcome to submit an issue on the Go Daily Library GitHub😄
reference
- bytebufferpool GitHub:https://github.com/valyala/bytebufferpool
- Go daily library GitHub: https://github.com/darjun/go-daily-lib
I
My blog: https://darjun.github.io
Welcome to follow my WeChat public account [GoUpUp], learn together and make progress together~
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。