1

Introduction

Slice is a data structure provided by Go language, which is very simple and convenient to use. However, due to implementation-level reasons, slicing often produces confusing results. Mastering the underlying structure and principles of slicing can avoid many common misunderstandings.

Underlying structure

The slice structure is defined in the slice.go file under the runtime

// src/runtime/slice.go
type slice struct {
  array unsafe.Pointer
  len int
  cap int
}
  • array : a pointer to the array of underlying storage data
  • len : The length of the slice, in the code we can use the len() function to get this value
  • cap : the capacity of the slice, is the maximum number of elements that can be accommodated without expansion. In the code, we can use the cap() function to get this value

We can output the underlying structure of the slice with the following code:

type slice struct {
  array unsafe.Pointer
  len   int
  cap   int
}

func printSlice() {
  s := make([]uint32, 1, 10)
  fmt.Printf("%#v\n", *(*slice)(unsafe.Pointer(&s)))
}

func main() {
  printSlice()
}

Run output:

main.slice{array:(unsafe.Pointer)(0xc0000d6030), len:1, cap:10}

Pay attention to a detail here. Since the runtime.slice structure is non-exported, we cannot use it directly. So I manually defined a slice structure in the code, and the fields are the same as the runtime.slice

We combine the underlying structure of slicing, first review the basics of slicing, and then look at the common problems of slicing one by one.

Basic knowledge

Create slice

There are 4 ways to create slices:

  1. var

var declares a variable of the slice type, and then the slice value is nil .

var s []uint32

For slices created in this way, the array field is a null pointer, and the len and cap fields are both equal to 0.

  1. Slice literal

Use the slice literal to enumerate all the elements. At this time, the slice length and capacity are equal to the number of specified elements.

s := []uint32{1, 2, 3}

After s , the underlying structure of 060a091c0e9220 is as follows:

len and cap fields are equal to 3.

  1. make

Use make create, you can specify the length and capacity. The format is make([]type, len[, cap]) , you can specify only the length, or both length and capacity:

s1 := make([]uint32)
s2 := make([]uint32, 1)
s3 := make([]uint32, 1, 10)
  1. Slice operator

Use the slice operator to cut a part of an existing slice or array to create a new slice. The format of the slice operator is [low:high] , for example:

var arr [10]uint32
s1 := arr[0:5]
s2 := arr[:5]
s3 := arr[5:]
s4 := arr[:]

The interval is left open and right closed, that is, [low, high) , including index low , excluding high . The length of the slice generated by cutting is high-low .

In addition, low and high have default values. low defaults to 0, and high defaults to the length of the original slice or array. All of them can be omitted. When omitted, it is equivalent to the default value.

The bottom layer of the slice created in this way shares the same data space, which may cause data overwriting during the slicing operation, so be extra careful.

Add element

You can use the append() function to add elements to the slice. You can add zero or more elements at a time. If the remaining space (that is, cap-len ) is enough to store the element, add the element directly to the back, and then increase the value of the len On the contrary, you need to expand the capacity, allocate a larger array space, copy the elements in the old array, and then perform the addition operation.

package main

import "fmt"

func main() {
  s := make([]uint32, 0, 4)

  s = append(s, 1, 2, 3)
  fmt.Println(len(s), cap(s)) // 3 4

  s = append(s, 4, 5, 6)
  fmt.Println(len(s), cap(s)) // 6 8
}

Slice you don't know

  1. Is the empty slice equal to nil ?

What is the output of the following code?

func main() {
  var s1 []uint32
  s2 := make([]uint32, 0)

  fmt.Println(s1 == nil)
  fmt.Println(s2 == nil)
  fmt.Println("nil slice:", len(s1), cap(s1))
  fmt.Println("cap slice:", len(s2), cap(s2))
}

analysis:

First of all s1 and s2 are both 0, which is easy to understand. Comparison with sections nil are equal, actually to check slice structure array whether the field is a null pointer. Obviously s1 == nil returns true , and s2 == nil returns false . Although s2 length of 0, make() allocates space for it. Therefore, a slice with a length of 0 is generally defined as var in the form of .

  1. By value or by reference?

What is the output of the following code?

func main() {
  s1 := []uint32{1, 2, 3}
  s2 := append(s1, 4)

  fmt.Println(s1)
  fmt.Println(s2)
}

analysis:

Why does the append() function have a return value? Because when we pass the slice to append() , what actually passed is the runtime.slice structure. This structure is passed by value, so array/len/cap fields within the function does not affect the outer slice structure. In the above code, executed append() after s1 the len and cap remains unchanged, so the output is:

[1 2 3]
[1 2 3 4]

So we call append() to write s = append(s, elem) , and assign the return value to the original slice, thereby overwriting the array/len/cap these fields of 060a091c0e970d.

Beginners may also make the mistake of ignoring the return value of append()

append(s, elem)

This is even more wrong. The added elements will be lost, assuming that the internal fields of the slice outside the function have not changed.

We can see that although the slice is passed by reference, the value runtime.slice Only the modification of existing elements will be reflected outside the function, because the underlying array space is shared.

  1. Slicing expansion strategy

What is the output of the following code?

func main() {
  var s1 []uint32
  s1 = append(s1, 1, 2, 3)
  s2 := append(s1, 4)
  fmt.Println(&s1[0] == &s2[0])
}

This involves the expansion strategy of slices. When expanding, if:

  • If the current capacity is less than 1024, the capacity will be doubled;
  • If the current capacity is greater than or equal to 1024, the capacity is increased by 0.25 times successively until the required capacity is met.

I looked at the runtime/slice.go , and after executing the above rules, I will make corresponding adjustments according to the size of the slice element and the number of computers. The whole process is more complicated, so you can study it on your own if you are interested.

We only need to know that the capacity is small at the beginning, and it will be doubled to reduce the frequency of subsequent capacity expansion due to the addition of elements. When the capacity is expanded to a certain extent, expanding the capacity by 2 times will cause a relatively large waste.

s1 = append(s1, 1, 2, 3) in the above example, the capacity will be expanded to 4. Execute s2 := append(s1, 4) because there is enough space, the array at the bottom of s2 So the address of the first element of s1 and s2

  1. The slicing operator can cut strings

The slicing operator can slice strings, but it is different from slices and arrays. Cutting a string returns a string, not a slice. Because the string is immutable, if it returns a slice. The slice and string share the underlying data, and the string can be modified by slicing.

func main() {
  str := "hello, world"
  fmt.Println(str[:5])
}

Output hello.

  1. Slice bottom data sharing

What is the output of the following code?

func main() {
  array := [10]uint32{1, 2, 3, 4, 5}
  s1 := array[:5]

  s2 := s1[5:10]
  fmt.Println(s2)

  s1 = append(s1, 6)
  fmt.Println(s1)
  fmt.Println(s2)
}

analysis:

First notice that the upper bound 10 of s2 := s1[5:10] is already greater than the length of the s1 Remember, when uses the slice operator to cut slices, the upper bound is the capacity of the slice, not the length . At this time, the underlying structure of the two slices overlaps, as shown in the following figure:

At this time, the output s2 is:

[0, 0, 0, 0, 0]

Then add element 6 to slice s1 , and the structure is as shown in the figure below, where slice s1 and s2 share element 6:

The output s1 and s2 at this time are:

[1, 2, 3, 4, 5, 6]
[6, 0, 0, 0, 0]

It can be seen that because the underlying data sharing of the slice may cause modification of one slice, other slices will also be modified. This can sometimes cause bugs that are difficult to debug. In order to alleviate this problem to a certain extent, Go 1.2 version provides an extended slice operator: [low:high:max] to limit the capacity of new slices. The slice capacity produced in this way is max-low .

func main() {
  array := [10]uint32{1, 2, 3, 4, 5}
  s1 := array[:5:5]

  s2 := array[5:10:10]
  fmt.Println(s2)

  s1 = append(s1, 6)
  fmt.Println(s1)
  fmt.Println(s2)
}

By executing s1 := array[:5:5] we limit the s1 to 5. At this time, the structure is as shown in the figure below:

When executing s1 = append(s1, 6) , it is found that there is no free capacity (because of len == cap == 5 ), a new bottom array is created and then added. At this time, the structure is as shown in the figure below, s1 and s2 do not interfere with each other:

to sum up

Knowing the underlying data structure of the slice and knowing that the runtime.slice , we can solve more than 90% of the slice problems. Combined with the graphics, it is very intuitive to see how the underlying data of the slice is operated.

The name of this series is my imitation "JavaScript You Don't Know" from 😀.

reference

  1. "Go Expert Programming", Douban link: https://book.douban.com/subject/35144587/
  2. Go GitHub you don’t know: https://github.com/darjun/you-dont-know-go

I

My blog: https://darjun.github.io

Welcome to follow my WeChat public account [GoUpUp], learn together and make progress together~


darjun
2.9k 声望358 粉丝