5

The content of this article is as follows. After reading this article, you will find the following Golang Map related interview questions.

interview questions

  1. The underlying implementation principle of map
  2. Why is iterating over the map unordered?
  3. How to implement in-order traversal of map?
  4. Why are Go maps not thread safe?
  5. How is a thread safe map implemented?
  6. Go sync.map or native map, which has better performance and why?
  7. Why is the load factor of Go map 6.5?
  8. What is the map expansion strategy?

Implementation principle

The map in Go is a pointer, occupying 8 bytes, pointing to the hmap structure; the underlying structure of the map can be seen in the src/runtime/map.go

The underlying structure of each map is hmap, and hmap contains several bucket arrays whose structure is bmap. The bottom layer of each bucket adopts a linked list structure. Next, let's take a closer look at the structure of the map

hmap structure

// A header for a Go map.
type hmap struct {
    count     int 
    // 代表哈希表中的元素个数,调用len(map)时,返回的就是该字段值。
    flags     uint8 
    // 状态标志,下文常量中会解释四种状态位含义。
    B         uint8  
    // buckets(桶)的对数log_2
    // 如果B=5,则buckets数组的长度 = 2^5=32,意味着有32个桶
    noverflow uint16 
    // 溢出桶的大概数量
    hash0     uint32 
    // 哈希种子

    buckets    unsafe.Pointer 
    // 指向buckets数组的指针,数组大小为2^B,如果元素个数为0,它为nil。
    oldbuckets unsafe.Pointer 
    // 如果发生扩容,oldbuckets是指向老的buckets数组的指针,老的buckets数组大小是新的buckets的1/2;非扩容状态下,它为nil。
    nevacuate  uintptr        
    // 表示扩容进度,小于此地址的buckets代表已搬迁完成。

    extra *mapextra 
    // 这个字段是为了优化GC扫描而设计的。当key和value均不包含指针,并且都可以inline时使用。extra是指向mapextra类型的指针。
 }

bmap structure

bmap is what we often call "buckets". A bucket can hold up to 8 keys. The reason why these keys fall into the same bucket is that after they are hashed, the hash result is "one class". Regarding the positioning of the key, we explain it in detail in the query and insertion of the map. In the bucket, the upper 8 bits of the hash value calculated by the key are used to determine where the key falls into the bucket (there are at most 8 locations in a bucket).

// A bucket for a Go map.
type bmap struct {
    tophash [bucketCnt]uint8        
    // len为8的数组
    // 用来快速定位key是否在这个bmap中
    // 桶的槽位数组,一个桶最多8个槽位,如果key所在的槽位在tophash中,则代表该key在这个桶中
}
//底层定义的常量 
const (
    bucketCntBits = 3
    bucketCnt     = 1 << bucketCntBits
    // 一个桶最多8个位置
)

但这只是表面(src/runtime/hashmap.go)的结构,编译期间会给它加料,动态地创建一个新的结构:

type bmap struct {
  topbits  [8]uint8
  keys     [8]keytype
  values   [8]valuetype
  pad      uintptr
  overflow uintptr
  // 溢出桶
}

The bucket memory data structure is visualized as follows:

Note that the key and value are put together separately, not in the form of key/value/key/value/... The source code shows that the advantage of this is that in some cases the padding field can be omitted to save memory space.

mapextra structure

When the key and value of the map are not pointers, and the size is less than 128 bytes, the bmap will be marked as not containing pointers, which can avoid scanning the entire hmap during gc. However, we see that bmap actually has an overflow field, which is of a pointer type, which destroys the assumption that bmap does not contain pointers. At this time, overflow will be moved to the extra field.

// mapextra holds fields that are not present on all maps.
type mapextra struct {
    // 如果 key 和 value 都不包含指针,并且可以被 inline(<=128 字节)
    // 就使用 hmap的extra字段 来存储 overflow buckets,这样可以避免 GC 扫描整个 map
    // 然而 bmap.overflow 也是个指针。这时候我们只能把这些 overflow 的指针
    // 都放在 hmap.extra.overflow 和 hmap.extra.oldoverflow 中了
    // overflow 包含的是 hmap.buckets 的 overflow 的 buckets
    // oldoverflow 包含扩容时的 hmap.oldbuckets 的 overflow 的 bucket
    overflow    *[]*bmap
    oldoverflow *[]*bmap

        nextOverflow *bmap    
    // 指向空闲的 overflow bucket 的指针
}

Main features

reference type

map is a pointer, the bottom layer points to hmap, so it is a reference type

golang has three commonly used advanced types slice , map, channel, they are all reference types , when the reference type is used as a function parameter, the original content data may be modified.

There is no passing by reference in golang, only passing by value and pointer. Therefore, when the map is passed as a function parameter, it is essentially passed by value, but because the underlying data structure of the map points to the actual element storage space through a pointer, modifying the map in the called function is also visible to the caller, so the map is used as a function. When passing parameters, it shows the effect of passing by reference.

Therefore, when passing a map, if you want to modify the contents of the map instead of the map itself, you don't need to use a pointer for the function parameter

func TestSliceFn(t *testing.T) {
    m := map[string]int{}
    t.Log(m, len(m))
    // map[a:1]
    mapAppend(m, "b", 2)
    t.Log(m, len(m))
    // map[a:1 b:2] 2
}

func mapAppend(m map[string]int, key string, val int) {
    m[key] = val
}

shared storage

underlying data structure of 161ed273039fc7 map points to the actual element storage space through a pointer. In this case, changes to one of the maps will affect other maps

func TestMapShareMemory(t *testing.T) {
    m1 := map[string]int{}
    m2 := m1
    m1["a"] = 1
    t.Log(m1, len(m1))
    // map[a:1] 1
    t.Log(m2, len(m2))
    // map[a:1]
}

Traversal order is random

In the case that the map is not modified, the order of the output keys and values may be different when using range to traverse the map multiple times. This is the intention of the designers of the Go language. The order of each range is randomized to remind developers that the underlying implementation of Go does not guarantee that the order of map traversal is stable. Please do not rely on the order of range traversal results.

The map itself is unordered, and the order of traversal will be randomized. If you want to traverse the map sequentially, you need to sort the map keys first, and then traverse the map in the order of the keys.

func TestMapRange(t *testing.T) {
    m := map[int]string{1: "a", 2: "b", 3: "c"}
    t.Log("first range:")
    // 默认无序遍历
    for i, v := range m {
        t.Logf("m[%v]=%v ", i, v)
    }
    t.Log("\nsecond range:")
    for i, v := range m {
        t.Logf("m[%v]=%v ", i, v)
    }

    // 实现有序遍历
    var sl []int
    // 把 key 单独取出放到切片
    for k := range m {
        sl = append(sl, k)
    }
    // 排序切片
    sort.Ints(sl)
    // 以切片中的 key 顺序遍历 map 就是有序的了
    for _, k := range sl {
        t.Log(k, m[k])
    }
}

Not thread safe

Maps are concurrency-unsafe by default for the following reasons:

After a long discussion, Go officials believe that Go map should be more suitable for typical usage scenarios (no need for safe access from multiple goroutines), rather than for a small number of cases (concurrent access), causing most programs to pay The cost of locking (performance) determines that it is not supported.

Scenario: 2 coroutines read and write at the same time, the following program will have a fatal error: fatal error: concurrent map writes

func main() {
    
    m := make(map[int]int)
    go func() {
                    //开一个协程写map
        for i := 0; i < 10000; i++ {
    
            m[i] = i
        }
    }()

    go func() {
                    //开一个协程读map
        for i := 0; i < 10000; i++ {
    
            fmt.Println(m[i])
        }
    }()

    //time.Sleep(time.Second * 20)
    for {
    
        ;
    }
}

If you want to achieve map thread safety, there are two ways:

Method 1: Use the read-write lock map + sync.RWMutex

func BenchmarkMapConcurrencySafeByMutex(b *testing.B) {
    var lock sync.Mutex //互斥锁
    m := make(map[int]int, 0)
    var wg sync.WaitGroup
    for i := 0; i < b.N; i++ {
        wg.Add(1)
        go func(i int) {
            defer wg.Done()
            lock.Lock()
            defer lock.Unlock()
            m[i] = i
        }(i)
    }
    wg.Wait()
    b.Log(len(m), b.N)
}

Method 2: Use 061ed27303a176 provided by sync.Map

sync.map is implemented with read-write separation, and its idea is to exchange space for time. Compared with the implementation of map+RWLock, it has made some optimizations: the read map can be accessed without lock, and the read map will be operated preferentially. If only the read map can be operated to meet the requirements (add, delete, modify, search and traverse), then there is no need to go to The operation of the write map (its read and write must be locked), so in some specific scenarios, the frequency of lock competition will be much less than the implementation of map+RWLock.

func BenchmarkMapConcurrencySafeBySyncMap(b *testing.B) {    var m sync.Map    var wg sync.WaitGroup    for i := 0; i < b.N; i++ {        wg.Add(1)        go func(i int) {            defer wg.Done()            m.Store(i, i)        }(i)    }    wg.Wait()    b.Log(b.N)}

hash collision

A map in golang is a collection of kv pairs. The bottom layer uses a hash table, and a linked list is used to resolve conflicts. When a conflict occurs, instead of applying a structure for each key to be strung together through a linked list, it is mounted with bmap as the smallest granularity, and one bmap can hold 8 kv. In the selection of the hash function, when the program starts, it will be detected whether the cpu supports aes, if so, the aes hash will be used, otherwise, the memhash will be used.

Common operations

create

map has 3 initialization methods, generally created by make

func TestMapInit(t *testing.T) {    // 初始化方式1:直接声明    // var m1 map[string]int    // m1["a"] = 1    // t.Log(m1, unsafe.Sizeof(m1))    // panic: assignment to entry in nil map    // 向 map 写入要非常小心,因为向未初始化的 map(值为 nil)写入会引发 panic,所以向 map 写入时需先进行判空操作    // 初始化方式2:使用字面量    m2 := map[string]int{}    m2["a"] = 2    t.Log(m2, unsafe.Sizeof(m2))    // map[a:2] 8    // 初始化方式3:使用make创建    m3 := make(map[string]int)    m3["a"] = 3    t.Log(m3, unsafe.Sizeof(m3))    // map[a:3] 8}

The creation of the map can be known by generating the assembly code. runtime.makemap If the initial capacity of your map is less than or equal to 8, you will find that you are going to runtime.fastrand because there is no need to generate multiple buckets when the capacity is less than 8, and the capacity of one bucket can be satisfied

Create a process

The makemap function will create a random hash seed fastrand , then calculate the minimum required number of buckets hint makeBucketArray create an array for storing buckets. This method is actually based on the incoming B number of buckets to be created calculated by 061ed27303a2c4 allocates a continuous space in memory for storing data. During the process of creating buckets, additional buckets for storing overflow data will be created. The number is 2^(B-4) . The initialization is complete and the hmap pointer is returned.

Calculate the initial value of B

Find a B such that the load factor of the map is within the normal range

B := uint8(0)for overLoadFactor(hint, B) {    B++}h.B = B// overLoadFactor reports whether count items placed in 1<<B buckets is over loadFactor.func overLoadFactor(count int, B uint8) bool {    return count > bucketCnt && uintptr(count) > loadFactorNum*(bucketShift(B)/loadFactorDen)}

find

There are two syntaxes for reading map in Go language: with comma and without comma. When the key to be queried is not in the map, the usage with comma will return a bool variable indicating whether the key is in the map; the statement without comma will return a zero value of the value type. Returns 0 if value is of type int, and returns an empty string if value is of type string.

// 不带 comma 用法value := m["name"]fmt.Printf("value:%s", value)// 带 comma 用法value, ok := m["name"]if ok {    fmt.Printf("value:%s", value)}

The lookup of the map can be known by generating the assembly code. According to the different types of keys, the compiler will replace the lookup function with a more specific function to optimize efficiency:

key typefind
uint32mapaccess1_fast32(t maptype, h hmap, key uint32) unsafe.Pointer
uint32mapaccess2_fast32(t maptype, h hmap, key uint32) (unsafe.Pointer, bool)
uint64mapaccess1_fast64(t maptype, h hmap, key uint64) unsafe.Pointer
uint64mapaccess2_fast64(t maptype, h hmap, key uint64) (unsafe.Pointer, bool)
stringmapaccess1_faststr(t maptype, h hmap, ky string) unsafe.Pointer
stringmapaccess2_faststr(t maptype, h hmap, ky string) (unsafe.Pointer, bool)
Find Process

Write protection monitoring

The function first checks the flags of the map. If the write flag bit of flags is set to 1 at this time, it means that there are other coroutines performing the "write" operation, which in turn causes the program to panic. This also shows that map is not safe for coroutines.

if h.flags&hashWriting != 0 {    throw("concurrent map read and map write")}
Calculate the hash value
hash := t.hasher(noescape(unsafe.Pointer(&ky)), uintptr(h.hash0))

After the key is calculated by the hash function, the obtained hash value is as follows (a total of 64 bits under the mainstream 64-bit computer):

 10010111 | 000011110110110010001111001010100010010110010101010 │ 01010
Find the bucket corresponding to the hash

m: the number of buckets

Get the corresponding bucket from buckets through hash & m. If the bucket is expanding and the expansion is not completed, get the corresponding bucket from oldbuckets

m := bucketMask(h.B)b := (*bmap)(add(h.buckets, (hash&m)*uintptr(t.bucketsize)))// m个桶对应B个位if c := h.oldbuckets; c != nil {  if !h.sameSizeGrow() {      // 扩容前m是之前的一半      m >>= 1  }  oldb := (*bmap)(add(c, (hash&m)*uintptr(t.bucketsize)))  if !evacuated(oldb) {      b = oldb    }}

Calculate the bucket number where the hash is located:

Use the last 5 bits of the hash value in the previous step, which is 01010 , and the value is 10, which is the 10th bucket (the range is 0~31th bucket)

traverse bucket

Calculate the slot where the hash is located:

top := tophash(hash)func tophash(hash uintptr) uint8 {    top := uint8(hash >> (goarch.PtrSize*8 - 8))    if top < minTopHash {        top += minTopHash    }    return top}

Use the upper 8 bits of the hash value in the previous step, that is, 10010111 , convert it to decimal, that is, 151, and find the tophash value (HOB hash) of 151* is the For the location of the key, the No. 2 slot is found, so the whole search process is over.

img

If it is not found in the bucket and the overflow is not empty, continue to search in the overflow bucket until it is found or all the key slots are searched, including all the overflow buckets.

Returns the pointer corresponding to the key

The corresponding slot is found above, and here we will analyze in detail how the key/value value is obtained:

// key 定位公式k :=add(unsafe.Pointer(b),dataOffset+i*uintptr(t.keysize))// value 定位公式v:= add(unsafe.Pointer(b),dataOffset+bucketCnt*uintptr(t.keysize)+i*uintptr(t.valuesize))//对于 bmap 起始地址的偏移:dataOffset = unsafe.Offsetof(struct{  b bmap  v int64}{}.v)

The starting address of the key in the bucket is unsafe.Pointer(b)+dataOffset. The address of the i-th key needs to cross the size of the i-th key on this basis; and we know that the address of the value is after all the keys, so the address of the i-th value also needs to add the offset of all the keys .

assign

It can be seen from the assembly language that inserting or modifying the key into the map will eventually call the mapassign function.

In fact, the syntax for inserting or modifying a key is the same, except that the key of the former operation does not exist in the map, while the key of the latter operation exists in the map.

mapassign has a family of functions that the compiler optimizes into corresponding "fast functions" depending on the type of key.

key typeinsert
uint32mapassign_fast32(t maptype, h hmap, key uint32) unsafe.Pointer
uint64mapassign_fast64(t maptype, h hmap, key uint64) unsafe.Pointer
stringmapassign_faststr(t maptype, h hmap, ky string) unsafe.Pointer

We only study the most general assignment function mapassign .

Assignment process

The assignment of the map will be accompanied by the expansion and migration of the map. The expansion of the map only doubles the underlying array, and does not transfer the data. The data transfer is carried out gradually after the expansion, and is carried out every time during the migration process. Assignment (access or delete) will do at least one migration.

Checksum initialization

1. Determine if map is nil

  1. Determine whether to read and write the map concurrently, if so, throw an exception
  2. Determine whether buckets is nil, if so, call newobject to allocate according to the current bucket size
migrate

Each time an assignment/deletion operation is performed, as long as oldbuckets != nil, it is considered that the capacity is being expanded, and a migration will be done. The migration process will be described in detail below.

Find & Update

According to the above search process, find the location of the key, if found, update it, if not found, find a vacancy and insert it

Expansion

After the previous iterative search action, if no insertable position is found, it means that the capacity needs to be expanded for insertion. The following will describe the expansion process in detail.

delete

As you can see from the assembly language, delete the key from the map, and finally call the mapdelete function

func mapdelete(t \*maptype, h _hmap, key unsafe.Pointer)

The logic of deletion is relatively simple. Most functions have been used in assignment operations, and the core is to find the specific location of the key. The search process is similar, search cell by cell in the bucket. After finding the corresponding position, perform a "zero" operation on the key or value, decrement the count value by 1, and set the tophash value of the corresponding position to Empty

e := add(unsafe.Pointer(b), dataOffset+bucketCnt*2*goarch.PtrSize+i*uintptr(t.elemsize))if t.elem.ptrdata != 0 {    memclrHasPointers(e, t.elem.size)} else {    memclrNoHeapPointers(e, t.elem.size)}b.tophash[i] = emptyOne

Expansion

Expansion time

Let's talk about the timing of triggering map expansion: when a new key is inserted into the map, condition detection will be performed. If the following two conditions are met, expansion will be triggered:

if !h.growing() && (overLoadFactor(h.count+1, h.B) || tooManyOverflowBuckets(h.noverflow, h.B)) {        hashGrow(t, h)        goto again // Growing the table invalidates everything, so try again    }

1. The loading factor exceeds the threshold

The threshold defined in the source code is 6.5 (loadFactorNum/loadFactorDen), which is a reasonable factor taken out after testing.

We know that each bucket has 8 vacancies, and if there is no overflow and all buckets are full, the load factor is calculated to be 8. Therefore, when the loading factor exceeds 6.5, it indicates that many buckets are about to be filled, and the search efficiency and insertion efficiency become lower. Expansion is necessary at this time.

For condition 1, there are too many elements and the number of buckets is too small. It is very simple: add 1 to B, and the maximum number of buckets ( 2^B ) directly becomes twice the original number of buckets. So, there are new and old buckets. Note that at this time the elements are all in the old bucket and have not been migrated to the new bucket. The new bucket just has a maximum size twice the original maximum size ( 2^B * 2 ).

2. Too many overflow buckets

When the loading factor is relatively small, the search and insertion efficiency of the map is also very low at this time, and the first point cannot identify this situation. The surface phenomenon is that the numerator for calculating the loading factor is relatively small, that is, the total number of elements in the map is small, but the number of buckets is large (the actual number of allocated buckets is large, including a large number of overflow buckets)

It's not hard to imagine the reason for this: elements are constantly being inserted and deleted. Many elements were inserted first, resulting in the creation of many buckets, but the load factor did not reach the critical value of point 1, and no expansion was triggered to alleviate this situation. After that, delete elements to reduce the total number of elements, and insert a lot of elements, resulting in the creation of many overflow buckets, but it will not trigger the provisions of point 1. What can you do with me? The number of overflow buckets is too large, which leads to scattered keys and scary inefficiency of search and insertion. Therefore, the second rule is introduced. It's like an empty city, with many houses but few residents, all scattered, it is difficult to find people

For Condition 2, there are actually not so many elements, but the number of overflow buckets is particularly large, indicating that many buckets are not full. The solution is to open up a new bucket space and move the elements in the old bucket to the new bucket, so that the keys in the same bucket are arranged more closely. In this way, it turns out that the key in the overflow bucket can be moved to the bucket. The result is to save space, improve bucket utilization, and naturally improve the efficiency of map search and insertion.

Expansion function
func hashGrow(t *maptype, h *hmap) {    bigger := uint8(1)    if !overLoadFactor(h.count+1, h.B) {        bigger = 0        h.flags |= sameSizeGrow    }    oldbuckets := h.buckets    newbuckets, nextOverflow := makeBucketArray(t, h.B+bigger, nil)    flags := h.flags &^ (iterator | oldIterator)    if h.flags&iterator != 0 {        flags |= oldIterator    }    // commit the grow (atomic wrt gc)    h.B += bigger    h.flags = flags    h.oldbuckets = oldbuckets    h.buckets = newbuckets    h.nevacuate = 0    h.noverflow = 0    if h.extra != nil && h.extra.overflow != nil {        // Promote current overflow buckets to the old generation.        if h.extra.oldoverflow != nil {            throw("oldoverflow is not nil")        }        h.extra.oldoverflow = h.extra.overflow        h.extra.overflow = nil    }    if nextOverflow != nil {        if h.extra == nil {            h.extra = new(mapextra)        }        h.extra.nextOverflow = nextOverflow    }    // the actual copying of the hash table data is done incrementally    // by growWork() and evacuate().}

Since map expansion needs to relocate the original key/value to a new memory address, if a large number of keys/values need to be relocated, performance will be greatly affected. Therefore, the expansion of the Go map adopts a method called "progressive". The original keys will not be relocated at one time, and only 2 buckets will be relocated each time.

hashGrow() function mentioned above does not actually "relocate", it just allocates new buckets and attaches the old buckets to the oldbuckets field. The real action of relocating buckets is in the growWork() function, and growWork() function is in the mapasign and mapdelete functions. That is, when inserting, modifying, or deleting a key, it will try to relocate the buckets. First, check whether the oldbuckets has been relocated. Specifically, check whether the oldbuckets is nil.

migrate

Migration timing

If the migration is not completed, when assigning/deleting, the migration will not be performed immediately after the expansion is completed (pre-allocated memory). Instead, incrementally expands . When there is access to a specific bukcet, it will gradually migrate (migrate oldbucket to bucket)

if h.growing() {        growWork(t, h, bucket)}
transfer function
func growWork(t *maptype, h *hmap, bucket uintptr) {    // 首先把需要操作的bucket 搬迁    evacuate(t, h, bucket&h.oldbucketmask())     // 再顺带搬迁一个bucket    if h.growing() {        evacuate(t, h, h.nevacuate)    }}

nevacuate identifies the current progress. If the relocation is completed, it should be the same length as 2^B

The implementation of the evacuate method is to transfer the bucket corresponding to this location and the data on its conflict chain to the new bucket.

  1. First, determine whether the current bucket has been transferred. (oldbucket identifies the location corresponding to the bucket that needs to be relocated)
b := (*bmap)(add(h.oldbuckets, oldbucket*uintptr(t.bucketsize)))// 判断if !evacuated(b) {  // 做转移操作}

The judgment of the transfer can be done directly through the tophash, and the first hash value in the tophash can be judged.

func evacuated(b *bmap) bool {  h := b.tophash[0]  // 这个区间的flag 均是已被转移  return h > emptyOne && h < minTopHash // 1 ~ 5}
  1. If it wasn't moved, it's time to migrate the data. When data is migrated, it may be migrated to buckets of the same size, or it may be migrated to buckets that are twice as large. Here xy are the marks that mark the target migration position: x marks the migration to the same position, y marks the migration to a position twice as large. Let's first look at the determination of the target location:
var xy [2]evacDstx := &xy[0]x.b = (*bmap)(add(h.buckets, oldbucket*uintptr(t.bucketsize)))x.k = add(unsafe.Pointer(x.b), dataOffset)x.v = add(x.k, bucketCnt*uintptr(t.keysize))if !h.sameSizeGrow() {  // 如果是2倍的大小,就得算一次 y 的值  y := &xy[1]  y.b = (*bmap)(add(h.buckets, (oldbucket+newbit)*uintptr(t.bucketsize)))  y.k = add(unsafe.Pointer(y.b), dataOffset)  y.v = add(y.k, bucketCnt*uintptr(t.keysize))}
  1. After determining the bucket location, you need to migrate according to the kv one by one.
  2. If the location of the currently relocated bucket and the overall relocated bucket are the same, we need to update the overall progress flag nevacuate
// newbit 是oldbuckets 的长度,也是nevacuate 的重点func advanceEvacuationMark(h *hmap, t *maptype, newbit uintptr) {  // 首先更新标记  h.nevacuate++  // 最多查看2^10 个bucket  stop := h.nevacuate + 1024  if stop > newbit {    stop = newbit  }  // 如果没有搬迁就停止了,等下次搬迁  for h.nevacuate != stop && bucketEvacuated(t, h, h.nevacuate) {    h.nevacuate++  }  // 如果都已经搬迁完了,oldbukets 完全搬迁成功,清空oldbuckets  if h.nevacuate == newbit {    h.oldbuckets = nil    if h.extra != nil {      h.extra.oldoverflow = nil    }    h.flags &^= sameSizeGrow  }

traverse

The traversal process is to traverse the buckets in order, and traverse the keys in the bucket in order.

Map traversal is unordered. If you want to achieve ordered traversal, you can sort the keys first.

Why is iterating over the map unordered?

If there is a migration, the position of the key has changed significantly, some keys fly to the high branch, and some keys remain in place. In this way, the results of traversing the map cannot be in the original order.

If it is a hard-written map, the operation of inserting and deleting into the map will not be performed. It stands to reason that every time such a map is traversed, a fixed-order key/value sequence will be returned. But Go avoids this practice, because it can lead to the misunderstanding of novice programmers into thinking that this is bound to happen, and in some cases, can be a big mistake.

Go must do more, when we traverse the map, not a fixed start traversing from No. 0 bucket, each time from a bucket random value number of traversing, and this is from a bucket of random numbers The traversal of cell starts. In this way, even if you are a hard-coded map, just traversing it is unlikely to return a fixed sequence of key/value pairs.

//runtime.mapiterinit 遍历时选用初始桶的函数func mapiterinit(t *maptype, h *hmap, it *hiter) {  ...  it.t = t  it.h = h  it.B = h.B  it.buckets = h.buckets  if t.bucket.kind&kindNoPointers != 0 {    h.createOverflow()    it.overflow = h.extra.overflow    it.oldoverflow = h.extra.oldoverflow  }  r := uintptr(fastrand())  if h.B > 31-bucketCntBits {    r += uintptr(fastrand()) << 31  }  it.startBucket = r & bucketMask(h.B)  it.offset = uint8(r >> h.B & (bucketCnt - 1))  it.bucket = it.startBucket    ...  mapiternext(it)}

Summarize

  1. map is a reference type
  2. map traversal is unordered
  3. map is not thread safe
  4. The hash collision resolution method of map is the linked list method
  5. The expansion of the map does not necessarily add space, it may also just do memory sorting
  6. The migration of the map is carried out step by step, and at least one migration will be done for each assignment.
  7. Deleting the key in the map may lead to many empty kvs, which will lead to migration operations. If it can be avoided, try to avoid it.

    This article is published by OpenWrite !

caspar
29 声望1 粉丝