LSM-Tree - LRU cache of LevelDb - 技术读书笔记

LSM-Tree - LRU cache of LevelDb

introduction

LRU cache is used in various open source components. It is often used for hot and cold data and elimination strategies. There are three main points for using LRU.

The first point is that the implementation is very simple.
The second point is that the amount of code itself is not bad.
The last one involving data structures is very classic.

LevelDB is a classic case for LRU cache implementation. This section describes how it uses LRU to implement cache.

There is a corresponding LRU cache algorithm topic in LeetCode, you can do it if you are interested:
lru-cache

theory

According to the introduction of the LRU cache structure of the wiki, you can understand the basic elimination strategy process of the cache, such as the node elimination process in the following figure:

The order of reading is A B C D E D F , the cache size is 4, the number in parentheses indicates the sorting, the smaller the number, the later it indicates Least recently .

According to the reading order of the arrows, when E is read, it is found that the cache is full, and the earliest A(0) will be eliminated.

Next, continue to read and update the sorting. In the penultimate time, it is found that D is the largest and B is the smallest. After reading F and adding it to the cache, it is found that the cache is already full. Value B has the least number of visits, so it is eliminated and replaced.

The implementation of the LRU according to the principle of least practicality requires two data structures:

HashTable: Used to implement O(1) lookups.
List: storage Least recently sort, used for the elimination of old data.

LevelDB implementation

Here is a direct look at how LevelDB applies this data structure.

The LRUCache design of LevelDB has 4 data structures, which are in a progressive relationship, namely:

LRUHandle (LRU node, also known as LRUNode)
HandleTable (Hash Table)
LRUCache (key, cache interface standard and implementation)
ShardedLRUCache (for improving concurrency efficiency)

The data structure of the entire LevelDB is composed as follows:

The following are the relevant interface definitions:

 // 插入一个键值对（key，value）到缓存（cache）中，
// 并从缓存总容量中减去该键值对所占额度（charge） 
// 
// 返回指向该键值对的句柄（handle），调用者在用完句柄后，
// 需要调用 this->Release(handle) 进行释放
//
// 在键值对不再被使用时，键值对会被传入的 deleter 参数
// 释放
virtual Handle* Insert(const Slice& key, void* value, size_t charge,
                       void (*deleter)(const Slice& key, void* value)) = 0;

// 如果缓存中没有相应键（key），则返回 nullptr
//
// 否则返回指向对应键值对的句柄（Handle）。调用者用完句柄后，
// 要记得调用 this->Release(handle) 进行释放
virtual Handle* Lookup(const Slice& key) = 0;

// 释放 Insert/Lookup 函数返回的句柄
// 要求：该句柄没有被释放过，即不能多次释放
// 要求：该句柄必须是同一个实例返回的
virtual void Release(Handle* handle) = 0;

// 获取句柄中的值，类型为 void*（表示任意用户自定义类型）
// 要求：该句柄没有被释放
// 要求：该句柄必须由同一实例所返回
virtual void* Value(Handle* handle) = 0;

// 如果缓存中包含给定键所指向的条目，则删除之。
// 需要注意的是，只有在所有持有该条目句柄都释放时，该条目所占空间才会真正被释放
virtual void Erase(const Slice& key) = 0;

// 返回一个自增的数值 id。当一个缓存实例由多个客户端共享时，
// 为了避免多个客户端的键冲突，每个客户端可能想获取一个独有
// 的 id，并将其作为键的前缀。类似于给每个客户端一个单独的命名空间。
virtual uint64_t NewId() = 0;

// 驱逐全部没有被使用的数据条目
// 内存吃紧型的应用可能想利用此接口定期释放内存。
// 基类中的 Prune 默认实现为空，但强烈建议所有子类自行实现。
// 将来的版本可能会增加一个默认实现。
virtual void Prune() {}

// 返回当前缓存中所有数据条目所占容量总和的一个预估
virtual size_t TotalCharge() const = 0;

According to the definition of the LevelDB interface, the cache can be sorted out to solve the following requirements:

Multithreading support
performance requirements
Lifecycle management of data items

cache.cc

Cache.cc basically contains all the code of LevelDB about LRU cache implementation.

HandleTable - Hash table

HandleTable and HashTable are actually the difference in names. The key internal attributes are as follows.

 private:

// The table consists of an array of buckets where each bucket is

// a linked list of cache entries that hash into the bucket.
// 第一个length_ 保存了桶的个数
uint32_t length_;
// 维护在整个hash表中一共存放了多少个元素
uint32_t elems_;
// 二维指针，每一个指针指向一个桶的表头位置
LRUHandle** list_;

In order to improve the search efficiency, there is only one element in a bucket as much as possible . In the following insertion code, the second-level pointer method is used, but it is not very complicated. When inserting, the predecessor node will be found, and the operation is The next_hash pointer in the predecessor node:

 // 首先读取 `next_hash`，找到下一个链节，将其链到待插入节点后边，然后修改前驱节点 `next_hash` 指向。
LRUHandle* Insert(LRUHandle* h) {
    // 查找节点，路由定位
    LRUHandle** ptr = FindPointer(h->key(), h->hash);
    
    LRUHandle* old = *ptr;
    
    h->next_hash = (old == nullptr ? nullptr : old->next_hash);
    
    *ptr = h;
    
    if (old == nullptr) {
    
        ++elems_;
    
        if (elems_ > length_) {
        
            // Since each cache entry is fairly large, we aim for a small average linked list length (<= 1).
            // 由于每个缓存条目都相当大，我们的目标是一个小的平均链表长度（<= 1）。
            Resize();
        }
    }
    return old;

}

In addition, when the entire hash the number of elements in the table exceeds the number of hash the number of buckets in the table, the Resize function will be called and the number of the entire bucket will be called. Double up while relocating existing elements to the back of a suitable bucket.

Note that this resize operation comes from an article [[Dynamic-sized NonBlocking Hash table]]. The author designed the entire hash table with reference to the description of this paper. The most critical part is the resize part, which will be discussed separately. Analyzed through an article.

If you want to understand the specific algorithm theory, you must understand the papers mentioned above:

Document download: p242-liu.pdf

 /*
        Resize 操作
    */
    void Resize() {
    
        uint32_t new_length = 4;
        
        while (new_length < elems_) {
            // 扩展
            new_length *= 2;
    
        }
        
        LRUHandle** new_list = new LRUHandle*[new_length];
        // 迁移哈希表
        memset(new_list, 0, sizeof(new_list[0]) * new_length);
        
        uint32_t count = 0;
        
        for (uint32_t i = 0; i < length_; i++) {
            
            LRUHandle* h = list_[i];
            
            while (h != nullptr) {
            
                LRUHandle* next = h->next_hash;
                
                uint32_t hash = h->hash;
                
                LRUHandle** ptr = &new_list[hash & (new_length - 1)];
                
                h->next_hash = *ptr;
                
                *ptr = h;
                
                h = next;
                
                count++;
        
            }
    
        }
    
        assert(elems_ == count);
        
        delete[] list_;
        
        list_ = new_list;
        
        length_ = new_length;
    
    }

};

This similar way of expanding the set in advance, combined with the good hash function and the element length control mentioned by the previous author, directly doubles the capacity. Finally, the efficiency of the search after this optimization can be considered as O（1） .

Let's expand here and talk about FindPointer the method of locating the route. You can see that the corresponding second-level pointer node can be quickly found by the hash value and bit operation of the cache node, that is, to find the final bucket, and because there is no sorting inside the linked list , where nodes are found by traversing a fully linked list.

In addition to this, there is another detail that the above insertion uses the predecessor node next_hash , and it is also appropriate to find and return this object here.

The final node search process is as follows:

If the hash or key of the node matches, return the double pointer of the node ( the pointer of the next_hash pointer of the predecessor node ).
Otherwise returns a double pointer to the last node of the list (edge case, if it's an empty list, the last node is the bucket head).

 // 返回一个指向 slot 的指针，该指针指向一个缓存条目
// 匹配键/哈希。 如果没有这样的缓存条目，则返回一个
// 指向对应链表中尾随槽的指针。
LRUHandle** FindPointer(const Slice& key, uint32_t hash) {

    LRUHandle** ptr = &list_[hash & (length_ - 1)];
    
    while (*ptr != nullptr && ((*ptr)->hash != hash || key != (*ptr)->key())) {
    
        ptr = &(*ptr)->next_hash;
    
    }
    
    return ptr;

}

The delete operation will modify the direction of next_hash , which is similar to the new basic operation, and will not be introduced here.

When deleting, just change the next_hash to the address of the successor node of the node to be deleted, and then return to the node to be deleted.

 LRUHandle* Remove(const Slice& key, uint32_t hash) {
    
        LRUHandle** ptr = FindPointer(key, hash);
        
        LRUHandle* result = *ptr;
        
        if (result != nullptr) {
        
            *ptr = result->next_hash;
            
            --elems_;
        
        }
        
        return result;

    }

summary

The main difficulty in the content of the entire hash table is to understand the maintenance of the secondary pointer. Among these methods, the function with the greatest side effect is the Resize method, which locks the entire hash table when it is running. The hash table and other threads must wait for the end of the redistribution of the hash table. Although a bucket tries to point to a node, if the hash distribution is uneven, there will still be a performance impact.

In addition, it should be noted that although blocking is required, the minimum granularity of the entire lock is a bucket. In most cases, other threads reading other buckets have no effect.

Again, in order to solve the problem of concurrent read and write hash tables, a gradual migration method is mentioned here : Dynamic-sized NonBlocking Hash table , which can evenly amortize the migration time, which is somewhat similar to the evolution of Go GC.

"Dynamic-Sized Nonblocking Hash Tables", by Yujie Liu, Kunlong Zhang, and Michael Spear. ACM Symposium on Principles of Distributed Computing, Jul 2014.=

According to theoretical knowledge, the elimination strategy of LRU cache usually removes the earliest accessed data from the cache, but obviously this operation cannot be completed through the hash table, so we need to see the design from the cache node.

LRUCache - LRU cache

The general structure of the LRU cache is as follows:

Like most LRU caches, the meaning of the two linked lists is to ensure the separation of hot and cold, and maintain a hash table internally.

lru_: Cold linked list, if the reference count returns to 0, the "cold palace" will be removed.
in_use_: Hot linked list, usually added from cold linked list to hot linked list by reference counting.

 LRUHandle lru_;      // lru_ 是冷链表，属于冷宫，

  LRUHandle in_use_;   // in_use_ 属于热链表，热数据在此链表

  HandleTable table_; // 哈希表部分已经讲过

In LRUCache, after the functions contained in the previously analyzed export interface Cache are omitted, the LRUCache class is simplified as follows:

 class LRUCache {
 public:
  LRUCache();
  ~LRUCache();

  // 构造方法可以手动之处容量，
  void SetCapacity(size_t capacity) { capacity_ = capacity; }

 private:
  // 辅助函数：将链节 e 从双向链表中摘除
  void LRU_Remove(LRUHandle* e);
  // 辅助函数：将链节 e 追加到链表头
  void LRU_Append(LRUHandle* list, LRUHandle* e);
  // 辅助函数：增加链节 e 的引用
  void Ref(LRUHandle* e);
  // 辅助函数：减少链节 e 的引用
  void Unref(LRUHandle* e);
  // 辅助函数：从缓存中删除单个链节 e
  bool FinishErase(LRUHandle* e) EXCLUSIVE_LOCKS_REQUIRED(mutex_);

  // 在使用 LRUCache 前必须先初始化此值
  size_t capacity_;

  // mutex_ 用以保证此后的字段的线程安全
  mutable port::Mutex mutex_;
  size_t usage_ GUARDED_BY(mutex_);

  // lru 双向链表的空表头
  // lru.prev 指向最新的条目，lru.next 指向最老的条目
  // 此链表中所有条目都满足 refs==1 和 in_cache==true
  // 表示所有条目只被缓存引用，而没有客户端在使用

    // 作者注释
    // LRU 列表的虚拟头。
   // lru.prev 是最新条目，lru.next 是最旧条目。
   // 条目有 refs==1 和 in_cache==t
  LRUHandle lru_ GUARDED_BY(mutex_);

  // in-use 双向链表的空表头
  // 保存所有仍然被客户端引用的条目
  // 由于在被客户端引用的同时还被缓存引用，
  // 肯定有 refs >= 2 和 in_cache==true.

  // 作者注释：
  // 使用中列表的虚拟头。
  // 条目正在被客户端使用，并且 refs >= 2 并且 in_cache==true。
  LRUHandle in_use_ GUARDED_BY(mutex_);

  // 所有条目的哈希表索引
  HandleTable table_ GUARDED_BY(mutex_);
};

The hot and cold linked list is maintained by the following methods:

Ref: Indicates that the function needs to use the cache. If the corresponding element is located in the cold linked list, it needs to overflow from the cold linked list to the hot linked list.
Unref: Contrary to Ref , it means that the client no longer accesses the element and needs to set the reference count to -1. For example, if it is completely unused, the element can be deleted if the reference count is 0. If the reference count is 0 1, you can put elements into the cold palace and put them into the cold chain list.

 void LRUCache::Ref(LRUHandle* e) {
    
        if (e->refs == 1 && e->in_cache) { 
            // If on lru_ list, move to in_use_ list.
            
            LRU_Remove(e);
            // 打入热链表
            LRU_Append(&in_use_, e);
        
        }
        
        e->refs++;
    
    }

    void LRUCache::Unref(LRUHandle* e) {
    
        assert(e->refs > 0);
        
        e->refs--;
        
        if (e->refs == 0) { 
            // Deallocate.
            // 解除分配
        
            assert(!e->in_cache);
            
            (*e->deleter)(e->key(), e->value);
            
            free(e);
    
        } else if (e->in_cache && e->refs == 1) {
            
            // No longer in use; move to lru_ list.
            // 不再使用； 移动到 冷链表。
            LRU_Remove(e);
            
            LRU_Append(&lru_, e);
        
        }
    
    }

LRU is relatively simple to add and delete nodes, similar to general linked list operations:

 void LRUCache::LRU_Append(LRUHandle* list, LRUHandle* e) {
        // 通过在 *list 之前插入来创建“e”最新条目
        // Make "e" newest entry by inserting just before *list
        
        e->next = list;
        
        e->prev = list->prev;
        
        e->prev->next = e;
        
        e->next->prev = e;
    
    }
    
    void LRUCache::LRU_Remove(LRUHandle* e) {
        
        e->next->prev = e->prev;
        
        e->prev->next = e->next;
    
    }

Because it is a cache, there is a capacity limit. If the capacity is exceeded, the least accessed elements must be removed from the cold chain list.

 Cache::Handle* LRUCache::Insert(const Slice& key, uint32_t hash, void* value,

size_t charge,

void (*deleter)(const Slice& key,

void* value)) {
    
    MutexLock l(&mutex_);
    
      
    
    LRUHandle* e =
    
    reinterpret_cast<LRUHandle*>(malloc(sizeof(LRUHandle) - 1 + key.size()));
    
    e->value = value;
    
    e->deleter = deleter;
    
    e->charge = charge;
    
    e->key_length = key.size();
    
    e->hash = hash;
    
    e->in_cache = false;
    
    e->refs = 1; // for the returned handle. 返回的句柄
    
    std::memcpy(e->key_data, key.data(), key.size());
    
      
    
    if (capacity_ > 0) {
    
        e->refs++; // for the cache's reference.
        
        e->in_cache = true;
        //链入热链表        
        LRU_Append(&in_use_, e);
        //使用的容量增加
        usage_ += charge;
        // 如果是更新的话，需要回收老的元素
    
        FinishErase(table_.Insert(e));
    
    } else { 
        // don't cache. (capacity_==0 is supported and turns off caching.)
        
         // capacity_==0 时表示关闭缓存，不进行任何缓存

        // next is read by key() in an assert, so it must be initialized
        // next 由断言中的 key() 读取，因此必须对其进行初始化
        e->next = nullptr;
    
    }
    
    while (usage_ > capacity_ && lru_.next != &lru_) {
    //如果容量超过了设计的容量，并且冷链表中有内容，则从冷链表中删除所有元素
    
        LRUHandle* old = lru_.next;
        
        assert(old->refs == 1);
        
        bool erased = FinishErase(table_.Remove(old->key(), old->hash));
        
        if (!erased) { // to avoid unused variable when compiled NDEBUG
        
            assert(erased);
        
        }
    
    }
    
      
    
    return reinterpret_cast<Cache::Handle*>(e);

}

Here you need to pay attention to the following code:

 if (capacity_ > 0) {
    
        e->refs++; // for the cache's reference.
        
        e->in_cache = true;
        //链入热链表        
        LRU_Append(&in_use_, e);
        //使用的容量增加
        usage_ += charge;
        // 如果是更新的话，需要回收老的元素
        FinishErase(table_.Insert(e));
    
    } else { 
        // don't cache. (capacity_==0 is supported and turns off caching.)
        // 不要缓存。 （容量_==0 受支持并关闭缓存。）
        // next is read by key() in an assert, so it must be initialized
        // next 由断言中的 key() 读取，因此必须对其进行初始化
        e->next = nullptr;
    
    }

The internal code needs to pay attention to the FinishErase(table_.Insert(e)); method. This part of the code needs to be combined with the LRUHandle* Insert(LRUHandle* h) method of the hash table (HandleTable) introduced earlier. If an element with the same key value is found internally already exists, the Insert function of HandleTable will return the old element.

Therefore, LRU's Insert function implicitly updates the operation , which will add the new Node to the Cache, and the old element will call the FinishErase function to decide whether to move it into the cold palace or delete it completely .

 // If e != nullptr, finish removing *e from the cache; it has already been removed from the hash table. Return whether e != nullptr.
// 如果e != nullptr，从缓存中删除*e；表示它已经被从哈希表中删除。同时返回e是否 !=nullptr。
bool LRUCache::FinishErase(LRUHandle* e) {

    if (e != nullptr) {
    
        assert(e->in_cache);
        
        LRU_Remove(e);
        
        e->in_cache  = false;
        
        usage_ -= e->charge;
        // 状态改变
        Unref(e);
    
    }
    
    return e != nullptr;

}

Extension: Mysql's memory cache pages are also maintained and processed in a way of separating hot and cold, the purpose is to balance the available pages and cache hits, but we all know that the 8.0 cache is directly deleted, because today's slightly larger OLTP business is for the cache. The hit rate is very low.

summary

The LRU cache implementation in LevelDB has the following characteristics:

Use cold and hot separated linked lists to maintain data, there is no intersection between hot and cold data: the --- 6b8566f8e877bcaa8f8110dfc946ce93 --- linked list referenced by the client, and the --- lru_ in-use linked list that is not referenced by any client.
The design of the doubly linked list, and one end uses the empty linked list as the boundary judgment. The prev pointer of the table header points to the latest entry, and the next pointer points to the oldest entry, finally forming a doubly circular linked list .
Use usage_ to indicate the current amount of the cache, and use capacity_ to indicate the total amount of the cache.
Several basic operations are abstracted: LRU_Remove , LRU_Append , Ref , Unref are used as auxiliary functions for multiplexing.
Each LRUCache is guarded by a lock mutex_ .

LRUHandle - LRU node

The LRU node judges whether the switch exists in the cache through the state. If the reference count is 0, the hash and the LRU linked list are removed by the erased method.

The following is a specific schematic diagram:

LRUHandle is actually the cache node LRUNode , LevelDB's Cache management, through the author's comments, we can understand that the entire cache node maintains two doubly linked lists and a hash table. No introduction is needed.

The classic problem of hashing is hash collision , and the use of linked list nodes to solve the problem of hash nodes is a classic way, and LevelDB is no exception, but it is a bit more complicated than the traditional design method.

 // 机翻自作者的注释，对于理解作者设计比较关键，翻得一般。建议对比原文多读几遍
// LRU缓存实现
//
// 缓存条目有一个“in_cache”布尔值，指示缓存是否有
// 对条目的引用。如果没有传递给其“删除器”的条目是通过 Erase()，
// 通过 Insert() 时， 插入具有重复键的元素，或在缓存销毁时。
//
// 缓存在缓存中保存两个项目的链表。中的所有项目
// 缓存在一个列表或另一个列表中，并且永远不会同时存在。仍被引用的项目
// 由客户端但从缓存中删除的不在列表中。名单是：

// - in-use: 包含客户端当前引用的项目，没有
// 特定顺序。 （这个列表用于不变检查。如果我们
// 删除检查，否则该列表中的元素可能是
// 保留为断开连接的单例列表。）

// - LRU：包含客户端当前未引用的项目，按 LRU 顺序
// 元素通过 Ref() 和 Unref() 方法在这些列表之间移动，
// 当他们检测到缓存中的元素获取或丢失它的唯一
// 外部参考。

// 一个 Entry 是一个可变长度的堆分配结构。 条目保存在按访问时间排序的循环双向链表中。
// LRU cache implementation

//

// Cache entries have an "in_cache" boolean indicating whether the cache has a

// reference on the entry. The only ways that this can become false without the

// entry being passed to its "deleter" are via Erase(), via Insert() when

// an element with a duplicate key is inserted, or on destruction of the cache.

//

// The cache keeps two linked lists of items in the cache. All items in the

// cache are in one list or the other, and never both. Items still referenced

// by clients but erased from the cache are in neither list. The lists are:

// - in-use: contains the items currently referenced by clients, in no

// particular order. (This list is used for invariant checking. If we

// removed the check, elements that would otherwise be on this list could be

// left as disconnected singleton lists.)

// - LRU: contains the items not currently referenced by clients, in LRU order

// Elements are moved between these lists by the Ref() and Unref() methods,

// when they detect an element in the cache acquiring or losing its only

// external reference.

// An entry is a variable length heap-allocated structure. Entries

// are kept in a circular doubly linked list ordered by access time.
struct LRUHandle {

    void* value;
    
    void (*deleter)(const Slice&, void* value); // 释放 key value 空间用户回调
        
    LRUHandle* next_hash; // 用于 Hashtable 处理链表冲突
    
    LRUHandle* next; // 双向链表维护LRU顺序
    
    LRUHandle* prev;
    
    size_t charge; // TODO(opt): Only allow uint32_t?
    
    size_t key_length;
    
    bool in_cache; // 该 handle 是否在 cache table 中

    
    uint32_t refs; // 该 handle 被引用次数
    
    uint32_t hash; // key 的 hash值，
    
    char key_data[1]; // Beginning of key
    
    Slice key() const {
    
    // next is only equal to this if the LRU handle is the list head of an
    
    // empty list. List heads never have meaningful keys.
    // 仅当 LRU 句柄是空列表的列表头时，next 才等于此值。此时列表头永远不会有有意义的键。
    assert(next != this);
    
    return Slice(key_data, key_length);

}

ShardedLRUCache

This class inherits the Cache interface and locks like all LRUCaches, but the difference is that ShardedLRUCache can define multiple LRUCaches to handle cache processing after different hash modulo.

 class ShardedLRUCache : public Cache {

    private:
    
    LRUCache shard_[kNumShards];
    
    port::Mutex id_mutex_;
    
    uint64_t last_id_;
    
    static inline uint32_t HashSlice(const Slice& s) {
    
        return Hash(s.data(), s.size(), 0);
    
    }
    
    static uint32_t Shard(uint32_t hash) { return hash >> (32 - kNumShardBits); }
    
    public:
    
        explicit ShardedLRUCache(size_t capacity) : last_id_(0) {
        
        const size_t per_shard = (capacity + (kNumShards - 1)) / kNumShards;
        
        for (int s = 0; s < kNumShards; s++) {
        
            shard_[s].SetCapacity(per_shard);
        
        }
    
    }
    
    ~ShardedLRUCache() override {}
    
    Handle* Insert(const Slice& key, void* value, size_t charge,
    
            void (*deleter)(const Slice& key, void* value)) override {
    
        const uint32_t hash = HashSlice(key);
        
        return shard_[Shard(hash)].Insert(key, hash, value, charge, deleter);
    
    }
    
    Handle* Lookup(const Slice& key) override {
    
        const uint32_t hash = HashSlice(key);
        
        return shard_[Shard(hash)].Lookup(key, hash);
    
    }
    
    void Release(Handle* handle) override {
    
        LRUHandle* h = reinterpret_cast<LRUHandle*>(handle);
        
        shard_[Shard(h->hash)].Release(handle);
    
    }
    
    void Erase(const Slice& key) override {
    
        const uint32_t hash = HashSlice(key);
        
        shard_[Shard(hash)].Erase(key, hash);
    
    }
    
    void* Value(Handle* handle) override {
    
        return reinterpret_cast<LRUHandle*>(handle)->value;
    
    }

    // 全局唯一自增ID
    uint64_t NewId() override {
    
        MutexLock l(&id_mutex_);
        
        return ++(last_id_);
    
    }
    
    void Prune() override {
    
        for (int s = 0; s < kNumShards; s++) {
        
            shard_[s].Prune();
        
        }
    
    }
    
    size_t TotalCharge() const override {
    
    size_t total = 0;
    
    for (int s = 0; s < kNumShards; s++) {
    
        total += shard_[s].TotalCharge();
    
    }
    
    return total;
    
    }

};

ShardedLRUCache There are 16 internal LRUCache , when looking up the key, first calculate which LRUCache it belongs to, and then lock the search in the corresponding LRUCache. This strategy is not uncommon, but the core The code is not in this part, so it is introduced at the end.

16个LRUCache， Shard方法利用key的前kNumShardBits = 4个bit 作为分片路由，最终可以支持kNumShards = 1 << kNumShardBits 16 shards, this is also 16 shards

 static uint32_t Shard(uint32_t hash) { return hash >> (32 - kNumShardBits); }

Since LRUCache and ShardedLRUCache have implemented the Cache interface, so ShardedLRUCache ShardedLRUCache need to route all Cache interface operations to the corresponding Shard. ShardedLRUCache There is not much logic, so I won't repeat it here.

Summarize

The LRU cache implementation of the entire LevelDB basically conforms to the design idea of LRU except that it is a bit non-mainstream in the place where it is named.

The core of the entire LevelDB is the hash table and hash function. The hash table that supports concurrent reading and writing and the core part of the resize function are all worthy of consideration.

The optimization of the hash table has actually been optimized since its appearance, and the implementation on LevelDB is a good reference.

write at the end

The basic introduction of the more important components of LevelDB is completed, and the LRU cache of LevelDB can also be regarded as a general implementation of textbooks.

LSM-Tree - LRU cache of LevelDb