Practice of guava cache expiration scheme

Expiration mechanism

As long as it is a cache, there must be an expiration mechanism. Guava cache expiration is divided into the following three types:

expireAfterAccess: Data is not accessed (read or written) within a specified time, it is expired data. When there is no data or expired data is read, when only one thread is allowed to update new data, other threads block and wait for the thread to update. Get the latest data.

Constructor:

public CacheBuilder<K, V> expireAfterAccess(long duration, TimeUnit unit) {
  ...
  this.expireAfterAccessNanos = unit.toNanos(duration);
  return this;
}

The constructor sets the value of expireAfterAccessNanos

expireAfterWrite: The data is not updated (written) within the specified time, it is expired data. When there is no data or expired data is read, when only one thread is allowed to update new data, other threads are blocked and wait for the thread to update after it is completed. the latest data.

Constructor:

public CacheBuilder<K, V> expireAfterWrite(long duration, TimeUnit unit) {
  ...
  this.expireAfterWriteNanos = unit.toNanos(duration);
  return this;
}

The constructor sets the value of expireAfterWriteNanos

refreshAfterWrite: If the data is not updated (written) within the specified time, it is expired data. When a thread is updating (writing) new data, other threads return old data.

Constructor:

public CacheBuilder<K, V> refreshAfterWrite(long duration, TimeUnit unit) {
  ...
  this.refreshNanos = unit.toNanos(duration);
  return this;
}

The constructor sets the value of refreshNanos

problem

expireAfterAccess and expireAfterWrite:
When the data reaches the expiration time, there can only be one thread to perform data refresh, and other requests block waiting for the refresh operation to complete, which will cause performance loss.
refreshAfterWrite：
When the data reaches the expiration time, there can only be one thread to load the new value, and other threads take the old value and return (you can also set to asynchronously obtain the new value, and all threads return the old value). This can effectively reduce waiting and lock contention, so refreshAfterWrite will perform better expireAfterWrite However, there will still be a thread that needs to perform refresh tasks, and guava cache supports asynchronous refresh. If asynchronous refresh is turned on, the thread will return to the old value after submitting the asynchronous refresh task, and the performance is better.
However, because the guava cache does not periodically clean up the function (active), it does expiration checking and cleaning (passive) when querying data. Then there will be the following problem: if the data is queried after a long period of time, the old value obtained may come from a long time ago, which will cause problems, and the scenes with high timeliness requirements may cause very large problems. mistake.

When there is no data to be accessed in the cache, no matter which mode is set, all threads will be blocked. Only one thread will load the data through the lock control.

principle

We must first understand the expiration principle of guava cache

1. Holistic approach

get method:

class LocalCache<K, V> extends AbstractMap<K, V> implements ConcurrentMap<K, V> {

    V get(K key, int hash, CacheLoader<? super K, V> loader) throws ExecutionException {
      checkNotNull(key);
      checkNotNull(loader);
      try {
        if (count != 0) { // read-volatile
          // don't call getLiveEntry, which would ignore loading values
          ReferenceEntry<K, V> e = getEntry(key, hash);
          if (e != null) {
            long now = map.ticker.read();
            V value = getLiveValue(e, now);
            if (value != null) {
              recordRead(e, now);
              statsCounter.recordHits(1);
              return scheduleRefresh(e, key, hash, value, now, loader);
            }
            ValueReference<K, V> valueReference = e.getValueReference();
            if (valueReference.isLoading()) {
              return waitForLoadingValue(e, key, valueReference);
            }
          }
        }

        // at this point e is either null or expired;
        return lockedGetOrLoad(key, hash, loader);
      } catch (ExecutionException ee) {
        Throwable cause = ee.getCause();
        if (cause instanceof Error) {
          throw new ExecutionError((Error) cause);
        } else if (cause instanceof RuntimeException) {
          throw new UncheckedExecutionException(cause);
        }
        throw ee;
      } finally {
        postReadCleanup();
      }
    }
    
}

It can be seen that guava cache inherits ConcurrentHashMap . In order to meet the concurrency scenario, the core data structure is based on ConcurrentHashMap.

2. Simplified method

Here the method is simplified into a few key steps related to this topic:

if (count != 0) {    // 当前缓存是否有数据
  ReferenceEntry<K, V> e = getEntry(key, hash);    // 取数据节点
  if (e != null) {                 
    V value = getLiveValue(e, now);    // 判断是否过期，过滤已过期数据，仅对expireAfterAccess或expireAfterWrite模式下设置的时间做判断
    if (value != null) {
      return scheduleRefresh(e, key, hash, value, now, loader);    // 是否需要刷新数据，仅在refreshAfterWrite模式下生
    }
    ValueReference<K, V> valueReference = e.getValueReference();
    if (valueReference.isLoading()) {   // 如果有其他线程正在加载/刷新数据
      return waitForLoadingValue(e, key, valueReference);    // 等待其他线程完成加载/刷新数据
    }        
  }
}
return lockedGetOrLoad(key, hash, loader);    // 加载/刷新数据

count is cache a property, is volatile modified (volatile int count), the number is saved in the current cache.

If count == 0 (no cache) or the hash node cannot be obtained key lock and load the cache ( lockedGetOrLoad ).
If the Hash node is obtained, it is judged whether it is expired ( getLiveValue ), and the expired data is filtered out.

`3. getLiveValue`

V getLiveValue(ReferenceEntry<K, V> entry, long now) {
  if (entry.getKey() == null) {
    tryDrainReferenceQueues();
    return null;
  }
  V value = entry.getValueReference().get();
  if (value == null) {
    tryDrainReferenceQueues();
    return null;
  }

  if (map.isExpired(entry, now)) {
    tryExpireEntries(now);
    return null;
  }
  return value;
}

Use isExpired determine whether the current node has expired:

boolean isExpired(ReferenceEntry<K, V> entry, long now) {
  checkNotNull(entry);
  if (expiresAfterAccess() && (now - entry.getAccessTime() >= expireAfterAccessNanos)) {
    return true;
  }
  if (expiresAfterWrite() && (now - entry.getWriteTime() >= expireAfterWriteNanos)) {
    return true;
  }
  return false;
}

isExpired only judged the expireAfterAccessNanos and expireAfterWriteNanos two times, combined with the expireAfterAccess , expireAfterWrite and refreshAfterWrite three methods of the constructor, you can see that this method does not refreshAfterWrite about the time expireAfterAccess , expireAfterWrite set time 06be1, ea6ab1 , The data is expired, otherwise it is not expired. If it is found that the data has expired, it will also check whether there are other expired data ( lazy deletion):

void tryExpireEntries(long now) {
  if (tryLock()) {
    try {
      expireEntries(now);
    } finally {
      unlock();
      // don't call postWriteCleanup as we're in a read
    }
  }
}
  
void expireEntries(long now) {
  drainRecencyQueue();
  ReferenceEntry<K, V> e;
  while ((e = writeQueue.peek()) != null && map.isExpired(e, now)) {
    if (!removeEntry(e, e.getHash(), RemovalCause.EXPIRED)) {
      throw new AssertionError();
    }
  }
  while ((e = accessQueue.peek()) != null && map.isExpired(e, now)) {
    if (!removeEntry(e, e.getHash(), RemovalCause.EXPIRED)) {
      throw new AssertionError();
    }
  }
}  

void drainRecencyQueue() {
  ReferenceEntry<K, V> e;
  while ((e = recencyQueue.poll()) != null) {
    if (accessQueue.contains(e)) {
      accessQueue.add(e);
    }
  }
}

Take the most recent range & write data and check whether it is out of date one by one.

`4. scheduleRefresh`

V value = getLiveValue(e, now);
if (value != null) {
  return scheduleRefresh(e, key, hash, value, now, loader);
}

getLiveValue , if the result is not null , it means expireAfterAccess and expireAfterWrite modes (or the time of these two modes is not set), but it does not mean that the data will not be refreshed, because getLiveValue did not judge the expiration time of refreshAfterWrite , But scheduleRefresh method.

V scheduleRefresh(
    ReferenceEntry<K, V> entry,
    K key,
    int hash,
    V oldValue,
    long now,
    CacheLoader<? super K, V> loader) {
  if (map.refreshes()
      && (now - entry.getWriteTime() > map.refreshNanos)
      && !entry.getValueReference().isLoading()) {
    V newValue = refresh(key, hash, loader, true);
    if (newValue != null) {
      return newValue;
    }
  }
  return oldValue;
}

When the following conditions are met, the data will be refreshed (the refresh thread returns the new value in the synchronous refresh mode, and the old value may be returned in the asynchronous refresh mode), otherwise the old value is directly returned:

Set refreshAfterWrite time refreshNanos .
The current data has expired.
No other thread is refreshing the data ( !entry.getValueReference().isLoading() ).

`5. waitForLoadingValue`

refreshAfterWrite is not set, and the data has expired:

If there are other threads refreshing, then block waiting ( future.get() ).

If no other threads are refreshing, then lock and refresh the data.

ValueReference<K, V> valueReference = e.getValueReference();
if (valueReference.isLoading()) {
  return waitForLoadingValue(e, key, valueReference);
}  

V waitForLoadingValue(ReferenceEntry<K, V> e, K key, ValueReference<K, V> valueReference)
 throws ExecutionException {
  if (!valueReference.isLoading()) {
 throw new AssertionError();
  }

  checkState(!Thread.holdsLock(e), "Recursive load of: %s", key);
  // don't consider expiration as we're concurrent with loading
  try {
 V value = valueReference.waitForValue();
 if (value == null) {
   throw new InvalidCacheLoadException("CacheLoader returned null for key " + key + ".");
 }
 // re-read ticker now that loading has completed
 long now = map.ticker.read();
 recordRead(e, now);
 return value;
  } finally {
 statsCounter.recordMisses(1);
  }
}

public V waitForValue() throws ExecutionException {
  return getUninterruptibly(futureValue);
}

public static <V> V getUninterruptibly(Future<V> future) throws ExecutionException {
  boolean interrupted = false;
  try {
 while (true) {
   try {
     return future.get();
   } catch (InterruptedException e) {
     interrupted = true;
   }
 }
  } finally {
 if (interrupted) {
   Thread.currentThread().interrupt();
 }
  }
}

`6. Load data`

Loading data, the final is to call either lockedGetOrLoad method, or scheduleRefresh in refresh method, the final call is CacheLoader of load/reload method. When there is no data to be accessed in the cache, no matter which mode is set, it will enter the lockedGetOrLoad method:

has the right to load data through lock contention.
Grab the locked data, set the node status to loading , and load the data.
Without grabbing the locked data, enter the same waitForLoadingValue method as the previous step, and block until the data loading is complete.

lock();
try {
  LoadingValueReference<K, V> loadingValueReference =
                new LoadingValueReference<K, V>(valueReference);
  e.setValueReference(loadingValueReference);

  if (createNewEntry) {
    loadingValueReference = new LoadingValueReference<K, V>();

    if (e == null) {
      e = newEntry(key, hash, first);
      e.setValueReference(loadingValueReference);
      table.set(index, e);
    } else {
      e.setValueReference(loadingValueReference);
    }
  }
} finally {
  unlock();
  postWriteCleanup();
}

if (createNewEntry) {
  try {
    // Synchronizes on the entry to allow failing fast when a recursive load is
    // detected. This may be circumvented when an entry is copied, but will fail fast most
    // of the time.
    synchronized (e) {
      return loadSync(key, hash, loadingValueReference, loader);
    }
  } finally {
    statsCounter.recordMisses(1);
  }
} else {
  // The entry already exists. Wait for loading.
  return waitForLoadingValue(e, key, valueReference);
}

`solution`

Through the above analysis, we can know that guava cache will be divided into two independent judgments when judging whether the cache is expired:

Judge expireAfterAccess and expireAfterWrite .
Judge refreshAfterWrite .

Back to the question " refreshAfterWrite although improved performance, but in addition outside the thread synchronous loading pattern refresh execution of other threads might access to expired long data." We can solve the problem expireAfterWrite and refreshAfterWrite refreshAfterWrite the expiration time of expireAfterWrite/expireAfterAccess is set, the expiration time of expireAfterWrite/expireAfterAccess can be set. The time of 061b1be95eaa33 is greater than the time of refreshAfterWrite For example refreshAfterWrite is 5 minutes, and expireAfterWrite is 30 minutes, when the expired data is accessed:

If the expiration time is less than 30 minutes, it will enter the scheduleRefresh method, and threads other than the refresh thread will directly return the old value.
If the cached data has not been accessed for a long time and the expiration time exceeds 30 minutes, the data will getLiveValue method. Except for the refresh thread, other threads are blocked and waiting.

Practice of guava cache expiration scheme

Expiration mechanism

problem

principle

1. Holistic approach

2. Simplified method

`3. getLiveValue`

`4. scheduleRefresh`

`5. waitForLoadingValue`

`6. Load data`

`solution`

noname

`引用和评论`

项目GIT管理

Java8的新特性

Java11的新特性

Java5的新特性

Java9的新特性

Java13的新特性

Java7的新特性