头图

CopyOnWriteArrayList Principle Analysis

introduce

The concurrent List in the Java concurrent package is only CopyOnWriteArrayList. CopyOnWriteArrayList is a thread-safe ArrayList. The modification operations are performed on the underlying copied array (snapshot), that is, when Copy strategy.

类图

FIG CopyOnWriteArrayList class, each of which has a target array CopyOnWriteArrayList array for storing concrete elements , ReentrantLock exclusive lock to ensure that only one thread of the array to be modified.

If we were to make our own copy-on-write thread-safe list, what would we do, and what are the points to consider?

  • When is the list initialized, what is the number of initialized list elements, and is the list finite size?
  • How to ensure thread safety, such as how to ensure thread safety when multiple threads read and write?
  • How to ensure data consistency when traversing a list using an iterator?

Let's take a look at how CopyOnWriteArrayList is implemented.

Main method analysis

initialization

In the no-argument constructor, an Object array of size 0 is created by default as the initial value.

public CopyOnWriteArrayList() {
        setArray(new Object[0]);
}

Argument constructor:

//传入的toCopyIn的副本
public CopyOnWriteArrayList(E[] toCopyIn) {
    setArray(Arrays.copyOf(toCopyIn, toCopyIn.length, Object[].class));
}
//入参为集合,复制到list中
public CopyOnWriteArrayList(Collection<? extends E> c) {
        Object[] elements;
        if (c.getClass() == CopyOnWriteArrayList.class)
            elements = ((CopyOnWriteArrayList<?>)c).getArray();
        else {
            elements = c.toArray();
            // c.toArray might (incorrectly) not return Object[] (see 6260652)
            if (elements.getClass() != Object[].class)
                elements = Arrays.copyOf(elements, elements.length, Object[].class);
        }
        setArray(elements);
}

add element

The functions used to add elements in CopyOnWriteArrayList are:

  • add(E e)
  • add(int index,E e)
  • addIfAbsent(E e)
  • addAllAbsent(Collection<? extents E> c) etc.

The principles of these functions are similar, we take add(E e) as an example to analyze.

public boolean add(E e) {
        // 获取独占锁
        final ReentrantLock lock = this.lock;
        lock.lock();
        try {
            // 获取array
            Object[] elements = getArray();
            int len = elements.length;
            //复制array到新数组并且添加新元素到新数组
            Object[] newElements = Arrays.copyOf(elements, len + 1);
            newElements[len] = e;
            // 使用新数组替换旧的数组
            setArray(newElements);
            return true;
        } finally {
            //释放独占锁
            lock.unlock();
        }
}

In the above code, the exclusive lock will be acquired first. If multiple threads call the add method at the same time, only one thread can acquire the lock, and other threads will be blocked until the lock is released.

Then use the new array to replace the original array and release the lock. needs to pay attention to when adding elements, first copy a snapshot, and then add on the snapshot, instead of directly performing on the original array.

Get the specified position element

Use the get(int index) method to get the element whose index is index, and throw IndexOutOfBoundsException if the element does not exist.

public E get(int index) {
        return get(getArray(), index);
}
final Object[] getArray() {
    return array;
}
private E get(Object[] a, int index) {
    return (E) a[index];
}

The above code, when a thread calls the get method gets the specified position of the element, the array array is first acquired, and then acquires the position specified by the index element, which is a two-step operation, but no lock synchronization throughout the process .

Suppose there are elements 1, 2, and 3 in the array.

array内容

Since the first step to obtain the array and the second step to access the specified position element according to the subscript have no shackles, this may cause another thread y to perform the remove operation after the first step and before the second step, assuming that 1 deleted, The remove operation will first acquire an exclusive lock and perform copy-on-write, that is, copy a copy of the current array and delete the element 1 accessed by thread x through the get method in the copied array, and then let the array point to the new array.

At this time, the reference count of the array pointed to by array is 1 instead of 0, because thread x is still using it. At this time, thread x starts to execute the second step, and the operation array is the array before thread y deletes the element.

弱一致性

Summary: Although thread y has deleted the element at index, the second step of thread x will still return the element at index, which is actually a weak consistency problem caused by the copy-on-write strategy of .

Modify the specified element

Use set(int index, E element) to modify the value of the specified element in the list, and throw an IndexOutOfBoundsException if the specified element does not exist.

public E set(int index, E element) {
    final ReentrantLock lock = this.lock;
    lock.lock();
    try {
        Object[] elements = getArray();
        E oldValue = get(elements, index);

        if (oldValue != element) {
            int len = elements.length;
            Object[] newElements = Arrays.copyOf(elements, len);
            newElements[index] = element;
            setArray(newElements);
        } else {
            // Not quite a no-op; ensures volatile write semantics
            setArray(elements);
        }
        return oldValue;
    } finally {
        lock.unlock();
    }
}

This method also first obtains the exclusive lock, then obtains the current array, and calls the get method to specify the position element. If the specified position element is not equal to the new value, a new array is created and the element is copied to the new array.

If the specified position element is the same as the new value, in order to ensure volatile semantics, the array still needs to be reset, although the contents of the array have not changed.

The purpose is to refresh the cache and notify other threads, that is, the so-called operation result is visible.

remove element

To delete the specified element in the list, you can use the following method.

  • E remove(int index)
  • boolean remove(Object o)
  • Boolean remove(Object o, Object[] snapshot, int index) etc.

The principle is roughly similar, and the remove(int index) method is explained here.

public E remove(int index) {
        //获取独占锁
        final ReentrantLock lock = this.lock;
        lock.lock();
        try {
            Object[] elements = getArray();
            int len = elements.length;
            E oldValue = get(elements, index);
            int numMoved = len - index - 1;
            //如果要删除的是最后一个元素
            if (numMoved == 0)
                setArray(Arrays.copyOf(elements, len - 1));
            else {
                //分两次复制删除后剩余的元素到新数组
                Object[] newElements = new Object[len - 1];
                System.arraycopy(elements, 0, newElements, 0, index);
                System.arraycopy(elements, index + 1, newElements, index,numMoved);
                setArray(newElements);
            }
            return oldValue;
        } finally {
            lock.unlock();
        }
}

First acquire an exclusive lock to ensure that other threads cannot modify the array during data deletion, then acquire the elements to be deleted in the array, copy the remaining elements to the new array, then replace the original array with the new array, and finally before returning Release the lock.

iterator

Let's take a look at the weak consistency of the iterator in CopyOnWriteArrayList. The so-called weak consistency means that after the iterator is returned, the additions, deletions and changes of the list by other threads are invisible to the iterator. Let's see how this is done. of.

public Iterator<E> iterator() {
    return new COWIterator<E>(getArray(), 0);
}
static final class COWIterator<E> implements ListIterator<E> {
    //array的快照
    private final Object[] snapshot;
    //数组下标
    private int cursor;

    private COWIterator(Object[] elements, int initialCursor) {
        cursor = initialCursor;
        snapshot = elements;
    }

    //是否遍历结束
    public boolean hasNext() {
        return cursor < snapshot.length;
    }

    //获取元素
    public E next() {
        if (! hasNext())
            throw new NoSuchElementException();
        return (E) snapshot[cursor++];
    }

}

When calling the iterator method to get the iterator, it will actually return a COWIterator object. The snapshot variable of the COWIterator object saves the content of the current list, and the cursor is the subscript of the data when traversing the list.

Why do you say snapshot is a snapshot of the list? Clearly is a reference to passed by a pointer, not a copy.

If other threads do not add or delete the list while the thread uses the returned iterator to traverse the elements, then the snapshot itself is an array of the list, because they are reference relationships.

However, if other threads add, delete, or modify the list during the traversal, then the snapshot is a snapshot, because the array in the list is replaced by the new array after the addition, deletion, and . At this time, the old array 161de8a90c1359 is referenced by the snapshot. This also means that after the iterator is obtained, when the iterator element is used, the additions, deletions and to the list by other threads are invisible, because they operate on two different arrays , which is the weak consistency of .

Summarize

CopyOnWriteArrayList uses the copy-on-write strategy to ensure the consistency of the list, while the acquisition - modification - write three-step operation is not atomic, so exclusive locks are used in the process of adding, deleting and modifying to ensure that in a certain Only one thread at a time can make modifications to the list array.

In addition, CopyOnWriteArrayList provides a weakly consistent iterator to ensure that after the iterator is acquired, other threads' modifications to the list are invisible, and the array traversed by the iterator is a snapshot.

The CopyOnWrite concurrent container is used for concurrent scenarios with more reads and fewer writes. Disadvantages: Memory usage problem , Data consistency problem (only the final consistency of the data can be guaranteed, but the real-time consistency of the data cannot be guaranteed).


神秘杰克
765 声望382 粉丝

Be a good developer.