1. Introduction to Disruptor
- Disruptor is a low-latency high-performance = lock-free bounded circular array developed by the British foreign exchange trading company LMAX. The system based on Disruptor can support 6 million orders per second in a single thread, and is currently an open source concurrency framework. Concurrency framework used at the bottom of Log4j2
Disruptor design features
- Ring data structure: the bottom layer uses an array instead of a table
- Element position positioning: the length of the array is 2^n, the subscript is increasing, and it can be quickly positioned by bit operations
- Lock-free design, the producer or consumer needs to apply for a location first, and can read and write after the application is successful. The CAS ensures thread safety during the application process.
- It is used to solve the data exchange between a single machine and multiple threads, rather than a distributed queue similar to kafka.
Second, the queue solution in the JDK
In the team | Boundedness | lock | Underlying structure |
---|---|---|---|
ArrayBlockingQueue | Have bound | locked | Array |
LinkedBlockingQueue | Have bound | locked | Linked list |
ConcurrentLinkedQueue | Unbounded | no lock | Linked list |
In a system scenario with high concurrency and high stability requirements, it is necessary to prevent the producer from being too fast, and only a bounded queue can be selected; at the same time, in order to reduce the impact of Java garbage collection on system performance, try to choose "array" as The underlying structure of the queue, there is only one eligible: ArrayBlockingQueue
2.1 Problems with ArrayBlockingQueue
- Locking: Non-locking performance> CAS operation performance> Locking performance.
2.1.2 False sharing
Pseudo-sharing: The cache system is stored in units of cache lines. When multiple threads modify independent variables, if these variables share the same cache line, they will unintentionally affect each other’s performance.
- There are several layers of cache between the CPU and the main memory. The closer to the CPU, the smaller the cache space and the faster the speed.
- When the CPU is computing, it will look for data from the nearest cache first, and then go to the upper layer when it can't find it.
- In the cache system, cache line (cache line) is stored as a unit. A cache line has 64 bytes and can store 8 long type data. When the cpu accesses an array of type long, when a value in the array is loaded into the cache, it will load another 7 additional values. When one value of the array becomes invalid, the entire cache line becomes invalid, and it will swap out the other 7 values.
ArrayBlockingQueue has three member variables:
- takeIndex: the index of the element that needs to be taken
- putIndex: the index of the position where the element can be inserted
- count: the number of elements in the queue
These three variables are easy to put in a cache line, but there is not much correlation between the modifications. Therefore, each modification will invalidate the previously cached data, and thus cannot fully achieve the sharing effect.
public class ArrayBlockingQueue<E> {
/** The queued items */
final Object[] items;
/** items index for next take, poll, peek or remove */
int takeIndex;
/** items index for next put, offer, or add */
int putIndex;
/** Number of elements in the queue */
int count;
}
- Pseudo-sharing solution: increase the interval between array elements so that the elements accessed by different threads are located on different cache lines, trading space for time.
// value1和value2可能会产生伪共享
class ValueNoPadding {
protected volatile long value1 = 0L;
protected volatile long value2 = 0L;
}
// value1和value2中间插入无用值 p1~p14
class ValuePadding {
protected long p1, p2, p3, p4, p5, p6, p7;
protected volatile long value1 = 0L;
protected long p9, p10, p11, p12, p13, p14;
protected volatile long value2 = 0L;
}
Three, Disruptor
RingBuffer
- ringBuffer is a ring, used as a space for transferring data between different threads
- The ringBuffer has a sequence number, and the entire sequence number is incremented to point to the next available element.
- The queue space is fixed and no longer changes when it is created, which can be used to reduce the pressure of the GC
Usage example
- Prepare the data container
// 数据容器,存放生产和消费的数据内容
public class LongEvent {
private long value;
}
- Prepare the production factory of the data container for data filling when the RingBuffer is initialized
// 数据容器生产工厂
public class LongEventFactory implements EventFactory<LongEvent> {
public LongEvent newInstance() {
return new LongEvent();
}
}
- Prepare consumers
//消费者
public class LongEventConsumer implements EventHandler<LongEvent> {
/**
*
* @param longEvent
* @param sequence 当前的序列
* @param endOfBatch 是否是最后一个数据
* @throws Exception
*/
@Override
public void onEvent(LongEvent longEvent, long sequence, boolean endOfBatch) throws Exception {
String str = String.format("long event : %s l:%s b:%s", longEvent.getValue(), sequence, endOfBatch);
System.out.println(str);
}
}
- Production thread, main thread
public class Main {
public static void main(String[] args) throws Exception {
// 线程工厂
ThreadFactory threadFactory = (r) -> new Thread(r);
// disruptor-创建一个disruptor
// 设置数据容器的工厂类,ringBuffer的初始化大小,消费者线程的工厂类
Disruptor<LongEvent> disruptor = new Disruptor<LongEvent>(new LongEventFactory(), 8, threadFactory);
// disruptor-设置消费者
disruptor.handleEventsWith(new LongEventConsumer());
disruptor.start();
// 获取disruptor的RingBuffer
RingBuffer<LongEvent> ringBuffer = disruptor.getRingBuffer();
// 主线程开始生产
for (long l = 0; l <= 8; l++) {
long nextIndex = ringBuffer.next();
LongEvent event = ringBuffer.get(nextIndex);
event.setValue(l);
ringBuffer.publish(nextIndex);
Thread.sleep(1000);
}
}
}
Realization principle
The process of producing data by a single producer
- Producer thread requests to write M data
- disruptor searches for M writable spaces in the order of the current pointer cursor, and returns the largest sequence number of the available space found
- Whether the serial number returned by CAS comparison is consistent with the serial number of the application, judge whether it will overwrite the unread element, if the return is correct, write the data directly
Multi-producer data production process
- Introduce a buff with the same size as the ringBuffer: availableBuffer is used to record the usage of each space in the ringBuffer. If the producer writes data, the corresponding availableBuffer location is marked as successfully written, and if the consumer reads the data, it will correspond. The availableBuffer location is marked as free.
- When multiple producers allocate space, use CAS to obtain different array space for each thread to operate.
- When multiple consumers consume data, they sequentially search for a continuous readable space from the availableBuffer, and return the largest serial number of the space, read the data, and mark the corresponding position of the availableBuffer as free.
Disruptor solves the problem of false sharing and thread visibility
// 数据左右两边插入多余变量隔离真正的变量
class LhsPadding
{
protected long p1, p2, p3, p4, p5, p6, p7;
}
class Value extends LhsPadding
{
protected volatile long value;
}
class RhsPadding extends Value
{
protected long p9, p10, p11, p12, p13, p14, p15;
}
public class Sequence extends RhsPadding
{
static final long INITIAL_VALUE = -1L;
private static final Unsafe UNSAFE;
private static final long VALUE_OFFSET;
public Sequence(final long initialValue)
{
UNSAFE.putOrderedLong(this, VALUE_OFFSET, initialValue);
}
public long get()
{
return value;
}
// 使用UNSAFE操作直接修改内存值
public void set(final long value)
{
UNSAFE.putOrderedLong(this, VALUE_OFFSET, value);
}
}
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。