The principle analysis of Apache Flink CDC batch stream fusion technology

This article is reproduced from the "Good Future Technology" public account, using Flink SQL cases to introduce the use of Flink CDC 2.0, and to interpret the core design in CDC. The main content is:
Case study
Core design
Detailed code

In August, Flink CDC released version 2.0.0. Compared with version 1.0, it supports distributed reading and checkpoint in the full reading phase, and ensures data consistency without locking the table during the full + incremental reading process. sex. For detailed introduction, refer to 1618de6d014713 Flink CDC 2.0 is officially released, and the core improvement .

The Flink CDC 2.0 data reading logic is not complicated, the more complicated is the FLIP-27: Refactor Source Interface and the lack of understanding of Debezium Api. This article focuses on introducing the processing logic of Flink CDC. FLIP-27 and the API call of Debezium will not be explained too much.

This article uses CDC 2.0.0 version, first introduces the use of Flink CDC 2.0 with a Flink SQL case, and then introduces the core design in CDC including slice division, split reading, incremental reading, and finally the data processing process involves flink -mysql-cdc interface call and implementation for code explanation.

1. Case

Full read + incremental read Mysql table data, changelog-json format, observe the RowKind type and the number of data items affected.

public static void main(String[] args) {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        EnvironmentSettings envSettings = EnvironmentSettings.newInstance()
                .useBlinkPlanner()
                .inStreamingMode()
                .build();
        env.setParallelism(3);
        // note: 增量同步需要开启CK
        env.enableCheckpointing(10000);
        StreamTableEnvironment tableEnvironment = StreamTableEnvironment.create(env, envSettings);
            
        tableEnvironment.executeSql(" CREATE TABLE demoOrders (\n" +
                "         `order_id` INTEGER ,\n" +
                "          `order_date` DATE ,\n" +
                "          `order_time` TIMESTAMP(3),\n" +
                "          `quantity` INT ,\n" +
                "          `product_id` INT ,\n" +
                "          `purchaser` STRING,\n" +
                "           primary key(order_id)  NOT ENFORCED" +
                "         ) WITH (\n" +
                "          'connector' = 'mysql-cdc',\n" +
                "          'hostname' = 'localhost',\n" +
                "          'port' = '3306',\n" +
                "          'username' = 'cdc',\n" +
                "          'password' = '123456',\n" +
                "          'database-name' = 'test',\n" +
                "          'table-name' = 'demo_orders'," +
                            //  全量 + 增量同步   
                "          'scan.startup.mode' = 'initial'      " +
                " )");

              tableEnvironment.executeSql("CREATE TABLE sink (\n" +
                "         `order_id` INTEGER ,\n" +
                "          `order_date` DATE ,\n" +
                "          `order_time` TIMESTAMP(3),\n" +
                "          `quantity` INT ,\n" +
                "          `product_id` INT ,\n" +
                "          `purchaser` STRING,\n" +
                "          primary key (order_id)  NOT ENFORCED " +
                ") WITH (\n" +
                "    'connector' = 'kafka',\n" +
                "    'properties.bootstrap.servers' = 'localhost:9092',\n" +
                "    'topic' = 'mqTest02',\n" +
                "    'format' = 'changelog-json' "+
                ")");

             tableEnvironment.executeSql("insert into sink select * from demoOrders");}

Full data output:

{"data":{"order_id":1010,"order_date":"2021-09-17","order_time":"2021-09-22 10:52:12.189","quantity":53,"product_id":502,"purchaser":"flink"},"op":"+I"}
{"data":{"order_id":1009,"order_date":"2021-09-17","order_time":"2021-09-22 10:52:09.709","quantity":31,"product_id":500,"purchaser":"flink"},"op":"+I"}
{"data":{"order_id":1008,"order_date":"2021-09-17","order_time":"2021-09-22 10:52:06.637","quantity":69,"product_id":503,"purchaser":"flink"},"op":"+I"}
{"data":{"order_id":1007,"order_date":"2021-09-17","order_time":"2021-09-22 10:52:03.535","quantity":52,"product_id":502,"purchaser":"flink"},"op":"+I"}
{"data":{"order_id":1002,"order_date":"2021-09-17","order_time":"2021-09-22 10:51:51.347","quantity":69,"product_id":503,"purchaser":"flink"},"op":"+I"}
{"data":{"order_id":1001,"order_date":"2021-09-17","order_time":"2021-09-22 10:51:48.783","quantity":50,"product_id":502,"purchaser":"flink"},"op":"+I"}
{"data":{"order_id":1000,"order_date":"2021-09-17","order_time":"2021-09-17 17:40:32.354","quantity":30,"product_id":500,"purchaser":"flink"},"op":"+I"}
{"data":{"order_id":1006,"order_date":"2021-09-17","order_time":"2021-09-22 10:52:01.249","quantity":31,"product_id":500,"purchaser":"flink"},"op":"+I"}
{"data":{"order_id":1005,"order_date":"2021-09-17","order_time":"2021-09-22 10:51:58.813","quantity":69,"product_id":503,"purchaser":"flink"},"op":"+I"}
{"data":{"order_id":1004,"order_date":"2021-09-17","order_time":"2021-09-22 10:51:56.153","quantity":50,"product_id":502,"purchaser":"flink"},"op":"+I"}
{"data":{"order_id":1003,"order_date":"2021-09-17","order_time":"2021-09-22 10:51:53.727","quantity":30,"product_id":500,"purchaser":"flink"},"op":"+I"}

Modify table data, incremental capture:

## 更新 1005 的值 
{"data":{"order_id":1005,"order_date":"2021-09-17","order_time":"2021-09-22 02:51:58.813","quantity":69,"product_id":503,"purchaser":"flink"},"op":"-U"}
{"data":{"order_id":1005,"order_date":"2021-09-17","order_time":"2021-09-22 02:55:43.627","quantity":80,"product_id":503,"purchaser":"flink"},"op":"+U"}

## 删除 1000 
{"data":{"order_id":1000,"order_date":"2021-09-17","order_time":"2021-09-17 09:40:32.354","quantity":30,"product_id":500,"purchaser":"flink"},"op":"-D"}

2. Core design

1. Slicing

The data reading method of the full stage is distributed reading. The current table data is divided into multiple chunks according to the primary key, and subsequent subtasks read the data in the chunk interval. According to whether the primary key column is a self-incrementing integer type, the table data is divided into uniformly distributed Chunks and non-uniformly distributed Chunks.

1.1 Uniform distribution

The primary key column is incremented and the type is an integer type (int, bigint, decimal). Query the minimum and maximum values of the primary key column, and divide the data evenly according to the chunkSize size. Because the primary key is an integer type, directly calculate the end position of the chunk based on the current chunk starting position and chunkSize size.

Note: no longer depends on whether the primary key column is self-incremented. It requires the primary key column to guard the integer type and calculate the data distribution coefficient according to max(id)-min(id)/rowcount, only the distribution coefficient is <= The configured distribution factor (evenly-distribution.factor is 1000.0d by default) will be divided evenly.

//  计算主键列数据区间
select min(`order_id`), max(`order_id`) from demo_orders;

//  将数据划分为 chunkSize 大小的切片
chunk-0： [min，start + chunkSize)
chunk-1： [start + chunkSize, start + 2chunkSize)
.......
chunk-last： [max，null)

1.2 Non-uniform distribution

The primary key column is not self-incrementing or the type is a non-integer type. The primary key is a non-numeric type, and the undivided data needs to be sorted in ascending order by the primary key for each division. The maximum value of the chunkSize before extraction is the end position of the current chunk.

Note: The a non-integer type for the primary key column, or the calculated distribution factor (distributionFactor)> configured distribution factor (evenly-distribution.factor).

// 未拆分的数据排序后，取 chunkSize 条数据取最大值，作为切片的终止位置。
chunkend = SELECT MAX(`order_id`) FROM (
        SELECT `order_id`  FROM `demo_orders` 
        WHERE `order_id` >= [前一个切片的起始位置] 
        ORDER BY `order_id` ASC 
        LIMIT   [chunkSize]  
    ) AS T

2. Full slice data read

Flink divides the table data into multiple chunks, and the subtasks read the chunk data in parallel without locking. Because the whole process is lock-free during the data slice reading process, there may be other transactions that modify the data within the slice range, and data consistency cannot be guaranteed at this time. Therefore, Flink uses snapshot record reading + Binlog data correction to ensure data consistency during the full phase.

2.1 Snapshot read

Execute SQL through JDBC to query the data records of the slice range.

## 快照记录数据读取SQL 
SELECT * FROM `test`.`demo_orders` 
WHERE order_id >= [chunkStart] 
AND NOT (order_id = [chunkEnd]) 
AND order_id <= [chunkEnd]

2.2 Data correction

Before and after the snapshot read operation, execute SHOW MASTER STATUS query the current offset of the binlog file. After the snapshot is read, the binlog data in the interval is queried and the read snapshot record is corrected.

Data organization structure when reading snapshot + Binlog data:

BinlogEvents revised the SnapshotEvents rules.

The binlog data is not read, that is, no other transactions are performed during the select phase, and all snapshot records are directly issued.
If the binlog data is read, and the changed data record does not belong to the current slice, a snapshot record is issued.
The binlog data is read, and the data record changes belong to the current slice. The delete operation removes the data from the snapshot memory, the insert operation adds new data to the snapshot memory, and the update operation adds change records to the snapshot memory. Finally, the two records before and after the update are output downstream.

Revised data organization structure:

![image.png](https://img.alicdn.com/imgextra/i1/O1CN01POsozI1WMXPFo2J0W_!!6000000002774-2-tps-1080-93.png)

Take reading the data in the range of slice [1,11] as an example to describe the processing of slice data. c, d, u represent the add, delete, and update operations captured by Debezium.

Data and structure before correction:

Revised data and structure:

After processing a single slice data, it will send the starting position of the completed slice data (ChunkStart, ChunkStartEnd) and the maximum offset of Binlog (High watermark) to the SplitEnumerator, which is used to specify the starting offset for incremental reading.

3. Incremental slice data reading

After the full-stage slice data is read, SplitEnumerator will issue a BinlogSplit for incremental data reading. The most important attribute that BinlogSplit reads is the starting offset. If the offset is set too small, there may be duplicate data downstream, and if the offset is set too large, the downstream may be dirty data that has expired. Flink CDC initiated offset increment for read all slices have been completed the whole amount of the minimum offset Binlog , only the data satisfying the condition was only issued to the downstream. Data distribution conditions:

The offset of the captured Binlog data> the maximum offset of the Binlog of the shard to which the data belongs.

For example, the completed slice information retained by SplitEnumerator is:

Slice index	Chunk data range	Maximum Binlog read by slice
0	[1,100]	1000
1	[101,200]	800
2	[201,300]	1500

When reading incrementally, start reading Binlog data from offset 800. When data <data:123, offset:1500> is captured, first find the snapshot shard to which 123 belongs, and find the corresponding maximum Binlog offset 800. The current offset is greater than the maximum offset read by the snapshot, then the data is delivered, otherwise it is directly discarded.

Three, the code is explained in detail

The design of FLIP-27: Refactor Source Interface not be introduced in detail. This article focuses on the flink-mysql-cdc interface call and implementation.

1. MySqlSourceEnumerator initialization

SourceCoordinator, as OperatorCoordinator's implementation of Source, runs on the Master node. At startup, it creates MySqlSourceEnumerator by calling MySqlParallelSource#createEnumerator and calls the start method to do some initialization work.
SourceCoordinator 启动.png

Create MySqlSourceEnumerator, use MySqlHybridSplitAssigner to slice full + incremental data, and use MySqlValidator to verify the mysql version and configuration.
MySqlValidator verification:
1. The mysql version must be greater than or equal to 5.7.
2. The binlog_format configuration must be ROW.
3. The binlog_row_image configuration must be FULL.
MySqlSplitAssigner initialization:
1. Create ChunkSplitter to divide slices.
2. Filter out the name of the table to be read.
Start the periodic scheduling thread, and require SourceReader to send to SourceEnumerator the slice information that has completed but not sent ACK events.

private void syncWithReaders(int[] subtaskIds, Throwable t) {
    if (t != null) {
        throw new FlinkRuntimeException("Failed to list obtain registered readers due to:", t);
    }
    // when the SourceEnumerator restores or the communication failed between
    // SourceEnumerator and SourceReader, it may missed some notification event.
    // tell all SourceReader(s) to report there finished but unacked splits.
    if (splitAssigner.waitingForFinishedSplits()) {
        for (int subtaskId : subtaskIds) {
            // note: 发送 FinishedSnapshotSplitsRequestEvent 
            context.sendEventToSourceReader(
                    subtaskId, new FinishedSnapshotSplitsRequestEvent());
        }
    }
}

2. MySqlSourceReader initialization

SourceOperator integrates SourceReader and interacts with SourceCoordinator through OperatorEventGateway.
SourceOperator 启动 (1).png

When SourceOperator is initialized, MySqlSourceReader is created through MySqlParallelSource. MySqlSourceReader creates Fetcher through SingleThreadFetcherManager to pull shard data, and the data is written to elementsQueue in MySqlRecords format.

MySqlParallelSource#createReader

public SourceReader<T, MySqlSplit> createReader(SourceReaderContext readerContext) throws Exception {
    // note:  数据存储队列
FutureCompletingBlockingQueue<RecordsWithSplitIds<SourceRecord>> elementsQueue =
        new FutureCompletingBlockingQueue<>();
final Configuration readerConfiguration = getReaderConfig(readerContext);

    // note: Split Reader 工厂类
Supplier<MySqlSplitReader> splitReaderSupplier =
        () -> new MySqlSplitReader(readerConfiguration, readerContext.getIndexOfSubtask());

return new MySqlSourceReader<>(
        elementsQueue,
        splitReaderSupplier,
        new MySqlRecordEmitter<>(deserializationSchema),
        readerConfiguration,
        readerContext);
}

Pass the created MySqlSourceReader to SourceCoordinator in the form of events for registration. After SourceCoordinator receives the registration event, it saves the reader address and index.

SourceCoordinator#handleReaderRegistrationEvent
// note: SourceCoordinator 处理Reader 注册事件
private void handleReaderRegistrationEvent(ReaderRegistrationEvent event) {
    context.registerSourceReader(new ReaderInfo(event.subtaskId(), event.location()));
    enumerator.addReader(event.subtaskId());
}

After MySqlSourceReader is started, it will send a request fragmentation event to MySqlSourceEnumerator to collect allocated slice data.
After SourceOperator is initialized, call emitNext to obtain the data collection from the elementsQueue by SourceReaderBase and send it to MySqlRecordEmitter. Schematic diagram of interface call:

3. MySqlSourceEnumerator handles fragmentation requests

When MySqlSourceReader starts, it sends a RequestSplitEvent event to MySqlSourceEnumerator, and reads interval data according to the returned slice range. MySqlSourceEnumerator fragment request processing logic in the full read phase, and finally returns a MySqlSnapshotSplit.
未命名文件 (1).png

Process the slice request event, allocate slices for the requested Reader, and pass MySqlSplit (MySqlSnapshotSplit in full phase, MySqlBinlogSplit in incremental phase) by sending AddSplitEvent time.

MySqlSourceEnumerator#handleSplitRequest
public void handleSplitRequest(int subtaskId, @Nullable String requesterHostname) {
    if (!context.registeredReaders().containsKey(subtaskId)) {
        // reader failed between sending the request and now. skip this request.
        return;
    }
    // note:  将reader所属的subtaskId存储到TreeSet, 在处理binlog split时优先分配个task-0
    readersAwaitingSplit.add(subtaskId);

    assignSplits();
}

// note: 分配切片
private void assignSplits() {
    final Iterator<Integer> awaitingReader = readersAwaitingSplit.iterator();
    while (awaitingReader.hasNext()) {
        int nextAwaiting = awaitingReader.next();
        // if the reader that requested another split has failed in the meantime, remove
        // it from the list of waiting readers
        if (!context.registeredReaders().containsKey(nextAwaiting)) {
            awaitingReader.remove();
            continue;
        }

        //note: 由 MySqlSplitAssigner 分配切片
        Optional<MySqlSplit> split = splitAssigner.getNext();
        if (split.isPresent()) {
            final MySqlSplit mySqlSplit = split.get();
            //  note: 发送AddSplitEvent, 为 Reader 返回切片信息
            context.assignSplit(mySqlSplit, nextAwaiting);
            awaitingReader.remove();

            LOG.info("Assign split {} to subtask {}", mySqlSplit, nextAwaiting);
        } else {
            // there is no available splits by now, skip assigning
            break;
        }
    }
}

MySqlHybridSplitAssigner handles the logic of full slicing and incremental slicing.
1. When the task is just started, remainingTables is not empty, the return value of noMoreSplits is false, and a SnapshotSplit is created.
2. After the full phase fragment read is completed, noMoreSplits returns true, and BinlogSplit is created.

MySqlHybridSplitAssigner#getNext
@Override
public Optional<MySqlSplit> getNext() {
    if (snapshotSplitAssigner.noMoreSplits()) {
        // binlog split assigning
        if (isBinlogSplitAssigned) {
            // no more splits for the assigner
            return Optional.empty();
        } else if (snapshotSplitAssigner.isFinished()) {
            // we need to wait snapshot-assigner to be finished before
            // assigning the binlog split. Otherwise, records emitted from binlog split
            // might be out-of-order in terms of same primary key with snapshot splits.
            isBinlogSplitAssigned = true;

            //note: snapshot split 切片完成后，创建BinlogSplit。
            return Optional.of(createBinlogSplit());
        } else {
            // binlog split is not ready by now
            return Optional.empty();
        }
    } else {
        // note: 由MySqlSnapshotSplitAssigner 创建 SnapshotSplit
        // snapshot assigner still have remaining splits, assign split from it
        return snapshotSplitAssigner.getNext();
    }
}

MySqlSnapshotSplitAssigner handles the full slice logic, generates slices through ChunkSplitter, and stores them in Iterator.

@Override
public Optional<MySqlSplit> getNext() {
    if (!remainingSplits.isEmpty()) {
        // return remaining splits firstly
        Iterator<MySqlSnapshotSplit> iterator = remainingSplits.iterator();
        MySqlSnapshotSplit split = iterator.next();
        iterator.remove();
        
        //note: 已分配的切片存储到 assignedSplits 集合
        assignedSplits.put(split.splitId(), split);

        return Optional.of(split);
    } else {
        // note: 初始化阶段 remainingTables 存储了要读取的表名
        TableId nextTable = remainingTables.pollFirst();
        if (nextTable != null) {
            // split the given table into chunks (snapshot splits)
            //  note: 初始化阶段创建了 ChunkSplitter，调用generateSplits 进行切片划分
            Collection<MySqlSnapshotSplit> splits = chunkSplitter.generateSplits(nextTable);
            //  note: 保留所有切片信息
            remainingSplits.addAll(splits);
            //  note: 已经完成分片的 Table
            alreadyProcessedTables.add(nextTable);
            //  note: 递归调用该该方法
            return getNext();
        } else {
            return Optional.empty();
        }
    }
}

ChunkSplitter divides the table into evenly distributed or unevenly distributed slices. The table read must contain a physical primary key.

public Collection<MySqlSnapshotSplit> generateSplits(TableId tableId) {

    Table schema = mySqlSchema.getTableSchema(tableId).getTable();
    List<Column> primaryKeys = schema.primaryKeyColumns();
    // note: 必须有主键
    if (primaryKeys.isEmpty()) {
        throw new ValidationException(
                String.format(
                        "Incremental snapshot for tables requires primary key,"
                                + " but table %s doesn't have primary key.",
                        tableId));
    }
    // use first field in primary key as the split key
    Column splitColumn = primaryKeys.get(0);

    final List<ChunkRange> chunks;
    try {
         // note: 按主键列将数据划分成多个切片
        chunks = splitTableIntoChunks(tableId, splitColumn);
    } catch (SQLException e) {
        throw new FlinkRuntimeException("Failed to split chunks for table " + tableId, e);
    }
    //note: 主键数据类型转换、ChunkRange 包装成MySqlSnapshotSplit。
    // convert chunks into splits
    List<MySqlSnapshotSplit> splits = new ArrayList<>();
    RowType splitType = splitType(splitColumn);
 
    for (int i = 0; i < chunks.size(); i++) {
        ChunkRange chunk = chunks.get(i);
        MySqlSnapshotSplit split =
                createSnapshotSplit(
                        tableId, i, splitType, chunk.getChunkStart(), chunk.getChunkEnd());
        splits.add(split);
    }
    return splits;
}

splitTableIntoChunks divides the slices according to the physical primary key.

private List<ChunkRange> splitTableIntoChunks(TableId tableId, Column splitColumn)
        throws SQLException {
    final String splitColumnName = splitColumn.name();
    //  select min, max
    final Object[] minMaxOfSplitColumn = queryMinMax(jdbc, tableId, splitColumnName);
    final Object min = minMaxOfSplitColumn[0];
    final Object max = minMaxOfSplitColumn[1];
    if (min == null || max == null || min.equals(max)) {
        // empty table, or only one row, return full table scan as a chunk
        return Collections.singletonList(ChunkRange.all());
    }

    final List<ChunkRange> chunks;
    if (splitColumnEvenlyDistributed(splitColumn)) {
        // use evenly-sized chunks which is much efficient
        // note: 按主键均匀划分
        chunks = splitEvenlySizedChunks(min, max);
    } else {
        // note: 按主键非均匀划分
        // use unevenly-sized chunks which will request many queries and is not efficient.
        chunks = splitUnevenlySizedChunks(tableId, splitColumnName, min, max);
    }

    return chunks;
}

/** Checks whether split column is evenly distributed across its range. */
private static boolean splitColumnEvenlyDistributed(Column splitColumn) {
    // only column is auto-incremental are recognized as evenly distributed.
    // TODO: we may use MAX,MIN,COUNT to calculate the distribution in the future.
    if (splitColumn.isAutoIncremented()) {
        DataType flinkType = MySqlTypeUtils.fromDbzColumn(splitColumn);
        LogicalTypeRoot typeRoot = flinkType.getLogicalType().getTypeRoot();
        // currently, we only support split column with type BIGINT, INT, DECIMAL
        return typeRoot == LogicalTypeRoot.BIGINT
                || typeRoot == LogicalTypeRoot.INTEGER
                || typeRoot == LogicalTypeRoot.DECIMAL;
    } else {
        return false;
    }
}


/**
 *  根据拆分列的最小值和最大值将表拆分为大小均匀的块，并以 {@link #chunkSize} 步长滚动块。
 * Split table into evenly sized chunks based on the numeric min and max value of split column,
 * and tumble chunks in {@link #chunkSize} step size.
 */
private List<ChunkRange> splitEvenlySizedChunks(Object min, Object max) {
    if (ObjectUtils.compare(ObjectUtils.plus(min, chunkSize), max) > 0) {
        // there is no more than one chunk, return full table as a chunk
        return Collections.singletonList(ChunkRange.all());
    }

    final List<ChunkRange> splits = new ArrayList<>();
    Object chunkStart = null;
    Object chunkEnd = ObjectUtils.plus(min, chunkSize);
    //  chunkEnd <= max
    while (ObjectUtils.compare(chunkEnd, max) <= 0) {
        splits.add(ChunkRange.of(chunkStart, chunkEnd));
        chunkStart = chunkEnd;
        chunkEnd = ObjectUtils.plus(chunkEnd, chunkSize);
    }
    // add the ending split
    splits.add(ChunkRange.of(chunkStart, null));
    return splits;
}

/**   通过连续计算下一个块最大值，将表拆分为大小不均匀的块。
 * Split table into unevenly sized chunks by continuously calculating next chunk max value. */
private List<ChunkRange> splitUnevenlySizedChunks(
        TableId tableId, String splitColumnName, Object min, Object max) throws SQLException {
    final List<ChunkRange> splits = new ArrayList<>();
    Object chunkStart = null;

    Object chunkEnd = nextChunkEnd(min, tableId, splitColumnName, max);
    int count = 0;
    while (chunkEnd != null && ObjectUtils.compare(chunkEnd, max) <= 0) {
        // we start from [null, min + chunk_size) and avoid [null, min)
        splits.add(ChunkRange.of(chunkStart, chunkEnd));
        // may sleep a while to avoid DDOS on MySQL server
        maySleep(count++);
        chunkStart = chunkEnd;
        chunkEnd = nextChunkEnd(chunkEnd, tableId, splitColumnName, max);
    }
    // add the ending split
    splits.add(ChunkRange.of(chunkStart, null));
    return splits;
}

private Object nextChunkEnd(
        Object previousChunkEnd, TableId tableId, String splitColumnName, Object max)
        throws SQLException {
    // chunk end might be null when max values are removed
    Object chunkEnd =
            queryNextChunkMax(jdbc, tableId, splitColumnName, chunkSize, previousChunkEnd);
    if (Objects.equals(previousChunkEnd, chunkEnd)) {
        // we don't allow equal chunk start and end,
        // should query the next one larger than chunkEnd
        chunkEnd = queryMin(jdbc, tableId, splitColumnName, chunkEnd);
    }
    if (ObjectUtils.compare(chunkEnd, max) >= 0) {
        return null;
    } else {
        return chunkEnd;
    }
}

4. MySqlSourceReader handles the slice allocation request

MySqlSourceReader 处理切片分配请求.png
After MySqlSourceReader receives the slice allocation request, it will first create a SplitFetcher thread, add to taskQueue, execute the AddSplitsTask task to process the task of adding slices, and then execute FetchTask to read the data using the Debezium API, and store the read data in the elementsQueue , SourceReaderBase will get data from this queue and send it to MySqlRecordEmitter.

When processing slice allocation events, create SplitFetcher and add AddSplitsTask to taskQueue.

SingleThreadFetcherManager#addSplits
public void addSplits(List<SplitT> splitsToAdd) {
    SplitFetcher<E, SplitT> fetcher = getRunningFetcher();
    if (fetcher == null) {
        fetcher = createSplitFetcher();
        // Add the splits to the fetchers.
        fetcher.addSplits(splitsToAdd);
        startFetcher(fetcher);
    } else {
        fetcher.addSplits(splitsToAdd);
    }
}

// 创建 SplitFetcher
protected synchronized SplitFetcher<E, SplitT> createSplitFetcher() {
    if (closed) {
        throw new IllegalStateException("The split fetcher manager has closed.");
    }
    // Create SplitReader.
    SplitReader<E, SplitT> splitReader = splitReaderFactory.get();

    int fetcherId = fetcherIdGenerator.getAndIncrement();
    SplitFetcher<E, SplitT> splitFetcher =
            new SplitFetcher<>(
                    fetcherId,
                    elementsQueue,
                    splitReader,
                    errorHandler,
                    () -> {
                        fetchers.remove(fetcherId);
                        elementsQueue.notifyAvailable();
                    });
    fetchers.put(fetcherId, splitFetcher);
    return splitFetcher;
}

public void addSplits(List<SplitT> splitsToAdd) {
    enqueueTask(new AddSplitsTask<>(splitReader, splitsToAdd, assignedSplits));
    wakeUp(true);
}

Execute the SplitFetcher thread, execute the AddSplitsTask thread for the first time to add fragments, and later execute the FetchTask thread to pull data.

SplitFetcher#runOnce
void runOnce() {
    try {
        if (shouldRunFetchTask()) {
            runningTask = fetchTask;
        } else {
            runningTask = taskQueue.take();
        }
        
        if (!wakeUp.get() && runningTask.run()) {
            LOG.debug("Finished running task {}", runningTask);
            runningTask = null;
            checkAndSetIdle();
        }
    } catch (Exception e) {
        throw new RuntimeException(
                String.format(
                        "SplitFetcher thread %d received unexpected exception while polling the records",
                        id),
                e);
    }

    maybeEnqueueTask(runningTask);
    synchronized (wakeUp) {
        // Set the running task to null. It is necessary for the shutdown method to avoid
        // unnecessarily interrupt the running task.
        runningTask = null;
        // Set the wakeUp flag to false.
        wakeUp.set(false);
        LOG.debug("Cleaned wakeup flag.");
    }
}

AddSplitsTask calls the handleSplitsChanges method of MySqlSplitReader to add the allocated slice information to the slice queue. On the next fetch() call, get the slice from the queue and read the slice data.

AddSplitsTask#run
public boolean run() {
    for (SplitT s : splitsToAdd) {
        assignedSplits.put(s.splitId(), s);
    }
    splitReader.handleSplitsChanges(new SplitsAddition<>(splitsToAdd));
    return true;
}
MySqlSplitReader#handleSplitsChanges
public void handleSplitsChanges(SplitsChange<MySqlSplit> splitsChanges) {
    if (!(splitsChanges instanceof SplitsAddition)) {
        throw new UnsupportedOperationException(
                String.format(
                        "The SplitChange type of %s is not supported.",
                        splitsChanges.getClass()));
    }

    //note: 添加切片 到队列。
    splits.addAll(splitsChanges.splits());
}

MySqlSplitReader executes fetch(), and DebeziumReader reads the data to the event queue, and returns it in MySqlRecords format after correcting the data.

MySqlSplitReader#fetch
@Override
public RecordsWithSplitIds<SourceRecord> fetch() throws IOException {
    // note: 创建Reader 并读取数据
    checkSplitOrStartNext();

    Iterator<SourceRecord> dataIt = null;
    try {
        // note:  对读取的数据进行修正
        dataIt = currentReader.pollSplitRecords();
    } catch (InterruptedException e) {
        LOG.warn("fetch data failed.", e);
        throw new IOException(e);
    }

    //  note: 返回的数据被封装为 MySqlRecords 进行传输
    return dataIt == null
            ? finishedSnapshotSplit()   
            : MySqlRecords.forRecords(currentSplitId, dataIt);
}

private void checkSplitOrStartNext() throws IOException {
    // the binlog reader should keep alive
    if (currentReader instanceof BinlogSplitReader) {
        return;
    }

    if (canAssignNextSplit()) {
        // note:  从切片队列读取MySqlSplit
        final MySqlSplit nextSplit = splits.poll();
        if (nextSplit == null) {
            throw new IOException("Cannot fetch from another split - no split remaining");
        }

        currentSplitId = nextSplit.splitId();
        // note:  区分全量切片读取还是增量切片读取
        if (nextSplit.isSnapshotSplit()) {
            if (currentReader == null) {
                final MySqlConnection jdbcConnection = getConnection(config);
                final BinaryLogClient binaryLogClient = getBinaryClient(config);

                final StatefulTaskContext statefulTaskContext =
                        new StatefulTaskContext(config, binaryLogClient, jdbcConnection);
                // note: 创建SnapshotSplitReader，使用Debezium Api读取分配数据及区间Binlog值
                currentReader = new SnapshotSplitReader(statefulTaskContext, subtaskId);
            }

        } else {
            // point from snapshot split to binlog split
            if (currentReader != null) {
                LOG.info("It's turn to read binlog split, close current snapshot reader");
                currentReader.close();
            }

            final MySqlConnection jdbcConnection = getConnection(config);
            final BinaryLogClient binaryLogClient = getBinaryClient(config);
            final StatefulTaskContext statefulTaskContext =
                    new StatefulTaskContext(config, binaryLogClient, jdbcConnection);
            LOG.info("Create binlog reader");
            // note: 创建BinlogSplitReader，使用Debezium API进行增量读取
            currentReader = new BinlogSplitReader(statefulTaskContext, subtaskId);
        }
        // note: 执行Reader进行数据读取
        currentReader.submitSplit(nextSplit);
    }
}

5. DebeziumReader data processing

DebeziumReader includes two stages: full slice reading and incremental slice reading. After reading the data, it is stored in the ChangeEventQueue, and the data is corrected when pollSplitRecords is executed.

SnapshotSplitReader full slice reading. The data read in the whole stage is to query the table data in the slice range by executing the Select statement, and the current offset is written when SHOW MASTER STATUS is executed before and after writing to the queue.

public void submitSplit(MySqlSplit mySqlSplit) {
    ......
    executor.submit(
            () -> {
                try {
                    currentTaskRunning = true;
                    // note: 数据读取，在数据前后插入Binlog当前偏移量
                    // 1. execute snapshot read task。 
                    final SnapshotSplitChangeEventSourceContextImpl sourceContext =
                            new SnapshotSplitChangeEventSourceContextImpl();
                    SnapshotResult snapshotResult =
                            splitSnapshotReadTask.execute(sourceContext);

                    //  note: 为增量读取做准备，包含了起始偏移量
                    final MySqlBinlogSplit appendBinlogSplit = createBinlogSplit(sourceContext);
                    final MySqlOffsetContext mySqlOffsetContext =
                            statefulTaskContext.getOffsetContext();
                    mySqlOffsetContext.setBinlogStartPoint(
                            appendBinlogSplit.getStartingOffset().getFilename(),
                            appendBinlogSplit.getStartingOffset().getPosition());

                    //  note: 从起始偏移量开始读取           
                    // 2. execute binlog read task
                    if (snapshotResult.isCompletedOrSkipped()) {
                        // we should only capture events for the current table,
                        Configuration dezConf =
                                statefulTaskContext
                                        .getDezConf()
                                        .edit()
                                        .with(
                                                "table.whitelist",
                                                currentSnapshotSplit.getTableId())
                                        .build();

                        // task to read binlog for current split
                        MySqlBinlogSplitReadTask splitBinlogReadTask =
                                new MySqlBinlogSplitReadTask(
                                        new MySqlConnectorConfig(dezConf),
                                        mySqlOffsetContext,
                                        statefulTaskContext.getConnection(),
                                        statefulTaskContext.getDispatcher(),
                                        statefulTaskContext.getErrorHandler(),
                                        StatefulTaskContext.getClock(),
                                        statefulTaskContext.getTaskContext(),
                                        (MySqlStreamingChangeEventSourceMetrics)
                                                statefulTaskContext
                                                        .getStreamingChangeEventSourceMetrics(),
                                        statefulTaskContext
                                                .getTopicSelector()
                                                .getPrimaryTopic(),
                                        appendBinlogSplit);

                        splitBinlogReadTask.execute(
                                new SnapshotBinlogSplitChangeEventSourceContextImpl());
                    } else {
                        readException =
                                new IllegalStateException(
                                        String.format(
                                                "Read snapshot for mysql split %s fail",
                                                currentSnapshotSplit));
                    }
                } catch (Exception e) {
                    currentTaskRunning = false;
                    LOG.error(
                            String.format(
                                    "Execute snapshot read task for mysql split %s fail",
                                    currentSnapshotSplit),
                            e);
                    readException = e;
                }
            });
}

SnapshotSplitReader incremental slice reading. The focus of slice reading in the incremental phase is to determine when BinlogSplitReadTask stops, and the offset at the end of the slice phase is terminated.

MySqlBinlogSplitReadTask#handleEvent
protected void handleEvent(Event event) {
    // note: 事件下发 队列
    super.handleEvent(event);
    // note: 全量读取阶段需要终止Binlog读取
    // check do we need to stop for read binlog for snapshot split.
    if (isBoundedRead()) {
        final BinlogOffset currentBinlogOffset =
                new BinlogOffset(
                        offsetContext.getOffset().get(BINLOG_FILENAME_OFFSET_KEY).toString(),
                        Long.parseLong(
                                offsetContext
                                        .getOffset()
                                        .get(BINLOG_POSITION_OFFSET_KEY)
                                        .toString()));
        // note: currentBinlogOffset > HW 停止读取
        // reach the high watermark, the binlog reader should finished
        if (currentBinlogOffset.isAtOrBefore(binlogSplit.getEndingOffset())) {
            // send binlog end event
            try {
                signalEventDispatcher.dispatchWatermarkEvent(
                        binlogSplit,
                        currentBinlogOffset,
                        SignalEventDispatcher.WatermarkKind.BINLOG_END);
            } catch (InterruptedException e) {
                logger.error("Send signal event error.", e);
                errorHandler.setProducerThrowable(
                        new DebeziumException("Error processing binlog signal event", e));
            }
            //  终止binlog读取
            // tell reader the binlog task finished
            ((SnapshotBinlogSplitChangeEventSourceContextImpl) context).finished();
        }
    }
}

SnapshotSplitReader corrects the original data in the queue when pollSplitRecords is executed. For specific processing logic, see RecordUtils#normalizedSplitRecords.

public Iterator<SourceRecord> pollSplitRecords() throws InterruptedException {
    if (hasNextElement.get()) {
        // data input: [low watermark event][snapshot events][high watermark event][binlogevents][binlog-end event]
        // data output: [low watermark event][normalized events][high watermark event]
        boolean reachBinlogEnd = false;
        final List<SourceRecord> sourceRecords = new ArrayList<>();
        while (!reachBinlogEnd) {
            // note: 处理队列中写入的 DataChangeEvent 事件
            List<DataChangeEvent> batch = queue.poll();
            for (DataChangeEvent event : batch) {
                sourceRecords.add(event.getRecord());
                if (RecordUtils.isEndWatermarkEvent(event.getRecord())) {
                    reachBinlogEnd = true;
                    break;
                }
            }
        }
        // snapshot split return its data once
        hasNextElement.set(false);
        //  ************   修正数据  ***********
        return normalizedSplitRecords(currentSnapshotSplit, sourceRecords, nameAdjuster)
                .iterator();
    }
    // the data has been polled, no more data
    reachEnd.compareAndSet(false, true);
    return null;
}

BinlogSplitReader data read. The reading logic is relatively simple, and the focus is on the setting of the starting offset, which is the HW of all slices.
When BinlogSplitReader executes pollSplitRecords, the original data in the queue is corrected to ensure data consistency. Binlog reading in the incremental phase is unbounded, and all data will be sent to the event queue. BinlogSplitReader judges whether the data is sent through shouldEmit().

BinlogSplitReader#pollSplitRecords
public Iterator<SourceRecord> pollSplitRecords() throws InterruptedException {
    checkReadException();
    final List<SourceRecord> sourceRecords = new ArrayList<>();
    if (currentTaskRunning) {
        List<DataChangeEvent> batch = queue.poll();
        for (DataChangeEvent event : batch) {
            if (shouldEmit(event.getRecord())) {
                sourceRecords.add(event.getRecord());
            }
        }
    }
    return sourceRecords.iterator();
}

Conditions for issuing events:

The newly received event post is greater than maxwm;
If the current data value belongs to a snapshot spilt & offset greater than HWM, the data will be issued.

/**
 *
 * Returns the record should emit or not.
 *
 * <p>The watermark signal algorithm is the binlog split reader only sends the binlog event that
 * belongs to its finished snapshot splits. For each snapshot split, the binlog event is valid
 * since the offset is after its high watermark.
 *
 * <pre> E.g: the data input is :
 *    snapshot-split-0 info : [0,    1024) highWatermark0
 *    snapshot-split-1 info : [1024, 2048) highWatermark1
 *  the data output is:
 *  only the binlog event belong to [0,    1024) and offset is after highWatermark0 should send,
 *  only the binlog event belong to [1024, 2048) and offset is after highWatermark1 should send.
 * </pre>
 */
private boolean shouldEmit(SourceRecord sourceRecord) {
    if (isDataChangeRecord(sourceRecord)) {
        TableId tableId = getTableId(sourceRecord);
        BinlogOffset position = getBinlogPosition(sourceRecord);
        // aligned, all snapshot splits of the table has reached max highWatermark
       
        // note:  新收到的event post 大于 maxwm ,直接下发
        if (position.isAtOrBefore(maxSplitHighWatermarkMap.get(tableId))) {
            return true;
        }
        Object[] key =
                getSplitKey(
                        currentBinlogSplit.getSplitKeyType(),
                        sourceRecord,
                        statefulTaskContext.getSchemaNameAdjuster());

        for (FinishedSnapshotSplitInfo splitInfo : finishedSplitsInfo.get(tableId)) {
            /**
             *  note: 当前 data值所属某个snapshot spilt &  偏移量大于 HWM,下发数据
             */
            if (RecordUtils.splitKeyRangeContains(
                            key, splitInfo.getSplitStart(), splitInfo.getSplitEnd())
                    && position.isAtOrBefore(splitInfo.getHighWatermark())) {
                return true;
            }
        }
        // not in the monitored splits scope, do not emit
        return false;
    }

    // always send the schema change event and signal event
    // we need record them to state of Flink
    return true;
}

6. MySqlRecordEmitter data delivery

SourceReaderBase obtains the DataChangeEvent data set read by the slice from the queue, and converts the data type from Debezium's DataChangeEvent to Flink's RowData type.

SourceReaderBase processes the slice data flow.

org.apache.flink.connector.base.source.reader.SourceReaderBase#pollNext
public InputStatus pollNext(ReaderOutput<T> output) throws Exception {
    // make sure we have a fetch we are working on, or move to the next
    RecordsWithSplitIds<E> recordsWithSplitId = this.currentFetch;
    if (recordsWithSplitId == null) {
        recordsWithSplitId = getNextFetch(output);
        if (recordsWithSplitId == null) {
            return trace(finishedOrAvailableLater());
        }
    }

    // we need to loop here, because we may have to go across splits
    while (true) {
        // Process one record.
        // note:  通过MySqlRecords从迭代器中读取单条数据
        final E record = recordsWithSplitId.nextRecordFromSplit();
        if (record != null) {
            // emit the record.
            recordEmitter.emitRecord(record, currentSplitOutput, currentSplitContext.state);
            LOG.trace("Emitted record: {}", record);

            // We always emit MORE_AVAILABLE here, even though we do not strictly know whether
            // more is available. If nothing more is available, the next invocation will find
            // this out and return the correct status.
            // That means we emit the occasional 'false positive' for availability, but this
            // saves us doing checks for every record. Ultimately, this is cheaper.
            return trace(InputStatus.MORE_AVAILABLE);
        } else if (!moveToNextSplit(recordsWithSplitId, output)) {
            // The fetch is done and we just discovered that and have not emitted anything, yet.
            // We need to move to the next fetch. As a shortcut, we call pollNext() here again,
            // rather than emitting nothing and waiting for the caller to call us again.
            return pollNext(output);
        }
        // else fall through the loop
    }
}

private RecordsWithSplitIds<E> getNextFetch(final ReaderOutput<T> output) {
    splitFetcherManager.checkErrors();

    LOG.trace("Getting next source data batch from queue");
    // note: 从elementsQueue 获取数据
    final RecordsWithSplitIds<E> recordsWithSplitId = elementsQueue.poll();
    if (recordsWithSplitId == null || !moveToNextSplit(recordsWithSplitId, output)) {
        return null;
    }

    currentFetch = recordsWithSplitId;
    return recordsWithSplitId;
}

MySqlRecords returns a single data collection.

com.ververica.cdc.connectors.mysql.source.split.MySqlRecords#nextRecordFromSplit

public SourceRecord nextRecordFromSplit() {
    final Iterator<SourceRecord> recordsForSplit = this.recordsForCurrentSplit;
    if (recordsForSplit != null) {
        if (recordsForSplit.hasNext()) {
            return recordsForSplit.next();
        } else {
            return null;
        }
    } else {
        throw new IllegalStateException();
    }
}

MySqlRecordEmitter converts data to Rowdata through RowDataDebeziumDeserializeSchema.

com.ververica.cdc.connectors.mysql.source.reader.MySqlRecordEmitter#emitRecord
public void emitRecord(SourceRecord element, SourceOutput<T> output, MySqlSplitState splitState)
    throws Exception {
if (isWatermarkEvent(element)) {
    BinlogOffset watermark = getWatermark(element);
    if (isHighWatermarkEvent(element) && splitState.isSnapshotSplitState()) {
        splitState.asSnapshotSplitState().setHighWatermark(watermark);
    }
} else if (isSchemaChangeEvent(element) && splitState.isBinlogSplitState()) {
    HistoryRecord historyRecord = getHistoryRecord(element);
    Array tableChanges =
            historyRecord.document().getArray(HistoryRecord.Fields.TABLE_CHANGES);
    TableChanges changes = TABLE_CHANGE_SERIALIZER.deserialize(tableChanges, true);
    for (TableChanges.TableChange tableChange : changes) {
        splitState.asBinlogSplitState().recordSchema(tableChange.getId(), tableChange);
    }
} else if (isDataChangeRecord(element)) {
    //  note: 数据的处理
    if (splitState.isBinlogSplitState()) {
        BinlogOffset position = getBinlogPosition(element);
        splitState.asBinlogSplitState().setStartingOffset(position);
    }
    debeziumDeserializationSchema.deserialize(
            element,
            new Collector<T>() {
                @Override
                public void collect(final T t) {
                    output.collect(t);
                }

                @Override
                public void close() {
                    // do nothing
                }
            });
} else {
    // unknown element
    LOG.info("Meet unknown element {}, just skip.", element);
}
}

RowDataDebeziumDeserializeSchema serialization process.

com.ververica.cdc.debezium.table.RowDataDebeziumDeserializeSchema#deserialize
public void deserialize(SourceRecord record, Collector<RowData> out) throws Exception {
    Envelope.Operation op = Envelope.operationFor(record);
    Struct value = (Struct) record.value();
    Schema valueSchema = record.valueSchema();
    if (op == Envelope.Operation.CREATE || op == Envelope.Operation.READ) {
        GenericRowData insert = extractAfterRow(value, valueSchema);
        validator.validate(insert, RowKind.INSERT);
        insert.setRowKind(RowKind.INSERT);
        out.collect(insert);
    } else if (op == Envelope.Operation.DELETE) {
        GenericRowData delete = extractBeforeRow(value, valueSchema);
        validator.validate(delete, RowKind.DELETE);
        delete.setRowKind(RowKind.DELETE);
        out.collect(delete);
    } else {
        GenericRowData before = extractBeforeRow(value, valueSchema);
        validator.validate(before, RowKind.UPDATE_BEFORE);
        before.setRowKind(RowKind.UPDATE_BEFORE);
        out.collect(before);

        GenericRowData after = extractAfterRow(value, valueSchema);
        validator.validate(after, RowKind.UPDATE_AFTER);
        after.setRowKind(RowKind.UPDATE_AFTER);
        out.collect(after);
    }
}

7. MySqlSourceReader reports the slice read completion event

After MySqlSourceReader has processed a full slice, it will send the completed slice information to MySqlSourceEnumerator, including the slice ID and HighWatermar, and then continue to send the slice request.

com.ververica.cdc.connectors.mysql.source.reader.MySqlSourceReader#onSplitFinished
protected void onSplitFinished(Map<String, MySqlSplitState> finishedSplitIds) {
for (MySqlSplitState mySqlSplitState : finishedSplitIds.values()) {
    MySqlSplit mySqlSplit = mySqlSplitState.toMySqlSplit();

    finishedUnackedSplits.put(mySqlSplit.splitId(), mySqlSplit.asSnapshotSplit());
}
/**
 *   note: 发送切片完成事件
 */
reportFinishedSnapshotSplitsIfNeed();

//  上一个spilt处理完成后继续发送切片请求
context.sendSplitRequest();
}

private void reportFinishedSnapshotSplitsIfNeed() {
    if (!finishedUnackedSplits.isEmpty()) {
        final Map<String, BinlogOffset> finishedOffsets = new HashMap<>();
        for (MySqlSnapshotSplit split : finishedUnackedSplits.values()) {
            // note: 发送切片ID，及最大偏移量
            finishedOffsets.put(split.splitId(), split.getHighWatermark());
        }
        FinishedSnapshotSplitsReportEvent reportEvent =
                new FinishedSnapshotSplitsReportEvent(finishedOffsets);

        context.sendSourceEventToCoordinator(reportEvent);
        LOG.debug(
                "The subtask {} reports offsets of finished snapshot splits {}.",
                subtaskId,
                finishedOffsets);
    }
}

8. MySqlSourceEnumerator allocates incremental slices

After reading all the slices in the full phase, MySqlHybridSplitAssigner will create a BinlogSplit for subsequent incremental reading. When creating a BinlogSplit, it will filter the minimum BinlogOffset from all completed full slices. Note: The minimum offset of the 2.0.0 branch createBinlogSplit always starts from 0. This bug has been fixed in the latest master branch.

private MySqlBinlogSplit createBinlogSplit() {
    final List<MySqlSnapshotSplit> assignedSnapshotSplit =
            snapshotSplitAssigner.getAssignedSplits().values().stream()
                    .sorted(Comparator.comparing(MySqlSplit::splitId))
                    .collect(Collectors.toList());

    Map<String, BinlogOffset> splitFinishedOffsets =
            snapshotSplitAssigner.getSplitFinishedOffsets();
    final List<FinishedSnapshotSplitInfo> finishedSnapshotSplitInfos = new ArrayList<>();
    final Map<TableId, TableChanges.TableChange> tableSchemas = new HashMap<>();

    BinlogOffset minBinlogOffset = null;
    // note: 从所有assignedSnapshotSplit中筛选最小偏移量
    for (MySqlSnapshotSplit split : assignedSnapshotSplit) {
        // find the min binlog offset
        BinlogOffset binlogOffset = splitFinishedOffsets.get(split.splitId());
        if (minBinlogOffset == null || binlogOffset.compareTo(minBinlogOffset) < 0) {
            minBinlogOffset = binlogOffset;
        }
        finishedSnapshotSplitInfos.add(
                new FinishedSnapshotSplitInfo(
                        split.getTableId(),
                        split.splitId(),
                        split.getSplitStart(),
                        split.getSplitEnd(),
                        binlogOffset));
        tableSchemas.putAll(split.getTableSchemas());
    }

    final MySqlSnapshotSplit lastSnapshotSplit =
            assignedSnapshotSplit.get(assignedSnapshotSplit.size() - 1).asSnapshotSplit();
       
    return new MySqlBinlogSplit(
            BINLOG_SPLIT_ID,
            lastSnapshotSplit.getSplitKeyType(),
            minBinlogOffset == null ? BinlogOffset.INITIAL_OFFSET : minBinlogOffset,
            BinlogOffset.NO_STOPPING_OFFSET,
            finishedSnapshotSplitInfos,
            tableSchemas);
}

For more Flink-related technical issues, you can scan the QR code to join the community DingTalk exchange group;

Get the latest technical articles and community dynamics in the first time, please follow the public account~

The principle analysis of Apache Flink CDC batch stream fusion technology

1. Case

2. Core design

1. Slicing

1.1 Uniform distribution

1.2 Non-uniform distribution

2. Full slice data read

2.1 Snapshot read

2.2 Data correction

3. Incremental slice data reading

Three, the code is explained in detail

1. MySqlSourceEnumerator initialization

2. MySqlSourceReader initialization

3. MySqlSourceEnumerator handles fragmentation requests

4. MySqlSourceReader handles the slice allocation request

5. DebeziumReader data processing

6. MySqlRecordEmitter data delivery

7. MySqlSourceReader reports the slice read completion event

8. MySqlSourceEnumerator allocates incremental slices

ApacheFlink

引用和评论

Flink在B站的大规模云原生实践

【Hadoop】HDFS架构解析

【Hadoop】HBase系统解析及适用场景

基于 pyflink 的算法工作流设计和改造

MCP+Hologres+LLM 搭建数据分析 Agent

某全球领先网络解决方案提供商基于 Apache Doris 统一 Trino、Pinot、Iceberg、Kyuubi技术栈

SelectDB 实时分析性能突出，宝舵成本锐减与性能显著提升的双赢之旅

The principle analysis of Apache Flink CDC batch stream fusion technology

1. Case

2. Core design

1. Slicing

1.1 Uniform distribution

1.2 Non-uniform distribution

2. Full slice data read

2.1 Snapshot read

2.2 Data correction

3. Incremental slice data reading

Three, the code is explained in detail

1. MySqlSourceEnumerator initialization

2. MySqlSourceReader initialization

3. MySqlSourceEnumerator handles fragmentation requests

4. MySqlSourceReader handles the slice allocation request

5. DebeziumReader data processing

6. MySqlRecordEmitter data delivery

7. MySqlSourceReader reports the slice read completion event

8. MySqlSourceEnumerator allocates incremental slices

ApacheFlink

引用和评论

Flink在B站的大规模云原生实践

【Hadoop】HDFS架构解析

【Hadoop】HBase系统解析及适用场景

基于 pyflink 的算法工作流设计和改造

MCP+Hologres+LLM 搭建数据分析 Agent

某全球领先网络解决方案提供商 基于 Apache Doris 统一 Trino、Pinot、Iceberg、Kyuubi技术栈

SelectDB 实时分析性能突出，宝舵成本锐减与性能显著提升的双赢之旅

某全球领先网络解决方案提供商基于 Apache Doris 统一 Trino、Pinot、Iceberg、Kyuubi技术栈