At present, the scale of data is growing at an alarming rate, and more and more application scenarios have higher requirements for the timeliness of data processing. With the rapid development of real-time computing technology in recent years, architectures such as real-time OLAP, real-time data lake, and real-time data warehouse have emerged, which better solve the real-time problem of lake and warehouse. However, what real-time needs is an end-to-end solution. In addition to the real-time implementation of lake warehouses, we also urgently need real-time data integration.
Real-time data integration refers to synchronizing and centralizing the data in each data island into the data warehouse in real time, so as to facilitate subsequent unified real-time analysis. Real-time data integration is an important part of the real-time data technology stack, and it is also the mainstream development trend in the industry. Unlike offline data integration, real-time data integration needs to deal with data and structures that may change at any time. In addition to ensuring low-latency synchronization to the target storage, it also needs to ensure data consistency and correctness in various scenarios. question.
Flink CDC is an open source representative of a real-time data integration framework. It has technical advantages such as full incremental integration, lock-free reading, concurrent reading, and distributed architecture. It is very popular in the open source community. In addition to the ability to enter the lake and warehousing in real time, Flink CDC also supports powerful data processing capabilities, and can perform real-time association, aggregation, and widening of database data through SQL.
Flink CDC Meetup Online
May 21 | Online
In order to promote the exchange and development of Flink CDC technology, we will hold the Flink CDC Meetup online on May 21st. In this Meetup, Alibaba technical expert, Apache Flink PMC Member & Committer Wu Chong (Yunxie) as the producer, invited big coffee from Alibaba, XTransfer, SF Express, OceanBase, Dajian Yuncang to share Flink CDC in various scenarios Best practices, production experience, technical principles, etc.
【Event Highlights】
• A lot of practical dry goods, such as the technical principle of Flink CDC to realize real-time synchronization and conversion of massive data, and practical optimization in various business scenarios.
• Each lecturer has a Q&A session. Ask questions through the community nail group, WeChat group, and live video account, and have the opportunity to get online answers from the lecturer~
• Watch the live broadcast through ApacheFlink video account and have a chance to get a custom T-shirt from Flink CDC!
【Agenda】
Introduction of guests and topics
Wu Chong
Alibaba Technical Expert, Apache Flink PMC Member & Committer
About the seller:
Wu Chong, aka Yunxie, Apache Flink PMC member & Committer. Working in Alibaba Cloud's open source big data platform, he is mainly responsible for R&D work related to Flink CDC and Flink SQL. He has been focusing on stream processing and batch processing for a long time.
"Real-time synchronization and conversion of massive data based on Flink CDC"
Xu Bangjiang <br>Senior Development Engineer at Alibaba, Apache Flink Committer & Flink CDC Maintainer
【Guest Profile】
Xu Bangjiang, who is famous for Alibaba, is currently focusing on the field of data integration.
【Introduction to the speech】
- Pain points of massive data integration;
- Real-time synchronization and conversion of massive data based on Flink CDC;
- Demo demonstration: real-time large screen;
- Summary and Outlook.
【Audience benefits】
Understand the technical principles of Flink CDC to realize real-time synchronization and conversion of massive data, and provide fresher data for business.
"The Implementation Principle and Use Practice of Flink CDC MongoDB Connector"
Sun Jiabao
XTransfer Senior Java Developer, Flink CDC Maintainer
【Guest Profile】
Sun Jiabao, working in the XTransfer Infrastructure Department, is responsible for the infrastructure construction of the big data platform. He is a maintainer member of the Flink CDC project, and a contributor to open source projects such as Debezium and Zeppelin.
【Introduction to the speech】
- Introduction to MongoDB ChangeStream technology;
- MongoDB CDC Connector usage practice;
- MongoDB CDC Connector parallelized Snapshot improvements.
【Audience benefits】
Beneficiaries: Users and technical development of Flink CDC MongoDB.
"Production Practice of Flink CDC in SF Express"
Qin Lihui
SF Big Data R&D Engineer
【Guest Profile】
Qin Lihui, working in the big data chassis team of SF Express, is mainly engaged in the research and development work related to data entry into lakes and warehouses.
【Introduction to the speech】
- SF data integration background
- Flink CDC practical problems and optimization
- future plan
【Audience benefits】
The audience can learn about the problems and challenges encountered in the production practice of Flink CDC, and we have optimized Flink CDC to solve these problems, supporting parallel reading of full and incremental log streams, supporting full hybrid splitting to solve data skew, Support functions such as sub-database sub-table synchronization of multiple DB instances.
"Flink CDC + OceanBase Full Incremental Integrated Data Integration Solution"
Wang He
OceanBase Technical Expert
【Guest Profile】
Wang He (Chuanfen), OceanBase technical expert.
【Introduction to the speech】
This sharing will bring Flink CDC + OceanBase full incremental integrated data integration solution from the following four parts:
- CDC Technical Brief
- Introduction to OceanBase CDC Components
- Introduction to Flink CDC
- Introduction to Flink CDC OceanBase Connector
【Audience benefits】
Learn about Flink CDC and OceanBase Community Edition data migration related tools, understand the principle and use of Flink CDC OceanBase Connector, and master the integration scheme of distributed database OceanBase Community Edition and big data processing engine Flink.
"The Practice of Flink CDC in Dajian Cloud Warehouse"
Gong Zhongqiang
Head of Dajian Cloud Warehouse Infrastructure Department
【Guest Profile】
Worked in the infrastructure department of Dajian Cloud Warehouse, mainly responsible for the design and development of the company's system architecture. Currently focusing on big data and cloud native fields, with certain practical experience and personal insights.
【Introduction to the speech】
- The company introduced the background of Flink CDC;
- The current business scenario of Flink CDC’s internal implementation;
- Future internal promotion and platform construction of Flink CDC.
【Audience benefits】
- Understand the business scenarios and production practices of Flink CDC in the company;
- Expand the vision of applying Flink CDC business scenarios.
details of the event
Time: May 21st 9:00-12:25
Live streaming on PC : https://developer.aliyun.com/live/248997
It is recommended to pay attention to the ApacheFlink video number and make an appointment for viewing on the mobile terminal
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。