1、系统要求
Linux
JDK(1.8以上,推荐1.8)
Python(2或3都可以)
Apache Maven 3.x (Compile DataX)
2、源码编译
1、下载代码,github代码同步到码云上了
git clone https://gitee.com/qzw2015/DataX.git
2、切换tag分支最新release tag
git checkout datax_v202309
3、修改 DataX/hdfsreader/pom.xml
<dependency>
<groupId>org.apache.parquet</groupId>
<artifactId>parquet-format</artifactId>
<version>2.4.0</version>
</dependency>
4、https://github.com/alibaba/DataX/tree/master/oceanbasev10writer/src/main/libs 下面下载 shade-ob-partition-calculator-1.0-SNAPSHOT.jar包并放入到DataX/oceanbasev10writer/src/main/libs下
5、打包
mvn -U clean package assembly:assembly -Dmaven.test.skip=true
结果如下 :
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] datax-all 0.0.1-SNAPSHOT ........................... SUCCESS [01:51 min]
[INFO] datax-common 0.0.1-SNAPSHOT ........................ SUCCESS [ 0.772 s]
[INFO] datax-transformer 0.0.1-SNAPSHOT ................... SUCCESS [ 0.579 s]
[INFO] datax-core 0.0.1-SNAPSHOT .......................... SUCCESS [ 1.407 s]
[INFO] plugin-rdbms-util 0.0.1-SNAPSHOT ................... SUCCESS [ 0.452 s]
[INFO] mysqlreader 0.0.1-SNAPSHOT ......................... SUCCESS [ 0.568 s]
[INFO] drdsreader 0.0.1-SNAPSHOT .......................... SUCCESS [ 0.566 s]
[INFO] sqlserverreader 0.0.1-SNAPSHOT ..................... SUCCESS [ 0.644 s]
[INFO] postgresqlreader 0.0.1-SNAPSHOT .................... SUCCESS [ 0.585 s]
[INFO] kingbaseesreader 0.0.1-SNAPSHOT .................... SUCCESS [ 0.568 s]
[INFO] oraclereader 0.0.1-SNAPSHOT ........................ SUCCESS [ 0.582 s]
[INFO] cassandrareader 0.0.1-SNAPSHOT ..................... SUCCESS [ 1.134 s]
[INFO] oceanbasev10reader 0.0.1-SNAPSHOT .................. SUCCESS [ 0.969 s]
[INFO] rdbmsreader 0.0.1-SNAPSHOT ......................... SUCCESS [ 0.694 s]
[INFO] odpsreader 0.0.1-SNAPSHOT .......................... SUCCESS [ 1.614 s]
[INFO] otsreader 0.0.1-SNAPSHOT ........................... SUCCESS [ 1.390 s]
[INFO] otsstreamreader 0.0.1-SNAPSHOT ..................... SUCCESS [ 1.172 s]
[INFO] hbase11xreader 0.0.1-SNAPSHOT ...................... SUCCESS [ 3.303 s]
[INFO] hbase094xreader 0.0.1-SNAPSHOT ..................... SUCCESS [ 2.317 s]
[INFO] hbase11xsqlreader 0.0.1-SNAPSHOT ................... SUCCESS [ 4.120 s]
[INFO] hbase20xsqlreader 0.0.1-SNAPSHOT ................... SUCCESS [ 0.616 s]
[INFO] plugin-unstructured-storage-util 0.0.1-SNAPSHOT .... SUCCESS [ 0.480 s]
[INFO] hdfsreader 0.0.1-SNAPSHOT .......................... SUCCESS [ 4.232 s]
[INFO] ossreader 0.0.1-SNAPSHOT ........................... SUCCESS [ 4.550 s]
[INFO] ftpreader 0.0.1-SNAPSHOT ........................... SUCCESS [ 2.044 s]
[INFO] txtfilereader 0.0.1-SNAPSHOT ....................... SUCCESS [ 1.887 s]
[INFO] streamreader 0.0.1-SNAPSHOT ........................ SUCCESS [ 0.481 s]
[INFO] clickhousereader 0.0.1-SNAPSHOT .................... SUCCESS [ 0.952 s]
[INFO] mongodbreader 0.0.1-SNAPSHOT ....................... SUCCESS [ 1.983 s]
[INFO] tdenginewriter 0.0.1-SNAPSHOT ...................... SUCCESS [ 1.949 s]
[INFO] tdenginereader 0.0.1-SNAPSHOT ...................... SUCCESS [ 0.881 s]
[INFO] gdbreader 0.0.1-SNAPSHOT ........................... SUCCESS [ 1.641 s]
[INFO] tsdbreader 0.0.1-SNAPSHOT .......................... SUCCESS [ 0.797 s]
[INFO] opentsdbreader 0.0.1-SNAPSHOT ...................... SUCCESS [ 1.155 s]
[INFO] loghubreader 0.0.1-SNAPSHOT ........................ SUCCESS [ 0.856 s]
[INFO] datahubreader 0.0.1-SNAPSHOT ....................... SUCCESS [ 1.044 s]
[INFO] starrocksreader 0.0.1-SNAPSHOT ..................... SUCCESS [ 0.537 s]
[INFO] mysqlwriter 0.0.1-SNAPSHOT ......................... SUCCESS [ 0.557 s]
[INFO] starrockswriter 1.1.0 .............................. SUCCESS [ 2.075 s]
[INFO] drdswriter 0.0.1-SNAPSHOT .......................... SUCCESS [ 0.509 s]
[INFO] databendwriter 0.0.1-SNAPSHOT ...................... SUCCESS [ 0.947 s]
[INFO] oraclewriter 0.0.1-SNAPSHOT ........................ SUCCESS [ 0.557 s]
[INFO] sqlserverwriter 0.0.1-SNAPSHOT ..................... SUCCESS [ 0.565 s]
[INFO] postgresqlwriter 0.0.1-SNAPSHOT .................... SUCCESS [ 0.661 s]
[INFO] kingbaseeswriter 0.0.1-SNAPSHOT .................... SUCCESS [ 0.567 s]
[INFO] odpswriter 0.0.1-SNAPSHOT .......................... SUCCESS [ 1.590 s]
[INFO] adswriter 0.0.1-SNAPSHOT ........................... SUCCESS [ 1.804 s]
[INFO] oceanbasev10writer 0.0.1-SNAPSHOT .................. SUCCESS [ 0.891 s]
[INFO] adbpgwriter 0.0.1-SNAPSHOT ......................... SUCCESS [ 1.058 s]
[INFO] hologresjdbcwriter 0.0.1-SNAPSHOT .................. SUCCESS [ 1.014 s]
[INFO] rdbmswriter 0.0.1-SNAPSHOT ......................... SUCCESS [ 0.639 s]
[INFO] hdfswriter 0.0.1-SNAPSHOT .......................... SUCCESS [ 4.426 s]
[INFO] osswriter 0.0.1-SNAPSHOT ........................... SUCCESS [ 4.573 s]
[INFO] otswriter 0.0.1-SNAPSHOT ........................... SUCCESS [ 1.490 s]
[INFO] hbase11xwriter 0.0.1-SNAPSHOT ...................... SUCCESS [ 2.284 s]
[INFO] hbase094xwriter 0.0.1-SNAPSHOT ..................... SUCCESS [ 1.864 s]
[INFO] hbase11xsqlwriter 0.0.1-SNAPSHOT ................... SUCCESS [ 3.953 s]
[INFO] hbase20xsqlwriter 0.0.1-SNAPSHOT ................... SUCCESS [ 0.623 s]
[INFO] kuduwriter 0.0.1-SNAPSHOT .......................... SUCCESS [ 0.697 s]
[INFO] ftpwriter 0.0.1-SNAPSHOT ........................... SUCCESS [ 1.880 s]
[INFO] txtfilewriter 0.0.1-SNAPSHOT ....................... SUCCESS [ 1.756 s]
[INFO] streamwriter 0.0.1-SNAPSHOT ........................ SUCCESS [ 0.508 s]
[INFO] elasticsearchwriter 0.0.1-SNAPSHOT ................. SUCCESS [ 1.047 s]
[INFO] mongodbwriter 0.0.1-SNAPSHOT ....................... SUCCESS [ 1.808 s]
[INFO] ocswriter 0.0.1-SNAPSHOT ........................... SUCCESS [ 0.962 s]
[INFO] tsdbwriter 0.0.1-SNAPSHOT .......................... SUCCESS [ 0.800 s]
[INFO] gdbwriter 0.0.1-SNAPSHOT ........................... SUCCESS [ 2.024 s]
[INFO] oscarwriter 0.0.1-SNAPSHOT ......................... SUCCESS [ 0.562 s]
[INFO] loghubwriter 0.0.1-SNAPSHOT ........................ SUCCESS [ 0.741 s]
[INFO] datahubwriter 0.0.1-SNAPSHOT ....................... SUCCESS [ 0.998 s]
[INFO] cassandrawriter 0.0.1-SNAPSHOT ..................... SUCCESS [ 1.028 s]
[INFO] clickhousewriter 0.0.1-SNAPSHOT .................... SUCCESS [ 1.045 s]
[INFO] doriswriter 0.0.1-SNAPSHOT ......................... SUCCESS [ 0.921 s]
[INFO] selectdbwriter 0.0.1-SNAPSHOT ...................... SUCCESS [ 1.013 s]
[INFO] adbmysqlwriter 0.0.1-SNAPSHOT ...................... SUCCESS [ 0.515 s]
[INFO] neo4jwriter 0.0.1-SNAPSHOT ......................... SUCCESS [ 1.229 s]
[INFO] gaussdbreader 0.0.1-SNAPSHOT ....................... SUCCESS [ 0.549 s]
[INFO] gaussdbwriter 0.0.1-SNAPSHOT ....................... SUCCESS [ 0.589 s]
[INFO] datax-example 0.0.1-SNAPSHOT ....................... SUCCESS [ 0.002 s]
[INFO] datax-example-core 0.0.1-SNAPSHOT .................. SUCCESS [ 0.255 s]
[INFO] datax-example-streamreader 0.0.1-SNAPSHOT .......... SUCCESS [ 0.006 s]
[INFO] datax-example-neo4j 0.0.1-SNAPSHOT ................. SUCCESS [ 0.006 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 03:34 min
[INFO] Finished at: 2024-07-17T16:47:23+08:00
[INFO] ------------------------------------------------------------------------
打包成功后的DataX包位于 {DataX_source_code_home}/target/datax/datax/ ,结构如下:
$ cd {DataX_source_code_home}
$ ls ./target/datax/datax/
bin conf job lib log log_perf plugin
3、示例(MySQL → MySQL)
3.1、准备(MySQL目标表)
create database journey;
use journey;
create table t_ds_user like dolphinscheduler.t_ds_user;
3.2、json准备
DataX 完整 JSON 配置文件示例 :
{
"job": {
"content": [
{
"reader": {
"name": "mysqlreader",
"parameter": {
"column": ["id", "name", "age"], // 要读取的列
"connection": [
{
"table": ["source_table"], // 源表名
"jdbcUrl": ["jdbc:mysql://localhost:3306/source_db"] // 源数据库连接 URL
}
]
}
},
"writer": {
"name": "postgresqlwriter",
"parameter": {
"column": [
{"name": "id", "type": "INTEGER"}, // 目标表的列及类型
{"name": "name", "type": "VARCHAR"},
{"name": "age", "type": "INTEGER"}
],
"table": "target_table", // 目标表名
"jdbcUrl": "jdbc:postgresql://localhost:5432/target_db", // 目标数据库连接 URL
"username": "your_username", // 目标数据库用户名
"password": "your_password" // 目标数据库密码
}
},
"transformer": [
{
"name": "columntransformer",
"parameter": {
"column": [
{"name": "age", "type": "INTEGER"} // 数据转换,例如将 age 列的数据类型转换为 INTEGER
]
}
}
]
}
],
"setting": {
"speed": {
"channel": 5 // 设置并发读取通道数
},
"errorLimit": {
"record": 1000, // 记录错误的最大数量
"percentage": 0.02 // 错误记录的最大百分比
}
}
}
}
mysql2mysql.json 如下 :
{
"job": {
"content": [{
"reader": {
"name": "mysqlreader",
"parameter": {
"username": "root",
"password": "root@123",
"connection": [{
"querySql": ["select * from t_ds_user;"],
"jdbcUrl": ["jdbc:mysql://127.0.0.1:3306/dolphinscheduler?useSSL=false"]
}]
}
},
"writer": {
"name": "mysqlwriter",
"parameter": {
"username": "root",
"password": "root@123",
"column": ["`id`", "`user_name`", "`user_password`", "`user_type`", "`email`", "`phone`", "`tenant_id`", "`create_time`", "`update_time`", "`queue`", "`state`", "`time_zone`"],
"connection": [{
"table": ["t_ds_user"],
"jdbcUrl": "jdbc:mysql://127.0.0.1:3306/journey?useSSL=false"
}]
}
}
}],
"setting": {
"speed": {
"channel": 1,
"record": 1000
},
"errorLimit": {
"record": 0,
"percentage": 0
}
}
},
"core": {
"transport": {
"channel": {
"speed": {
"channel": 1,
"record": 1000
}
}
}
}
}
3.3、执行
python3 /Users/qiaozhanwei/IdeaProjects/DataX/target/datax/datax/bin/datax.py /Users/qiaozhanwei/IdeaProjects/DataX/target/mysql2mysql.json
3.4、执行结果
mysql> select * from journey.t_ds_user;
+----+-----------+----------------------------------+-----------+------------+-------+-----------+---------------------+---------------------+-------+-------+-----------+
| id | user_name | user_password | user_type | email | phone | tenant_id | create_time | update_time | queue | state | time_zone |
+----+-----------+----------------------------------+-----------+------------+-------+-----------+---------------------+---------------------+-------+-------+-----------+
| 1 | admin | a0e29abb026840908b372ecdb1231766 | 0 | xxx@qq.com | | -1 | 2024-06-19 16:56:24 | 2024-06-19 16:56:24 | NULL | 1 | NULL |
+----+-----------+----------------------------------+-----------+------------+-------+-----------+---------------------+---------------------+-------+-------+-----------+
1 row in set (0.00 sec)
mysql>
如感兴趣,点赞加关注,谢谢!!!
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。