序
本文主要研究一下jdbc的batch的使用以及jpa的batch设置
batch
statement的batch操作,可以批量进行insert或update操作,提升操作性能,特别是在大数据量的insert或update的时候。
使用方式
@Test
public void testSqlInjectSafeBatch(){
String sql = "insert into employee (name, city, phone) values (?, ?, ?)";
Connection conn = null;
PreparedStatement pstmt = null;
try{
conn = dataSource.getConnection();
conn.setAutoCommit(false);
pstmt = conn.prepareStatement(sql);
for (int i=0;i<3;i++) {
pstmt.setString(1,"name"+i);
pstmt.setString(2,"city"+i);
pstmt.setString(3,"iphone"+i);
pstmt.addBatch();
}
pstmt.executeBatch();
conn.commit();
}catch (SQLException e){
e.printStackTrace();
try {
conn.rollback();
} catch (SQLException e1) {
e1.printStackTrace();
}
}finally {
DbUtils.closeQuietly(pstmt);
DbUtils.closeQuietly(conn);
}
}
主要就是每条操作参数设置完之后,调用addBatch方法,然后再所有操作都pstmt.addBatch()完之后,调用pstmt.executeBatch()
这种方式有个缺陷就是数据量大容易消耗内存,因此建议再分批次处理
@Test
public void testSqlInjectSafeAndOOMSafeBatch(){
String sql = "insert into employee (name, city, phone) values (?, ?, ?)";
Connection conn = null;
PreparedStatement pstmt = null;
final int batchSize = 1000;
int count = 0;
try{
conn = dataSource.getConnection();
pstmt = conn.prepareStatement(sql);
for (int i=0;i<10000;i++) {
pstmt.setString(1,"name"+i);
pstmt.setString(2,"city"+i);
pstmt.setString(3,"iphone"+i);
pstmt.addBatch();
//小批量提交,避免OOM
if(++count % batchSize == 0) {
pstmt.executeBatch();
}
}
pstmt.executeBatch(); //提交剩余的数据
}catch (SQLException e){
e.printStackTrace();
}finally {
DbUtils.closeQuietly(pstmt);
DbUtils.closeQuietly(conn);
}
}
jpa的batch设置
spring:
jpa:
database-platform: org.hibernate.dialect.PostgreSQLDialect
hibernate:
ddl-auto: update
naming:
implicit-strategy: org.springframework.boot.orm.jpa.hibernate.SpringImplicitNamingStrategy
physical-strategy: org.springframework.boot.orm.jpa.hibernate.SpringPhysicalNamingStrategy
show-sql: true
properties:
hibernate:
format_sql: true
jdbc:
batch_size: 5000
batch_versioned_data: true
order_inserts: true
order_updates: true
通过设置spring.jpa.properties.hibernate.jdbc.batch_size来设置批量
实例测试
@Test
public void testJpaBatch() {
List<DemoUser> demoUsers = new ArrayList<>();
for(int i=0;i<10;i++){
DemoUser demoUser = new DemoUser();
demoUser.setPrincipal("demo");
demoUser.setAccessToken(UUID.randomUUID().toString());
demoUser.setAuthType(UUID.randomUUID().toString());
demoUser.setDeptName(UUID.randomUUID().toString());
demoUser.setOrgName(UUID.randomUUID().toString());
demoUsers.add(demoUser);
}
StopWatch stopWatch = new StopWatch("jpa batch");
stopWatch.start();
demoUserDao.save(demoUsers);
stopWatch.stop();
System.out.println(stopWatch.prettyPrint());
}
调整batch_size参数的测试结果
没有设置批量
* StopWatch 'jpa batch': running time (millis) = 21383
-----------------------------------------
ms % Task name
-----------------------------------------
21383 100%
设置批量500
StopWatch 'jpa batch': running time (millis) = 16790
-----------------------------------------
ms % Task name
-----------------------------------------
16790 100%
批量1000
StopWatch 'jpa batch': running time (millis) = 12317
-----------------------------------------
ms % Task name
-----------------------------------------
12317 100%
批量5000
StopWatch 'jpa batch': running time (millis) = 13190
-----------------------------------------
ms % Task name
-----------------------------------------
13190 100%
小结
jdbc的batch参数对于大数据量的新增/更新操作来说,非常有用,可以提升批量操作的效率。
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。