Don’t panic when GaussDB (DWS) data skew occurs, this article teaches you to easily get the table skew rate

Abstract: GaussDB (DWS) is an MPP parallel architecture. If the table data is skewed, it will cause a series of performance problems, affect user experience, and may cause system failures in severe cases. Therefore, it is a matter of concern for GaussDB (DWS) operation and maintenance managers to quickly obtain and correct sloping tables.

This article is shared from the HUAWEI cloud community " GaussDB (DWS) data skew do not panic, ", the original author: SeqList.

GaussDB (DWS) is an MPP parallel architecture. If the table data is skewed, it will cause a series of performance problems, affect user experience, and may cause system failures in severe cases. Therefore, it is a matter of concern for GaussDB (DWS) operation and maintenance managers to quickly obtain and correct sloping tables.

Description of Requirement

GaussDB (DWS) itself provides the pgxc_get_table_skewness view to query the skewness, but in actual practice, this view has performance problems and the calculation of the skew rate of the view has problems. In practice, the tilt rate of a table obtained by this view is not high (for example, 0.03), but in fact the table is tilted.

At the same time, in many cases we need to obtain the skew rate of all tables under a schema to troubleshoot the skew problem. pgxc_get_table_skewness is also described in the product documentation as a view with poor performance.

Therefore, there is an urgent need for a function or view that has good performance and can express tilt in the project practice process.

Design ideas

GaussDB (DWS) has a table_distribution function to obtain the space size of each DN. Through this function, we can quickly obtain the size of each DN. At the same time, we can obtain the tilt of the table according to the size of each DN:

skewness = (max(dnsize) - avg(dnsize))*100/max(dnsize)

The ratio of the maximum DN space size to the average DN space size of the slope rate formula calculation table can accurately reflect the slope rate, and multiply by 100 as the performance percentage.

Implementation process

According to the skew formula, we get the following SQL, which can quickly get the skew of all tables in the schema. Take public as an example below:

select schemaname,tablename,sum(dnsize)/1024^3 dnsize_gb,(max(dnsize) - avg(dnsize))*100/nullif(max(dnsize),0) skewness_factor
from (
select schemaname
,tablename
,(regexp_split_to_array(tbl_dis,'[\,\(\)]+'))[4]::bigint as vprocname
,(regexp_split_to_array(tbl_dis,'[\,\(\)]+'))[5]::bigint as dnsize
from (
select nspname as schemaname
,relname as tablename
,table_distribution(nspname,relname)::text as tbl_dis
from pg_class a
inner join pg_namespace b
on a.relnamespace = b.oid
and a.relkind = 'r'
and b.oid not in (100)
)
) 
where schemaname= 'public' group by 1,2 order by 3 desc;

The results are as follows. Through an example, it can be seen that the test13 table is 2GB and has a serious skew of 97%, while store_sales1, a large 70GB table, also has a skew of 58%.

Compare with the result of GaussDB (DWS) pgxc_get_table_skewness view

Using GaussDB (DWS) system view pgxc_get_table_skewness, it is difficult to see that store_sales1 is skewed.

Here we are using the system view pgxc_get_table_skewness to obtain

select * from  PGXC_GET_TABLE_SKEWNESS where schemaname = 'public' and tablename in ('store_sales1','test13');

From the results, the skewratio field, test13 table can see that there is a serious skew, while the skewratio value of store_sales1 is only 0.031, and there is no skew. But in fact, the table is slanted

We look at the data distribution verification of each DN through table_skewness and find that store_sales1 does have a certain skew.

to sum up:

The skew rate of GaussDB (DWS) is the result of the view pgxc_get_table_skewness. Although it can reflect the severely skewed table, it is difficult to see the large skewed table. At the same time, this function has certain performance problems, and it can't be executed when there are many tables.

The slope rate acquisition method provided in this article can more accurately reflect the slope of the table and can be called a method for quickly obtaining the slope rate of all tables in the entire schema; in the test process, the larger the amount of data, the more tables, and the execution time will be reduced. The slower, test a schema about 3800 tables, a total of about 40TB of data, and obtain the space size and skew rate of all tables in about 5 minutes.

However, the method provided in this article can only operate on a single schema to obtain the table space size and tilt rate of the entire database, and the actual measurement cannot be performed successfully. If you do not require timeliness, you can fix a time every day, in the form of batches, to obtain a list of all tables in a library, use the table_distribution function to obtain table space information one table at a time, and use multiple concurrent executions. This way can be The space situation of all tables is executed within a certain period of time.

For example: for the case of 100,000 tables in the entire database, you can use 100 concurrent executions to execute insert into table_size_info select * from table_distribution('schema.table'); in this way, the DN space information of 100,000 tables can be obtained. Then use the formulas in this article to summarize the tilt rate and total space size of each table.

you want to know more about GuassDB (DWS), please search for "GaussDB DWS" on WeChat and follow the WeChat public account, and share with you the latest and most complete PB series warehouse black technology. You can also get a lot of learning materials in the background~

Click to follow, and get to know the fresh technology of Huawei Cloud for the first time~

Don’t panic when GaussDB (DWS) data skew occurs, this article teaches you to easily get the table skew rate

Description of Requirement

Design ideas

Implementation process

Compare with the result of GaussDB (DWS) pgxc_get_table_skewness view

to sum up:

华为云开发者联盟

引用和评论

华为云开发者联盟入选 2023 中国技术品牌影响力企业榜，深耕开发者生态

MySQL慢查询日志：性能优化的终极指南

做到真正0丢失、0重复：Apache SeaTunnel 实现万亿级数据一致性全解密

Devin 发布 DeepWiki，2 星的项目直接装出万星的气场

好用的开源埋点方案-ClkLog埋点用户分析系统

DNS服务器地址大全

实战分享：DolphinScheduler 中 Shell 任务环境变量最佳配置方式