postgresql profiling

场景

分析postgresql BE的性能数据，适应于使用GCC编译的场景。

原理

借助gprof工具，gcc编译的时候，加入-pg选项，则gcc会自动将目标代码中加入性能测试代码。程序执行过程中，性能测试代码会记录程序的性能数据，程序结束后，保存性能数据文件。通过gprof工具可以解码该文件，进行查看分析。如下例：

gcc mytest.c -g -pg

说明：
. 对于程序引用的外部库，如libc等，默认没有-pg生成的性能代码，不会单独统计其函数的性能数据。可以下载对应的带-pg选项的库文件，如libc_p.a。(系统调用也不计算如sleep)如下例：

gcc mytest.c -g -pg -lc_p

. gprof默认不支持多线程、共享库程序。

方法

. 编译

打开profiling选项

./configure --enable-profiling

. 执行

$ psql
postgres=# create table t1(id int);
postgres=# insert into t1 values(1);
postgres=# insert into t1 values(2);
postgres=# \q

psql退出后，其对应的BE也随之退出。BE退出后产生相应的gmon.out文件，默认在数据目录的gprof/$PID目录下

data/dn6/gprof/26516/gmon.out

. 转换

将gmon.out转换成可读的格式

gprof ./install/bin/postgres ./data/dn6/gprof/26516/gmon.out > gp.out

格式说明

. 第一部分

各个函数的耗时排名。

column	意义
% time	本函数耗时所占的比例
self seconds	本函数自身的耗时，不包含其调用的子函数的耗时。如上所述，依赖的外部库(如memset)，系统调用等会统计到本函数中，不会单独统计。

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
 32.34      2.61     2.61    75262     0.00     0.00  handle_sync
 11.15      3.51     0.90   109412     0.00     0.00  base_yyparse
  4.71      3.89     0.38  2083921     0.00     0.00  SearchCatCacheInternal
  4.58      4.26     0.37  2940109     0.00     0.00  core_yylex
  4.21      4.60     0.34 13442073     0.00     0.00  AllocSetAlloc
  2.73      4.82     0.22  1627890     0.00     0.00  hash_search_with_hash_value
  2.60      5.03     0.21  4517373     0.00     0.00  MemoryContextAllocZeroAligned
  1.24      5.13     0.10  1297178     0.00     0.00  hash_bytes

. 第二部分

各个函数的堆栈情况，按照函数耗时排名排列。这里的函数耗时，与第一部分不同，是包含其子函数的耗时的。

column	意义
index	排名(包含子函数的耗时)
% time	本函数耗时所占的比例
self	同第一部分中的'self seconds'。本函数自身的耗时。
children	其调用的子函数的耗时。
called	被调用的次数(不包含递归调用次数)。

下面例子，为hgds_handle_sync的信息。下面是其子函数信息，按照耗时排序。上面是其caller的信息。

index % time    self  children    called     name
-----------------------------------------------
                2.61    0.46   75262/75262       PostgresMain [6]
[8]     38.0    2.61    0.46   75262         handle_sync [8]
                0.01    0.33   71509/307849      dist_extended_msg [15]
                0.00    0.06   71509/71509       handle_result [81]
                0.05    0.00   75262/82932       IsTransactionBlock [90]
                0.01    0.00   75262/75262       dlist_is_empty [206]
                0.00    0.00   75262/1496363     errstart [106]
                0.00    0.00    3836/7670        cl_bind_dist_info [526]
-----------------------------------------------

. 第三部分

函数index信息。

Index by function name

 [1385] AbortBufferIO        [121] SearchSysCache1         [8] handle_sync
 [726] AbortOutOfAnyTransaction [1010] SearchSysCache2  [1039] hash_combine64

总结

通过profiling信息，可以分析哪些函数贡献了更多的耗时，以便针对性的优化。也可以针对两个版本的数据进行对比，查找新增的性能杀手，进行优化。

参考资料

pgbuild说明
 gprof介绍
 gprof对比
 系统调用不计算例子
 一篇不错的介绍

postgresql profiling

场景

原理

方法

. 编译

. 执行

. 转换

格式说明

. 第一部分

. 第二部分

. 第三部分

总结

参考资料

黑暗森林

引用和评论

mysql 编译安装

在 Kubernetes 上用 KubeBlocks + Dify 快速构建生产级 AIGC 应用

PostgreSQL@K8s 性能优化记

Devin 发布 DeepWiki，2 星的项目直接装出万星的气场

2025年1月国产数据库大事记-墨天轮

袋鼠数据库工具 6.2 AI 版已上线

数据库加密全解析：从传输到存储的安全实践