强烈建议OB日志盘大小是内存规格的3倍或以上。

作者:郑增权,爱可生 DBA 团队成员,OceanBase 和 MySQL 数据库技术爱好者。

爱可生开源社区出品,原创内容未经授权不得随意使用,转载请联系小编并注明来源。

本文约 1300 字,预计阅读需要 5 分钟。

背景

某客户基于节约资源的想法,将日志盘设置的比较小,日志盘大小约为集群内存规格的1.5倍,当创建租户时,CPU和内存都充足的情况下,却存在报错"LOG_DISK resource not enough",我们尝试复现问题并定位原因。

环境信息

  • 架构:单节点集群
  • 版本:OceanBase:4.2.1.4
MySQL [oceanbase]> select svr_ip,status,build_version from __all_server;
+--------------+--------+-------------------------------------------------------------------------------------------+
| svr_ip       | status | build_version                                                                             |
+--------------+--------+-------------------------------------------------------------------------------------------+
| 10.186.64.61 | ACTIVE | 4.2.1.4_104010012024030714-c4f3400ad2839e337bc9dab5d1bfe1d01134a1d7(Mar  7 2024 14:32:22) |
+--------------+--------+-------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

查看集群可分配的 CPU、内存、日志盘容量。

  • MEM_CAPACITY:observer 进程可用的内存大小
  • LOG_DISK_CAPACITY:日志盘空间总大小,41.8GB
  • LOG_DISK_ASSIGNED:日志盘已分配大小,6GB
  • 剩余可分配 CPU 数量:18-2=16C
  • 剩余可分配内存大小:24-2=22GB
MySQL [oceanbase]> SELECT SVR_IP, SVR_PORT , ZONE , SQL_PORT , CPU_CAPACITY , CPU_CAPACITY_MAX, CPU_ASSIGNED , CPU_ASSIGNED_MAX ,MEMORY_LIMIT/1024/1024/1024 MEMORY_LIMIT_GB , MEM_CAPACITY/1024/1024/1024 MEM_CAPACITY_GB, MEM_ASSIGNED/1024/1024/1024 MEM_ASSIGNED_GB ,LOG_DISK_CAPACITY/1024/1024/1024 LOG_DISK_CAPACITY_GB,LOG_DISK_ASSIGNED/1024/1024/1024 LOG_DISK_ASSIGNED_GB,LOG_DISK_IN_USE/1024/1024/1024 LOG_DISK_IN_USE_GB  FROM GV$OB_SERVERS;
+--------------+----------+-------+----------+--------------+------------------+--------------+------------------+-----------------+-----------------+-----------------+----------------------+----------------------+--------------------+
| SVR_IP       | SVR_PORT | ZONE  | SQL_PORT | CPU_CAPACITY | CPU_CAPACITY_MAX | CPU_ASSIGNED | CPU_ASSIGNED_MAX | MEMORY_LIMIT_GB | MEM_CAPACITY_GB | MEM_ASSIGNED_GB | LOG_DISK_CAPACITY_GB | LOG_DISK_ASSIGNED_GB | LOG_DISK_IN_USE_GB |
+--------------+----------+-------+----------+--------------+------------------+--------------+------------------+-----------------+-----------------+-----------------+----------------------+----------------------+--------------------+
| 10.186.64.61 |     2882 | zone1 |     2881 |           18 |               18 |            2 |                2 | 30.000000000000 | 24.000000000000 |  2.000000000000 |      41.875000000000 |       6.000000000000 |     0.125000000000 |
+--------------+----------+-------+----------+--------------+------------------+--------------+------------------+-----------------+-----------------+-----------------+----------------------+----------------------+--------------------+
1 row in set (0.00 sec)

可以看到当前集群仅 sys 租户占用了 2C2G 的资源。

MySQL [oceanbase]> SELECT a.tenant_name,a.tenant_id,b.name unit_config,c.name pool_name,b.max_cpu,b.min_cpu,MEMORY_SIZE/1024/1024/1024 as MEMORY_SIZE
-> FROM
-> OCEANBASE.DBA_OB_TENANTS a,
-> OCEANBASE.DBA_OB_UNIT_CONFIGS b,
-> OCEANBASE.DBA_OB_RESOURCE_POOLS c
-> WHERE a.tenant_id=c.tenant_id
-> AND b.unit_config_id = c.unit_config_id
-> ORDER BY a.tenant_id desc;
+-------------+-----------+-------------------------------+-----------+---------+---------+----------------+
| tenant_name | tenant_id | unit_config                   | pool_name | max_cpu | min_cpu | MEMORY_SIZE    |
+-------------+-----------+-------------------------------+-----------+---------+---------+----------------+
| sys         |         1 | config_sys_zone1_twoctwog_xio | sys_pool  |       2 |       2 | 2.000000000000 |
+-------------+-----------+-------------------------------+-----------+---------+---------+----------------+
1 row in set (0.01 sec)

报错复现及疑问

创建租户

尝试创建 1 个规格为 4C12G 的租户。

MySQL [oceanbase]> CREATE RESOURCE UNIT mem_test_unit MEMORY_SIZE = '12G',MAX_CPU = 4, MIN_CPU = 4;
Query OK, 0 rows affected (0.02 sec)

创建资源池

存在报错:LOG_DISK resource not enough

MySQL [oceanbase]> CREATE RESOURCE POOL pool_evan UNIT='mem_test_unit', UNIT_NUM=1, ZONE_LIST=('zone1');
ERROR 4733 (HY000): zone 'zone1' resource not enough to hold 1 unit. You can check resource info by views: DBA_OB_UNITS, GV$OB_UNITS, GV$OB_SERVERS.
server '"10.186.64.61:2882"' LOG_DISK resource not enough

问题 1

剩余资源为 16C 22GB,为何创建一个 4C 12GB 的资源池会失败?

问题 2

报错有关 LOG_DISK ,日志盘容量与内存规格存在何种关联?

日志盘大小与租户内存大小的关系

尝试新建规格为 1C1G 的租户,分析租户内存大小与日志盘容量分配的规律。

  1. 新建规格为 1C1G 的租户。
  2. LOG_DISK_ASSIGNED_GB 增长至 9G,相较之前增加了 3G。
MySQL [oceanbase]> CREATE RESOURCE UNIT unit_1g MEMORY_SIZE = '1G',MAX_CPU = 1, MIN_CPU = 1;
Query OK, 0 rows affected (0.01 sec)
MySQL [oceanbase]> CREATE RESOURCE POOL pool_1g UNIT='unit_1g', UNIT_NUM=1, ZONE_LIST=('zone1');
Query OK, 0 rows affected (0.02 sec)
MySQL [oceanbase]> CREATE TENANT IF NOT EXISTS tenant_1g
    ->                 PRIMARY_ZONE = 'zone1',
    ->                 RESOURCE_POOL_LIST=('pool_1g')
    ->                 set OB_TCP_INVITED_NODES='%';
Query OK, 0 rows affected (26.05 sec)
MySQL [oceanbase]> 
MySQL [oceanbase]> SELECT SVR_IP, SVR_PORT , ZONE , SQL_PORT , CPU_CAPACITY ,CPU_ASSIGNED ,MEMORY_LIMIT/1024/1024/1024 MEMORY_LIMIT_GB , MEM_CAPACITY/1024/1024/1024 MEM_CAPACITY_GB, MEM_ASSIGNED/1024/1024/1024 MEM_ASSIGNED_GB ,LOG_DISK_CAPACITY/1024/1024/1024 LOG_DISK_CAPACITY_GB,LOG_DISK_ASSIGNED/1024/1024/1024 LOG_DISK_ASSIGNED_GB  FROM GV$OB_SERVERS;
+--------------+----------+-------+----------+--------------+--------------+-----------------+-----------------+-----------------+----------------------+----------------------+
| SVR_IP       | SVR_PORT | ZONE  | SQL_PORT | CPU_CAPACITY | CPU_ASSIGNED | MEMORY_LIMIT_GB | MEM_CAPACITY_GB | MEM_ASSIGNED_GB | LOG_DISK_CAPACITY_GB | LOG_DISK_ASSIGNED_GB |
+--------------+----------+-------+----------+--------------+--------------+-----------------+-----------------+-----------------+----------------------+----------------------+
| 10.186.64.61 |     2882 | zone1 |     2881 |           18 |            3 | 30.000000000000 | 24.000000000000 |  3.000000000000 |      41.875000000000 |       9.000000000000 |
+--------------+----------+-------+----------+--------------+--------------+-----------------+-----------------+-----------------+----------------------+----------------------+
1 row in set (0.00 sec)

小结

每为租户分配 1GB 内存,则对应的分配 3GB 日志盘容量。

代入本文初始环境计算

计算公式

本文初始背景集群剩余内存规格计算:

(LOG_DISK_CAPACITY_GB - LOG_DISK_ASSIGNED_GB) / 3 =(41.875 - 6.00)/ 3 ≈ 11.958 GB

即,集群剩余可用的内存上限为 11.958GB,取整数为 11GB。

验证

释放资源

删掉租户 tenant_1g 和对应的 RESOURCE POOL 释放资源。

MySQL [oceanbase]> DROP TENANT tenant_1g;
Query OK, 0 rows affected (35.04 sec)
MySQL [oceanbase]> DROP RESOURCE POOL pool_1g;
Query OK, 0 rows affected (0.01 sec)
MySQL [oceanbase]> SELECT SVR_IP, SVR_PORT , ZONE , SQL_PORT , CPU_CAPACITY ,CPU_ASSIGNED ,MEMORY_LIMIT/1024/1024/1024 MEMORY_LIMIT_GB , MEM_CAPACITY/1024/1024/1024 MEM_CAPACITY_GB, MEM_ASSIGNED/1024/1024/1024 MEM_ASSIGNED_GB ,LOG_DISK_CAPACITY/1024/1024/1024 LOG_DISK_CAPACITY_GB,LOG_DISK_ASSIGNED/1024/1024/1024 LOG_DISK_ASSIGNED_GB  FROM GV$OB_SERVERS;
+--------------+----------+-------+----------+--------------+--------------+-----------------+-----------------+-----------------+----------------------+----------------------+
| SVR_IP       | SVR_PORT | ZONE  | SQL_PORT | CPU_CAPACITY | CPU_ASSIGNED | MEMORY_LIMIT_GB | MEM_CAPACITY_GB | MEM_ASSIGNED_GB | LOG_DISK_CAPACITY_GB | LOG_DISK_ASSIGNED_GB |
+--------------+----------+-------+----------+--------------+--------------+-----------------+-----------------+-----------------+----------------------+----------------------+
| 10.186.64.61 |     2882 | zone1 |     2881 |           18 |            2 | 30.000000000000 | 24.000000000000 |  2.000000000000 |      41.875000000000 |       6.000000000000 |
+--------------+----------+-------+----------+--------------+--------------+-----------------+-----------------+-----------------+----------------------+----------------------+
1 row in set (0.00 sec)       

重新创建

创建一个规格为 16C11G 的租户。

MySQL [oceanbase]> CREATE RESOURCE UNIT unit_11g MEMORY_SIZE = '11G',MAX_CPU = 16, MIN_CPU = 16;
Query OK, 0 rows affected (0.01 sec)
MySQL [oceanbase]>  CREATE RESOURCE POOL pool_11g UNIT='unit_11g', UNIT_NUM=1, ZONE_LIST=('zone1');
Query OK, 0 rows affected (0.02 sec)

MySQL [oceanbase]> CREATE TENANT IF NOT EXISTS tenant_11g
    ->                 PRIMARY_ZONE = 'zone1',
    ->                 RESOURCE_POOL_LIST=('pool_11g')
    ->                 set OB_TCP_INVITED_NODES='%';
Query OK, 0 rows affected (25.99 sec)
MySQL [oceanbase]> SELECT SVR_IP, SVR_PORT , ZONE , SQL_PORT , CPU_CAPACITY ,CPU_ASSIGNED ,MEMORY_LIMIT/1024/1024/1024 MEMORY_LIMIT_GB , MEM_CAPACITY/1024/1024/1024 MEM_CAPACITY_GB, MEM_ASSIGNED/1024/1024/1024 MEM_ASSIGNED_GB ,LOG_DISK_CAPACITY/1024/1024/1024 LOG_DISK_CAPACITY_GB,LOG_DISK_ASSIGNED/1024/1024/1024 LOG_DISK_ASSIGNED_GB  FROM GV$OB_SERVERS;
+--------------+----------+-------+----------+--------------+--------------+-----------------+-----------------+-----------------+----------------------+----------------------+
| SVR_IP       | SVR_PORT | ZONE  | SQL_PORT | CPU_CAPACITY | CPU_ASSIGNED | MEMORY_LIMIT_GB | MEM_CAPACITY_GB | MEM_ASSIGNED_GB | LOG_DISK_CAPACITY_GB | LOG_DISK_ASSIGNED_GB |
+--------------+----------+-------+----------+--------------+--------------+-----------------+-----------------+-----------------+----------------------+----------------------+
| 10.186.64.61 |     2882 | zone1 |     2881 |           18 |           18 | 30.000000000000 | 24.000000000000 | 13.000000000000 |      41.875000000000 |      39.000000000000 |
+--------------+----------+-------+----------+--------------+--------------+-----------------+-----------------+-----------------+----------------------+----------------------+
1 row in set (0.01 sec)

疑问解答

下面我们来解答上面提出的两个疑问。

问题 1

剩余资源为 16C 22GB,为何创建一个 4C 12GB 的资源池会失败?

答:

  1. 由于日志盘规格为 41.875GB,且 sys 租户已占用 6GB 日志盘份额,经前文计算可得:集群剩余可用的内存上限为 11GB。
  2. 建租户内存规格超过 11GB 会因申请不到对应份额的日志盘容量而引发报错 LOG_DISK resource not enough

问题 2

报错有关 LOG_DISK ,日志盘容量与内存规格存在何种关联?

答:LOG_DISK_SIZE:默认值为内存规格值的 3 倍,最小值为 2G。

建议

日志盘大小尽量设置为内存上限的 3 或 4 倍(生产环境至少是 3 倍),避免因日志盘不足导致集群已有的内存无法进行分配。

参考资料

  1. 《GV$OB_SERVERS》:https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000001051773
  2. 《ALTER RESOURCE UNIT》:https://www.oceanbase.com/docs/common-oceanbase-database-cn-1000000000220301

爱可生开源社区
426 声望209 粉丝

成立于 2017 年,以开源高质量的运维工具、日常分享技术干货内容、持续的全国性的社区活动为社区己任;目前开源的产品有:SQL审核工具 SQLE,分布式中间件 DBLE、数据传输组件DTLE。