初始化GP集群

类似于postgreSQL中的initdb,只不过是集群版本。对应init gp

主要通过gpinitsystem工具实现,需要实现做好集群配置gpinitsystem_config。主要为指定master、segment节点、segment节点上的segment数量、数据库端口号等信息。gpinitsystem也会对之前配置的一些系统设置进行检查,如ulimit。

可以指定一个数据库,也可以不指定只创建默认库,和postgreSQL的initdb不同,最终会将所有数据库实例拉起来。

步骤如下:

设置环境变量

master节点root用户下

# 切换到gpadmin用户
su - gpadmin
source /usr/local/gpdb/greenplum_path.sh

创建segment节点列表文件

mkdir -p ~/gpconfigs/
vi ~/gpconfigs/hostfile_gpinitsystem
注意不要添加多余的空行

dw1-seg1
dw1-seg2

创建gp的配置文件

cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config ~/gpconfigs/
vi ~/gpconfigs/gpinitsystem_config

ARRAY_NAME="dw1_array"
SEG_PREFIX=gpseg
PORT_BASE=6000
declare -a DATA_DIRECTORY=(/data/primary /data/primary)
MASTER_HOSTNAME=dw1-cn
MASTER_DIRECTORY=/data/master
MASTER_PORT=5432
TRUSTED_SHELL=ssh
CHECK_POINT_SEGMENTS=8
ENCODING=UNICODE

DATABASE_NAME=dw1

说明:

DATA_DIRECTORY指定一个segment host上的各个segment对应的数据目录。
个数决定了在一个host上起的segment数量,每个segment对应一个postgreSQL数据库实例。如(/data/primary /data/primary)对应两个segment实例。会在该目录下新建目录作为数据目录,如
[gpadmin@dw1-seg1 vagrant]$ ls /data/primary/
gpseg0  gpseg1
还会针对同一个host上指定多个网口的情形进行合理绑定。前提这些网口对应IP在hostfile_gpinitsystem中暴露出来。

[segment数量设置](https://developer.aliyun.com/ask/54601?spm=a2c6h.13159736)
和CPU核数,块设备数量,RAID卡通道数量,网卡数量都有关系。
要做到均衡配比,资源的利用才能最合理。
你可以这样来分配,例如CPU核数乘以0.8作为segment数量,然后再均匀分配SEG对应的块设备,网卡,RAID卡,

PORT_BASE和[系统配置](https://segmentfault.com/a/1190000043041096)中的net.ipv4.ip_local_port_range对应

DATABASE_NAME默认不设置,只会创建默认的gpadmin库。指定后会自动创建指定的库,如dw1.

MIRROR_DATA_DIRECTORY指定mirror,默认没有打开

gpinitsystem -c ~/gpconfigs/gpinitsystem_config -h ~/gpconfigs/hostfile_gpinitsystem

问题(GP6下,GP7.0 beta无此问题):
抱怨找不到ifconfig和netstat
解决
yum install net-tools.x86_64
问题:
出现warning:
-[WARN]:-Host dw1-cn open files limit is 1024 should be >= 65535

而ulimit -n显示没有问题
[gpadmin@dw1-cn ~]$ ulimit -n
65536

[gpadmin@dw1-cn ~]$ ssh gpadmin@172.28.128.6 'ulimit -n'
1024

原因:
gpinitsystem是ssh到每个host再执行ulimit -n,如上这样得到的是系统默认的1024。而ssh在load limit方面可能有问题,和openssl库使用PAM库有关(参照)ssh ulimit问题
但是实际上gp对ulimit的要求是为对本地运行的数据库服务有效即可,因此可以忽略。

执行后的输出:

[gpadmin@dw1-cn ~]$ gpinitsystem -c ~/gpconfigs/gpinitsystem_config -h ~/gpconfigs/hostfile_gpinitsystem
[INFO]:-Checking configuration parameters, please wait...
[INFO]:-Reading Greenplum configuration file /home/gpadmin/gpconfigs/gpinitsystem_config
[INFO]:-Locale has not been set in /home/gpadmin/gpconfigs/gpinitsystem_config, will set to default value
[INFO]:-Locale set to en_US.utf8
[INFO]:-MASTER_MAX_CONNECT not set, will set to default value 250
[INFO]:-Checking configuration parameters, Completed
[INFO]:-Commencing multi-home checks, please wait...

[INFO]:-Configuring build for standard array
[INFO]:-Commencing multi-home checks, Completed
[INFO]:-Building primary segment instance array, please wait...

[INFO]:-Checking Master host
[INFO]:-Checking new segment hosts, please wait...
[WARN]:-Host dw1-cn open files limit is 1024 should be >= 65535
[WARN]:-Host dw1-cn open files limit is 1024 should be >= 65535

[INFO]:-Checking new segment hosts, Completed
[INFO]:-Greenplum Database Creation Parameters
[INFO]:---------------------------------------
[INFO]:-Master Configuration
[INFO]:---------------------------------------
[INFO]:-Master instance name       = dw1_array
[INFO]:-Master hostname            = dw1-cn
[INFO]:-Master port                = 5432
[INFO]:-Master instance dir        = /data/master/gpseg-1
[INFO]:-Master LOCALE              = en_US.utf8
[INFO]:-Greenplum segment prefix   = gpseg
[INFO]:-Master Database            = dw1
[INFO]:-Master connections         = 250
[INFO]:-Master buffers             = 128000kB
[INFO]:-Segment connections        = 750
[INFO]:-Segment buffers            = 128000kB
[INFO]:-Checkpoint segments        = 8
[INFO]:-Encoding                   = UNICODE
[INFO]:-Postgres param file        = Off
[INFO]:-Initdb to be used          = /usr/local/gpdb/bin/initdb
[INFO]:-GP_LIBRARY_PATH is         = /usr/local/gpdb/lib
[INFO]:-HEAP_CHECKSUM is           = on
[INFO]:-HBA_HOSTNAMES is           = 0
[WARN]:-Ulimit check               = Warnings generated, see log file <<<<<
[INFO]:-Array host connect type    = Single hostname per node
[INFO]:-Master IP address [1]      = ::1
[INFO]:-Master IP address [2]      = x.x.x.x
[INFO]:-Master IP address [3]      = 172.28.128.6
[INFO]:-Master IP address [4]      = x::x:x:x:x
[INFO]:-Master IP address [5]      = x::x:x:x:x
[INFO]:-Standby Master             = Not Configured
[INFO]:-Number of primary segments = 2
[INFO]:-Total Database segments    = 4
[INFO]:-Trusted shell              = ssh
[INFO]:-Number segment hosts       = 2
[INFO]:-Mirroring config           = OFF
[INFO]:----------------------------------------
[INFO]:-Greenplum Primary Segment Configuration
[INFO]:----------------------------------------
[INFO]:-dw1-seg1   6000    dw1-seg1        /data/primary/gpseg0    2
[INFO]:-dw1-seg1   6001    dw1-seg1        /data/primary/gpseg1    3
[INFO]:-dw1-seg2   6000    dw1-seg2        /data/primary/gpseg2    4
[INFO]:-dw1-seg2   6001    dw1-seg2        /data/primary/gpseg3    5

选择y后成功

Continue with Greenplum creation Yy|Nn (default=N):
> y
[INFO]:-Building the Master instance database, please wait...
[INFO]:-Starting the Master in admin mode
[INFO]:-Commencing parallel build of primary segment instances
[INFO]:-Spawning parallel processes    batch [1], please wait...

[INFO]:-Waiting for parallel processes batch [1], please wait...

[INFO]:------------------------------------------------
[INFO]:-Parallel process exit status
[INFO]:------------------------------------------------
[INFO]:-Total processes marked as completed           = 4
[INFO]:-Total processes marked as killed              = 0
[INFO]:-Total processes marked as failed              = 0
[INFO]:------------------------------------------------
[INFO]:-Removing back out file
[INFO]:-No errors generated from parallel processes
[INFO]:-Restarting the Greenplum instance in production mode
[INFO]:-Starting gpstop with args: -a -l /home/gpadmin/gpAdminLogs -m -d /data/master/gpseg-1
[INFO]:-Gathering information and validating the environment...
[INFO]:-Obtaining Greenplum Master catalog information
[INFO]:-Obtaining Segment details from master...
[INFO]:-Greenplum Version: 'postgres (Greenplum Database) 6.0.0-beta.1 build dev'
[INFO]:-Commencing Master instance shutdown with mode='smart'
[INFO]:-Master segment instance directory=/data/master/gpseg-1
[INFO]:-Stopping master segment and waiting for user connections to finish ...

[INFO]:-Attempting forceful termination of any leftover master process
[INFO]:-Terminating processes for segment /data/master/gpseg-1
[INFO]:-Starting gpstart with args: -a -l /home/gpadmin/gpAdminLogs -d /data/master/gpseg-1
[INFO]:-Gathering information and validating the environment...
[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 6.0.0-beta.1 build dev'
[INFO]:-Greenplum Catalog Version: '301908232'
[INFO]:-Starting Master instance in admin mode
[INFO]:-Obtaining Greenplum Master catalog information
[INFO]:-Obtaining Segment details from master...
[INFO]:-Setting new master era
[INFO]:-Master Started...
[INFO]:-Shutting down master
[INFO]:-Commencing parallel segment instance startup, please wait...
[INFO]:-Process results...
[INFO]:-----------------------------------------------------
[INFO]:-   Successful segment starts                                            = 4
[INFO]:-   Failed segment starts                                                = 0
[INFO]:-   Skipped segment starts (segments are marked down in configuration)   = 0
[INFO]:-----------------------------------------------------
[INFO]:-Successfully started 4 of 4 segment instances
[INFO]:-----------------------------------------------------
[INFO]:-Starting Master instance dw1-cn directory /data/master/gpseg-1
[INFO]:-Command pg_ctl reports Master dw1-cn instance active
[INFO]:-Connecting to dbname='template1' connect_timeout=15
[INFO]:-No standby master configured.  skipping...
[INFO]:-Database successfully started
[INFO]:-Completed restart of Greenplum instance in production mode
[INFO]:-Scanning utility log file for any warning messages
[WARN]:-*******************************************************
[WARN]:-Scan of log file indicates that some warnings or errors
[WARN]:-were generated during the array creation
[INFO]:-Please review contents of log file
[INFO]:-/home/gpadmin/gpAdminLogs/gpinitsystem_20221215.log
[INFO]:-To determine level of criticality
[INFO]:-These messages could be from a previous run of the utility
[INFO]:-that was called today!
[WARN]:-*******************************************************
[INFO]:-Greenplum Database instance successfully created
[INFO]:-------------------------------------------------------
[INFO]:-To complete the environment configuration, please
[INFO]:-update gpadmin .bashrc file with the following
[INFO]:-1. Ensure that the greenplum_path.sh file is sourced
[INFO]:-2. Add "export MASTER_DATA_DIRECTORY=/data/master/gpseg-1"
[INFO]:-   to access the Greenplum scripts for this instance:
[INFO]:-   or, use -d /data/master/gpseg-1 option for the Greenplum scripts
[INFO]:-   Example gpstate -d /data/master/gpseg-1
[INFO]:-Script log file = /home/gpadmin/gpAdminLogs/gpinitsystem_20221215.log
[INFO]:-To remove instance, run gpdeletesystem utility
[INFO]:-To initialize a Standby Master Segment for this Greenplum instance
[INFO]:-Review options for gpinitstandby
[INFO]:-------------------------------------------------------
[INFO]:-The Master /data/master/gpseg-1/pg_hba.conf post gpinitsystem
[INFO]:-has been configured to allow all hosts within this new
[INFO]:-array to intercommunicate. Any hosts external to this
[INFO]:-new array must be explicitly added to this file
[INFO]:-Refer to the Greenplum Admin support guide which is
[INFO]:-located in the /usr/local/gpdb/docs directory
[INFO]:-------------------------------------------------------

配置环境变量

master(包括standby)的gpadmin用户下
可以放到.bashrc中,或者单独的文件中

source /usr/local/gpdb/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/data/master/gpseg-1
export PGPORT=5432
export PGUSER=gpadmin
export PGDATABASE=gpadmin
# 如果DATABASE_NAME指定了数据库名,如dw1
# export PGDATABASE=dw1

在master、segment上ps看数据库已经启动起来
master:

[gpadmin@dw1-cn ~]$ ps -ef|grep postgres
gpadmin    15626       1  0 06:54 ?        00:00:00 /usr/local/gpdb/bin/postgres -D /data/master/gpseg-1 -p 5432 -E
gpadmin    15627   15626  0 06:54 ?        00:00:00 postgres:  5432, master logger process
gpadmin    15629   15626  0 06:54 ?        00:00:00 postgres:  5432, checkpointer process
gpadmin    15630   15626  0 06:54 ?        00:00:00 postgres:  5432, writer process
gpadmin    15631   15626  0 06:54 ?        00:00:00 postgres:  5432, wal writer process
gpadmin    15632   15626  0 06:54 ?        00:00:00 postgres:  5432, stats collector process
gpadmin    15633   15626  0 06:54 ?        00:00:00 postgres:  5432, bgworker: dtx recovery process
gpadmin    15634   15626  0 06:54 ?        00:00:00 postgres:  5432, bgworker: ftsprobe process
gpadmin    15637   15626  0 06:54 ?        00:00:00 postgres:  5432, bgworker: sweeper process
gpadmin    15850   10619  0 07:11 pts/0    00:00:00 grep --color=auto postgres

master:

[gpadmin@dw1-cn ~]$ netstat -anp|grep 15626
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
tcp        0      0 0.0.0.0:5432            0.0.0.0:*               LISTEN      15626/postgres
tcp6       0      0 :::5432                 :::*                    LISTEN      15626/postgres
udp6       0      0 ::1:48929               ::1:48929               ESTABLISHED 15626/postgres
unix  2      [ ACC ]     STREAM     LISTENING     73165    15626/postgres       /tmp/.s.PGSQL.5432
[gpadmin@dw1-cn ~]$ netstat |head
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State

segment:

[gpadmin@dw1-seg1 vagrant]$ ps -ef|grep postgres
gpadmin     8241       1  0 06:54 ?        00:00:00 /usr/local/gpdb/bin/postgres -D /data/primary/gpseg1 -p 6001
gpadmin     8242       1  0 06:54 ?        00:00:00 /usr/local/gpdb/bin/postgres -D /data/primary/gpseg0 -p 6000
gpadmin     8243    8241  0 06:54 ?        00:00:00 postgres:  6001, logger process
gpadmin     8244    8242  0 06:54 ?        00:00:00 postgres:  6000, logger process
gpadmin     8249    8241  0 06:54 ?        00:00:00 postgres:  6001, checkpointer process
gpadmin     8250    8241  0 06:54 ?        00:00:00 postgres:  6001, writer process
gpadmin     8251    8241  0 06:54 ?        00:00:00 postgres:  6001, wal writer process
gpadmin     8252    8241  0 06:54 ?        00:00:00 postgres:  6001, stats collector process
gpadmin     8253    8241  0 06:54 ?        00:00:00 postgres:  6001, bgworker: sweeper process
gpadmin     8254    8242  0 06:54 ?        00:00:00 postgres:  6000, checkpointer process
gpadmin     8255    8242  0 06:54 ?        00:00:00 postgres:  6000, writer process
gpadmin     8256    8242  0 06:54 ?        00:00:00 postgres:  6000, wal writer process
gpadmin     8257    8242  0 06:54 ?        00:00:00 postgres:  6000, stats collector process
gpadmin     8258    8242  0 06:54 ?        00:00:00 postgres:  6000, bgworker: sweeper process
gpadmin     8410    5515  0 07:12 pts/0    00:00:00 grep --color=auto postgres

segment:

[root@dw1-seg1 vagrant]# netstat |head
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State

[root@dw1-seg1 vagrant]# netstat -anp|grep 8241
tcp        0      0 0.0.0.0:6001            0.0.0.0:*               LISTEN      8241/postgres
tcp6       0      0 :::6001                 :::*                    LISTEN      8241/postgres
udp6       0      0 ::1:51463               ::1:51463               ESTABLISHED 8241/postgres
unix  2      [ ACC ]     STREAM     LISTENING     68927    8241/postgres        /tmp/.s.PGSQL.6001
[root@dw1-seg1 vagrant]# netstat -anp|grep 8242
tcp        0      0 0.0.0.0:6000            0.0.0.0:*               LISTEN      8242/postgres
tcp6       0      0 :::6000                 :::*                    LISTEN      8242/postgres
udp6       0      0 ::1:39652               ::1:39652               ESTABLISHED 8242/postgres
unix  2      [ ACC ]     STREAM     LISTENING     68928    8242/postgres        /tmp/.s.PGSQL.6000

登陆数据库
master节点上gpadmin下
需要首先让“配置环境变量”配置的环境变量生效

[gpadmin@dw1-cn ~]$ psql -d dw1
psql (9.4.26)
Type "help" for help.
# stop所有实例
gpstop -M fast -a

# start所有实例
gpstart -a

psql -d dw2
dw1=# CREATE TABLE products(name varchar(40),prod_id integer,supplier_id integer) DISTRIBUTED BY (prod_id);
CREATE TABLE
dw1=# insert into  products values('a', 1, 1);
INSERT 0 1
dw1=# insert into  products values('b', 2, 2);
INSERT 0 1
dw1=# insert into  products values('c', 3, 3);
INSERT 0 1
dw1=# select * from products;
 name | prod_id | supplier_id
------+---------+-------------
 c    |       3 |           3
 a    |       1 |           1
 b    |       2 |           2
(3 rows)

黑暗森林
12 声望2 粉丝

弱小和无知不是生存的障碍,傲慢才是!


引用和评论

1 篇内容引用
0 条评论