初始化GP集群
类似于postgreSQL中的initdb,只不过是集群版本。对应init gp。
主要通过gpinitsystem工具实现,需要实现做好集群配置gpinitsystem_config。主要为指定master、segment节点、segment节点上的segment数量、数据库端口号等信息。gpinitsystem也会对之前配置的一些系统设置进行检查,如ulimit。
可以指定一个数据库,也可以不指定只创建默认库,和postgreSQL的initdb不同,最终会将所有数据库实例拉起来。
步骤如下:
设置环境变量
master节点root用户下
# 切换到gpadmin用户
su - gpadmin
source /usr/local/gpdb/greenplum_path.sh
创建segment节点列表文件
mkdir -p ~/gpconfigs/
vi ~/gpconfigs/hostfile_gpinitsystem
注意不要添加多余的空行
dw1-seg1
dw1-seg2
创建gp的配置文件
cp $GPHOME/docs/cli_help/gpconfigs/gpinitsystem_config ~/gpconfigs/
vi ~/gpconfigs/gpinitsystem_config
ARRAY_NAME="dw1_array"
SEG_PREFIX=gpseg
PORT_BASE=6000
declare -a DATA_DIRECTORY=(/data/primary /data/primary)
MASTER_HOSTNAME=dw1-cn
MASTER_DIRECTORY=/data/master
MASTER_PORT=5432
TRUSTED_SHELL=ssh
CHECK_POINT_SEGMENTS=8
ENCODING=UNICODE
DATABASE_NAME=dw1
说明:
DATA_DIRECTORY指定一个segment host上的各个segment对应的数据目录。
个数决定了在一个host上起的segment数量,每个segment对应一个postgreSQL数据库实例。如(/data/primary /data/primary)对应两个segment实例。会在该目录下新建目录作为数据目录,如
[gpadmin@dw1-seg1 vagrant]$ ls /data/primary/
gpseg0 gpseg1
还会针对同一个host上指定多个网口的情形进行合理绑定。前提这些网口对应IP在hostfile_gpinitsystem中暴露出来。
[segment数量设置](https://developer.aliyun.com/ask/54601?spm=a2c6h.13159736)
和CPU核数,块设备数量,RAID卡通道数量,网卡数量都有关系。
要做到均衡配比,资源的利用才能最合理。
你可以这样来分配,例如CPU核数乘以0.8作为segment数量,然后再均匀分配SEG对应的块设备,网卡,RAID卡,
PORT_BASE和[系统配置](https://segmentfault.com/a/1190000043041096)中的net.ipv4.ip_local_port_range对应
DATABASE_NAME默认不设置,只会创建默认的gpadmin库。指定后会自动创建指定的库,如dw1.
MIRROR_DATA_DIRECTORY指定mirror,默认没有打开
gpinitsystem -c ~/gpconfigs/gpinitsystem_config -h ~/gpconfigs/hostfile_gpinitsystem
问题(GP6下,GP7.0 beta无此问题):
抱怨找不到ifconfig和netstat
解决
yum install net-tools.x86_64
问题:
出现warning:
-[WARN]:-Host dw1-cn open files limit is 1024 should be >= 65535
而ulimit -n显示没有问题
[gpadmin@dw1-cn ~]$ ulimit -n
65536
[gpadmin@dw1-cn ~]$ ssh gpadmin@172.28.128.6 'ulimit -n'
1024
原因:
gpinitsystem是ssh到每个host再执行ulimit -n,如上这样得到的是系统默认的1024。而ssh在load limit方面可能有问题,和openssl库使用PAM库有关(参照)ssh ulimit问题
但是实际上gp对ulimit的要求是为对本地运行的数据库服务有效即可,因此可以忽略。
执行后的输出:
[gpadmin@dw1-cn ~]$ gpinitsystem -c ~/gpconfigs/gpinitsystem_config -h ~/gpconfigs/hostfile_gpinitsystem
[INFO]:-Checking configuration parameters, please wait...
[INFO]:-Reading Greenplum configuration file /home/gpadmin/gpconfigs/gpinitsystem_config
[INFO]:-Locale has not been set in /home/gpadmin/gpconfigs/gpinitsystem_config, will set to default value
[INFO]:-Locale set to en_US.utf8
[INFO]:-MASTER_MAX_CONNECT not set, will set to default value 250
[INFO]:-Checking configuration parameters, Completed
[INFO]:-Commencing multi-home checks, please wait...
[INFO]:-Configuring build for standard array
[INFO]:-Commencing multi-home checks, Completed
[INFO]:-Building primary segment instance array, please wait...
[INFO]:-Checking Master host
[INFO]:-Checking new segment hosts, please wait...
[WARN]:-Host dw1-cn open files limit is 1024 should be >= 65535
[WARN]:-Host dw1-cn open files limit is 1024 should be >= 65535
[INFO]:-Checking new segment hosts, Completed
[INFO]:-Greenplum Database Creation Parameters
[INFO]:---------------------------------------
[INFO]:-Master Configuration
[INFO]:---------------------------------------
[INFO]:-Master instance name = dw1_array
[INFO]:-Master hostname = dw1-cn
[INFO]:-Master port = 5432
[INFO]:-Master instance dir = /data/master/gpseg-1
[INFO]:-Master LOCALE = en_US.utf8
[INFO]:-Greenplum segment prefix = gpseg
[INFO]:-Master Database = dw1
[INFO]:-Master connections = 250
[INFO]:-Master buffers = 128000kB
[INFO]:-Segment connections = 750
[INFO]:-Segment buffers = 128000kB
[INFO]:-Checkpoint segments = 8
[INFO]:-Encoding = UNICODE
[INFO]:-Postgres param file = Off
[INFO]:-Initdb to be used = /usr/local/gpdb/bin/initdb
[INFO]:-GP_LIBRARY_PATH is = /usr/local/gpdb/lib
[INFO]:-HEAP_CHECKSUM is = on
[INFO]:-HBA_HOSTNAMES is = 0
[WARN]:-Ulimit check = Warnings generated, see log file <<<<<
[INFO]:-Array host connect type = Single hostname per node
[INFO]:-Master IP address [1] = ::1
[INFO]:-Master IP address [2] = x.x.x.x
[INFO]:-Master IP address [3] = 172.28.128.6
[INFO]:-Master IP address [4] = x::x:x:x:x
[INFO]:-Master IP address [5] = x::x:x:x:x
[INFO]:-Standby Master = Not Configured
[INFO]:-Number of primary segments = 2
[INFO]:-Total Database segments = 4
[INFO]:-Trusted shell = ssh
[INFO]:-Number segment hosts = 2
[INFO]:-Mirroring config = OFF
[INFO]:----------------------------------------
[INFO]:-Greenplum Primary Segment Configuration
[INFO]:----------------------------------------
[INFO]:-dw1-seg1 6000 dw1-seg1 /data/primary/gpseg0 2
[INFO]:-dw1-seg1 6001 dw1-seg1 /data/primary/gpseg1 3
[INFO]:-dw1-seg2 6000 dw1-seg2 /data/primary/gpseg2 4
[INFO]:-dw1-seg2 6001 dw1-seg2 /data/primary/gpseg3 5
选择y后成功
Continue with Greenplum creation Yy|Nn (default=N):
> y
[INFO]:-Building the Master instance database, please wait...
[INFO]:-Starting the Master in admin mode
[INFO]:-Commencing parallel build of primary segment instances
[INFO]:-Spawning parallel processes batch [1], please wait...
[INFO]:-Waiting for parallel processes batch [1], please wait...
[INFO]:------------------------------------------------
[INFO]:-Parallel process exit status
[INFO]:------------------------------------------------
[INFO]:-Total processes marked as completed = 4
[INFO]:-Total processes marked as killed = 0
[INFO]:-Total processes marked as failed = 0
[INFO]:------------------------------------------------
[INFO]:-Removing back out file
[INFO]:-No errors generated from parallel processes
[INFO]:-Restarting the Greenplum instance in production mode
[INFO]:-Starting gpstop with args: -a -l /home/gpadmin/gpAdminLogs -m -d /data/master/gpseg-1
[INFO]:-Gathering information and validating the environment...
[INFO]:-Obtaining Greenplum Master catalog information
[INFO]:-Obtaining Segment details from master...
[INFO]:-Greenplum Version: 'postgres (Greenplum Database) 6.0.0-beta.1 build dev'
[INFO]:-Commencing Master instance shutdown with mode='smart'
[INFO]:-Master segment instance directory=/data/master/gpseg-1
[INFO]:-Stopping master segment and waiting for user connections to finish ...
[INFO]:-Attempting forceful termination of any leftover master process
[INFO]:-Terminating processes for segment /data/master/gpseg-1
[INFO]:-Starting gpstart with args: -a -l /home/gpadmin/gpAdminLogs -d /data/master/gpseg-1
[INFO]:-Gathering information and validating the environment...
[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 6.0.0-beta.1 build dev'
[INFO]:-Greenplum Catalog Version: '301908232'
[INFO]:-Starting Master instance in admin mode
[INFO]:-Obtaining Greenplum Master catalog information
[INFO]:-Obtaining Segment details from master...
[INFO]:-Setting new master era
[INFO]:-Master Started...
[INFO]:-Shutting down master
[INFO]:-Commencing parallel segment instance startup, please wait...
[INFO]:-Process results...
[INFO]:-----------------------------------------------------
[INFO]:- Successful segment starts = 4
[INFO]:- Failed segment starts = 0
[INFO]:- Skipped segment starts (segments are marked down in configuration) = 0
[INFO]:-----------------------------------------------------
[INFO]:-Successfully started 4 of 4 segment instances
[INFO]:-----------------------------------------------------
[INFO]:-Starting Master instance dw1-cn directory /data/master/gpseg-1
[INFO]:-Command pg_ctl reports Master dw1-cn instance active
[INFO]:-Connecting to dbname='template1' connect_timeout=15
[INFO]:-No standby master configured. skipping...
[INFO]:-Database successfully started
[INFO]:-Completed restart of Greenplum instance in production mode
[INFO]:-Scanning utility log file for any warning messages
[WARN]:-*******************************************************
[WARN]:-Scan of log file indicates that some warnings or errors
[WARN]:-were generated during the array creation
[INFO]:-Please review contents of log file
[INFO]:-/home/gpadmin/gpAdminLogs/gpinitsystem_20221215.log
[INFO]:-To determine level of criticality
[INFO]:-These messages could be from a previous run of the utility
[INFO]:-that was called today!
[WARN]:-*******************************************************
[INFO]:-Greenplum Database instance successfully created
[INFO]:-------------------------------------------------------
[INFO]:-To complete the environment configuration, please
[INFO]:-update gpadmin .bashrc file with the following
[INFO]:-1. Ensure that the greenplum_path.sh file is sourced
[INFO]:-2. Add "export MASTER_DATA_DIRECTORY=/data/master/gpseg-1"
[INFO]:- to access the Greenplum scripts for this instance:
[INFO]:- or, use -d /data/master/gpseg-1 option for the Greenplum scripts
[INFO]:- Example gpstate -d /data/master/gpseg-1
[INFO]:-Script log file = /home/gpadmin/gpAdminLogs/gpinitsystem_20221215.log
[INFO]:-To remove instance, run gpdeletesystem utility
[INFO]:-To initialize a Standby Master Segment for this Greenplum instance
[INFO]:-Review options for gpinitstandby
[INFO]:-------------------------------------------------------
[INFO]:-The Master /data/master/gpseg-1/pg_hba.conf post gpinitsystem
[INFO]:-has been configured to allow all hosts within this new
[INFO]:-array to intercommunicate. Any hosts external to this
[INFO]:-new array must be explicitly added to this file
[INFO]:-Refer to the Greenplum Admin support guide which is
[INFO]:-located in the /usr/local/gpdb/docs directory
[INFO]:-------------------------------------------------------
配置环境变量
master(包括standby)的gpadmin用户下
可以放到.bashrc中,或者单独的文件中
source /usr/local/gpdb/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/data/master/gpseg-1
export PGPORT=5432
export PGUSER=gpadmin
export PGDATABASE=gpadmin
# 如果DATABASE_NAME指定了数据库名,如dw1
# export PGDATABASE=dw1
在master、segment上ps看数据库已经启动起来
master:
[gpadmin@dw1-cn ~]$ ps -ef|grep postgres
gpadmin 15626 1 0 06:54 ? 00:00:00 /usr/local/gpdb/bin/postgres -D /data/master/gpseg-1 -p 5432 -E
gpadmin 15627 15626 0 06:54 ? 00:00:00 postgres: 5432, master logger process
gpadmin 15629 15626 0 06:54 ? 00:00:00 postgres: 5432, checkpointer process
gpadmin 15630 15626 0 06:54 ? 00:00:00 postgres: 5432, writer process
gpadmin 15631 15626 0 06:54 ? 00:00:00 postgres: 5432, wal writer process
gpadmin 15632 15626 0 06:54 ? 00:00:00 postgres: 5432, stats collector process
gpadmin 15633 15626 0 06:54 ? 00:00:00 postgres: 5432, bgworker: dtx recovery process
gpadmin 15634 15626 0 06:54 ? 00:00:00 postgres: 5432, bgworker: ftsprobe process
gpadmin 15637 15626 0 06:54 ? 00:00:00 postgres: 5432, bgworker: sweeper process
gpadmin 15850 10619 0 07:11 pts/0 00:00:00 grep --color=auto postgres
master:
[gpadmin@dw1-cn ~]$ netstat -anp|grep 15626
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
tcp 0 0 0.0.0.0:5432 0.0.0.0:* LISTEN 15626/postgres
tcp6 0 0 :::5432 :::* LISTEN 15626/postgres
udp6 0 0 ::1:48929 ::1:48929 ESTABLISHED 15626/postgres
unix 2 [ ACC ] STREAM LISTENING 73165 15626/postgres /tmp/.s.PGSQL.5432
[gpadmin@dw1-cn ~]$ netstat |head
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
segment:
[gpadmin@dw1-seg1 vagrant]$ ps -ef|grep postgres
gpadmin 8241 1 0 06:54 ? 00:00:00 /usr/local/gpdb/bin/postgres -D /data/primary/gpseg1 -p 6001
gpadmin 8242 1 0 06:54 ? 00:00:00 /usr/local/gpdb/bin/postgres -D /data/primary/gpseg0 -p 6000
gpadmin 8243 8241 0 06:54 ? 00:00:00 postgres: 6001, logger process
gpadmin 8244 8242 0 06:54 ? 00:00:00 postgres: 6000, logger process
gpadmin 8249 8241 0 06:54 ? 00:00:00 postgres: 6001, checkpointer process
gpadmin 8250 8241 0 06:54 ? 00:00:00 postgres: 6001, writer process
gpadmin 8251 8241 0 06:54 ? 00:00:00 postgres: 6001, wal writer process
gpadmin 8252 8241 0 06:54 ? 00:00:00 postgres: 6001, stats collector process
gpadmin 8253 8241 0 06:54 ? 00:00:00 postgres: 6001, bgworker: sweeper process
gpadmin 8254 8242 0 06:54 ? 00:00:00 postgres: 6000, checkpointer process
gpadmin 8255 8242 0 06:54 ? 00:00:00 postgres: 6000, writer process
gpadmin 8256 8242 0 06:54 ? 00:00:00 postgres: 6000, wal writer process
gpadmin 8257 8242 0 06:54 ? 00:00:00 postgres: 6000, stats collector process
gpadmin 8258 8242 0 06:54 ? 00:00:00 postgres: 6000, bgworker: sweeper process
gpadmin 8410 5515 0 07:12 pts/0 00:00:00 grep --color=auto postgres
segment:
[root@dw1-seg1 vagrant]# netstat |head
Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
[root@dw1-seg1 vagrant]# netstat -anp|grep 8241
tcp 0 0 0.0.0.0:6001 0.0.0.0:* LISTEN 8241/postgres
tcp6 0 0 :::6001 :::* LISTEN 8241/postgres
udp6 0 0 ::1:51463 ::1:51463 ESTABLISHED 8241/postgres
unix 2 [ ACC ] STREAM LISTENING 68927 8241/postgres /tmp/.s.PGSQL.6001
[root@dw1-seg1 vagrant]# netstat -anp|grep 8242
tcp 0 0 0.0.0.0:6000 0.0.0.0:* LISTEN 8242/postgres
tcp6 0 0 :::6000 :::* LISTEN 8242/postgres
udp6 0 0 ::1:39652 ::1:39652 ESTABLISHED 8242/postgres
unix 2 [ ACC ] STREAM LISTENING 68928 8242/postgres /tmp/.s.PGSQL.6000
登陆数据库
master节点上gpadmin下
需要首先让“配置环境变量”配置的环境变量生效
[gpadmin@dw1-cn ~]$ psql -d dw1
psql (9.4.26)
Type "help" for help.
# stop所有实例
gpstop -M fast -a
# start所有实例
gpstart -a
psql -d dw2
dw1=# CREATE TABLE products(name varchar(40),prod_id integer,supplier_id integer) DISTRIBUTED BY (prod_id);
CREATE TABLE
dw1=# insert into products values('a', 1, 1);
INSERT 0 1
dw1=# insert into products values('b', 2, 2);
INSERT 0 1
dw1=# insert into products values('c', 3, 3);
INSERT 0 1
dw1=# select * from products;
name | prod_id | supplier_id
------+---------+-------------
c | 3 | 3
a | 1 | 1
b | 2 | 2
(3 rows)
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。