This article was first published in Nebula Graph public number NebulaGraphCommunity , Follow & see big factory graph database technology practice
background
In the daily testing of Nebula-Graph, we will often deploy Nebula-Graph on the server. In order to improve efficiency, we need a tool that can help us achieve rapid deployment. The main requirements are:
- Nebula Graph can be deployed with a non-root account, so that we can set cgroups for this user to limit resources.
- The configuration file can be changed on the operating machine and then distributed to the deployed cluster, which is convenient for us to do various parameter adjustment tests.
- You can use the script to call, which is convenient for us to inherit on the testing platform or tool in the future.
In terms of tool selection, there are Fabric and Puppet in the early days, and Ansible and SaltStack are relatively new tools.
Ansible has 40K+ stars on GitHub and was acquired by Red Hat in 2015. The community is relatively active. Many open source projects provide Ansible deployment methods, such as 1609cc070c00d6 1609cc070c00d7 in and tidb-ansible .
In summary, we use Ansible to deploy Nebula Graph.
Introduction to Ansible
Features
Ansible is an open source, automated deployment tool (Ansible Tower is commercial). It has the following characteristics:
- The default protocol is based on SSH, which does not require additional agent deployment compared to SaltStack.
- Use playbook, role, module to define the deployment process, which is more flexible.
- Operational behavior is idempotent.
- Modular development, richer modules.
The advantages and disadvantages are obvious
- Using the SSH protocol, the advantage is that most machines can be deployed through Ansible by default as long as they have an account and password, but the disadvantage is that the performance will be worse.
- Use playbook to define the deployment process. Python's Jinja2 is used as a template rendering engine, which is more convenient for those who are familiar with it, but for those who have not used it, it will increase the cost of learning.
In summary, it is suitable for batch deployment of small batch machines, and does not need to be concerned about the scenario of additional deployment of agents, which matches our needs.
Deployment logic
Usually for offline deployment, machines can be divided into 3 roles.
- Ansible Execution Machine : The machine running Ansible needs to be able to connect to all machines via SSH.
- has an external network resource machine : run tasks that need to be connected to the external network, such as downloading RPM packages.
- server : the server running the service, which can be isolated from the network and deployed through the execution machine
Task logic
In Ansible, there are mainly three levels of tasks:
- Module
- Role
- Playbook
Module is divided into CoreModule and CustomerModule, which are the basic units of Ansible tasks.
When running a task, Ansible will first substitute the parameters according to the code of the module to generate a new Python file, put it in the remote tmp folder via SSH, and then execute Python remotely via SSH to return the output result, and finally put the remote directory delete.
# 设置不删除 tmp 文件
export ANSIBLE_KEEP_REMOTE_FILES=1
# -vvv 查看 debug 信息
ansible -m ping all -vvv
<192.168.8.147> SSH: EXEC ssh -o ControlMaster=auto -o ControlPersist=30m -o ConnectionAttempts=100 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="nebula"' -o ConnectTimeout=10 -o ControlPath=/home/vesoft/.ansible/cp/d94660cf0d -tt 192.168.8.147 '/bin/sh -c '"'"'/usr/bin/python /home/nebula/.ansible/tmp/ansible-tmp-1618982672.9659252-5332-61577192045877/AnsiballZ_ping.py && sleep 0'"'"''
You can see that there is such a log output. AnsiballZ_ping.py is the Python file generated by the module. You can log in to that machine and execute the Python statement to see the result.
python3 AnsiballZ_ping.py
#{"ping": "pong", "invocation": {"module_args": {"data": "pong"}}}
The standard output of the running Python file is returned, and Ansible does additional processing on the returned result.
Role is a series of tasks for serial modules, and context parameters can be passed through register.
typical example:
- Create a directory
- If the directory is created successfully, continue the installation, otherwise exit the entire deployment project.
Playbook is the association between the organization's deployment machine and the role.
By grouping different machines in inventory, and deploying different groups with different roles, a very flexible installation and deployment task can be completed.
After the playbook is defined, different environments, as long as the machine configuration in the inventory is changed, the same deployment process can be completed.
Module customization
Custom filter
Ansible uses Jinja2 as the template rendering engine, you can use the filter that comes with Jinja2, such as
# 使用 default filter,默认输出 5
ansible -m debug -a 'msg={{ hello | default(5) }}' all
Sometimes, we will need a custom filter to manipulate variables. A typical scenario is the address --meta_server_addrs
nebula-metad.
- When there is only one
metad1:9559
, the format is 0609cc070c0503, - When there are 3
metad1:9559,metad2:9559,metad3:9559
Under the project of the ansible playbook, create a new filter_plugins directory and create a map_fomat.py Python file. The content of the file:
# -*- encoding: utf-8 -*-
from jinja2.utils import soft_unicode
def map_format(value, pattern):
"""
e.g.
"{{ groups['metad']|map('map_format', '%s:9559')|join(',') }}"
"""
return soft_unicode(pattern) % (value)
class FilterModule(object):
""" jinja2 filters """
def filters(self):
return {
'map_format': map_format,
}
{{ groups['metad']|map('map_format', '%s:9559')|join(',') }}
the value we want.
Custom module
The custom module needs to conform to the format of the Ansible framework, including getting parameters, standard returns, error returns, etc.
The written custom module needs to be configured with ANSIBLE_LIBRARY in ansible.cfg so that ansible can obtain it.
Ansible practice of Nebula Graph
Because Nebula Graph itself is not complicated to start, it is very simple to use Ansible to complete the deployment of Nebula-Graph.
- Download the RPM package.
- Copy the RPM package to the deployment machine, unzip it, and put it in the destination folder.
- Update the configuration file.
- Start by shell.
Use a generic role
Nebula Graph has three components, graphd, metad, and storaged. The naming and startup of the three components use the same format. You can use the common role. Graphd, metad, and storaged respectively refer to the common role.
On the one hand, it is easier to maintain, and on the other hand, the deployed services are more fine-grained. For example, ABC machine deploys storaged, only C machine deploys graphd, then AB machine, there will be no graphd configuration file.
# 通用的 role, 使用变量 install/task/main.yml
- name: config {{ module }}.conf
template:
src: "{{ playbook_dir}}/templates/{{ module }}.conf.j2"
dest: "{{ deploy_dir }}/etc/{{ module }}.conf"
# graphd role,将变量传进来 nebula-graphd/task/main.yml
- name: install graphd
include_role:
name: install
vars:
module: nebula-graphd
In the playbook, the graphd machine group runs graphd's role. If AB is not in the graphd machine group, the graphd configuration file will not be uploaded.
After such deployment, you cannot use Nebula-Graph's nebula.service start all to start all, because some machines will not have a nebula-graphd.conf configuration file. Similarly, you can specify different machine groups and pass different parameters through parameters in the playbook.
# playbook start.yml
- hosts: metad
roles:
- op
vars:
- module: metad
- op: start
- hosts: storaged
roles:
- op
vars:
- module: storaged
- op: start
- hosts: graphd
roles:
- op
vars:
- module: graphd
- op: start
This is equivalent to multiple ssh to execute the startup script. Although the execution efficiency is not as good as start all, the start and stop of the service will be more flexible.
Use vars_prompt to end the playbook
When you only want to update the binary and do not want to delete the data directory,
You can add vars_prompt to the remove playbook to confirm the second time. If the second confirmation is confirmed, the data will be deleted, otherwise the playbook will be exited.
# playbook remove.yml
- hosts: all
vars_prompt:
- name: confirmed
prompt: "Are you sure you want to remove the Nebula-Graph? Will delete binary only (yes/no)"
roles:
- remove
In the role, the value of the second confirmation will be verified
# remove/task/main.yml
---
- name: Information
debug:
msg: "Must input 'yes', abort the playbook "
when:
- confirmed != 'yes'
- meta: end_play
when:
- confirmed != 'yes
The effect is shown in the figure. You can confirm the deletion a second time. If it is not yes, the execution of this playbook will be cancelled, so that you can delete only the binary and not the data of the nebula cluster.
Exchange graph database technology? Please join Nebula exchange group under Nebulae fill in your card , Nebula assistant will pull you into the group ~
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。