Ansible practice of Nebula Graph

This article was first published in Nebula Graph public number NebulaGraphCommunity , Follow & see big factory graph database technology practice

Nebula Graph 的 Ansible 实践

background

In the daily testing of Nebula-Graph, we will often deploy Nebula-Graph on the server. In order to improve efficiency, we need a tool that can help us achieve rapid deployment. The main requirements are:

Nebula Graph can be deployed with a non-root account, so that we can set cgroups for this user to limit resources.
The configuration file can be changed on the operating machine and then distributed to the deployed cluster, which is convenient for us to do various parameter adjustment tests.
You can use the script to call, which is convenient for us to inherit on the testing platform or tool in the future.

In terms of tool selection, there are Fabric and Puppet in the early days, and Ansible and SaltStack are relatively new tools.

Ansible has 40K+ stars on GitHub and was acquired by Red Hat in 2015. The community is relatively active. Many open source projects provide Ansible deployment methods, such as 1609cc070c00d6 1609cc070c00d7 in and tidb-ansible .

In summary, we use Ansible to deploy Nebula Graph.

Introduction to Ansible

Features

Ansible is an open source, automated deployment tool (Ansible Tower is commercial). It has the following characteristics:

The default protocol is based on SSH, which does not require additional agent deployment compared to SaltStack.
Use playbook, role, module to define the deployment process, which is more flexible.
Operational behavior is idempotent.
Modular development, richer modules.

The advantages and disadvantages are obvious

Using the SSH protocol, the advantage is that most machines can be deployed through Ansible by default as long as they have an account and password, but the disadvantage is that the performance will be worse.
Use playbook to define the deployment process. Python's Jinja2 is used as a template rendering engine, which is more convenient for those who are familiar with it, but for those who have not used it, it will increase the cost of learning.

In summary, it is suitable for batch deployment of small batch machines, and does not need to be concerned about the scenario of additional deployment of agents, which matches our needs.

Deployment logic

Usually for offline deployment, machines can be divided into 3 roles.

Ansible Execution Machine : The machine running Ansible needs to be able to connect to all machines via SSH.
has an external network resource machine : run tasks that need to be connected to the external network, such as downloading RPM packages.
server : the server running the service, which can be isolated from the network and deployed through the execution machine

Nebula Graph 的 Ansible 实践

Task logic

In Ansible, there are mainly three levels of tasks:

Module
Role
Playbook

Module is divided into CoreModule and CustomerModule, which are the basic units of Ansible tasks.

When running a task, Ansible will first substitute the parameters according to the code of the module to generate a new Python file, put it in the remote tmp folder via SSH, and then execute Python remotely via SSH to return the output result, and finally put the remote directory delete.

Nebula Graph 的 Ansible 实践

# 设置不删除 tmp 文件
export ANSIBLE_KEEP_REMOTE_FILES=1

# -vvv 查看 debug 信息
ansible -m ping all -vvv
<192.168.8.147> SSH: EXEC ssh -o ControlMaster=auto -o ControlPersist=30m -o ConnectionAttempts=100 -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="nebula"' -o ConnectTimeout=10 -o ControlPath=/home/vesoft/.ansible/cp/d94660cf0d -tt 192.168.8.147 '/bin/sh -c '"'"'/usr/bin/python /home/nebula/.ansible/tmp/ansible-tmp-1618982672.9659252-5332-61577192045877/AnsiballZ_ping.py && sleep 0'"'"''

You can see that there is such a log output. AnsiballZ_ping.py is the Python file generated by the module. You can log in to that machine and execute the Python statement to see the result.

python3 AnsiballZ_ping.py

#{"ping": "pong", "invocation": {"module_args": {"data": "pong"}}}

The standard output of the running Python file is returned, and Ansible does additional processing on the returned result.

Role is a series of tasks for serial modules, and context parameters can be passed through register.

typical example:

Create a directory
If the directory is created successfully, continue the installation, otherwise exit the entire deployment project.

Playbook is the association between the organization's deployment machine and the role.

By grouping different machines in inventory, and deploying different groups with different roles, a very flexible installation and deployment task can be completed.

After the playbook is defined, different environments, as long as the machine configuration in the inventory is changed, the same deployment process can be completed.

Module customization

Custom filter

Ansible uses Jinja2 as the template rendering engine, you can use the filter that comes with Jinja2, such as

# 使用 default filter，默认输出 5

ansible -m debug -a 'msg={{ hello | default(5) }}' all

Sometimes, we will need a custom filter to manipulate variables. A typical scenario is the address --meta_server_addrs nebula-metad.

When there is only one metad1:9559 , the format is 0609cc070c0503,
When there are 3 metad1:9559,metad2:9559,metad3:9559

Under the project of the ansible playbook, create a new filter_plugins directory and create a map_fomat.py Python file. The content of the file:

# -*- encoding: utf-8 -*-
from jinja2.utils import soft_unicode

def map_format(value, pattern):
    """
    e.g.  
    "{{ groups['metad']|map('map_format', '%s:9559')|join(',') }}"
    """
    return soft_unicode(pattern) % (value)

class FilterModule(object):
    """ jinja2 filters """
    def filters(self):
        return {
            'map_format': map_format,
        }

{{ groups['metad']|map('map_format', '%s:9559')|join(',') }} the value we want.

Custom module

The custom module needs to conform to the format of the Ansible framework, including getting parameters, standard returns, error returns, etc.

The written custom module needs to be configured with ANSIBLE_LIBRARY in ansible.cfg so that ansible can obtain it.

refer to the official website: 1609cc070c0653 https://ansible-docs.readthedocs.io/zh/stable-2.0/rst/developing_modules.html

Ansible practice of Nebula Graph

Because Nebula Graph itself is not complicated to start, it is very simple to use Ansible to complete the deployment of Nebula-Graph.

Download the RPM package.
Copy the RPM package to the deployment machine, unzip it, and put it in the destination folder.
Update the configuration file.
Start by shell.

Use a generic role

Nebula Graph has three components, graphd, metad, and storaged. The naming and startup of the three components use the same format. You can use the common role. Graphd, metad, and storaged respectively refer to the common role.

On the one hand, it is easier to maintain, and on the other hand, the deployed services are more fine-grained. For example, ABC machine deploys storaged, only C machine deploys graphd, then AB machine, there will be no graphd configuration file.

# 通用的 role, 使用变量 install/task/main.yml
- name: config {{ module }}.conf
  template:
    src: "{{ playbook_dir}}/templates/{{ module }}.conf.j2"
    dest: "{{ deploy_dir }}/etc/{{ module }}.conf"
    
# graphd role，将变量传进来 nebula-graphd/task/main.yml
- name: install graphd
  include_role:
    name: install
  vars:
    module: nebula-graphd

In the playbook, the graphd machine group runs graphd's role. If AB is not in the graphd machine group, the graphd configuration file will not be uploaded.

After such deployment, you cannot use Nebula-Graph's nebula.service start all to start all, because some machines will not have a nebula-graphd.conf configuration file. Similarly, you can specify different machine groups and pass different parameters through parameters in the playbook.

# playbook start.yml
- hosts: metad
  roles:
    - op
  vars:
    - module: metad
    - op: start

- hosts: storaged
  roles:
    - op
  vars:
    - module: storaged
    - op: start

- hosts: graphd
  roles:
    - op
  vars:
    - module: graphd
    - op: start

This is equivalent to multiple ssh to execute the startup script. Although the execution efficiency is not as good as start all, the start and stop of the service will be more flexible.

Nebula Graph 的 Ansible 实践

Use vars_prompt to end the playbook

When you only want to update the binary and do not want to delete the data directory,

You can add vars_prompt to the remove playbook to confirm the second time. If the second confirmation is confirmed, the data will be deleted, otherwise the playbook will be exited.

# playbook remove.yml
- hosts: all
  vars_prompt:
    - name: confirmed
      prompt: "Are you sure you want to remove the Nebula-Graph? Will delete binary only  (yes/no)"
 
  roles:
    - remove

In the role, the value of the second confirmation will be verified

# remove/task/main.yml
---
- name: Information
  debug:
    msg: "Must input 'yes', abort the playbook "
  when:
    - confirmed != 'yes'

- meta: end_play
  when:
    - confirmed != 'yes

The effect is shown in the figure. You can confirm the deletion a second time. If it is not yes, the execution of this playbook will be cancelled, so that you can delete only the binary and not the data of the nebula cluster.

Nebula Graph 的 Ansible 实践

Exchange graph database technology? Please join Nebula exchange group under Nebulae fill in your card , Nebula assistant will pull you into the group ~

Ansible practice of Nebula Graph

background

Introduction to Ansible

Features

Deployment logic

Task logic

Module customization

Custom filter

Custom module

Ansible practice of Nebula Graph

Use a generic role

Use vars_prompt to end the playbook

Recommended reading

NebulaGraph

引用和评论

来领《黑神话：悟空》！NebulaGraph 用户案例征集ing

图查询语言GQL(Graph Query Language)语法概览

neo4j迁移到dozerdb