In the preface of the previous article, we talked about the origin of the source code interpretation series. In the Nebula Graph Overview article, we will take you to understand the architecture of Nebula Graph, code warehouse distribution, code structure, and module planning.
1. Architecture
Nebula Graph is an open source distributed graph database. Nebula adopts the design of separation of storage and calculation, decoupling storage and calculation. At the same time, in addition to the database core, we also provide many peripheral tools, such as data import, monitoring, deployment, visualization, graph calculation and so on.
For Nebula design, please refer to "Overview of Graph Databases and Nebula's Practice in Graph Database Design" .
The overall architecture design is shown in the figure below:
The query engine adopts a stateless design, can easily realize horizontal expansion, and is divided into several main parts such as syntax analysis, semantic analysis, optimizer, and execution engine.
For detailed design, please refer to "Query Engine Design for Graph Databases" , Nebula Graph 2.0 Query Engine" .
The query engine architecture design is shown in the following figure:
Storage consists of two parts, one is meta-related storage, which we call Meta Service, and the other is data-related storage, which we call Storage Service.
Storage Service has three layers: the bottom layer is Store Engine; the top layer is our Consensus layer, which implements Multi Group Raft; the top layer is our Storage interfaces, which defines a series of graph-related APIs.
For detailed design, please refer to "Graph Database Storage Design" .
The storage engine architecture design is shown in the following figure:
2. Overview of Code Repository
Welcome to the vesoft code repository (vesoft is the developer of Nebula Graph, a graph database).
The current Nebula product architecture includes graph database kernel, client, tools, test framework, compilation, visualization, monitoring, etc.
The main purpose of this article is to briefly introduce the code structure of Nebula Graph's main Repo and explain the basic functions of each module. More detailed design instructions will follow. I hope to help community readers better understand Nebula Graph and make their own contributions to the Nebula community, such as submitting features, fixing bugs, submitting documents, etc.
The following lists most of the code repositories in the vesoft-inc repository:
- nebula : Nebula 1.0 kernel code
- nebula graph : Nebula 2.0 query calculation engine
- nebula storage : Nebula 2.0 storage engine
- nebula common : Nebula 2.0 kernel toolkit
Nebula Clients
- nebula-java : Java client
- nebula-cpp : CPP client
- nebula-go : Go client
- nebula-python : Python client
Nebula Tools
- nebula-importer : a high-performance data import tool based on Go client
- nebula-spark-utils : Collection tools Spark Connector, Exchange, Algorithm
- nebula-br : backup and recovery tool
- nebula-ansible , nebula-operator : deployment tool
Nebula Test
- nebula-bench : stress and performance test project
- nebula-chaos : Chaos test project
Compiling
- nebula-third-party : The third-party package that the Nebula Graph database kernel depends on
- nebula-gears : Nebula Graph graph database kernel tool chain
- nebula-graph-studio : Nebula Graph visualization tool
3. Code structure and module description
3.1 Nebula Graph
├── cmake
├── conf
├── LICENSES
├── package
├── resources
├── scripts
├── src
│ ├── context
│ ├── daemons
│ ├── executor
│ ├── optimizer
│ ├── parser
│ ├── planner
│ ├── scheduler
│ ├── service
│ ├── session
│ ├── stats
│ ├── util
│ ├── validator
│ └── visitor
└── tests
├── admin
├── bench
├── common
├── data
├── job
├── maintain
├── mutate
├── query
└── tck
- conf/: Query engine configuration file directory
- package/: graph packaging script
- resources/: resource files
- scripts/: startup script
src/: Query engine source code directory
- src/context/: Query context information, including AST (Abstract Syntax Tree), Execution Plan (execution plan), execution results, and other computing-related resources.
- src/daemons/: main process of query engine
- src/executor/: executor, the realization of each operator
- src/optimizer/: RBO (rule-based optimization) implementation, and optimization rules
- src/parser/: lexical analysis, grammatical analysis, AST structure definition
- src/planner/: operator, and execution plan generation
- src/scheduler/: The scheduler that executes the plan
- src/service/: Query engine service layer, which provides authentication and an interface for executing Query
- src/session/: Session management
- src/stats/: execution statistics, such as P99, slow query statistics, etc.
- src/util/: utility functions
- src/validator/: Semantic analysis implementation, used to check semantic errors, and make some simple rewrite optimizations
- src/visitor/: expression visitor, used to extract expression information or optimize
- tests/: BDD-based integration testing framework to test all functions provided by Nebula Graph
3.2 Nebula Storage
├── cmake
├── conf
├── docker
├── docs
├── LICENSES
├── package
├── scripts
└── src
├── codec
├── daemons
├── kvstore
├── meta
├── mock
├── storage
├── tools
├── utils
└── version
- conf/: storage engine configuration file directory
- package/: storage packaging script
- scripts/: startup script
src/: storage engine source code directory
- src/codec/: serialization and deserialization tools
- src/daemons/: main process of storage engine and metadata engine
- src/kvstore/: Implementation of distributed KV storage based on raft
- src/meta/: KVStore-based metadata management service implementation, used to manage metadata information, cluster management, long time-consuming task management, etc.
- src/storage/: Implementation of graph data storage engine based on KVStore
- src/tools/: Some small tools to achieve
- src/utils/: Code tool functions
3.3 Nebula Common
├── cmake
│ └── nebula
├── LICENSES
├── src
│ └── common
│ ├── algorithm
│ ├── base
│ ├── charset
│ ├── clients
│ ├── concurrent
│ ├── conf
│ ├── context
│ ├── cpp
│ ├── datatypes
│ ├── encryption
│ ├── expression
│ ├── fs
│ ├── function
│ ├── graph
│ ├── hdfs
│ ├── http
│ ├── interface
│ ├── meta
│ ├── network
│ ├── plugin
│ ├── process
│ ├── session
│ ├── stats
│ ├── test
│ ├── thread
│ ├── thrift
│ ├── time
│ ├── version
│ └── webservice
└── third-party
Nebula Common warehouse code is a toolkit of Nebula kernel code, providing efficient implementation of some common tools. Some commonly used toolkits, I believe that every engineer must also be aware of it. Only the directories closely related to the graph database are described here.
- src/common/clients/: meta, CPP implementation of storage client
- src/common/datatypes/: The definition of data types and calculations in Nebula Graph, such as string, int, bool, float, Vertex, Edge, etc.
- rc/common/expression/: definition of expression in nGQL
- src/common/function/: the definition of the function in nGQL
- src/common/interface/: interface definition of graph, meta, storage services
The above is the introduction of this article.
Exchange graph database technology? Please join Nebula exchange group under Nebula fill in your card , Nebula assistant will pull you into the group ~
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。