Nebula Graph 源码解读系列|客户端的通信秘密——fbthrift

Overview

Nebula Clients provides users with APIs in multiple programming languages to interact with Nebula Graph, and repackages the data structure returned by the server to facilitate users' use.

The languages currently supported by Nebula Clients are C++, Java, Python, Golang and Rust.

communication framework

Nebula Clients uses fbthrift https://github.com/facebook/fbthrift as the RPC communication framework between the server and the client, realizing cross-language interaction.

fbthrift provides three functions:

  1. Generate code: fbthrift can serialize different languages into data structures
  2. Serialization: Serialize the generated data structure
  3. Communication interaction: transfer messages between the client and the server, and call the corresponding server function when receiving requests from clients in different languages

example

Take the Golang client as an example to show the application of fbthrift in Nebula Graph.

  1. The definition of the Vertex structure on the server side:

    struct Vertex {
     Value vid;
     std::vector<Tag> tags;
    
     Vertex() = default;
    };
  2. First, define some data structures in src/interface/common.thrift :
struct Tag {
        1: binary name,
        // List of <prop_name, prop_value>
        2: map<binary, Value> (cpp.template = "std::unordered_map") props,
} (cpp.type = "nebula::Tag")

struct Vertex {
        1: Value     vid,
        2: list<Tag> tags,
} (cpp.type = "nebula::Vertex")

Here we define a Vertex structure, in which (cpp.type = "nebula::Vertex") indicates that this structure corresponds to nebula::Vertex of the server.

  1. fbthrift will automatically generate Golang data structures for us:

    // Attributes:
    //  - Vid
    //  - Tags
    type Vertex struct {
     Vid *Value `thrift:"vid,1" db:"vid" json:"vid"`
     Tags []*Tag `thrift:"tags,2" db:"tags" json:"tags"`
    }
    
    func NewVertex() *Vertex {
     return &Vertex{}
    }
    
    ...
    
    func (p *Vertex) Read(iprot thrift.Protocol) error { // 反序列化
     ...
    }
    
    func (p *Vertex) Write(oprot thrift.Protocol) error { // 序列化
     ...
    }
  2. In the statement MATCH (v:Person) WHERE id(v) == "ABC" RETURN v : the client requests a vertex ( nebula::Vertex ) from the server, after the server finds this vertex, it will serialize to , and send it to the client through transport of the RPC communication framework, and the client receives this When copying data, will be deserialized to to generate the corresponding data structure ( type Vertex struct ) defined in the client.

client module

In this chapter, we will take nebula-go as an example to introduce each module of the client and its main interface.

  1. The configuration class Configs provides global configuration options.

    type PoolConfig struct {
     // 设置超时时间,0 代表不超时,单位 ms。默认是 0
     TimeOut time.Duration
     // 每个连接最大空闲时间,当连接超过该时间没有被使用将会被断开和删除,0 表示永久 idle,连接不会关闭。默认是 0
     IdleTime time.Duration
     // max_connection_pool_size: 设置最大连接池连接数量,默认 10
     MaxConnPoolSize int
     // 最小空闲连接数,默认 0
     MinConnPoolSize int
    }
  2. Client session Session , which provides an interface that users directly call.

    //管理 Session 特有的信息
    type Session struct {
     // 用于执行命令的时候的身份校验或者消息重试
     sessionID  int64
     // 当前持有的连接
     connection *connection
     // 当前使用的连接池
     connPool   *ConnectionPool
     // 日志工具
     log        Logger
     // 用于保存当前 Session 所用的时区
     timezoneInfo
    }
  3. The interface definition has the following

     // 执行 nGQL,返回的数据类型为 ResultSet,该接口是非线程安全的。
     func (session *Session) Execute(stmt string) (*ResultSet, error) {...}
     // 重新为当前 Session 从连接池中获取连接
     func (session *Session) reConnect() error {...}
     // 做 signout,释放 Session ID,归还 connection 到 pool
     func (session *Session) Release() {
  4. Connection pool ConnectionPool , manages all connections, the main interfaces are as follows

    // 创建新的连接池, 并用输入的服务地址完成初始化
    func NewConnectionPool(addresses []HostAddress, conf PoolConfig, log Logger) (*ConnectionPool, error) {...}
    // 验证并获取 Session 实例
    func (pool *ConnectionPool) GetSession(username, password string) (*Session, error) {...}
  5. Connect Connection , encapsulate the network of thrift , and provide the following interfaces

    // 和指定的 ip 和端口的建立连接
    func (cn *connection) open(hostAddress HostAddress, timeout time.Duration) error {...}
    // 验证用户名和密码
    func (cn *connection) authenticate(username, password string) (*graph.AuthResponse, error) {
    // 执行 query
    func (cn *connection) execute(sessionID int64, stmt string) (*graph.ExecutionResponse, error) {...}
    // 通过 SessionId 为 0 发送 "YIELD 1" 来判断连接是否是可用的
    func (cn *connection) ping() bool {...}
    // 向 graphd 释放 sessionId
    func (cn *connection) signOut(sessionID int64) error {...}
    // 断开连接
    func (cn *connection) close() {...}
  6. Load Balance LoadBalance , use this module in the connection pool

    • Strategy: Polling Strategy

Module Interaction Analysis

模块交互图

  1. connection pool

    • initialization:

      • When using, the user needs to create and initialize a connection pool ConnectionPool. The connection pool will establish a connection to the address of the Nebula service specified by the user when it is initialized. Connection, if multiple Graph services are deployed in cluster deployment, The connection pool will use the round-robin strategy to balance the load and establish a nearly equal number of connections to each address.
    • Manage connections:

      • Two queues are maintained in the connection pool, idle connection queue idleConnectionQueue and in-use connection queue idleConnectionQueue, the connection pool will periodically detect expired idle connections and close them. When adding or deleting elements, these two queues will use read-write lock to ensure the correctness of multi-threaded execution.
      • When the Session requests a connection from the connection pool, it will check whether there are any available connections in the idle connection queue, and if so, it will be directly returned to the Session for the user to use; if there are no available connections and the current total number of connections does not exceed the maximum connections defined in the configuration If the number of connections is reached, a new connection will be created to the Session; if the limit of the maximum number of connections has been reached, an error will be returned.
    • Generally, the connection pool needs to be closed only when the program exits, and all connections in the pool will be disconnected when it is closed.
  2. client session

    • Client Session Session is generated by connection pool. The user needs to provide the user password for verification. After the verification is successful, the user will get a Session instance and communicate with the server through the connection in the Session. The most commonly used interface is execute() . If an error occurs during execution, the client will check the type of error. If it is a network reason, will automatically reconnect to and try to execute the statement again.
    • It should be noted that a Session does not support the simultaneous use of by multiple threads. The correct way is to apply for multiple Sessions with multiple threads, and each thread uses one Session.
    • When the session is released, the connections it holds will be put back into the idle connection queue of the connection pool , so that it can be reused by other sessions later.
  3. connect

    • Each connection instance is equivalent and can be held by any Session. The purpose of is that these connections can be reused by different , reducing the overhead of repeatedly switching Transport.
    • The connection will send the client's request to the server and return its result to the Session.
  4. User usage example

    // Initialize connection pool
    pool, err := nebula.NewConnectionPool(hostList, testPoolConfig, log)
    if err != nil {
     log.Fatal(fmt.Sprintf("Fail to initialize the connection pool, host: %s, port: %d, %s", address, port, err.Error()))
    }
    // Close all connections in the pool when program exits
    defer pool.Close()
    
    // Create session
    session, err := pool.GetSession(username, password)
    if err != nil {
     log.Fatal(fmt.Sprintf("Fail to create a new session from connection pool, username: %s, password: %s, %s",
         username, password, err.Error()))
    }
    // Release session and return connection back to connection pool when program exits
    defer session.Release()
    
    // Excute a query
    resultSet, err := session.Execute(query)
    if err != nil {
     fmt.Print(err.Error())
    }

return data structure

The client encapsulates some complex query results returned by the server and adds an interface for the convenience of users.

Basic type of query resultpackaged type
Null
Bool
Int64
Double
String
TimeTimeWrapper
Date
DateTimeDateTimeWrapper
List
Set
Map
VertexNode
EdgeRelationship
PathPathWrraper
DateSetResultSet
-Record (row operations for ResultSet)

For nebula::Value , it will be packaged as ValueWrapper on the client side, and converted to other structures through the interface. (ig node = ValueWrapper.asNode() )

Analysis of data structures

For statement MATCH p= (v:player{name:"Tim Duncan"})-[]->(v2) RETURN p , the returned result is:

+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| p                                                                                                                                                                                                                         |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| <("Tim Duncan" :bachelor{name: "Tim Duncan", speciality: "psychology"} :player{age: 42, name: "Tim Duncan"})<-[:teammate@0 {end_year: 2016, start_year: 2002}]-("Manu Ginobili" :player{age: 41, name: "Manu Ginobili"})> |
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Got 1 rows (time spent 11550/12009 us)

We can see that the returned result contains one row, and the type is a path. At this time, if you need to get the attribute of the path end point (v2), you can do it as follows:

// Excute a query
resultSet, _ := session.Execute("MATCH p= (v:player{name:"\"Tim Duncan"\"})-[]->(v2) RETURN p")

// 获取结果的第一行, 第一行的 index 为0
record, err := resultSet.GetRowValuesByIndex(0)
if err != nil {
    t.Fatalf(err.Error())
}

// 从第一行中取第一列那个 cell 的值
// 此时 valInCol0 的类型为 ValueWrapper 
valInCol0, err := record.GetValueByIndex(0)

// 将 ValueWrapper 转化成 PathWrapper 对象
pathWrap, err = valInCol0.AsPath()

// 通过 PathWrapper 的 GetEndNode() 接口直接得到终点
node, err = pathWrap.GetEndNode()

// 通过 node 的 Properties() 得到所有属性
// props 的类型为 map[string]*ValueWrapper
props, err = node.Properties()

client address

GitHub address of each language client:

Recommended reading

"Complete Guide to Open Source Distributed Graph Database Nebula Graph", also known as: Nebula Book, which records the knowledge points and specific usage of graph database and graph database Nebula Graph in detail, read the portal: https://docs.nebula -graph.com.cn/site/pdf/NebulaGraph-book.pdf

Exchange graph database technology? To join the Nebula communication group, please fill in your Nebula business card , and the Nebula assistant will pull you into the group~~


NebulaGraph
169 声望684 粉丝

NebulaGraph:一个开源的分布式图数据库。欢迎来 GitHub 交流:[链接]