头图
This article was first published on Nebula Graph Community public number

Nebula Graph 的 KV 存储分离原理和性能测评

1 Overview

Over the past decade, graph computing has continued to gain popularity in both academia and industry. Concomitantly, the world's data is growing exponentially. In this case, the requirements for data storage and query are getting higher and higher. Therefore, graph databases have also attracted enough attention in this context. According to the statistics of the world-renowned database ranking website DB-Engines.com , since 2013, graph database has been the “fastest growing” database category. Although compared with relational databases, the proportion of graph databases is still small. However, due to the more graph native data form and targeted relational query optimization, graph databases have become an irreplaceable database type for relational databases. In addition, as the amount of data continues to explode, people pay more and more attention to the relationship between data. People hope to obtain business success and gain more knowledge of human society by mining the relationship between data. Therefore, we believe that graph databases, which are inherently optimized for storing data relations and data mining, will continue to grow rapidly in databases.

Nebula Graph 的 KV 存储分离原理和性能测评

Figure 1: Performance comparison between relational and graph databases in relational queries

Figure 1 shows the performance difference between relational and graph databases for querying multi-hop relationships between data (depth-first or breadth-first search of graphs). (Here we take Neo4j as an example, but Nebula is superior to Neo4j in terms of performance and scalability).
Although relational query performance is better than relational databases, the performance of current mainstream graph databases on multi-hop queries (high-depth graph traversal) is still disastrous. This is especially true for large data volumes and distributed systems. And storage performance is often the bottleneck of database performance. In our simple test (single machine, 1GB/s SSD, 73GB data), a simple three-hop query and return attribute P50 latency can reach 3.3s, and P99 latency can reach 9.0s. And as the complexity of the query statement increases, the system is likely to become unusable.

Currently, Nebula uses RocksDB as the underlying storage engine. We know that the main workload of graph databases is multi-level traversal of relationships, and a query often requires multiple reads to the storage engine. Therefore, the performance disadvantage of one read to RocksDB will be magnified multiple times in graph queries. This is why multi-hop graph query latency is huge . Conversely, performance gains for a single RocksDB query will also be amplified multiple times in Nebula queries. Therefore, how to optimize RocksDB so that RocksDB adapts to the workload of the graph database becomes the top priority of optimizing the performance of the storage side.

The core data structure of RocksDB is LSM-Tree. All keys and values are stored in LSM-Tree. But this design has a disadvantage: since the value is often larger than the key, most of the space in the LSM-Tree is used to store the value. Once the value is particularly large, the LSM-Tree will need a deeper level to store the data. The read and write performance of LSM-Tree is directly negatively related to the number of layers. The deeper the number of layers, the more read and write amplification it may bring, and the worse the performance. Therefore, we propose to use KV separation to store graph databases: data with smaller values are stored in LSM-Tree, while data with larger values are stored in log. In this way, the large value does not exist in the LSM-Tree, and the height of the LSM-Tree will be reduced. The size of the adjacent two layers of RocksDB is 10 times the relationship, even if the height of one layer of LSM-Tree is reduced, the possibility of Cache can be greatly increased, thereby improving read performance. In addition, the reduction of read and write amplification brought about by KV separation can also speed up read and write speeds.

In our tests, we found that KV separation has a huge performance boost for graph queries. The query latency reduction for small values can be as high as 80.7%. For large-value queries, the latency reduction is relatively small, but it can also reach 52.5%. Considering that most of the data with large value is cold data and accounts for the vast majority, the overall performance improvement will be closer to the query for small value.

It is worth noting that Nebula has provided KV separation since version 3.0.0 . Users can configure the KV separation function through the nebula-storaged configuration file.

2. KV separation technology

As mentioned above, LSM-Tree has a serious read-write amplification problem. This is because the key and value need to be compacted at the same time for each Compaction, and the amplification rate will increase as the data in the database increases (for 100GB of data, the write amplification can reach more than 300 times). This may not be such a serious problem for LevelDB . Because LevelDB is designed for HDD. For HDDs, random read and write performance is much lower than sequential read and write (1,000x difference). As long as the read/write magnification ratio is less than 1,000, it is cost-effective. But for SSD, the difference between random and sequential read and write performance is not that big, especially for NVMe SSD (as shown in the figure below).

Nebula Graph 的 KV 存储分离原理和性能测评

Figure 2. NVMe SSD performance

So a lot of read and write amplification will waste bandwidth. Therefore, Lu. etc proposed a KV-separated storage structure in 2016: https://www.usenix.org/system/files/conference/fast16/fast16-papers-lu.pdf . The core idea of the paper is that the relatively small key is stored in the LSM-Tree, and the value is placed in the log. In this way, each layer of LSM-Tree can store more KV, and the same amount of data, the level will be lower, which can improve the efficiency of reading and writing. For less workloads to be written, the write bandwidth saved during compaction can also reduce the read P99 latency. Of course, there are some problems with this design. Compared with the absolute order of KV in SSTable, the value in log is only relatively ordered (sorted by key). Because each time the data is inserted, it is append to the log, and the subsequent Garbage Collection (hereinafter referred to as GC) must be relied on to make the data in the log relatively ordered. Therefore, for large-scale range query Range Query, and it is a small value (64B), the performance may be poor.

The latest version of RocksDB currently supports KV separation: http://rocksdb.org/blog/2021/05/26/integrated-Blob-db.html . It stores the separated value in multiple logs (.blob file). Each SST corresponds to multiple Blobs. Blob GC is triggered during SST Compaction.

3. Nebula KV separation performance test

Here, we test Nebula's performance of KV separation under different data and queries, including:

  1. Topology queries that do not query properties
  2. Query the attributes of small value and large value points respectively
  3. Attribute query on edge
  4. data insertion
  5. There is no effect of KV separation in the case of large values.

3.1 Test environment

This test mainly uses a physical machine. Node A has 56-core Intel(R) Xeon(R) CPU E5-2697 v3 @ 2.60GHz, 256GB RAM, graph data is stored on 1.5TB NVMe SSD.

The test data are all generated by LDBC ( https://github.com/ldbc ). In order to test the KV separation effect, we prepared two types of data: one is all small values, and the other is a mixture of large and small values. For the former, we use the default LDBC settings, as shown in Figure 3.

Nebula Graph 的 KV 存储分离原理和性能测评

Figure 3. LDBC default Text and Comment value settings

For the latter we set text, comment, large post and large comment to min size 4KB and max size 64KB. This belongs to the large value, connected by the REPLY_OF edge. We keep the default setting of Person, which is a small value and is connected by KNOW edges. Refer to Table 1 for the specific distribution of its value.

BucketPercentage
0-32B2.67%
32B-128B0.04%
128B-160B96.60%
160B-192B0.69%

Table 1. Person value size distribution

Accordingly, we prepared two data, see Table 2.

dataset value size dataset size Comment
Data1< 200B Person
Default for others37GBscale * 30
Data2<200B Person
Min 4K, Max 64K for others73GBscale * 3

Table 2. Dataset Details

Here we use different test statements in nebula-bench , including 1-hop, 2-hop, 3-hop query and FindShortestPath (abbreviated FSP in the test), INSERT statement. Additionally, we added 5-hop queries to test more extreme performance cases. The same test is repeated 5 times, and NoSep and Sep will be used later to refer to the unconfigured KV separation and the configured KV separation , respectively.

3.2 Internal structure of RocksDB after KV separation

We import the dataset Data2 using KV separation and non-separation respectively. For the KV separation setting, we set the data larger than a certain threshold to be stored in Blob (separation), and the data smaller than the threshold to be stored in SST (not separated). Here we take the 100B threshold as an example.
For NoSep, after importing the data, there is a total of 13GB of data and 217 SSTs. For Sep, after importing the data, there is a total of 16GB data and 39 SSTs. Figure 4. and Figure 5. Show the LSM-Tree internals for NoSep and Sep, respectively.

Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评

Figure 4. KV does not separate RocksDB has three layers of LSM-Tree

Nebula Graph 的 KV 存储分离原理和性能测评

Figure 5. RocksDB has two layers of LSM-Tree after KV separation

It can be seen that NoSep uses a 3-layer LSM-Tree, while Sep has only two layers. Considering that we configure RocksDB L1 is 256MB, L2 is 2.5GB, and L3 is 25GB. The reduction of the number of single layers from 3 to 2 may be the difference between whether all data can be stored in the Cache.

3.3 Performance of topology query

There are both attribute queries and topology queries for graph queries. First, we test the effect of KV separation on non-value (topological) queries. Topology query will not call GET operations on RocksDB, but only SEEK and NEXT operations. For topology query, it is divided into walk for KNOW relationship and REPLY_OF relationship. It is worth noting that the topology query only involves edge , although the size of the attributes of the points connected by these two relations (edges) is quite different. We import the same Data2 dataset into Nebula with KV separation and non-separation respectively, and obtain query statements from the same original data for query.

Query performance is related to the Cache mechanism. RocksDB mainly uses its own LRU Cache, also called Block Cache, and OS Page Cache. The Block Cache stores the decompressed Block, and the Page Cache stores the compressed Block. Although Block Cache is more efficient, it has low space usage. In a production environment, we recommend setting the Block Cache size to 1/3 memory. This test will compare without using Block Cache (only OS Page Cache), 1/3 memory (80GB) Block Cache + OS Page Cache, and direct I/O (neither Block Cache nor OS Page Cache) respective performance.

Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评

Figure 6. Performance impact of KV separation on KNOW relational topology queries

TestAvg reductionP50 reductionP99 reduction
Go 1 Step0.9%1.0%0.7%
Go 2 Step4.7%5.3%5.1%
Go 3 Step15.9%21.4%9.4%
Go 5 Step11.2%12.2%7.9%
FSP6.8%10.1%2.7%

Table 3a. KV separation for KNOW relation topology query latency reduction (Page Cache)

TestAvg reductionP50 reductionP99 reduction
Go 1 Step1.2%1.1%1.5%
Go 2 Step8.3%10.0%7.1%
Go 3 Step26.8%43.2%29.6%
Go 5 Step31.7%38.3%21.6%
FSP8.6%12.4%3.9%

Table 3b. KV separation for KNOW relation topology query latency reduction (Block Cache + Page Cache)

TestAvg reductionP50 reductionP99 reduction
Go 1 Step1.1%1.0%1.8%
Go 2 Step14.3%15.0%12.4%
Go 3 Step17.4%17.8%16.0%
Go 5 Step10.5%3.0%17.2%
FSP15.8%21.7%-3.6%

Table 3c. KV splitting latency reduction (direct I/O) for KNOW relational topology queries

Figure 6 and Table 3 show Nebula's test results for the KNOW relational query. For the data distribution characteristics of the dataset Data2, we tested 100B and 4KB as the KV separation threshold. It can be clearly seen that KV separation greatly improves the query performance of Data2. For queries with shallow graph traversal depths (Go 1 Step), the latency reduction for KV separation is relatively small, with a P50 latency reduction of about 1% at the 100B threshold. But for deeper queries, the improvement is very obvious. For example, for Go 3 Step, when using the large block cache (Table 3.b), the P50 latency can be reduced by 43.2% under the 100B threshold. The depth of finding the shortest path for graph traversal is often between 1 hop and multi-hop query, so its performance improvement is the average of multi-hop query.

Note here that KV separation in Go 5 Step has no further performance improvement, probably because for deeper traversals, the main bottleneck is not storage. Compared with the performance improvement of KV separation over non-separation, the difference between the thresholds of 100B and 4KB is not large. So in Table 3, we show the specific latency reduction percentage using the 100B threshold as an example.

In addition, in the case of Cache (Block or Page Cache), all data can be cached. At this time, KV separation performance is better than no separation performance, which also shows that the data access efficiency in memory after KV separation is higher. The specific reason may be that SST is not the best data structure for memory access. Therefore, the performance improvement brought by the small LSM-Tree will be more obvious than that of the large LSM-Tree.

Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评

Figure 7. Performance of KV separation for REPLY_OF topology queries (note that there is not much difference in performance between Go 3 Step and Go 5 Step here. This is because only 0.6% of Go 5 step queries returned non-zero results. Hence Go 5 Step Essentially no more edges to walk than Go 3 Step.)

TestAvg reductionP50 reductionP99 reduction
Go 1 Step1.1%1.8%0.1%
Go 2 Step1.2%1.3%1.1%
Go 3 Step1.2%1.4%0.8%
Go 5 Step1.1%1.2%1.0%
FSP6.4%9.1%3.3%

Table 4a. KV separation for REPLY_OF relational topology query latency reduction (Page Cache)

TestAvg reductionP50 reductionP99 reduction
Go 1 Step0.8%0.9%0.2%
Go 2 Step1.2%1.4%0.2%
Go 3 Step1.3%1.6%0.4%
Go 5 Step1.3%1.6%0.7%
FSP5.4%8.0%2.4%

Table 4b. KV separation for REPLY_OF relational topology query latency reduction (Block Cache + Page Cache)

TestAvg reductionP50 reductionP99 reduction
Go 1 Step1.2%1.4%0.2%
Go 2 Step1.3%1.7%0.0%
Go 3 Step1.3%1.5%0.6%
Go 5 Step1.8%2.0%0.7%
FSP9.1%13.3%-4.3%

Table 4c. Latency reduction (direct I/O) of KV separation for REPLY_OF relational topology queries

Figure 7. and Table 4. show the query results for the REPLY_OF relation. At this point, it can be seen that although the performance improvement effect of KV separation is not obvious, there is also a certain improvement. Note here that there is not much difference in performance between Go 3 Step and Go 5 Step. This is because only 0.6% of Go 5 Step queries returned non-zero results. Therefore, Go 5 Step essentially does not walk many more edges than Go 3 Step. Note that due to the small difference in performance, we only show the comparison of KV separation and no separation at the 4K threshold.

3.4 Attribute query performance of points

In this section, we test the performance impact of KV separation on attribute (value) queries. According to the theory of KV separation, the influence of traditional LSM-Tree on read performance mainly comes from read amplification. The read magnification can only be manifested through the GET operation. And attribute query just can reflect the impact of read amplification.
First, we test attribute queries on points. As above, the test is divided into walk for small value and large value.

Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评

Figure 8. Performance of KV separation for small value attribute queries

TestAvg reductionP50 reductionP99 reduction
Go 1 Step36.2%37.6%32.4%
Go 2 Step34.0%35.5%28.6%
Go 3 Step37.4%40.3%29.3%
Go 5 Step16.3%19.0%14.3%

Table 5a. KV separation for small value attribute query latency reduction (Page Cache)

TestAvg reductionP50 reductionP99 reduction
Go 1 Step35.6%37.8%31.6%
Go 2 Step29.6%31.1%23.4%
Go 3 Step33.8%36.0%25.4%
Go 5 Step19.1%33.5%9.9%

Table 5b. Latency reduction for small value attribute queries with KV separation (Block Cache + Page Cache)

TestAvg reductionP50 reductionP99 reduction
Go 1 Step78.6%78.9%77.9%
Go 2 Step80.8%81.3%78.7%
Go 3 Step75.8%77.3%72.1%
Go 5 Step53.2%46.3%37.5%

Table 5c. Latency reduction (direct I/O) for small-value attribute queries with KV separation

Figure 8 and Table 5 show the performance impact of KV separation for small value walk and return value (attribute) queries. This is achieved by doing a query on the KNOW relationship and returning the node properties. We can see that with 4K as the threshold, the KV separation P50 delay reduction can reach 81.3%, and the P99 delay reduction can reach 78.7%. This is mainly due to the huge performance improvement of KV separation for point queries, which is also consistent with the test results of RocksDB for KV separation. In addition, it can be seen that KV separation has the largest performance improvement for direct I/O. This is because the improvement of read performance by KV separation mainly comes from the reduction of read amplification and the improvement of cache hit rate. However, since the disk I/O performance is much smaller than the memory access, the reduction of read amplification is more obvious for the performance improvement. And read amplification only occurs when I/O is generated. Therefore, direct I/O here mainly reflects the improvement of performance due to the reduction of read amplification.

Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评

Figure 9. Performance of KV separation for large-value attribute queries

TestAvg reductionP50 reductionP99 reduction
Go 1 Step7.1%6.7%8.8%
Go 2 Step3.8%4.2%3.8%
Go 3 Step1.9%1.8%1.6%
Go 5 Step0.5%1.0%-0.6%

Table 6a. KV separation for large-value attribute query latency reduction (Page Cache)

TestAvg reductionP50 reductionP99 reduction
Go 1 Step0.7%0.5%1.3%
Go 2 Step0.9%0.9%1.0%
Go 3 Step1.5%1.6%1.6%
Go 5 Step1.4%1.7%1.0%

Table 6b. KV separation for large-value attribute query latency reduction (Block Cache + Page Cache)

TestAvg reductionP50 reductionP99 reduction
Go 1 Step52.3%52.5%51.1%
Go 2 Step26.4%22.9%24.3%
Go 3 Step7.6%1.0%16.8%
Go 5 Step2.4%2.4%3.8%

Table 6c. Latency reduction (direct I/O) for large-value attribute queries with KV separation

Figure 9 and Table 6 By walking on the REPLY_OF relationship and returning the end value, reflects the impact of KV separation on the performance of reading large . At this time, when KV separation is relatively shallow traversal, the performance improvement is larger, but it does not have much impact on deeper . This is mainly because for the REPLY_OF relationship, 99.99% of the queries have a 1-hop relationship or more, only 50.7% of the data have a 2-hop relationship or more, and only 16.8% of the data have a 3-hop relationship or more. Therefore, as the number of query hops increases, the proportion of queries that need to actually go to the GET attribute decreases. Therefore, for the query Go 1 Step with the highest actual GET property ratio, KV separation has the most performance improvement. For other queries, the performance impact of KV separation will be closer to the topology query of REPLY_OF relation in Section 3.3.

3.5 Attribute query performance of edges

Then test the query performance for edge attributes. According to the previous test results, we know that the depth of the REPLY_OF edge connection is relatively shallow, and the deeper query will also degenerate into a shallow query. Therefore, here we only focus on edge attribute queries for the KNOW relation.

Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评

Figure 10. Performance impact of KV separation on edge attribute query

TestAvg reductionP50 reductionP99 reduction
Go 1 Step0.5%0.6%-0.5%
Go 2 Step3.3%4.2%2.8%
Go 3 Step11.4%25.8%1.2%
Go 5 Step15.0%24.3%6.6%

Table 7a. KV separation for edge attribute query latency reduction (Page Cache)

TestAvg reductionP50 reductionP99 reduction
Go 1 Step0.0%0.2%0.1%
Go 2 Step-0.3%-0.1%-0.6%
Go 3 Step4.2%13.0%-1.7%
Go 5 Step0.7%4.7%-0.1%

Table 7b. KV separation for edge attribute query latency reduction (Block Cache + Page Cache)

TestAvg reductionP50 reductionP99 reduction
Go 1 Step0.5%0.2%1.4%
Go 2 Step4.0%4.9%2.1%
Go 3 Step12.5%12.9%10.2%
Go 5 Step6.9%1.2%13.4%

Table 7c. KV separation for edge attribute query latency reduction (direct I/O)

Figure 10 and Table 7 show the test results. It can be seen that the performance improvement of KV separation is greatly reduced in this case. Because in the LDBC dataset, the attribute on the edge is only creationDate, which is less than the KV separation threshold of 100B in the test. So whether KV separation is used or not, the attributes on the edges are stored on the LSM-Tree. It's just that after the KV is separated, the size of the LSM-Tree becomes smaller and the Cache efficiency is higher. Therefore, the KV separation has a smaller performance improvement for edge attribute queries in the LDBC dataset by .

3.6 Performance of Data Insertion

Here, we also test the performance impact of KV separation on data insertion. Since Cache has little impact on the performance of RocksDB's PUT , only the OS Page Cache scenario is tested here. We continue to use the dataset Data2, and each insert query runs for 10 minutes to ensure that the Compaction is fired.

Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评

Figure 11. Performance impact of KV separation on data insertion

Figure 11 shows the data insertion performance. One is to insert Person data, which is a small value. The other is to insert Comment data (at least 4KB), which is a large value. We can see that KV separation has a significant performance improvement for inserting large values. Its benefits mainly come from less Compaction. For the insertion of small values, the write amplification caused by PUT is not obvious in RocksDB. But on the other hand, when each KV Flush to L0, the overhead of an extra write brought by KV separation (need to write SST and Blob at the same time) is more obvious. Therefore, when the value of is relatively small, the write performance of KV separation will be affected .

3.7 What if my dataset is full of small Values?

We already know that If there are large values in the dataset, KV separation can bring huge performance improvements, whether it is topology query or value query . What if there is no big value? Are there side effects?

In this section we test the performance impact of KV separation without large values.

The first thing to note is that KV separation does not always bring performance improvements to RocksDB. Especially in the case of small value, Range Query performance will be worse. Our test here will not cover all cases of small value. We chose to use the dataset Data1 for testing against the default LDBC data.

Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评
Nebula Graph 的 KV 存储分离原理和性能测评

Figure 12. Performance impact of KV separation on small-value topologies and attribute queries

TestAvg reductionP50 reductionP99 reduction
Go 1 Step3.2%3.5%2.5%
Go 2 Step5.1%7.0%-1.4%
Go 3 Step2.1%3.5%0.4%
FSP2.6%3.7%1.1%

Table 9a. Latency reduction for all small value queries

TestAvg reductionP50 reductionP99 reduction
TestAvg reductionP50 reductionP99 reduction
Go 1 Step0.2%-1.5%7.7%
Go 2 Step11.2%9.3%19.2%
Go 3 Step3.4%1.6%8.2%
FSP5.9%8.3%1.9%

Table 9b. Latency reduction for all small value queries

TestAvg reductionP50 reductionP99 reduction
Go 1 Step17.8%19.1%15.1%
Go 2 Step22.2%24.9%19.1%
Go 3 Step58.1%68.1%28.1%
FSP26.2%35.6%3.2%

Table 9c. Latency reduction for all small value queries

Figure 12 shows the performance of setting the KV separation threshold to 100B for all small values. Among them, in Go 1 Step, Go 2 Step, and Go 3 Step, we extract the attributes of the target node, which is an attribute query, and FindShortestPath is a topology query that does not return a value. It can be seen that although the performance improvement is not as good as that of the dataset Data2, the latency is also reduced, especially in the case of direct I/O. Note that we do not show Go 5 Step here, because the Go 5 Step stress test has a timeout, so the test results are inaccurate.

In conclusion, we prove that KV separation can bring performance improvement whether it is large value or small value, topology query or value query.

4 Conclusion

We propose to use KV separation to store graph databases: data with smaller values are stored in LSM-Tree, and data with larger values are stored in log. In this way, since the large value does not exist in the LSM-Tree, the height of the LSM-Tree may be reduced, increasing the possibility of Cache, thereby improving the read performance. Even if Cache is not considered, the reduction of read and write amplification brought about by KV separation can speed up read and write. In our test experiments, we found that KV separation has a huge boost in the performance of graph queries. can reduce query latency by as much as 80.7% for small values. For queries with large values, the latency reduction is relatively small, but it can also reach 52.5% . Considering that most large value data is cold and overwhelming, the overall performance improvement will be closer to that for small value queries. Starting from Nebula Graph 3.0.0, the KV separation function can be configured through the nebula-storaged configuration file.


Exchange graph database technology? To join the Nebula exchange group, please fill in your Nebula business card first at , and the Nebula assistant will pull you into the group~~

Follow the public number


NebulaGraph
169 声望684 粉丝

NebulaGraph:一个开源的分布式图数据库。欢迎来 GitHub 交流:[链接]