可能的树 | Dolt 文档

  • Prolly Tree: Short for "Probabilistic B-tree", coined by Noms developers. It's closely related to B-trees and useful, especially as the storage engine for version controlled databases.
  • Properties: Has B-tree like performance on reads and writes, fast diffs (can compare two versions efficiently), and structural sharing (shared portions stored once).
  • Construction: Sorts the map by key, determines chunk boundaries using a seed and hash function, hashes each chunk, and builds a new map if needed. Modifications involve directly modifying values in chunks and recalculating content addresses.
  • Inserts and Deletes: Work by walking the tree, editing chunks, and recalculating content addresses. Probability of chunk boundary shift depends on chunk size and added key/value.
  • History Independence: Enables fast diff and structural sharing. No matter the order of inserts, updates, or deletes, the tree is the same.
  • Dolt's Implementation: Fixes Noms' chunk size problem by considering chunk size in boundary decisions. Only considers keys for chunking, improving update performance. Dolt's block store is built for SQL databases with fixed schema.
  • Performance: On standard sysbench tests, Dolt is about 10% slower than MySQL. The difference comes from being implemented in Golang, MySQL's more mature SQL analyzer, and fewer data transformations.
  • Data Storage: In Dolt, all database data is stored in Prolly Trees, including table data, secondary indexes, and schemas.
  • Diff and Sharing: Can compute diffs in time proportional to the size of differences and structurally share data across versions.
阅读 8
0 条评论