前言
- 本文对 Elasticsearch 7.10 适用
- Elasticsearch 7.10 对应 Lucene 8.7
Lucene 8.7 关于扩展名的官方文档 https://lucene.apache.org/cor...
相关阅读
- elasticsearch 百亿级数据检索案例与原理
- Day 7 - Elasticsearch中数据是如何存储的
- A Dive into the Elasticsearch Storage
某分片索引文件列表片段
/data/nodes/0/indices/wLxsr8mrTfq1ZVro5eAKig/3/index$ ll -ah
-rw-r--r-- 1 elasticsearch elasticsearch 746 Mar 5 09:10 _as2.fdm
-rw-r--r-- 1 elasticsearch elasticsearch 3.1G Mar 5 09:10 _as2.fdt
-rw-r--r-- 1 elasticsearch elasticsearch 63K Mar 5 09:10 _as2.fdx
-rw-r--r-- 1 elasticsearch elasticsearch 8.7K Mar 5 09:14 _as2.fnm
-rw-r--r-- 1 elasticsearch elasticsearch 67M Mar 5 09:14 _as2.kdd
-rw-r--r-- 1 elasticsearch elasticsearch 207K Mar 5 09:14 _as2.kdi
-rw-r--r-- 1 elasticsearch elasticsearch 508 Mar 5 09:14 _as2.kdm
-rw-r--r-- 1 elasticsearch elasticsearch 14M Mar 5 09:10 _as2.nvd
-rw-r--r-- 1 elasticsearch elasticsearch 463 Mar 5 09:10 _as2.nvm
-rw-r--r-- 1 elasticsearch elasticsearch 614 Mar 5 09:14 _as2.si
-rw-r--r-- 1 elasticsearch elasticsearch 535M Mar 5 09:14 _as2_Lucene80_0.dvd
-rw-r--r-- 1 elasticsearch elasticsearch 11K Mar 5 09:14 _as2_Lucene80_0.dvm
-rw-r--r-- 1 elasticsearch elasticsearch 440M Mar 5 09:13 _as2_Lucene84_0.doc
-rw-r--r-- 1 elasticsearch elasticsearch 119M Mar 5 09:13 _as2_Lucene84_0.pos
-rw-r--r-- 1 elasticsearch elasticsearch 350M Mar 5 09:13 _as2_Lucene84_0.tim
-rw-r--r-- 1 elasticsearch elasticsearch 6.7M Mar 5 09:13 _as2_Lucene84_0.tip
-rw-r--r-- 1 elasticsearch elasticsearch 5.2K Mar 5 09:13 _as2_Lucene84_0.tmd
几种索引的文件体积片段
ES 查询过程用到的各种文件
- 图片出自 聊聊 Elasticsearch 的查询毛刺
- 图中红圈标识了体积较大的文件,体积较大的文件对磁盘 io 影响较大
逐个解释
.tim(体积较大)
- Name: Term Dictionary
- Brief Description: The term dictionary, stores term info
- 倒排表指针,倒排索引的元数据信息
.doc(体积较大)
- Name: Frequencies
- Brief Description: Contains the list of docs which contain each term along with frequency
- 包含 term 和频率的文档列表,保存了每个 term 的 docid 列表和 term 在 doc 中的词频
- 注意这里 Lucene 的 docid 不是 ES 的 _id,ES 的 _id 存储在 .fdt 文件里面。
.fdt(体积较大)
- Name: Field Data
- Brief Description: The stored fields for documents
- 存储 store 字段的文档,source 是一个特殊的 store 字段。
.pos(体积较大)
- Name: Positions
- Brief Description: Stores position information about where a term occurs in the index
- 全文索引的字段,会有该文件,保存了 term 在 doc 中的位置。
.dvd(体积较大), .dvm
- Name: Per-Document Values
- Brief Description: Encodes additional scoring factors or other per-document information.
- .dvd: DocValues data,DocValues 数据
- .dvm: DocValues metadata,用于聚类的 DocValues 元数据,正排索引,列存储
segments_N
- Name: Segments File
- Brief Description: Stores information about a commit point
write.lock
- Name: Lock File
- Brief Description: The Write lock prevents multiple IndexWriters from writing to the same file.
.si
- Name: Segment Info
- Brief Description: Stores metadata about a segment
.cfs, .cfe
- Name: Compound File
- Brief Description: An optional "virtual" file consisting of all the other index files for systems that frequently run out of file handles.
.fnm
- Name: Fields
- Brief Description: Stores information about the fields
- field 数据元信息
.fdx
- Name: Field Index
- Brief Description: Contains pointers to field data
.tip
- Name: Term Index
- Brief Description: The index into the Term Dictionary
- 词典索引
.pay
- Name: Payloads
- Brief Description: Stores additional per-position metadata information such as character offsets and user payloads
.nvd, .nvm
- Name: Norms
- Brief Description: Encodes length and boost factors for docs and fields
- .nvd: Norms data
- .nvm: Norms metadata
.tvx
- Name: Term Vector Index
- Brief Description: Stores offset into the document data file
.tvd
- Name: Term Vector Data
- Brief Description: Contains term vector data.
.liv
- Name: Live Documents
- Brief Description: Info about what documents are live
.dii, .dim
- Name: Point values
- Brief Description: Holds indexed points, if any
本文出自 qbit snap
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。