前言

某分片索引文件列表片段

/data/nodes/0/indices/wLxsr8mrTfq1ZVro5eAKig/3/index$ ll -ah
-rw-r--r-- 1 elasticsearch elasticsearch  746 Mar  5 09:10 _as2.fdm
-rw-r--r-- 1 elasticsearch elasticsearch 3.1G Mar  5 09:10 _as2.fdt
-rw-r--r-- 1 elasticsearch elasticsearch  63K Mar  5 09:10 _as2.fdx
-rw-r--r-- 1 elasticsearch elasticsearch 8.7K Mar  5 09:14 _as2.fnm
-rw-r--r-- 1 elasticsearch elasticsearch  67M Mar  5 09:14 _as2.kdd
-rw-r--r-- 1 elasticsearch elasticsearch 207K Mar  5 09:14 _as2.kdi
-rw-r--r-- 1 elasticsearch elasticsearch  508 Mar  5 09:14 _as2.kdm
-rw-r--r-- 1 elasticsearch elasticsearch  14M Mar  5 09:10 _as2.nvd
-rw-r--r-- 1 elasticsearch elasticsearch  463 Mar  5 09:10 _as2.nvm
-rw-r--r-- 1 elasticsearch elasticsearch  614 Mar  5 09:14 _as2.si
-rw-r--r-- 1 elasticsearch elasticsearch 535M Mar  5 09:14 _as2_Lucene80_0.dvd
-rw-r--r-- 1 elasticsearch elasticsearch  11K Mar  5 09:14 _as2_Lucene80_0.dvm
-rw-r--r-- 1 elasticsearch elasticsearch 440M Mar  5 09:13 _as2_Lucene84_0.doc
-rw-r--r-- 1 elasticsearch elasticsearch 119M Mar  5 09:13 _as2_Lucene84_0.pos
-rw-r--r-- 1 elasticsearch elasticsearch 350M Mar  5 09:13 _as2_Lucene84_0.tim
-rw-r--r-- 1 elasticsearch elasticsearch 6.7M Mar  5 09:13 _as2_Lucene84_0.tip
-rw-r--r-- 1 elasticsearch elasticsearch 5.2K Mar  5 09:13 _as2_Lucene84_0.tmd

几种索引的文件体积片段

image

ES 查询过程用到的各种文件

逐个解释

.tim(体积较大)

  • Name: Term Dictionary
  • Brief Description: The term dictionary, stores term info
  • 倒排表指针,倒排索引的元数据信息

.doc(体积较大)

  • Name: Frequencies
  • Brief Description: Contains the list of docs which contain each term along with frequency
  • 包含 term 和频率的文档列表,保存了每个 term 的 docid 列表和 term 在 doc 中的词频
  • 注意这里 Lucene 的 docid 不是 ES 的 _id,ES 的 _id 存储在 .fdt 文件里面。

.fdt(体积较大)

  • Name: Field Data
  • Brief Description: The stored fields for documents
  • 存储 store 字段的文档,source 是一个特殊的 store 字段。

.pos(体积较大)

  • Name: Positions
  • Brief Description: Stores position information about where a term occurs in the index
  • 全文索引的字段,会有该文件,保存了 term 在 doc 中的位置。

.dvd(体积较大), .dvm

  • Name: Per-Document Values
  • Brief Description: Encodes additional scoring factors or other per-document information.
  • .dvd: DocValues data,DocValues 数据
  • .dvm: DocValues metadata,用于聚类的 DocValues 元数据,正排索引,列存储

segments_N

  • Name: Segments File
  • Brief Description: Stores information about a commit point

write.lock

  • Name: Lock File
  • Brief Description: The Write lock prevents multiple IndexWriters from writing to the same file.

.si

  • Name: Segment Info
  • Brief Description: Stores metadata about a segment

.cfs, .cfe

  • Name: Compound File
  • Brief Description: An optional "virtual" file consisting of all the other index files for systems that frequently run out of file handles.

.fnm

  • Name: Fields
  • Brief Description: Stores information about the fields
  • field 数据元信息

.fdx

  • Name: Field Index
  • Brief Description: Contains pointers to field data

.tip

  • Name: Term Index
  • Brief Description: The index into the Term Dictionary
  • 词典索引

.pay

  • Name: Payloads
  • Brief Description: Stores additional per-position metadata information such as character offsets and user payloads

.nvd, .nvm

  • Name: Norms
  • Brief Description: Encodes length and boost factors for docs and fields
  • .nvd: Norms data
  • .nvm: Norms metadata

.tvx

  • Name: Term Vector Index
  • Brief Description: Stores offset into the document data file

.tvd

  • Name: Term Vector Data
  • Brief Description: Contains term vector data.

.liv

  • Name: Live Documents
  • Brief Description: Info about what documents are live

.dii, .dim

  • Name: Point values
  • Brief Description: Holds indexed points, if any
本文出自 qbit snap

qbit
268 声望279 粉丝