# 用 Parser Combinator 解析 Cirru 的缩进语法

Cirru 缩进解析已经实现, 修了些 bug 具体实现可能和文中有区别 https://github.com/Cirru/parser-combinator.clj

### 概览

Cirru 语法解析器前面用的是 Tricky 的函数式编程做的, 有点像 State Monad

### LL, LR, Parser Combinator 资源

[The difference between top-down parsing and bottom-up parsing
](http://qntm.org/top)
Parser combinators explained

Parsing CSS with Parsec

LL 是先从 Parse Tree 根节点开始, 预测程序所有的可能结构, 排除错误的预测
LR 是先从每个字符开始组合, 逐步组合更多, 看最后是否得到单个程序

### Parser Combinator 基础

``````(def read-a [code]
(if (= (subs code 0 1) "a")
(subs code 1) nil))``````

``````(def initial-state {
:code "code"
:failed false
:value nil
:indentation 0
:msg "initial"
})``````

`:code` 就是存储还没解析的字符串, `:msg` 是错误消息, 调试用的

`many` 也可以把两个 parser 的 `:value` 处理成列表, 作为结果

(注意代码中 "log" "just" "wrap" 是生成调试消息用的, 可以先忽略)

``````(defn helper-chain [state parsers]
(log-nothing "helper-chain" state)
(if (> (count parsers) 0)
(let
[parser (first parsers)
result (parser state)]
(if (:failed result)
(fail state "failed apply chaining")
(recur
(assoc result :value
(conj (into [] (:value state)) (:value result)))
(rest parsers))))
state))

(defn combine-chain [& parsers]
(just "combine-chain"
(fn [state]
(helper-chain (assoc state :value []) parsers))))

(defn combine-times [parser n]
(just "combine-times"
(fn [state]
(let
[method (apply combine-chain (repeat n parser))]
(method state)))))``````

https://gist.github.com/jiyinyiyong/0568487a4ab31716186f

`peek` 意思是预览后续的内容, 但不是真的把 `:value` 作为解析的一部分

``````(defn combine-peek [parser]
(just "combine-peek"
(fn [state]
(let
[result (parser state)]
(if (:failed result)
(fail state "peek failed")
state)))))``````

``````(defn combine-value [parser handler]
(just "combine-value"
(fn [state]
(let
[result (parser state)]
(assoc result :value
(handler (:value result) (:failed result)))))))``````

### 关于缩进

``````(defn parse-two-blanks [state]
((just "parse-two-blanks"
(combine-value
(combine-times parse-whitespace 2)
(fn [value is-failed] 1))) state))``````

`star` 是参考正则里的习惯, 表示零个或者多个, 这里是零个或多个空行

``````(defn parse-line-breaks [state]
((just "parse-line-breaks"
(combine-value
(combine-chain
(combine-star parse-empty-line)
parse-newline)
(fn [value is-failed] nil))) state))``````

``````(defn parse-indentation [state]
((just "parse-indentation"
(combine-value
(combine-chain
(combine-value parse-line-breaks (fn [value is-failed] nil))
(combine-value (combine-star parse-two-blanks)
(fn [value is-failed] (count value))))
(fn [value is-failed]
(if is-failed 0 (last value))))) state))``````

``````(def parse-indent
(just "parse-indent"
(fn [state]
(let
[result (parse-indentation state)]
(if
(> (:value result) (:indentation result))
(assoc state
:indentation (+ (:indentation result) 1)
:value nil)
(fail result "no indent"))))))``````

``````(def parse-unindent
(just "parse-unindent"
(fn [state]
(let
[result (parse-indentation state)]
(if
(< (:value result) (:indentation result))
(assoc state
:indentation (- (:indentation result) 1)
:value nil)
(fail result "no unindent"))))))``````

``````(def parse-align
(just "parse-align"
(fn [state]
(let
[result (parse-indentation state)]
(if
(= (:value result) (:indentation state))
(assoc result :value nil)
(fail result "not aligned"))))))``````

`block-line` 内部的缩进的很多行, 称为 `inner-block`

``````(defn parse-inner-block [state]
((just "parse-inner-block"
(combine-value
(combine-chain parse-indent
(combine-value
(combine-optional parse-indentation)
(fn [value is-failed] nil))
(combine-alternate parse-block-line parse-align)
parse-unindent)
(fn [value is-failed]
(if is-failed nil
(filter some? (nth value 2)))))) state))

(defn parse-block-line [state]
((just "parse-block-line"
(combine-value
(combine-chain
(combine-alternate parse-item parse-whitespace)
(combine-optional parse-inner-block))
(fn [value is-failed]
(let
[main (into [] (filter some? (first value)))
nested (into [] (last value))]
(if (some? nested)
(concat main nested)
main))))) state))``````

``````(defn parse-program [state]
((just "parse-program"
(combine-value
(combine-chain
(combine-optional parse-line-breaks)
(combine-alternate parse-block-line parse-align)
parse-line-eof)
(fn [value is-failed]
(if is-failed nil
(filter some? (nth value 1)))))) state))``````

### 结尾

##### 题叶
ClojureScript 爱好者.
##### 题叶

ClojureScript 爱好者.

17.2k 声望
2.6k 粉丝