问题背景
阅读ECMAScript® 2019 Language Specification的11.8.4章节,StringLiteral规则的描述如下:
StringLiteral ::
" DoubleStringCharactersopt "
' SingleStringCharactersopt '
DoubleStringCharacters ::
DoubleStringCharacter
DoubleStringCharactersopt
DoubleStringCharacter ::
SourceCharacter but not one of " or \\ or LineTerminator
<LS>
<PS>
LineContinuation
\ EscapeSequence
EscapeSequence ::
CharacterEscapeSequence
HexEscapeSequence
UnicodeEscapeSequence
0 \[lookahead ∉ DecimalDigit\] /* 我是对这个条件分支的含义有困惑 */
关于lookahead的描述,参看5.15章节(page19)
“”
“if the phrase “[lookahead ∈ set]” appears in the right-hand side of a production, it indicates that the production may only be used if the immediately following input token sequence is a member of the given set。”
我的问题
我的问题是这个后续的Token是什么,是任何Unicode字符还是符合上一层规则的上下文字符?我该如何来写这个Token的正则呢?
感谢关注,解答
我想我搞明白
characterA[lookahead<condition>]
产生式中关于“the immediately following input token sequence”这个Token的描述了。如果产生式是
characterA[lookahead<condition>]characterB
,那么[lookahead<condition>]限制条件就加在了characterB上,这个很明确。如果产生式是
characterA[lookahead<condition>]
,那么该condition是characterA的先行断言条件,是characterA的匹配规则 啊。我想我是阅读理解上犯了错误。之前一直不明白[lookahead<condition>]后面的Token该如何书写匹配规则,是写any unicode character 还是 任何带有上下文规则的character。
最后,写一下“EscapeSequence”这条正则作为问题的结束,如下: