2
头图

Flex and Bison are tools specially designed for programmers of compilers and interpreters:

  • Flex is used for lexical analysis (or scanning), which divides the input into meaningful chunks called tokens.
  • Bison is used for syntax analysis (or parsing) to determine how these tokens relate to each other.

For example, the following code snippet:

 alpha = beta + gamma;

词法分析把这段代码分解为这样一些记号: alpha , = , beta , + , gamma , ; . Then parsing determines that beta + gamma is an expression, and this expression is assigned to alpha .

However, they have since proven to be very effective in other application areas as well. Any application, especially text processing, that looks for specific patterns in its input, or that uses a command language as input, is a good candidate for Flex and Bison.

For example, SQL analysis:

In the structure of the compiler, the lexical analyzer and the syntax analyzer are the main components of the front end of the compiler. Most compilers are organized into three main stages: front-end, optimizer, and back-end. The front end focuses on understanding the source language program, converting it into some sort of intermediate representation (IR). Flex and Bison are tools designed for the front end of the compiler.

origin

bison is derived from yacc, a parser generator developed by Stephen C. Johnson at Bell Labs from 1975 to 1978. As its name (yacc is short for yet another compiler compiler) suggests, a lot of people were writing parser generators back then. Johnson's tool is based on the parsing theory developed by DE Knuth (so yacc is very reliable) and a convenient input grammar. This made yacc very popular among Unix users, although the restricted copyright Unix followed at the time made it only available in academia and Bell systems. Around 1985, Bob Corbett, a graduate student at the University of California, Berkeley, reimplemented yacc using an improved internal algorithm and evolved into Berkeley yacc. Since this version was faster than Bell Labs' yacc and used the flexible Berkeley license, it quickly became the most popular yacc. Richard Stallman from the Free Software Foundation adapted Corbett's version and used it in the GNU project, where it was added with numerous new features and evolved into the current bison. bison is now maintained as a project of the FSF, and it is released under the GNU Public License.

In 1975, Mike Lesk and summer intern Eric Schmidt wrote lex, a lexer generator, and Schmidt did most of the programming. They found that lex could be used both as a stand-alone tool and as a coroutine to Johnson's yacc. lex has thus become very popular, although it is a bit slow and buggy. (Schmidt went on to have a very successful career in the computer industry, though. He is now, in 2009, Google's CEO. The CEO handed over in 2010 and remains chairman of Google.)

Around 1987, Vern Paxson of Lawrence Berkeley's lab rewrote a version of lex written in ratfor (an extended Fortran language popular at the time) into C, called flex, which means "fast lexical analyzer". Generator" (Fast Lexical Analyzer Generator). Since it is faster and more reliable than AT&T's lex, and is based on a Berkeley license like Berkeley's yacc, it eventually surpasses the original lex as well. flex is now a SourceForge project, still under the Berkeley license.

Install

Most Linux and BSD systems come with flex and bison as a base part of the system. Installing them is also easy if your system doesn't include them.

For example, on Ubuntu/Debian systems, you can install directly with apt:

 # Ubuntu 20
$ sudo apt install flex bison -y

$ flex -V
flex 2.6.4
$ bison -V
bison (GNU Bison) 3.5.1

example

See https://github.com/ikuokuo/start-ai-compiler/tree/main/books/flex_bison for examples, all from the book Flex & Bison given in the epilogue.

The example guides us how to use Flex & Bison to develop a calculator, and can support variables, procedures, loops and conditional expressions, there are built-in functions, and also supports user-defined functions.

Compile all examples as follows:

 cd books/flex_bison/

# 编译 release
make
# 编译 debug
make debug

# 清理
make clean

The example program will output into the _build directory, execute as follows:

 $ ./_build/linux-x86_64/release/1-5_calc/bin/1-5_calc
> (1+2)*3 + 4/2
= 11

$ ./_build/linux-x86_64/release/3-5_calc/bin/3-5_calc
> let sq(n)=e=1; while |((t=n/e)-e)>.001 do e=avg(e,t);;
Defined sq
> let avg(a,b)=(a+b)/2;
Defined avg
> sq(10)
= 3.162
> sqrt(10)
= 3.162
> sq(10)-sqrt(10)
= 0.000178

If only one example is compiled:

 cd ch01/1-1_wc/

# 编译 release
make -j8
# 编译 debug
make -j8 args="debug"

# 清理
make clean

program

Both Flex and Bison programs consist of three parts: the definition part, the rules part, and the user subroutines.

 ... definition section ...
%%
... rules section ...
%%
... user subroutines section ...

Flex rules are partly based on regular expressions and Bison is based on BNF (Backus-Naur Form) grammars. For detailed usage, please follow the Flex & Bison book and examples given in the epilogue.

Without going into too much detail here, this article aims to give you an idea of what tools like Flex and Bison are and what they can do for us.

Epilogue

Flex and Bison are automatic generation tools for lexical analyzers (Scanner) and parser (Parser), applying the results of formal language theory. These tools are also available for text search, website filtering, word processing, and command-line language interpreters.

This article mainly comes from the following books:

GoCoding personal practice experience sharing, you can pay attention to the public number!

GoCoding
88 声望5 粉丝

Go coding in my way :)