3

Introduction

The high performance of Nginx is recognized by the industry. In recent years, its share in the global server market has also increased year by year. It has also been widely used in well-known domestic Internet companies. Alibaba has also expanded on Nginx to create the famous Tengine. OpenResty is a dynamic web platform based on Nginx and LuaJIT built by the Chinese Zhang Yichun. LuaJIT is a just-in-time compiler of the Lua programming language. Lua is a powerful, dynamic, and lightweight programming language. The design purpose of the language is to be embedded in the application, so as to provide flexible extension and customization functions for the application. OpenResty is an extensible web platform realized by using Lua to extend Nginx. At present, OpenResty is mostly used in the development of API gateways. Of course, it can also be used to replace Nginx for reverse proxy and load balancing scenarios.

The architecture of OpenResty

As mentioned earlier, the bottom layer of OpenResty is based on Nginx and LuaJIT, so OpenResty inherits the multi-process architecture of Nginx. Each worker process is obtained by fork of the Master process. In fact, the LuaJIT virtual machine in the Master process will also fork together. come over. All coroutines in the same Worker will share this LuaJIT virtual machine, and the execution of Lua code is also done in this virtual machine. At the same point in time, each Worker process can only process one user's request, that is, only one coroutine is running.

Nginx

Since Nginx uses an event-driven model to process requests, it is best for each Worker process to occupy a CPU exclusively. In practice, we often configure the number of Worker processes to be the same as the number of CPU cores. In addition, each Worker process is bound to a certain CPU core, so that the CPU cache on each CPU core can be better used and the cache is reduced. The hit rate of failures, thereby improving the performance of request processing.

LuaJIT

In fact, OpenResty initially used standard Lua by default. LuaJIT has been used by default since version 1.5.8.1. The reason behind this is because LuaJIT has a great performance advantage over standard Lua.

First of all, in addition to a Lua interpreter implemented by assembly, the runtime environment of LuaJIT also has a JIT compiler that can directly generate machine code. At the beginning, LuaJIT is the same as standard Lua. Lua code is compiled into bytecode, and the bytecode is interpreted and executed by the LuaJIT interpreter. But the difference is that LuaJIT's interpreter will record some runtime statistics while executing the bytecode, such as the actual number of executions of each Lua function call entry, and the actual number of executions of each Lua loop. When these times exceed a random threshold, it is considered that the corresponding Lua function entry or the corresponding Lua cycle is hot enough, and the JIT compiler will be triggered to start working. The JIT compiler will try to compile the corresponding Lua code path starting from the entrance of the hot function or a certain position of the hot loop. The compilation process is to convert the LuaJIT bytecode into the intermediate code (IR) defined by LuaJIT itself, and then generate the machine code of the target machine. This process is similar to the working principle of the JIT compiler in Java. In fact, they are all the same type of optimization methods adopted to improve the efficiency of the program. As the so-called underlying technologies are all the same, you can learn by analogy.

Secondly, LuaJIT also closely integrates FFI (Foreign Function Interface, which cannot be used as a separate module), allowing you to directly call external C functions and use C data structures in Lua code. FFI completes the Lua/C binding work by parsing ordinary C declarations. The code generated by the JIT compiler accessing the C data structure from the Lua code is the same as the code generated by the C compiler. Different from the function call bound through the classic Lua/C API, the call to the C function can be inlined in the JIT compiled code, so the FFI method is not only simple, but also has better performance than the traditional Lua/C API method.

The following is a simple call example:

local ffi = require("ffi")
ffi.cdef[[
int printf(const char *fmt, ...);
]]
ffi.C.printf("Hello %s!", "world")

With just a few lines of code, you can directly call C's printf function in Lua to print out Hello world!. Similarly, we can use FFI to call C functions of NGINX and OpenSSL to accomplish more functions.

How OpenResty works

OpenResty is a high-performance web platform based on Nginx, so its efficient operation is inseparable from Nginx.

Nginx processes HTTP requests in 11 execution stages, as we can see from the source code of ngx_http_core_module.h:

typedef enum {
    NGX_HTTP_POST_READ_PHASE = 0,

    NGX_HTTP_SERVER_REWRITE_PHASE,

    NGX_HTTP_FIND_CONFIG_PHASE,
    NGX_HTTP_REWRITE_PHASE,
    NGX_HTTP_POST_REWRITE_PHASE,

    NGX_HTTP_PREACCESS_PHASE,

    NGX_HTTP_ACCESS_PHASE,
    NGX_HTTP_POST_ACCESS_PHASE,

    NGX_HTTP_PRECONTENT_PHASE,

    NGX_HTTP_CONTENT_PHASE,

    NGX_HTTP_LOG_PHASE
} ngx_http_phases;

Coincidentally, OpenResty also has 11 *_by_lua instructions, which are closely related to the 11 execution stages of NGINX. Instructions are the basic building blocks for writing Nginx scripts using Lua. They are used to specify when the Lua code written by the user will run and how to use the results. The following figure shows the execution sequence of different instructions. This figure can help clarify the logic of the script we wrote.

Among them, init_by_lua will only be executed when the Master process is created, and init_worker_by_lua will only be executed when each Worker process is created. Other *_by_lua commands are triggered by terminal requests and will be executed repeatedly.

The execution timing and use of each OpenResty command are described below.

Embed Lua code during Nginx startup

init_by_lua : Lua code called immediately at the Lua VM level when Nginx parses the configuration file (Master process). Generally in the init_by_lua stage, we can pre-load the Lua module and public read-only data, so that we can use the COW (copy on write) feature of the operating system to save some memory. However, the init_by_lua phase cannot execute http requests to obtain remote configuration information, which is somewhat inconvenient for the initialization work.

init_worker_by_lua : Called when the Nginx Worker process starts, generally in the init_worker_by_lua stage, we will perform some timed tasks, such as the upstream service node's dynamic awareness of the init_by_lua* expansion request and the inability to perform health checks during the init_by_lua request phase. , Can also be carried out in the timed task at this stage.

Embed Lua code when OpenSSL handles SSL protocol

ssl_certificate_by_lua *: Use the SSL_CTX_set_cert_cb feature of the OpenSSL library (requires version 1.0.2e or above) to add the Lua code before the code for verifying the downstream client SSL certificate, which can be used to set the SSL certificate chain and corresponding private key for each request And in this context, SSL handshake flow control is performed without blocking.

Embed Lua code in 11 HTTP stages

set_by_lua *: Add Lua code to the script instructions in the official Nginx ngx_http_rewrite_module module for execution, because ngx_http_rewrite_module does not support non-blocking I/O in its instructions, so the Lua API that needs to generate the current Lua "light threads" cannot be used Work in this phase. Since the Nginx event loop will be blocked during code execution at this stage, it is necessary to avoid performing time-consuming operations in this stage. It is generally used to execute faster and less code to set variables.

rewrite_by_lua *: Add Lua code to the rewrite stage of 11 stages, as an independent module to execute the corresponding Lua code for each request. The Lua code at this stage can make API calls and execute as a newly generated coroutine in an independent global environment (ie sandbox). Many functions can be implemented at this stage, such as calling external services, forwarding and redirecting processing, etc.

access_by_lua : Add Lua code to the access phase of 11 phases for execution, similar to rewrite_by_lua , but also as an independent module to execute the corresponding Lua code for each request. The Lua code at this stage can make API calls and execute as a newly generated coroutine in an independent global environment (ie sandbox). Generally used for access control, permission verification, etc.

content_by_lua *: Exclusively execute the corresponding Lua code for each request in the content stage of the 11 stages to generate the returned content. It should be noted that do not use this instruction and other content processing instructions in the same location. For example, this instruction and proxy_pass instruction should not be used in the same location.

log_by_lua *: Add Lua code to the log stage of 11 stages for execution. It will not replace the currently requested access log, but will run before it. It is generally used for request statistics and log records.

Embed Lua code during load balancing

balance_by_lua : Add Lua code to the init_upstream callback method of the reverse proxy module and generate the upstream service address for upstream load balancing control. This Lua code execution context does not support yield, so Lua APIs that may yield (such as cosockets and "light threads") are disabled in this context. However, we can generally bypass this restriction by performing such operations in the early processing stage (such as access_by_lua ) and passing the result to this context through ngx.ctx.

Embed Lua code when filtering responses

header_filter_by_lua *: Embed Lua code into the response header filtering stage for response header filtering.

body_filter_by_lua *: Embed the Lua code into the response packet body filtering stage for the response body filtering process. It should be noted that this stage may be called multiple times in a request, because the response body may be passed in the form of a block. Therefore, the Lua code specified in this instruction can also be run multiple times during the life cycle of a single HTTP request.

OpenResty quick experience

After understanding OpenResty's architecture and basic working principles, we use a simple example to get started with OpenResty, using the Mac system we work with.

Install OpenResty

$ brew tap openresty/brew
$ brew install openresty

Create working directory

$ mkdir ordemo
$ cd ordemo
$ mkdir logs/ conf/

Create nginx configuration file

In the conf working directory, create the nginx configuration file nginx.conf, the configuration content is as follows:

error_log logs/error.log debug;
pid logs/nginx.pid;

events {
    worker_connections 1024;
}

http {
    access_log logs/access.log

    server {
        listen 8080;
        location / {
            content_by_lua '
                ngx.say("Welcome to OpenResty!")
            ';
        }
    }
}

Start service

$ cd ordemo
$ openresty -p `pwd` -c conf/nginx.conf

# 停止服务
$ openresty -p `pwd` -c conf/nginx.conf -s stop

If there is no error, it means that OpenResty has been started successfully. The request can be initiated through a browser or curl command:

$ curl -i 127.0.0.1:8080
HTTP/1.1 200 OK
Server: openresty/1.19.3.1
Date: Tue, 29 Jun 2021 08:55:51 GMT
Content-Type: text/plain
Transfer-Encoding: chunked
Connection: keep-alive

Welcome to OpenResty!

This is the simplest service development process based on OpenResty. Lua code is embedded in the content phase of the 11 phases of the Nginx HTTP request, and the request response body is directly generated.

Application of OpenResty in Dewu

The current infrastructure team has developed a traffic routing component (API-ROUTE) based on OpenResty for remote live and small items. This component mainly recognizes the user ID in the request and performs dynamic routing according to routing rules. It also implements customer-based routing. The gray-scale diversion of the end IP and user ID will assume more roles in the future according to the plan.

Is the simple Demo above very simple? Do you think of the Demo Hello World, an introductory programming language? Hello World may seem simple, but the execution process hidden behind it is not that simple! Similarly, OpenResty is not as simple as we have seen! There are many cultural and technical details hidden behind it. . You know everything. .

Finally, students who are interested in OpenResty are welcome to share their learning progress together.

Reference and study list

Nginx core knowledge 150 lectures

OpenResty from entry to actual combat

OpenResty official website

OpenResty API

awesome-resty

Text/words

Pay attention to Dewu Technology, and work hand in hand to the cloud of technology


得物技术
851 声望1.5k 粉丝