2W word summary! Systemization takes you to a comprehensive understanding of Nginx

Preface

As a developer, do you often encounter leaders asking you to go to the server to modify the Nginx configuration, but you will prevaricate with the excuse "I am a developer, I will not"! Today, let us bid farewell to this embarrassment and move towards a "real" programmer! ! !

Nginx overview

Nginx is an open source, high-performance, highly reliable Web and reverse proxy server, and supports hot deployment. It can almost run 7 * 24 hours without interruption. Even if it runs for several months, it does not need to be restarted. Hot update of the software version under uninterrupted service. Performance is the Nginx . It occupies less memory, has strong concurrency, and can support up to 5w concurrent connections. The most important thing is that Nginx is free and commercialized, and its configuration and use are relatively simple.

Nginx features

High concurrency and high performance;
Modular architecture makes it very scalable;
The asynchronous non-blocking event-driven model is similar to Node.js;
Compared with other servers, it can continue for several months or even longer without restarting the server, making it highly reliable;
Hot deployment and smooth upgrade;
Fully open source and ecologically prosperous;

Nginx role

The most important usage scenarios of Nginx

Static resource services, providing services through the local file system;
Reverse proxy service, extended to include caching, load balancing, etc.;
API service, OpenResty;

Node.js is no stranger to the front-end. Nginx and Node.js have many similar concepts, HTTP server, event-driven, asynchronous non-blocking, etc., and Nginx can also be realized with Node.js. But Nginx does not conflict with Node.js, and both have their own areas of expertise. Nginx is good at the processing of the underlying server-side resources (static resource processing and forwarding, reverse proxy, load balancing, etc.), Node.js is better at the processing of the upper-level specific business logic, and the two can be perfectly combined.

Use a picture to show:

Nginx installation

Nginx on the Linux centOS 7.x operating system. As for the installation on other operating systems, you can search on the Internet, which is very simple.

Use yum to install Nginx:

yum install nginx -y

After the installation is complete, use the rpm -ql nginx command to view the installation information of Nginx

# Nginx配置文件
/etc/nginx/nginx.conf # nginx 主配置文件
/etc/nginx/nginx.conf.default

# 可执行程序文件
/usr/bin/nginx-upgrade
/usr/sbin/nginx

# nginx库文件
/usr/lib/systemd/system/nginx.service # 用于配置系统守护进程
/usr/lib64/nginx/modules # Nginx模块目录

# 帮助文档
/usr/share/doc/nginx-1.16.1
/usr/share/doc/nginx-1.16.1/CHANGES
/usr/share/doc/nginx-1.16.1/README
/usr/share/doc/nginx-1.16.1/README.dynamic
/usr/share/doc/nginx-1.16.1/UPGRADE-NOTES-1.6-to-1.10

# 静态资源目录
/usr/share/nginx/html/404.html
/usr/share/nginx/html/50x.html
/usr/share/nginx/html/index.html

# 存放Nginx日志文件
/var/log/nginx

There are two main folders of concern:

/etc/nginx/conf.d/ is the storage place for sub-configuration items, /etc/nginx/ nginx.conf main configuration file will import all sub-configuration items in this folder by default;
/usr/share/nginx/html/ Static files are placed in this folder, or you can place them in other places according to your own habits;

Nginx common commands

systemctl system command:

# 开机配置
systemctl enable nginx # 开机自动启动
systemctl disable nginx # 关闭开机自动启动

# 启动Nginx
systemctl start nginx # 启动Nginx成功后，可以直接访问主机IP，此时会展示Nginx默认页面

# 停止Nginx
systemctl stop nginx

# 重启Nginx
systemctl restart nginx

# 重新加载Nginx
systemctl reload nginx

# 查看 Nginx 运行状态
systemctl status nginx

# 查看Nginx进程
ps -ef | grep nginx

# 杀死Nginx进程
kill -9 pid # 根据上面查看到的Nginx进程号，杀死Nginx进程，-9 表示强制结束进程

Nginx application commands:

nginx -s reload  # 向主进程发送信号，重新加载配置文件，热重启
nginx -s reopen  # 重启 
Nginxnginx -s stop    # 快速关闭
nginx -s quit    # 等待工作进程处理完成后关闭
nginx -T         # 查看当前 Nginx 最终的配置
nginx -t         # 检查配置是否有问题

Nginx core configuration

Configuration file structure

A typical configuration example of Nginx

# main段配置信息
user  nginx;                        # 运行用户，默认即是nginx，可以不进行设置
worker_processes  auto;             # Nginx 进程数，一般设置为和 CPU 核数一样
error_log  /var/log/nginx/error.log warn;   # Nginx 的错误日志存放目录
pid        /var/run/nginx.pid;      # Nginx 服务启动时的 pid 存放位置

# events段配置信息
events {
    use epoll;     # 使用epoll的I/O模型(如果你不知道Nginx该使用哪种轮询方法，会自动选择一个最适合你操作系统的)
    worker_connections 1024;   # 每个进程允许最大并发数
}

# http段配置信息
# 配置使用最频繁的部分，代理、缓存、日志定义等绝大多数功能和第三方模块的配置都在这里设置
http { 
    # 设置日志模式
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;   # Nginx访问日志存放位置

    sendfile            on;   # 开启高效传输模式
    tcp_nopush          on;   # 减少网络报文段的数量
    tcp_nodelay         on;
    keepalive_timeout   65;   # 保持连接的时间，也叫超时时间，单位秒
    types_hash_max_size 2048;

    include             /etc/nginx/mime.types;      # 文件扩展名与类型映射表
    default_type        application/octet-stream;   # 默认文件类型

    include /etc/nginx/conf.d/*.conf;   # 加载子配置项
    
    # server段配置信息
    server {
     listen       80;       # 配置监听的端口
     server_name  localhost;    # 配置的域名
      
     # location段配置信息
     location / {
      root   /usr/share/nginx/html;  # 网站根目录
      index  index.html index.htm;   # 默认首页文件
      deny 172.168.22.11;   # 禁止访问的ip地址，可以为all
      allow 172.168.33.44；# 允许访问的ip地址，可以为all
     }
     
     error_page 500 502 503 504 /50x.html;  # 默认50x对应的访问页面
     error_page 400 404 error.html;   # 同上
    }
}

main global configuration, effective for the whole world;
The events configuration affects the network connection between the Nginx server
http configuration proxy, cache, log definition and most of the functions and the configuration of third-party modules;
server configures the relevant parameters of the virtual host, there can be multiple server blocks in one http block;
location is used to configure matching uri;
upstream configures the specific address of the back-end server, an indispensable part of load balancing configuration;

Use a picture to clearly show its hierarchical structure:

Core parameters of the main section of the configuration file

user

Specify the owner and group of the worker child process Nginx

user USERNAME [GROUP]

user nginx lion; # 用户是nginx;组是lion

pid

Specify the path to store the pid file of the Nginx

pid /opt/nginx/logs/nginx.pid # master主进程的的pid存放在nginx.pid的文件

worker_rlimit_nofile_number

Specifies the maximum number of file handles that can be opened by a worker child process.

worker_rlimit_nofile 20480; # 可以理解成每个worker子进程的最大连接数量。

worker_rlimit_core

Specify the core file after the worker child process terminates abnormally, which is used to record and analyze problems.

worker_rlimit_core 50M; # 存放大小限制working_directory /opt/nginx/tmp; # 存放目录

worker_processes_number

Specify Nginx to start the of worker child processes of 16083baf21123c.

worker_processes 4; # 指定具体子进程数量
worker_processes auto; # 与当前cpu物理核心数一致

worker_cpu_affinity

Bind each worker child process to our CPU physical core.

worker_cpu_affinity 0001 0010 0100 1000; # 4个物理核心，4个worker子进程

Binding each worker child process to a specific CPU physical core has the advantage of avoiding the same worker child process switching on different CPU cores, causing cache failure and reducing performance. But it cannot really avoid process switching.

worker_priority

Specify the nice value of the worker child process to adjust Nginx , usually set to a negative value to call Nginx first.

worker_priority -10; # 120-10=110，110就是最终的优先级

The priority value of Linux default process is 120, the smaller the value, the more priority; the nice setting range is -20 to +19

[ Remark ] The default priority value of the application is 120 plus the nice value equal to its final value. The smaller the value, the higher the priority.

worker_shutdown_timeout

Specify the timeout period when the worker child process exits gracefully.

worker_shutdown_timeout 5s;timer_resolution

The timer precision used inside the worker child process, the larger the adjustment interval, the fewer system calls, which is conducive to performance improvement; conversely, the more system calls, the performance decreases.

timer_resolution 100ms;

In a Linux system, users need to send a request to the operating system kernel when they need to get a timer. There is bound to be overhead when there is a request, so the larger the interval, the smaller the overhead.

daemon

Specify Nginx , foreground or background, the foreground is used for debugging, and the background is used for production.

daemon off; # 默认是on，后台运行模式

Core parameters of the events section of the configuration file

use

What event-driven model Nginx

use method; # 不推荐配置它，让nginx自己选择method 可选值为：select、poll、kqueue、epoll、/dev/poll、eventport

workr_connections

The maximum number of concurrent connections that a worker child process can handle.

worker_connections 1024 # 每个子进程的最大连接数为1024

accept_mutex

Whether to open the load balancing mutex lock.

accept_mutex on # 默认是off关闭的，这里推荐打开

server_name directive

Specify the virtual host domain name.

server_name name1 name2 name3

# 示例：
server_name www.nginx.com;

Four ways to write domain name matching:

精确匹配：server_name www.nginx.com ;
左侧通配：server_name *.nginx.com ;
右侧统配：server_name www.nginx.* ;
正则匹配：server_name ~^www\.nginx\.*$ ;
匹配优先级：精确匹配 > 左侧通配符匹配 > 右侧通配符匹配 > 正则表达式匹配

server_name configuration example:

1. Configure local DNS resolution vim /etc/hosts (macOS system)

# 添加如下内容，其中 121.42.11.34 是阿里云服务器IP地址
121.42.11.34 www.nginx-test.com
121.42.11.34 mail.nginx-test.com
121.42.11.34 www.nginx-test.org
121.42.11.34 doc.nginx-test.com
121.42.11.34 www.nginx-test.cn
121.42.11.34 fe.nginx-test.club

[Note] The virtual domain name is used here for testing, so local DNS resolution needs to be configured. If you use the domain name purchased on Alibaba Cloud, you need to set up the domain name resolution on Alibaba Cloud.

2. Configure Cloud 16083baf2116fe Nginx , vim /etc/nginx/nginx.conf

# 这里只列举了http端中的sever端配置

# 左匹配
server {
 listen 80;
 server_name *.nginx-test.com;
 root /usr/share/nginx/html/nginx-test/left-match/;
 location / {
  index index.html;
 }
}

# 正则匹配
server {
 listen 80;
 server_name ~^.*\.nginx-test\..*$;
 root /usr/share/nginx/html/nginx-test/reg-match/;
 location / {
  index index.html;
 }
}

# 右匹配
server {
 listen 80;
 server_name www.nginx-test.*;
 root /usr/share/nginx/html/nginx-test/right-match/;
 location / {
  index index.html;
 }
}

# 完全匹配
server {
 listen 80;
 server_name www.nginx-test.com;
 root /usr/share/nginx/html/nginx-test/all-match/;
 location / {
  index index.html;
 }
}

3. Access analysis

当访问 www.nginx-test.com 时，都可以被匹配上，因此选择优先级最高的“完全匹配”；
当访问 mail.nginx-test.com 时，会进行“左匹配”；
当访问 www.nginx-test.org 时，会进行“右匹配”；
当访问 doc.nginx-test.com 时，会进行“左匹配”；
当访问 www.nginx-test.cn 时，会进行“右匹配”；
当访问 fe.nginx-test.club 时，会进行“正则匹配”；

root

Specify the location of the static resource directory, it can be written in http, server, location and other configurations.

root path

E.g:

location /image {
 root /opt/nginx/static;
}

When a user visits www.test.com/image/1.png, the path actually found on the server is /opt/ nginx /static/image/1.png

[Note] root will superimpose the definition path and URI, and alias will only take the definition path.

alias

It also specifies the location of the static resource directory, and it can only be written in location.

location /image {
 alias /opt/nginx/static/image/;
}

When the user visits www.test.com/image/1.png, the path actually found on the server is /opt/nginx/static/image/1.png

[Note] When using alias, you must add / at the end, and it can only be located in location.

location

Configure the path.

location [ = | ~ | ~* | ^~ ] uri {
 ...
}

Matching rules:

= 精确匹配；
~ 正则匹配，区分大小写；
~* 正则匹配，不区分大小写；
^~ 匹配到即停止搜索；
匹配优先级： = > ^~ > ~ > ~* > 不带任何字符。

Examples:

server {
  listen 80;
  server_name www.nginx-test.com;
  
  # 只有当访问 www.nginx-test.com/match_all/ 时才会匹配到/usr/share/nginx/html/match_all/index.html
  location = /match_all/ {
      root /usr/share/nginx/html
      index index.html
  }
  
  # 当访问 www.nginx-test.com/1.jpg 等路径时会去 /usr/share/nginx/images/1.jpg 找对应的资源
  location ~ \.(jpeg|jpg|png|svg)$ {
   root /usr/share/nginx/images;
  }
  
  # 当访问 www.nginx-test.com/bbs/ 时会匹配上 /usr/share/nginx/html/bbs/index.html
  location ^~ /bbs/ {
   root /usr/share/nginx/html;
    index index.html index.htm;
  }
}
location 中的反斜线
location /test {
 ...
}

location /test/ {
 ...
}

Without / When visiting www.nginx-test.com/test, Nginx first finds whether there is a test directory, if there is, then find index.html under the test directory; if there is no test directory, nginx will find whether there is a test file.
With / When visiting www.nginx-test.com/test, Nginx first finds whether there is a test directory, if there is, then find index.html under the test directory, if there is no test file, it will not find whether there is a test file.

return

Stop processing the request and return the response code directly or redirect to another URL; after the return command is executed, subsequent commands in the location will not be executed.

return code [text];
return code URL;
return URL;

E.g:

location / {
 return 404; # 直接返回状态码
}

location / {
 return 404 "pages not found"; # 返回状态码 + 一段文本
}

location / {
 return 302 /bbs ; # 返回状态码 + 重定向地址
}

location / {
 return https://www.baidu.com ; # 返回重定向地址
}

rewrite

Rewrite the URL according to the specified regular expression matching rules.

语法：rewrite 正则表达式 要替换的内容 [flag];上下文：server、location、if

Example: rewirte /images/(._.jpg)1; # $1 is a backreference of the preceding brackets (._.jpg)

The meaning of flag optional values:

The last rewritten URL initiates a new request, enters the server section again, and retry the match in the location;
break directly uses the rewritten URL and no longer matches the statements in other locations;
redirect returns 302 temporary redirection;
permanent returns 301 permanent redirection;

server{
  listen 80;
  server_name fe.lion.club; # 要在本地hosts文件进行配置
  root html;
  location /search {
   rewrite ^/(.*) https://www.baidu.com redirect;
  }
  
  location /images {
   rewrite /images/(.*) /pics/$1;
  }
  
  location /pics {
   rewrite /pics/(.*) /photos/$1;
  }
  
  location /photos {
  
  }
}

According to this configuration, we will analyze:

When visiting fe.lion.club/search, it will automatically redirect us to https://www.baidu.com .
When visiting fe.lion.club/images/1.jpg, the first step is to rewrite the URL to fe.lion.club/pics/1.jpg, find the location of pics, and continue to rewrite the URL to fe.lion.club/ photos/1.jpg, after finding the location of /photos, go to the html/photos directory to find 1.jpg static resources.

if instruction

语法：if (condition) {...}

上下文：server、location

Example:

if($http_user_agent ~ Chrome){
  rewrite /(.*)/browser/$1 break;
}

condition Judgment condition:

$variable 仅为变量时，值为空或以0开头字符串都会被当做 false 处理；
= 或 != 相等或不等；
~ 正则匹配；
! ~ 非正则匹配；
~* 正则匹配，不区分大小写；
-f 或 ! -f 检测文件存在或不存在；
-d 或 ! -d 检测目录存在或不存在；
-e 或 ! -e 检测文件、目录、符号链接等存在或不存在；
-x 或 ! -x 检测文件可以执行或不可执行；

Examples:

server {
  listen 8080;
  server_name localhost;
  root html;
  
  location / {
   if ( $uri = "/images/" ){
     rewrite (.*) /pics/ break;
    }
  }
}

When accessing localhost:8080/images/, it will enter the if judgment and execute the rewrite command.

autoindex

When the user request ends with /, the directory structure is listed, which can be used to quickly build a static resource download website.

Autoindex.conf configuration information:

server {
  listen 80;
  server_name fe.lion-test.club;
  
  location /download/ {
    root /opt/source;
    
    autoindex on; # 打开 autoindex，，可选参数有 on | off
    autoindex_exact_size on; # 修改为off，以KB、MB、GB显示文件大小，默认为on，以bytes显示出⽂件的确切⼤⼩
    autoindex_format html; # 以html的方式进行格式化，可选参数有 html | json | xml
    autoindex_localtime off; # 显示的⽂件时间为⽂件的服务器时间。默认为off，显示的⽂件时间为GMT时间
  }
}

When accessing fe.lion.com/download/, the files under the path of the server /opt/source/download/ will be displayed, as shown in the following figure:

variable

Nginx provides users with many variables, but after all, it is the data generated by a complete request process. Nginx provides these data to users in the form of variables.

Here are some commonly used variables in projects:

remote_addr #客户端 IP 地址
remote_port #客户端端口
server_addr #服务端 IP 地址
server_port #服务端端口
server_protocol #服务端协议
binary_remote_addr #二进制格式的客户端 IP 地址
connection #TCP 连接的序号，递增
connection_request #TCP 连接当前的请求数量
uri #请求的URL，不包含参数
request_uri #请求的URL，包含参数
scheme #协议名， http 或 https
request_method #请求方法
request_length #全部请求的长度，包含请求行、请求头、请求体
args #全部参数字符串
arg_参数名 #获取特定参数值
is_args #URL 中是否有参数，有的话返回 ? ，否则返回空
query_string #与 args 相同
host #请求信息中的 Host ，如果请求中没有 Host 行，则在请求头中找，最后使用 nginx 中设置的 server_name 。
http_user_agent #用户浏览器
http_referer #从哪些链接过来的请求
http_via #每经过一层代理服务器，都会添加相应的信息
http_cookie #获取用户 cookie
request_time #处理请求已消耗的时间
https #是否开启了 https ，是则返回 on ，否则返回空
request_filename #磁盘文件系统待访问文件的完整路径
document_root #由 URI 和 root/alias 规则生成的文件夹路径
limit_rate #返回响应时的速度上限值

Example demonstration var.conf:

server{
 listen 8081;
 server_name var.lion-test.club;
 root /usr/share/nginx/html;
 location / {
  return 200 "
remote_addr: $remote_addr
remote_port: $remote_port
server_addr: $server_addr
server_port: $server_port
server_protocol: $server_protocol
binary_remote_addr: $binary_remote_addr
connection: $connection
uri: $uri
request_uri: $request_uri
scheme: $scheme
request_method: $request_method
request_length: $request_length
args: $args
arg_pid: $arg_pid
is_args: $is_args
query_string: $query_string
host: $host
http_user_agent: $http_user_agent
http_referer: $http_referer
http_via: $http_via
request_time: $request_time
https: $https
request_filename: $request_filename
document_root: $document_root
";
 }
}

When we visit http://var.lion-test.club :8081/test?pid=121414&cid=sadasd, because Nginx , the Chrome browser will download a file for us by default. The following shows the contents of the downloaded file:

remote_addr: 27.16.220.84
remote_port: 56838
server_addr: 172.17.0.2
server_port: 8081
server_protocol: HTTP/1.1
binary_remote_addr: 
connection: 126
uri: /test/
request_uri: /test/?pid=121414&cid=sadasd
scheme: http
request_method: GET
request_length: 518
args: pid=121414&cid=sadasd
arg_pid: 121414
is_args: ?
query_string: pid=121414&cid=sadasd
host: var.lion-test.club
http_user_agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.182 Safari/537.36
http_referer: 
http_via: 
request_time: 0.000
https: 
request_filename: /usr/share/nginx/html/test/
document_root: /usr/share/nginx/html

There are a lot of Nginx configurations. The above is just a list of some commonly used configurations. In actual projects, you still need to learn to consult the documentation.

Nginx application core concepts

The proxy is a hypothetical layer of server between the server and the client. The proxy will receive the client's request and forward it to the server, and then forward the server's response to the client.

Regardless of whether it is a forward proxy or a reverse proxy , it achieves the above functions.

Forward proxy

Forward proxy means a server located between the client and the origin server. In order to obtain content from the origin server, the client sends a request to the proxy and specifies the target (origin server), and then the proxy forwards it to the origin server Request and return the obtained content to the client.

The forward proxy serves us, that is, serves the client. The client can access server resources that it cannot access by itself according to the forward proxy.

The forward proxy is transparent to us and non-transparent to the server, that is, the server does not know whether it is receiving the visit from the proxy or the visit from the real client.

Reverse proxy

Reverse Proxy means to use a proxy server to accept connection requests on the internet, then forward the request to the server on the internal network, and return the result from the server to the client requesting the connection on the internet. At this time, the proxy server acts as a reverse proxy server externally. The reverse proxy serves the server. The reverse proxy can help the server receive requests from the client, help the server do request forwarding, load balancing, etc.

The reverse proxy is transparent to the server and non-transparent to us, that is, we don't know that we are accessing the proxy server, and the server knows that the reverse proxy is serving him.

Advantages of reverse proxy:

Hide the real server;
Load balancing facilitates horizontal expansion of back-end dynamic services;
Separation of dynamic and static to improve the robustness of the system;

So what is " dynamic and static separation "? What is load balancing?

Dynamic and static separation

Separation of dynamic and static refers to the architecture design method of separating static pages from dynamic pages or static content interfaces and dynamic content interfaces in the web server architecture to access different systems, thereby prompting the accessibility and maintainability of the entire service.

In general, we need to separate dynamic resource and static resources, due Nginx's high concurrency and features such as caching static resources, often static resources deployed on Nginx. If the request is for a static resource, go directly to the static resource directory to obtain the resource. If it is a request for a dynamic resource, use the principle of reverse proxy to forward the request to the corresponding background application for processing, thereby achieving dynamic separation.

After the separation of front and back ends, the access speed of static resources can be greatly improved. Even if dynamic services are unavailable, access to static resources will not be affected.

Load balancing

In general, the client sends multiple requests to the server, and the server processes the requests, some of which may need to manipulate some resources such as databases, static resources, etc. After the server has processed them, the results are returned to the client.

For the early system, this mode is not complicated in function requirements, and can be competent with relatively few concurrent requests, and the cost is also low. With the continuous increase in the amount of information, the rapid increase in the amount of access and data, and the continuous increase in system business complexity, this approach can no longer meet the requirements, and when the amount of concurrency is particularly large, the server is likely to collapse.

Obviously this is a problem caused by the bottleneck of server performance. In addition to the heap machine, the most important approach is load balancing.

In the case of explosive growth of requests, the performance of a single machine cannot meet the requirements no matter how strong it is. At this time, the concept of clusters is born. A single server cannot solve the problem. You can use multiple servers, and then distribute the request to each server. The load is distributed to different servers, which is load balancing, and the core is "sharing pressure". Nginx implements load balancing, which generally refers to forwarding requests to a server cluster.

To give a specific example, when taking the subway in the evening rush hour, there will often be a subway staff loudspeaker at the entrance of the station, "please go to port B, there are fewer people and cars and empty...", the role of this staff is to load balance .

Nginx's strategy for load balancing:

Polling strategy: the strategy adopted by default, all client request polling is allocated to the server. This strategy can work normally, but if one of the servers is too stressed and there is a delay, it will affect all users assigned to this server.
The minimum number of connections strategy: Prioritize requests to servers with less pressure. It can balance the length of each queue and avoid adding more requests to servers with high pressure.
The fastest response time strategy: Give priority to the server with the shortest response time.
Client IP binding strategy: Requests from the same IP are always allocated to one server, which effectively solves the problem of session sharing in dynamic web pages.

Nginx actual configuration

Before configuring reverse proxy and load balancing and other functions, there are two core modules that we must master. These two modules should be said to be the core of Nginx application configuration. They are: upstream and proxy_pass.

upstream

Used to define the relevant information of the upstream server (refers to the application server provided by the background).


语法：upstream name { ...}

上下文：http

Example:

upstream back_end_server{ 
 server 192.168.100.33:8081
 }

Instructions that can be used in upstream:

server defines the upstream server address;
Zone defines shared memory, used for cross-worker child processes;
keepalive enables long connections for upstream services;
keepalive_requests The maximum number of HTTP requests for a long connection;
keepalive_timeout In idle situation, the timeout period of a long connection;
hash hash load balancing algorithm;
ip_hash is a load balancing algorithm for hash calculation based on IP;
least_conn load balancing algorithm for the least number of connections
；
least_time The shortest response time load balancing algorithm;
random random load balancing algorithm;

server

Define the upstream server address.

语法：server address [parameters]上下文：upstream

Optional values for parameters:

weight=number weight value, the default is 1;
max_conns=number The maximum number of concurrent connections to the upstream server;
fail_timeout=time The judgment time when the server is unavailable;
max_fails=numer the number of checks that the server is unavailable;
backup The backup server, which will be enabled only when other servers are unavailable;
down mark the server is unavailable for a long time, offline maintenance;

keepalive

Limit the maximum number of idle long connections between each worker child process and the upstream server.

keepalive connections;
上下文：upstream

Example: [keepalive]( http://mp.weixin.qq.com/s?__biz=MzI0MDQ4MTM5NQ==&mid=2247
485067&idx=1&sn=d5585b021802dfa47fb9cd8e01ddfb1d&chksm=e91b639
7de6cea81179917290cde1ff505291cee65f08d78f6a48fee90425db7e19b025fa683&scene=21#wechat_redirect) 16;

keepalive_requests

The maximum number of HTTP requests that a single long connection can handle.

语法：keepalive_requests number;
默认值：keepalive_requests 100;
上下文：upstream

keepalive_timeout

The longest hold time of an idle long connection.

语法：keepalive_timeout time;
默认值：keepalive_timeout 60s;
上下文：upstream

Configuration example

upstream back_end{ server 127.0.0.1:8081 weight=3 max_conns=1000 fail_timeout=10s max_fails=2;  keepalive 32;  keepalive_requests 50;  keepalive_timeout 30s;}

proxy_pass

Used to configure the proxy server.

语法：proxy_pass URL;上下文：location、if、limit_except

Example:

proxy_pass http://127.0.0.1:8081
proxy_pass http://127.0.0.1:8081/proxy

URL parameter principles

The URL must start with http or https ;
Variables can be carried in the URL;
Whether there is a URI in the URL will directly affect the URL sent to the upstream request;

Next, let's take a look at two common URL usages:

proxy_pass http://192.168.100.33:8081
proxy_pass http://192.168.100.33:8081/

The difference between these two usages is with / and without /, the difference is big when configuring the proxy:

Without / means that Nginx will not modify the user URL, but will transmit it directly to the upstream application server;
With / means that Nginx will modify the user URL. The modification method is to delete the URL after the location from the user URL;

Usage without /:

location /bbs/{  proxy_pass http://127.0.0.1:8080;}

analysis:

User request URL: /bbs/abc/test.html
The URL of the request to Nginx: /bbs/abc/test.html
The URL of the request to reach the upstream application server: /bbs/abc/test.html

Usage with /:

location /bbs/{  proxy_pass http://127.0.0.1:8080/;}

analysis:

User request URL: /bbs/abc/test.html
The URL of the request to Nginx: /bbs/abc/test.html
The URL of the request to reach the upstream application server: /abc/test.html
There is no splicing of /bbs, which is consistent with the difference between root and alias.

Configure reverse proxy

In order to demonstrate closer to reality, the author prepared two cloud servers, their public IPs are: 121.42.11.34 and 121.5.180.193.

We use the 121.42.11.34 server as the upstream server and do the following configuration:

# /etc/nginx/conf.d/proxy.conf
server{
  listen 8080;
  server_name localhost;
  
  location /proxy/ {
    root /usr/share/nginx/html/proxy;
    index index.html;
  }
}

# /usr/share/nginx/html/proxy/index.html
<h1> 121.42.11.34 proxy html </h1>

After the configuration is complete, restart the Nginx server nginx -s reload.

Use the 121.5.180.193 server as a proxy server and do the following configuration:

# /etc/nginx/conf.d/proxy.conf
upstream back_end {
  server 121.42.11.34:8080 weight=2 max_conns=1000 fail_timeout=10s max_fails=3;
  keepalive 32;
  keepalive_requests 80;
  keepalive_timeout 20s;
}

server {
  listen 80;
  server_name proxy.lion.club;
  location /proxy {
   proxy_pass http://back_end/proxy;
  }
}

The local machine wants to access the proxy.lion.club domain name, so local hosts need to be configured. Enter the configuration file through the command: vim /etc/hosts and add the following content:

121.5.180.193 proxy.lion.club

analysis:

When accessing proxy.lion.club/proxy, find 121.42.11.34:8080 through the upstream configuration;
Therefore, the access address becomes http://121.42.11.34 :8080/proxy;
Connect to the 121.42.11.34 server and find the server provided by port 8080;
Find the /usr/share/nginx/html/proxy/index.html resource through the server, and finally display it.

Configure load balancing

The configuration of load balancing is mainly to use the upstream command.

We use the 121.42.11.34 server as the upstream server and do the following configuration (/etc/nginx/conf.d/balance.conf ):

server{  listen 8020;  location / {   return 200 'return 8020 \nserver{
  listen 8020;
  location / {
   return 200 'return 8020 \n';
  }
}

server{
  listen 8030;
  location / {
   return 200 'return 8030 \n';
  }
}

server{
  listen 8040;
  location / {
   return 200 'return 8040 \n';
  }
}

After the configuration is complete:

nginx -t detects whether the configuration is correct;
nginx -s reload restarts the Nginx server;
Execute the ss -nlt command to check whether the port is occupied, so as to determine whether the Nginx service is started correctly.

Use the 121.5.180.193 server as a proxy server and do the following configuration (/etc/nginx/conf.d/balance.conf ):

upstream demo_server {
  server 121.42.11.34:8020;
  server 121.42.11.34:8030;
  server 121.42.11.34:8040;
}

server {
  listen 80;
  server_name balance.lion.club;
  
  location /balance/ {
   proxy_pass http://demo_server;
  }
}

After the configuration is complete, restart the Nginx server. And configure the mapping relationship between ip and domain name on the client that needs to be accessed.

# /etc/hosts121.5.180.193 balance.lion.club

http://balance.lion.club/balance/ command on the client machine:

It is not difficult to see that the load balancing configuration has taken effect, and the upstream servers distributed to us are different each time. It is distributed to upstream servers through a simple polling strategy.

Next, let's learn about other distribution strategies of Nginx.

hash algorithm

By formulating a keyword as a hash key, it is mapped to a specific upstream server based on the hash algorithm. Keywords can contain variables and strings.

upstream demo_server {
  hash $request_uri;
  server 121.42.11.34:8020;
  server 121.42.11.34:8030;
  server 121.42.11.34:8040;


server {
  listen 80;
  server_name balance.lion.club;
  
  location /balance/ {
   proxy_pass http://demo_server;
  }
}

Hash $request\_uri means to use the request_uri variable as the key value of the hash. As long as the accessed URI remains the same, it will always be distributed to the same server.

ip_hash

According to the client's request ip, it will always be assigned to the same host as long as the ip address remains unchanged. It can effectively solve the problem of back-end server session retention.

upstream demo_server {
  ip_hash;
  server 121.42.11.34:8020;
  server 121.42.11.34:8030;
  server 121.42.11.34:8040;
}

server {
  listen 80;
  server_name balance.lion.club;
  
  location /balance/ {
   proxy_pass http://demo_server;
  }
}

Least connections algorithm

Each worker child process obtains the information of the back-end server by reading the data of the shared memory. To select a server with the least number of connections currently established for the allocation request.

语法：least_conn;上下文：upstream;

Example:

upstream demo_server {
  zone test 10M; # zone可以设置共享内存空间的名字和大小
  least_conn;
  server 121.42.11.34:8020;
  server 121.42.11.34:8030;
  server 121.42.11.34:8040;
}

server {
  listen 80;
  server_nme balance.lion.club;
  
  location /balance/ {
   proxy_pass http://demo_server;
  }
}

Finally, you will find that the configuration of load balancing is not complicated at all.

Configuration slow

Caching can be very effective in improving performance, so whether it is the client (browser), the proxy server (Nginx), or even the upstream server, caching is involved. It can be seen that caching is very important in every aspect. Let us learn how to set the caching strategy in Nginx.

proxy_cache

Store some resources that have been accessed before and may be accessed again, so that users can obtain them directly from the proxy server, thereby reducing the pressure on the upstream server and speeding up the overall access speed.

语法：proxy_cache zone | off ; # zone 是共享内存的名称默认值：proxy_cache off;上下文：http、server、location

proxy_cache_path

Set the storage path of the cache file.

语法：proxy_cache_path path [level=levels] ...可选参数省略，下面会详细列举默认值：proxy_cache_path off上下文：http

Parameter meaning:

path The storage path of the cache file;
The directory level of level path;
keys_zone set shared memory;
If inactive is not accessed within the specified time, the cache will be cleared, the default is 10 minutes;

proxy_cache_key

Set the key of the cache file.

语法：proxy_cache_key
默认值：proxy_cache_key $scheme$proxy_host$request_uri;
上下文：http、server、location

proxy_cache_valid

Configure what status codes can be cached, and the duration of the cache.

语法：proxy_cache_valid [code...] time;
上下文：http、server、location

Configuration

proxy_cache_valid 200 304 2m;; # 说明对于状态为200和304的缓存文件的缓存时间是2分钟

proxy_no_cache

Define the conditions for saving to the cache. If at least one value of the string parameter is not empty and not equal to "0", the response will not be saved to the cache.

语法：proxy_no_cache string;
上下文：http、server、location

示例：
proxy_no_cache $http_pragma    $http_authorization;

proxy_cache_bypass

Define a condition under which no response will be obtained from the cache.

语法：proxy_cache_bypass string;
上下文：http、server、location

Example:

proxy_cache_bypass $http_pragma    $http_authorization;

upstream_cache_status variable

It stores the information about whether the cache is hit or not, and will be set in the response header information, which is very useful in debugging.

MISS: 未命中缓存HIT： 命中缓存EXPIRED: 缓存过期STALE: 命中了陈旧缓存REVALIDDATED: Nginx验证陈旧缓存依然有效UPDATING: 内容陈旧，但正在更新BYPASS: X响应从原始服务器获取

Configuration example

We use the 121.42.11.34 server as the upstream server and do the following configuration (/etc/nginx/conf.d/cache.conf ):

server {
  listen 1010;
  root /usr/share/nginx/html/1010;
  location / {
   index index.html;
  }
}

server {
  listen 1020;
  root /usr/share/nginx/html/1020;
  location / {
   index index.html;
  }
}

Use the 121.5.180.193 server as a proxy server and do the following configuration (/etc/nginx/conf.d/cache.conf ):

proxy_cache_path /etc/nginx/cache_temp levels=2:2 keys_zone=cache_zone:30m max_size=2g inactive=60m use_temp_path=off;

upstream cache_server{
  server 121.42.11.34:1010;
  server 121.42.11.34:1020;
}

server {
  listen 80;
  server_name cache.lion.club;
  location / {
    proxy_cache cache_zone; # 设置缓存内存，上面配置中已经定义好的
    proxy_cache_valid 200 5m; # 缓存状态为200的请求，缓存时长为5分钟
    proxy_cache_key $request_uri; # 缓存文件的key为请求的URI
    add_header Nginx-Cache-Status $upstream_cache_status # 把缓存状态设置为头部信息，响应给客户端
    proxy_pass http://cache_server; # 代理转发
  }
}

The cache is configured like this, and we can find the corresponding cache file in the /etc/nginx/cache\_temp path.

For some pages or data with very high real-time requirements, you should not set the cache. Let's see how to configure the content not to be cached.

...

server {
  listen 80;
  server_name cache.lion.club;
  # URI 中后缀为 .txt 或 .text 的设置变量值为 "no cache"
  if ($request_uri ~ \.(txt|text)$) {
   set $cache_name "no cache"
  }
  
  location / {
    proxy_no_cache $cache_name; # 判断该变量是否有值，如果有值则不进行缓存，如果没有值则进行缓存
    proxy_cache cache_zone; # 设置缓存内存
    proxy_cache_valid 200 5m; # 缓存状态为200的请求，缓存时长为5分钟
    proxy_cache_key $request_uri; # 缓存文件的key为请求的URI
    add_header Nginx-Cache-Status $upstream_cache_status # 把缓存状态设置为头部信息，响应给客户端
    proxy_pass http://cache_server; # 代理转发
  }
}

HTTPS

Before learning how to configure HTTPS , let's briefly review the HTTPS workflow? How is it encrypted to ensure security?

HTTPS workflow

Configure certificate

Download the compressed file of the certificate, there is a Nginx folder inside, copy the xxx.crt and xxx.key files to the server directory, and then configure the following:

server {
  listen 443 ssl http2 default_server;   # SSL 访问端口号为 443
  server_name lion.club;         # 填写绑定证书的域名(我这里是随便写的)
  ssl_certificate /etc/nginx/https/lion.club_bundle.crt;   # 证书地址
  ssl_certificate_key /etc/nginx/https/lion.club.key;      # 私钥地址
  ssl_session_timeout 10m;
  ssl_protocols TLSv1 TLSv1.1 TLSv1.2; # 支持ssl协议版本，默认为后三个，主流版本是[TLSv1.2]
 
  location / {
    root         /usr/share/nginx/html;
    index        index.html index.htm;
  }
}

After this configuration, the HTTPS version of the website can be accessed normally.

Configure cross-domain CORS

Let's briefly review what is going on across domains.

Cross-domain definition

The same origin policy restricts how documents or scripts loaded from the same source interact with resources from another source. This is an important security mechanism used to isolate potentially malicious files. Normally, read operations between different sources are not allowed.

Definition of homology

If the protocol, port (if specified), and domain name of the two pages are the same, the two pages have the same source.

An example for comparison with the source of http://store.company.com/dir/page.html is given below:

http://store.company.com/dir2/other.html 同源https://store.company.com/secure.html 不同源，协议不同http://store.company.com:81/dir/etc.html 不同源，端口不同http://news.company.com/dir/other.html 不同源，主机不同

Different sources have the following restrictions:

At the web data level, the same-origin policy restricts sites from different sources from reading the current site’s Cookie, IndexDB, LocalStorage and other data.
At the DOM level, the same-origin policy restricts the read and write operations of JavaScript scripts from different sources on the current DOM object.
At the network level, the same-origin policy restricts sending site data to sites from different sources through XMLHttpRequest and other methods.

Nginx solves the principle of cross-domain

E.g:

The domain name of the front-end server is: fe.server.com
The domain name of the back-end service is: dev.server.com
Now when I make a request to dev.server.com on fe.server.com, there will be cross-domain.

Now we only need to start an Nginx server, set server\_name to fe.server.com, then set the corresponding location to intercept the front-end cross-domain requests, and finally proxy the request back to dev.server.com. Such as the following configuration:

server {
 listen      80;
 server_name  fe.server.com;
 location / {
  proxy_pass dev.server.com;
 }
}

This can perfectly bypass the browser's same-origin policy: fe.server.com's access to Nginx's fe.server.com belongs to the same-origin access, and the request forwarded by Nginx to the server will not trigger the browser's same-origin policy.

Configure to enable gzip compression

GZIP is one of the three standard HTTP compression formats specified. At present, most websites are using GZIP to transmit resource files such as HTML, CSS, JavaScript.

For text files, the effect of GZiP is very obvious, and the traffic required for transmission after it is turned on will be reduced to about 1/4~1/3.

Not every browser supports gzip. How do you know whether the client supports gzip? The Accept-Encoding in the request header identifies the support for compression.

Enabling gzip requires both client and server support. If the client supports gzip parsing, then gzip can be enabled as long as the server can return gzip files. We can enable the server to support gzip through the configuration of Nginx. The content-encoding:gzip in the response below means that the server has enabled gzip compression.

Create a new configuration file gzip.conf in the /etc/nginx/conf.d/ folder:

# # 默认off，是否开启gzip
gzip on; 
# 要采用 gzip 压缩的 MIME 文件类型，其中 text/html 被系统强制启用；
gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;

# ---- 以上两个参数开启就可以支持Gzip压缩了 ---- #

# 默认 off，该模块启用后，Nginx 首先检查是否存在请求静态文件的 gz 结尾的文件，如果有则直接返回该 .gz 文件内容；
gzip_static on;

# 默认 off，nginx做为反向代理时启用，用于设置启用或禁用从代理服务器上收到相应内容 gzip 压缩；
gzip_proxied any;

# 用于在响应消息头中添加 Vary：Accept-Encoding，使代理服务器根据请求头中的 Accept-Encoding 识别是否启用 gzip 压缩；
gzip_vary on;

# gzip 压缩比，压缩级别是 1-9，1 压缩级别最低，9 最高，级别越高压缩率越大，压缩时间越长，建议 4-6；
gzip_comp_level 6;

# 获取多少内存用于缓存压缩结果，16 8k 表示以 8k*16 为单位获得；
gzip_buffers 16 8k;

# 允许压缩的页面最小字节数，页面字节数从header头中的 Content-Length 中进行获取。默认值是 0，不管页面多大都压缩。建议设置成大于 1k 的字节数，小于 1k 可能会越压越大；
# gzip_min_length 1k;

# 默认 1.1，启用 gzip 所需的 HTTP 最低版本；
gzip_http_version 1.1;

In fact, you can also use front-end construction tools such as webpack, rollup, etc. to do Gzip compression when printing the production package, and then put it in the Nginx server, which can reduce server overhead and speed up access.

I have learned about the actual application of Nginx. I believe that by mastering the core configuration and actual configuration of Nginx, we can easily deal with any needs we encounter later. Next, let us learn a little bit more about the architecture of Nginx.

Nginx architecture

Process structure

Process model diagram of Nginx with multi-process structure:

The Nginx process architecture in multi-process is shown in the figure below. There will be a parent process (Master Process) and it will have many child processes (Child Processes).

The Master Process is used to manage the child process, and it does not really handle user requests.
If a child process is down, it will send a message to the Master process to indicate that it is unavailable. At this time, the Master process will start a new child process.
When a configuration file is modified, the Master process will notify the work process to obtain new configuration information, which is what we call hot deployment.
The child processes communicate through shared memory.

The principle of configuration file reloading

The process of reload reloading the configuration file:

Send HUP signal (reload command) to the master process;
The master process checks whether the configuration syntax is correct;
The master process opens the listening port;
The master process uses the new configuration file to start a new worker child process;
The master process sends a QUIT signal to the old worker child process;
The old worker process closes the listener handle, and closes the process after processing the current connection;
Nginx has been running smoothly throughout the entire process, achieving smooth upgrades, and users have no perception;

Nginx modular management mechanism

The internal structure of Nginx is composed of core parts and a series of functional modules. This division is to make the function of each module relatively simple, easy to develop, and also easy to expand the system. Nginx modules are independent of each other, low coupling and high

2W word summary! Systemization takes you to a comprehensive understanding of Nginx

Preface

Nginx overview

Nginx features

Nginx role

Nginx installation

Use yum to install Nginx:

Nginx common commands

systemctl system command:

Nginx application commands:

Nginx core configuration

Configuration file structure

Core parameters of the main section of the configuration file

user

pid

worker_rlimit_nofile_number

worker_rlimit_core

worker_processes_number

worker_cpu_affinity

worker_priority

worker_shutdown_timeout

timer_resolution 100ms;

daemon

Core parameters of the events section of the configuration file

use

workr_connections

accept_mutex

server_name directive

server_name configuration example:

root

alias

location

return

rewrite

if instruction

autoindex

variable

Nginx application core concepts

Forward proxy

Reverse proxy

Dynamic and static separation

Load balancing

Nginx's strategy for load balancing:

Nginx actual configuration

upstream

server

keepalive

keepalive_requests

keepalive_timeout

Configuration example

proxy_pass

Configure reverse proxy

Configure load balancing

hash algorithm

ip_hash

Least connections algorithm

Configuration slow

proxy_cache

proxy_cache_path

proxy_cache_key

proxy_cache_valid

proxy_no_cache

proxy_cache_bypass

upstream_cache_status variable

Configuration example

HTTPS

HTTPS workflow

Configure certificate

Configure cross-domain CORS

Nginx solves the principle of cross-domain

Configure to enable gzip compression

Nginx architecture

Process structure

The principle of configuration file reloading

Nginx modular management mechanism

民工哥

引用和评论

早知道有这么个吊炸天的 CI&CD 工具，我就不用 Jenkins 了！

70k star，取代Postman！这款轻量级API工具，太香了！

C++ 中 VS 项目引入公共配置文件