Quantitative testing of Nginx and Gunicorn performance

foreword

Nginx famous. No one in the world knows you, so there is no need to introduce it. It is known as a million-level connection. 👍👍👍

Gunicorn , wsgi server under Python. It is an important part of the Python web ecosystem, equivalent to tomcat under java. 💪💪💪

Today we will take a quantitative look at the specific values of the RPS of these two things.

Of course, it's not about who is stronger between these two things, but it is more representative of a heavyweight player

Introduction to the test platform

There must be a big difference in the conclusion for different platforms, so list the hardware platforms here:

The hardware test platforms used are heavyweight to avoid bottlenecks on the hardware platform

图片.png

Client:

Linux operating system
AMD 5700g 8 cores 16 threads 👊👊👊👊👊👊
32GB of RAM

图片.png

Server:

Linux operating system
AMD 4800u 8 cores 16 threads 👏👏👏👏👏
4GB of RAM

Introduction to Stress Testing Tools

Use the stress test tool named wrk , this tool has 31k of star , which is enough to show authority and professionalism

图片.png

Reference article:

ubuntu20.04 install wrk stress test tool and simply use

Pure Nginx

Server preparation

We need to install Nginx and Guicorn on the server side, let's talk about Nginx first

Step 1: Install Nginx

sudo apt install nginx

Step 2: Replace the default introduction page

location / {
    #设置content type
    default_type text/html ;
    # HTTP Status Code 和 内容 
    return 200  "hello world! ";
}

The barrel effect, the upper limit of the system is determined by the shortcomings of the system:
Why replace the introductory page? Because the default introduction page of Nginx is smelly and long, if it is not replaced, the test bottleneck will be on the network. Think about it, an html page of 300KB and a gigabit network card will be full after 400 passes. So let's replace it with "hello world!", which is 1 KB if you die, and can also provide an upper limit of 125x1000=12w. 12w is much better than 400.

Reference article:
[Nginx] output/return HelloWorld

Step 3: Restart Nginx

sudo service nginx restart

Step 4: Check the status of Nginx :

Nginx is a master-slave multi-process model, the number of 4800u work 8 cores and 16 threads, so there are 16 work processes

─$ sudo service nginx status 
● nginx.service - A high performance web server and a reverse proxy server
     Loaded: loaded (/lib/systemd/system/nginx.service; disabled; vendor preset: disabled)
     Active: active (running) since Tue 2022-01-04 22:44:13 CST; 1s ago
       Docs: man:nginx(8)
    Process: 11215 ExecStartPre=/usr/sbin/nginx -t -q -g daemon on; master_process on; (code=exited, status=0>
    Process: 11216 ExecStart=/usr/sbin/nginx -g daemon on; master_process on; (code=exited, status=0/SUCCESS)
   Main PID: 11217 (nginx)
      Tasks: 17 (limit: 3391)
     Memory: 17.6M
        CPU: 95ms
     CGroup: /system.slice/nginx.service
             ├─11217 "nginx: master process /usr/sbin/nginx -g daemon on; master_process on;"
             ├─11218 "nginx: worker process" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ">
             ├─11219 "nginx: worker process" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ">
             ├─11220 "nginx: worker process" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ">
             ├─11221 "nginx: worker process" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ">
             ├─11222 "nginx: worker process" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ">
             ├─11223 "nginx: worker process" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ">
             ├─11224 "nginx: worker process" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ">
             ├─11226 "nginx: worker process" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ">
             ├─11227 "nginx: worker process" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ">
             ├─11228 "nginx: worker process" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ">
             ├─11229 "nginx: worker process" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ">
             ├─11230 "nginx: worker process" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ">
             ├─11231 "nginx: worker process" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ">
             ├─11232 "nginx: worker process" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ">
             ├─11233 "nginx: worker process" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ">
             └─11234 "nginx: worker process" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" "" ">

1月 04 22:44:13 kali systemd[1]: Starting A high performance web server and a reverse proxy server...
1月 04 22:44:13 kali systemd[1]: Started A high performance web server and a reverse proxy server.

start testing

Test 1

Come on, the client strikes! 🏄🏼‍♂️🏄🏼‍♂️🏄🏼‍♂️

wrk http://192.168.31.95 -t 16 -c 64 -d 10

The parameter t represents thread, that is, the number of threads
The parameter c represents connect, that is, the number of connections
Parameter d indicates how many seconds

Stress testing is to require high concurrent requests to smash the server like a flood 🤯🤯🤯

Test Results

─➤  ./wrk http://192.168.31.95  -t16 -c64  -d 10                                                                                 
Running 10s test @ http://192.168.31.95
  16 threads and 64 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     5.85ms    3.72ms  48.92ms   82.33%
    Req/Sec   708.88    348.74     1.48k    67.47%
  113734 requests in 10.09s, 17.35MB read
Requests/sec:  11271.96
Transfer/sec:      1.72MB

We can see Requests/sec: 11271.96 , what does this mean, that is, the client's wrk completed an average of 1.1w requests per second, corresponding to RPS：1.1w
At the same time, it can be seen that Transfer/sec: 1.72MB only occupies 1.72MB of network bandwidth, which is still far from the carrying capacity of the network!

Of course, I don't know which one of the application layer, network layer, data link layer, and physical layer is used by this 1.72MB. I guess the high probability is the application layer.

Test 2

Let's adjust the parameters of wrk to see the difference in the test results under different parameters

wrk http://192.168.31.95  -t32 -c160  -d 10

We increase the number of threads and connections to make more violent concurrent requests 🤯🤯🤯🤯

Test Results

─➤  ./wrk http://192.168.31.95  -t32 -c160  -d 10
Running 10s test @ http://192.168.31.95
  32 threads and 160 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     8.00ms    6.31ms  69.27ms   84.35%
    Req/Sec   701.97    395.33     1.29k    47.11%
  225052 requests in 10.07s, 34.33MB read
Requests/sec:  22349.79
Transfer/sec:      3.41MB

At this time, RPS raised to 2.2w

Test 3

wrk http://192.168.31.95  -t64 -c320  -d 10

Double it! ! ! ! ! ! ! Continue to increase the number of threads and connections to make more violent concurrent requests🤯🤯🤯🤯

Test Results

─➤  ./wrk http://192.168.31.95  -t64 -c320  -d 10
Running 10s test @ http://192.168.31.95
  64 threads and 320 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    14.08ms    9.75ms  84.81ms   87.53%
    Req/Sec   384.02    157.25     1.03k    66.21%
  246408 requests in 10.07s, 37.59MB read
Requests/sec:  24458.27
Transfer/sec:      3.73MB

Well, at this time, the RPS has not doubled, indicating that we have basically found the upper limit of Nginx : RPS 2.5w

Summary of Nginx

We can see that in the single-machine-to-single-machine stress test, the upper limit of Nginx is almost 2.5w RPS

In fact, at this time, the CPU usage of the server is very low.

Let's take a look at the good play of Gunicorn

Pure Gunicoen

The following is the test code, written in python, if you want to run, find a Python version greater than or equal to 3.6 to run

from flask import Flask
app = Flask(__name__)


@app.route('/', methods=['GET'])
def home():
    success: bool = False
    return {
        'status': success
    }

@app.route('/upload/', methods=['POST'])
def hello():
    success: bool = False
    return {
        'status': success
    }

Gunicorn has multiple operating modes:

Method 1: Gunicorn, like Nginx, also supports the master-slave architecture, that is, multi-process 👍
Method 2: Single-process multi-threading, that is, pure multi-threading👍👍
Method 3: The combination of multi-process and multi-threading, that is, the combination of methods 1 and 2, is also the master-slave architecture, that is, multi-process, but each process is multi-threaded 👍👍👍

Let's test one by one 🤪

Pure multi-process mode

gunicorn.conf.py file configuration

import multiprocessing

bind = "0.0.0.0:63000"
workers = 32

Reference article:
Configuration File

Run Gunicorn

─$ gunicorn fapi:app -c gunicorn.conf.py
[2022-01-04 22:53:08 +0800] [11519] [INFO] Starting gunicorn 20.1.0
[2022-01-04 22:53:09 +0800] [11519] [INFO] Listening at: http://0.0.0.0:63000 (11519)
[2022-01-04 22:53:09 +0800] [11519] [INFO] Using worker: sync
[2022-01-04 22:53:09 +0800] [11520] [INFO] Booting worker with pid: 11520
[2022-01-04 22:53:09 +0800] [11521] [INFO] Booting worker with pid: 11521
[2022-01-04 22:53:09 +0800] [11522] [INFO] Booting worker with pid: 11522
[2022-01-04 22:53:09 +0800] [11523] [INFO] Booting worker with pid: 11523
[2022-01-04 22:53:09 +0800] [11524] [INFO] Booting worker with pid: 11524
[2022-01-04 22:53:09 +0800] [11525] [INFO] Booting worker with pid: 11525
[2022-01-04 22:53:09 +0800] [11526] [INFO] Booting worker with pid: 11526
[2022-01-04 22:53:09 +0800] [11527] [INFO] Booting worker with pid: 11527
[2022-01-04 22:53:09 +0800] [11528] [INFO] Booting worker with pid: 11528
[2022-01-04 22:53:09 +0800] [11529] [INFO] Booting worker with pid: 11529
[2022-01-04 22:53:09 +0800] [11530] [INFO] Booting worker with pid: 11530
[2022-01-04 22:53:09 +0800] [11531] [INFO] Booting worker with pid: 11531
[2022-01-04 22:53:09 +0800] [11532] [INFO] Booting worker with pid: 11532
[2022-01-04 22:53:09 +0800] [11533] [INFO] Booting worker with pid: 11533
[2022-01-04 22:53:09 +0800] [11534] [INFO] Booting worker with pid: 11534
[2022-01-04 22:53:09 +0800] [11535] [INFO] Booting worker with pid: 11535
[2022-01-04 22:53:09 +0800] [11536] [INFO] Booting worker with pid: 11536
[2022-01-04 22:53:09 +0800] [11537] [INFO] Booting worker with pid: 11537
[2022-01-04 22:53:10 +0800] [11538] [INFO] Booting worker with pid: 11538
[2022-01-04 22:53:10 +0800] [11539] [INFO] Booting worker with pid: 11539
[2022-01-04 22:53:10 +0800] [11540] [INFO] Booting worker with pid: 11540
[2022-01-04 22:53:10 +0800] [11541] [INFO] Booting worker with pid: 11541
[2022-01-04 22:53:10 +0800] [11542] [INFO] Booting worker with pid: 11542
[2022-01-04 22:53:10 +0800] [11543] [INFO] Booting worker with pid: 11543
[2022-01-04 22:53:10 +0800] [11544] [INFO] Booting worker with pid: 11544
[2022-01-04 22:53:10 +0800] [11545] [INFO] Booting worker with pid: 11545
[2022-01-04 22:53:10 +0800] [11546] [INFO] Booting worker with pid: 11546
[2022-01-04 22:53:10 +0800] [11547] [INFO] Booting worker with pid: 11547
[2022-01-04 22:53:10 +0800] [11548] [INFO] Booting worker with pid: 11548
[2022-01-04 22:53:10 +0800] [11549] [INFO] Booting worker with pid: 11549
[2022-01-04 22:53:10 +0800] [11550] [INFO] Booting worker with pid: 11550
[2022-01-04 22:53:10 +0800] [11551] [INFO] Booting worker with pid: 11551

Do a stress test

wrk http://192.168.31.95:63000  -t64 -c320  -d 10

This time, we will not adjust the parameters of wrk bit by bit, and directly support the same configuration parameters of Nginx.

Test Results

─➤  ./wrk http://192.168.31.95:63000  -t64 -c320  -d 10                                                                                             1 ↵
Running 10s test @ http://192.168.31.95:63000
  64 threads and 320 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    76.09ms   37.75ms 664.24ms   71.68%
    Req/Sec    33.70     17.98   191.00     75.18%
  21342 requests in 10.09s, 3.30MB read
Requests/sec:   2115.13
Transfer/sec:    334.65KB

As you can see, RPS is a bit embarrassing, only 2000
🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡🤡

Pure multi-threaded mode

gunicorn.conf.py file configuration

import multiprocessing

bind = "0.0.0.0:63000"
threads = 32

Run Gunicorn

─$ gunicorn fapi:app -c gunicorn.conf.py
[2022-01-04 22:57:38 +0800] [11607] [INFO] Starting gunicorn 20.1.0
[2022-01-04 22:57:38 +0800] [11607] [INFO] Listening at: http://0.0.0.0:63000 (11607)
[2022-01-04 22:57:38 +0800] [11607] [INFO] Using worker: gthread
[2022-01-04 22:57:38 +0800] [11608] [INFO] Booting worker with pid: 11608

Do a stress test

wrk http://192.168.31.95:63000  -t64 -c320  -d 10

Test Results

─➤  ./wrk http://192.168.31.95:63000  -t64 -c320  -d 10
Running 10s test @ http://192.168.31.95:63000
  64 threads and 320 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   122.28ms  109.01ms   2.00s    98.58%
    Req/Sec    42.33     22.31   303.00     75.49%
  8888 requests in 10.08s, 1.42MB read
  Socket errors: connect 0, read 0, write 0, timeout 73
Requests/sec:    881.68
Transfer/sec:    143.80KB

More exaggerated! ! ! Only 900 of RPS ! 💩💩💩💩

Multi-threaded and multi-process hybrid mode

From the above test, we can see that the effects of multi-threading and multi-process are quite impressive, and a single hello world is so impressive.

Let's take a look at the performance of the blend mode!

I have prepared multiple sets of tests. The number of processes in each set of tests and the number of threads in each process are different. Chongchongchong!

Test one:

gunicorn.conf.py file configuration

import multiprocessing

bind = "0.0.0.0:63000"

workers = 8
threads = 8

The first round is a contestant with 8 processes and 8 threads per process

Run Gunicorn

─$ gunicorn fapi:app -c gunicorn.conf.py                                                               130 ⨯
[2022-01-04 22:59:45 +0800] [11668] [INFO] Starting gunicorn 20.1.0
[2022-01-04 22:59:45 +0800] [11668] [INFO] Listening at: http://0.0.0.0:63000 (11668)
[2022-01-04 22:59:45 +0800] [11668] [INFO] Using worker: gthread
[2022-01-04 22:59:45 +0800] [11669] [INFO] Booting worker with pid: 11669
[2022-01-04 22:59:45 +0800] [11670] [INFO] Booting worker with pid: 11670
[2022-01-04 22:59:45 +0800] [11671] [INFO] Booting worker with pid: 11671
[2022-01-04 22:59:45 +0800] [11672] [INFO] Booting worker with pid: 11672
[2022-01-04 22:59:46 +0800] [11673] [INFO] Booting worker with pid: 11673
[2022-01-04 22:59:46 +0800] [11674] [INFO] Booting worker with pid: 11674
[2022-01-04 22:59:46 +0800] [11675] [INFO] Booting worker with pid: 11675
[2022-01-04 22:59:46 +0800] [11676] [INFO] Booting worker with pid: 11676

Do a stress test

wrk http://192.168.31.95:63000  -t64 -c320  -d 10

Test Results

─➤  ./wrk http://192.168.31.95:63000  -t64 -c320  -d 10
Running 10s test @ http://192.168.31.95:63000
  64 threads and 320 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    74.25ms   24.12ms 304.79ms   72.75%
    Req/Sec    67.53     17.83   171.00     77.02%
  43368 requests in 10.10s, 6.91MB read
Requests/sec:   4293.70
Transfer/sec:    700.34KB

As you can see, RPS：4000 , such as the previous two methods, have made great progress! 👏

Test two:

gunicorn.conf.py file configuration

import multiprocessing

bind = "0.0.0.0:63000"

workers = 16
threads = 16

The second round is a contestant with 16 processes and 16 threads per process

Do a stress test

wrk http://192.168.31.95:63000  -t64 -c320  -d 10

Test Results

─➤  ./wrk http://192.168.31.95:63000  -t64 -c320  -d 10
Running 10s test @ http://192.168.31.95:63000
  64 threads and 320 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    63.56ms   32.04ms 242.04ms   80.12%
    Req/Sec    82.27     29.26   180.00     69.46%
  51927 requests in 10.08s, 8.27MB read
Requests/sec:   5150.42
Transfer/sec:    840.00KB

As you can see, RPS：5000 ! 👏👏

Test three:

gunicorn.conf.py file configuration

import multiprocessing

bind = "0.0.0.0:63000"

workers = 8
threads = 32

The third round is a contestant with 8 processes and 32 threads per process

Do a stress test

wrk http://192.168.31.95:63000  -t64 -c320  -d 10

Test Results

─➤  ./wrk http://192.168.31.95:63000  -t64 -c320  -d 10
Running 10s test @ http://192.168.31.95:63000
  64 threads and 320 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    80.44ms   31.75ms 515.60ms   90.45%
    Req/Sec    63.91     16.65   232.00     87.49%
  40686 requests in 10.07s, 6.48MB read
Requests/sec:   4039.57
Transfer/sec:    658.99KB

As you can see, RPS：4000 ! 👏👏

There is a drop, I did not take the average of three rounds of tests, so the probability error is normal

Test four:

gunicorn.conf.py file configuration

import multiprocessing

bind = "0.0.0.0:63000"

workers = 16
threads = 32

The fourth round is a contestant with 16 processes and 32 threads per process

Do a stress test

wrk http://192.168.31.95:63000  -t64 -c320  -d 10

Test Results

─➤  ./wrk http://192.168.31.95:63000  -t64 -c320  -d 10
Running 10s test @ http://192.168.31.95:63000
  64 threads and 320 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    67.43ms   35.35ms 529.30ms   82.06%
    Req/Sec    76.93     27.15   160.00     74.54%
  49099 requests in 10.10s, 7.82MB read
Requests/sec:   4861.03
Transfer/sec:    792.80KB

As you can see, RPS：5000 ! 👏👏👏

Gunicorn Summary

It can be seen that Nginx is still inferior to the professional HTTP server in the profession of Gunicorn .

The record is 25k to 5k

Why is Nginx so much better than Gunicorn !

Nginx is written in c, which is bound to be faster than pure Python's Gunicorn, which is a language advantage. (Next time you can introduce uwsgi to see)
Nginx is a product optimized to the extreme. After all, Nginx is a high-level game that even uses CPU affinity!
Nginx uses IO multiplexing, which is much more efficient than Gunicorn's multi-process and multi-threading model. (Next time you can introduce uvicorn )

Nginx and Gunicorn hybrid mode

todo

Quantitative testing of Nginx and Gunicorn performance

foreword

Introduction to the test platform

Introduction to Stress Testing Tools

Pure Nginx

Server preparation

start testing

Test 1

Test 2

Test 3

Summary of Nginx

Pure Gunicoen

Pure multi-process mode

Pure multi-threaded mode

Multi-threaded and multi-process hybrid mode

Test one:

Test two:

Test three:

Test four:

Gunicorn Summary

Nginx and Gunicorn hybrid mode

universe_king

引用和评论

linux 中，分析 cpu 占用率过高和分析磁盘压力的命令

再见 XShell！一款万能通用的终端工具，用完爱不释手！

使用FreeBSD+WireGuard+nginx完成本地部署微信对接服务器一种方法

前端跨域问题解决办法, Nginx配置为例

OpenInfra 基金会董事会宣布加入 Linux 基金会意向，增强开源全球影响力

WGCLOUD的监控概要页面提示【系统已停止发送告警通知】处理记录

Linux 磁盘挂载教程