协程库性能测试(Benchbox)

0

使用https://github.com/tboox/benc...
机器lscpu:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 4
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 79
Model name: Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz
Stepping: 1
CPU MHz: 2397.224
BogoMIPS: 4794.44
Hypervisor vendor: VMware
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 25600K
NUMA node0 CPU(s): 0-3
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts nopl xtopology tsc_reliable nonstop_tsc cpuid aperfmperf pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 invpcid rtm rdseed adx smap xsaveopt dtherm ida arat pln pts

libaco 共享栈,默认2M
libmill 共享栈(实测表现推测), 默认256K
libog 虚拟内存栈,默认1M?
协程切换性能
➜ benchbox git:(master) ✗ xmake coroutine -n switch 1000
switch[1000]: tbox: 10000000 switches in 355 ms, 28169014 switches per second
switch[1000]: libaco: 10000000 switches in 403 ms, 24813895 switches per second
switch[1000]: coroutine(cloudwu): 10000000 switches in 10088 ms, 991276 switches per second
switch[1000]: libgo(boost): 10000000 switches in 586 ms, 17064846 switches per second
switch[1000]: libmill: 10000000 switches in 983 ms, 10172939 switches per second
switch[1000]: libco: 10000000 switches in 1256 ms, 7961783 switches per second
switch[1000]: libtask: 10000000 switches in 12189 ms, 820411 switches per second
switch[1000]: libfiber(acl): 10000000 switches in 645 ms, 15503875 switches per second
➜ benchbox git:(master) ✗ xmake coroutine -n switch
switch[2]: tbox: 10000000 switches in 211 ms, 47393364 switches per second
switch[2]: libaco: 10000000 switches in 430 ms, 23255813 switches per second
switch[2]: coroutine(cloudwu): 10000000 switches in 9886 ms, 1011531 switches per second
switch[2]: libgo(boost): 10000000 switches in 300 ms, 33333333 switches per second
switch[2]: libmill: 10000000 switches in 445 ms, 22471910 switches per second
switch[2]: libco: 10000000 switches in 616 ms, 16233766 switches per second
switch[2]: libtask: 10000000 switches in 9806 ms, 1019783 switches per second
switch[2]: libfiber(acl): 10000000 switches in 262 ms, 38167938 switches per second
➜ benchbox git:(master) ✗ xmake coroutine -n switch 10000
switch[10000]: tbox: 10000000 switches in 637 ms, 15698587 switches per second
switch[10000]: libaco: 10000000 switches in 445 ms, 22471910 switches per second
switch[10000]: coroutine(cloudwu): 10000000 switches in 11776 ms, 849184 switches per second
switch[10000]: libgo(boost): 10000000 switches in 1414 ms, 7072135 switches per second
switch[10000]: libmill: 10000000 switches in 899 ms, 11123470 switches per second
switch[10000]: libco: 10000000 switches in 4009 ms, 2494387 switches per second
switch[10000]: libtask: 10000000 switches in 16773 ms, 596196 switches per second
switch[10000]: libfiber(acl): 10000000 switches in 1334 ms, 7496251 switches per second
➜ benchbox git:(master) ✗ xmake coroutine -n switch 20000
switch[20000]: tbox: 10000000 switches in 753 ms, 13280212 switches per second
switch[20000]: libaco: 10000000 switches in 426 ms, 23474178 switches per second
switch[20000]: coroutine(cloudwu): 10000000 switches in 11823 ms, 845809 switches per second
switch[20000]: libgo(boost): 10000000 switches in 2116 ms, 4725897 switches per second
switch[20000]: libmill: 10000000 switches in 886 ms, 11286681 switches per second
switch[20000]: libco: 10000000 switches in 4359 ms, 2294104 switches per second
switch[20000]: libtask: 10000000 switches in 15731 ms, 635687 switches per second
switch[20000]: libfiber(acl): 10000000 switches in 2026 ms, 4935834 switches per second
➜ benchbox git:(master) ✗ xmake coroutine -n switch 50000
switch[50000]: tbox: 10000000 switches in 1721 ms, 5810575 switches per second
switch[50000]: libaco: 10000000 switches in 509 ms, 19646365 switches per second
switch[50000]: coroutine(cloudwu): 10000000 switches in 11627 ms, 860067 switches per second
switch[50000]: libgo(boost): 10000000 switches in 2625 ms, 3809523 switches per second
switch[50000]: libmill: 10000000 switches in 896 ms, 11160714 switches per second
switch[50000]: libco: 10000000 switches in 4363 ms, 2292000 switches per second
switch[50000]: libtask: 10000000 switches in 16106 ms, 620886 switches per second
switch[50000]: libfiber(acl): 10000000 switches in 2463 ms, 4060089 switches per second
➜ benchbox git:(master) ✗ xmake coroutine -n switch 100000
switch[100000]: tbox: 10000000 switches in 1854 ms, 5393743 switches per second
switch[100000]: libaco: 10000000 switches in 549 ms, 18214936 switches per second
switch[100000]: coroutine(cloudwu): 10000000 switches in 11622 ms, 860437 switches per second
switch[100000]: libgo(boost): 10000000 switches in 3102 ms, 3223726 switches per second
switch[100000]: libmill: 10000000 switches in 884 ms, 11312217 switches per second
switch[100000]: libco: 10000000 switches in 4322 ms, 2313743 switches per second
switch[100000]: libtask: 10000000 switches in 15200 ms, 657894 switches per second
switch[100000]: libfiber(acl): 10000000 switches in 2560 ms, 3906250 switches per second
➜ benchbox git:(master) ✗ xmake coroutine -n switch 1000000
switch[1000000]: tbox: 10000000 switches in 1909 ms, 5238344 switches per second
switch[1000000]: libaco: 10000000 switches in 501 ms, 19960079 switches per second
switch[1000000]: libgo(boost): 10000000 switches in 7854 ms, 1273236 switches per second
switch[1000000]: libmill: 10000000 switches in 622 ms, 16077170 switches per second
switch[1000000]: libfiber(acl): 10000000 switches in 3977 ms, 2514458 switches per second

考虑高并发场景单线程内, 一个连接一个协程时一般会有10000-50000协程,切换性能:

libaco>libmill>tbox>libfiber(acl)>libgo>libco>coroutine(cloudwu)>libtask

一个请求一个协程时一般要支持100000协程以上,切换性能亦同上

channel性能

➜ benchbox git:(master) ✗ xmake coroutine -n channel
channel[0]: tbox: 10000000 passes in 1053 ms, 9496676 passes per second
channel[0]: libgo(boost): 10000000 passes in 2926 ms, 3417634 passes per second
channel[0]: libmill: 10000000 passes in 2872 ms, 3481894 passes per second
channel[0]: libtask: 10000000 passes in 11472 ms, 871687 passes per second
channel[0]: libfiber(acl): 10000000 passes in 1097 ms, 9115770 passes per second
➜ benchbox git:(master) ✗ xmake coroutine -n channel 1000
channel[1000]: tbox: 10000000 passes in 356 ms, 28089887 passes per second
channel[1000]: libgo(boost): 10000000 passes in 820 ms, 12195121 passes per second
channel[1000]: libmill: 10000000 passes in 2912 ms, 3434065 passes per second
channel[1000]: libtask: 10000000 passes in 1745 ms, 5730659 passes per second
channel[1000]: libfiber(acl): 10000000 passes in 740 ms, 13513513 passes per second
➜ benchbox git:(master) ✗ xmake coroutine -n channel 10000
channel[10000]: tbox: 10000000 passes in 321 ms, 31152647 passes per second
channel[10000]: libgo(boost): 10000000 passes in 764 ms, 13089005 passes per second
channel[10000]: libmill: 10000000 passes in 3216 ms, 3109452 passes per second
channel[10000]: libtask: 10000000 passes in 1834 ms, 5452562 passes per second
channel[10000]: libfiber(acl): 10000000 passes in 760 ms, 13157894 passes per second
➜ benchbox git:(master) ✗ xmake coroutine -n channel 50000
channel[50000]: tbox: 10000000 passes in 306 ms, 32679738 passes per second
channel[50000]: libgo(boost): 10000000 passes in 829 ms, 12062726 passes per second
channel[50000]: libmill: 10000000 passes in 2652 ms, 3770739 passes per second
channel[50000]: libtask: 10000000 passes in 1854 ms, 5393743 passes per second
channel[50000]: libfiber(acl): 10000000 passes in 795 ms, 12578616 passes per second
➜ benchbox git:(master) ✗ xmake coroutine -n channel 100000
channel[100000]: tbox: 10000000 passes in 326 ms, 30674846 passes per second
channel[100000]: libgo(boost): 10000000 passes in 822 ms, 12165450 passes per second
channel[100000]: libmill: 10000000 passes in 2723 ms, 3672420 passes per second
channel[100000]: libtask: 10000000 passes in 1874 ms, 5336179 passes per second
channel[100000]: libfiber(acl): 10000000 passes in 1025 ms, 9756097 passes per second
➜ benchbox git:(master) ✗ xmake coroutine -n channel 1000000
channel[1000000]: tbox: 10000000 passes in 341 ms, 29325513 passes per second
channel[1000000]: libgo(boost): 10000000 passes in 1144 ms, 8741258 passes per second
channel[1000000]: libmill: 10000000 passes in 2778 ms, 3599712 passes per second
channel[1000000]: libtask: 10000000 passes in 1982 ms, 5045408 passes per second
channel[1000000]: libfiber(acl): 10000000 passes in 910 ms, 10989010 passes per second

channel性能(一个连接一个channel):tbox>libfiber(acl)~libgo>libtask>libmill

共享栈需要避免使用局部变量,心智负担比虚拟内存栈大。综上,性能最好的c实现是tbox, 但libfiber(acl)支持hook更完善,c++是libgo.

tbox协程使用优化后boost.context,acl协程在 Russ Cox (golang 的协程作者) 在 2005 年实现的 libtask二次开发,libgo直接使用boost.context.

你可能感兴趣的

载入中...