Why tune
If it is said that introducing a technology requires interest and enthusiasm, then it takes persistence and perseverance to make this technology online. Cloud Native is true, Istio is true.
In the performance test before going live, the use of Istio provides observability, convenience in operation and maintenance, and also introduces pain: increasing service response delay. How to minimize the pain has become an urgent matter at the moment.
Performance: SERVICE-A, which was 9ms before, now takes 14 ms. SERVICE-A depends on SERVICE-B.
Road to analysis
There are two ways under your feet:
- Directly adjust some configurations that are considered suspicious and disable some functions. Then pressure test to see the results.
- Do cpu profile for sidecar to locate suspicious places. Perform relatively well-founded tuning
I chose 2.
Sidecar CPU Profile (photo lung)
As a relatively mature open source product, Istio has its official benchmark project:
https://github.com/istio/tools/tree/release-1.8/perf/benchmark
I refer to: https://github.com/istio/tools/tree/release-1.8/perf/benchmark/flame#setup-perf-tool-envoy .
Install perf
Run the perf tool of Linux in the container to profile the sidecar. There are some difficulties. For example, the Istio-proxy container is read-only by default for the entire file system. I modified it to be writable. Need to enter the container as root. If you find it troublesome, you can also make a custom image based on the original image. The specific method is not the focus of this article, so I won't talk about it. Afterwards, you can use package tools (such as apt) to install perf.
This is an example of istio-proxy container configuration:
spec:
containers:
- name: istio-proxy
image: xyz
securityContext:
allowPrivilegeEscalation: true
capabilities:
add:
- ALL
privileged: true
readOnlyRootFilesystem: false
runAsGroup: 1337
runAsNonRoot: false
runAsUser: 1337
Execute profile and generate Flame Graph
Enter the istio-proxy container as root (yes, root can save a bit)
perf record -g -F 19 -p `pgrep envoy` -o perf.data -- sleep 120
perf script --header -i perf.data > perf.stacks
perf.stacks
copied to the development machine, Flame Graph is generated. Yes, a perl script is needed: https://github.com/brendangregg/FlameGraph (proudly produced by my idol Brendan Gregg)
export FlameGraph=/xyz/FlameGraph
$FlameGraph/stackcollapse-perf.pl < perf.stacks | $FlameGraph/flamegraph.pl --hash > perf.svg
Finally, perf.svg
generated:
The picture above is just an envoy worker thread, and there is another thread similar to it. So the above proxy_wasm::ContextBase::onLog
uses 14% of the CPU of the whole process. As can be seen from the above figure, this is probably an Envoy extension Filter. The question is, what kind of Filter is this, and why some stack information cannot be obtained (perf-18.map in the figure above).
Envoy Filter-Utopia of wasm
What I know is that wasm is a vm engine (analogous to jvm). Envoy supports Native mode to achieve extension, and also supports wasm mode to achieve extension. Of course, there must be a performance loss between vm engine and Native.
Fortunately, a brother search led me to find this document:
https://istio.io/v1.8/docs/ops/deployment/performance-and-scalability/
One of the pictures, and a paragraph gave me a hint:
baseline
Client pod directly calls the server pod, no sidecars are present.none_both
Istio proxy with no Istio specific filters configured.v2-stats-wasm_both
Client and server sidecars are present with telemetry v2v8
configured.v2-stats-nullvm_both
Client and server sidecars are present with telemetry v2nullvm
configured by default.v2-sd-full-nullvm_both
Export Stackdriver metrics, access logs and edges with telemetry v2nullvm
configured.v2-sd-nologging-nullvm_both
Same as above, but does not export access logs.
Well (now popular in Cantonese) a performance test, so many lines do? What translates into grounding is:
baseline
does not use sidecarsnone_both
does not use Istio's Filterv2-stats-wasm_both
filter implemented using wasmv2-stats-nullvm_both
uses the Filter implemented by Native
What do you want to say in these few words? Foreigners are sometimes more reserved. To put it down to the ground, we want to promote the use of wasm technology, so we use this by default. If you mind the 1ms delay, and a little bit of CPU. Please use Native technology again. Well, I admit, I mind.
Note: Later I discovered that the official standard version of Istio 1.8 uses Native Filter. Our environment is an internal customized version, which uses wasm Filter by default (or a utopia based on security, isolation, and portability greater than performance). So, for you, Native Filter is already the default configuration.
The exhausted Worker Thread and the idling core
The following is the thread-level top monitoring of the enovy process. Yes, pthread said, thread naming is not a patent of the Java world. The COMMAND column is the thread name.
top -p `pgrep envoy` -H -b
top - 01:13:52 up 42 days, 14:01, 0 users, load average: 17.79, 14.09, 10.73
Threads: 28 total, 2 running, 26 sleeping, 0 stopped, 0 zombie
%Cpu(s): 42.0 us, 7.3 sy, 0.0 ni, 46.9 id, 0.0 wa, 0.0 hi, 3.7 si, 0.1 st
MiB Mem : 94629.32+total, 67159.44+free, 13834.21+used, 13635.66+buff/cache
MiB Swap: 0.000 total, 0.000 free, 0.000 used. 80094.03+avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
42 istio-p+ 20 0 0.274t 221108 43012 R 60.47 0.228 174:48.28 wrk:worker_1
41 istio-p+ 20 0 0.274t 221108 43012 R 55.81 0.228 149:33.37 wrk:worker_0
18 istio-p+ 20 0 0.274t 221108 43012 S 0.332 0.228 2:22.48 envoy
At the same time, it is found that the increase in the concurrency pressure of the client does not significantly increase the CPU usage of the envoy inter-thread of this two worker threads to 100%. The fact that the hyper-threaded CPU core that has been circulating among the people cannot reach core * 2 performance is here. How to do? Try adding a worker.
One word: tone
Istio EnvoyFilter
, so I played like this:
kubectl apply -f - <<"EOF"
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
...
name: stats-filter-1.8
spec:
configPatches:
- applyTo: HTTP_FILTER
match:
context: SIDECAR_OUTBOUND
listener:
filterChain:
filter:
name: envoy.http_connection_manager
subFilter:
name: envoy.router
proxy:
proxyVersion: ^1\.8.*
patch:
operation: INSERT_BEFORE
value:
name: istio.stats
typed_config:
'@type': type.googleapis.com/udpa.type.v1.TypedStruct
type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
value:
config:
configuration:
'@type': type.googleapis.com/google.protobuf.StringValue
value: |
{
}
root_id: stats_outbound
vm_config:
allow_precompiled: true
code:
local:
inline_string: envoy.wasm.stats
runtime: envoy.wasm.runtime.null
vm_id: stats_outbound
...
EOF
kubectl apply -f - <<"EOF"
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: metadata-exchange-1.8
spec:
configPatches:
- applyTo: HTTP_FILTER
match:
context: SIDECAR_INBOUND
listener:
filterChain:
filter:
name: envoy.http_connection_manager
proxy:
proxyVersion: ^1\.8.*
patch:
operation: INSERT_BEFORE
value:
name: istio.metadata_exchange
typed_config:
'@type': type.googleapis.com/udpa.type.v1.TypedStruct
type_url: type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
value:
config:
configuration:
'@type': type.googleapis.com/google.protobuf.StringValue
value: |
{}
vm_config:
allow_precompiled: true
code:
local:
inline_string: envoy.wasm.metadata_exchange
runtime: envoy.wasm.runtime.null
...
EOF
Note: Later I discovered that the official standard version of Istio 1.8 uses Native Filter, which is envoy.wasm.runtime.null
. Our environment is an internal customized version, which uses wasm Filter by default (or a utopia based on security, isolation, and portability greater than performance). Therefore, the above optimization may be that the default configuration has been completed for you. That is, you can ignore...
The following is the number of threads to modify envoy:
kubectl edit deployments.apps my-service-deployment
spec:
template:
metadata:
annotations:
proxy.istio.io/config: 'concurrency: 4'
Sidecar CPU Profile (take the lungs again)
Because of the use of native envoy filter instead of wasm filter. As can be seen from the above figure, the stack loss situation is gone. The measured CPU usage has dropped by about 8%, and the latency has been reduced by 1ms.
Summarize
Instead of condemning the cheating custom version of the default wasm envoy filter configuration and thread configuration, it is better to think about why you paid for several days to locate this problem. When we were very excited to board a new technology ship, in addition to remembering to bring a lifebuoy, we must not forget: you are the captain, besides knowing how to drive, you should also understand the working principle and maintenance technology of the ship so that you can deal with emergencies. Negative trust.
Original: https://blog.mygraphql.com/zh/posts/cloud/istio/istio-tunning/istio-filter-tunning-thread/
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。