之前和读者聊到,“现在envoy用来做七层网关,要想达到好用,就差几个关键技术点没解决”:https://x.com/spacewander_lzx/status/1793292249155162207。于是我决定开一个系列,不定期更新,写写 Envoy 目前还做得不够好的地方。
今天先聊聊 Envoy 里面路由配置变更粒度过大的问题。
Envoy 里和路由相关的主要是两种配置:LDS 和 RDS。其中 LDS (listener)控制四层上的配置,如监听哪个端口、TLS 协议相关的参数等;RDS(route)控制 HTTP 层面上的路由配置。LDS 是 RDS 的根,即每个 LDS 是多个 RDS 的上级节点。
以下面的 k8s Gateway API 的 Gateway 资源为例:
(在下面的配置里和示例无关的部分以省略号代替。不同的控制面在翻译 Gateway API 资源到 Envoy 的 xDS 在细节上有所不同,此处使用的是 istio)
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: gateway
namespace: default
spec:
gatewayClassName: istio
listeners:
- name: http
hostname: "*.exp.com"
port: 80
protocol: HTTP
- name: http2
hostname: "*.test.com"
port: 80
protocol: HTTP
- name: https
hostname: "*.exp.com"
port: 443
protocol: HTTPS
tls:
certificateRefs:
- group: ""
kind: Secret
name: cert
mode: Terminate
- name: https2
hostname: "*.httpbin.com"
port: 443
protocol: HTTPS
tls:
certificateRefs:
- group: ""
kind: Secret
name: cert
mode: Terminate
其对应 80 端口和 443 端口两个 LDS,具体配置为:
- accessLog:
- filter:
responseFlagFilter:
flags:
- NR
name: envoy.access_loggers.file
typedConfig:
'@type': type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
logFormat:
textFormatSource:
inlineString: |
[%START_TIME%] "%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL%" %RESPONSE_CODE% %RESPONSE_FLAGS% %RESPONSE_CODE_DETAILS% %CONNECTION_TERMINATION_DETAILS% "%UPSTREAM_TRANSPORT_FAILURE_REASON%" %BYTES_RECEIVED% %BYTES_SENT% %DURATION% %RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)% "%REQ(X-FORWARDED-FOR)%" "%REQ(USER-AGENT)%" "%REQ(X-REQUEST-ID)%" "%REQ(:AUTHORITY)%" "%UPSTREAM_HOST%" %UPSTREAM_CLUSTER% %UPSTREAM_LOCAL_ADDRESS% %DOWNSTREAM_LOCAL_ADDRESS% %DOWNSTREAM_REMOTE_ADDRESS% %REQUESTED_SERVER_NAME% %ROUTE_NAME%
path: /dev/stdout
address:
socketAddress:
address: 0.0.0.0
portValue: 80
filterChains:
- filters:
- name: envoy.filters.network.http_connection_manager
typedConfig:
'@type': type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
httpFilters:
...
- name: envoy.filters.http.router
typedConfig:
'@type': type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
httpProtocolOptions: {}
normalizePath: true
pathWithEscapedSlashesAction: KEEP_UNCHANGED
rds:
configSource:
ads: {}
initialFetchTimeout: 0s
resourceApiVersion: V3
routeConfigName: http.80
...
listenerFiltersTimeout: 0s
name: 0.0.0.0:80
trafficDirection: OUTBOUND
- accessLog:
- filter:
responseFlagFilter:
flags:
- NR
name: envoy.access_loggers.file
typedConfig:
'@type': type.googleapis.com/envoy.extensions.access_loggers.file.v3.FileAccessLog
logFormat:
textFormatSource:
inlineString: |
[%START_TIME%] "%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL%" %RESPONSE_CODE% %RESPONSE_FLAGS% %RESPONSE_CODE_DETAILS% %CONNECTION_TERMINATION_DETAILS% "%UPSTREAM_TRANSPORT_FAILURE_REASON%" %BYTES_RECEIVED% %BYTES_SENT% %DURATION% %RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)% "%REQ(X-FORWARDED-FOR)%" "%REQ(USER-AGENT)%" "%REQ(X-REQUEST-ID)%" "%REQ(:AUTHORITY)%" "%UPSTREAM_HOST%" %UPSTREAM_CLUSTER% %UPSTREAM_LOCAL_ADDRESS% %DOWNSTREAM_LOCAL_ADDRESS% %DOWNSTREAM_REMOTE_ADDRESS% %REQUESTED_SERVER_NAME% %ROUTE_NAME%
path: /dev/stdout
address:
socketAddress:
address: 0.0.0.0
portValue: 443
filterChains:
- filterChainMatch:
serverNames:
- '*.exp.com'
filters:
- name: envoy.filters.network.http_connection_manager
typedConfig:
'@type': type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
httpFilters:
...
- name: envoy.filters.http.router
typedConfig:
'@type': type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
httpProtocolOptions: {}
normalizePath: true
pathWithEscapedSlashesAction: KEEP_UNCHANGED
rds:
configSource:
ads: {}
initialFetchTimeout: 0s
resourceApiVersion: V3
routeConfigName: https.443.default.gateway-istio-autogenerated-k8s-gateway-https.default
...
transportSocket:
name: envoy.transport_sockets.tls
typedConfig:
'@type': type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
commonTlsContext:
alpnProtocols:
- h2
- http/1.1
tlsCertificateSdsSecretConfigs:
- name: kubernetes-gateway://default/cert
sdsConfig:
ads: {}
resourceApiVersion: V3
requireClientCertificate: false
- filterChainMatch:
serverNames:
- '*.httpbin.com'
filters:
- name: envoy.filters.network.http_connection_manager
typedConfig:
'@type': type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
httpFilters:
...
- name: envoy.filters.http.router
typedConfig:
'@type': type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
httpProtocolOptions: {}
normalizePath: true
pathWithEscapedSlashesAction: KEEP_UNCHANGED
rds:
configSource:
ads: {}
initialFetchTimeout: 0s
resourceApiVersion: V3
routeConfigName: https.443.default.gateway-istio-autogenerated-k8s-gateway-https2.default
...
transportSocket:
name: envoy.transport_sockets.tls
typedConfig:
'@type': type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.DownstreamTlsContext
commonTlsContext:
alpnProtocols:
- h2
- http/1.1
tlsCertificateSdsSecretConfigs:
- name: kubernetes-gateway://default/cert
sdsConfig:
ads: {}
resourceApiVersion: V3
requireClientCertificate: false
listenerFilters:
- name: envoy.filters.listener.tls_inspector
typedConfig:
'@type': type.googleapis.com/envoy.extensions.filters.listener.tls_inspector.v3.TlsInspector
listenerFiltersTimeout: 0s
name: 0.0.0.0:443
trafficDirection: OUTBOUND
K8S Gateway API 规定,一个 Gateway 会有多个 listener,每个 listener 的名字是唯一的。当协议为 HTTPS 时,listener 的 hostname 作为 TLS 的 Servername 进行匹配。我们可以看到,同一个 Gateway 资源的同一个端口(443)上的多个 listener 被翻译成一个 LDS。一个 LDS 里会有多个 filterChains,数目和对应的 listener 一致。每个 filterChain match 一个给定的 Servername,即 listener 里面的 hostname。每个 filterChain 里都有自己的 http_connection_manager(以下简称 HCM)。HCM 里有个 rds 的字段,指定了对应的 RDS,如下:
rds:
configSource:
ads: {}
initialFetchTimeout: 0s
resourceApiVersion: V3
routeConfigName: https.443.default.gateway-istio-autogenerated-k8s-gateway-https.default
...
rds:
configSource:
ads: {}
initialFetchTimeout: 0s
resourceApiVersion: V3
routeConfigName: https.443.default.gateway-istio-autogenerated-k8s-gateway-https2.default
routeConfigName 里面包含了原始的 listener 的 name。
而对于不存在 Servername 进行区分的 HTTP 协议端口上的 LDS,只会有一个 filterChain,也即只有一个 HCM,最终也只有一个 RDS 资源:
rds:
configSource:
ads: {}
initialFetchTimeout: 0s
resourceApiVersion: V3
routeConfigName: http.80
由于在 LDS 上不再区分 listener,所以 HTTP 80 端口上的这个 RDS 名称中只是放了协议和端口信息。
接下来让我们拍一个 HTTPRoute,看看对应的 RDS 配置:
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: http
namespace: default
spec:
parentRefs:
- name: gateway
namespace: default
hostnames: ["httpbin.exp.com"]
rules:
- matches:
- path:
type: PathPrefix
value: /get
backendRefs:
- name: httpbin
port: 8000
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
name: http2
namespace: default
spec:
parentRefs:
- name: gateway
namespace: default
hostnames: ["httpbin.test.com"]
rules:
- matches:
- path:
type: PathPrefix
value: /get
backendRefs:
- name: httpbin
port: 8000
RDS 配置如下:
80 端口上的 RDS
- ignorePortInHostMatching: true
maxDirectResponseBodySizeBytes: 1048576
name: http.80
validateClusters: false
virtualHosts:
- domains:
- httpbin.exp.com
includeRequestAttemptCount: true
name: httpbin.exp.com:80
routes:
- decorator:
operation: httpbin.default.svc.cluster.local:8000/*
match:
caseSensitive: true
pathSeparatedPrefix: /get
metadata:
filterMetadata:
istio:
config: /apis/networking.istio.io/v1alpha3/namespaces/default/virtual-service/http-0-istio-autogenerated-k8s-gateway
name: default.http.0
route:
cluster: outbound|8000||httpbin.default.svc.cluster.local
clusterNotFoundResponseCode: INTERNAL_SERVER_ERROR
maxGrpcTimeout: 0s
retryPolicy:
hostSelectionRetryMaxAttempts: "5"
numRetries: 2
retriableStatusCodes:
- 503
retryHostPredicate:
- name: envoy.retry_host_predicates.previous_hosts
typedConfig:
'@type': type.googleapis.com/envoy.extensions.retry.host.previous_hosts.v3.PreviousHostsPredicate
retryOn: connect-failure,refused-stream,unavailable,cancelled,retriable-status-codes
timeout: 0s
- domains:
- httpbin.test.com
includeRequestAttemptCount: true
name: httpbin.test.com:80
routes:
- decorator:
operation: httpbin.default.svc.cluster.local:8000/*
match:
caseSensitive: true
pathSeparatedPrefix: /get
metadata:
filterMetadata:
istio:
config: /apis/networking.istio.io/v1alpha3/namespaces/default/virtual-service/http2-0-istio-autogenerated-k8s-gateway
name: default.http2.0
route:
cluster: outbound|8000||httpbin.default.svc.cluster.local
clusterNotFoundResponseCode: INTERNAL_SERVER_ERROR
maxGrpcTimeout: 0s
retryPolicy:
hostSelectionRetryMaxAttempts: "5"
numRetries: 2
retriableStatusCodes:
- 503
retryHostPredicate:
- name: envoy.retry_host_predicates.previous_hosts
typedConfig:
'@type': type.googleapis.com/envoy.extensions.retry.host.previous_hosts.v3.PreviousHostsPredicate
retryOn: connect-failure,refused-stream,unavailable,cancelled,retriable-status-codes
timeout: 0s
443 端口,*.exp.com
上的 RDS
- ignorePortInHostMatching: true
maxDirectResponseBodySizeBytes: 1048576
name: https.443.default.gateway-istio-autogenerated-k8s-gateway-https.default
validateClusters: false
virtualHosts:
- domains:
- httpbin.exp.com
includeRequestAttemptCount: true
name: httpbin.exp.com:443
routes:
- decorator:
operation: httpbin.default.svc.cluster.local:8000/*
match:
caseSensitive: true
pathSeparatedPrefix: /get
metadata:
filterMetadata:
istio:
config: /apis/networking.istio.io/v1alpha3/namespaces/default/virtual-service/http-0-istio-autogenerated-k8s-gateway
name: default.http.0
route:
cluster: outbound|8000||httpbin.default.svc.cluster.local
clusterNotFoundResponseCode: INTERNAL_SERVER_ERROR
maxGrpcTimeout: 0s
retryPolicy:
hostSelectionRetryMaxAttempts: "5"
numRetries: 2
retriableStatusCodes:
- 503
retryHostPredicate:
- name: envoy.retry_host_predicates.previous_hosts
typedConfig:
'@type': type.googleapis.com/envoy.extensions.retry.host.previous_hosts.v3.PreviousHostsPredicate
retryOn: connect-failure,refused-stream,unavailable,cancelled,retriable-status-codes
timeout: 0s
443 端口,*.httpbin.com
上的 RDS
- ignorePortInHostMatching: true
maxDirectResponseBodySizeBytes: 1048576
name: https.443.default.gateway-istio-autogenerated-k8s-gateway-https2.default
validateClusters: false
virtualHosts:
- domains:
- '*'
name: blackhole:443
可以看到每个 RDS 下面有 virtualHosts 字段,里面是一组域名列表。每个域名下面有 routes 字段,里面是一组路径列表。因为我们没有创建 listener https2 对应的 HTTPRoute 资源,所以其对应的 RDS https.443.default.gateway-istio-autogenerated-k8s-gateway-https2.default
只有占位符。而 80 端口上同时有 *.exp.com
和 *.test.com
两个域名通配符,所以其对应的 RDS http.80
下面有两个域名的配置。
总结如下:
- 每个端口都是一个 LDS。
- 对于 HTTPS 协议,每个 hostname 使用自己的 filterChain;对于 HTTP 协议,整个端口都用一个 filterChain。
- 每个 filterChain 有一个 HCM,每个 HCM 有一个 RDS。
顺便一提,虽然 LDS 里面的多个 filterChain、RDS 里面的多个域名和路径,都是以列表的形式配置,但实际上在定位请求命中的路径时,这三处地方匹配方式都不一样。
- LDS 里面的多个 filterChain:Envoy 会合并多个 filterChain 的匹配条件。以 ServerName 为例,在匹配时会先用 hash 来命中 exact match 的名称,然后每次减掉一个子域名,逐个匹配 wildcard maps。
- RDS 里面的多个域名:先用 hash 来命中 exact match 的名称,然后由长到短逐个匹配 wildcard
- 同一个域名下的多个路径:只有这个是严格按配置顺序一一匹配的。除此之外,还支持通过 generic matching API 来进行树形的匹配。以及支持通过 route scope,控制匹配的范围。
细心的读者应该已经发现,RDS 的粒度是每个 HCM 只有一个,这导致了开篇提到的问题。哪怕只是任何一个路径发生了变化,都会导致整个 RDS 需要重新推送。这对整个端口、所有域名上的路由都共享一个 RDS 的 HTTP 协议服务来说更为明显。
在 Envoy 赖以发家的少有 Route 级别配置的 Service Mesh 场景,这并非什么大问题。但一旦想要用 Envoy 来处理南北的网关流量,每次路由配置一变就把全量路由配置重新推送一下,显然是不可接受的。
Envoy 社区并非对此毫无察觉。针对这个问题,不少解法被提了出来。
欲知后事如何,请听下回分解。
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。