author
Yin Ye, Tencent expert engineer, Tencent Cloud TCM product leader. He has many years of practical experience in K8s, Service Mesh, etc.
Introduction
For many backend service businesses, we all want to get the client source IP. Load balancers on the cloud, such as Tencent Cloud CLB, support passing client source IP to backend services. However, when using istio, due to the existence of istio ingressgateway and sidecar, if the backend service needs to obtain the client source IP, especially the four-layer protocol, the situation will become more complicated.
text
In many business scenarios, we all want to get the client source IP. On-cloud load balancers, such as Tencent Cloud CLB , support passing client IP to backend services. TKE/TCM also does a good job of integrating this capability.
However, when using istio, due to the existence of istio ingressgateway and sidecar on the intermediate link, if the backend service needs to obtain the client IP, especially the four-layer protocol, the situation will become more complicated.
For the application service, it only sees connections from Envoy.
Some common source IP retention methods
Let's take a look at some common Loadbalancer/Proxy source IP retention methods. Our application protocols are generally four-layer or seven-layer protocols.
Source IP retention for Layer 7 protocols
The way to keep the source IP of the client at Layer 7 is relatively simple. The most representative one is the HTTP header XFF(X-Forwarded-For)
. XFF saves the source IP of the original client and transparently transmits it to the backend. The application can parse the XFF header and get The source IP of the client. Common seven-layer proxy components, such as Nginx, Haproxy, and Envoy all support this function.
Source IP retention for Layer 4 protocols
DNAT
IPVS/iptables
All support DNAT, the client accesses the LB through the VIP, when the request message arrives at the LB, the LB selects a backend server according to the connection scheduling algorithm, and rewrites the destination address VIP of the message to the address of the selected server , the destination port of the message is rewritten to the corresponding port of the selected server, and finally the modified message is sent to the selected server. Because the LB does not modify the source IP of the packet when forwarding the packet, the back-end server can see the source IP of the client.
Transparent Proxy
Nginx/Haproxy
Support transparent proxy ( Transparent Proxy
). When this configuration is enabled, when the LB establishes a connection with the backend service, it will bind the source IP of the socket to the IP address of the client, which depends on the kernel TPROXY and the IP_TRANSPARENT
option of the socket.
In addition, in the above two methods, the response of the backend service must go through the LB and then return to the Client, which generally requires the cooperation of policy routing.
TOA
TOA ( TCP Option Address
) is a method to obtain the real source IP based on the four-layer protocol (TCP). The essence is to insert the source IP address into the Options field of the TCP protocol. This requires the kernel to install the corresponding TOA kernel module .
Proxy Protocol
Proxy Protocol is a four-layer source address reservation scheme implemented by Haproxy. Its principle is very simple. After the Proxy establishes a TCP connection with the back-end Server, before sending the actual application data, it first sends a Proxy Protocol
protocol header (including client source IP/port, destination IP/port, etc. information). In this way, the backend server obtains the real client source IP address by parsing the protocol header.
Proxy Protocol
Require both Proxy and Server to support this protocol. But it can maintain source IP across multiple layers of intermediate proxies. This is somewhat similar to the seven-layer XFF design idea.
Implementing source IP retention in istio
In istio, due to the existence of istio ingressgateway and sidecar, it will be more difficult for the application to obtain the source IP address of the client. But in order to support transparent proxy , Envoy itself supports Proxy Protocol
, and combined with TPROXY, we can get the source IP in the istio service.
east-west traffic
When the istio east-west service is accessed, due to Sidecar injection, all traffic entering and leaving the service is intercepted and proxied by Envoy, and then Envoy forwards the request to the application. Therefore, the source address of the request received by the application is the address accessed by Envoy 127.0.0.6
.
# kubectl -n foo apply -f samples/httpbin/httpbin.yaml
# kubectl -n foo apply -f samples/sleep/sleep.yaml
# kubectl -n foo get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
httpbin-74fb669cc6-qvlb5 2/2 Running 0 4m9s 172.17.0.57 10.206.2.144 <none> <none>
sleep-74b7c4c84c-9nbtr 2/2 Running 0 6s 172.17.0.58 10.206.2.144 <none> <none>
# kubectl -n foo exec -it deploy/sleep -c sleep -- curl http://httpbin:8000/ip
{
"origin": "127.0.0.6"
}
As you can see, the source IP seen by httpbin is 127.0.0.6
. This can also be confirmed from the socket information.
# kubectl -n foo exec -it deploy/httpbin -c httpbin -- netstat -ntp | grep 80
tcp 0 0 172.17.0.57:80 127.0.0.6:56043 TIME_WAIT -
- istio open TPROXY
We modify the httpbin deployment to use TPROXY (note that the IP of ---f6cb13eda6a4520e2ec9efae0d55109b httpbin
becomes 172.17.0.59
):
# kubectl patch deployment -n foo httpbin -p '{"spec":{"template":{"metadata":{"annotations":{"sidecar.istio.io/interceptionMode":"TPROXY"}}}}}'
# kubectl -n foo get pods -l app=httpbin -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
httpbin-6565f59ff8-plnn7 2/2 Running 0 43m 172.17.0.59 10.206.2.144 <none> <none>
# kubectl -n foo exec -it deploy/sleep -c sleep -- curl http://httpbin:8000/ip
{
"origin": "172.17.0.58"
}
As you can see, httpbin can get the real IP of the sleep side.
Status of socket:
# kubectl -n foo exec -it deploy/httpbin -c httpbin -- netstat -ntp | grep 80
tcp 0 0 172.17.0.59:80 172.17.0.58:35899 ESTABLISHED 9/python3
tcp 0 0 172.17.0.58:35899 172.17.0.59:80 ESTABLISHED -
The first line is the receiver socket of httpbin, and the second line is the sender socket of envoy.
httpbin envoy
Log:
{"bytes_received":0,"upstream_local_address":"172.17.0.58:35899",
"downstream_remote_address":"172.17.0.58:46864","x_forwarded_for":null,
"path":"/ip","istio_policy_status":null,
"response_code":200,"upstream_service_time":"1",
"authority":"httpbin:8000","start_time":"2022-05-30T02:09:13.892Z",
"downstream_local_address":"172.17.0.59:80","user_agent":"curl/7.81.0-DEV","response_flags":"-",
"upstream_transport_failure_reason":null,"request_id":"2b2ab6cc-78da-95c0-b278-5b3e30b514a0",
"protocol":"HTTP/1.1","requested_server_name":null,"duration":1,"bytes_sent":30,"route_name":"default",
"upstream_cluster":"inbound|80||","upstream_host":"172.17.0.59:80","method":"GET"}
can be seen,
- downstream_remote_address: 172.17.0.58:46864 ## sleep address
- downstream_local_address: 172.17.0.59:80 ## Destination address accessed by sleep
- upstream_local_address: 172.17.0.58:35899 ## httpbin envoy connects to the local address of httpbin (the IP of sleep)
- upstream_host: 172.17.0.59:80 ## The destination address of httpbin envoy access
The local address of httpbin envoy connecting httpbin is the IP address of sleep.
North-South traffic
For north-south traffic, the client first requests the CLB, and the CLB forwards the request to the ingressgateway, and then to the back-end service. Since there is one more ingressgateway hop in the middle, it becomes more difficult to obtain the source IP of the client.
We access httpbin with TCP protocol:
apiVersion: v1
kind: Service
metadata:
name: httpbin
namespace: foo
labels:
app: httpbin
service: httpbin
spec:
ports:
- name: tcp
port: 8000
targetPort: 80
selector:
app: httpbin
---
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
name: httpbin-gw
namespace: foo
spec:
selector:
istio: ingressgateway # use istio default controller
servers:
- port:
number: 8000
name: tcp
protocol: TCP
hosts:
- "*"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: httpbin
namespace: foo
spec:
hosts:
- "*"
gateways:
- httpbin-gw
tcp:
- match:
- port: 8000
route:
- destination:
port:
number: 8000
host: httpbin
Access httpbin through ingressgateway:
# export GATEWAY_URL=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
# curl http://$GATEWAY_URL:8000/ip
{
"origin": "172.17.0.54"
}
As you can see, the address seen by httpbin is the address of ingressgateway
:
# kubectl -n istio-system get pods -l istio=ingressgateway -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
istio-ingressgateway-5d5b776b7b-pxc2g 1/1 Running 0 3d15h 172.17.0.54 10.206.2.144 <none> <none>
Although we have enabled the transparent proxy in httpbin envoy
, the ingressgateway cannot pass the source address of the client to httpbin envoy
. The implementation based on envoy Proxy Protocol
can solve this problem.
Enable both ingressgateway and httpbin through EnvoyFilter Proxy Protocol
support.
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: ingressgw-pp
namespace: istio-system
spec:
configPatches:
- applyTo: CLUSTER
patch:
operation: MERGE
value:
transport_socket:
name: envoy.transport_sockets.upstream_proxy_protocol
typed_config:
"@type": type.googleapis.com/envoy.extensions.transport_sockets.proxy_protocol.v3.ProxyProtocolUpstreamTransport
config:
version: V1
transport_socket:
name: "envoy.transport_sockets.raw_buffer"
workloadSelector:
labels:
istio: ingressgateway
---
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
name: httpbin-pp
namespace: foo
spec:
configPatches:
- applyTo: LISTENER
match:
context: SIDECAR_INBOUND
patch:
operation: MERGE
value:
listener_filters:
- name: envoy.filters.listener.proxy_protocol
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.listener.proxy_protocol.v3.ProxyProtocol
- name: envoy.filters.listener.original_dst
- name: envoy.filters.listener.original_src
workloadSelector:
labels:
app: httpbin
Access httpbin through LB again:
# curl http://$GATEWAY_URL:8000/ip
{
"origin": "106.52.131.116"
}
httpbin gets the source IP of the client.
- ingressgateway envoy log
{"istio_policy_status":null,"protocol":null,"bytes_sent":262,"downstream_remote_address":"106.52.131.116:6093","start_time":"2022-05-30T03:33:33.759Z",
"upstream_service_time":null,"authority":null,"requested_server_name":null,"user_agent":null,"request_id":null,
"upstream_cluster":"outbound|8000||httpbin.foo.svc.cluster.local","upstream_transport_failure_reason":null,"duration":37,"response_code":0,
"method":null,"downstream_local_address":"172.17.0.54:8000","route_name":null,"upstream_host":"172.17.0.59:80","bytes_received":83,"path":null,
"x_forwarded_for":null,"upstream_local_address":"172.17.0.54:36162","response_flags":"-"}
can be seen,
- downstream_remote_address: 106.52.131.116:6093 ## Client source address
- downstream_local_address: 172.17.0.54:8000
- upstream_local_address: 172.17.0.54:42122 ## ingressgw local addr
- upstream_host: 172.17.0.59:80 ## httpbin address
- httpbin envoy log
{"istio_policy_status":null,"response_flags":"-","protocol":null,"method":null,"upstream_transport_failure_reason":null,"authority":null,"duration":37,
"x_forwarded_for":null,"user_agent":null,"downstream_remote_address":"106.52.131.116:6093","downstream_local_address":"172.17.0.59:80",
"bytes_sent":262,"path":null,"requested_server_name":null,"upstream_service_time":null,"request_id":null,"bytes_received":83,"route_name":null,
"upstream_local_address":"106.52.131.116:34431","upstream_host":"172.17.0.59:80","response_code":0,"start_time":"2022-05-30T03:33:33.759Z","upstream_cluster":"inbound|80||"}
can be seen,
- downstream_remote_address: 106.52.131.116:6093 ## Client source address
- downstream_local_address: 172.17.0.59:80 ## httpbin address
- upstream_local_address: 106.52.131.116:34431 ## The client IP is reserved, the port is different
- upstream_host: 172.17.0.59:80 ## httpbin address
It is worth noting that the httpbin envoy
of upstream_local_address
retains the client's IP, so that the source address IP seen by httpbin is the real IP of the client.
- data flow
Related Implementation Analysis
TRPOXY
The kernel implementation of TPROXY refers to net/netfilter/xt_TPROXY.c .
istio-iptables
will set the following iptables rules to mark data packets.
-A PREROUTING -p tcp -j ISTIO_INBOUND
-A PREROUTING -p tcp -m mark --mark 0x539 -j CONNMARK --save-mark --nfmask 0xffffffff --ctmask 0xffffffff
-A OUTPUT -p tcp -m connmark --mark 0x539 -j CONNMARK --restore-mark --nfmask 0xffffffff --ctmask 0xffffffff
-A ISTIO_DIVERT -j MARK --set-xmark 0x539/0xffffffff
-A ISTIO_DIVERT -j ACCEPT
-A ISTIO_INBOUND -p tcp -m conntrack --ctstate RELATED,ESTABLISHED -j ISTIO_DIVERT
-A ISTIO_INBOUND -p tcp -j ISTIO_TPROXY
-A ISTIO_TPROXY ! -d 127.0.0.1/32 -p tcp -j TPROXY --on-port 15006 --on-ip 0.0.0.0 --tproxy-mark 0x539/0xffffffff
It is worth mentioning that TPROXY can realize the redirection of data packets by itself without relying on NAT. In addition, combined with policy routing, non-local packets are routed through local lo:
# ip rule list
0: from all lookup local
32765: from all fwmark 0x539 lookup 133
32766: from all lookup main
32767: from all lookup default
# ip route show table 133
local default dev lo scope host
More details about TPROXY can be found here .
Implementation of Proxy Protocol in Envoy
- proxy protocol header format
Here Version 1(Human-readable header format)
is used, as follows:
0000 50 52 4f 58 59 20 54 43 50 34 20 31 30 36 2e 35 PROXY TCP4 106.5
0010 32 2e 31 33 31 2e 31 31 36 20 31 37 32 2e 31 37 2.131.116 172.17
0020 2e 30 2e 35 34 20 36 30 39 33 20 38 30 30 30 0d .0.54 6093 8000.
0030 0a .
As you can see, the header includes the IP:PORT
information of the client and ingressgateway. For a more detailed introduction, refer to here .
- ProxyProtocolUpstreamTransport
ingressgateway as the sender, use ProxyProtocolUpstreamTransport
, build Proxy Protocol
header:
/// source/extensions/transport_sockets/proxy_protocol/proxy_protocol.cc
void UpstreamProxyProtocolSocket::generateHeaderV1() {
// Default to local addresses (used if no downstream connection exists e.g. health checks)
auto src_addr = callbacks_->connection().addressProvider().localAddress();
auto dst_addr = callbacks_->connection().addressProvider().remoteAddress();
if (options_ && options_->proxyProtocolOptions().has_value()) {
const auto options = options_->proxyProtocolOptions().value();
src_addr = options.src_addr_;
dst_addr = options.dst_addr_;
}
Common::ProxyProtocol::generateV1Header(*src_addr->ip(), *dst_addr->ip(), header_buffer_);
}
- envoy.filters.listener.proxy_protocol
httpbin envoy
As the receiver, configure ListenerFilter( envoy.filters.listener.proxy_protocol
) to parse Proxy Protocol
header:
/// source/extensions/filters/listener/proxy_protocol/proxy_protocol.cc
ReadOrParseState Filter::onReadWorker() {
Network::ConnectionSocket& socket = cb_->socket(); /// ConnectionHandlerImpl::ActiveTcpSocket
...
if (proxy_protocol_header_.has_value() && !proxy_protocol_header_.value().local_command_) {
...
// Only set the local address if it really changed, and mark it as address being restored.
if (*proxy_protocol_header_.value().local_address_ !=
*socket.addressProvider().localAddress()) { /// proxy protocol header: 172.17.0.54:8000
socket.addressProvider().restoreLocalAddress(proxy_protocol_header_.value().local_address_); /// => 172.17.0.54:8000
} /// Network::ConnectionSocket
socket.addressProvider().setRemoteAddress(proxy_protocol_header_.value().remote_address_); /// 修改downstream_remote_address为106.52.131.116
}
// Release the file event so that we do not interfere with the connection read events.
socket.ioHandle().resetFileEvents();
cb_->continueFilterChain(true); /// ConnectionHandlerImpl::ActiveTcpSocket
return ReadOrParseState::Done;
}
这里值得注意的, envoy.filters.listener.proxy_protocol
解析proxy protocol header
时, local_address
端的dst_addr(172.17.0.54:8000)
, remote_address
end src_addr(106.52.131.116)
. The order was just reversed.
After the processing of ---6ef75cf24cbe891e84625f239706249d proxy_protocol
, the connected downstream_remote_address
is changed to the source address of the client.
- envoy.filters.listener.original_src
For sidecar.istio.io/interceptionMode: TPROXY
, virtualInbound listener
will add envoy.filters.listener.original_src
:
# istioctl -n foo pc listeners deploy/httpbin --port 15006 -o json
[
{
"name": "virtualInbound",
"address": {
"socketAddress": {
"address": "0.0.0.0",
"portValue": 15006
}
},
"filterChains": [...],
"listenerFilters": [
{
"name": "envoy.filters.listener.original_dst",
"typedConfig": {
"@type": "type.googleapis.com/envoy.extensions.filters.listener.original_dst.v3.OriginalDst"
}
},
{
"name": "envoy.filters.listener.original_src",
"typedConfig": {
"@type": "type.googleapis.com/envoy.extensions.filters.listener.original_src.v3.OriginalSrc",
"mark": 1337
}
}
...
]
"listenerFiltersTimeout": "0s",
"continueOnListenerFiltersTimeout": true,
"transparent": true,
"trafficDirection": "INBOUND",
"accessLog": [...]
}
]
envoy.filters.listener.original_src
By tcp option
modify upstream_local_address
to achieve ---27707ed68b7c425a1cddaabb7bcf1fclient IP downstream_remote_address
transmission.
/// source/extensions/filters/listener/original_src/original_src.cc
Network::FilterStatus OriginalSrcFilter::onAccept(Network::ListenerFilterCallbacks& cb) {
auto& socket = cb.socket(); /// ConnectionHandlerImpl::ActiveTcpSocket.socket()
auto address = socket.addressProvider().remoteAddress(); /// get downstream_remote_address
ASSERT(address);
ENVOY_LOG(debug,
"Got a new connection in the original_src filter for address {}. Marking with {}",
address->asString(), config_.mark());
...
auto options_to_add =
Filters::Common::OriginalSrc::buildOriginalSrcOptions(std::move(address), config_.mark());
socket.addOptions(std::move(options_to_add)); /// Network::Socket::Options
return Network::FilterStatus::Continue;
}
- envoy.filters.listener.original_dst
In addition, httbin envoy
as the receiving end of ingressgateway, virtualInbound listener
also configure ListenerFilter( envoy.filters.listener.original_dst
), to see its function.
// source/extensions/filters/listener/original_dst/original_dst.cc
Network::FilterStatus OriginalDstFilter::onAccept(Network::ListenerFilterCallbacks& cb) {
ENVOY_LOG(debug, "original_dst: New connection accepted");
Network::ConnectionSocket& socket = cb.socket();
if (socket.addressType() == Network::Address::Type::Ip) { /// socket SO_ORIGINAL_DST option
Network::Address::InstanceConstSharedPtr original_local_address = getOriginalDst(socket); /// origin dst address
// A listener that has the use_original_dst flag set to true can still receive
// connections that are NOT redirected using iptables. If a connection was not redirected,
// the address returned by getOriginalDst() matches the local address of the new socket.
// In this case the listener handles the connection directly and does not hand it off.
if (original_local_address) { /// change local address to origin dst address
// Restore the local address to the original one.
socket.addressProvider().restoreLocalAddress(original_local_address);
}
}
return Network::FilterStatus::Continue;
}
For istio, iptable intercepts the original request and transfers it to port 15006 (in request) or 15001 (out request), so the socket that handles the request local address
does not request original dst address
. original_dst
ListenerFilter
is responsible for changing the local address of the socket to original dst address
.
For virtualOutbound listener
envoy.filters.listener.original_dst
directly, set use_original_dst
to true, and then envoy will add envoy.filters.listener.original_dst
At the same time, virtualOutbound listener
will forward the request to the listener associated with the original destination address of the request for processing.
For virtualInbound listener
, --- envoy.filters.listener.original_dst
--- will be added directly. Unlike virtualOutbound listener
, it just changes the address to original dst address
instead of forwarding the request to the corresponding listener for processing (for incoming requests, there is no listener with dst address) . In fact, the incoming request is handled by FilterChain.
Refer to the code generated by virtualInbound listener
:
// istio/istio/pilot/pkg/networking/core/v1alpha3/listener_builder.go
func (lb *ListenerBuilder) aggregateVirtualInboundListener(passthroughInspectors map[int]enabledInspector) *ListenerBuilder {
// Deprecated by envoyproxy. Replaced
// 1. filter chains in this listener
// 2. explicit original_dst listener filter
// UseOriginalDst: proto.BoolTrue,
lb.virtualInboundListener.UseOriginalDst = nil
lb.virtualInboundListener.ListenerFilters = append(lb.virtualInboundListener.ListenerFilters,
xdsfilters.OriginalDestination, /// 添加envoy.filters.listener.original_dst
)
if lb.node.GetInterceptionMode() == model.InterceptionTproxy { /// TPROXY mode
lb.virtualInboundListener.ListenerFilters =
append(lb.virtualInboundListener.ListenerFilters, xdsfilters.OriginalSrc)
}
...
summary
Based on TPROXY and Proxy Protocol, we can realize the preservation of the client source IP of the four-layer protocol in istio.
refer to
- istio doc: Configuring Gateway Network Topology
- IP Transparency and Direct Server Return with NGINX and NGINX Plus as Transparent Proxy
- Kernel doc: Transparent proxy support
- Haproxy doc: The PROXY protocol
- Envoy doc: IP Transparency
- 【IstioCon 2021】How to keep source address in Istio?
about us
For more cases and knowledge about cloud native, you can pay attention to the public account of the same name [Tencent Cloud Native]~
Welfare:
① Reply to the [Manual] in the background of the official account, you can get the "Tencent Cloud Native Roadmap Manual" & "Tencent Cloud Native Best Practices"~
②The official account will reply to [series] in the background, and you can get "15 series of 100+ super practical cloud native original dry goods collection", including Kubernetes cost reduction and efficiency enhancement, K8s performance optimization practices, best practices and other series.
③If you reply to the [White Paper] in the background of the official account, you can get the "Tencent Cloud Container Security White Paper" & "The Source of Cost Reduction - Cloud Native Cost Management White Paper v1.0"
④ Reply to [Introduction to the Speed of Light] in the background of the official account, you can get a 50,000-word essence tutorial of Tencent Cloud experts, Prometheus and Grafana of the speed of light.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。