author

Yin Ye, Tencent expert engineer, Tencent Cloud TCM product leader. He has many years of practical experience in K8s, Service Mesh, etc.

Introduction

For many backend service businesses, we all want to get the client source IP. Load balancers on the cloud, such as Tencent Cloud CLB, support passing client source IP to backend services. However, when using istio, due to the existence of istio ingressgateway and sidecar, if the backend service needs to obtain the client source IP, especially the four-layer protocol, the situation will become more complicated.

text

In many business scenarios, we all want to get the client source IP. On-cloud load balancers, such as Tencent Cloud CLB , support passing client IP to backend services. TKE/TCM also does a good job of integrating this capability.

However, when using istio, due to the existence of istio ingressgateway and sidecar on the intermediate link, if the backend service needs to obtain the client IP, especially the four-layer protocol, the situation will become more complicated.

For the application service, it only sees connections from Envoy.

Some common source IP retention methods

Let's take a look at some common Loadbalancer/Proxy source IP retention methods. Our application protocols are generally four-layer or seven-layer protocols.

Source IP retention for Layer 7 protocols

The way to keep the source IP of the client at Layer 7 is relatively simple. The most representative one is the HTTP header XFF(X-Forwarded-For) . XFF saves the source IP of the original client and transparently transmits it to the backend. The application can parse the XFF header and get The source IP of the client. Common seven-layer proxy components, such as Nginx, Haproxy, and Envoy all support this function.

Source IP retention for Layer 4 protocols

DNAT

IPVS/iptables All support DNAT, the client accesses the LB through the VIP, when the request message arrives at the LB, the LB selects a backend server according to the connection scheduling algorithm, and rewrites the destination address VIP of the message to the address of the selected server , the destination port of the message is rewritten to the corresponding port of the selected server, and finally the modified message is sent to the selected server. Because the LB does not modify the source IP of the packet when forwarding the packet, the back-end server can see the source IP of the client.

Transparent Proxy

Nginx/Haproxy Support transparent proxy ( Transparent Proxy ). When this configuration is enabled, when the LB establishes a connection with the backend service, it will bind the source IP of the socket to the IP address of the client, which depends on the kernel TPROXY and the IP_TRANSPARENT option of the socket.

In addition, in the above two methods, the response of the backend service must go through the LB and then return to the Client, which generally requires the cooperation of policy routing.

TOA

TOA ( TCP Option Address ) is a method to obtain the real source IP based on the four-layer protocol (TCP). The essence is to insert the source IP address into the Options field of the TCP protocol. This requires the kernel to install the corresponding TOA kernel module .

Proxy Protocol

Proxy Protocol is a four-layer source address reservation scheme implemented by Haproxy. Its principle is very simple. After the Proxy establishes a TCP connection with the back-end Server, before sending the actual application data, it first sends a Proxy Protocol protocol header (including client source IP/port, destination IP/port, etc. information). In this way, the backend server obtains the real client source IP address by parsing the protocol header.

Proxy Protocol Require both Proxy and Server to support this protocol. But it can maintain source IP across multiple layers of intermediate proxies. This is somewhat similar to the seven-layer XFF design idea.

Implementing source IP retention in istio

In istio, due to the existence of istio ingressgateway and sidecar, it will be more difficult for the application to obtain the source IP address of the client. But in order to support transparent proxy , Envoy itself supports Proxy Protocol , and combined with TPROXY, we can get the source IP in the istio service.

east-west traffic

When the istio east-west service is accessed, due to Sidecar injection, all traffic entering and leaving the service is intercepted and proxied by Envoy, and then Envoy forwards the request to the application. Therefore, the source address of the request received by the application is the address accessed by Envoy 127.0.0.6 .

 # kubectl -n foo apply -f samples/httpbin/httpbin.yaml
# kubectl -n foo apply -f samples/sleep/sleep.yaml
# kubectl -n foo get pods -o wide
NAME                       READY   STATUS    RESTARTS   AGE    IP            NODE           NOMINATED NODE   READINESS GATES
httpbin-74fb669cc6-qvlb5   2/2     Running   0          4m9s   172.17.0.57   10.206.2.144   <none>           <none>
sleep-74b7c4c84c-9nbtr     2/2     Running   0          6s     172.17.0.58   10.206.2.144   <none>           <none>


# kubectl -n foo exec -it deploy/sleep -c sleep -- curl http://httpbin:8000/ip
{
  "origin": "127.0.0.6"
}

As you can see, the source IP seen by httpbin is 127.0.0.6 . This can also be confirmed from the socket information.

 # kubectl -n foo exec -it deploy/httpbin -c httpbin -- netstat -ntp | grep 80
tcp        0      0 172.17.0.57:80          127.0.0.6:56043         TIME_WAIT   -
  • istio open TPROXY

We modify the httpbin deployment to use TPROXY (note that the IP of ---f6cb13eda6a4520e2ec9efae0d55109b httpbin becomes 172.17.0.59 ):

 # kubectl patch deployment -n foo httpbin -p '{"spec":{"template":{"metadata":{"annotations":{"sidecar.istio.io/interceptionMode":"TPROXY"}}}}}'
# kubectl -n foo get pods -l app=httpbin  -o wide
NAME                       READY   STATUS    RESTARTS   AGE   IP            NODE           NOMINATED NODE   READINESS GATES
httpbin-6565f59ff8-plnn7   2/2     Running   0          43m   172.17.0.59   10.206.2.144   <none>           <none>

# kubectl -n foo exec -it deploy/sleep -c sleep -- curl http://httpbin:8000/ip
{
  "origin": "172.17.0.58"
}

As you can see, httpbin can get the real IP of the sleep side.

Status of socket:

 # kubectl -n foo exec -it deploy/httpbin -c httpbin -- netstat -ntp | grep 80                  
tcp        0      0 172.17.0.59:80          172.17.0.58:35899       ESTABLISHED 9/python3           
tcp        0      0 172.17.0.58:35899       172.17.0.59:80          ESTABLISHED -

The first line is the receiver socket of httpbin, and the second line is the sender socket of envoy.

httpbin envoy Log:

 {"bytes_received":0,"upstream_local_address":"172.17.0.58:35899",
"downstream_remote_address":"172.17.0.58:46864","x_forwarded_for":null,
"path":"/ip","istio_policy_status":null,
"response_code":200,"upstream_service_time":"1",
"authority":"httpbin:8000","start_time":"2022-05-30T02:09:13.892Z",
"downstream_local_address":"172.17.0.59:80","user_agent":"curl/7.81.0-DEV","response_flags":"-",
"upstream_transport_failure_reason":null,"request_id":"2b2ab6cc-78da-95c0-b278-5b3e30b514a0",
"protocol":"HTTP/1.1","requested_server_name":null,"duration":1,"bytes_sent":30,"route_name":"default",
"upstream_cluster":"inbound|80||","upstream_host":"172.17.0.59:80","method":"GET"}

can be seen,

  • downstream_remote_address: 172.17.0.58:46864 ## sleep address
  • downstream_local_address: 172.17.0.59:80 ## Destination address accessed by sleep
  • upstream_local_address: 172.17.0.58:35899 ## httpbin envoy connects to the local address of httpbin (the IP of sleep)
  • upstream_host: 172.17.0.59:80 ## The destination address of httpbin envoy access

The local address of httpbin envoy connecting httpbin is the IP address of sleep.

North-South traffic

For north-south traffic, the client first requests the CLB, and the CLB forwards the request to the ingressgateway, and then to the back-end service. Since there is one more ingressgateway hop in the middle, it becomes more difficult to obtain the source IP of the client.

We access httpbin with TCP protocol:

 apiVersion: v1
kind: Service
metadata:
  name: httpbin
  namespace: foo
  labels:
    app: httpbin
    service: httpbin
spec:
  ports:
  - name: tcp
    port: 8000
    targetPort: 80
  selector:
    app: httpbin
---
apiVersion: networking.istio.io/v1alpha3
kind: Gateway
metadata:
  name: httpbin-gw
  namespace: foo
spec:
  selector:
    istio: ingressgateway # use istio default controller
  servers:
  - port:
      number: 8000
      name: tcp
      protocol: TCP
    hosts:
    - "*"
---
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: httpbin
  namespace: foo
spec:
  hosts:
    - "*"
  gateways:
    - httpbin-gw
  tcp:
    - match:
      - port: 8000
      route:
        - destination:
            port:
              number: 8000
            host: httpbin

Access httpbin through ingressgateway:

 # export GATEWAY_URL=$(kubectl -n istio-system get service istio-ingressgateway -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
# curl http://$GATEWAY_URL:8000/ip
{
  "origin": "172.17.0.54"
}

As you can see, the address seen by httpbin is the address of ingressgateway :

 # kubectl -n istio-system get pods -l istio=ingressgateway -o wide
NAME                                    READY   STATUS    RESTARTS   AGE     IP            NODE           NOMINATED NODE   READINESS GATES
istio-ingressgateway-5d5b776b7b-pxc2g   1/1     Running   0          3d15h   172.17.0.54   10.206.2.144   <none>           <none>

Although we have enabled the transparent proxy in httpbin envoy , the ingressgateway cannot pass the source address of the client to httpbin envoy . The implementation based on envoy Proxy Protocol can solve this problem.

Enable both ingressgateway and httpbin through EnvoyFilter Proxy Protocol support.

 apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: ingressgw-pp
  namespace: istio-system
spec:
  configPatches:
  - applyTo: CLUSTER
    patch:
      operation: MERGE
      value:
        transport_socket:
          name: envoy.transport_sockets.upstream_proxy_protocol
          typed_config:
            "@type": type.googleapis.com/envoy.extensions.transport_sockets.proxy_protocol.v3.ProxyProtocolUpstreamTransport
            config:
              version: V1
            transport_socket:
              name: "envoy.transport_sockets.raw_buffer"
  workloadSelector:
    labels:
      istio: ingressgateway
---
apiVersion: networking.istio.io/v1alpha3
kind: EnvoyFilter
metadata:
  name: httpbin-pp
  namespace: foo
spec:
  configPatches:
  - applyTo: LISTENER
    match:
      context: SIDECAR_INBOUND
    patch:
      operation: MERGE
      value:
        listener_filters:
        - name: envoy.filters.listener.proxy_protocol
          typed_config:
            "@type": type.googleapis.com/envoy.extensions.filters.listener.proxy_protocol.v3.ProxyProtocol
        - name: envoy.filters.listener.original_dst
        - name: envoy.filters.listener.original_src
  workloadSelector:
    labels:
      app: httpbin

Access httpbin through LB again:

 # curl http://$GATEWAY_URL:8000/ip
{
  "origin": "106.52.131.116"
}

httpbin gets the source IP of the client.

  • ingressgateway envoy log
 {"istio_policy_status":null,"protocol":null,"bytes_sent":262,"downstream_remote_address":"106.52.131.116:6093","start_time":"2022-05-30T03:33:33.759Z",
"upstream_service_time":null,"authority":null,"requested_server_name":null,"user_agent":null,"request_id":null,
"upstream_cluster":"outbound|8000||httpbin.foo.svc.cluster.local","upstream_transport_failure_reason":null,"duration":37,"response_code":0,
"method":null,"downstream_local_address":"172.17.0.54:8000","route_name":null,"upstream_host":"172.17.0.59:80","bytes_received":83,"path":null,
"x_forwarded_for":null,"upstream_local_address":"172.17.0.54:36162","response_flags":"-"}

can be seen,

  • downstream_remote_address: 106.52.131.116:6093 ## Client source address
  • downstream_local_address: 172.17.0.54:8000
  • upstream_local_address: 172.17.0.54:42122 ## ingressgw local addr
  • upstream_host: 172.17.0.59:80 ## httpbin address
  • httpbin envoy log
 {"istio_policy_status":null,"response_flags":"-","protocol":null,"method":null,"upstream_transport_failure_reason":null,"authority":null,"duration":37,
"x_forwarded_for":null,"user_agent":null,"downstream_remote_address":"106.52.131.116:6093","downstream_local_address":"172.17.0.59:80",
"bytes_sent":262,"path":null,"requested_server_name":null,"upstream_service_time":null,"request_id":null,"bytes_received":83,"route_name":null,
"upstream_local_address":"106.52.131.116:34431","upstream_host":"172.17.0.59:80","response_code":0,"start_time":"2022-05-30T03:33:33.759Z","upstream_cluster":"inbound|80||"}

can be seen,

  • downstream_remote_address: 106.52.131.116:6093 ## Client source address
  • downstream_local_address: 172.17.0.59:80 ## httpbin address
  • upstream_local_address: 106.52.131.116:34431 ## The client IP is reserved, the port is different
  • upstream_host: 172.17.0.59:80 ## httpbin address

It is worth noting that the httpbin envoy of upstream_local_address retains the client's IP, so that the source address IP seen by httpbin is the real IP of the client.

  • data flow

Related Implementation Analysis

TRPOXY

The kernel implementation of TPROXY refers to net/netfilter/xt_TPROXY.c .

istio-iptables will set the following iptables rules to mark data packets.

 -A PREROUTING -p tcp -j ISTIO_INBOUND
-A PREROUTING -p tcp -m mark --mark 0x539 -j CONNMARK --save-mark --nfmask 0xffffffff --ctmask 0xffffffff
-A OUTPUT -p tcp -m connmark --mark 0x539 -j CONNMARK --restore-mark --nfmask 0xffffffff --ctmask 0xffffffff
-A ISTIO_DIVERT -j MARK --set-xmark 0x539/0xffffffff
-A ISTIO_DIVERT -j ACCEPT
-A ISTIO_INBOUND -p tcp -m conntrack --ctstate RELATED,ESTABLISHED -j ISTIO_DIVERT
-A ISTIO_INBOUND -p tcp -j ISTIO_TPROXY
-A ISTIO_TPROXY ! -d 127.0.0.1/32 -p tcp -j TPROXY --on-port 15006 --on-ip 0.0.0.0 --tproxy-mark 0x539/0xffffffff

It is worth mentioning that TPROXY can realize the redirection of data packets by itself without relying on NAT. In addition, combined with policy routing, non-local packets are routed through local lo:

 # ip rule list
0:    from all lookup local 
32765:    from all fwmark 0x539 lookup 133 
32766:    from all lookup main 
32767:    from all lookup default 

# ip route show table 133
local default dev lo scope host

More details about TPROXY can be found here .

Implementation of Proxy Protocol in Envoy

  • proxy protocol header format

Here Version 1(Human-readable header format) is used, as follows:

 0000   50 52 4f 58 59 20 54 43 50 34 20 31 30 36 2e 35   PROXY TCP4 106.5
0010   32 2e 31 33 31 2e 31 31 36 20 31 37 32 2e 31 37   2.131.116 172.17
0020   2e 30 2e 35 34 20 36 30 39 33 20 38 30 30 30 0d   .0.54 6093 8000.
0030   0a                                                .

As you can see, the header includes the IP:PORT information of the client and ingressgateway. For a more detailed introduction, refer to here .

  • ProxyProtocolUpstreamTransport

ingressgateway as the sender, use ProxyProtocolUpstreamTransport , build Proxy Protocol header:

 /// source/extensions/transport_sockets/proxy_protocol/proxy_protocol.cc

void UpstreamProxyProtocolSocket::generateHeaderV1() {
  // Default to local addresses (used if no downstream connection exists e.g. health checks)
  auto src_addr = callbacks_->connection().addressProvider().localAddress(); 
  auto dst_addr = callbacks_->connection().addressProvider().remoteAddress();

  if (options_ && options_->proxyProtocolOptions().has_value()) {
    const auto options = options_->proxyProtocolOptions().value();
    src_addr = options.src_addr_;
    dst_addr = options.dst_addr_;
  }

  Common::ProxyProtocol::generateV1Header(*src_addr->ip(), *dst_addr->ip(), header_buffer_);
}
  • envoy.filters.listener.proxy_protocol

httpbin envoy As the receiver, configure ListenerFilter( envoy.filters.listener.proxy_protocol ) to parse Proxy Protocol header:

 /// source/extensions/filters/listener/proxy_protocol/proxy_protocol.cc

ReadOrParseState Filter::onReadWorker() {
  Network::ConnectionSocket& socket = cb_->socket(); /// ConnectionHandlerImpl::ActiveTcpSocket
...
  if (proxy_protocol_header_.has_value() && !proxy_protocol_header_.value().local_command_) {
...
    // Only set the local address if it really changed, and mark it as address being restored.
    if (*proxy_protocol_header_.value().local_address_ !=
        *socket.addressProvider().localAddress()) { /// proxy protocol header: 172.17.0.54:8000
      socket.addressProvider().restoreLocalAddress(proxy_protocol_header_.value().local_address_); /// => 172.17.0.54:8000
    } /// Network::ConnectionSocket
    socket.addressProvider().setRemoteAddress(proxy_protocol_header_.value().remote_address_); /// 修改downstream_remote_address为106.52.131.116
  }

  // Release the file event so that we do not interfere with the connection read events.
  socket.ioHandle().resetFileEvents();
  cb_->continueFilterChain(true); /// ConnectionHandlerImpl::ActiveTcpSocket
  return ReadOrParseState::Done;
}

这里值得注意的, envoy.filters.listener.proxy_protocol解析proxy protocol header时, local_address端的dst_addr(172.17.0.54:8000)remote_address end src_addr(106.52.131.116) . The order was just reversed.

After the processing of ---6ef75cf24cbe891e84625f239706249d proxy_protocol , the connected downstream_remote_address is changed to the source address of the client.

  • envoy.filters.listener.original_src

For sidecar.istio.io/interceptionMode: TPROXY , virtualInbound listener will add envoy.filters.listener.original_src :

 # istioctl -n foo pc listeners deploy/httpbin --port 15006 -o json
[
    {
        "name": "virtualInbound",
        "address": {
            "socketAddress": {
                "address": "0.0.0.0",
                "portValue": 15006
            }
        },
        "filterChains": [...],
        "listenerFilters": [
            {
                "name": "envoy.filters.listener.original_dst",
                "typedConfig": {
                    "@type": "type.googleapis.com/envoy.extensions.filters.listener.original_dst.v3.OriginalDst"
                }
            },
            {
                "name": "envoy.filters.listener.original_src",
                "typedConfig": {
                    "@type": "type.googleapis.com/envoy.extensions.filters.listener.original_src.v3.OriginalSrc",
                    "mark": 1337
                }
            }
        ...
        ]
        "listenerFiltersTimeout": "0s",
        "continueOnListenerFiltersTimeout": true,
        "transparent": true,
        "trafficDirection": "INBOUND",
        "accessLog": [...]
    }
]

envoy.filters.listener.original_src By tcp option modify upstream_local_address to achieve ---27707ed68b7c425a1cddaabb7bcf1fclient IP downstream_remote_address transmission.

 /// source/extensions/filters/listener/original_src/original_src.cc

Network::FilterStatus OriginalSrcFilter::onAccept(Network::ListenerFilterCallbacks& cb) {
  auto& socket = cb.socket(); /// ConnectionHandlerImpl::ActiveTcpSocket.socket()
  auto address = socket.addressProvider().remoteAddress();   /// get downstream_remote_address
  ASSERT(address);

  ENVOY_LOG(debug,
            "Got a new connection in the original_src filter for address {}. Marking with {}",
            address->asString(), config_.mark());

...
  auto options_to_add =
      Filters::Common::OriginalSrc::buildOriginalSrcOptions(std::move(address), config_.mark()); 
  socket.addOptions(std::move(options_to_add)); /// Network::Socket::Options
  return Network::FilterStatus::Continue;
}
  • envoy.filters.listener.original_dst

In addition, httbin envoy as the receiving end of ingressgateway, virtualInbound listener also configure ListenerFilter( envoy.filters.listener.original_dst ), to see its function.

 // source/extensions/filters/listener/original_dst/original_dst.cc

Network::FilterStatus OriginalDstFilter::onAccept(Network::ListenerFilterCallbacks& cb) {
  ENVOY_LOG(debug, "original_dst: New connection accepted");
  Network::ConnectionSocket& socket = cb.socket();

  if (socket.addressType() == Network::Address::Type::Ip) { /// socket SO_ORIGINAL_DST option
    Network::Address::InstanceConstSharedPtr original_local_address = getOriginalDst(socket); /// origin dst address

    // A listener that has the use_original_dst flag set to true can still receive
    // connections that are NOT redirected using iptables. If a connection was not redirected,
    // the address returned by getOriginalDst() matches the local address of the new socket.
    // In this case the listener handles the connection directly and does not hand it off.
    if (original_local_address) { /// change local address to origin dst address
      // Restore the local address to the original one.
      socket.addressProvider().restoreLocalAddress(original_local_address);
    }
  }

  return Network::FilterStatus::Continue;
}

For istio, iptable intercepts the original request and transfers it to port 15006 (in request) or 15001 (out request), so the socket that handles the request local address does not request original dst address . original_dst ListenerFilter is responsible for changing the local address of the socket to original dst address .

For virtualOutbound listener envoy.filters.listener.original_dst directly, set use_original_dst to true, and then envoy will add envoy.filters.listener.original_dst At the same time, virtualOutbound listener will forward the request to the listener associated with the original destination address of the request for processing.

For virtualInbound listener , --- envoy.filters.listener.original_dst --- will be added directly. Unlike virtualOutbound listener , it just changes the address to original dst address instead of forwarding the request to the corresponding listener for processing (for incoming requests, there is no listener with dst address) . In fact, the incoming request is handled by FilterChain.

Refer to the code generated by virtualInbound listener :

 // istio/istio/pilot/pkg/networking/core/v1alpha3/listener_builder.go

func (lb *ListenerBuilder) aggregateVirtualInboundListener(passthroughInspectors map[int]enabledInspector) *ListenerBuilder {
    // Deprecated by envoyproxy. Replaced
    // 1. filter chains in this listener
    // 2. explicit original_dst listener filter
    // UseOriginalDst: proto.BoolTrue,
    lb.virtualInboundListener.UseOriginalDst = nil
    lb.virtualInboundListener.ListenerFilters = append(lb.virtualInboundListener.ListenerFilters,
        xdsfilters.OriginalDestination, /// 添加envoy.filters.listener.original_dst
    )
    if lb.node.GetInterceptionMode() == model.InterceptionTproxy { /// TPROXY mode
        lb.virtualInboundListener.ListenerFilters =
            append(lb.virtualInboundListener.ListenerFilters, xdsfilters.OriginalSrc)
    }
...

summary

Based on TPROXY and Proxy Protocol, we can realize the preservation of the client source IP of the four-layer protocol in istio.

refer to

about us

For more cases and knowledge about cloud native, you can pay attention to the public account of the same name [Tencent Cloud Native]~

Welfare:

① Reply to the [Manual] in the background of the official account, you can get the "Tencent Cloud Native Roadmap Manual" & "Tencent Cloud Native Best Practices"~

②The official account will reply to [series] in the background, and you can get "15 series of 100+ super practical cloud native original dry goods collection", including Kubernetes cost reduction and efficiency enhancement, K8s performance optimization practices, best practices and other series.

③If you reply to the [White Paper] in the background of the official account, you can get the "Tencent Cloud Container Security White Paper" & "The Source of Cost Reduction - Cloud Native Cost Management White Paper v1.0"

④ Reply to [Introduction to the Speed of Light] in the background of the official account, you can get a 50,000-word essence tutorial of Tencent Cloud experts, Prometheus and Grafana of the speed of light.


账号已注销
350 声望974 粉丝