Author: Chengtan

This year, the K8s Ingress Nginx project has successively disclosed three high-risk security vulnerabilities (CVE-2021-25745 [ 1] , CVE-2021-25746 [ 2] , CVE-2021-25748 [ 3] ), and the project also recently announced that it will stop Receive new feature PRs, focus on fixes and improve stability. Ingress Nginx, as a gateway component of the K8s project, is installed by default in the K8s cluster of a large number of users. As the basic software at the edge of the Internet network, and being used on a large scale, it is bound to become an ideal target for some attackers. Once the defense line is broken, the cost is painful. You can refer to the basic components that are also the network boundary. The Heartbleed painstaking vulnerability of OpenSSL is not far away.

Shortly after the Heartbleed vulnerability was released in 2014, OpenBSD began to maintain LibreSSL on its own, and Google also launched BoringSSL, which provides a more secure alternative based on the same set of SSL/TLS protocol standards as OpenSSL. Similarly, based on the same set of K8s Ingress API standards, is there a more secure K8s gateway that can replace Ingress Nginx?

 apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ingress-example
spec:
  rules:
  - host: foo.bar.com
    http:
      paths:
      - pathType: Prefix
        path: "/"
        backend:
          service:
            name: service1
            port:
              number: 80
  - host: bar.foo.com
    http:
      paths:
      - pathType: Prefix
        path: "/"
        backend:
          service:
            name: service2
            port:
              number: 80

Ingress API standard example

In the article "K8s Gateway Selection: Nginx or Envoy" , we have given this new option: MSE cloud native gateway. This article continues to analyze why the MSE cloud native gateway has better security guarantees.

Ingress Nginx Architecture Design Flaws

 title=

Ingress Nginx container architecture (image from kubernetes.io)

The frequent occurrence of Ingress Nginx security vulnerabilities is caused by its insecure architecture design: the control plane Ingress Controller component (Go program) and the data plane Nginx component are placed in a container. The control plane is an Admin role here, and it is conceivable that it will manage some sensitive information, such as the authentication credentials for communicating with the K8s API Server. The shared container between the data plane and the control plane provides an opportunity for attackers to obtain these sensitive information through the data plane. for example:

K8s uses the RBAC mechanism to implement the authentication of the API Server interface, and the credential information used for RBAC authentication will be mounted to the /var/run/secrets/kubernetes.io/serviceaccount directory of the container through volume. CVE-2021-25745 exploits the vulnerability of the control plane splicing the nginx.conf configuration file, and implements configuration injection through Ingress Path, allowing Nginx to provide a static file proxy route to obtain this credential. You can see what you can do with this certificate:

 title=

Ingress Nginx credential permissions (image from blog.lightspin.io)

The above figure shows the ClusterRole permission description of the ServiceAccount role of ingress-nginx. Because the gateway needs to load the TLS certificate, this role has the permission to view all Secrets in the cluster. The attacker can not only get the private key information of all TLS certificates through this certificate, but also get all the key class configurations in the cluster!

 title=

The architectural root cause of the vulnerability (image from blog.lightspin.io)

In fact, CVE-2021-25746 and CVE-2021-25748, including the earlier CVE-2021-25742 [ 4] , have the root cause of this problem. CVE-2021-25742 Implements credential acquisition through Nginx configuration fragments based on custom snippets; CVE-2021-25746 can implement Nginx configuration fragment injection to obtain credentials based on various Ingress Annotations; CVE-2021-25748 bypasses the fix for CVE-2021-25745 regular detection. It's so overwhelming...  

The Ingress Nginx community recognizes the seriousness of this architectural problem and has begun planning to separate the control plane from the data plane. If the existing architecture is continued, more serious security vulnerabilities may be exposed in the future.

It is worth noting that this architecture will not only cause the above security problems, but also cause the control plane and data plane processes to preempt and schedule each other when the container CPU load is high, resulting in a series of stability problems, such as:

  1. The livenessProbe, which is the responsibility of the control plane, fails to time out, causing the container to keep restarting
  2. When prometheus is enabled to collect monitoring indicators, the control plane cannot seize enough CPU due to high load, and OOM will occur, resulting in the container being killed. For details, see the related issue at the end of the article [ 5]

A Safer Alternative - MSE Cloud Native Gateway

 title=

Control plane and data plane architecture of MSE cloud native gateway

As can be seen from the above figure, the MSE cloud native gateway uses the data plane (Envoy) and control plane (Istio) isolation architecture, which fundamentally avoids the above problems. The MSE cloud native gateway adopts a managed deployment model, rather than being deployed in the user's own K8s cluster. Even if a security vulnerability occurs, the user can easily fix the vulnerability through one-click smooth upgrade. And a professional security team collects vulnerability intelligence, which can provide faster and more reliable repair solutions than open source.

Based on the explanations of the previous CVE vulnerability principles, it is not difficult to find that the way Ingress Nginx implements data plane control by splicing nginx.conf configuration on the control plane also has great security risks, such as defining a special Ingress Path:

 apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: ingress-example
spec:
  rules:
  - http:
      paths:
      - pathType: Prefix
        # 下文中{...}省略号隐去了可能引发漏洞的配置
        path: "/inject{...}location /abc"
        backend:
          service:
            name: service
            port:
              number: 80

The following configuration snippet will appear in the generated nginx.conf:

 location /inject{...}location /abc {
  set $ingress_name "ingress-example";
  ...
  ...
}

Those who are familiar with Nginx configuration will know that there are two location path matching rules, where location /abc corresponds to the above Ingress routing configuration, and location /inject can implement an additional configuration injection, which can be used in {...} Write arbitrary Nginx Location-level configurations, or even use highly flexible Lua scripts to achieve various purposes of configuration injectors.

Unlike Ingress Nginx, which implements data plane control by splicing nginx.conf configuration on the control plane, the cloud native gateway uses the more secure and reliable xDS protocol, and parses and replaces string splicing through xDS API configuration, which fundamentally avoids configuration injection caused by splicing configuration. issues, ensuring that configuration actions are explicit and the behavior is predictable. The following is the proto protocol used to issue route matching rules to Envoy, which is different from the location directive splicing of Ingress Nginx, which obviously restricts the scope of route matching configuration.

 message RouteMatch {
  option (udpa.annotations.versioning).previous_message_type = "envoy.api.v2.route.RouteMatch";
  message GrpcRouteMatchOptions {
    option (udpa.annotations.versioning).previous_message_type =
        "envoy.api.v2.route.RouteMatch.GrpcRouteMatchOptions";
  }
  message TlsContextMatchOptions {
    option (udpa.annotations.versioning).previous_message_type =
        "envoy.api.v2.route.RouteMatch.TlsContextMatchOptions";
    google.protobuf.BoolValue presented = 1;
    google.protobuf.BoolValue validated = 2;
  }
  // An extensible message for matching CONNECT requests.
  message ConnectMatcher {
  }
  reserved 5, 3;
  reserved "regex";
  oneof path_specifier {
    option (validate.required) = true;
    string prefix = 1;
    string path = 2;
    type.matcher.v3.RegexMatcher safe_regex = 10 [(validate.rules).message = {required: true}];
    ConnectMatcher connect_matcher = 12;
    string path_separated_prefix = 14 [(validate.rules).string = {pattern: "^[^?#]+[^?#/]$"}];
    string path_template = 15
        [(validate.rules).string = {min_len: 1 max_len: 256 ignore_empty: true}];
  }
  google.protobuf.BoolValue case_sensitive = 4;
  core.v3.RuntimeFractionalPercent runtime_fraction = 9;
  repeated HeaderMatcher headers = 6;
  repeated QueryParameterMatcher query_parameters = 7;
  GrpcRouteMatchOptions grpc = 8;
  TlsContextMatchOptions tls_context = 11;
  repeated type.matcher.v3.MetadataMatcher dynamic_metadata = 13;
}

It is worth mentioning that, at present, a large number of routing policy functions of Ingress Nginx need to be updated by updating nginx.conf, and then restarting Nginx to take effect. During the restart process, the client connection will be disconnected. In the scenario of websocket equal-length connection, it will cause business impact; However, through Envoy's xDS configuration, the routing policy takes effect based on RDS/ECDS, which has no effect on long connections.

In order to facilitate the smooth migration of users from Ingress Nginx to MSE cloud native gateway, we are not only fully compatible with the K8s Ingress API standard, but also compatible with the commonly used Ingress Nginx Annotation, see the document [ 6] at the end of this article.

In addition, the plug-in market of cloud native gateways provides a variety of authentication and security protection plug-ins, which can enhance network security protection capabilities:

 title=

Cloud Native Gateway Plugin Market

Users can also implement dynamic expansion of gateway functions in multiple languages (Go, JS, Rust, etc.) based on Wasm technology (without restarting the gateway). Based on the sandbox mechanism of Wasm, even if your code logic accesses a null pointer, it will not. Cause the gateway to crash. This safe and simple extension mechanism is also not available in Ingress Nginx.

Reference link:

[1] CVE-2021-25745:

https://github.com/kubernetes/ingress-nginx/issues/8502

[2] CVE-2021-25746:

https://github.com/kubernetes/ingress-nginx/issues/8503

[3] CVE-2021-25748:

https://github.com/kubernetes/ingress-nginx/issues/8686

[4] CVE-2021-25742:

https://github.com/kubernetes/ingress-nginx/issues/7837

[5] Related issues:

https://github.com/kubernetes/ingress-nginx/pull/8397

[6] Documentation:

https://help.aliyun.com/document_detail/424813.html


阿里云云原生
1k 声望305 粉丝