Introduction to Cloud Log Service (SLS) combines the characteristics and application scenarios of Kubernetes logs to provide a full range of log collection, processing and analysis practices in the application environment of container microservices.

Direct best practice: [ architecture log collection operation and maintenance management best practice ]
Best Practice Channel: [ Best Practice Channel ]
There are a wealth of best practices for enterprise cloud migration. Starting from typical scenarios, we provide a series of project practice solutions to lower the threshold for enterprise cloud migration while meeting your needs!

The importance of the Kubernetes logging system

An important criterion for cloud native observability of microservices is logging. The collection, storage and analysis of logs are one of the key pillars of building a modern system platform, which can help the team diagnose problems, retrospect quality, and monitor system operational efficiency. In today's environment of the container/Kubernetes technology boom, the log system also plays a very critical role for Kubernetes. For Devops, operations, security, etc., it is inseparable from complete, diverse and effective log collection, storage management and analysis, as shown in the figure below visible.

1.png

Challenges faced by log collection, operation and maintenance management under the microservice architecture

As we all know, with the container/Kubernetes technology in the process of microservices landing, compared with physical machines, VMs in application deployment, application delivery and other links, it provides users with simpler, lighter, and more cost-effective advantages, and users are using container/Kubernetes technology to do In the process of microservice transformation, there is also a mixed deployment of containerized applications/non-containerized applications. For applications deployed based on VMs or physical machines, log collection related technologies are relatively complete, including relatively sound Logstash, Fluentd, FileBeats, etc., but in the case of application containerization, especially when deploying microservice applications based on the Kubenetes cluster, log collection, operation and maintenance It brings a lot of challenges to users, the main reasons are:

  • There are many log collection targets, which need to collect host logs, container logs, and container stdout. There are multiple application data and multiple log formats, and there is a lack of a unified one-stop log collection solution;
  • Cluster elastic scalability, strong environment dynamics, and dynamic service migration have brought great difficulties to log collection. The dynamic nature of log collection and data integrity are very big challenges;
  • Operation and maintenance costs are very high. Some existing solutions can only use multiple software combinations to collect. The stability of the system assembled by each software is difficult to guarantee, and it lacks centralized management, configuration, and monitoring methods, which makes the operation and maintenance burden heavy;
  • Log collection agent is highly intrusive, Docker Driver extension needs to modify the underlying engine, and one container corresponding to the collection of one log collection agent will cause resource competition and waste.
  • Log collection performance is low. Under normal circumstances, a Docker Engine will run dozens or even hundreds of Containers. At this time, the collection performance and resource consumption of the open source log collection agent are very unsatisfactory;
  • The log analysis efficiency and methods are lacking, and the open source log analysis display tool lacks simple and effective visualization methods for real-time log analysis and intelligent analysis.

2.png

Alibaba Cloud Kubernetes log collection solution

Based on the above analysis, Alibaba Cloud's log service product addresses the needs and pain points of log collection, operation and maintenance management for users in the process of implementing application microservices based on Kubernetes, and combines the advantages of Alibaba Cloud's combined cloud products to propose a one-stop log collection and operation. The solution of dimensional management analysis provides powerful log processing and analysis capabilities, such as PB-level log real-time query, log clustering analysis, Ingress log analysis report, log analysis function, upstream and downstream ecological docking capabilities, etc., providing users with container/Kubernetes One-stop capability of log collection, operation and maintenance management in the process of technology landing application microservice transformation.

3.png

  • Ingress log solution
    Ingress in Kubernetes is a declaration of API resources. The specific implementation requires Ingress Controller to take over the definition of Ingress. At present, the more popular Ingress Controller implementations include Nginx, Traefik, listio, Kong, etc. The most widely accepted in China is Nginx Ingress Controller. .
    Logging and monitoring are basic functions provided by all Ingress Controllers. Logs generally include Access Log, Controller Log, and Error Log. Monitoring mainly extracts Metrics information from logs and Controllers. The access log in this data has the largest magnitude, the most information, and the highest value. Generally, the seven-layer access log includes: URL, source IP, UserAgent, status code, inbound traffic, outbound traffic, response time, etc. For Ingress Controller, this This type of forwarding log also includes additional information such as the forwarded service name and service response time. From this information, we can analyze a lot of useful information, such as: PV and UV of website visits; geographical distribution of visits and device distribution; error ratio of website visits; response delay of back-end services; distribution of visits to different URLs Wait. However, it is very complicated to manually build and operate a complete set of Ingress log analysis and monitoring system. The system requires a lot of modules, such as deploying log collection Angent and configuring collection and parsing rules, deploying real-time data analysis engines such as Elastic Search, Clickhouse, etc. Visualize components and build reports such as Grafana, Kibana, etc., deploy alarm modules and configure alarm rules such as ElastAlert, etc., and because the Kubernetes cluster has relatively large traffic, it is also necessary to build a buffer queue such as Redis, Kafka, etc.
    In order to simplify the user's threshold for Ingress log analysis and monitoring, Alibaba Cloud Container Service and Log Service connect Ingress, and only need to apply a yaml resource to complete the deployment of a complete set of Ingress log solutions such as log collection, analysis, and visualization. 4.png
    5.png
  • Kubernetes container log collection analysis and monitoring
    Logs are an indispensable part of any system. The official Kubernetes documentation also introduces a variety of log collection methods. In summary, there are three main methods: native mode, DaemonSet mode, and SideCar mode.

    • Native method: Use kubectl logs to directly view the locally retained logs, or redirect the logs to files, syslog, fluentd and other systems through the logging driver of docker engine.
    • DaemonSet method: Deploy a log agent on each node of the Kubernetes cluster, and the agent collects logs from all containers to the server.
    • SideCar mode: Run a SideCar log agent container in a container group (Pod) to collect logs generated by the main container of the container group (Pod). Log collection in the SideCar mode relies on the shared log directory of Logtail and the business container. The business container writes the logs to the shared directory. Logtail monitors the changes in the log files in the shared directory and collects the logs.

    The comparison of collection methods is shown in the table below. 8.png
    As can be seen from the above table, the native method is relatively weak, and is generally not recommended to be used in a production system; the DameonSet method has a much smaller resource footprint, but the scalability and tenant isolation are limited, and it is more suitable for single-function or non-business Many clusters; The SideCar method occupies more resources, but it has strong flexibility and multi-tenant isolation. It is recommended to use this method for large-scale Kubernetes clusters or as a PAAS platform serving multiple business parties. Usually we can collect and deploy suggestions like this:

    • Core application: Use SideCar to collect.
    • Common application/system log: use DaemonSet method to collect.
    • Standard output: Use DaemonSet method to collect. 6.png

to sum up

This article introduces the log collection and operation and maintenance management solutions in the process of application microservice transformation based on Kubernetes. Due to space limitations, this article cannot introduce specific implementation suggestions and more features one by one. Please read the best practices of Alibaba Cloud official website in detail. Channel architecture log collection operation and maintenance management best practices

Copyright statement: content of this article is contributed spontaneously by Alibaba Cloud real-name registered users, and the copyright belongs to the original author. The Alibaba Cloud Developer Community does not own its copyright and does not assume corresponding legal responsibilities. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and the "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find suspected plagiarism in this community, fill in the infringement complaint form to report it. Once verified, the community will immediately delete the suspected infringing content.

阿里云开发者
3.2k 声望6.3k 粉丝

阿里巴巴官方技术号,关于阿里巴巴经济体的技术创新、实战经验、技术人的成长心得均呈现于此。