1. Business Background
In today's era of information explosion, information flows freely around the world with the help of the trend of the Internet, resulting in a variety of platform systems and software systems, and more and more business will also lead to system complexity.
When there is a problem in the core business that affects the user experience, the developer does not find it in time, and it is too late when the problem is discovered, or when the CPU of the server continues to increase, the disk space is full, etc., the operation and maintenance personnel need to find and deal with it in time. This requires an effective monitoring system to monitor and early warning.
How to monitor and maintain these services and servers is an important part that our developers and operation and maintenance personnel cannot ignore. This article is about 5,000 words in length. I will explain the principles of vivo server monitoring and the evolution of the architecture. A systematic arrangement for everyone to refer to when selecting monitoring technology.
vivo server monitoring aims to provide one-stop data monitoring including system monitoring, JVM monitoring and custom business indicator monitoring for server applications, and supporting real-time, multi-dimensional and multi-channel alarm services to help users grasp applications in a timely manner Multi-faceted status, timely early warning and discovery of faults in advance, and detailed data provided after the event to track down and locate problems, and improve service availability. At present, the cumulative number of access business parties for vivo server monitoring has reached 200+. This article introduces server monitoring. Our company also has other types of excellent monitoring, including general monitoring, call chain monitoring, and client monitoring.
1.1 The basic process of monitoring system
Whether it is an open source monitoring system or a self-developed monitoring system, the overall process is similar.
1) Data collection : can include JVM monitoring data such as GC times, number of threads, old and new generation area sizes; system monitoring data such as disk usage, disk read and write throughput, network egress and ingress traffic , the number of TCP connections; business monitoring data such as error logs, access logs, video playback volume, PV, UV, etc.
2) Data transmission : report the collected data to the monitoring system in the form of message or HTTP protocol.
3) Data storage : Some are stored using RDBMS such as MySQL and Oracle, some are stored using time series databases OpenTSDB, InfluxDB, and some are directly stored using HBase.
4) Data visualization : Graphical display of data indicators, which can be line charts, bar charts, pie charts, etc.
5) Monitoring alarms : Flexible alarm settings, and support for various notification channels such as email, SMS, and IM.
1.2 How to use the monitoring system in a standardized way
Before using the monitoring system, we need to understand the basic working principle of the monitoring object, such as JVM monitoring, we need to know the memory structure of the JVM and the common garbage collection mechanism; secondly, we need to determine how to describe and define the state of the monitoring object, such as monitoring The interface performance of a business function can monitor the request volume, time-consuming situation, and error volume of the interface. After determining how to monitor the status of the object, you need to define a reasonable alarm threshold and alarm type. When an alarm reminder is received , to help developers find faults in time; finally, establish a complete fault handling system, respond quickly when an alarm is received, and handle online faults in a timely manner.
2. The architecture and evolution of vivo server monitoring system
Before introducing the vivo server monitoring system architecture, let's take a look at the OpenTSDB time series database. Before understanding, we will explain why we choose OpenTSDB. The reasons are as follows:
1) The monitoring data collection index has a unique value at a certain point in time, and there is no complex structure and relationship.
2) The indicators of monitoring data have the characteristics of changing with time.
3) Based on the distributed and scalable time series database of HBase, the storage layer does not need to invest too much energy, and has the characteristics of high throughput and good scalability of HBase.
4) Open source, implemented in Java, and provides an HTTP-based application programming interface, which can be modified quickly for troubleshooting.
2.1 Introduction to OpenTSDB
1) A distributed and scalable time series database based on HBase, the main purpose is to be a monitoring system. For example, collect monitoring data of large-scale clusters (including network devices, operating systems, and applications), store and query it, support second-level data collection, support permanent storage, do capacity planning, and easily access existing In the monitoring system, the system architecture diagram of OpenTSDB is as follows:
(from official documentation)
The storage structure unit is Data Point, that is, the value of a Metric at a certain point in time. Data Point includes the following sections:
- Metric, monitoring metric name;
- Tags, Metric tags, used to mark information such as machine names, including TagKey and TagValue;
- Value, the actual value corresponding to the Metric, an integer or a decimal;
- Timestamp, timestamp.
The core stores two tables: tsdb and tsdb-uid. The table tsdb is used to store monitoring data, as shown below:
(Image source: https://www.jianshu.com )
Row Key is Metric+Timestamp's hourly hour+TagKey+TagValue, which is combined with the corresponding byte map; Qualifier under column family t is the number of seconds remaining on Timestamp's hourly hour, and the corresponding value is Value.
The table tsdb-uid is used to store the byte mapping just mentioned, as shown below:
(Image source: https://www.jianshu.com )
"001" in the figure means tagk=hots or tagv=static, which provides positive and negative queries.
2) OpenTSDB usage policy description:
- Do not use the rest interface provided by OpenTSDB, and directly connect to HBase through the client;
- The Thrd thread of the compact action is disabled on the engineering side;
- Obtain Redis buffered data and write it to OpenTSDB in batches at intervals of 10 seconds.
2.2 Points that OpenTSDB needs to pay attention to in practice
1) Accuracy problem
String value = "0.51";
float f = Float.parseFloat(value);
int raw = Float.floatToRawIntBits(f);
byte[] float_bytes = Bytes.fromInt(raw);
int raw_back = Bytes.getInt(float_bytes, 0);
double decode = Float.intBitsToFloat(raw_back);
/**
* 打印结果:
* Parsed Float: 0.51
* Encode Raw: 1057132380
* Encode Bytes: 3F028F5C
* Decode Raw: 1057132380
* Decoded Float: 0.5099999904632568
*/
System.out.println("Parsed Float: " + f);
System.out.println("Encode Raw: " + raw);
System.out.println("Encode Bytes: " + UniqueId.uidToString(float_bytes));
System.out.println("Decode Raw: " + raw_back);
System.out.println("Decoded Float: " + decode);
As shown in the above code, when OpenTSDB stores floating-point data, it cannot know the storage intention, and it will encounter accuracy problems during conversion, that is, store "0.51" and retrieve it as "0.5099999904632568".
2) Aggregate function problem
Most of the aggregation functions of OpenTSDB, including sum, avg, max, and min, are LERP (linear interpolation) interpolation methods, that is, the acquired values are filled in, which is very unfriendly to the use of null values. For details, see OpenTSDB's document on interpolation.
At present, the OpenTSDB used by vmonitor server monitoring is our modified source code. The nimavg function has been added, and the built-in zimsum function can meet the needs of null value insertion.
2.3 Principle of vivo server monitoring collector
1) Timer
Three types of collectors are included: OS collector, JVM collector, and business indicator collector. The OS and JVM perform collection and aggregation every minute, and the business indicator collector will collect in real time and complete the aggregation reset within 1 minute. The data of the three collectors is packaged and reported to RabbitMQ, and the reporting action asynchronously times out.
2) Service indicator collector
There are two ways to collect business indicators: log output filtering and tool code reporting (intrusive), log output filtering is by inheriting the filter of log4j, so as to obtain the renderedMessage output by the Appender specified in the indicator configuration, and configure the keywords according to the indicator , aggregation method and other information for synchronous monitoring and collection; code reporting reports message information according to the indicator code specified in the code, which is an intrusive collection method and is implemented by calling the Util provided by monitoring. The business indicator configuration is refreshed from the CDN every 5 minutes, and there are multiple built-in aggregators for aggregation, including count count, sum summation, average average, max maximum value and min minimum value statistics.
2.4 The architecture design of the old version of vivo server monitoring
1) Data collection and reporting : The monitoring collector vmonitor-agent connected to the demand-side application collects the corresponding data according to the monitoring indicator configuration, and reports the data to RabbitMQ once every minute. The adopted indicator configuration is downloaded from the CDN every 5 minutes. , CDN content is uploaded by the monitoring background.
2) Computation and storage : The monitoring background receives data from RabbitMQ, disassembles it and stores it in OpenTSDB for visualization chart calls, and stores the configuration of monitoring projects, applications, indicators and alarms in MySQL; realizes distributed task distribution through Zookeeper and Redis The module realizes the coordinated operation of multiple monitoring services for distributed computing.
3) Alarm detection : Obtain monitoring indicator data from OpenTSDB, detect anomalies according to the alarm configuration, and send the anomalies through third-party dependent self-developed messages and SMS messages. Alarm detection completes distributed computing through the distributed task distribution module.
2.5 vivo server monitoring old version deployment architecture
1) Self-built computer room A : The deployment architecture is taken in China as an example. The monitoring project is deployed in self-built computer room A, monitoring RabbitMQ messages in this computer room, and the dependent Redis, OpenTSDB, MySQL, Zookeeper, etc. are all in the same computer room and need to be uploaded The monitoring indicator configuration is uploaded to the CDN from the file service, and can be called by the monitoring application device.
2) Cloud computer room : The monitoring requirements of the cloud computer room The application equipment reports the monitoring data to the local RabbitMQ of the cloud computer room, and the RabbitMQ of the cloud computer room forwards the specified queue to the RabbitMQ of the self-built computer room A by routing, and the monitoring configuration of the cloud computer room Pull through CDN.
2.6 Architecture design of the new version of vivo server monitoring
1) Collection (access side) : The business side accesses vmonitor-collector and configures relevant monitoring items in the monitoring background of the corresponding environment to complete the access. vmonitor-collector will periodically pull the monitoring item configuration, collect service data and Report every minute.
2) Data aggregation : The old version supports RabbitMQ to route the collected data to RabbitMQ in the monitoring room (this behavior does not occur in the same room), which is consumed by the monitoring background service; CDN is responsible for carrying the configuration of each application for application Pull regularly. As a monitoring data gateway, the new version of vmonitor-gateway adopts http method to report monitoring data and pull indicator configuration, abandoning the previous methods of RabbitMQ reporting and CDN synchronization configuration, so as to avoid the impact on monitoring reporting when the two fail.
3) Visualization and support for alarms and configuration (monitoring background vmonitor): responsible for the diversified display of data in the foreground (including business indicator data, branch room summary data, single server data, and business indicator composite operation presentation), data aggregation, Alerts (currently including SMS and self-developed messages), etc.
4) Data storage : The storage uses HBASE cluster and open source OpenTSDB as an intermediary for aggregation. After the original data is reported, it is persisted to the HBase cluster through OpenTSDB. Redis is used as a distributed data storage to schedule information such as task allocation and alarm status. Metrics and alert configurations are stored in MySQL.
3. Strategy for monitoring, collecting, reporting and storing monitoring data
In order to reduce the monitoring access cost and avoid the impact of RabbitMQ reporting failure and CDN synchronization configuration failure on the monitoring system, the collection layer will be directly reported to the agent layer through HTTP, and the queues of the collection layer and the data agent layer will be used to maximize the data during disasters. degree of rescue.
The detailed process description is as follows:
1) The collector (vmonitor-collector) collects and compresses data every minute according to the monitoring configuration, and stores it in the local queue (maximum length 100, that is, the maximum storage of 100 minutes of data). Notifications can be reported by HTTP, and the data can be reported to the gateway.
2) The gateway (vmonitor-gateway) authenticates the reported data and discards it if it is found to be illegal. At the same time, it determines whether the current lower layer is abnormally blown. If it occurs, it will notify the acquisition layer to reset the data return queue.
3) The gateway verifies the version number of the monitoring configuration brought when is reported. If it expires, the latest monitoring configuration will be returned when the result is returned, requiring the collection layer to update the configuration.
4) The gateway stores the reported data in the Redis queue corresponding to the application (the maximum length of the cache queue key for a single application is 1w); after the storage queue is completed, the HTTP report is returned immediately, indicating that the gateway has received the data, and the collection layer can remove the entry data.
5) The gateway decompresses and aggregates the Redis queue data; if the circuit breaker is abnormal, the previous behavior is suspended; after completion, it is stored to OpenTSDB through HTTP; if the storage behavior is abnormal in large quantities, the circuit breaker is triggered.
4. Core Indicators
4.1 System monitoring alarms and service monitoring alarms
After the collected data is stored in HBase through OpenTSDB, distributed computing is completed through the distributed task distribution module. If the alarm rules configured by the business party are met, the corresponding alarm is triggered, and the alarm information is grouped and routed to the correct notifier. Alarms can be sent through SMS self-developed messages, and the personnel who need to receive alarms can be entered by name, job number, and pinyin query. When a large number of repeated alarms are received, repeated alarm information can be eliminated. All alarm information can be recorded through MySQL tables. It is convenient for follow-up query and statistics. The purpose of alarm is not only to help developers find faults in time and establish a fault emergency mechanism, but also to combine monitoring items and alarm sorting services with business characteristics, and learn from the best monitoring practices in the industry. The alarm flow chart is as follows:
4.2 Supported alarm types and calculation formulas
1) The maximum value : triggers an alarm when the specified field exceeds this value (alarm threshold unit: number).
2) Minimum value : Trigger an alarm when the specified field falls below this value (alarm threshold unit: number).
3) Volatility : Take the maximum or minimum value during the period from the current time to the previous 15 minutes and the average value within these 15 minutes to make a floating percentage alarm. The fluctuation needs to be configured with a fluctuation baseline. The “alarm threshold” judgment is made, and the alarm will not be triggered if the value is lower than the baseline value (alarm threshold unit: percent).
calculation formula :
Volatility - calculation formula for upward fluctuation: float rate = (float) (max - avg) / (float) avg;
Volatility - calculation formula for downward fluctuation: float rate = (float) (avg - min) / (float) avg;
Volatility - interval fluctuation calculation formula: float rate = (float) (max - min) / (float) max;
4) Daily chain ratio : Take the value between the current time and the same time yesterday as a floating percentage alarm (alarm threshold unit: percent).
Calculation formula: float rate = (current value - previous period value) / previous period value
5) Week-to-week ratio : Take the value of the current time and the same time of the same day last week as a floating percentage alarm (alarm threshold unit: percent).
Calculation formula: float rate = (current value - previous period value) / previous period value
6) Hour-day chain ratio : Take the sum of the data values from the current time to the previous hour and the sum of the data values in the previous hour at the same time yesterday to make a floating percentage alarm (alarm threshold unit: percent).
Calculation formula: float rate = (float) (anHourTodaySum - anHourYesterdaySum) / (float) anHourYesterdaySum.
5. Demonstration effect
5.1 Business Indicator Data Query
1) In the query condition column "Indicator", you can select the specified indicator.
2) Double-click the indicator name on the chart to display a larger picture, and the bottom is the total value of the indicator field according to the starting time.
3) The scroll wheel can zoom the chart.
5.2 System monitoring & JVM monitoring indicator data query
1) The page refreshes automatically every minute.
2) If a certain line, that is, the entire line of a certain machine, is displayed in red, it means that the machine has not reported data for more than half an hour. If the machine is abnormally offline, you should pay attention to investigation.
3) Click the Details button to query the system & JVM monitoring data in detail.
5.3 Configuration of business indicators
A single monitoring indicator (common) can collect data for the log file of a single specified Appender.
【Required】【Indicator Type】There are two types: "common" and "composite". Compounding is a secondary aggregation of multiple common indexes, so under normal circumstances, you need to add common indexes first.
【Required】【Chart Order】Order in positive order to control the display order of indicator charts on the data page.
【Required】【Indicator Code】UUID short code is automatically generated by default.
[Optional] [Appender] is the name of the appender of the log4j log file. It is required that the appender must be referenced by the ref of the logger; if you use intrusive data collection, you do not need to specify it.
[Optional] [Keyword] is the keyword for filtering log file lines.
[Optional] [Delimiter] refers to the symbol used to separate single-line log columns, usually ",", English comma or other symbols.
6. Comparison of mainstream monitoring
6.1 Zabbix
Zabbix was born in 1998. The core components are developed in C language, and the web side is developed in PHP. It is an excellent representative of the old-fashioned monitoring system. It can monitor network parameters, server health and software integrity, and is widely used.
Zabbix uses MySQL for data storage. There is no feature of OpenTSDB to support tags, so it is impossible to aggregate statistics and alarm configuration in multiple dimensions, which is inflexible to use. Zabbix does not provide a corresponding SDK, the application layer monitoring support is limited, and there is no intrusive buried point and collection function provided by our self-developed monitoring.
In general, Zabbix has a higher maturity, and high integration leads to poor flexibility. After the monitoring complexity increases, the customization difficulty will increase, and the MySQL relational database used is a problem for large-scale monitoring data insertion and query. question.
6.2 Open-Falcon
OpenFalcon is an enterprise-level, high-availability, and scalable open-source monitoring solution that provides real-time alarm, data monitoring and other functions. It is developed in Go and Python and is open sourced by Xiaomi. Using Falcon can easily monitor the status of the entire server, such as disk space, port live, network traffic and so on. Based on Proxy-gateway, it is easy to implement application layer monitoring (such as monitoring the access volume and time consumption of the interface) and other personalized monitoring requirements through self-embedded points, and the integration is convenient.
The official architecture diagram is as follows:
6.3 Prometheus
Prometheus is an open source monitoring alarm system and time series database (TSDB) developed by SoundCloud. Prometheus is developed in Go language and is an open source version of Google's BorgMon monitoring system.
Like Xiaomi's Open-Falcon, using OpenTSDB for reference, tags are introduced into the data model, which can support multi-dimensional aggregation statistics and alarm rule settings, greatly improving the efficiency of use. Monitoring data is directly stored in the local time series database of Prometheus Server. A single instance can process millions of metrics. The architecture is simple and does not depend on external storage. A single server node can work directly.
The official architecture diagram is as follows:
6.4 Vivo server monitoring vmonitor
As a monitoring background management system, vmonitor can perform visual viewing, alarm configuration, business indicator configuration, etc., and has the functions of JVM monitoring, system monitoring and business monitoring. Through the queue of the collection layer (vmonitor-collector collector) and the data agent layer (vmonitor-gateway gateway), the data can be saved to the greatest extent during disasters.
An SDK is provided to facilitate business side integration, and it supports application-layer monitoring statistics such as log output filtering and intrusive code reporting data. Based on the OpenTSDB time series open source database, the source code has been modified, and the nimavg function has been added to meet the needs of the built-in zimsum function. Null value insertion requirements, with powerful data aggregation capabilities, can provide real-time, multi-dimensional, multi-channel alarm services.
7. Summary
This article mainly introduces the design and evolution of the vivo server monitoring architecture, which is a real-time monitoring system based on the java technology stack. It also briefly lists several types of mainstream monitoring systems in the industry, hoping to help everyone. Understand the monitoring system and make more appropriate choices in technology selection.
The monitoring system involves a wide range and is a huge and complex system. This article only introduces JVM monitoring, system monitoring and business monitoring (including log monitoring and tool code intrusive reporting) in server monitoring. It does not involve Client monitoring and full-link monitoring, etc., if you want to understand thoroughly, you must combine theory with practice and then go deeper.
Author: vivo internet server team - Deng Haibo
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。