Hello everyone, my name is Yang Chenggong.
The last article introduced why there is a monitoring system at the front end? What is the significance of front-end monitoring system? Some friends left a message after reading and wanted to hear some detailed implementations. So in this article, we will start to introduce how front-end monitoring is implemented.
If you still don't understand why, what's the use of monitoring, it is recommended to read the previous article: Why can't there be no monitoring system in the front end?
Before implementing it, you must first have an overall context in your mind and understand the specific process steps for building front-end monitoring. Because the front-end monitoring system is actually a complete full-stack project, not just the front-end, and even the main implementation revolves around 数据方面
.
Of course, there is another point to explain. The realization of this article is mainly for ordinary business and for the direction of self-research by small and medium-sized factories. I have seen the monitoring system made by the big factory. It is very complex and powerful, and it is hundreds of millions of data at every turn. In the end, it has reached the direction of big data. I only introduce how to implement the main function, how to solve the problem.
The construction process of front-end monitoring is divided into the following stages:
- Acquisition phase: data collection
- API stage: build an API application and receive the collected data
- Data storage stage: API application connects to the database and stores the collected data
- Query and statistics stage: query, count and analyze the collected data
- Visualization stage: The front-end queries statistical data through the API and displays it visually
- Alarm stage: API docking with alarm notification services, such as DingTalk
- Deployment stage: the overall deployment of the application goes live
Below I will sort out the key realization ideas of each stage.
Collection phase: What data to collect?
The first step in monitoring is to collect data. Having data is the prerequisite for monitoring.
The meaning of collecting data is to record the real operations of users in the process of using the product. Combined with our analysis in the previous article, the data generated by real operations can be divided into two categories: abnormal data and behavioral data .
Let's analyze the anomalous data first. The exceptions in the project can be generally divided into two categories, one is front-end exceptions, and the other is interface exceptions.
Front-end exception
Front-end exceptions can be roughly divided into:
- JS code execution exception
- Promise exception
- Static resource loading exception
- console.error exception
- cross domain exception
The most important one, and the one we encounter the most, is a variety of js code execution exceptions. Such as type errors, reference errors, etc. Most of these exceptions are caused by our imprecise coding, so collecting such exceptions will help us improve the coding quality.
Then there is the Promise exception. Promise is one of the most important attributes of ES6. It tests our js asynchronous programming ability, which is mainly reflected in the interface request. Therefore, the exception capture of these two parts is very critical.
In addition, the abnormal loading of static resources generally refers to referencing some image addresses in html, third-party js addresses, etc., which cannot be loaded normally for various reasons, and this should also be monitored.
The console.error exception is generally used in a third-party front-end framework. It has customized some errors and will be thrown with console.error
. Such exceptions are also necessary to be caught.
As for cross-domain exceptions, we often encounter this, which can usually be found during the joint debugging phase of front-end and back-end development. However, it is not certain that the configuration of the backend is suddenly changed online, which causes the frontend to cross domains. For security, you should also monitor it.
The front-end abnormal collection is about these five types, which basically cover more than 90% of the abnormal situations in the front-end.
Interface exception
Interface exceptions belong to the back-end exceptions, but interface exceptions will directly cause front-end page errors. Therefore, such exceptions are an important basis for us to determine the root cause of online problems. Interface exceptions can be classified according to the response results:
- Unresponsive/Timeout response exception
- 4xx request exception
- 5xx server exception
- Insufficient permissions
Sometimes due to network problems or server problems, the front end does not receive a response after initiating the request, and the request is suspended. This time is a non-response/timeout response exception. For this type of exception, we can set the maximum request time, actively disconnect the request after the timeout, and add an interface timeout record.
In addition, other types of interface exceptions can be judged according to HTTP 状态码
or the specified field returned by the backend such as error_code
.
Regardless of whether the status code or other judgment methods are used, as long as the exception type can be distinguished, this is not strictly required.
The 4xx exception type is a request exception, which is generally a problem with parameters passed by the front end, or a problem with interface validation parameters. The key to handling such exceptions is to save the request parameters, which can facilitate front-end troubleshooting.
5xx errors are exceptions handled internally by the server. The key information of such exceptions is the time of the error report and the returned exception description. Saving these can facilitate the backend to find logs.
Insufficient permissions I think is also an important type of error. Because the authority design of some management systems is more complicated, sometimes the interface is suddenly inexplicable and cannot be adjusted, which affects the user's next operation, which also needs to be recorded and tracked.
behavioral data
Behavioral data is relatively broad, and any meaningful operation of the user can be defined as behavioral data.
For example, when a button is clicked, how long it stays there, the click-through rate of the new feature, when it is used, and so on. One of the advantages of the self-developed monitoring system is flexibility. Any useful information you need can be designed at this stage.
This stage is very critical and is the core of the monitoring system design, so I wrote it in detail, and everyone should consider more about which data to collect at this stage. The later stages are based on the specific implementation of this design.
API stage: build an API interface for reporting data
In the previous stage, a plan for data collection was prepared. After the data is collected, the next step is to report the data .
To put it bluntly, data reporting is to transfer the data by calling an API interface and then store it in the database. Therefore, the task of this stage is to build an API interface application for reporting data.
As a glorious front-end engineer, it is natural to choose Node.js , which belongs to the JS family when developing interfaces. Node.js currently has a lot of frameworks. I prefer lightweight and concise, and what needs to be installed by myself, so I choose the simple and classic Express framework.
The things to do to build an API application are:
- Directory structure design
- routing design
- Authentication
- parameter validation
- request response encapsulation
- error handling
There are also some details to deal with. This stage is a very good learning opportunity for students with weak back-end foundations.
I highly recommend that the front-end friends master some basic knowledge of the back-end, at least understand what is going on in terms of simple principles. This stage is mainly to understand how the API application is built, why each part is done, and what problems can be solved, so that your basic knowledge of the backend will be established.
After the framework is set up, the main thing is to design the interface URL and then write the processing logic to ensure that the interface designed in this step can be adjusted and data can be received.
Data storage stage: interface docking with database
In the previous step, we built the API interface and received the collected data. Then, in this step, we need to connect to the database and store the collected data in the database.
For the database, choose the most friendly to the front end, the document database belonging to the NoSQL family MongoDB
.
The biggest feature of this database is that the stored data format is similar to JSON, and the operation is like calling functions in JS and combining JOSN data. It is very easy for us to understand and get started with the front end. You can experience it in the actual combat process. Elegant too.
The data storage stage mainly introduces the basic information and operations of the database, including the following aspects:
- How to connect to the database
- How to design fields
- how to verify
- how to write
- how to inquire
The key at this stage is 数据验证
, after designing the database fields, we hope that all written data must conform to the data format we want. If it does not conform after verification, we can supplement or modify the data fields, or simply refuse to write, which can ensure the reliability of the data and avoid unnecessary data cleaning.
After the data writing is done, some simple query and modification functions should be added. Because you want to see if the execution is successful after you write the data, you can check a list to see the results.
It is also necessary to modify the function. A very common requirement in front-end monitoring is to calculate the user's page stay time . My plan is to create a record when the user enters a certain page, and then when they leave, modify the record and add an end time field, which requires the modification function.
Finally, I would like to mention that many people are talking about how to do data cleaning . In fact, this depends on how you verify when you store the data in front of you. If it is indeed possible to store invalid data, you can write an interface for clearing data, write your own cleanup logic, and execute it regularly.
Query statistics stage: data query and statistical analysis
After a series of preparations, we have completed the API interface and data writing functions. Assuming that we have collected enough data and stored it in the database, this stage is the time to make good use of the data.
The main task of this stage is to retrieve data and 统计分析
, which are basically "query" operations.
The query here is not just to check, but how to check it is related to whether the data we collect can be effectively used. My idea is to start from these two aspects:
-
行为数据
: Query the overall statistics to see the trend of a certain time period -
异常数据
: Single query, precise positioning, and specific errors
Of course this is only in general terms. Behavior data will also be queried in a single line. For example, if I want to see what a user has done at a certain time, this is an exact search. There are also statistics on abnormal data, such as the ranking of the trigger frequency of abnormal interfaces.
The amount of behavior data will be very large, and will be frequently generated and written to the database during the user's use of the system. Therefore, in most cases of this type of data, overall statistics are made from multiple dimensions such as page and time by means of 聚合查询
, and finally some percentage conclusions are drawn. These statistical values can roughly reflect the actual usage of the product.
There is an optimization point here, because frequent requests will increase the burden on the interface, so a part of the data can also be stored locally, and after a certain amount is reached, the interface is requested and stored at one time.
Anomaly data is very important to developers, and it is a godsend for us to locate and solve bugs. Different from the multiple statistics of behavioral data, we are more concerned with the detailed information of each individual record for abnormal data, so that we can see errors at a glance.
Querying abnormal data is also relatively simple. Just like ordinary list query, it only needs to return the latest abnormal data. Of course, after we investigate the problem, we should also mark the handled exception as handled, which can prevent repeated investigation.
It can be seen that the most important thing at this stage is to make a statistical interface to prepare for the visualization of chart display in the next stage.
Visualization Phase: Final Data Chart Presentation
In the last stage, we developed a statistical interface and found the desired data results. Unfortunately, these results can only be understood by programmers, and others may not understand them. So in the end, in order to reflect the data more intuitively, we need to use the front-end visualization chart to make these data come alive.
At this stage, we are finally back to the most familiar 前端领域
. The tasks at this stage are relatively simple and smooth. Build a new front-end application based on React, access the statistical interface of the previous step, and then integrate the front-end chart library to display the statistical results in charts.
This new application is a front-end monitoring system that really needs to be displayed to the outside world. It is used by developers or product students within the team, so that they can view the data information generated by the product in real time and solve their own problems.
In fact, there are no key issues to talk about at this stage. The main thing is to choose an easy-to-use chart library and connect the interface. There are also various types of charts. It is necessary to consider which data is suitable for which chart, and make a judgment based on the actual situation.
Finally, the front-end pages and interface data of the monitoring system cannot be seen by everyone, so there must be basic login pages and functions. Do this, the task of this stage is over.
Alarm stage: Immediately alarm notification when abnormality is found
In the previous stage, after the front-end construction of the monitoring system is completed and the statistical data is displayed as a chart, the entire monitoring system is basically available.
But there is another situation, that is, the user suddenly reported an error when using our product, and the error information was also written into the database. If you don't actively refresh the page at this time, and in fact you can't refresh the page all the time, then we don't know this error at all.
If this is a very fatal bug with a wide-ranging impact, and we don't even know when the bug occurs, it will cause us great losses.
Therefore, in order to ensure that we can solve the bug in time, the function of an alarm notification is very important. Its function is to push 第一时间
to developers when an exception occurs, so that everyone can immediately find the problem and solve it with the fastest speed to avoid omission.
Alarm notification, the general solution now is to connect DingTalk or the robot of enterprise WeChat, we use DingTalk here. Which platform to use depends on which platform your subject is on. For example, the main body of my team is DingTalk, so when sending an alarm notification, you can directly use your mobile phone number to call any of your team members @
to achieve more accurate reminders.
This part is a supplement to the API application. After applying for DingTalk developer permission, access the relevant code in the API.
Deployment stage: everything is ready, just waiting for the launch
In the previous stages, we have completed data collection, API application construction, data storage, front-end visual display, and monitoring and alarming. The functions of the entire front-end monitoring system are complete. The last step is to deploy all the front-end and back-end databases online for everyone to access.
The deployment is mainly nginx parsing, https configuration, database installation, and application deployment of nodejs, etc. The content of this stage will be a little bit more operation and maintenance. But don't worry, I'll go over the key operations in detail here as well.
When the system is online, you can try to use any of your front-end projects to save the collected data through the API according to the collection method in the first article, and then you can log in to the monitoring system to view the real usage data.
When this part is completed, congratulations, a small front-end monitoring system is built. In the future, we can continue to expand the functions based on this, and slowly make this self-developed monitoring system more powerful.
Summarize
This article introduces the construction process of the front-end monitoring system, divides the overall process into several stages, briefly describes what to do in each stage, and what are the key issues, so as to help you clarify the idea of building a monitoring system.
The next article will introduce the implementation code of front-end monitoring. The article's first public account programmer was successful . The author Yang Chenggong focuses on the sharing of front-end engineering and architecture. Follow me for more hard-core knowledge.
Any questions and suggestions in this article are welcome to communicate with me, thank you for reading🙏
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。