Introduction to uses two demo sharing technologies to calculate the flink version of the solution in real time

This article is compiled from the live sharing of Alibaba Cloud Intelligent Industry Solution Expert GIN
Live link: https://developer.aliyun.com/learning/course/839

This article mainly shares two real-time big data applications based on Flink. In order to better reflect the value of the application and the typical scenarios it represents, this sharing has customized two application cases that are close to real life.

The first is how to do real-time API application service log analysis. The second is to use simulated IoT telemetry data to analyze the engine of the vehicle and perform real-time anomaly detection to achieve the purpose of predictive maintenance.

Real-time application log analysis

Scene description

The requirements of the first scenario are relatively common. This scenario builds an API for vehicle privacy protection. This API itself can perform privacy protection processing on photos of vehicles uploaded by users, and is a deep learning model.

This model is encapsulated into an API and placed on Alibaba Cloud's public cloud ECS for users from all over the world to access. The first thing that needs to be done for this API is to analyze the frequency of how many people are accessing his feedback, from which country or region, and some characteristics of his visit, whether it is an attack or a normal use?

图片 1.png

In order to do this real-time analysis, you first need to be able to perform a massive and real-time collection of the application logs of each API dispersed in each server. Not only can we collect, we must also be able to process it in a relatively timely and real-time manner. Processing includes queries that may have dimension tables, aggregation of some windows, etc., which are relatively common operations for streaming computing. Finally, the results of these operations are placed in an environment with high throughput and low latency, making downstream analysis The system can access the data in real time.

The entire link is not complicated, but it represents a very important capability, that is, by using real-time calculation and processing represented by Flink, it can provide business decision-makers with a data-driven decision-making function in a second-level unit.

Demo solution architecture

Let's take a look at how this demo is implemented. There are several important keys in the architecture here.

image25.gif

First of all, on the upper right is the environment of the built API. Flask and Pytho are used in combination with the mainstream Nginx and Gunicorn to make it into an API. You need to turn the API into a container image and deploy it to Alibaba Cloud's ECS through the image. For high concurrency and low latency, the seventh layer of load balancing is installed, and an API Gateway is set in front to help users go. The ability to call API.

At the same time as this demo, we also provide a WEB APP, so that users can not only call the API through code, but also use a graphical interface to access the API. When the front-end users call the API, they will use the SLS simple log service to collect real-time collection API application logs from the server of the API itself, and after simple processing, deliver them to the real-time calculation Flink.
Flink has a very good feature, that is, it can subscribe to the log delivery from the simple log service, and perform the query combination of the window aggregation dimension table on the log in a streaming calculation mode, and so on. There is another advantage. It can use custom SQL to customize more complex business logic.

When these data are processed, Flink will write the stream data to Hologres in the form of a structured table. Hologres not only serves as a storage for data, but also as an OLAP-like engine that powers downstream BI data. These things are strung together to form a framework for this big data real-time log collection and analysis.

Solution analysis

Let's take a look at how each component is used.

Use vehicle privacy API as a data source for real-time analysis
The WEB APP can allow users to upload photos of their vehicles very easily, and the API will perform a blurring process on him. In the screen recording, you can see that the background of this photo is blurred after being processed by the API, and the part of the license plate and the part of the private information are also blocked.

image28.gif

SLS Log Center
When a user accesses this API, the background simple log service will perform a real-time collection of him.

image30.png

After the log is collected, it will use Log tail's ability to convert data to perform a certain degree of analysis and conversion of the original log, which includes parsing the IP address into geographic information such as the accuracy of the national city latitude, which is convenient for subsequent downstream operations. This information can be dispatched during analysis. In addition to simple services, it also provides a very powerful graphical data analysis capability.

image31.png

real-time calculation Flink version
Here you can do a primary data analysis or data survey function, you can see whether the conversion of the original log meets a demand for downstream business support. After the log is collected, converted, and processed, the log will be processed through the Log Hub. Delivery to the stream processing center, that is, real-time calculation of Flink.

image32.png

The term practical delivery is not particularly precise. In fact, Flink actively subscribes to the processed log information of the Log Store stored in the Log Hub. Flink has a very good place, you can use common SQL to write business logic, including conversion and processing some logical conditions. After SQL is written, just click Go Online, it can be packaged into a Flink job and hosted in Flink's cluster, which can be accessed very conveniently through this console.

So how often is the current plus cluster used? What about the CPU, whether there is any abnormality, whether there is any error, including checking the entire delivery situation, etc., can be hosted directly through Flink, which is a very big advantage, and there is almost no need to worry about operation and maintenance.

image33.png

Hologres (HSAP)

Flink processing is completed. This stream data can be directly written to our storage system Hologress in a way similar to the table structure through the interface provided by Flink. Hologress has a particularly big feature: It is both OLTP and OLAP.

Specifically, it can be used as an OUTP to write quickly, and at the same time, a high-concurrency and low-latency query analysis can be performed on the written data at the same time. That is, the ability of the OLAP engine that is often referred to. He merged the two into one, so Hologress is also called HSAP.
image34.png

DataV Dashboard
In this architecture, it is mainly used to show the processed data to the downstream, that is, the end user. The terminal business decision-maker can see the large screen of consumption in real time.

image35.png

This real-time large screen will be accessed with the API, and will reflect the latest information processed with a delay of seconds. On the real-time large screen of this datav, this can greatly reduce the delay caused by decision-makers when they see the data.

If the traditional batch processing method is adopted, each processing may require terabytes of data, and the processing time can be as long as several hours. If an end-to-end real-time computing solution with flink as the core is adopted, this delay can be compressed from a few hours to a few seconds or even less than one second.

Real-time predictive maintenance of vehicle engines

Scene description

The second business scenario is to use IoT to analyze and determine whether the engine of the car walking on the road shows some abnormal licenses through simulated telemetry data. You can determine whether there may be a problem in advance. If you leave it alone, there may be a component in 3 months. It is about to break. This is also a requirement that is often mentioned in actual application scenarios. We call it predictive maintenance. Predictive maintenance can save customers a lot of money in actual application scenarios, because when things have been repaired, they are definitely not as effective as replacing them before they are damaged.

图片 4.png

Demo solution architecture

In order to realize such a scene closer to the real world, I investigated and understood that there is a diagnostic system called OBD II in the on-board equipment, which often contains classic data, and collected some of these data to process and simulate it. I wrote a program to simulate a data of a real car's engine running in a real environment.

图片 15png.png

Of course, this time because it is impossible to actually drive a car on the road, this simulation program uses various statistical analysis methods to simulate and generate such driving data to achieve the real effect as much as possible.

This program will deliver the simulated telemetry data of the driving engine to Kafka, and then subscribe to Kafka's topics through real-time calculation of Flink consumption, and then perform different streaming calculations according to each topic. One part of the result is archived in OSS, and historical data is stored in it, and the other part is directly delivered to the developed anomaly detection model as a heat flow data source, deployed on PAI EAS, and can be directly called by Flink .

Then after making this machine learning judgment, look at the current engine data for any signs of abnormality, and then write this result into the database for AB to make one consumption. After the data is processed in real-time by Flink through real-time calculation, part of the data is archived in OSS.

This part of the data is actually used as historical data to model, or even re-model. Because this feature of driving may change every once in a while, commonly known as Data Drifting, then the newly generated historical data can be used to retrain the model, and the retrained model can be used as a Web Service. , Deploy it to PAI ES for Flink to call, in this way, a Lambda architecture big data solution is completed.

Solution analysis

Generate simulated driving data

First, we need to do the work of generating simulation data to simulate the OBD data of the engine's telemetry data and deliver it to the cloud for analysis. The function calculation is used here, which is very convenient. It is first a managed service, it is a service service.

image61.gif

Secondly, you can copy the Python script from the locally developed script directly to the function calculation, and use this managed calculation to execute such a program script generated by the simulation data, which is very convenient.

In this demo, a function calculation is performed every minute, that is, a batch of telemetry data is generated, and then a data is delivered to Kafka at an interval of 3 seconds to simulate this data in a real environment as much as possible A frequency of occurrence.

Collect/publish driving data

Kafka is also a commonly used big data Pub/Sub system. It is very flexible and has great scalability. In Kafka on Alibaba Cloud, you can build a Kafka cluster in EMR, or use Kafka on MQ. One of the hosting services to build a complete service.

image62.gif

This is a Kafka system. This demo uses Kafka to build a managed Pop Subject System for convenience. This System is actually only used to accumulate the data generated in the front, that is, the engine data delivered by the vehicle. In the production environment, there can be no one car, or even tens of thousands, hundreds of thousands of vehicles are possible. If Kafka is used, it can be very convenient to expand. Regardless of whether there are 10 or 100,000 front-end vehicles, the overall structure does not need to be changed much, and it can calmly respond to these flexible needs for expansion.

Real-time calculation and abnormal analysis model call

For the real-time computing part, the real-time computing system of Flink is still adopted, but the exclusive cluster of Blink is used in this demo, which is the so-called semi-managed real-time computing platform. In fact, it is almost exactly the same as the full hosting usage in the previous scene.

image63.gif

It’s just that when making this demo, part of the area has not yet launched the fully managed version of Flink, so I chose a service called Blink exclusive cluster, which is also linked to the real-time computing family, and the method used is almost the same as that of the whole Hosting is exactly the same. Developers only need to focus on writing this script to do business logic processing, click to go online, and the rest is basically completely managed by Flink on its behalf, just need to monitor to see if there are any abnormalities, including doing Some tuning and other tasks are very convenient.

Then it is worth mentioning here that the interface called by the PAI-EAS model is embedded in Flink, so that when Flink processes streaming data in real time, it can also throw part of the data to PAI to make the inference of the model. , The results obtained are combined with real-time streaming data, and finally written into the downstream storage system, which reflects the scalability and scalability of the Flink computing platform.

anomaly detection model

This part shows how to use a graphical learning platform to design and develop a very simple binary classification model.

image64.gif

This binary classification model is mainly to learn from the historical data of the past engine, which features will be used to judge the engine as a problem, and which are more normal values. Through this model, there is a basis that can be used to make a judgment on the newly generated engine data in the future, which helps business personnel to predict the current engine data problems earlier.

model deployment and call service

Because the model has learned related features and the pattern of this Data from the past. The studio used in the development of this model is completely drag-and-drop construction, and almost never wrote a piece of code. It is very convenient and fast. It is possible to realize the development of a model through buttons. The better point is that when the model is developed, it can be deployed with one-click through PAI, packaged into a rest API and Web Service, and placed on the PAI platform for users to call. After one-click deployment, it is very convenient to make a test call to the service of the deployed model.

image65.gif

High Throughput Structured Data Storage (RDS)

When the model is deployed, you can use Flink to let him judge whether there is any abnormality. After the stream data is processed in real time, it is finally written into a MySQL database.

image66.gif

This database will be used as a data source to provide a data support for the downstream real-time large screen. In this way, the business staff can realize the real-time state, that is, every few seconds, they can see if there is any problem with the car currently running on the road?

Near Realtime Dashboard

Through this link: https://datav.aliyuncs.com/share/9fff231ff81f409829180ee933e7bcee can open this real-time big screen.

image68.gif

The large screen of data v is updated every 5 seconds by default, which means that the latest pre-telemetry data from the database, including the data for judging whether there is any abnormality, will be displayed on the large screen every 5 seconds.

Red represents the data collected at this time point, which means there is a problem, then blue represents normal, which is relatively normal data. The normal standard of this data is completely controlled by the simulated data generated previously by the function computer. Because the logic of the function computer has artificially added some data that will make the engine look wrong, the abnormal part of this demo reflects a little more.

The above are the two demos shared this time. Interested students can use the real-time calculation Flink version build their own applications.

Copyright Notice: content of this article is contributed spontaneously by Alibaba Cloud real-name registered users, and the copyright belongs to the original author. The Alibaba Cloud Developer Community does not own its copyright and does not assume corresponding legal responsibilities. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and the "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find suspected plagiarism in this community, fill in the infringement complaint form to report it. Once verified, the community will immediately delete the suspected infringing content.

阿里云开发者
3.2k 声望6.3k 粉丝

阿里巴巴官方技术号,关于阿里巴巴经济体的技术创新、实战经验、技术人的成长心得均呈现于此。


引用和评论

0 条评论