Abstract: This article organizes the sharing by Cao Jie and Yin Chunguang, senior development engineers of Ziyi Payment, in the Flink Forward Asia 2021 platform construction session. This article is mainly divided into four parts:
- Company Profile
- problems in practice
- case practice
- future plan
Click to view live replay & speech PDF
1. Company Profile
E-Pay is a wholly-owned subsidiary of China Telecom. The company's main businesses include livelihood payment, consumer shopping, and financial wealth management. At the same time, we rely on cloud computing, big data, artificial intelligence and other technical means to empower online and offline merchants.
The company's main business segments are divided into digital life, digital finance and financial technology services. Among them, digital life mainly refers to conventional payment services, such as the payment of people's livelihood, that is, the payment of residents' water, electricity and gas bills, etc. At the same time, we will jointly launch 5G rights packages in conjunction with telecommunications; digital finance mainly includes insurance, wealth management, and credit, among which orange Staging and corporate white papers are key products; technology services are mainly divided into corporate credit reporting and digital intelligence services. Corporate credit reporting refers to relying on existing core technological capabilities such as cloud computing, big data, artificial intelligence, and blockchain to create professional and efficient services. Intelligent risk management and credit reporting technology solutions. Data intelligence business refers to the base platform of Tianyi Cloud, focusing on SaaS/PaaS services and financial security services to create end-to-end financial solutions.
At present, Yipay has 50 million+ monthly active users, 500 million+ existing users, about 4,000 online servers, and hundreds of billions of daily records.
With the continuous expansion of the company's business scale, the business challenges we face are also increasing, mainly in two aspects:
- On the one hand, with the continuous increase in demand, the number of applications has increased dramatically due to the use of customized development methods, which makes it difficult to manage applications in a unified manner. The applications of various business lines are developing in a chimney-like direction, and the index caliber and calculation are not unified, and repeated processing will result in a waste of capacity;
- On the other hand, the data volume of a single topic in some scenarios is as high as 2.2 million per second, and for scenarios such as risk control, the business response delay is required to be within 200 milliseconds.
In response to the above problems, we have been actively exploring the establishment of a real-time processing system since 2018, combined with practical experience in the industry. In 2019, he started to build a real-time indicator processing system and introduced SparkStreaming as a computing engine. In the early 2000s, we introduced StructuredStreaming as a real-time computing engine for the sake of timeliness. As the application of the service continues to grow, we receive an increasing demand for real-time decisions that rely on combinations of atomic metrics. So in September 2020, we started building a real-time decision-making system and introduced FlinkCEP into the system. Until April this year, in order to solve the processing needs of some complex indicators, we introduced Flink SQL into the indicator processing link.
After continuous iteration of products, an enterprise-based intelligent decision-making system, Xianjian Platform, was finally formed.
The above figure shows the main functions of the Xianjian platform. The first is real-time indicator processing. At present, we support a variety of data sources, mainly including commonly used middleware such as Kafka and Pulsar. At the same time, in order to reduce the difficulty for users, we provide 23 algorithm templates, and also support the customized processing method of SQL; the second is real-time decision-making. We support the nested combination of rich rules and rule groups to meet the needs of complex decision-making. In addition, we integrate real-time, offline and third-party tags to provide users with a unified data query service. At the same time, for the stability of production, we provide comprehensive monitoring functions and fine-grained resource isolation, circuit breaker, and current limiting strategies. At the same time, for the running status of real-time computing jobs, we monitor the data volume and delay of Source and Sink with related metrics.
The above figure shows the logical architecture of the Xianjian platform, which is mainly divided into 4 layers.
- The top layer is the application caller, which mainly includes intelligent risk control, intelligent decision-making, and intelligent marketing system;
- Next is the real-time decision-making module, which provides the function of real-time decision-making, including the configuration and management of decision-making on the Web, and at the same time provides the development center for verification of decision-making tasks, and real-time decision-making through the decision-making core;
- The third layer is the real-time index processing module, through which users configure different processing methods, input into different execution engines, and at the same time integrate data services to provide users with query results;
- The bottom layer is the data layer. The data source mainly includes business data, user buried point data and offline data processed by the group. Finally, according to the user's configuration, the calculation results are stored in the corresponding DB.
The technical architecture diagram of the real-time index processing system mainly includes three modules. The front-end interface is mainly responsible for the configuration and authority management of user tasks. The background will generate the corresponding user-defined DSL language format and submit it to the kernel. The kernel selects the corresponding execution engine through the Mode Selector according to different configuration methods.
If the template processing method is used, the DSL Parser will be used to parse the grammar, and then the data will be cleaned and the algorithm will be calculated; if it is in the SQL mode, only the SQL grammar will be parsed, and the user's UDF and related configuration information will be loaded to generate the corresponding The task execution graph of , is handed over to the Program Deployer and selects the corresponding deployment environment to publish the task.
The execution environment manages resources through yarn, and HDFS is responsible for metadata storage.
The functions of Stream SQL are divided into basic functions and performance monitoring functions.
The basic functions mainly include the following:
- SQL syntax validation. Currently, Flink SQL syntax is supported, and the SQL syntax is verified before the user submits it;
- Sandbox testing. Users can pre-submit tasks and verify the accuracy of tasks;
- Supports loading of user udf functions.
The performance monitoring functions mainly include the following:
- Provides fine-grained resource configuration. The community version of Flink SQL does not support resource configuration at the operator level, and can only use a unified parallelism configuration, which will lead to excessive pressure on a node in production and cause task delays. Therefore, we set the parallelism of each node by obtaining the JsonPlan of Streamgraph, so as to achieve fine-grained resource configuration;
- Task status monitoring. We will monitor the running status of tasks, taking into account task delays and long processing links. We only monitor the rate of change of the data flow and traffic of the source and sink. Once the rate of change is found to be abnormal, it will be fed back to the business users in time, so that business changes can be detected as soon as possible;
- Failed tasks are automatically recovered. Able to recover by getting the most recent Checkpoint. At the same time, for tasks with a long checkpoint cycle, considering the recovery time when restarting, we will force a savepoint before restarting, thereby shortening the task recovery time.
The figure above shows the process of real-time indicator configuration:
- The first step is to configure the corresponding Source, Schema information or demo that provides data for automatic parsing;
- The second step is to select the method of data cleaning. Here are several simple data cleaning logics and SQL methods are also supported;
- The third step is to select the algorithm template for calculation, which also supports the nesting of algorithms.
The above figure shows the process of SQL processing configuration. First create a task, including the user's resources and other parameters, then write the task SQL, and finally launch the task and submit it to the execution environment.
The front-end page in the real-time decision-making module is mainly responsible for the configuration of decision-making tasks and the management of user rights, and submits the tasks to the back-end. The backend will publish the online policy to the corresponding decision node through Zookeeper. Each execution node has a ZK Watcher to monitor the status of the policy, load the policy through RuleLoader and compile the policy through RuleCompiler, and finally hand it over to Flink CEP for decision execution. Finally store the result of the decision to DB or middleware.
The process of decision-making configuration first needs to create a task, then configure the corresponding rules and the combination of rules, and finally test the task through the development center to verify the accuracy of the decision.
2. Problems in practice
In the process of practice, we also encountered many challenges, which can be summed up in the following aspects:
Consistency of business state data, repeated calculation of indicators, dynamic rule configuration, and full-link monitoring monitoring.
The first is the consistency of job state data configured by the indicator engine during the upgrade of the indicator job. Early indicator jobs were developed manually, and some business states were stored in HDFS. The jobs configured by the indicator engine did not manage the data of business states separately, and data consistency problems would be encountered during the migration of old tasks to the platform.
The solution is to extend the old computing program, read the full state data and store it externally, and then stop the old task. The jobs configured by the indicator engine perform data calculation from the specified offset, and then fill in the original indicator data from external storage.
The figure above shows the flow of job upgrade. Task reads the business State data and stores it externally when the function is opened. If it is a Keyed State, the State interface cannot obtain all the state data of the current task. It is necessary to downcast the State object, and then obtain all the state data indicator engine. The job configures and specifies the corresponding offset, and performs index calculation by complementing the data from the outside, thereby completing data recovery.
Secondly, there are pain points in the process of continuous addition of index jobs. Multiple jobs consume the same Kafka repeatedly, which leads to the problem of high upstream consumption pressure and repeated calculation of indicators.
In response to the above pain points, our solution is to optimize all jobs in a unified manner, pre-clean all message sources in a unified manner, and distribute them to the corresponding data domain topics according to the business process. Unified caliber management of indicators to ensure that indicators are not double-counted. Currently, there is no hierarchical processing of real-time indicators, mainly to avoid long computing links that affect the timeliness of services.
Third, there are problems with Flink CEP. The real-time decision-making module uses Flink CEP for rule matching. Initially, the matching of rules is achieved through program coding. However, as there are more and more rules, it is inconvenient to maintain and the development cost also increases. Flink CEP cannot perform dynamic rule configuration and parallel decision-making of multiple rules.
In response to the above problems, we have extended the development of Flink CEP to solve the problem of dynamic rule configuration and multiple rule decisions.
The above figure shows the logical architecture of Flink CEP extension development. The user configures the rules through the RuleManager and publishes the rule change event to Zookeeper. After the RuleListener listens to the change of the event, if a new rule is added, it will compile and generate a RulePattern instance through the groovy dynamic language. With the increase of rules, the thread processing efficiency of the CEP operator will decrease. It is necessary to speed up rule processing by binding rule groups to corresponding workers. After the CEP operator thread receives the event, it will be distributed to all workers. After the worker thread is processed, it will be released to the CEP operator thread through the queue, and finally released to the downstream.
Finally, there is the issue of data full-link monitoring. The data stream is transmitted from the collection end through Flume, then to the message center for indicator calculation, and then released to the downstream for real-time decision-making, which does not allow a large amount of data loss and data delay.
Based on the above requirements, it is necessary to monitor the overall data link, and use prometheus + grafana to collect metrics and alarm. This is mainly for the monitoring of message accumulation and loss for the Flume message middleware. Flink indicator calculation mainly monitors the running status and back pressure, and downstream monitors the time of CEP decision-making. Monitoring data links can help O&M to quickly locate and solve online problems.
3. Case practice
The diagram above shows how the Xianjian platform works.
First, the upstream user behavior and business events are transmitted to the Xianjian platform through the data channel, and the business side is responsible for configuring real-time indicators and business rules. When the business event triggers the business rules through the calculation results of the indicators, the Xianjian platform immediately pushes the results to the downstream The message center reaches users through various business systems. For example, when a user visits the homepage of wealth management, if he does not subscribe for a product within 30 minutes, a corresponding push message will be sent to the user according to the user's qualifications.
4. Future planning
In the future, we plan to continue to explore in the following areas:
- First, the scheme for incremental database collection is unified. Currently, the collection of MySQL is implemented using Canal. In the future, we plan to use Flink CDC to perform unified incremental collection for Oracle and MySQL;
- Second, offline real-time batch stream fusion. At present, the offline data warehouse is calculated by Spark SQL, and the real-time data warehouse is calculated by Flink SQL. The maintenance of two sets of metadata and different indicators makes the daily workload very heavy. Therefore, we hope to use Flink to complete the unified batch flow calculation;
- Third, Flink jobs automatically scale up and down. At present, Flink cannot perform automatic capacity expansion and shrinkage, and the traffic changes in the morning and evening will lead to a lot of waste of resources. When the computing power is insufficient, it can only be expanded by manual operations. We hope to realize automatic expansion based on Flink and reduce operation and maintenance costs.
Click to view live replay & speech PDF
For more technical issues related to Flink, you can scan the code to join the community DingTalk exchange group to get the latest technical articles and community dynamics as soon as possible. Please pay attention to the public number~
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。