Author: Liu Dudu
background
Pre-school enlightenment education track, competition is increasingly stimulating, the "Little Monkey English", "Little Monkey Language", and "Little Monkey Thinking" under the Good Future Group have three independent product lines, strategic adjustments, resource mergers, heavy blows, and new creations. "Little Monkey Enlightenment" brand, an enlightenment course designed for preschool children aged 2-6.
status quo
- Language level: the main technology stack of Little Monkey English uses JAVA language; the main technology stack of Little Monkey English and Thinking uses PHP language
- Structural level: data dependence, type differentiation is large, and cannot be converted to each other
- Storage level: Course storage structure, pre-class, extra-curricular, and in-class storage structure are greatly differentiated
- Operational level: independent operation, group-oriented, and strategic approaches are all different
- Course level: Courses are highly differentiated, and the production process is different
Program
Common migration solutions in the industry include "stop migration", "dual-write migration", and "real-time (one-way) migration". Each solution has its own characteristics, advantages and disadvantages, and each solution is analyzed.
- Downtime migration: proactive, controllable migration timing, low complexity, low risk, low cost, detrimental to the business, and unfriendly to users
- Dual-write migration: passive, controllable migration timing, high complexity, high risk, high cost, non-destructive to business, and user-friendly
- Real-time migration: passive, uncontrollable migration timing, high complexity, high risk, medium cost, non-destructive to business, and user-friendly
challenge
Driven by the strategic background of the group, combined with the current status quo and planning for the future, it is expected to abandon the historical baggage, go lightly, and adopt a real-time migration solution after comprehensive consideration. Face many challenges, such as:
- Migration timing: uncontrollable, users trigger the migration independently
- Irreversible: Once the migration is triggered, the process is irreversible. To deal with emergencies, emergency plans are required
- Consistency: Data consistency, data before and after migration cannot be lost
- Change the engine of the aircraft: take into account the user experience, as far as possible to ensure that the entire migration process is unaware of the user and smooth transition
Migration plan
Little Monkey Enlightenment is a brand new brand, new architecture design, storage structure design, course production & scheduling design, compatible with the old three subject product forms, and has the ability to be flexible and expandable.
The first step, the timing of migration is clear
The migration timing is when the user downloads the Little Monkey Enlightenment APP, and the timing is triggered after logging in. After the timing is triggered, the three services will be notified first to limit the unavailability and ensure that there will be no new data. At this time, start the migration script to perform data migration. Ensure that the migration speed is fast enough and ensure data consistency to improve user experience. The entire process is irreversible. Once an emergency occurs, an emergency plan is required. After the migration is successful, the user can take lessons in the Little Monkey Enlightenment APP (English, thinking, and Chinese) . No longer have to take classes in three apps
The second step, data analysis & classification
Through business and data analysis & classification, data can be divided into two major categories: basic data and user data. The characteristics of the two data types are also different, for example:
- Basic data: public data, including merchandise sold, points-based merchandise, lesson bags, picture books, and nursery rhymes
- User data: personal privacy data, including rights, points, reports, and in-class data. After segmentation, it is found that user data can be divided into core data and transaction data, such as:
- Core data: rights, points
- Transaction data: detailed data such as transaction records, class attendance, and class completion records
The third step is to develop a suitable migration plan based on the characteristics of different data
Through the above analysis, understand the current data and characteristic analysis of the business, for example:
- Basic data: small amount of data, basic pre-data, controllable migration timing and migration time, controllable risk, and manual verification.
- User data: This part of the data is relatively special. The amount of core data is small and important. The amount of detailed data is large and the degree of importance is not as high as that of the core data. The overall complexity of the migration is high. The core data is migrated in real time. Once transaction data occurs, it will not When there is a change, if the amount of data is large, it can be migrated beforehand and cooperate with the user to trigger the migration for the final data migration. The timing and time of migration are uncontrollable, triggered by users, and risks are uncontrollable.
The fourth step, "live migration" technical solution analysis
The design idea of the entire migration architecture is “transparent, finds problems before users, solves in minutes, and data can be traced back”. After the user triggers the migration opportunity, the dispatch center uses the Invoke reflection mechanism to start the migration subtask & monitor the status of the subtasks (time-consuming, mileage card), and the subtasks decide whether to split processing based on the amount of data.
Core architecture design point: RDS-TO-DRDS: In the face of transaction details or learning records such data, in order to improve the query speed during migration, you can consider synchronizing the data to DRDS, and use the user ID as the primary key of the sub-database and sub-table
- Task decoupling: When faced with multiple business modules or task data top and bottom dependencies, reduce or reduce the top and bottom dependencies through decoupling. For example, the modules of commodities, orders, and logistics are all hierarchically related, and are conventional processes It must be the product first and then the production logistics data according to the order after the order. The serial method. The problem that this brings is efficiency. Therefore, decoupling is carried out in a certain way. The dependency between the data generally depends on the ID. Here we can Keep the new ID generated by the same old ID under certain circumstances. In addition, in the process of migration, in addition to historical data, there are also new data. It is recommended to distinguish by the number of IDs to avoid hitting pins, and it is also easy to check The problem is to distinguish between new business data and historical data based on ID.
- Multi-task parallel: Parallel is to improve the speed of migration, there is nothing to say about this.
- Backtracking: The associated primary key ID of historical data is of string type, and the new ID generated by the same old ID must be consistent, recorded & stored. Of course, there is also one that is generated by an algorithm based on the ID, and then deduced by the algorithm. .
- Data fragmentation: Data fragmentation is also to improve the migration speed.
- Idempotence: the work done in order to ensure the interruption or abnormality caused by uncontrollable factors, and to ensure the consistency of the data
- Isolation: First, ensure that users are isolated from users, and secondly, tasks are isolated, in order to ensure stability
- Progress monitoring: In order to be able to understand the online situation, every system monitoring is indispensable.
- State calculation: Multiple subtasks work at the same time, and the final migration state, result, time-consuming, etc. need to be calculated through the dispatch center.
- Dispatching center: Involving the data of all various business modules, each module is called a subtask. Through the dispatch center, it is unified scheduling, monitoring, and calculating the overall migration progress; triggering all migration subtasks & monitoring & calculations. Calculate the migration status of all subtasks. Calculate the final migration progress, the migration time, the amount of migration data, and the core key data.
- Monitoring alarm: Provide transparent monitoring methods for users to migrate content, and find problems before users
- Quality Assurance System: Means provided for data consistency and emergency plans.
- Data Fragmentation Proxy: Reduce migration time and improve user experience, and solve the problem of large data migration.
- Heterogeneous data conversion: Provide data from old business data, go through format conversion, data modeling, result calculation, and post-processing to complete the data migration of the entire subtask; the milestone is to record the core and key steps in order to quickly locate the problem.
- Business ID converter: The data ID type has changed, and the string has become a numeric type. In order to ensure data consistency, all data can be traced back, which is convenient for tracing the past and present of the data.
The fifth step, quality assurance system design
In order to improve the user experience and the speed of migration and data consistency, so that emergencies occur, we can have emergency plans for emergencies.
For the status & progress monitoring of each module (task), it is equipped with an automatic repair mechanism to deal with problems to a certain extent. When the automatic repair mechanism cannot handle the problem, timely warning is handled. We can quickly repair the problem through the patch assistant.
online
Going online is a test and verification of all our previous work, and we still have to be cautious in this process.
The internal testing phase is divided into two steps:
- Internal test: internal test + data drill
- External test: sample different types of users and test them as seed users.
- Natural stage: This stage is still relatively conservative, seeking stability first, and gradually.
- Proposal stage: Relatively speaking, there is already a certain degree of confidence, and a medium-scale notification can be migrated.
- Mandatory phase: the large-scale migration phase
- Silent phase: The silent users are migrated, and this part of the data is the last batch of user data
Replay
At each stage of the launch, each issue must be reviewed and summarized to prevent the same problem from recurring. After the review, it will be found that the problems that occur are very low-level and should not occur. E.g:
Environmental issues: In the pre-release environment at the beginning, the modification of a certain database address was pulled down after going online, which caused the data migration to be not up-to-date
Redis problem: The cache expiration time is not well controlled, resulting in rapid growth.
ID problem: The communication between the two parties is not clear beforehand, which leads to data mismatch.
to sum up
Data model establishment method
- In-depth understanding of business design and data structure design on both sides
Sorting old and new mapping relationships (point-to-point conversion)
- Reverse the data structure design of the old system according to the new data structure design
- Field attribute mapping, relational mapping
- Meaning of field special types
The new structure of the new field organization
- The data source of the new field
- Type conversion of new fields
New data structure sorting
- Get data source preconditions
- Get data source method
- Result attributes and new data structure mapping
Record
The amount of migrated data exceeds 2 billion
There is no clustering problem, and the failure rate is 0.1‰
Patent output (authorized) data migration method, device, electronic equipment and storage medium
growing up
Personal growth is still relatively large. In terms of technology, architecture design weighs the interests of all parties. There is no perfect architecture, only the most suitable one. You can't just focus on the architecture level. You also need to pay more attention to the details and landing, and the details determine success. . In terms of project management, the project communication and coordination, rhythm, and risk control have also been greatly improved. As the project owner, he is responsible for and coordinating seven or eight roles with nearly 30 people. The results of this project are inseparable from the team and Every partner's constant support, and also thanks to Mr. Liu Junhai for his guidance on the overall plan.
Scan the code and add friends to pull you into the technical exchange group, and add a secret code: Si No
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。