The Meituan Lodging data governance team has been engaged in data governance for many years, from the initial passive and single-point governance to the later active and special governance, and then to the current systematic and automated governance. Along the way, they have been accumulating and precipitation, and they are also thinking and practicing continuously. At present, the team has achieved some phased results and has been recognized and affirmed by multiple business lines of Meituan. I hope to share the experience and lessons of the process with you, and also hope to bring some new ideas to the students engaged in data governance.
1. Preface
The Meituan Lodging Data Governance Team has accumulated many years of experience in warehouse construction and data governance, combined with the demands for data governance in the business development stage, and gradually transformed the governance idea from special, appearance, and problem-driven governance to automation and systematization. It has been implemented and practiced from the three directions of standardization, digitization and systematization.
2. Background introduction
The Meituan lodging business has developed for many years since its launch in 2014. It has gone through an exploration period, an offensive period, a development period, and gradually transitioned from the development period to the transformation period. The business has entered a relatively stable development stage from the previous rapid expansion stage, and the operation method has been transformed into a refined operation. At the same time, the requirements for the cost, efficiency, security, and value of data are also getting higher and higher, all of which have raised data governance. new requirements.
On the other hand, the data center to which the accommodation data group belongs has multiple business lines, such as accommodation, ticket vacation, etc. Each business line has different business models, different business life cycle stages, and different cognition and experience accumulation in data governance. . How to efficiently reuse data governance experience and capabilities, so that each business line of the data center can steadily improve the efficiency and effect of data governance, and avoid stepping on pits, which requires more standardized, systematic, and automated data governance.
Previously, we have had some accumulation and precipitation in data governance. The previous stage mainly changed from single-point and passive governance to active and special governance. Governance actions were conscious, planned, and targeted, and achieved good results. There are certain achievements (for the governance experience of the previous stage, please refer to the article on the data governance practice of Meituan Wine Travel ), but in general, it is still based on problem-driven governance and governance based on experience. In the face of new data governance responsibilities and requirements, there are some problems in the past methods, mainly including the following aspects.
Big difference in governance cognition
- Inconsistent cognition and inconsistent thinking : There is a lack of general system guidelines for governance. Different administrators have different depth of cognition of data governance, methods of problem disassembly, thinking and steps of governance, methods adopted and effect tracking, etc. larger difference.
- Repeated governance and lack of information : Incomplete governance, lack of experience in governance, and repeated implementation of the same governance by different people.
- Cross-scope, unclear boundaries, and difficult to evaluate effects : Different people set up different special projects for different problems, and the underlying logic of the problems overlaps. Some governance did not do anything, but received better results, and some governance was unclear about the results.
Governance is not standard
- Lack of process specification : There is a lack of theoretical guidance for the governance of each direction and each type of problem, and the governance methods, actions, processes, and steps depend on the experience and judgment of the governance person.
- Tracking the difficulty and quantity of problems: There is a lack of measurement standards for governance problems, and more judgment is made by human beings, and the governance effect lacks an evaluation system.
- The solution is difficult to implement : the solution exists in the document, which requires the administrator to find and understand, lack of tool support, and high cost.
Low governance efficiency and poor effect
- Low degree of online governance: The asset information and governance actions that governance relies on are scattered in multiple systems, information is fragmented, and execution efficiency is low.
- The process cannot be standardized, and the results are not guaranteed : the governance process requires the “manual guarantee” of the governance person, and there are deviations in understanding and implementation.
Data governance is not systematic
- Lack of overall top-level governance scheme design : the requirements of business and data centers for data governance require more comprehensive, refined, and effective governance, systematization of governance, thinking from a macro perspective, and dismantling layer by layer. Schematic design from the top.
- Problems are becoming more and more complex, and it is difficult to solve a single point : In the past, more problems were solved from the appearance. On the surface, the measurement indicators have improved. In fact, it is "a headache to treat the head, and a foot pain to treat the foot", and it has not been fundamentally solved. question. Or multiple problems have commonalities, and the fundamental problems are the same. For example, the root cause of the shortage of query resources may be insufficient construction or operation of the analysis topic model.
- The priority of different issues cannot be determined : The priority of different issues lacks measurement standards and methods, and mainly relies on human judgment.
- Governance does not conform to the MECE principle : which issues are composed of each governance direction, which are the most important, which have the highest ROI, which issues and governance actions can be combined, and the same issue should have different themes and different levels of measurement standards and governance methods in the data warehouse What differences need to be considered in systematic governance.
3. Thinking about the System of Governance
From the above background, it is not difficult to see that we are facing different requirements and challenges for data construction and governance in different business life cycle stages. At the same time, more passive governance, problem-driven special governance methods and methods in the past are also relatively backward, which directly leads to It is difficult for the technical team to meet the financial and business support requirements of the business side.
Through continuous learning and summarizing experience, we have begun to realize that data governance is a very complex and comprehensive issue. Only by building a standard business data governance system can we ensure that data governance is effective in status assessment, goal setting, and processes. Standard construction, governance monitoring and management, capacity building, execution efficiency, effect evaluation and other links have been effectively implemented. The following introduces our understanding and thinking at the level of governance systemization.
3.1 What is data governance systematization?
For data management and governance, we expect to build a set of core capabilities such as management system, method system, evaluation system, standard system, tool system, etc., to continuously serve the implementation of data governance. It can be compared to a general e-commerce company. If it needs to operate and serve customers well, it must first build a set of sales system, product system, supply system, logistics system, labor system, etc. Only in this way can they cooperate with each other and achieve good service. user's goal.
3.2 How does the systematization of data governance solve the current governance problems?
- In terms of methods and methods : first design the top-level governance framework, define and plan the necessary parts of governance, such as the scope, personnel, responsibilities, goals, methods, and tools from the perspective of the team as a whole, and then implement it. Pay more attention to the generality and effectiveness of the overall strategy, rather than getting into a specific problem solution and starting governance.
- In terms of technical means : based on perfect technical research and development specifications, with metadata and indicator system as the core, comprehensively evaluate and monitor business data warehouses and data applications, and support governance system tools to help governance students implement governance strategies and solve data problems Develop the problem of low efficiency of classmate governance.
- In terms of operation strategy : by evaluating the scope of influence and benefits of the governance issues, determine the importance of the issues to be governed, and promote the resolution of governance issues of different importance from the perspective of managers and the perspective of the person responsible for the issue.
3.3 How to build a business data governance system framework?
Our construction idea is: take the team data governance goal as the core orientation, design the relevant capability combination required to achieve the goal, and continue to iterate and improve according to the organizational requirements, the problem feedback during the implementation process, and finally realize the vision of data governance.
The system framework mainly includes the following contents:
- Management : Legislate, formulate relevant organizational guarantee process specifications, responsibilities design, reward and punishment measures, and guide and ensure the smooth progress of data governance. These are the key factors for data governance to successfully start and operate.
- Standard layer : Set standards, formulate various technical specifications and solutions required in the data governance process such as various R&D standards and specifications, solution standards SOP, etc. This is an important basis for all technical problems to be correct or not, and it is also an ex-ante solution in governance. essential part. Perfect standard specifications and good landing effect can greatly reduce the occurrence of data failure problems.
- Capability layer : Improve capabilities, mainly the digital capabilities of problem measurement based on metadata, and the systematic capabilities of instrumental detection and resolution of problems. Digitization and systematization capabilities are an important guarantee for the scientificity, quality and efficiency of data governance implementation.
- Execution layer : Set actions, combine specific goals to be achieved, and solve problems in each governance domain according to the ideas of pre-event constraints, in-process monitoring, and post-event governance. The achievement of the goal needs to be divided into specific issues related to the seven governance domains. Therefore, the achievement of a governance goal depends on the comprehensiveness and depth of the problem description in the governance domain.
- Evaluation layer : Giving evaluation, monitoring problems based on indicators, health evaluation system, special evaluation report, evaluating governance benefits and effects, which are an important starting point for implementing governance promotion process monitoring and result inspection.
- Vision : A long-term governance goal that guides data governance to move forward in a directionally and continuously toward the ultimate goal.
System framework construction results : The business data governance system framework is a top-level design for the overall data governance work. The framework defines what, how, what, what tools to use, and what goals to achieve for line-of-business data governance. Lazi's understanding of business data governance, standardized governance path methods and components, and guide data governance to be carried out in an orderly and effective manner.
3.4 How is the system framework implemented?
Referring to the characteristics of each component of the business line data standardization governance system framework, we implement the implementation of the data governance system framework through standardization , dataization , and systematization . , and finally get quantifiable results.
4. Systematic Practice of Governance
4.1 Standardization
Data governance standardization is a key breakthrough and an important means for enterprises to manage data assets. A series of policies, regulations, and plans need to be transformed into standards and systems before they can be effectively implemented. The standardization of data governance is not only conducive to establishing and improving various data management mechanisms and business processes, but also conducive to improving data quality, ensuring data security and compliance, and releasing data value. However, in the process of standardizing data governance, we often face the following three problems:
- Lack of process specification : There is a lack of standards and constraints in each link to guide standardized operations, and it is impossible to effectively prevent the occurrence and resolution of problems.
- Poor landing conditions : Normative standards, SOPs, etc. do not have landing conditions, relying on subjective will, cannot be effectively implemented, and the effect is poor.
- The construction method is unreasonable : Standardize the construction of Case by Case, and the lack of systematic construction ideas leads to "constant construction and continuous shortage".
In response to the above three problems, we divide the data development process from the perspective of problem solving, and sort out the missing process specifications through the ideas of pre-event constraints, in-process monitoring, and post-event analysis and evaluation, so as to realize the standard process specifications in data governance. All links are fully covered, and systematic tools are built to ensure the implementation of standards and specifications. The following will introduce how we solve the above problems in the process of data governance standardization from the aspects of specification construction and tool guarantee.
4.1.1 Standardized construction
Specifications are the basis for establishing rules and regulations for data governance. In view of the unreasonable construction of standards and specifications and the lack of process specifications, we use systematic construction ideas to divide the data development process and data governance process from the overall structure, and focus on the whole process. Corresponding specifications for each link of data governance shall be established:
- Data governance management specifications : clarify the responsibilities of the data governance organization and the composition of personnel, and determine the data governance implementation process and governance problem operation and maintenance process to ensure the smooth progress of the data governance process.
- Data R&D specifications : clarify the specifications that need to be followed in all aspects of data development. From the source of the problem, through the establishment of a complete R&D specification, guide the R&D work to be carried out according to the standard, which can reduce the occurrence of problems to a certain extent.
- Data standardization governance SOP : clarify the governance actions of each governance issue to ensure that the governance actions are standard and implementable.
- Data health evaluation specification : clarify the evaluation criteria of governance effect, and achieve long-term, stable and indexed measurement of the data system.
4.1.2 Tool Guarantee
Standard Specification Visualization - Knowledge Center
In terms of sharing standards and specifications, the technical team in the past may have the following problems in the process of actual specification implementation:
- Specifications cannot be found : Important specification documents are scattered in various Wiki spaces, which makes it impossible to quickly find them when using them, which is inefficient.
- Poor specification quality : Documents are not maintained uniformly, cannot be continuously iterated and improved, and cannot be updated with the development of business and technology.
- No permissions for norms : Documents are scattered in each member's private space, and permissions are not granted to everyone, and high-quality content cannot be shared in time.
In response to the above problems, we re-collected and classified existing specification documents, supplemented missing documents, optimized document content, and added a knowledge center module to productize the knowledge system framework, maintain a unified entry and authority management at the product level, and at the same time Strictly control the release process, and solve the problems of "can't find", "poor quality" and "no authority" when the standard specification is actually implemented.
Tooling of Test Specifications - Gossip Furnace
In terms of the implementation of data testing specifications, in the past, data testing specifications were maintained through Wiki, which could not constrain the actual execution process, resulting in poor data quality and prone to data failures. In order to reduce data failures due to non-standard testing in the data development process and improve data quality and business satisfaction, we use the ETL testing tool jointly built by the data center and the data platform tool group (Meituan Internal Tool - Bagua Furnace) To ensure the implementation of the test specification SOP, it requires everyone to fully test without affecting the efficiency of the test data, realize the constraints of data governance before the event, reduce the amount of problems after the event, and ensure the quality of data. The tool construction is shown in the following figure:
Governance efficiency improvement tool - SOP automation tool
In the daily data development work, data engineers will undertake part of the data governance work. In the past, problems were managed by executing each step in the data governance SOP, but they often faced the following problems:
- Low governance efficiency : It is necessary to perform corresponding governance actions on each platform according to the governance experience in the SOP. For some SOPs with more complicated steps, it is necessary to jump to multiple platforms to operate, and the governance efficiency is low.
- Governance process cannot be restrained : Governance experience is beyond words and cannot restrain the execution of data engineers, resulting in incomplete governance of some problems.
Based on the above problems, we have developed a governance efficiency improvement tool - SOP automation tool, which aggregates multiple platform governance tools, and implements the various execution steps of data governance standardization SOP through tools to achieve one-stop governance capabilities in one tool and constrain engineers Governance actions ensure that the entire governance process is standard and the effect can be monitored, thereby improving governance efficiency and governance quality.
For example, in the governance of invalid tasks, it is first necessary to investigate the problem governance experience and deposit it into the SOP document, and then configure each execution step in the SOP document through automated tools in turn. Data engineers only need to implement all governance actions in one interface during governance. The following figure shows the SOP for invalid task governance and Meituan's automated tools:
4.1.3 Standardization benefits and construction experience
Through the standardization construction of data governance, we have solved several problems of the team in terms of data governance specifications, and achieved obvious results:
- The standardization of data development and data governance has been realized, and the problem of inconsistent process method standards in development, management, and operation and maintenance among various groups in the team has been solved.
- Implement standardized test specifications through test tools, block problems beforehand, improve data quality, and reduce failures.
- Through the SOP automation tool, the standardization of the governance process is effectively guaranteed, and the problem of poor governance effect is solved.
At the same time, in the actual construction process, we also summarized some standardized construction experience:
- How to implement the standard specification needs to be part of the construction of the standard process specification, preferably with deliverables.
- The formulation of standards and specifications, in addition to regular content, needs to comprehensively consider factors such as organizational goals, organizational characteristics, existing tools, historical conditions, and user feedback, otherwise it will give people a feeling of "ungrounded".
- In the formulation of standard specifications, priority should be given to utilizing and adapting existing tool capabilities, and implementing tools instead of adapting tools to process specifications.
4.2 Digitization
In the past, everyone mainly relied on experience and judgment when carrying out data governance work, lacking scientific and quantifiable starting points, unable to accurately perceive the severity of governance problems, and unable to accurately assess the recovery of governance benefits. Therefore, we have carried out digital work, describing everyone's data development work with data, and constructing an accurate view of the entire data development work.
4.2.1 Digital Architecture Design Scheme
Construction ideas: Through the analogy of each link of the data life cycle, the way of abstracting and describing business objects in the construction of business data warehouses, abstracting and describing metadata objects, and building a metadata data warehouse and governance indicator system for application in data governance Scenes
The framework mainly includes metadata warehouse, indicator system, data asset level, and various data applications based on metadata warehouse. It uses metadata to drive data governance and daily team management, avoid relying too much on experience to solve problems, and better serve the business. . The following chapters will introduce the core data content of the digital framework: metadata warehouse, indicator system, and data asset level.
4.2.2 Metadata warehouse construction
Metadata is data that describes data, including data asset types, data storage size, data flow blood relationship, data production process and other information. There are many types of information, scattered distribution, and incomplete information. Rich metadata helps us quickly understand team data assets, making data assets more accurate and transparent. Provide support for data use and value release.
Our construction idea adopts the idea of data business , business digitization , and digital application to build a metadata warehouse.
- Data businessization : business description of the daily data development work of data engineers, abstracting multiple business processes, such as requirement proposal, task development, data table output, data application, and requirement delivery.
- Business digitization : Use the ideas and methods of building business data warehouses to build metadata data warehouses and indicator measurement systems for each business process and theme after data businessization, and improve ease of use and richness through metadata scene application.
- Digital application : develop data products on the basis of metadata warehouse to drive the implementation of data governance.
Through the idea of data business, we abstract three major subject domains, including business domain, management domain, and technical domain, to describe metadata warehouse objects, and subdivide each subject domain into multiple themes:
- Business metadata : Based on specific business logic metadata, common business metadata includes business definitions, business terms, business rules, business indicators, etc.
- Technical metadata : describes the data related to data warehouse development, management and maintenance, including data source information, data warehouse model, data cleaning and updating rules, data mapping and access rights, etc., mainly used by engineers who develop and manage data warehouses.
- Management metadata : data describing concepts, relationships, and rules in the management field, including management processes, personnel organization, roles and responsibilities, and other information.
In the layering of the metadata warehouse, we use the most common four-layer structure layering method, which are the source layer, the detail layer, the summary layer, the application layer and the dimension information. Different from the hierarchical design method of business data warehouses, data is organized according to the dimensional modeling idea from the detail layer to avoid over-design, and only need to do a good job of topic division and decoupling. In the summary layer, data is coupled from analysis habits to improve ease of use. The application layer creates the required interfaces to support the application as needed.
At present, we have completed the construction of some content in the technical domain, management domain and business domain of the metadata warehouse, and have supported the indicator system and multiple data applications at the upper level. Further additions and improvements.
4.2.2 Construction of Indicator System
The measurement of a problem needs to be considered from many aspects, and only one indicator cannot fully explain the problem, which requires a set of logical and interrelated data indicators to describe the problem. In the process of data development, multiple indicators need to be developed to monitor and measure the problems existing in the data development team in terms of quality, security, efficiency, and cost.
Previously, the accommodation data team did not have a mature and stable indicator system and could not accurately measure the team's business support and technical capabilities for a long time. In 2020, we built a data governance indicator system on the basis of the metadata warehouse, which comprehensively measured various problems in the construction of the business data warehouse, and monitored the strengths and weaknesses of the work through the indicator system. ability to support the business.
construction plan
The construction goal of the indicator system is to monitor the work status and change trends of the team, and it needs to be able to cover all aspects of the work. Therefore, in the construction of the index system, we classify the index system from different perspectives, so as not to cover too much, but to make the index suitable for different usage scenarios:
- Life cycle perspective : Starting from the data itself, measure the various processes of data from production to destruction, including definition, access, processing, storage, use, destruction, etc.
- Team management goal perspective : classified according to the goals to be achieved by the core of team management, including quality, efficiency, cost, safety, ease of use, value, etc.
- Problem Object Perspective : Classify according to the core concerns of governance issues, including security, resources, services, architecture, efficiency, value, quality, etc.
Construction results
At present, we have built a total of 112 indicators in three categories: technology, demand and failure, covering all aspects of data development:
- Technical indicators : 57 indicators covering 5 aspects of cost, quality, safety, value and ease of use.
- Demand indicators : 36 indicators covering 7 aspects including new addition, response, development, launch and acceptance.
- Fault indicators : 19 indicators covering fault discovery, cause location and processing.
Application of metadata and indicator system:
- Team management : Help team managers quickly understand the team's situation and improve management efficiency.
- Data governance : Use metadata and indicator systems to drive data governance and provide quantifiable starting points for data governance.
- Project Evaluation : Help project members accurately evaluate project problems, progress and benefits.
Construction thinking
In the process of index construction, we have accumulated the following experiences:
- The indicator system should not only solve the problem that managers have no grasp of daily work, but also become the governance grasp of specific problem-handling personnel, taking into account both managers and developers.
- The indicator system is to show the content at the overall level. It is also necessary to solve practical problems through indicators, form a closed loop of indicator system and data governance tools, and realize a continuous cycle of problem discovery, problem management, and measurement results.
- Priority is given to determining the overall development goals of the team, and the indicators are set from the target division, and the indicators should cover different development stages of different business lines as much as possible.
- Businesses need to clarify their own stage, formulate assessment goals and measure thresholds for different stages, which not only unify the measurement standards, but also neutralize everyone's assessment standards.
- The indicators need to pay attention to the layered construction, avoid "beard and eyebrows", easy to adapt to the current organizational structure, and easy to divide responsibilities and positioning.
- After the construction of the basic index system is completed, it can be used as a starting point for daily management and work, as a basis for project initiation, and as a means for evaluating project results.
4.2.3 Asset Grade Construction
With the rapid development of the business, the scale of the data assets that the team is responsible for is also expanding. Up to now, the team is responsible for 3000+ offline Hive tables, 2000+ ETL production tasks, and each person is responsible for 100+ ETL production tasks. In the face of increasingly large-scale data assets, team managers and data engineers often encounter the following problems:
- Only experience can be judged to determine which core assets are, and the priority of solving problems cannot be assessed.
- The guarantee of core links, such as the configuration scope of SLA and DQC, lacks scientific evaluation methods.
- Managers lack accurate judgment on the core assets of the team, and cannot make management actions accurately and effectively.
In order to enrich the relationship and content between metadata, mine and identify more valuable data information, and use the metadata capability to drive the daily work of data R&D and operation and maintenance, on the basis of the metadata warehouse, we have built a derivative capability, that is, the construction of asset levels. . The asset level can scientifically and effectively evaluate the importance of data, and it can also help improve the data quality classification monitoring scheme, so as to achieve key guarantees for key tasks.
The following figure shows the general calculation process of data asset level. We first confirm each impact factor and impact weight value according to the asset type, divide the importance level of the impact factor, then divide the score interval according to the numerical range of each impact factor, and finally summarize and calculate to obtain the final asset level score. and asset-level results, and sampling to verify the accuracy of the results.
Asset Grade Construction (Data Sheet)
The following figure is the method and flow chart for the construction of the asset level of the data table:
1) Determine the impact factor and weight evaluation
The determination of the impact factor is the most critical part of the asset grade calculation, and a reasonable assessment of the impact factor is crucial to the accuracy of the final asset grade result. According to the experience in actual data development, the following key factors affect the importance of the data table:
- Downstream type : Determine the importance of downstream assets. There are generally two types of downstream assets: ETL tasks and data products. ETL tasks and data products are divided into ordinary and VIP types according to their importance.
- Quantity of downstream : decide whether it is a key node, the scope of influence on downstream production, the more the number of downstream, the greater the scope of influence.
- Use popularity : determine whether it is useful or not, and affects the range of query users. The higher the popularity, the wider the range of users affected.
- Link depth and layering : Determine the repair time of the problem. The deeper the link, the longer the problem repair time may be.
After determining the impact factors, we need to judge the weight value of each impact factor. We use the analytic hierarchy process to calculate the weight value (the analytic hierarchy process is mainly used in uncertain situations and decision-making problems with multiple evaluation criteria. For the specific calculation steps, you can refer to the relevant information). The advantage is that the research object is used as the A system that makes decisions according to the way of decomposition, comparison and judgment, and comprehensive thinking, and the calculation process is simple and practical.
2) Calculate the asset class score
According to the actual situation, each impact factor is divided into a score interval, and combined with the weight value of each impact factor, the final score of the asset grade can be calculated. The total score is the sum of the product of each impact factor score and the corresponding weight.
3) Asset level mapping
We divide the final score of asset grade into L1 ~ L5, L5 is the highest asset grade, and L1 is the lowest asset grade.
Asset Grade Scenarios (Data Sheet)
At present, asset grades have been applied to the implementation of daily governance, providing a powerful tool for data graded governance:
4.3 Systematization
4.3.1 Data 100 - Governance Center
In addition to standardization and digitization, the implementation of our data governance system still faces many problems:
- Data assets cannot be counted and described, managers and data engineers do not know what they have, and there is a lack of visualization of assets.
- Managers lack a handle to spot team problems, and problems are difficult to track down.
- The degree of online governance is low, multiple tools need to be jumped, the governance efficiency is low, the governance process cannot be standardized, and the results cannot be guaranteed.
In response to the above problems, we have built a data hundred products-governance center governance platform (internal product of Meituan) to realize one-stop, full-coverage data governance that integrates asset management, problem analysis and monitoring, automated governance, process tracking, and result evaluation. The platform can effectively improve the quality and efficiency of governance, and provide strong support for the improvement of data quality. Through the concept of "management + governance", we can comprehensively monitor data, human efficiency and other issues from the perspectives of managers and R&D personnel, and realize three modules: asset panorama, management center, and governance center:
Asset Panorama
From the perspective of managers + data RD , the asset panorama introduces the problems of the current data status, helps line-of-business managers and data RD realize data asset visualization, provides managers with a starting point for technical management, and improves data for data RD Probing and data usage efficiency. Contains three sub-modules: asset market, asset catalog, and personal assets:
- Asset Market : From the perspective of business line managers, it shows an overview of various assets in the business line, helping managers to quickly understand the data assets in the group in one-stop, without having to jump to multiple platforms.
- Asset catalog : Displays the asset types and details of team data, provides information support for the use of data RD data, and improves the efficiency of RD data exploration.
- Personal Assets : From the perspective of the owner, display the number of data assets, asset types and data details under the names of data RD individuals and groups, and describe personal asset information in detail.
control center
Data team managers often face two problems in their day-to-day team management:
- Management methods rely more on experience and judgment. When the team's demand increases and the number of team members increases, it will bring about an increase in management difficulty. Managers lack a grasp to quickly see the overall situation of the team.
- Manage action day levels. The manager finds that a certain core indicator of the team is abnormal (for example, the number of failures), and needs to find the corresponding person in charge to inquire, but cannot quickly track the abnormality from the system and obtain the cause.
The management center mainly solves the problem of how to manage from the perspective of managers. Through the core indicators that managers pay attention to, it provides managers with the ability to monitor team status, judge team problems, and assist management in decision-making, so that managers can manage from "experience-dependent management". ” into “data-driven management.” It includes four major modules: manager market, operation and maintenance management, demand management, and team management:
- Manager's Market : Provide managers with functions such as an overview of the team's core indicators, problem trend analysis, abnormal detail tracking, and abnormal cause marking, so that managers can quickly understand the team's situation and make management actions in a timely manner.
- Demand management : Provide detailed human efficiency analysis and demand management functions to serve human efficiency management and improve efficiency.
- Fault management : Provides detailed fault analysis dashboard and fault review management capabilities to improve fault management efficiency.
- Team operation : The team's monthly report, on-duty, satisfaction questionnaire and other capabilities required for team operation to improve operational efficiency.
Governance Center
In the daily data governance process, the person responsible for the problem mainly has the following pain points:
- Does not understand the context, goals, and importance of the issues to be governed that are assigned to them. Governance work has become a blindly assigned task. Even if the governance action is completed, it may still not be able to guarantee whether the governance goal is truly achieved, especially when dealing with multiple types of governance problems at the same time, the effect is poor.
- When data governance solves problems, various tools are usually used to help each other to solve the problem. After there are many problems, the governance problem becomes the repeated use of different tools, which seriously affects the efficiency and effect of governance.
From the perspective of the person responsible for the problem, the governance center solves the problem of how to solve it, and provides one-stop governance capabilities for front-line governance engineers from problem evaluation and analysis, to governance, to progress monitoring. The governance work is refined and normalized, which improves the quality and efficiency of data warehouse governance. It includes four modules: governance overview, analysis and evaluation, problem governance, and progress monitoring.
- Governance overview : The home page of the Governance Center introduces the framework of the team's data governance system and standardized governance results, so that users can be cognitively consistent with the governance concept of the Governance Center, and provide excellent solutions for data governance.
- Analysis and evaluation : Quantitatively evaluate seven types of governance issues, provide governance priorities and issue rankings, and let users know what to do first.
- Problem Governance : Provide rich governance indicators, comprehensively measure governance problems, timely notify problem distribution, and use SOP automation tools to standardize the problem solving process, ensure governance effects, and improve governance efficiency.
- Progress monitoring : Provides a problem management progress board and problem distribution progress monitoring, which is convenient for managers to control the progress of problem management at a macro level and reasonably plan the distribution rhythm.
4.3.2 SOP Automation Tools
In the daily data governance process, each team will deposit several SOP specification documents to guide everyone in problem management and reduce the occurrence of problems. However, there are still many problems in the implementation of SOP:
- SOPs generally exist in the form of Wikis, and constraints cannot be tracked in the actual execution process.
- The execution of SOP action needs to jump to multiple platform systems, and the execution efficiency is low.
construction plan
Based on the above problems, we developed the SOP automatic configuration tool. The SOP automation tool is a SOP configuration tool, which is suitable for problem management SOPs. The management actions are configured through the tool to improve the management efficiency, thereby ensuring the quality of the process and the quality of the results. The goal is to solve the problems of low execution efficiency and inability to track and monitor the process encountered in the implementation of SOP specification documents, and to achieve one-stop problem-solving capabilities.
SOP automation tools mainly include basic building layer, configuration layer and application layer. The following is the product architecture diagram and product interface:
- Basic component layer : SOP minimum granularity module, including display components (rich text, table, IFrame), logic control components (single choice, multiple choice), users can choose a combination of multiple basic components according to the SOP content.
- Configuration layer : configure the parameter information and execution steps used in the SOP.
- Application layer : Display the final effect of SOP, and provide external services through the URL interface. For example, the governance center can call the SOP tool interface to achieve one-stop governance capabilities.
The actual operation steps of SOP are as follows:
After creating the SOP, the user can selectively configure the data information to be displayed, and then drag the basic components in turn according to the SOP execution steps, and fill in the execution operation to complete the SOP configuration work. URL. Automated tools mainly provide external services in the form of external embedding.
Application scenarios
Through the SOP automation tool, data governance has realized the online problem solving process and standardized steps, which has well guaranteed the governance effect and improved the governance efficiency. The following figure is a comparison of the process of invalid storage indicators before and after using the SOP automation tool. Through the comparison, we can see that engineers needed to manually confirm some information and jump to multiple platforms to operate. Now all actions need to be completed in one interface. Greatly reduces the workload of R&D personnel.
At present, our team has completed the construction of governance SOPs for more than 30 indicators in 7 governance areas, and all of them have been implemented through automated tools. In the future, we will still explore other special governance contents and use SOP automation tools to assist in data governance.
4.3.3 Experience Summary
Through the systematic construction of data governance, we have summarized the following points:
- Systematization is an effective solution to the problem-solving approach from offline to online, from scattered actions to coherent actions.
- There is no perfect system, and there is no need to pursue perfection. Consider the input-output ratio, quickly solve the main contradiction, and apply it to specific problem solving.
- Product positioning design and the ability design of long-term product planning are particularly important, otherwise it is prone to the situation of "doing what you are doing, not knowing what to do, and not knowing what direction to develop".
V. Business Data Governance Implementation Process
The data governance implementation process is a set of general standard processes that we summarize and abstract when implementing and solving specific data problems based on the business data governance standardization framework, which is applicable to most governance scenarios to solve problems. The benefit of standard processes is to standardize the operational processes of data governance engineers to ensure the quality of implementation. The process consists of 5 steps:
- STEP 1: Identify problems and set goals. To find problems, we must start from the perspective of the business data development team, focus on serving the business, comply with data development specifications, and collect user feedback, and find and collect relevant problems as much as possible. At the same time, the goals set must be achievable.
- STEP 2: Disassemble the problem, design measurable indicators, and realize it through the collection and construction of metadata, which is used to further quantify the goal and serve as a starting point for monitoring and governance of the implementation process.
- STEP 3: For the specific problems measured, formulate relevant solution SOPs, and check whether the corresponding R&D standards and specifications are complete, and build or improve the corresponding tool-based problem solving through several stages before, during and after the problem occurs. ability.
- STEP 4: Promote operations, take the result as the core goal, use different strategies for different roles, focus on whether the problem solving process will conflict with the interests of users, control the rhythm, and solve the problem in a planned way according to the importance of the problem.
- STEP 5: Summarize the precipitation methodology, iterate cognition, continuously explore the optimal solution to the problem, and optimize the governance plan and capability.
6. Summary and Outlook
After continuous thinking and practice in the systematic construction of data governance, our systematic framework has been basically established, and great progress has been made in the three directions of standardization, digitization and systematization of data governance, and we have made great progress in business applications. certain grades. More importantly, we have helped the business solve practical problems in data cost, security, efficiency and other fields, especially in terms of cost. affim.
But compared to the "ideal end state", our work still has a long way to go. The various blood vessels, bones, and internal organs in the huge "body" of the data governance system still need to be filled continuously. In the construction process of process specification, metadata data warehouse, indicator system, asset classification, etc., there are still many need to rely on expert experience , human judgment, and manual operations are connected in series. In the next step, we will make efforts in intelligence (such as intelligent metadata service, intelligent data standard construction, etc.) and automation (online construction of governance application scenarios based on governance framework, etc.).
7. About the author
Wang Lei, Youwei, Wei Bin, etc. are all from the Data Science and Platform Department of Meituan.
Read more collections of technical articles from the Meituan technical team
Frontend | Algorithm | Backend | Data | Security | O&M | iOS | Android | Testing
| Reply keywords such as [2021 stock], [2020 stock], [2019 stock], [2018 stock], [2017 stock] in the public account menu bar dialog box, you can view the collection of technical articles by the Meituan technical team over the years.
| This article is produced by Meituan technical team, and the copyright belongs to Meituan. Welcome to reprint or use the content of this article for non-commercial purposes such as sharing and communication, please indicate "The content is reproduced from the Meituan technical team". This article may not be reproduced or used commercially without permission. For any commercial activities, please send an email to tech@meituan.com to apply for authorization.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。