头图

On September 7th, the fourth session of the ONES R&D Management Master Class was officially launched. Tencent DevOps and R&D efficiency senior technical expert Mr. Zhang Le shared "Industry Trends and Five Improvements in R&D Performance Measurement". With the help of the "Five Improvements" of performance measurement, we can analyze the bottleneck problem of R&D team performance, and help enterprises to improve R&D performance better and faster.

The following is the core content shared by Mr. Zhang Le.

The digitization of R&D efficiency is a must in this era. How to get started? Start with a valid measure. When we talk about efficacy measures, we should think at different levels. Take a look at the "DIKW" model below:

This model can help us better understand the relationship between data, information, knowledge and wisdom. The bottom layer is "data". Data is a description of the physical world. The original log submitted by the code, the information of the demand flow, etc. are all data; some data needs to be authenticated to have value, so the second layer is called a data. Value "information"; if only one-sided information is not enough to answer a complex question, we need to aggregate and process a variety of information, supplemented by analysis and judgment, to achieve understanding, this is the third layer of "knowledge" ”; however, our ultimate goal is to make performance improvements, so on top of the accumulation of multiple knowledge, we use the experience of experts to generate solutions and aggregate knowledge into the top-level “wisdom”.

Therefore, to sum up in one sentence, digitization is to extract data, refine information, refine knowledge, and aggregate wisdom from the physical world, and ultimately improve productivity.

Five Flow Metrics for Value Stream Analysis

The digitization of R&D effectiveness requires effective effectiveness measurement, without which improvement and management cannot be achieved. Next, let’s look at the five flow metrics of value stream analysis, namely flow efficiency, flow time, flow rate, flow load, and flow distribution.

  • Flow efficiency can be calculated by dividing the value-added time by the total time;
  • Flow time, which refers to the overall cycle from when a requirement is received to when it is delivered;
  • Flow rate, in fact, throughput, refers to how much effective value delivery can be completed per unit time;
  • Flow load, which refers to the number of work items we are working on in parallel in the delivery pipeline that have not yet been completed;
  • Flow distribution can measure the distribution of different jobs in a team, such as how much is allocated to do business needs, how much to solve defects or technical debt, etc.

Regarding these five flow indicators, in the next course, we will use a whole class to do a detailed dismantling and describe the specific analysis methods. Please pay attention to the next course.

ONES Performance Performance Management

To be effective in R&D, good tools are necessary. ONES has developed a performance management tool: ONES Performance, which has a very clear design logic. For metrics, in addition to indicators, we should also consider scenarios, and scenarios are more critical. In other words, who is this indicator for? For the boss, for the PMO, or for the R&D? Each level focuses on different indicators, and its usage scenarios are also different.

ONES Performance covers all scenarios of performance management, provides data support for team performance improvement, and effectively helps teams to continuously improve performance.

Five refinements of efficacy measurement

The five refinements of R&D performance measurement are the implementation framework for performance measurement.

  • The first item is to establish a measurement infrastructure and establish a value stream network;
  • The second item is to establish a measurement index system, such as result indicators, process indicators, leading indicators, lagging indicators, etc.;
  • The third item is the insight analysis model, which measures and analyzes relatively complex problems;
  • The fourth item is to gain insight into product construction, simplify complexity, use products to help us shield the complexity of the underlying implementation, and directly present user conclusions and analysis;
  • The fifth item, data-driven, uses experimental thinking to keep the efficacy measurement from deviating.

Next, let’s take a look at the specific content of these five diligence in turn:

(1) Establish measurement infrastructure First of all, we need to establish a measurement base, but this base is not only the so-called data warehouse, but also carries more capacity. We divide it into three layers: tool network, artifact network and value stream network.

The node of the tool network is each single-point tool, such as tools for managing requirements, tools for managing code, and tools for managing CI/CD. These tools are connected through relevant interfaces to achieve interconnection; the artifact network refers to our Whether the artifacts (requirements, code, product packages, pipelines, etc.) generated by the tool can be connected to each other; the value stream network refers to the alignment of the work flowing in the tool and the team with the business-level product. This also confirms Metcalfe's law: the value of a network grows with its connectivity.

(2) Establish a metric indicator system Let's take a look at the indicator panorama in the figure below. The original intention of my design is to organize key metrics into a reasonable structure. Based on the measurement of value stream, the R&D process can be divided into three dimensions: delivery efficiency, delivery quality and delivery capability, as well as the dimension of business results; the R&D stage can be divided into the requirements stage, development stage, testing stage, release stage, and operation stage. Therefore, this graph is an indicator matrix formed based on a two-dimensional graph, which I shared for the first time in 2019, and is often cited in the industry.

Later, I found that some companies already have hundreds or thousands of indicators, and they don't know where the key indicators should be. Therefore, I further upgraded the indicator panorama, and no longer pursued listing every key indicator, but established a structured model - the Cube model of measurement indicators, which divided the measurement indicators into different levels and from multiple dimensions. Better characterization and interpretation.

This model is not a plane, but a cube. We use the blue point in the middle as a gripper and pull it out from the picture. It can be seen that it is actually a cube. It has a "result surface" and a "process surface". From the "result side" we need to focus on value, efficiency, quality and cost, which can be summarized as "more, faster, better, and less" we often say. There are many result indicators, such as demand lead time, demand throughput, online defect density and so on. Therefore, the "result surface" is result-oriented, and has a pulling and guiding role for the entire R&D value. In order to achieve the results indicators, we still have a lot of process work to do, all of which also depends on the "process side", such as the improvement of collaboration capabilities, engineering capabilities, technical capabilities and organizational capabilities.

To sum up, we need to guide and drive the overall results based on the results indicators, and drive specific improvement activities through the refinement and analysis of process indicators.

(3) Insight analysis model The following figure is the GQM (Goal, Question, Metric) analysis model. Its basic idea is: the collection and analysis of data should focus on clear and specific goals, and then each goal is classified into a group of possible Questions answered quantitatively, and each question is then illustrated by a specific indicator. Therefore, the GQM analysis model must be aligned with the goal, so that the measurement starts from the end.

However, when doing efficacy measurement, you can't just consider indicators, and all measurement analysis methods should be well-known.

I have listed twelve analysis methods here. Let’s first look at the following more common ones:

First, trend analysis. There is a saying in R&D efficiency that "the trend is greater than the absolute value". What we need to do is to measure the vector, not the absolute value. Therefore, we observe increments and fluctuations, changes and trends, which are more important than absolute values.

Second, composition analysis. For example, if the demand delivery cycle is 20 days, we can see where the time is spent, and which stage takes the longest time; at the same time, it can also help us analyze where the R&D bottleneck will explode.

Third, the contribution analysis method. This can help us avoid the "average trap". We need to do further analysis based on different analysis dimensions to find the factors that have the greatest impact on the positive/negative fluctuations of the results, rather than just looking at the average.

Fourth, Pareto analysis. Also known as the law of the critical minority. It means that the main result of things is likely to depend on only a small number of factors, and the core is to find the "critical minority", which is the room for optimization and improvement.

Fifth, cumulative flow graph analysis. Cumulative flow graphs reflect information such as lead times, work-in-progress, delivery rates, and patterns of collaboration among the various roles of the team.

(4) Insight into product construction I think product construction is suitable for enterprises with a certain scale to operate, because it is more difficult to do, and some large factories even invest dozens of people all year round to develop and maintain a tool system for efficiency measurement, as shown below:

The bottom layer is the data warehouse. We need to collect data from the data source and aggregate it into the data warehouse, establish a data model, and make some flexible configurations for the source data and indicators; Display and service capabilities of products/functions; the top layer is the target user, and the tool product for efficiency measurement must consider the target user. The scenarios for managers, R&D leaders, research efficiency experts or front-line engineers are different.

(5) Data-driven, the improvement of R&D efficiency of experimental thinking does not come from the measurement itself, but from the targeted improvement. In this regard, I summarize five steps:

  • The first step is to define an improvement goal;
  • The second step is to dig out possible bottlenecks in the value stream through insights;
  • The third step is to choose the appropriate practice in the practice of R&D efficiency map (detailed in the first lecture of the master class);
  • The fourth step is to start from a small area and conduct experiments longitudinally;
  • The fifth step is to judge whether the experimental results are valid, and if so, form a benchmarking effect.

There are four key points in the above five steps:

  • solve only one problem at a time;
  • Systems thinking, considering constraints in a holistic way;
  • Define scope, timing, metrics, success criteria;
  • Data-based evidence, continuous experimentation.

- Q & A -

Q1: R&D work often faces R&D tasks with different difficulty values. How to achieve a fair measurement?

Teacher Zhang Le: Measurement is first of all a tool for self-improvement, that is, comparing yourself with yourself. We say that the trend is greater than the absolute value. For a certain person, a certain team, a certain department, a certain product line, it changes over time, whether the performance is improving, this is the first consideration; then it is the horizontal comparison, However, note that only teams of the same type and nature can make horizontal comparisons.

Q2: Should performance metrics be linked to KPIs? If it is not linked and relies on the self-driving force of employees, it will feel even more unreliable. How to balance this contradiction?

Mr. Zhang Le: Many companies have used KPIs or assessments to measure their effectiveness, but if they do, there are many examples of failures. Therefore, I think that through the design of a mechanism, it is necessary to solve the problems of orientation and team self-drive, and also try to eliminate the side effects caused by the strong binding of measurement and assessment.

I think the measurement should be divided into two parts, the first part is the result indicator, which is the overall traction and guidance. This part can be connected with the way of OKR. For example, an "O" in the OKR of a R&D team is related to performance, and the corresponding "KR" is the key result of efficiency, quality, etc., and the overall orientation is done in this way; the indicators in the second part Process indicators, such as unit testing, code review, scan pass rate, etc., are not suitable for direct binding with KPIs and assessments. These indicators should be more empowering for front-line teams, giving them more authorization and choice. , and let them use and adopt effective improvement measures according to local conditions, as long as they can achieve their goals.

Therefore, we have both result indicators to align with the goal orientation, so that everyone can know what the company and departments value; at the same time, we also have process indicators to give the front-line team some flexibility, so there is no need to rigidly measure and evaluate, which leads to data creation and short-sighted behavior. .

Q3: Sometimes the efficiency is improved, the quality is difficult to guarantee; but the quality is guaranteed, the efficiency is not up. In the process of efficacy measurement, how to balance the relationship between efficiency and effectiveness?

Teacher Zhang Le: In fact, in many successful cases, we see that efficiency and quality can be achieved to a certain extent. I would like to share with you an interesting sentence I heard before. The first half of the sentence is: The "Speed" is the "Quality", and we want efficiency from quality; the second half of the sentence is: The biggest "Muda" is "Re-Do", The biggest waste is rework. We need to control the quality from the root and build it into it. That is to say, although it takes more work to build the quality, the end-to-end cost is reduced. Doing a good job in engineering practices such as continuous delivery can achieve both efficiency and quality to a certain extent.

Q4: Our company has implemented performance indicators for a period of time before, but it feels that it is not as efficient as flat management, and performance measurement does not seem to be suitable for small teams. What size does the teacher recommend that the team reach to start doing efficacy measurement?

Teacher Zhang Le: This question is very interesting. Unless the team is very small, there are only three or five people, and they all know what the other party is doing, then there is really no need to measure. But if you grow to a small dynamic team, such as a Scrum team of about ten people, you can actually start measuring at this time. Of course, it is necessary to control the cost of measurement, that is, the ROI of measurement must be well controlled. At the beginning, do not spend too much cost or make it too complicated.

Therefore, I think that measurement can be done in essence, and it is a very effective mechanism for providing quantitative feedback. Small teams can do it at low cost, and large teams can build data indicators, measurement models, etc., and can do more complicated work to mine more valuable information.

Q5: Our company is engaged in game development. Due to the expansion of the scale, various problems are frequently exposed, such as a sharp increase in defects and a large demand. If you want to start introducing efficacy measurement now, how would you suggest to start?

Teacher Zhang Le: The most important question you need to think about is what is the most important thing at this stage. For example, if the company is in a period of rapid development, or needs to compete with another product, and the demand needs to be realized as quickly as possible, can it take the initiative to choose liabilities and take certain quality risks? If quality and word of mouth are the most important things for a current product, then defect removal and control are even more important. Therefore, in a certain period of time, there must be a most painful point. This most painful point is your staged goal. Once you have a goal and then look for indicators, this problem will be solved.

Q6: How to solve the problem of poor development and test quality and high test missed test rate?

Teacher Zhang Le: It is recommended to start with the external quality of the product, pay attention to online defects, and conduct root cause analysis for defect classification to clarify the stage of problem introduction and the stage most likely to be detected.

Based on the above analysis, it is possible to re-examine the grading of the existing quality guards, analyze whether each level intercepts the problems that should be intercepted by the level, and continuously improve the quality assurance methods, such as code review, unit testing, and the design of various tests and test case supplements. At the same time, we should also pay attention to the evolution in the direction of shifting quality to the left. For example, by building development self-testing pipelines, MR pipelines, and integration pipelines, quality access control is established at key nodes in the code submission and delivery process. Only codes and products that meet quality access control can go downstream. Delivery, quality assurance from the source and upstream.

Q7: Is the change lead time the difference from the start of coding to deployment? If yes, then "Ali's 211" represents the change lead time of one day, "start coding to deployment" or "complete coding to deployment"?

Mr. Zhang Le: There are many cycle indicators that are easy to confuse, including lead time, cycle time, flow time, and change lead time from an engineering perspective. for changes), etc., which will be explained in detail in the next measurement course. Specifically, "Lead time for changes" is officially defined as the time from code submission to deployment to the production environment, which measures the level of efficiency from an engineering perspective, such as the capabilities and effects of CI/CD pipelines.

In "Ali 211", the definition of the third "1" is precisely "change integration release time", which measures the time from submission and integration to online after the code is completed. It is expected to be completed within 1 hour. "Change lead time" is basically synonymous.

The first season of ONES R&D Management Master Class is still going on. In the next class, we will invite Mr. Zhang Le again to explain in detail the value stream analysis method based on the flow framework, the five flow indicators of value stream, and the implementation cases of domestic and foreign large factories. . Hurry up and follow the ONES video account, more live broadcasts are coming soon!


万事ONES
554 声望26.2k 粉丝

ONES专注于企业级研发管理工具及解决方案,产品矩阵贯穿整个研发流程,实践敏捷开发与持续交付,追踪项目进度,量化团队表现,助力企业更好更快发布产品。