Introduction to Fengshen-Problem Management | Interactive Robot

image.png

1. Project background

In the process of platform operation and maintenance, platform users will inevitably have some problems. In the initial operation and maintenance stage, users directly communicate with platform operation and maintenance personnel, feedback problems or ask some questions, which will inevitably increase a lot of communication costs, as shown in Figure 1. In the long-term operation and maintenance process, the following problems will be exposed.

图片1.jpg
figure 1

1.1 User pain points

①I don't know who to solve if there is a problem or no one can be found;
②Unable to perceive the progress of problem handling;
③The communication cost is high, and important issues cannot be dealt with in time;

1.2 Operation and maintenance pain points

① Internal information cannot be shared effectively;
②Multiple entry points for problem management, chaotic problem tracking, and repeated solutions to the same problem;
③The problem processing cycle is long, and the handover process is cumbersome and easy to miss problems;

2. Business Architecture

2.1 Architecture description

The problem management robot is used to help operation and maintenance personnel and users establish various types of problem handling processes, manage all problems and track and record the handling methods of these problems, and provide users with a working platform for assigning, transferring and coordinating problems.
The problem management robot is problem-solving-oriented, and through the unification of problem portals, the user side/operation and maintenance side business portals are all Dingding groups. The DingTalk group is divided into Daji (user side)/Huangwang (Operation and maintenance side). Customer questions are raised in the Daji group and transferred to the Dingwang group, and the operation and maintenance personnel will take the order, as shown in Figure 2.

图片2.jpg
figure 2

2.2 Features

  1. Rely on the DingTalk robot to manage all problems centrally, and adapt to multiple terminals of mobile phones and computers;
  2. Record all problems, ensure that they are dealt with in a timely manner and finally solved, avoid problems being ignored, delayed or forgotten, and continue to accumulate;
  3. From the time the problem is entered, someone has been responsible for it until the problem is closed;
  4. Record all the information of the problem handling process (such as processor, processing time, processing content, etc.);
  5. Reduce a lot of communication work such as inquiries, supervision, and reports;

2.3 Function grouping

User side group
  1. Question entry: standardized entry template, the user directly enters the problem with @rob;
  2. Question query: query questions at any time and get the current progress of the problem;
  3. Problem modification: The user can specify the problem handler, evaluate the problem, return the problem, and expedite the problem;
  4. Question export: The personal dimension supports multiple export options and exports the questions to Excel, which is convenient for summary reports;

3.jpg
image 3

Operation and maintenance side group
  1. Question query: a variety of query modes, query the problem according to your own needs;
  2. Problem modification: Operation and maintenance personnel can suspend, mark, change the status, update progress, and forward the problem to the problem;
  3. Problem export: The global dimension supports multiple export options and exports the problem to Excel, which is convenient for summary reporting;
  4. Progress monitoring: timeout reminder of problem processing time to speed up the progress of problem processing;
  5. Problem broadcast: regularly broadcast the list of unresolved problems, and discover important problems in time;

4.jpg
Figure 4

Problematic market
  1. Data visualization: Reports are generated according to the dimensions of problem distribution platform, product, processing personnel, and quantity distribution.
  2. Problem details: You can search for problem details, processing time, etc.

1618995758503-3560f015-4bb2-4d14-9bd3-ef544e8f8c5b.png
Figure 5

3. Problem handling

3.1 Processing flow

图片6.jpg
Figure 6

3.2 Process description

LinkDingding groupCharacterDescriptionProcess status changes
1.1Customer baseuser@妲己, the robot auto-reply requires the next operation option.
1.2Customer baseuser@妲己Robot, choose to enter the question. After the question is successfully entered, it will automatically push the question to the on-site group, waiting to be taken over.Pending
1.3Customer baseDaji (robot)@妲己Robot, select the query question.
1.4Customer baseuserThe robot automatically assigns the problem ID and pushes the problem entry details. If you want to modify the problem, you need to close it and submit it again.
1.5Customer baseuserReply to the currently unresolved problem, click on the problem to view the details.
1.6Customer baseuserChoose whether to modify the status of the problem, no/temporarily closed/resolved.
1.7Customer baseuserSelect "Temporary Close"-this issue is pending, the issue will not be queried through "Query Issues Unresolved", but it can be displayed in "Query Issues All", modify the issue through @妲己, and restart the issue.Processing -> Temporarily closed
1.8Customer baseDaji (robot)Select "Resolved"-the problem is closed, and you can check all the problems through @妲己 query for all the problems to see.............. .. ... ................................................................................................................................................;.;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; & & and &;;;;;;;;;;;;;;;? and;;?”?Processing -> resolved
1.9Customer baseuserAfter receiving the on-site "problem handling update" message push, proceed with the processing status selection operation.
2.1Resident groupOperation and maintenance personnelReceived push notification of user problem entry.
2.2Resident groupOperation and maintenance personnelTake over the problem of user input.To be processed -> processing
2.3Resident groupOperation and maintenance personnelChoose whether to forward the question to other resident sites.
2.4Resident groupOperation and maintenance personnel①Do not forward-deal with the problem. ②If the problem is not resolved after verification, modify the "Problem Processing Status" to "Processing".Resolved pending customer verification -> processing
2.5Resident groupOperation and maintenance personnel@纣王, modify the problem processing progress, the modification is automatically pushed to the customer group and @the person who raised the problem.
2.6Resident groupOperation and maintenance personnelSelect whether to "modify the problem handling status", if it has been resolved, change the status to "resolved pending customer verification".Processing -> Resolved pending customer verification
2.7Resident groupOperation and maintenance personnel@纣王 forwarded the question ID. Support the active transfer of the receiver and the active transfer to other operation and maintenance personnel.
2.8Resident groupOperation and maintenance personnelBroadcast-automatically broadcast the problem handling situation (cumulative processing situation, today's solution situation) at 10, 14, 18, and 20 every day. Timeout-① Push notification every 10 minutes for not taking over the problem, ② Push the timeout reminder and @TAM at 4h/8h/12h/24h/48h from the time the problem is successfully entered.

4. Conclusion

This issue introduces the original intention of the problem management robot design and the current results. At present, the problem management robot has served several hybrid cloud projects, and the efficiency of project problem tracking has been significantly improved, the user experience has been improved, and the communication cost generated by the problem handling process has been greatly reduced.
Next, I will introduce you to other modules of Fengshen, including operation and maintenance of the market, report analysis, time series database and other related knowledge, so stay tuned!

Reference documents

[1] Robot: 160c08a218e775 https://developers.dingtalk.com/document/tutorial

related information

[1] Fengshen-Operation and Maintenance Brain | Log Detection Tool
[2] Fengshen-core function |

We are the Alibaba Cloud Intelligent Global Technical Service-SRE team. We are committed to becoming a technology-based, service-oriented, and high-availability engineer team of business systems; providing professional and systematic SRE services to help customers make better use of the cloud 、Build a more stable and reliable business system based on the cloud to improve business stability. We hope to share more technologies that help enterprise customers go to the cloud, make good use of the cloud, and make their business operations on the cloud more stable and reliable. You can scan the QR code below to join the Alibaba Cloud SRE Technical Institute Dingding circle, and more The multi-cloud master communicates about those things about the cloud platform.

Copyright Statement: content of this article is contributed spontaneously by Alibaba Cloud real-name registered users. The copyright belongs to the original author. The Alibaba Cloud Developer Community does not own its copyright and does not assume corresponding legal responsibilities. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and the "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find suspected plagiarism in this community, fill in the infringement complaint form to report it. Once verified, the community will immediately delete the suspected infringing content.

阿里云开发者
3.2k 声望6.3k 粉丝

阿里巴巴官方技术号,关于阿里巴巴经济体的技术创新、实战经验、技术人的成长心得均呈现于此。