Customer service is a solution to help the experience proceed as smoothly as possible when the user service experience is not perfect, and it is a solution to the problem after the occurrence of the problem. And intelligent customer service allows most simple problems to be solved quickly by self-help, giving complex problems a chance to be solved efficiently by humans. Throughout the journey of user service, the Meituan Platform/Search and NLP Department provided six core smart customer service capabilities including question recommendation, question understanding, dialogue management, answer supply, speech recommendation and conversation summary, in order to achieve low cost and high The purpose of communicating with users efficiently and with high quality. This article mainly introduces the core technology of Meituan intelligent customer service and its practice in Meituan.
1 background
Currently, Meituan has 630 million annual transaction users and serves 7.7 million life service businesses. In addition, there is a large group of leaders in Meituan's preferred business. The Meituan platform covers more than 200 life service categories such as food, accommodation, transportation, travel, shopping, entertainment, etc. In the pre-sales, in-sales, and after-sales links of platform services, there are a lot of information consultation, order status acquisition, appeals and complaints, etc. Communicate appeals. In addition, as a listed company with tens of thousands of employees, employees also have a lot of communication demands. In the face of these needs, if they are all achieved through human resources, it is obviously not in line with the company's long-term development goals, which requires the introduction of intelligent customer service.
1.1 Landing of intelligent customer service facing different scenarios
First, let's look at some of the most common customer service scenarios in daily life.
- pre-sale scene : For example, consumers choose to stay in a hotel on the platform, and they have strong information consultation requirements before placing an order on room prices, hotel facilities, check-in and check-out policies, etc.
- sale scene : For example, the take-out reminder has not arrived yet, add remarks, do not add hotness, add invoicing and other consultations, etc. The pre-sale and in-sale scenes mainly occur between consumers and merchants or platforms.
- After-sales scene : For example, the takeaway scene complains about the lack of food delivery, the rider delivers the meal overtime, and the refund is required. The hotel scene complains that the hotel cannot check in at the store, etc. After-sales often involve customer service agents, consumers, riders and merchants, requiring multi-party coordination solve.
- Office scene : such as IT, human resources, finance, legal affairs and other consulting, production, operation and research on the interface products provided by the consultation and Q&A, product Q&A with sales consultants, and sales consultants with Q&As to merchants, etc.
1.2 Landing of intelligent customer service facing different groups of people
Communication is a basic requirement of human beings. In most scenarios, our pursuit of communication is to aim at low cost, high efficiency and high quality, and dialogue robots also need to meet these three requirements at the same time. At present, we divide according to the service groups, and the smart customer service landing scenes can be roughly divided into the following four categories:
- for users : Provide intelligent customer service robots to help them solve most of the problems by themselves.
- for agents : Use the ability to recommend words or conversation summaries to improve the work efficiency of artificial agents and improve the work experience of artificial agents.
- for merchants : Create merchant assistants to reduce the effort of merchants to respond and improve the communication experience between consumers and merchants.
- for employees : Through dialogue robots, you can answer questions to employees on your own, thereby improving office efficiency.
1.3 What is Smart Customer Service
To answer what smart customer service is, you can first look at what customer service is. Our understanding is that customer service is a solution to help the experience go on smoothly when the user service experience is not perfect, and it is a solution to the problem after it occurs. And intelligent customer service allows most simple problems to be solved quickly by self-help, giving complex problems a chance to be solved efficiently by humans.
The picture above shows the customer service journey. First of all, users will enter the line to seek services by typing online or dialing a hotline, of which online consultation traffic accounts for more than 85%. When the user enters the service portal, the user first expresses the demand, and then the intelligent robot responds to the demand. In the process, the robot must first understand the problem, such as adding a note or modifying the address, or applying for a refund, etc., and then the robot tries to solve it by itself. If it can't be solved, it will be transferred to manual service in a timely manner. Finally, when the user leaves the service, the system will send a questionnaire, expecting the user to evaluate the service.
2 Core Technology of Smart Customer Service
2.1 Overview of Dialogue Interaction Technology
The technology behind intelligent customer service is mainly based on dialogue and interaction technology. Common conversation tasks can be divided into small chat type, task type and question and answer type:
- chat type : Usually it does not pay attention to a specific task. Its main goal is to have an open field dialogue with people, and the focus is to generate smooth, reasonable and natural responses.
- Task type : Usually it is to help users complete a certain task instruction, such as finding hotels, querying the status of orders, solving users' refund applications, and so on. The needs of users are usually more complex, requiring multiple rounds of interaction to continuously collect the necessary information required for the task, and then make decisions based on the information, perform different actions, and finally complete the user's instructions.
- question-and-answer type : focus on one question and one answer, that is, to give accurate answers directly based on the user's question. The most essential difference between question and answer type and task type is whether the system needs to maintain a representation of the user's target state and whether it needs a decision-making process to complete the task.
In terms of technical realization, it can usually be divided into retrieval type, generative type and task type:
- search formula : The main idea is to find the replies that best match the input sentence from the dialogue corpus. These replies are usually pre-stored data.
- generative formula : The main idea is based on the Encoder-Decoder architecture of deep learning, which acquires language ability from a large amount of corpus, and directly generates answer words based on the content of the question and related real-time status information.
- Task-based : It is a task-based dialogue, usually a dialogue state is maintained, and the next action is decided according to different dialogue states, whether to query the database or reply to the user, etc.
Small chats, questions and answers, and task-based dialogues are all passively responding to user needs. In the specific business, there will be problem recommendations, product recommendations, etc. to actively guide user interaction. Meituan’s business scenarios are mainly task-based and question-and-answer, with some small chats interspersed in between. Small chats are mainly greetings or simple emotional comforts, which play a role in lubricating human-computer dialogue.
As described in the previous user service process, there may be two communication objects for users. In addition to communicating with robots, they may also communicate with humans. If you are looking for a customer service scene, the person is a customer service agent, if you are looking for a merchant scene, the person is a merchant. The abilities of the robot mainly include question recommendation, question understanding, dialogue management and answer supply.
At present, the core output indicators for measuring the performance of robots are dissatisfaction and labor transfer rate, which respectively measure how well the problem is solved, and how many problems can be solved by humans. In terms of manual assistance, we provide the ability to recommend speech skills and session summaries. The core indicators are the reduction of ATT and ACW. ATT is the average communication time between humans and users, and ACW is other processing time after manual communication.
2.2 Intelligent robots-multiple rounds of dialogue
This is an example of a real multi-round dialogue. When the user enters the service portal, he first selects a recommended question "How to contact the rider", and the robot gives the contact information to call the rider. At the same time, in order to further clarify the scenario, the user is asked whether the meal has been received. When the user selects "not received yet", combined with the estimated delivery time and the current time, it is found that the time has not expired. The solution given is "OK, help." The user urges" or "I will wait again", at this time the user selects "I will wait again".
How does the robot behind this example work? First, when the user inputs "how to contact the rider", the question understanding module matches it with the expanded question in the knowledge base, and then obtains the corresponding standard question, which is the intention "how to contact the rider". Then the dialog management module triggers the corresponding task process according to the intention "how to contact the rider", first query the order interface, obtain the rider's phone number, and then output the dialog status to the answer generation module, and generate the final result according to the template, as shown in the red box on the right Show. In this process, it is necessary to have an intention system, define a task process, and an order query interface. These are all business-related and are mainly maintained by the operation team of each business. So, what does the dialogue system do? One is to match the user's input with the standard questions in the intention system, and the other is to complete the scheduling in multiple rounds of interaction.
Question understanding is to match the user's question with the intention system, and the standard question corresponding to the matched extended question is the user's intention. The actual work process of the robot is to do two things: recall and fine arrangement. Recalls are more often realized with existing search engines, and technically, more attention is paid to refinement.
Meituan’s self-developed intelligent customer service system was built in 2018. During the construction process, we continue to introduce the most advanced technology in the industry into our system, and at the same time, based on the characteristics of Meituan’s business and problem understanding The characteristics of this task adapt these technologies.
at the end of 2018 (see the article "Exploration and Practice of BERT 16163ef1364e3f"), we quickly replaced the original DSSM model with BERT in full. Later, according to the characteristics of the Meituan customer service dialogue, we carried out secondary training and online learning transformation of BERT. At the same time, in order to avoid interference between businesses and reduce noise interference by increasing knowledge discrimination, we also did multitasking Learning (each business is an independent task at the upper level) and multi-domain learning (Query and expansion questions are matched, changed to the overall matching of expansion questions, standard questions and answers), and finally our model is Online Learning based Multi-task Multi-Field RoBERTa. After such a series of technical iterations, our recognition accuracy rate has also gone from less than 80% at the beginning to close to 90% now.
After understanding the user's intention, some questions can be solved directly by giving answers, while some questions need to be further clarified. For example, the example of "how to apply for a meal loss" is not to directly tell the application method, but to clarify which order it is and whether it affects the consumption, and then clarify whether some users' requests are partial refunds or they want to arrange supplementary delivery. Different solutions. Such a process is strongly related to the business and needs to be defined by the business operations team. As shown in the task process tree on the right, we first provide a visual TaskFlow editing tool, and componentize outbound calls, maps, and APIs, and then business operators can complete the task process design by dragging and dropping.
In the real interaction between the dialogue engine and the user, it is necessary to complete the matching scheduling of each step in the Task. For example, in this example, if the user didn't click the "Yes but it affects the dining..." item, but instead typed in and said "It's okay, I want a partial refund", what should I do? This intention is not defined in advance, which requires the dialogue engine to support fuzzy matching of each step in the Task. Our TaskFlow Engine based on Bayes Network can just support the combination of rules and probabilities. The fuzzy matching algorithm here reuses the semantic matching capabilities of the problem understanding model.
This is another example. After the user asks "Can a member unsubscribe", the robot replied "Unable to return". Although this question is answered, the user is easily dissatisfied at this time and turns to seek manual services. If at this time, in addition to giving the answer, we also clarify the real reason behind the problem, and guide the user to ask whether it is "the takeaway red envelope cannot be used" or the "problem caused by changing the mobile phone", based on the Shuncheng relationship modeling, the user The probability is that in these situations, the user is likely to choose, so that the conversation can proceed further, and a more refined solution can be given, which also reduces the user's direct transfer to manual services.
This guidance task is called multi-round topic guidance, and the specific method is to model the co-occurrence relationship and the succession relationship of events in the conversation log. As shown in the figure on the right, here is originally to model the guidance between sentence levels. Considering the sparsity of sentences, we abstract it to the guidance between events. The co-occurrence relationship we use is the classic collaborative filtering method. mold. In addition, considering the directionality between events, we model the continuity relationship between events, the formula is as follows:
And through multi-objective learning, while considering the click indicators and task indicators, such as the non-transferred manual customer service data and non-dissatisfied data were modeled on the succession relationship, the formula is as follows:
In the end, we achieved very positive benefits in terms of click-through rate, dissatisfaction, and turnover rate.
The Meituan platform covers more than 200 life service categories such as food, accommodation, transportation, travel, shopping, and entertainment. When a user enters the service from a comprehensive service portal such as Meituan App or Review App, it is necessary to first determine whether the user wants to consult Which business, one of the tasks here is to "judge which business the user Query belongs to", this task is called domain recognition. If the domain can be clearly judged, then the domain knowledge will be used directly to answer; when the judgment cannot be clearly judged, multiple rounds of dialogue interactions are needed to clarify with the user. For example, if the user enters "I want to refund", there are refund intentions in multiple businesses. At this time, we need to first determine which business is the refund intention. If the judgment is not high, a list of businesses will be given to the user. Make your own choice for clarification.
The domain recognition model mainly models three types of data: labeled data in the knowledge base of various fields, a large number of weakly supervised unstandardized data in various fields, and personalized data.
- Based on the problem understanding model signal learned from the labeled data in the knowledge base of various fields, the possibility that the user input belongs to the intention of each business can be judged.
- We noticed that in addition to the comprehensive service portals such as Meituan App and Dianping App, which involve multiple businesses, there are also a large number of portals that can clarify the business, such as the order entry, the entry from the product detail page, and the dialogue data from these entries is Clear business label information. Therefore, we can obtain a large amount of weakly-supervised data in various business areas, and based on these data we can train a first-level classification model.
- At the same time, some issues need to be further clarified by combining personalized data such as user order status. For example, "I want a refund", there will be multiple businesses. Therefore, it is necessary to train a secondary model in conjunction with the user status characteristics, and finally determine which business the user's input belongs to.
In the end, the secondary domain recognition model has achieved very good benefits in terms of satisfaction, labor transfer rate and successful transfer rate indicators.
2.3 Intelligent robot-problem recommendation
After introducing the basic modules of multiple rounds of dialogue, question understanding and dialogue management, let's introduce the other two modules of the intelligent robot: question recommendation and answer supply. As shown in the previous examples of multiple rounds of dialogue, when a user enters the service portal, how the robot first guides the user to accurately express their needs, so as to reduce the user’s loss or direct transfer to manual services, and also reduce the failure of the robot to understand correctly The resulting multiple rounds of clarification and other invalid interactions.
This problem is a standard exposure click problem, and its essence is a recommendation problem. We use the classic FM model of CTR estimation task as the basic model. At the same time, in combination with business goals, we hope that the solution to the user’s click problem can solve the user’s problem. The problem is ultimately defined as the "exposure, click, and solve" problem. The final The model is ESSM-FM combined with multi-objective learning, which improves the conversion rate, labor conversion rate and dissatisfaction of effective interaction.
2.4 Intelligent robots-answer supply
After-sales customer service scenarios usually have a concentrated problem, and the solution of the problem mostly relies on the internal system data and rules of the business. Usually, the business department maintains the knowledge base, including the intention system, task process, and answers. However, in the pre-sale scene, most of the knowledge comes from the merchant or the product itself, user experience and evaluation information, etc. It has the characteristics of open user questions, high knowledge density, and difficulty in sorting answers manually. For example, which city and attraction to visit, which hotels are nearby, whether the hotel has a bathtub, where is the address of the hotel, etc., all need to consult "decision-making". In response to these demands, we use intelligent question and answer to solve the problem of consultation and answer supply.
Intelligent Q&A is to learn answer supply from Meituan data to quickly answer users’ questions. Based on different data sources, we have built different Q&A technologies.
- For basic business information, such as asking about business hours, addresses, prices, etc., we use KBQA to solve them. Use the basic information of the business to construct the graph, understand the problem through the question understanding model, and then query the graph to obtain accurate answers.
- Based on the community data, that is, the community data of the users in the "Ask Everyone" module on the merchant's details page, the community QA capability is built, and the similarity between the user's question and the "question-and-answer pair" in the "question-and-answer pair" model is built , Choose the answer with the highest similarity to answer some open questions from users.
- For UGC comment data and merchant policies and other unstructured data, build Document QA capabilities, and use machine reading comprehension technology to extract answers from documents for user questions, similar to the reading comprehension questions in our childhood Chinese exams, and further answer users Some of the open questions.
Finally, according to the answers given by multiple question and answer modules, the answer fusion sorting of multiple answer sources is carried out to select the final answer. In addition, the authenticity of the answer is also examined here, that is, the model of "believe that the majority thinks that the correct is correct". For the detailed introduction of this part, you can refer to the Intelligent Question Answering Technology Exploration and Practice ".
3 Artificial assisted core technology
3.1 Artificial assistance-verbal recommendation
The previous articles are all about intelligent robot technology. In addition to communicating with robots, users may also communicate with humans. During the survey of customer service agents’ workplaces, we found that agents often respond to similar or even the same skills in conversations with users, and they unanimously expect the ability to provide skills recommendations to improve efficiency. In addition, in addition to requesting help from customer service agents, in many cases, direct communication between users and merchants will make it more efficient to solve problems, and the efficiency of communication not only affects the consumer experience, but also affects the business's operations. For example, in the takeaway business, the order rate of consumers and the response time of the merchants have a relatively obvious inverse relationship. Whether it is a customer service agent or a merchant, there is a strong demand for verbal recommendation.
So, what should I do with the verbal recommendation? A common practice is to prepare a library of commonly used common words first, and some agents or businesses will also prepare a personal common words database, and then the system will search for the most suitable words according to the user's query and context to recommend. According to our investigation, we found that this part of the knowledge base is not well maintained. There are factors such as frequent changes in business knowledge that cause the maintained knowledge to be quickly unavailable, and factors such as the lack of willingness of the agents or businesses themselves. In addition, there is less experience available for new customer service agents or new businesses. Therefore, we adopt the automatic memory of each agent and the historical chat of the same skill group, the historical chat of the merchant and the merchant of the same category, according to the current input and context, predict the next possible reply, without manual processing Organizing greatly improves efficiency.
We construct historical chat records as "N+1" QA question and answer pairs. The first N sentences are regarded as question Q, and the last sentence is regarded as reply sentence A. The entire framework can be transformed into a search-style question and answer model. In the recall phase, in addition to the text information recall, we also added the above multiple rounds of slot label, Topic label and other recall optimizations, sorted into a BERT-based model, and added role information modeling, where the role is user, business or agent.
As shown in the figure above, the entire architecture is divided into two parts, offline and online. In addition, after the launch, we also added a layer of CTR prediction model to increase the adoption rate. At present, the average adoption rate of word technique recommendations for multiple businesses is around 24%, and the coverage rate is around 85%. The recommendation of speech skills is especially valuable for new agents. New employees often find it difficult to organize speech skills. By adopting the recommended speech skills, the proficiency cycle can be reduced. Observation shows that the average adoption rate of agent employees within 3 months is more than 3 months. 3 times the number of seat employees.
3.2 Manual assistance-session summary
After the agent in the customer service scene communicates with the user, he still needs to make a work order summary for some necessary information, including what the event is, what is the background of the event, what is the user's request, and what is the final processing result, etc. However, filling in these contents is actually very unfriendly to the seat, and it usually needs to be summarized, especially some communication takes a long time, and you need to look back and forth through the dialogue history to make a correct summary. In addition, in order to continuously improve the service products, it is also necessary to extract and label the corresponding events from the conversation log to facilitate business analysis.
Some of the questions here are multiple-choice questions, and some are fill-in-the-blank questions. For example, which event is specifically discussed in this conversation, we have compiled a relatively complete event system in advance, which can be regarded as multiple-choice questions, which can be calculated using classification or semantic similarity calculation models. To solve it. Another example is the background of the incident. For example, the background of the take-out refund is due to the spilled meal, the background of the hotel refund is that there is no room in the store, etc. It is an open question. The analysis found that it can be well extracted from the content of the conversation. Use abstract extraction model to solve. As for the processing results, it not only depends on the content of the dialogue, but also whether it is outbound, whether the merchant is connected after the outbound call, and whether a follow-up visit is required, etc. We have found that the generative model is more effective in our experiments. The specific model used is shown in the figure above. The event selection here takes into account the frequent addition of new events. We switched to the similarity calculation task of the twin towers. The background extraction uses the BERT-Sum model, and the processing result uses Google's PEGASUS model.
04 Summary and next steps
4.1 Summary-Interactive Cube
I introduced some of the core technologies in the practice of Meituan’s smart customer service. In the process, it also introduced the communication between customer service agents and consumers/merchants/riders/group leaders to improve efficiency, as well as the communication between consumers and merchants. Improve efficiency. In addition to these two parts, in the corporate office scene, there is actually a lot of communication between employees, sales consultants and businesses. If you do it one by one, the cost is high and the efficiency is low. The solution is to platform the capabilities accumulated in the intelligent customer service. It is best to solve it in a "package" to support more business needs at a fixed cost. So we built Meituan's dialogue platform-Moses Dialogue Platform, and used a "package" solution to solve the smart customer service needs of each business at a fixed cost.
4.2 Summary-Dialogue platform "Moses"
What kind of dialogue platform can be built to provide the expected team without NLP capabilities to have good dialogue robots? The first is to instrument and process dialogue capabilities. As shown in the figure above, the system can be divided into four layers: application scenario layer, solution layer, dialogue capability layer, and platform function layer.
- Application Scenario Layer : In the pre-sales application scenario, one type of demand is for merchant assistants, such as the Meituan flash purchase IM assistant and the integrated IM assistant listed in the figure, which need to assist the merchant input and the robot part to take over the high-frequency problem ability; Another type of demand is that smart Q&A is needed to fill the consultation gap in scenarios where there is no merchant IM, such as the hotel inquiries and scenic spots listed in the picture; in addition, in-sale, after-sales and corporate office scenarios, their needs are also different .
- solution layer : This requires us to have several sets of solutions, which can be roughly divided into intelligent robots, intelligent question and answer, merchant assistance, agent assistance, etc. The dialog ability requirements of each solution are also different. These solutions need to easily assemble the basic dialog capabilities, are transparent to the user, and can be used out of the box.
- Dialogue ability layer : The corresponding introduction was also carried out earlier. The six core competencies include question recommendation, question understanding, dialogue management, answer supply, speech recommendation and conversation summary.
- platform function layer : In addition, we need to provide supporting operating capabilities to provide operators of the business side to daily maintenance of the knowledge base, data analysis, and so on.
Secondly, to provide a "package" of solutions, it is also necessary to provide solutions at different stages for the business at different stages.
- Some businesses only want to maintain common Q&A and answer high-frequency questions. Then they only need to maintain an entry-level robot, and only need to maintain its intentions, common statements and answers in the intention management module. NS.
- For teams with operational resources, they hope to continuously enrich the knowledge base to improve their Q&A capabilities. At this time, the knowledge discovery module can be used to automatically discover new intentions and new statements of intentions from the daily logs. The operations staff only need Just spend a little time every day to confirm the addition and maintenance of the answers. This is an advanced business side.
- There are also some advanced business parties who want to call APIs in their business to solve complex problems. At this time, they can use the TaskFlow editing engine to directly register the business API on the platform, and complete the task editing by visual drag and drop.
In addition, in order to further facilitate more business involvement, we also provide some official skill packs such as small chat, general instructions, and regional queries, which the business side can directly check and use. In addition, as we continue to accumulate in our business, there will be more and more official industry skill packages. The overall direction is to gradually make the threshold for business use lower and lower.
4.3 Next steps
The dialogue system introduced in the previous article is a pipeline-style dialogue system, which is divided into different modules according to functions, and each module is modeled separately and connected in series. The advantage of this approach is that it can effectively divide the responsibilities of different teams. For example, R&D students focus on building a problem recommendation model, problem understanding model and Task engine, etc.; business operation students focus on intent system maintenance, task process design and answer design and many more. Its disadvantages are also obvious. Modules are coupled and errors accumulate. It is difficult to jointly optimize. Then the students responsible for each module may go to repair and repair, which is easy to cause deformation of the action.
Another type of modeling method is End-to-End. The various modules of the Pipeline dialogue system are jointly modeled into a model to directly realize the language-to-language transformation. This type of method was initially applied in the small chat dialogue system, and recently followed With the rapid development of large-scale pre-training models, academic research has gradually begun to study end-to-end task-based dialogue systems based on pre-training models. Its advantage is that the model can make full use of unsupervised human conversation, and it can be iterated quickly with data-driven; the disadvantage is that the model is poorly controllable, difficult to explain, and lacks the ability to intervene. At present, it is mainly based on academic research, and no mature application cases have been seen.
In addition to using this large number of unsupervised conversation logs for everyone, another idea is to build a rule-based user simulator based on Rule-Based TaskFlow, interact to generate a large amount of conversation data, and then train a conversation model. In order to ensure the robustness of the dialogue system, it can also be optimized using methods similar to counter attacks. It can simulate the behavior of Hard User, execute TaskFlow out of order, randomly interrupt, jump to a dialogue node, and so on.
In addition, by comparing and analyzing the human-machine dialogue log and the Renren dialogue log, the human-machine dialogue is relatively rigid and cannot effectively capture the user's emotions, and humans are very good at this aspect. This is very important in the customer service scene. Users often come in with negative emotions, and robots need to have empathy. The end-to-end data-driven dialogue and dialogue empathy capacity building will also be the focus of our attempts in the next period of time.
Read more technical articles from the
the front | algorithm | backend | data | security | operation and maintenance | iOS | Android | test
| in the public account menu bar dialog box, and you can view the collection of technical articles from the Meituan technical team over the years.
| This article is produced by the Meituan technical team, and the copyright belongs to Meituan. Welcome to reprint or use the content of this article for non-commercial purposes such as sharing and communication, please indicate "the content is reproduced from the Meituan technical team". This article may not be reproduced or used commercially without permission. For any commercial activity, please send an email to tech@meituan.com to apply for authorization.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。