Exploration of Dialogue Summary Technology in Meituan (SIGIR)

With more and more text data generated by the Internet, the problem of text information overload is becoming more and more serious. It is very necessary to carry out a "dimension reduction" process for various texts, and text summarization is one of the important means. This paper first introduces classic text summarization methods, including extractive summarization methods and generative summarization methods, and then analyzes the model of dialogue summarization, and shares the challenges that Meituan faces in real dialogue summarization scenarios. I hope it can bring some inspiration or help to students who are engaged in related work.

1. Dialogue Summary Technical Background

Text summarization ^[65-74] aims to convert text or text collections into short summaries containing key information, and is an important means to alleviate textual information overload. Text summarization can be divided into single-document summarization and multi-document summarization according to the input type. Single-document summarization generates summaries from a given document, and multi-document summarization generates summaries from a given set of topic-related documents. According to the output type, it can be divided into extractive summary and generative summary. Extractive summary extracts key sentences and keywords from the source document to form an abstract, and all the abstract information comes from the original text. Generative summarization allows generating new words and phrases to compose summaries based on the original text. In addition, text summarization can be divided into supervised summarization and unsupervised summarization according to the presence or absence of supervised data. According to the input data field, text summaries can be divided into news summaries, patent summaries, paper summaries, dialogue summaries, and so on.

Automatic text summarization can be regarded as a process of information compression. We automatically compress one or more input documents into a short summary. This process inevitably has information loss, but requires retaining as much important information as possible. . Automatic summarization systems usually involve three main steps: understanding of input documents, screening of key points, and synthesis of summaries. Among them, document understanding can be shallow or deep. Most automatic summarization systems only need to perform relatively shallow document understanding, such as paragraph division, sentence segmentation, lexical analysis, etc., and some summarization systems need to rely on syntactic analysis, semantic role labeling, referencing Digestion, and even techniques such as deep semantic analysis.

Dialogue summarization is a special case of text summarization, and its core is oriented to dialogue data. Conversational data comes in different forms, such as: meetings, small talk, emails, debates, customer service, and more. Different forms of dialogue summarization have different application scenarios in their own specific fields, but their core is the same as the core of the summarization task, both to capture the key information in the dialogue and help to quickly understand the core content of the dialogue. Unlike text summaries, the key information of dialogue summaries is often scattered in different places, and the speakers and topics in the dialogue are constantly changing. In addition, there is currently a lack of dialogue summarization datasets, which increases the difficulty of dialogue summarization ^[64] .

Based on the actual scene, this paper proposes a distance-supervised Span-Level dialogue summarization scheme for reading comprehension "Distant Supervision based Machine Reading Comprehension for Extractive Summarization in Customer Service" (published in SIGIR 2021), which is stronger than the benchmark method in ROUGE- The L index and BLEU index have increased by about 3%.

2. Introduction to classic models of text summarization and dialogue summarization

Text summarization can be divided into two modes: extractive summarization and generative summarization. Extractive summarization usually uses algorithms to extract ready-made keywords and sentences from source documents as summary sentences. In terms of fluency, it is generally better than generative summaries. However, the extractive summary will introduce too much redundant information and cannot reflect the characteristics of the summary itself. The generative summary is based on NLG (Natural Language Generation) technology. According to the content of the source document, the algorithm model generates a natural language description instead of directly extracting the sentences of the original text.

Currently, many works on generative summarization are based on the Seq2Seq model in deep learning ^[44] . Recently, after the emergence of a large number of pre-trained models represented by BERT ^[34] , a lot of work has also focused on how to use pre-trained models for NLG tasks. The classical models in the above two modes are described below.

2.1 Extractive summarization model

Extractive abstract selects keywords and key sentences from the original text to form an abstract. This method naturally has a low error rate in grammar and syntax, and ensures a certain effect. Traditional extractive summarization methods use graph methods, clustering, etc. to complete unsupervised summarization. The current popular neural network-based extractive summarization often models the problem as two tasks: sequence labeling and sentence ranking. The following first introduces the traditional extractive summarization method, and then briefly describes the neural network-based extractive summarization method.

Traditional extractive summarization method

Lead-3

In general, documents often indicate the topic in the title and the beginning of the document, so the easiest way is to extract the first few sentences of the document as a summary. The commonly used method is Lead-3 ^[63] , which extracts the first three sentences of the document as the summary of the document. The Lead-3 method, while simple and straightforward, is a very effective method.

TextRank

The TextRank ^[58] algorithm is modeled on PageRank, using sentences as nodes and using inter-sentence similarity to construct undirected weighted edges. Iteratively updates the node value using the weights on the edge, and finally selects the N highest-scoring nodes as the summary.

clustering

Clustering-based methods treat sentences in the document as a point, and complete the summarization in a clustering manner. For example, Padmakumar and Saran ^[11] encode sentences in documents using Skip Thought Vectors and Paragram Embeddings to obtain sentence-level vector representations. Then use K-means clustering ^[59] and Mean-Shift clustering ^[60] for sentence clustering to get N categories. Finally, from each category, the sentence closest to the centroid is selected to get N sentences as the final summary.

Extractive summarization method based on neural network

After the popularity of neural networks in recent years, the performance of extractive summarization methods based on neural networks is significantly higher than that of traditional extractive summarization methods. The extractive summarization method based on neural network is mainly divided into sequence labeling method and sentence sorting method. The difference is that sentence sorting method uses sentence revenue as a scoring method and considers the relationship between sentences.

sequence labeling

This method can be modeled as a sequence labeling task for processing. The core idea is to assign a binary label (0 or 1) to each sentence in the original text. 0 means that the sentence does not belong to the abstract, and 1 means that the sentence belongs to the abstract. . The final summary consists of all sentences with label 1.

The key to this method is to obtain the representation of the sentence, that is, to encode the sentence into a vector, and perform a binary classification task according to the vector, such as the SummaRuNNer model ^[48] , which uses a bidirectional GRU to model the word-level and sentence-level representations (the model is shown in the figure below). 1 shown). The blue part is the word-level representation, and the red part is the sentence-level representation. For each sentence representation, there is a 0, 1 label output, indicating whether it is a summary or not.

图1 SummaRuNNer模型结构

The training of this model requires supervised data, and existing datasets often do not have corresponding sentence-level labels, which can be obtained through heuristic rules. The specific method is as follows: first, select the sentence with the highest ROUGE score in the original text and the standard abstract calculation and add it to the candidate set, and then continue to select from the original text to ensure that the ROUGE score of the selected abstract set increases until the condition cannot be met. The sentences corresponding to the obtained candidate summary set are set as 1 labels, and the rest are 0 labels.

Sentence order

Extractive summarization can also be modeled as a sentence sorting task. The difference from the sequence labeling task is that the sequence labeling represents a 0, 1 label for each sentence, while the sentence sorting task is to output whether it is a summary for each sentence. The probability of the sentence, and finally, according to the probability, the Top K sentences are selected as the final summary. Although the task modeling method (the final selection summary method) is different, its core focus is the modeling of sentence representation.

The model of the sequence labeling method scores the sentences after obtaining the representation of the sentence, which results in the separation of the scoring and the selection. The scoring is first, and then the selection is made according to the score, and the relationship between the sentences is not used. NeuSUM ^[49] proposes a new scoring method that uses sentence returns as the scoring method, taking into account the interrelationships between sentences. Its model NeuSUM is shown in Figure 2 below:

图2 NeuSUM模型结构

The sentence encoding part is basically the same as before. The scoring and extraction part is done using one-way GRU and two-layer MLP. One-way GRU is used to record the situation of extracting sentences in the past, and two-layer MLP is used for scoring, as shown in the following formula:

2.2 Generative Summary Model

Extractive summarization has certain guarantees in terms of grammar and syntax, but it also faces certain problems, such as: wrong content selection, poor coherence, and poor flexibility. Generated summaries allow new words or phrases to be included in the summaries, providing greater flexibility. With the development of neural network models in recent years, sequence-to-sequence (Seq2Seq) models have been widely used for generative summarization tasks and achieved certain results. The classic Pointer-Generator ^[50] model and the point-based generative summarization model Leader+Writer ^[4] are introduced below.

Pointer-Generator Model

Only using Seq2Seq to complete generative summarization has the following problems: ① unregistered word problem (OOV); ② repeated generation problem. Pointer-Generator ^[50] adds Copy and Coverage mechanisms to Seq2Seq based on attention mechanism, which effectively alleviates the above problems. Its model structure is shown in Figure 3 below:

图3 Pointer-Generator 模型结构

The model is based on the Seq2Seq model of the attention mechanism. It uses the hidden layer state of each step of decoding and the hidden layer state of the encoder to calculate the weight, and finally obtains the context vector, and uses the context vector and the decoder hidden layer state to calculate the output probability.

two innovations

Copy mechanism : The probability of copying or generation is calculated at each step of decoding. Because the vocabulary is fixed, this mechanism can choose to copy words from the original text to the abstract, effectively alleviating the problem of unregistered words (OOV).
Coverage mechanism : In each step of decoding, the attention weight of the previous step is considered, and the coverage loss is combined to avoid continuing to consider the part that has obtained high weight. This mechanism can effectively alleviate the problem of generating duplicates.

Leader-Writer model

The Leader-Writer model mainly generates summaries by mining the main points (such as background, conclusion, etc.) present in the dialogue. The author summarizes several existing problems of generative summarization: ①Logical, for example, in customer service dialogue, the background should precede the conclusion; ②Completeness, that is, all points in the dialogue should exist in the summary; ③The key information is correct For example, although there is only one word difference between "user agrees" and "user disagrees", the meanings are completely opposite; ④The problem of too long abstract. To solve these problems, this paper proposes the following solutions:

The auxiliary task of bullet point sequence prediction is introduced, and the bullet point sequence information of the dialogue is used to guide the model to generate a logical, complete, and correct summary of key information. As shown in Figure 4 below, the Leader-Writer model encodes each utterance with a layer of Transformer encoders, uses the Leader decoder to classify the gist of each utterance, and uses the Writer decoder for summary generation. The decoded output of the Leader decoder is used as the input of the initial state of the Writer decoder to exploit the gist information of different dialogue segments.
Introduce the Pointer-Generator model to generate longer, more informative summaries.

图4 Leader-Writer 模型

2.3 Dialogue Summary Model

Dialogue has the characteristics of scattered key information, low information density, multi-domain, topic change, frequent speaker role change, etc. Therefore, text summarization can be directly applied to dialogue summarization, and some research works are also devoted to solving these problems. Two representative dialogue summarization models are introduced below: SPNet ^[53] and TDS-SATM ^[54] .

Scaffold Pointer Network (SPNet)

Aiming at three problems faced by dialogue summarization: (1) there are many speakers; (2) it is difficult to correctly summarize the key entity information; (3) there are many dialogue fields and large domain characteristics. To this end, this paper proposes 3 solutions:

Use Pointer-generator for generative summary extraction, while introducing different encoders to encode different speaker roles.
For entity information such as place name and time, the input of the encoder is replaced by a unified symbol, for example, the time is replaced by [time].
The auxiliary loss for dialogue domain classification is introduced, and the cross-entropy loss for multiple domain classification is added as auxiliary loss.

TDS-SATM

The important information of the dialogue is often scattered among different sentences, and most of the utterances are common expressions that are not important, and noise and escaping errors are also often present in the dialogue. In order to solve the above problems, the author proposes the following two solutions:

Based on the neural topic model, the saliency-aware neural topic model (SATM) is proposed to infer topic distribution through dialogue. The author divides topics into informative topics and other topics. During the generation of SATM, the authors constrain each word corresponding to the standard abstract to be generated from informative topics, so that SATM can generate more topic-relevant words.
To capture persona information and extract semantic topics from conversations, the authors use SATM to perform multi-role topic modeling for customer utterances, customer service utterances, and overall conversations, respectively. The authors use a two-stage summary generator, including sentence extraction and generation of summaries from the extracted sentences. The topic information obtained by SATM is incorporated into a summary generator to generate summaries with important information in the conversation.

The overall architecture of the model is shown in Figure 5 below:

图5 TDS-SATM的整体架构

3. Span-level extractive summarization scheme DSMRC-S based on reading comprehension (published in SIGIR 2021)

3.1 Background introduction

In order to ensure a good user experience in the future, Meituan has a large number of manual customer service personnel to deal with user call problems. After receiving a call, customer service students need to manually record the content of the call, which is time-consuming and labor-intensive. An effective dialogue summary model can greatly increase the work efficiency of customer service students and reduce the average processing time of manual customer service for each incoming call.

Although the above classical methods have achieved good results on datasets such as CNN/Daily Mail and LCSTS, they still encounter many challenges in practical scenarios. For example, the generative summary still lacks stability (repetition or strange words) and logic, and if the extractive summary does not have clear annotation information to train the model, it is generally marked by "sentences with high ROUGE-L indicators as positive examples" This method automatically labels sentence-level labels, but this coarse-grained method of only extracting sentence-level is also prone to noise. In addition, the results of existing dialogue summaries are uncontrollable, and it is difficult to obtain specific information elements.

In order to apply to practical scenarios, we introduce the Span-Level extraction dialogue summarization scheme based on reading comprehension. The related results were also published at the SIGIR 2021 international conference, and the method will be introduced in detail below.

3.2 Method introduction

In order to solve the problem that the existing dialogue summaries are difficult to obtain specified information elements and lack of labeled data, we propose a more flexible, Distant Supervision based Machine Reading Comprehension Model for Extractive Summarization (Distant Supervision based Machine Reading Comprehension Model for Extractive Summarization), Referred to as DSMRC-S, the overall structure is shown in Figure 6 below:

图6 DSMRC-S模型的总体结构

DSMRC-S consists of a BERT-based MRC (Machine Reading Comprehension) module, remote supervision module, and a density-based extraction strategy. During the preprocessing stage, the tokens in the dialogue are automatically annotated, and the model is trained to predict the probability of each token in the dialogue appearing in the answer. Then, based on the probabilities predicted in the previous step, a density-based extraction strategy is used to extract the most suitable span as the answer.

Our method can be mainly divided into two parts: ① Converting the dialogue summarization task into reading comprehension; ② A reading comprehension scheme without additional annotation.

Conversation summaries converted into reading comprehension tasks

After receiving a call, the customer service needs to write a summary. The content of the summary usually contains some fixed key elements, such as "user's call background", "user's call request", "solution" and so on. Based on such characteristics, we convert the automatic summarization task into a reading comprehension task, and each key element in the summary corresponds to a question in the reading comprehension task.

The benefits of this conversion are:

The powerful language understanding capabilities of pre-trained language models can be more effectively utilized.
Compared with the uncontrollable content generated by Seq2Seq, the way of reading comprehension can be guided in a more targeted manner through questions, so that the answers can be more focused as abstracts, and the information elements of interest can be obtained.

Reading comprehension solution without additional annotation

Reading comprehension tasks often require large amounts of labeled data. Fortunately, the human customer service records a large amount of key information (such as "user's call background", "user's call request", "solution", etc.), which can be used as the answer to the reading comprehension question. However, the records of human customer service are not the original text fragments of the conversation, and cannot be directly used for extractive reading comprehension. To solve this problem, we designed the following two stages (a reading comprehension scheme that does not rely on additional annotations).

3.3 Experiment

In this section, we evaluate the model performance of DSMRC-S, and the experimental setup and experimental results are detailed below.

data set

We evaluate in the Meituan scene data, which includes 400,000 conversations, each of which contains four key elements handwritten by agents (such as user call requests, agent solutions, etc.).

Experimental details

Evaluation Metrics

We use the BLEU and ROUGE-L (F1) metrics commonly used in machine translation and text summarization to measure the closeness of the output results to the reference text (handwritten summaries of customer service), which evaluate the model output text and reference text based on accuracy and F1 score, respectively Overlapping cases on n-grams. At the same time, the Distinct metric is also used to measure the difference of the output summary.

Comparison method

S2S+Att : A Sequence-to-Sequence ^[44] model based on the RNN+Attention ^[45] mechanism.
S2S+Att+Pointer : Added the Pointer mechanism ^[50] , allowing the model to decide whether to generate a Token or copy a Token from the dialogue.
S2S+Att+Pointer(w) : (w) refers to predicting the entire summary as a whole, rather than predicting multiple key elements and then combining them in the end.
Trans+Att+Pointer : Replace RNN with Transformer ^[46] .
Trans+Att+Pointer(w) : Replace RNN with Transformer, (w) refers to predicting the entire summary as a whole, rather than predicting multiple key elements and finally combining them.
Leader+Writer : A hierarchical Transformer structure ^[4] , the Leader module first predicts the sequence of key elements, and the Writer module generates the final summary according to the sequence of key elements.
TDS+SATM : A two-stage approach for sentence-level summary extraction and character-level summary generation using Transformer structures ^[54] , and neural topic models for topic enhancement.
DSMRC-S : Our proposed Span-level extractive summarization method based on reading comprehension.

Experimental results

main experiment

表1 DSMRC-S和其他Baseline方法效果对比（%）

The performances of DSMRC-S and other Baseline methods are shown in Table 1. We can draw the following conclusions:

Our model achieves the best performance, outperforming the best Baseline method by about 3% on both BLEU and ROUGE-L.
Forecasting each key element individually works significantly better than forecasting the entire summary. For example, Trans+Att+Pointer is 3.62% higher on ROUGE-L than Trans+Att+Pointer(w). This means that in the customer service scenario, it is necessary to split the summary for prediction.
From the difference of summary, our model also achieves the best performance, which is 3.9% better than the best Baseline method on Distinct1 metric.

Performance on different key elements

图7 DSMRC-S和Baseline方法在预测不同关键要素上的性能（%）

As shown in the figure above, we show the performance of the model in predicting different key elements. Our method DSMRC-S outperforms other Baseline methods in the prediction of each key element, which shows that our method is beneficial to extract the content of different key elements. Specifically, on the second key element (user appeal), our method is significantly better (possibly because user appeal is generally mentioned in the conversation as-is).

Performance on conversations of different lengths

图8 DSMRC-S和Baseline方法在不同的对话轮次和摘要长度的样本上的性能

As shown in the figure above, we also show the performance of the model on samples with different dialogue turns and summary lengths. ROUGE-L almost decreases for all methods with increasing dialogue rounds and summary lengths, due to the increasing difficulty of prediction. However, our method DSMRC-S performs better than the Baseline method on samples of different dialogue rounds and summary lengths.

4. Summary and Outlook

This paper first introduces the classic methods of text summarization, including extractive summarization methods and generative summarization methods, and then introduces a more flexible Span-Level scheme based on distance-supervised reading comprehension, which is better than the strong benchmark method in the ROUGE-L index and BLEU indicators are about 3% higher. In the future, we will continue to explore and practice dialogue summarization in the following directions:

Abstract extraction method for multi-span answers;
Exploration of Prompt-based Generative Dialogue Summarization Method;
Deep modeling of dialogue structure to capture richer dialogue information.

5. References

[1] AM Rush, S. Chopra, and J. Weston, “A neural attention model for abstractive sentence summarization,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015.
[2] A. See, PJ Liu, and CD Manning, “Get to the point: Summarization with pointer-generator networks,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, ACL 2017.
[3] S. Gehrmann, Y. Deng, and AM Rush, “Bottom-up abstractive summarization,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018.
[4] C. Liu, P. Wang, J. Xu, Z. Li, and J. Ye, “Automatic dialogue summary generation for customer service,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD 2019.
[5] S. Chopra, M. Auli, and AM Rush, “Abstractive sentence summarization with attentive recurrent neural networks,” in NAACL HLT 2016.
[6] Y. Miao and P. Blunsom, “Language as a latent variable: Discrete generative models for sentence compression,” in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, EMNLP 2016.
[7] D. Wang, P. Liu, Y. Zheng, X. Qiu, and X. Huang, “Heterogeneous graph neural networks for extractive document summarization,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020.
[8] M. Zhong, D. Wang, P. Liu, X. Qiu, and X. Huang, “A closer look at data bias in neural extractive summarization models.”
[9] Q. Zhou, N. Yang, F. Wei, S. Huang, M. Zhou, and T. Zhao, “Neural document summarization by jointly learning to score and select sentences,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018,
[10] J. Cheng and M. Lapata, “Neural summarization by extracting sentences and words,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016
[11] R. Nallapati, F. Zhai, and B. Zhou, “Summarunner: A recurrent neural network based sequence model for extractive summarization of documents,” in Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence,
[12] H. Pan, J. Zhou, Z. Zhao, Y. Liu, D. Cai, and M. Yang, “Dial2desc: End-to-end dialogue description generation,” CoRR, vol. abs/1811.00185, 2018 .
[13] C. Goo and Y. Chen, “Abstractive dialogue summarization with sentence-gated modeling optimized by dialogue acts,” in 2018 IEEE Spoken Language Technology Workshop, SLT 2018
[14] J. Gu, T. Li, Q. Liu, Z. Ling, Z. Su, S. Wei, and X. Zhu, “Speaker-aware BERT for multi-turn response selection in retrieval-based chatbots,” in CIKM '20
[15] K. Filippova, E. Alfonseca, CA Colmenares, L. Kaiser, and O. Vinyals, “Sentence compression by deletion with lstms,” in Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, EMNLP 2015.
[16] R. Nallapati, B. Zhou, CN dos Santos, C ̧. Gu ̈lc ̧ehre, and B. Xiang, “Abstractive text summarization using sequence-to-sequence rnns and beyond,” in Proceedings of the 20th SIGNLL Conference on Computational Natural Language Learning, CoNLL 2016,
[17] A. Celikyilmaz, A. Bosselut, X. He, and Y. Choi, “Deep communicating agents for abstractive summarization,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics
[18] R. Paulus, C. Xiong, and R. Socher, “A deep reinforced model for abstractive summarization,” in 6th International Conference on Learning Representations, ICLR 2018
[19] L. Zhao, W. Xu, and J. Guo, “Improving abstractive dialogue summarization with graph structures and topic words,” in Proceedings of the 28th International Conference on Computational Linguistics, COLING 2020,
[20] Y. Zou, L. Zhao, Y. Kang, J. Lin, M. Peng, Z. Jiang, C. Sun, Q. Zhang, X. Huang, and X. Liu, “Topic-oriented spoken dialogue summarization for customer service with saliency-aware topic modeling,” in Thirty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2021
[21] Q. Zhou, N. Yang, F. Wei, S. Huang, M. Zhou, and T. Zhao, “A joint sentence scoring and selection framework for neural extractive document summarization,” IEEE ACM Trans. Audio Speech Lang . Process., vol. 28, pp. 671–681, 2020.
[22] Y. Chen and M. Bansal, “Fast abstractive summarization with reinforce-selected sentence rewriting,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018.
[23] A. Jadhav and V. Rajan, “Extractive summarization with SWAP-NET: sentences and words from alternating pointer networks,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018,
[24] S. Narayan, SB Cohen, and M. Lapata, “Ranking sentences for extractive summarization with reinforcement learning,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL- HLT 2018,
[25] X. Zhang, M. Lapata, F. Wei, and M. Zhou, “Neural latent extractive document summarization,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing,
[26] Y. Liu, I. Titov, and M. Lapata, “Single document summarization as tree induction,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019,
[27] J. Xu, Z. Gan, Y. Cheng, and J. Liu, “Discourse-aware neural extractive text summarization,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020
[28] M. Zhong, P. Liu, Y. Chen, D. Wang, X. Qiu, and X. Huang, “Extractive summarization as text matching,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020
[29] Y. Wu, W. Wu, C. Xing, ou, and Z. Li, “Sequential matching network: A new architecture for multi-turn response selection in retrieval-based chatbots,” in ACL 2017,
[30] Z. Zhang, J. Li, P. Zhu, H. Zhao, and G. Liu, “Modeling multi-turn conversation with deep utterance aggregation,” in COLING 2018,
[31] X. Zhou, L. Li, D. Dong, Y. Liu, Y. Chen, WX Zhao, D. Yu, and H. Wu, “Multi-turn response selection for chatbots with deep attention matching network,” in ACL 2018
[32] C. Tao, W. Wu, C. Xu, W. Hu, D. Zhao, and R. Yan, “One time of interaction may not be enough: Go deep with an interaction-over-interaction network for response selection in dialogues,” in ACL 2019
[33] M. Henderson, I. Vulic, D. Gerz, I. Casanueva, P. Budzianowski, S. Coope, G. Spithourakis, T. Wen, N. Mrksic, and P. Su, “Training neural response selection for task-oriented dialogue systems,” in Proceedings of the 57th Conference of the Association for Computational Linguistics, ACL 2019
[34] J. Devlin, M. Chang, K. Lee, and K. Toutanova, “BERT: pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019,
[35] J. Dong and J. Huang, “Enhance word representation for out-of-vocabulary on ubuntu dialogue corpus,” CoRR, vol. abs/1802.02614, 2018.
[36] C. Goo and Y. Chen, “Abstractive dialogue summarization with sentence-gated modeling optimized by dialogue acts,” in 2018 IEEE Spoken Language Technology Workshop, SLT 2018,
[37] Q. Chen, Z. Zhuo, and W. Wang, “BERT for joint intent classification and slot filling,” CoRR, vol. abs/1902.10909, 2019.
[38] L. Song, K. Xu, Y. Zhang, J. Chen, and D. Yu, “ZPR2: joint zero pronoun recovery and resolution using multi-task learning and BERT,” in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, ACL 2020
[39] S. Chuang, AH Liu, T. Sung, and H. Lee, “Improving automatic speech recognition and speech translation via word embedding prediction,” IEEE ACM Trans. Audio Speech Lang. Process., vol. 29, pp. 93–105, 2021.
[40] C.-Y. Lin, “ROUGE: A package for automatic evaluation of summaries,” in Text Summarization Branches Out. Barcelona, Spain: Association for Computational Linguistics, Jul. 2004, pp. 74–81.
[41] K. Papineni, S. Roukos, T. Ward, and W. Zhu, “Bleu: a method for automatic evaluation of machine translation,” in Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics,
[42] J. Li, M. Galley, C. Brockett, J. Gao, and B. Dolan, “A diversity-promoting objective function for neural conversation models,” in NAACL HLT 2016, The 2016 Conference of the North American Chapter of the Association for Computational Linguistics.
[43] Y. Liu and M. Lapata, “Text summarization with pretrained encoders,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019,
[44] I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence-to-sequence learning with neural networks,” in Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014
[45] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” in 3rd International Conference on Learning Representations, ICLR 2015,
[46] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, AN Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017,
[47] C. Raffel, N. Shazeer, A. Roberts, K. Lee, S. Narang, M. Matena, Y. Zhou, W. Li, and PJ Liu, “Exploring the limits of transfer learning with a unified text -to-text transformer,” J. Mach. Learn. Res., vol. 21, pp. 140:1–140:67, 2020.
[48] R.Nallapati, F. Zhai, B. Zhou, “SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents.” AAAI 2017.
[49] Q. Zhou, N. Yang, F. Wei, S. Huang, M. Zhou, T. Zhao, “Nerual Document Summarization by Jointly Learning to Score and Select Sentences,” ACL 2018.
[50] Abigail See, Peter J Liu, and Christopher D Manning. Get to the point: Summarization with pointer-generator networks. arXiv preprint arXiv:1704.04368, 2017.
[51] Abdelrahman Mohamed, Omer Levy, Veselin Stoyanov and Luke Zettlemoyer. “BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension.” ACL (2020).
[52] Zhang, Jingqing, Yao Zhao, Mohammad Saleh and Peter J. Liu. “PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization.” ArXiv abs/1912.08777 (2020): n. pag.
[53] Yuan, Lin and Zhou Yu. “Abstractive Dialog Summarization with Semantic Scaffolds.” ArXiv abs/1910.00825 (2019): n. pag.
[54] Zou, Yicheng, Lujun Zhao, Yangyang Kang, Jun Lin, Minlong Peng, Zhuoren Jiang, Changlong Sun, Qi Zhang, Xuanjing Huang and Xiaozhong Liu. “Topic-Oriented Spoken Dialogue Summarization for Customer Service with Saliency-Aware Topic Modeling .” AAAI (2021).
[55] Brown, Tom B. et al. “Language Models are Few-Shot Learners.” ArXiv abs/2005.14165 (2020): n. pag.
[56] Radford, Alec, Jeff Wu, Rewon Child, David Luan, Dario Amodei and Ilya Sutskever. “Language Models are Unsupervised Multitask Learners.” (2019).
[57] Radford, Alec and Karthik Narasimhan. “Improving Language Understanding by Generative Pre-Training.” (2018).
[58] Mihalcea, Rada and Paul Tarau. “TextRank: Bringing Order into Text.” EMNLP (2004).
[59] Hartigan, JA and M. Anthony. Wong. “A k-means clustering algorithm.” (1979).
[60] Comaniciu, Dorin and Peter Meer. “Mean Shift: A Robust Approach Toward Feature Space Analysis.” IEEE Trans. Pattern Anal. Mach. Intell. 24 (2002): 603-619.
[61] Lin, Chin-Yew. “ROUGE: A Package for Automatic Evaluation of Summaries.” ACL 2004 (2004).
[62] Papineni, Kishore, Salim Roukos, Todd Ward and Wei-Jing Zhu. “Bleu: a Method for Automatic Evaluation of Machine Translation.” ACL (2002).
[63] Ishikawa, Kai, Shinichi Ando and Akitoshi Okumura. “Hybrid Text Summarization Method based on the TF Method and the Lead Method.” NTCIR (2001).
[64] Feng, Xiachong, Xiaocheng Feng and Bing Qin. “A Survey on Dialogue Summarization: Recent Advances and New Frontiers.” ArXiv abs/2107.03175 (2021): n. pag.
[65] El-Kassas, Wafaa S., Cherif R. Salama, Ahmed A. Rafea and Hoda Korashy Mohamed. “Automatic text summarization: A comprehensive survey.” Expert Syst. Appl. 165 (2021): 113679.
[66] Nallapati, Ramesh, Bowen Zhou, Cícero Nogueira dos Santos, Çaglar Gülçehre and Bing Xiang. “Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond.” CoNLL (2016).
[67] Shi, Tian, Yaser Keneshloo, Naren Ramakrishnan and Chandan K. Reddy. “Neural Abstractive Text Summarization with Sequence-to-Sequence Models.” ACM Transactions on Data Science 2 (2021): 1 - 37.
[68] Fabbri, Alexander R., Irene Li, Tianwei She, Suyi Li and Dragomir R. Radev. “Multi-News: A Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model.” ArXiv abs/1906.01749 (2019) : n. pag.
[69] Li, Wei and Hai Zhuge. “Abstractive Multi-Document Summarization Based on Semantic Link Network.” IEEE Transactions on Knowledge and Data Engineering 33 (2021): 43-54.
[70] DeYoung, Jay, Iz Beltagy, Madeleine van Zuylen, Bailey Kuehl and Lucy Lu Wang. “MSˆ2: Multi-Document Summarization of Medical Studies.” EMNLP (2021).
[71] Nallapati, Ramesh, Feifei Zhai and Bowen Zhou. “SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents.” AAAI (2017).
[72] Narayan, Shashi, Shay B. Cohen and Mirella Lapata. “Ranking Sentences for Extractive Summarization with Reinforcement Learning.” NAACL (2018).
[73] Zhong, Ming, Pengfei Liu, Yiran Chen, Danqing Wang, Xipeng Qiu and Xuanjing Huang. “Extractive Summarization as Text Matching.” ACL (2020).
[74] Zhang, Jingqing, Yao Zhao, Mohammad Saleh and Peter J. Liu. “PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization.” ArXiv abs/1912.08777 (2020): n. pag.

6. The author of this article

Ma Bing, Liu Cao, Jinxiong, Shujie, Jiansong, Yang Fan, Guanglu, etc. are all from the Meituan Platform/Voice Interaction Department.

7. Recruitment information

The Voice Interaction Department is responsible for the research and development of Meituan's voice and intelligent interactive technologies and products, and provides large-scale processing and intelligent response capabilities for voice and spoken data for Meituan's business and ecological partners. After years of R&D and accumulation, the team has built a large-scale technical platform service in technologies such as speech recognition, synthesis, spoken language understanding, intelligent question answering and multi-round interaction, and has developed solutions including outbound robots, intelligent customer service, and voice content analysis. The products have been widely implemented in the company's rich business scenarios; at the same time, we also attach great importance to close cooperation with the industry. Through the Meituan voice application platform, we have connected with many partners such as third-party mobile phone voice assistants, smart speakers, and smart car machines. Provide voice life service applications to more users.

The Voice Interaction Department has been recruiting natural language processing algorithm engineers and algorithm experts for a long time. Interested students can send their resumes to chenjiansong@meituan.com .

Read more collections of technical articles from the Meituan technical team

| Reply keywords such as [2021 stock], [2020 stock], [2019 stock], [2018 stock], [2017 stock] in the public account menu bar dialog box, you can view the collection of technical articles by the Meituan technical team over the years.

| This article is produced by Meituan technical team, and the copyright belongs to Meituan. Welcome to reprint or use the content of this article for non-commercial purposes such as sharing and communication, please indicate "The content is reproduced from the Meituan technical team". This article may not be reproduced or used commercially without permission. For any commercial activities, please send an email to tech@meituan.com to apply for authorization.