1
头图
As the cornerstone of the digitalization of the new retail industry, the commodity knowledge graph provides a precise and structured understanding of commodities and plays a vital role in business applications. Compared with the original map around merchants in Meituan's brain, the product map needs to deal with more scattered, complex, and massive data and business scenarios, and faces low quality of information sources, multiple data dimensions, reliance on common sense and professional knowledge, etc. challenge. This article will focus on the knowledge map of retail commodities and introduce Meituan's exploration in the direction of commodity level construction, attribute system construction, and map construction human efficiency improvement. I hope it will be helpful or inspiring for everyone.

background

Meituan Brain

In recent years, artificial intelligence is rapidly changing people’s lives. There are actually two technical driving forces behind them: deep learning and knowledge graph . We generalize deep learning as an implicit model, which is usually for a specific task, such as playing Go, identifying cats, face recognition, speech recognition, and so on. Generally speaking, it can achieve excellent results on many tasks. At the same time, it also has some limitations. For example, it requires a large amount of training data and powerful computing power, it is difficult to perform cross-task migration, and it does not have good performance. Interpretability. On the other hand, as an explicit model, knowledge graph is also a major technical driving force of artificial intelligence, which can be widely applied to different tasks. Compared with deep learning, the knowledge in the knowledge graph can be precipitated, has strong interpretability, and is closer to human thinking. It supplements the accumulation of human knowledge for the implicit deep model, and deep learning complements each other. Therefore, many large Internet companies around the world are actively deploying in the field of knowledge graphs.

图1 人工智能两大驱动力

Meituan connects hundreds of millions of users and tens of millions of merchants, and it also contains a wealth of knowledge about daily life. In 2018, the Meituan Knowledge Map team began to build the Meituan brain, focusing on using knowledge map technology to empower business and further improve user experience. Specifically, Meituan Brain will deeply understand and structure the tens of millions of merchants, hundreds of millions of dishes/commodities, billions of user reviews, and millions of scenarios behind Meituan’s business. Knowledge modeling builds knowledge associations between people, stores, commodities, and scenes, thereby forming a large-scale knowledge graph in the field of life services. At this stage, Meituan’s brain has covered billions of entities and tens of billions of triples, which has verified the effectiveness of the knowledge graph in catering, food delivery, hotel, financial and other scenarios.

图2 美团大脑

Exploration in the field of new retail

Meituan gradually breaks through the original boundaries and explores new businesses in the field of life services. It is not limited to helping people "eat better" through takeaways and catering. In recent years, it has gradually expanded to other areas such as retail and travel to help everyone "live." better". In the retail field, Meituan has successively launched a series of corresponding businesses such as Meituan Flash Sale, Meituan Shopping, Meituan Optimal, and Tuanhaohu, gradually realizing the vision of "everything at home". In order to better support Meituan’s new retail business, we need to build a knowledge map of the retail products behind, accumulate structured data, and have a deep understanding of the products, users, attributes, scenarios, etc. in the retail field, so that we can better serve Users provide services in the field of retail commodities.

Compared with the catering, takeaway, hotel and other fields around merchants, the retail commodity field poses greater challenges for the construction and application of knowledge graphs. On the one hand, the number of commodities is larger, and the scope of coverage is also broader. On the other hand, the display information of the product itself is often relatively sparse, and to a large extent, it is necessary to combine common sense knowledge in life to make inferences, so that the dozens of dimensional attributes hidden behind can be supplemented to complete the integrity of the product. Understanding. In the example in the figure below, a simple product description such as "Leshi Cucumber Flavor" actually corresponds to a wealth of hidden information. Only after structured extraction of this knowledge and corresponding knowledge reasoning can it better support downstream searches. , Recommendations and other module optimization.

图3 商品结构化信息的应用

The goal of commodity map construction

Based on the characteristics of Meituan's retail business, we have developed a multi-level, multi-dimensional, and cross-business retail product knowledge map system.

图4 商品知识图谱体系

Multi-level

In different application scenarios of different businesses, the definition of "commodity" will be different, and it is necessary to understand the products of different granularities. Therefore, in our retail commodity knowledge graph, a five-layer hierarchy system has been established, which specifically includes:

  • L1- Commodity SKU/SPU : Corresponding to the granularity of the commodities sold in the business, it is the object of user transactions, and is often the commodity linked to the merchant, such as "Mengniu low-fat high-calcium milk 250ml box sold by Wangjing Carrefour" . This level is also the cornerstone of the lowest level of the product map, linking the business product library and the map knowledge.
  • L2-Standard product : Describes the granularity of the objective facts of the product itself, such as "Mengniu low-fat high-calcium milk 250ml box", no matter what channel is purchased at which merchant, the product itself does not make any difference. Commodity barcodes are the objective basis for standard commodities. At this level, we can model objective knowledge around standard products. For example, the same standard product will have the same brand, taste, packaging and other attributes.
  • abstract product : We further abstract the product series from standard products, such as "Mengniu low-fat high-calcium milk". In this level, we no longer pay attention to the specific packaging and specifications of the products, and aggregate the products of the same series into abstract products, which carry users' subjective perceptions of the products, including users' aliases and brand perceptions of the product series , Subjective evaluation, etc.
  • L4-Main category : Describe the essential category of the main body of the product, such as "eggs", "cream strawberry", "desktop sausage", etc. This layer serves as the back-end category system of the product map, modeling the category of the product field in an objective manner, and carrying the needs of users for products. For example, eggs from various brands and origins can meet users' needs for this category of eggs. .
  • -Business category : Compared with the back-end category system of the main category, the business category as the front-end category system will be manually defined and adjusted according to the current development stage of the business. Each business will be based on the characteristics and characteristics of the current business stage. Need to establish a corresponding front-end category system.

multi-dimensional

  • Product attribute perspective : Around the product itself, we need a large number of attribute dimensions to describe the product. Product attribute dimensions are mainly divided into two categories: one is general attribute dimensions, including brand, specification, packaging, origin, etc.; the other is category-specific attribute dimensions. For example, for milk products, we will focus on fat content (full fat/ Low-fat/skimmed milk), storage method (room temperature milk, refrigerated milk), etc. Commodity attributes mainly describe the objective knowledge of commodities, which are often based on the level of standard commodities.
  • User perception perspective : In addition to the objective dimensions of commodity attributes, users often have a series of subjective perceptions of commodities, such as commodity aliases ("little black bottle", "happy water"), and product evaluation ( "Sweet and delicious", "Instant import", "High cost performance"), product list/list ("Imported Food List", "Summer Summer Standby") and other dimensions. These subjective perceptions are often based on the level of abstract commodities.
  • category/category perspective : From the category/category perspective, different categories/categories will have their own different concerns. At this level, we will model the typical brands under each category/category, which typical attributes users pay attention to, and how long the repurchase cycle of different categories is.

Cross-business

The goal of the Meituan Brain Commodity Knowledge Atlas is to model commodity knowledge in the objective world, rather than being limited to a single business. In the five-tier system of the commodity map, standard commodities, abstract commodities, and category systems are all decoupled from the business, and are built around objective commodities, including the various dimensions of data built around these levels, which also describe the commodity field. Objective knowledge.

When applied to each business, we associate the objective graph knowledge up to the business front-end category, and down to the business product SPU/SKU, then we can complete the access of each business data, and realize the integration of each business data and objective knowledge. China Unicom provides a more comprehensive cross-business panoramic data perspective. Using such data, we can more comprehensively model and analyze users’ preferences for businesses and categories, and their sensitivity to prices, quality, etc. on the user side. We can more accurately model the repurchase cycle of each category on the product side. , Region/season/festival preferences, etc.

The challenge of building a product map

The challenge of constructing a commodity knowledge graph mainly comes from the following three aspects:

  1. quality of information sources is low. product itself is relatively scarce, and it is often based on titles and pictures. Especially in the LBS e-commerce scenario like Meituan Flash Sale, merchants need to upload a large amount of product data, and there are many cases of incomplete information for the entry of product information. In addition to the title and pictures, although the product details also contain a lot of knowledge information, their quality is often uneven, and the structure is different, and it is extremely difficult to mine knowledge from it.
  2. data dimensions. : There are many data dimensions that need to be constructed in the commodity field. Taking product attributes as an example, we not only need to build general attributes, such as brand, specification, packaging, taste and other dimensions, but also cover specific attributes of concern under each category/category, such as fat content, sugar content, battery Capacity, etc., as a whole will involve hundreds of dimension attribute dimensions. Therefore, the efficiency of data construction is also a big challenge.
  3. on common sense/professional knowledge : Because people have a wealth of common sense knowledge in daily life, they can get the hidden product information through a short description, for example, when they see a product like "Leshi Cucumber" In fact, it is Lay's cucumber-flavored potato chips. When I saw "Tang Monk Meat", I knew that this is actually not a kind of meat but a snack. Therefore, we also need to explore methods of semantic understanding combined with common sense knowledge. At the same time, in the fields of medicine and personal care, the construction of the map needs to rely on strong professional knowledge, such as the relationship between diseases and drugs, and such relationships have extremely high requirements for accuracy, and all knowledge needs to be accurate. It is correct, so it also requires a better combination of experts and algorithms for efficient map construction.

Commodity Atlas Construction

After understanding the goals and challenges of map construction, next we will introduce the specific plan of product map data construction.

Hierarchical system construction

category system construction

The essential category describes the finest category to which the product belongs. It aggregates a category of products and carries the final consumer demand of users, such as "high-calcium milk" and "beef jerky". There is also a certain difference between the essential category and the category. A category is a collection of several categories. It is an abstract category concept and cannot be clearly identified on a specific category of goods, such as "dairy products" and "fruits".

category marking : For the construction of the product map, the key step is to establish the relationship between the product and the category, that is, to label the product category. Through the association between products and categories, we can establish the association between products in the product library and user needs, and then display specific products to users. The following briefly introduces the marking methods of the following categories:

  1. category vocabulary construction : category marking first needs to build a preliminary product category vocabulary. First, we obtain preliminary product candidate words through operations such as word segmentation, NER, and new word discovery on data sources such as the product database, search logs, and merchant tags of the various e-commerce businesses of Meituan. Then, train a two-class model by labeling a small number of samples (to judge whether a word is a category). In addition, we combine active learning methods to select samples that are difficult to distinguish from the predicted results, label them again, and continue to iterate the model until the model converges.
  2. category marking : First, we obtain the candidate categories in the product by identifying the title of the product and combining it with the category vocabulary in the previous step, such as identifying the "skim milk" in "Mengniu skimmed milk 500ml", "Milk" and so on. Then, after obtaining the product and the corresponding category, we use the supervised data to train the two-category model for category marking, enter the pair composed of the SPU_ID of the product and the candidate category TAG, namely <SPU_ID, TAG>, and check whether it matches predict. Specifically, on the one hand, we use the rich semi-structured corpus in the business to build statistical features around the tag words, and on the other hand, we use models such as named entity recognition and BERT-based semantic matching to produce high-level correlation features. Above, we input the above features into the final judgment model for model training.
  3. category label post-processing : In this step, we post-process some of the categories marked on the model, such as category cleaning strategies based on the correlation of images, combined with product title naming entity recognition results, etc.

Through the above three steps, we can establish a connection between products and categories.

category system : The category system consists of categories and their relationships. Common category relationships include synonyms and subordinates. In the process of building a category system, the following methods are commonly used to complement the relationship. We mainly use the following methods:

  1. Rule-based category relationship mining. In general corpus data such as Encyclopedia, some categories have fixed pattern descriptions, such as "corn is also known as corn, corn cob, corn, pearl rice, etc.", "durian is one of the famous tropical fruits", so you can use rules to extract synonyms from it And up and down.
  2. Classification-based category relationship mining. Similar to the category marking method mentioned above, we construct synonyms and upper and lower positions as samples of <TAG, TAG>, using statistical features mined in commodity libraries, search logs, encyclopedia data, UGC, and based on Sentence-BERT For the obtained semantic features, use the binary classification model to judge whether the category relationship is established. For the trained classification model, we also use active learning to select the hard-to-separate samples in the result, perform secondary labeling, and continue to iterate the data to improve the performance of the model.
  3. Graph-based reasoning of category relations. After obtaining the preliminary synonyms and upper-lower relations, we use these existing relations to construct the network, and use GAE, VGAE and other methods to predict the links of the network, so as to complement the edge relations of the graph.

图5 商品图谱品类体系的构建

standard/abstract goods

Standard products describe the granularity of the objective facts of the products themselves, and have nothing to do with the sales channels and merchants, while the product barcode is the objective basis for the standard products. Standard product association means that the business SKU/SPU that belongs to a product barcode is correctly associated with the product barcode, so as to model the corresponding objective knowledge at the standard product level, such as the corresponding brand, taste and packaging attributes of the standard product . The following is a case to illustrate the specific tasks and plans associated with the standard product.

case: The picture below is a standard product of a bull three-meter patch panel. When the merchant enters the information, it will directly associate the product with the product barcode. Part of the standard product association was completed through the data entered by the merchant, but this part is relatively small, and there are a large number of missing links and link errors. In addition, the description of the title of the product for the same standard product by different merchants is strange. Our goal is to supplement the missing links and associate the product with the correct standard product.

图6 商品图谱标品关联任务

For the related tasks of standard products, we constructed a synonym discrimination model in the commodity field: using a small amount of related data provided by merchants through remote supervision, as an existing knowledge graph to construct training samples for remote supervision. In the model, a positive example is a standard product code with a relatively high degree of confidence; a negative example is an SPU with a similar product name or image in the original data but not belonging to the same standard product. After constructing training samples with relatively high accuracy, synonym model training is performed through the BERT model. Finally, through the model's autonomous denoising method, the final accuracy rate can reach more than 99%. Overall, it can be sensitive to dimensions such as brand, specifications, and packaging.

图7 商品图谱标品关联方法

Abstract products are the level of user cognition. As the object of user reviews, this level is more effective for modeling user preferences. At the same time, in the display of decision information, abstract product granularity is more in line with user perception. For example, in the ice cream ranking shown in the figure below, the SKUs corresponding to the abstract products in the user's perception are listed, and then the characteristics of different abstract products and the reasons for recommendation are correspondingly displayed. The overall construction method of the abstract commodity layer is similar to the standard commodity layer, adopting the model process associated with the standard product, and adjusting the rules in the data construction part.

图8 商品图谱抽象商品聚合

Attribute dimension construction

A comprehensive understanding of a commodity needs to cover all attribute dimensions. For example, "Layshi Cucumber Flavored Potato Chips" need to dig out its corresponding attributes such as brand, category, taste, packaging specifications, label, origin, and user comment characteristics to accurately reach users in scenarios such as product search and recommendation. The source data of product attribute mining mainly includes three dimensions: product title, product picture and semi-structured data.

图9 商品图谱属性建设

The product title contains the most important information dimension for the product. At the same time, the product title analysis model can be applied to query understanding, which can quickly and deeply understand the split for users, and can also provide high-level features for downstream recall ranking. Therefore, here we focus on the method of using product titles to extract attributes.

The overall product title analysis can be modeled as a task of text sequence labeling. For example, for the product title "Leshi Cucumber Potato Chips", the goal is to understand the various components in the title text sequence, such as Leshi corresponds to the brand, cucumber corresponds to the flavor, and potato chips is the category, so we use the named entity recognition (NER) model to analyze the product title . However, there are three major challenges in product title analysis: (1) Less contextual information; (2) Relying on common sense knowledge; (3) Annotated data usually has more noise. In order to solve the first two challenges, we first tried to introduce map information into the model, which mainly includes the following three dimensions:

  • node information : The graph entity is used as a dictionary and accessed in Soft-Lexicon mode to alleviate the problem of NER's boundary segmentation error.
  • Relevant information : Product title analysis relies on common sense knowledge. For example, in the absence of common sense, we cannot confirm whether "cucumber" is a product category or a taste attribute only from the title "Lay's Cucumber Chips". Therefore, we introduced the linked data of the knowledge map to alleviate the problem of the lack of common sense knowledge: in the knowledge map, there is a "brand-sale-category" relationship between Leshi and potato chips, but there is no direct relationship between Leshi and Cucumber. Therefore, the graph structure can be used to alleviate the lack of common sense knowledge of the NER model. Specifically, we use Graph Embedding technology to embed characterization of the map, use the map structure information of the map to represent the individual characters and words in the map, and then splice the embedding representation containing the map structure information and the representation of the text semantics. Fusion, and then integrated into the NER model, so that the model can take into account both semantics and common sense knowledge.
  • node type information : The same word can represent different attributes, for example, "cucumber" can be used as both a category and an attribute. Therefore, when performing Graph Embedding modeling on the graph, we split the entity nodes according to different types. When the graph node representation is integrated into the NER model, the attention mechanism is then used to select the representation corresponding to the more semantic entity type according to the context, so as to alleviate the problem of different meanings of words under different types, and achieve the integration of different types of entities.

图10 商品图谱标题解析

Next, we discuss how to alleviate the problem of labeling noise. In the labeling process, the problem of under-labeling, missing labeling or wrong labeling is unavoidable, especially when labeling the product title NER is more complicated. For the noise problem in the labeled data, the following methods are used to optimize the noise labeling: no longer adopt the original non-zero or 1 Hard training method, but use the soft training method based on confidence data, and then iteratively cross by Bootstrapping Validation, and then adjust according to the confidence of the current training set. We have verified through experiments that using Soft training + Bootstrapping multiple iterations, the model effect has been significantly improved on a data set with a relatively large noise ratio. For specific methods, please refer to our paper "Iterative Strategy for Named Entity Recognition with Imperfect Annotations" in the NLPCC 2020 competition.

图11 基于噪音标注的NER优化

Efficiency improvement

The construction of knowledge graphs is often a mining method that is formulated separately for data in various domain dimensions. This mining method is labor-intensive and relatively inefficient. For each different field and each different data dimension, we need to customize the task-related features and label data. In the commodity scenario, there are many dimensions of mining, so the improvement in efficiency is also crucial. We first model the knowledge mining task into three types of classification tasks, including node modeling, relationship modeling, and node association. In the training process of the entire model, efficiency optimization is actually the two steps mentioned above: (1) feature extraction for the task; (2) data labeling for the task.

图12 知识挖掘任务建模

For the feature extraction part, we abandoned the customized feature mining method for different mining tasks. Instead, we tried to decouple features and tasks, build a cross-task universal graph mining feature system, and use massive feature libraries to target nodes. /Relation/association for characterization, and use supervised training data for feature combination and selection. Specifically, the map feature system we constructed is mainly composed of four types of feature groups:

  1. The rule template type feature mainly uses artificial prior knowledge and integrates the rule model capabilities.
  2. Statistical distribution features can make full use of various corpora, and perform statistics based on different corpora and different levels of dimensions.
  3. Syntactic analysis features use the modeling capabilities of the NLP field to introduce dimensional features such as word segmentation, part of speech, and syntax.
  4. Embedding representational features is the use of high-level model capabilities and the ability to introduce semantic understanding models such as BERT.

图13 知识挖掘特征体系

For the data labeling part, we mainly improve efficiency from three perspectives.

  1. Through semi-supervised learning, fully utilize unlabeled data for pre-training.
  2. Through active learning technology, the samples that can provide the most information gain for the model are selected for labeling.
  3. Using the remote supervision method, the remote supervision samples are constructed through the existing knowledge for model training, and the value of the existing knowledge is exerted as much as possible.

Man-machine integration-professional map construction

The current structure of the medical and health industry is undergoing changes. Consumers are more inclined to use online medical solutions and drug delivery services. Therefore, the medical business has gradually become one of Meituan’s important businesses. Compared with the construction of general commodity knowledge graphs, knowledge in the field of medicine has the following two characteristics: (1) It is extremely professional and requires relevant background knowledge to judge the corresponding attribute dimensions, such as the applicable symptoms of medicines. (2) The accuracy requirements are extremely high, and errors are not allowed for strong professional knowledge, otherwise it is more likely to cause serious consequences. Therefore, we use a combination of intelligent models and expert knowledge to construct a drug knowledge graph.

The knowledge in the drug map can be divided into two types: weak professional knowledge and strong professional knowledge. Weak professional knowledge refers to knowledge that can be easily acquired and understood by ordinary people, such as how to use drugs and applicable people; while strong professional knowledge requires Knowledge that can be judged by talents with professional background, such as the indications of medicines for treating diseases and adapting to symptoms, etc. Because these two types of data rely on experts in different degrees, we adopt different mining links:

  • Weak professional knowledge : For the mining of weak professional knowledge of drug graphs, we extract corresponding information from data sources such as instructions and encyclopedia knowledge, and combine the rules and strategies precipitated by expert knowledge to extract corresponding information with the help of general semantic models. Knowledge, and complete the data construction through batch inspections by experts.
  • Strong professional knowledge : For the mining of strong professional knowledge of the drug map, in order to ensure that the relevant knowledge is 100% accurate, we extract the candidates of the drug-related attribute dimensions through the model, and then give these candidate knowledge to the experts for full quality inspection. Here, we mainly use the ability of algorithms to reduce the energy expenditure of professional pharmacists on the level of basic data as much as possible, and to improve the efficiency of experts in extracting professional knowledge from semi-structured corpus.

In highly specialized fields such as pharmaceuticals, there are often differences in the expression of professional knowledge and user habits. Therefore, in addition to digging out strong and weak expertise, we also need to fill in the differences between expertise and users in order to better integrate the drug map with downstream applications. To this end, we dig out the alias data of diseases, symptoms, and efficacy from data sources such as user behavior logs and daily conversations in the field, as well as data on common names of drugs, to open up the path between user habits and professional expressions.

图14 人机结合的专业知识挖掘

Landing application of commodity map

Since Google applied knowledge graphs to search engines and significantly improved search quality and user experience, knowledge graphs have played an important role in various vertical field scenarios. In the field of Meituan products, we also effectively apply the product map to multiple downstream scenarios around product business search, recommendation, merchant, and user. Next, we will give a few typical cases to introduce.

Structured recall

The data of the product map is very helpful for the understanding of the product. For example, in product search, if users are searching for headaches and backaches, they can know what medicines are pain-relieving through a structured knowledge graph; users need to rely on the common sense of graphs when searching for lovely strawberry and cucumber potato chips. Knowledge to understand the real needs of users is ice cream and potato chips, not strawberries and cucumbers.

图15 基于图谱的结构化召回

Ranking model generalization

The category information, category information, and attribute information of the map can be used as a relatively powerful relevance judgment method and intervention method on the one hand, and on the other hand, it can provide different coarse and fine-grained product aggregation capabilities, which can be used as generalization features to rank The model can effectively improve the generalization ability of the ranking model, and it has higher value for the commodity field where user behavior is particularly sparse. Specific features include:

  1. Aggregate commodities through various granularities, and access the ranking model with ID features.
  2. The construction of statistical characteristics is carried out after the aggregation of each particle size.
  3. By means of graph embedding representation, the high-dimensional vector representation of commodities is combined with the ranking model.

图16 基于图谱的排序优化

Multi-modal map embedding

Existing research work has been proved in many fields. Embedding the data of the knowledge graph and combining it with the ranking model in the way of high-dimensional vector representation can effectively alleviate the data in the ranking/recommendation scene by introducing external knowledge. The effect of thinning and cold start problems. However, the traditional work of graph embedding often ignores the multi-modal information in the knowledge graph. For example, in the commodity field, we have non-simple graph node-type knowledge such as the picture of the commodity, the title of the commodity, and the introduction of the merchant. The introduction can also further improve the information gain of graph embedding for recommendation/ranking.

图17 基于多模态图谱的推荐-背景

The existing graph embedding methods have some problems when applied to multi-modal graph representation, because in multi-modal scenarios, the meaning of the edges in the graph is no longer a purely semantic reasoning relationship, but multi-modal The relationship between information supplementation, so we also proposed the MKG Entity Encoder and MKG Attention Layer to better model the multi-modal knowledge graph and effectively connect its representation to the recommendation/ranking model based on the characteristics of the multi-modal map. For specific methods, please refer to our paper "Multi-Modal Knowledge Graphs for Recommender Systems" published in CIKM 2020.

图18 基于图谱的排序优化-模型

User/business side optimization

The commodity map provides explicit and interpretable information on the user side to assist users in making decisions. Specific presentation forms include screening items, featured tags, ranking lists, and reasons for recommendation, etc. The dimension of the filter item is determined by the attribute category that the user pays attention to under the category corresponding to the current query term. For example, when the user searches for potato chips, the user usually pays attention to its taste, packaging, net content, etc. We will follow the supply The enumerated values of the data in these dimensions display filter items. The product's characteristic tags are derived from the title, product detail page information and comment data extraction, and the product features are displayed with concise and clear structured data. Product recommendation reasons are obtained through two channels: review extraction and text generation, which are linked with query words to give reasons why the product is worth buying from the user's perspective. The list data is more objective, reflecting the quality of the product with real data such as sales volume.

On the merchant side, that is, the merchant publishing side, the product map provides real-time prediction capabilities based on product titles, helping merchants to mount categories and improve attribute information. For example, after a merchant fills in the title "12 boxes of German imported Deya skimmed milk", the online category prediction service provided by the product map can mount it in the category "Food & Beverage-Dairy Products-Pure Milk" and identify it through the entity Service, get the attribute information of the product "Origin-Germany", "Import-Import", "Brand-Deya", "Fat Content-Skim", "Specifications-12 Boxes". After the prediction is completed, the merchant will confirm and release it. , Reduce the maintenance cost of merchants on product information, and improve the information quality of the published products.

About the Author

Xuezhi, Fengjiao, Ziwen, Kuang Jun, Lin Sen, Wuwei, etc., all come from the NLP Center of the Meituan Platform Search and NLP Department.

Job Offers

Meituan’s brain knowledge map team is continuously recruiting a large number of positions, including internships, school recruitment, and social recruitment. It is located in Beijing/Shanghai. Interested students are welcome to join us and use natural language and knowledge map technology to help everyone eat better. Life is better. Resume can be sent to: caoxuezhi@meituan.com.

Read more technical articles from the

the front | algorithm | backend | data | security | operation and maintenance | iOS | Android | test

16135d199ec146 | . You can view the collection of technical articles from the Meituan technical team over the years.

| This article is produced by the Meituan technical team, and the copyright belongs to Meituan. Welcome to reprint or use the content of this article for non-commercial purposes such as sharing and communication, please indicate "the content is reproduced from the Meituan technical team". This article may not be reproduced or used commercially without permission. For any commercial activity, please send an email to tech@meituan.com to apply for authorization.


美团技术团队
8.6k 声望17.6k 粉丝