About For e-commerce platforms, the intelligent search function is very important. This article analyzes the search exclusive characteristics and business needs of the e-commerce industry, and introduces the intelligent search capabilities of [e-commerce industry template] provided by open search, hoping to bring more ideas and solutions to enterprises to improve business conversion~

Cloud Open Search-Intelligent Search Solution for E-commerce Industry:

https://www.aliyun.com/page-source//data-intelligence/activity/opensearch

1. The business logic of search

"Search Query→Recall→Sort→Search Results"

When the user enters a Query in the search box, the system recalls related documents or products through its semantic understanding, and then sorts them according to the actual search intent of the customer through algorithmic ordering, finally solving their search requirements and realizing business transformation.

Among them, [Recall] and [Sort] are the most important for the business goals of search guidance.

2. Application of Natural Language Processing Technology (NLP) in Search

1. Concept introduction

If you want to optimize the effect of search engines, you must have a certain understanding of natural language processing technology, because the user enters a Query and interprets it from an academic perspective. Natural language intelligence research realizes effective communication between humans and computers using language. It is a science that integrates linguistics, psychology, computer science, mathematics, and statistics.

Natural language processing is hailed by scholars as "the jewel in the crown of artificial intelligence". The research covers subjects such as perceptual intelligence, cognitive intelligence, and creative intelligence. It is a necessary technology to realize complete artificial intelligence.

2. Alibaba Cloud Dharma Academy NLP search analysis path

Third, the characteristics of e-commerce search

1. Keyword stuffing

For example: Yang Mi has the same summer dress with free shipping.

2. Word order has little effect on semantics

For example: Yang Mi's same style women's summer dress is free shipping; women's summer dress's free Yang Mi's same style.

3. Category prediction problem

For example: when a user queries for "Apple", it may be for fruits or for mobile phone brands.

4. Poor relevance of query recall documents

Inaccurate recognition of core words and inaccurate word segmentation

5. Search-led business conversion has a larger proportion

According to statistics, comprehensive e-commerce search led conversions accounted for more than 40%, and vertical e-commerce search led conversions accounted for more than 60%.

6. Higher stability requirements and support for flexible expansion

The QPS of the active and large promotion system may be a hundred times or thousand times higher than usual, and smooth expansion and contraction are required to ensure the stability of the system.

Fourth, the core functions of e-commerce search optimization

1. Word segmentation (focus!)

1.1 The optimization of the word segmentation effect directly affects the number of recalls, reduces the rate of no results, and improves the quality of search recall

E.g:

"Hot pot nine yuan nine free shipping"

  • Participles with poor results: "hot pot, pot, nine, block, nine, package, post"; "hot pot, nine, block, nine, free shipping"
  • Open search word participle: "hot pot, nine yuan nine, free shipping"

"925 silver earrings"

  • Participles with poor results: "925, white fungus, silver, earrings"
  • Open search term: "925, silver, earrings"

1.2 Different word segmentation methods directly affect the keywords involved in the recall, thereby affecting the accuracy of the recall

At present, it is difficult for many open source self-built systems to achieve good word segmentation effects. The main reason is that the amount of training corpus data is limited, which is not enough to form industry data that can be continuously polished and cultivated. In particular, the e-commerce industry has a rich variety of products, Chinese characters and words express diverse meanings, and many polyphonic characters and synonyms. It is difficult to achieve rapid solution optimization by its own algorithm engineers and development teams. This is a long period of continuous accumulation and training. process.

2. Named entity recognition

2.1 E-commerce search-entity recognition meaning

Perform entity word marking and recognition on e-commerce queries and titles, including categories such as brand, category, category modification, model, and style;

2.2 Open search entity recognition advantages

  • Based on Taobao's full data and knowledge base, it deeply optimizes the entity recognition capabilities of the e-commerce industry, and solves the problems of rapid brand update, large ambiguities, category modification relationships, and brand category matching relationships;

2.3 Open search entity recognition function

2.3.1 acts on query rewrite:

Open search query analysis can rewrite two queries. The first query is more accurate, and the second query reduces the terms involved in the recall. When the number of more accurate recall results is insufficient, the second query is used to expand the recall. Query rewriting is mainly based on the importance of the entity. The entity words with high importance are retained when recalling. The low importance part does not affect the recall, only the algorithm ranking.

implementation :

Entity importance is currently divided into three levels: high, medium, and low. Among them, "brand, category" is in the high-end, which is the most important; secondly, "style, style, color, season, crowd, location..." is in the mid-range; finally "size, modifiers, influence service, series, unit..." Low-end, can be discarded without participating in the recall.

2.3.2 Use with category prediction

Different entities in the query have different effects on the category. Therefore, when the original query does not have the result of category prediction, according to certain rules, after removing the words that are irrelevant to the category intent or having low relevance, the category is performed. Project prediction, this will be of great help to the category prediction of long-tail query.

example:

" Yang Mi (person's name) same style (suffix) spring (time season) Slim (style element) dresses are sorted according to the priority of the query:

Spring slim dress

Spring dress
dress
dress

The system will query the results of category predictions in the order described above

3. Category prediction

3.1 illustrates :

  • The user’s search for "Apple" may be an apple that wants fruit, or an Apple phone;
  • Users searched for "Huawei", and the recall results were sorted by sales. The highest-selling "Huawei Watch" and "Huawei Accessories" were probably ranked first, but the actual search intent "Huawei Mobile" came in the back.

    3.2 Open search category prediction ability

    Category forecasts are open search in based on category information articles / content to improve search performance arithmetic functions . According to the user's query words to predict the result of which category the user wants to query, combined with the ranking expression, the results that are more in line with the search intent can be ranked higher.

Basic principle : Collect the queries searched in history, combine the click behavior data after the query query, and link them with the item information under the category, use these data to train the model, and the model to describe the query and the category The regularity of data between.

Different users have different search intentions. Some behaviors intend to search for "accessories", and some intentions are to search for "mobile phones". Based on the user's behavior data, it can be judged by category, so as to achieve a personalized display in the sorting effect;

4. Sorting Algorithm

4.1 E-commerce sorting FAQ

  • Unsatisfactory ranking of query results: resulting in low click-through rate and high bounce rate, which directly affects business conversion;
  • Data lacks timeliness: it is difficult to balance the trade-off relationship between high-quality products and newly released products;
  • Merchant rankings: Some merchants find the sorting loopholes, and get the top position through keyword stuffing, and the user experience is not good;
  • Human resources are tight: 2-3 professional algorithm engineers are required, and it is difficult to find suitable talents.

4.2 Open search e-commerce sorting ability

Based on the application structure template and index structure template, it provides common basic sorting and business sorting expressions for e-commerce, which can meet the sorting effect requirements of most e-commerce industries without additional configuration. Users can also customize sorting through cava scripts.

5. Manual intervention bad case

5.1 Common bad case

  • When "iPhone11" was first launched, users searched for "Apple/iphone", and the latest product must be ranked first. When there is no conventional sorting algorithm, manual intervention of category prediction is required;
  • "Spray bubble" is another name for a basketball shoe, not the mainstream name. The full name is "Air Jordan AirFoamposite series". At this time, you need to synchronize the professional vocabulary visualization accumulated in normal operations to open search for a patch for the semantic understanding function of the query , To be resolved through flexible intervention;
  • Cross-border e-commerce sometimes Query involves foreign languages such as "Japanese, Korean, Thai", etc. When our word segmentation dictionary cannot be optimized for word segmentation, it can also be solved through the word segmentation intervention function;
  • The user searches for Query "Chanel Cushion", the default entity recognition, "Chanel" is classified as "common words"; "Cushion" is classified as "Material", manual intervention in entity recognition is required, and "Chanel" is intervened as a brand.

5.2 Open search manual intervention function

  • Built-in intervention dictionary, you can add a custom intervention dictionary on top of it;
  • Support query analysis and intervention dictionary (stop words, spelling error correction, synonyms, entity recognition, word weight, category prediction);

6. Search guide function

6.1 Search guide function business value

6.1.1 Hot search shading

  • Popular queries are the vane of users’ interests. By analyzing popular queries, we can grasp the trend of users’ interests and provide decision-making basis for formulating operation strategies;
  • Recommend some high-quality queries to users to improve business goals;
  • Users recommend popular queries, which not only improves user experience, but also increases exposure opportunities for some of the most popular queries;
  • By analyzing the user’s behavior and combining the user’s interests to recommend the query, think about what the user wants to improve the chance of conversion;

image.png

6.1.2 Drop-down prompt

  • Improve input efficiency, help users find the content they want as soon as possible, reduce the number of user queries, and reduce the pressure on the server;
  • Recommend better query;

    image.png

6.2 Open search search guide advantage

Built-in hot search, shading, drop-down prompts and multiple search guidance algorithm models, no need to develop the system to automatically train the model every day, play an important role in guiding users' search intentions, greatly reducing subsequent query intention understanding, relevance, ranking, operation intervention The difficulty of tuning can play a very good role in improving the overall business goals.

5. Open search for e-commerce industry templates

1. Search architecture

OpenSearch pioneered the ability to search templates in the e-commerce industry to help companies quickly build higher-level search services and drive exponential business growth.

2. One-key configuration

Built-in e-commerce industry search capabilities, simple configuration, no threshold for novices

3. Advantages of e-commerce industry templates

  • Industry best practices to reduce trial and error costs

Put the best practice of building e-commerce industry search into products, users do not need to explore in all directions, only need to access according to the template to have better services;

  • Built-in higher quality algorithm model, saving training costs

Users start from 0 to optimize the search, eliminating a lot of data annotation and model training, and directly built-in Ali Group's search algorithm capabilities, saving dozens of people/months of algorithm work;

  • Support personalized search and service capabilities

Through the multi-channel recall capability on the engine side, it can realize important services such as search results, drop-down prompts, and shading words to improve search conversion;

  • architecture is open, and developers can customize the model to reflow in real time

Support the import of NLP models trained by users into open search to flexibly meet the needs of business developers;

  • recalled engine performance is fully ahead of

Alibaba's self-developed Ha3 engine handles massive data, high concurrency, and massive user requests, and its performance is several times better than open source solutions;

  • -efficient industry iteration capability

According to the changes of e-commerce banks, iteratively update the original capabilities and provide more time-sensitive service guarantees;

4. Optimization of the core indicators of the enhanced version of the e-commerce industry

open source search and e-commerce industry enhanced version

4.2 general version and e-commerce industry enhanced version capabilities comparison

4.3 Offline data processing

Single cluster real-time data synchronization Tps millions;


Get expert guidance:

https://survey.aliyun.com/apps/zhiliao/uzhnOt\_g9

e-commerce industry template configuration process:

https://help.aliyun.com/document\_detail/208651.html

Copyright Statement: content of this article is contributed spontaneously by Alibaba Cloud real-name registered users, and the copyright belongs to the original author. The Alibaba Cloud Developer Community does not own its copyright and does not assume corresponding legal responsibilities. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and the "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find suspected plagiarism in this community, fill in the infringement complaint form to report it. Once verified, the community will immediately delete the suspected infringing content.

阿里云开发者
3.2k 声望6.3k 粉丝

阿里巴巴官方技术号,关于阿里巴巴经济体的技术创新、实战经验、技术人的成长心得均呈现于此。