Introduction to will require a large amount of data calculation and processing in the backend to recall the search results that meet the user's needs. This sharing combines the common problems and difficulties of the query analysis service in the self-built search business, and introduces the Alibaba Cloud open search Query and analysis capabilities and solutions, and in-depth interpretation of how Alibaba's query analysis service architecture and Elasticsearch-compatible architecture are implemented
Special guest:
Xiang Zhaogui (Xiang Gong)--Alibaba Senior Technical Expert
Video address: https://yqh.aliyun.com/live/opensearch
Introduction to Query Analysis
The role of query analysis in search
The process of processing search requests can be divided into two stages in engineering realization, recall and sorting. In the recall phase, it is necessary to find the documents that users want in the engine as much as possible, and in the sorting phase, it is necessary to rank the documents that meet the requirements first to return to the user.
Processing and analysis can be performed quickly through query analysis. For example, in an actual production environment, users often have some wrong input and need to perform query error correction. Secondly, we need to segment the query and identify the importance of different words, which helps us to use it in recall and sorting. At the same time, since there are multiple meanings in the actual environment, synonyms must be expanded. Secondly, the user's query needs to be rewritten to help the engine perform the recall more efficiently. In the query processing stage, some information will be output to help us calculate the relevance of some documents, category relevance, and vectorization of the text to calculate its semantic relevance when sorting.
Query analysis link
In general, the role of query analysis is to analyze and rewrite the query entered by the user to improve the accuracy of our system's recall and the relevance of sorting. The following is a simple example to introduce the query analysis function of open search.
Problems faced by self-built search services
- Requires continuous accumulation of industry domain knowledge;
- The lack of a large number of industry sample data makes self-study difficult;
- Algorithm tuning, engineering development, and daily operation and maintenance require continuous human investment;
Open search query analysis features
- provides a complete query analysis solution for the
Provide algorithm functions for specific areas and optimize certain specific algorithm functions. For example, in the e-commerce industry, open search provides entity recognition. The education industry is often not only text, but also subtext or pictures, so a text vectorization function is performed on the query. We will also optimize some functions in different industries, such as spelling error correction or synonym mining and so on.
- query and analysis every function can intervene
The intervention is effective in real time, including entity recognition, spelling error correction, stop words, word weights, synonyms, category predictions, etc.
- Lightweight to customize service
According to the customer's different business scenarios to configure his query and analysis capabilities, Open Search provides a complete set of these capabilities and functions, and users can select some of the capabilities to use in the actual production environment according to actual needs. Secondly, it supports users to use a variety of different types of query analysis, or different query and analysis configurations.
- Free operation and maintenance
Eliminate users' continuous investment in daily operation and maintenance.
Query analysis service architecture
Algorithm Service Center
- Release and iteration of algorithm functions;
- Add, delete, modify and check user models;
- Algorithm model training;
- Reflow of algorithm model;
Intervention function
- Addition, deletion, modification, and investigation of user intervention data;
- Real-time synchronization of intervention data to query analysis services;
Query analysis and category prediction services
- Load dictionary, model, data, configuration;
- Different industries are realized through different service chain configurations;
- Load user intervention data;
Query process
- Execute the corresponding query analysis chain according to the functions configured by the user;
- The rewritten query is sent to the engine to execute the query;
DIIRuntime framework
- Support a variety of different types of indexes to meet the efficient access of algorithms to various types of data;
- Index construction, distribution, loading, and query are unified, reducing development and operation and maintenance costs;
- Chain service framework, flexible chain formation, supporting functions in different scenarios;
- Algorithm development only needs to pay attention to the realization of the logic of the algorithm function itself, which is simple and fast;
Elasticsearch compatible architecture
Open search Elasticsearch engine query analysis function
- basically aligns the query analysis capabilities of open search;
- Possess industry segmentation ability
- Can intervene
- Support extended word segmentation
- has industry query and analysis capabilities
- Configurable
- Can intervene
Implementation architecture
1. Create an instance
- Create an open search instance and associate it with an instance of Aliyun Elasticsearch
- Install plugin
2. Configuration query analysis
- Set the analyzer to use the response in Mapping
- Plug-in function
- Provide general and industry word segmentation capabilities
- Access query analysis service, get query rewrite results
- Rewrite Elasticsearch query query
\>>If there is a need for in-depth optimization of search results, you can fill in the expert consultation questionnaire, and participate in the trial to get the open search general word segmentation ability for free. Questionnaire address: https://c.tb.cn/F3.05Srxl
If you want to communicate with more developers, understand the cutting-edge search and recommendation technology , you can scan the code to join the community
Copyright Statement: content of this article is contributed spontaneously by Alibaba Cloud real-name registered users, and the copyright belongs to the original author. The Alibaba Cloud Developer Community does not own its copyright and does not assume corresponding legal responsibilities. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and the "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find suspected plagiarism in this community, fill in the infringement complaint form to report it. Once verified, the community will immediately delete the suspected infringing content.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。