Introduction to will require a large amount of data calculation and processing in the backend to recall the search results that meet the user's needs. This sharing combines the common problems and difficulties of the query analysis service in the self-built search business, and introduces the Alibaba Cloud open search Query and analysis capabilities and solutions, and in-depth interpretation of how Alibaba's query analysis service architecture and Elasticsearch-compatible architecture are implemented

Special guest:

Xiang Zhaogui (Xiang Gong)--Alibaba Senior Technical Expert

Video address: https://yqh.aliyun.com/live/opensearch

Introduction to Query Analysis

The role of query analysis in search

The process of processing search requests can be divided into two stages in engineering realization, recall and sorting. In the recall phase, it is necessary to find the documents that users want in the engine as much as possible, and in the sorting phase, it is necessary to rank the documents that meet the requirements first to return to the user.

Processing and analysis can be performed quickly through query analysis. For example, in an actual production environment, users often have some wrong input and need to perform query error correction. Secondly, we need to segment the query and identify the importance of different words, which helps us to use it in recall and sorting. At the same time, since there are multiple meanings in the actual environment, synonyms must be expanded. Secondly, the user's query needs to be rewritten to help the engine perform the recall more efficiently. In the query processing stage, some information will be output to help us calculate the relevance of some documents, category relevance, and vectorization of the text to calculate its semantic relevance when sorting.

image

Query analysis link

In general, the role of query analysis is to analyze and rewrite the query entered by the user to improve the accuracy of our system's recall and the relevance of sorting. The following is a simple example to introduce the query analysis function of open search.

image

Problems faced by self-built search services

  1. Requires continuous accumulation of industry domain knowledge;
  2. The lack of a large number of industry sample data makes self-study difficult;
  3. Algorithm tuning, engineering development, and daily operation and maintenance require continuous human investment;

Open search query analysis features

  • provides a complete query analysis solution for the

Provide algorithm functions for specific areas and optimize certain specific algorithm functions. For example, in the e-commerce industry, open search provides entity recognition. The education industry is often not only text, but also subtext or pictures, so a text vectorization function is performed on the query. We will also optimize some functions in different industries, such as spelling error correction or synonym mining and so on.

  • query and analysis every function can intervene

The intervention is effective in real time, including entity recognition, spelling error correction, stop words, word weights, synonyms, category predictions, etc.

  • Lightweight to customize service

According to the customer's different business scenarios to configure his query and analysis capabilities, Open Search provides a complete set of these capabilities and functions, and users can select some of the capabilities to use in the actual production environment according to actual needs. Secondly, it supports users to use a variety of different types of query analysis, or different query and analysis configurations.

  • Free operation and maintenance

Eliminate users' continuous investment in daily operation and maintenance.

image

Query analysis service architecture

Algorithm Service Center

  • Release and iteration of algorithm functions;
  • Add, delete, modify and check user models;
  • Algorithm model training;
  • Reflow of algorithm model;

Intervention function

  • Addition, deletion, modification, and investigation of user intervention data;
  • Real-time synchronization of intervention data to query analysis services;

Query analysis and category prediction services

  • Load dictionary, model, data, configuration;
  • Different industries are realized through different service chain configurations;
  • Load user intervention data;

Query process

  • Execute the corresponding query analysis chain according to the functions configured by the user;
  • The rewritten query is sent to the engine to execute the query;

image

DIIRuntime framework

  • Support a variety of different types of indexes to meet the efficient access of algorithms to various types of data;
  • Index construction, distribution, loading, and query are unified, reducing development and operation and maintenance costs;
  • Chain service framework, flexible chain formation, supporting functions in different scenarios;
  • Algorithm development only needs to pay attention to the realization of the logic of the algorithm function itself, which is simple and fast;

image

Elasticsearch compatible architecture

Open search Elasticsearch engine query analysis function

  • basically aligns the query analysis capabilities of open search;
  • Possess industry segmentation ability
    • Can intervene
    • Support extended word segmentation
  • has industry query and analysis capabilities
    • Configurable
    • Can intervene

image

Implementation architecture

1. Create an instance

  • Create an open search instance and associate it with an instance of Aliyun Elasticsearch
  • Install plugin

2. Configuration query analysis

  • Set the analyzer to use the response in Mapping
  • Plug-in function
    • Provide general and industry word segmentation capabilities
    • Access query analysis service, get query rewrite results
    • Rewrite Elasticsearch query query

image


\>>If there is a need for in-depth optimization of search results, you can fill in the expert consultation questionnaire, and participate in the trial to get the open search general word segmentation ability for free. Questionnaire address: https://c.tb.cn/F3.05Srxl

If you want to communicate with more developers, understand the cutting-edge search and recommendation technology , you can scan the code to join the community

Copyright Statement: content of this article is contributed spontaneously by Alibaba Cloud real-name registered users, and the copyright belongs to the original author. The Alibaba Cloud Developer Community does not own its copyright and does not assume corresponding legal responsibilities. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and the "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find suspected plagiarism in this community, fill in the infringement complaint form to report it. Once verified, the community will immediately delete the suspected infringing content.

阿里云开发者
3.2k 声望6.3k 粉丝

阿里巴巴官方技术号,关于阿里巴巴经济体的技术创新、实战经验、技术人的成长心得均呈现于此。


引用和评论

0 条评论