Introduction to Cloud opens a unified search recall engine. The search recall link supports both Alibaba Cloud's self-developed Ha3 engine and Alibaba Cloud Elasticsearch engine, and provides search algorithm capabilities for multiple industries to help companies efficiently achieve deep optimization of search results
Special guest:
Xing Shaomin (Duo Yu)-Alibaba Senior Technical Expert
Video address: https://yqh.aliyun.com/live/opensearch
Search challenges
Engineering challenge
- million QPS
- 618, Double 11 and other big promotion activities with high concurrent access
- 100 billion-level data
- SKU, order, logistics and other big data retrieval
- High aging
- Order and logistics data have extremely high timeliness requirements
- high availability
- Unavailability in minutes can cause huge business losses
- low latency
- Search as a traffic entry, high latency will lead to a decrease in transaction volume
Algorithm challenge
- degree of information specification is low
For example, in the e-commerce industry, the product name of the e-commerce industry, in order to improve the ranking of the products, in order to make the product rank higher, the product name will add a lot of keywords, which will cause the product name to be irregular. Many of them may not even understand the grammar. At this time, it is very difficult to analyze these keywords.
E.g:
- Baby cotton clothing suits, winter clothing for infants and young children, 0-1 years old, 3 boys and infants, autumn and winter women’s warmth, padded jacket, padded jacket;
- Fresh 5 catties of edamame, green beans, edamame, sweet beans, fresh vegetables, peas, and freshly picked pods;
- Intent enrichment
When searching, the intent of the query term is very rich, and then there will be many different intents.
E.g:
- Water--(Mineral water? Toilet water? Shampoo?)
- Apple--(Apple to eat? Apple phone?)
- Marco pineapple--(pineapple? Marco pineapple ham sausage?)
- Stockings milk tea--(stockings? milk tea?)
- recall volume, difficult to sort
- Tens of millions in a single recall, it is difficult to accurately sort under limited resources
What happens if we can't handle these problems well? Then if the engineering challenges and algorithm challenges are not handled well, there will be a problem of user loss.
user churn observation:
- Users who search for a certain keyword more than twice and still have no results will think that the platform does not have such products ;
- The user browses the search results for more than half a minute and still directly out of ;
- The user browses more than 4 pages of search results and still directly out of ;
Search products and solutions
About Elasticsearch
The industry's most mainstream information retrieval and analysis engine, the DB-Engine index ranks "the No. 7 database of global heat, and the No. 1 search engine of global heat". Widely used in various business scenarios.
Alibaba Cloud Elasticsearch product introduction
Provide fully managed Elastic Stack service, 100% compatible with open source, and provide X-Pack commercial plug-in for free, ready to use, pay on demand. At the same time, in-depth function and kernel performance optimization, providing richer analysis and retrieval capabilities, more secure and highly available services.
Features and advantages
- low cost
- Provide free X-Pack commercial plug-ins worth 6,000 USD per node
- Intelligent operation and maintenance, advanced monitoring alarms, disaster recovery deployment, etc., ultra-low operation and maintenance costs
- Targeted scenario tuning, improve resource utilization efficiency, multiple product price strategies
- function and performance
- Log enhanced kernel, 100% cost reduction, 100% performance improvement
- Text, video, audio, image, provide the most comprehensive information retrieval capabilities
- Fully align with the requirements of Equal Guarantee 2.0, and enterprise-level data security capabilities
- Open secondary development capabilities to support the packaging of various business scenarios
- brand endorsement
- Alibaba Cloud and Elastic strategic cooperation;
- Rich industry experience
- Provide services for 30 industries including e-commerce, retail, education, finance, media, and logistics;
- Global Service
- Services cover all Alibaba Cloud data centers, and support localized private cloud delivery, and hybrid cloud solutions
Alibaba Cloud Open Search Product Introduction
OpenSearch is a one-stop intelligent search business development platform based on a large-scale distributed search engine independently developed by Alibaba. It currently provides search service support for the core businesses of Alibaba Group including Taobao and Tmall. With built-in capabilities such as query semantic understanding and machine learning sorting algorithms for various industries, it provides fully open engine capabilities to help developers quickly build intelligent search services.
Application scenario
- E-commerce industry: product search, order search, store search, database acceleration and analysis scenarios
- Content industry: news search, community search, video search, gallery search
- Multimedia industry, game industry, corporate big data....
core advantages
- Engineering advantages: high performance (millisecond end-to-end delay), high stability (99.99% stability), high aging (effective in milliseconds);
- Algorithm advantage: NLP technology accumulated by Dharma Academy for many years, query analysis and search sorting ability polished in multiple industries;
- Product advantages: low threshold, free operation and maintenance, open platform;
Search in the group
- The core search engine HA3 was hatched from Taobao Tmall search
- 1,000+ business accesses within the group, 700 billion+ products/documents are indexed, and daily search PV reaches tens of billions.
- 2020 Double Eleven QPS peak value 1.1 million+, real-time data update TPS peak value 550,000+
Productization of open search algorithm
Open Search is an intelligent search product. In recent years, it has done a lot of algorithm productization work, which includes query analysis, multi-channel recall, intelligent sorting, user behavior, business development, effect evaluation, etc.
Alibaba Cloud Search Service Selection-Product Ecology
product model
Open source product--Alibaba Cloud Elasticsearch
- Highly well-known in the industry, the preferred platform for search;
- The open source ecological learning threshold is low and easy to master;
- The plug-in mechanism allows free customization to meet different business needs;
Ali self-developed product-OpenSearch
- One-stop search engine platform service;
- The core engine HA3 is the core search technology of Alibaba Group, providing millions of QPS query capabilities and hundreds of billions of document index capabilities;
- Built-in QP and sorting algorithm capabilities and industry templates to achieve high-quality search results in vertical industries;
Application ecology
Performance difference
Unified recall engine
To meet the usage habits of different users, if customers query from Elasticsearch, they can call the QP function in open search, and then provide customers with the ability to query and analyze. If you are querying from open search, you can use open search query analysis capabilities natively.
\>>If there is a need for in-depth optimization of search effects, you can fill in the expert consultation questionnaire, and participate in the trial to get the open search general word segmentation ability for free. Questionnaire address: https://c.tb.cn/F3.05Srxl
If you want to communicate with more developers, understand the cutting-edge search and recommendation technology , you can scan the code to join the community
Copyright Statement: content of this article is contributed spontaneously by Alibaba Cloud real-name registered users. The copyright belongs to the original author. The Alibaba Cloud Developer Community does not own its copyright and does not assume corresponding legal responsibilities. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and the "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find suspected plagiarism in this community, fill in the infringement complaint form to report it. Once verified, the community will immediately delete the suspected infringing content.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。