About For operation and maintenance engineers, if you want to vote for the five most frantic operation and maintenance support scenarios, the various promotional activities must be on the list. Every promotion season goes online is a restless night of restlessness. A large number of content updates, a large number of customers influx, and a large amount of data read and write, although there are various technical solutions or tool services to ensure the smooth progress of the big promotion. However, it is still possible to receive complaints from users from all over the world, such as "the product image cannot be loaded", "the page opens slowly", and "unable to complete the order payment". These bad results, such as low user conversion and slow business growth due to user experience and website performance, will eventually make operation and maintenance engineers become the "back-to-back man".
Baiyu
For the operation and maintenance engineer, if you want to vote for the five most frantic operation and maintenance support scenarios, the various promotional activities must be on the list. Every promotion season goes online is a restless night of restlessness. A large number of content updates, a large number of customers influx, and a large amount of data read and write, although there are various technical solutions or tool services to ensure the smooth progress of the big promotion. However, it is still possible to receive complaints from users from all over the world, such as "the product image cannot be loaded", "the page opens slowly", and "cannot complete the order payment". These bad results, such as low user conversion and slow business growth due to user experience and website performance, will eventually make operation and maintenance engineers become the "back-to-back man".
Regarding the issue of "user experience and website performance", we conducted interviews with many enterprise operation and maintenance engineers and independent webmasters, and found that everyone's views were concentrated in the following areas:
(1) Performance and experience problems caused by "the gap between product and user experience"
As the Internet dividend fades, product features and user experience design are becoming more and more convoluted. There is a gap between the functional logic design of the product and the user's understanding when it is used. A large number of spike activities, promotional activities, and UGC content make the product logic more complex. Even if various guidance and documentation are provided, users still need time to understand and cultivate usage habits. At the same time, in order to further enrich the functional modules, a large number of rich media, third-party components, and customer advertisements are constantly being added. The external cooperation content is excessive and unreasonable, which increases the system load and drags down the product performance. It is necessary, necessary, and still necessary, and the ultimate price is to have to sacrifice a certain amount of website performance and user experience.
(2) Performance and experience problems caused by "complex network environment"
As we all know, there are all kinds of first- and second-tier operators all over the country, which greatly increases the complexity of the national network environment. Due to the slow update of the operator's infrastructure and many sudden man-made problems, frequent IDC failures are caused. , Companies can only appease users and lie down waiting for repairs, and the time-consuming troubleshooting of these problems can only be resigned. At the same time, the wide geographical distribution, scattered user distribution, and personalized network access methods make the access network complicated, and enterprises cannot effectively estimate the user environment. Even with the help of widely distributed data centers and multi-line BGP access, it is still difficult to solve network environment problems, which further aggravates the difficulty of network optimization and makes the actual user experience of real users even more unpredictable.
(3) The performance and experience problems caused by the difference of "clearly different PC-side environments"
As the country with the largest number of netizens in the world, behind these massive users in my country is a huge difference in user-side hardware configuration. Some people may use i9-11900K+RTX3080 Ti to watch 4K HD live video on bilibili, and some people may use Millennium Pentium 4 and integrated graphics released in the year, browse text news on the portal website. This causes different groups of different browser versions, their own rendering mechanisms, and local host performance differences, and there are user experience differences such as abnormal access, slow speed, and local resource consumption. Faced with this situation, how to understand the actual experience of the majority of users, balance or evaluate the differences in user experience, and make a choice among them has become a difficult problem that every website operation and development must face.
(4) The system availability guarantee problem caused by "the sequelae of pursuing iteration speed"
Due to the frenzied competition on the Internet, products have to selectively ignore product architecture and stability on the multiple-choice question of functional window period and fine tuning. The lack of rigorous architecture and business development beyond the support of the architecture cause system load overload, system crashes, response timeouts, and other problems. There are many factors that cause this problem:
First of all, the business iteration speed is very fast, and the intrusive monitoring methods cannot be implemented in a short time, but the business system needs to be quickly sensed when the business system fails;
Secondly, development resources are tight or uncoordinated, infrastructure-related monitoring cannot directly reflect business problems, and the implementation cost of application monitoring is too high.
Finally, the own application calls the third-party API interface, the availability of the third-party API interface cannot be guaranteed, and it cannot respond and handle in time if it fails.
Looking at the disassembly, we will think that these are single-point problems, but after the business volume increases, the chain reaction will increase these problems and directly affect the user experience.
(5) "Lack of monitoring means from the user's perspective" leads to passive response to customer complaints
Although product functions will go through various tests when they are launched, the operation team continues to pay attention to user usage. However, for the operation and maintenance team, it is only after the customer complains that there is a problem in the system, and it is very passive to deal with it. It may even take a day for abnormal recurrence and positioning problems, which seriously affects NPS; common monitoring methods are mostly from their own perspective Starting off, it is not possible to intuitively reflect the user's problem.
So, in the face of so many influencing factors, how do we test our website from the perspective of real users, quantify the website user experience, and locate website performance bottlenecks? Here, we take the e-commerce industry marketing activities as examples. With the increasingly fierce competition, promotional activities such as Double Eleven and 618 have become important annual marketing activities for pan-trading industries such as e-commerce. However, the short-term influx of a large number of users will cause website loading delays, or business service stalls, and other problems that affect user experience.
Specific questions include:
Before going live, it is impossible to simulate real users, and test the actual experience of the product when peak users have high concurrent access.
There is no accurate assessment of the user’s actual browsing path, no conversion bottleneck link can be located, and no idea how to optimize.
During the big promotion stage, product information is updated frequently. After the update, complaints from users from all over the country such as "the product image cannot be loaded" and "the page opens slowly" are often received.
The performance status of competing product activities in the same industry cannot be obtained, and it is impossible to understand the changes in the marketing situation of competing products.
In the past, the above problems were difficult to solve. The specific reasons for the difficulty include:
Although there are task walls and other methods, the operation and maintenance team cannot find enough real traffic that meets actual needs to test the product user experience. Procurement of related traffic is time-consuming and expensive.
The general product launch window for marketing promotion is very tight, and the delivery time for the R&D team is relatively limited. If you want to add relevant intrusive probes for monitoring, it will slow down product delivery and may affect product stability.
The operation and maintenance team is unable to actively test the correlation, resulting in problems that can only be found in the actual user experience process, and can only be passively eliminated. However, the recurrence of problems and fault location may drag the entire operation and maintenance team, causing the repair time to be prolonged indefinitely.
Therefore, the operation team and the operation and maintenance team need a product or solution that can solve the above problems. As a business-oriented non-intrusive cloud native monitoring product, Cloud Dial Test has become the best choice. Through Alibaba Cloud's global service network, it simulates real user behaviors and continuously monitors the availability and performance of websites and their networks, services, and API ports around the clock. Achieve fine-grained problem positioning at the page element level, network request level, and network link level. Abundant monitoring related items and analysis models help companies find and locate performance bottlenecks and experience dark spots in a timely manner, reduce operational risks, and improve service experience and efficiency.
(1) Global monitoring node coverage
More than 200,000 LMs worldwide, more than 500 IDC terminal monitoring nodes, 400+ operators at home and abroad, and hundreds of thousands of registered members ensure that the monitoring scale meets the ever-increasing business scale.
(2) No need to embed code, ready to use out of the box
Zero-intrusive monitoring, you only need to enter the URL and perform simple configuration without R&D support. A complete website performance data analysis report can be obtained in a few minutes. Multiple purchase models of resource packs & pay-as-you-go to meet the needs of operation and maintenance testing.
(3) Business-oriented, preset a variety of analysis models
The monitoring cycle is fine to the minute level, with more than 20 monitoring related parameter settings in 7 categories, supporting a variety of mainstream protocols, and providing 7×24 hours of real-time monitoring, warning and performance analysis services for sites and business ports with fine-grained faults. From the perspective of the end customer, through multi-dimensional combination analysis of regions and operators, drill down to analyze the details of a single sample, use a rich indicator system and chart types to visually locate the problem, the affected area and its root cause, and the pressure drop analysis time to improve Operation and maintenance efficiency. Realize refined monitoring.
(4) Intelligent alarm, precise positioning
Real-time alarms are realized for the first screen time, overall performance, and availability, rich alarm policy settings, and deep integration with Alibaba Cloud Alarm Center, effectively reducing MTTR. Support the discovery of page element-level errors, and accurately locate the cause of the problem to a single network request process, improving the efficiency of problem location.
Take the marketing promotion of an e-commerce company as an example. The monthly active users of the website exceed one million. The user groups are mainly distributed in third-, fourth- and fifth-tier cities across the country, and the annual website operation and maintenance expenses exceed 2 million yuan. However, due to the frequent updates of product information during the big promotion stage, after the update, users from all over the country often complained about "the product pictures cannot be loaded" and "the page opens slowly", resulting in low user conversions and the operation and maintenance team being complained.
Faced with this dilemma, we have solved this problem through cloud dial test products and further optimized the performance of the website in order to support the business promotion.
(1) Stress test
Before the company’s marketing activities or the new system goes online, use cloud dialing to select the monitoring points of operators in different cities across the country, set browsing and network tasks, obtain real-time access experience data of the first-line users, and accurately locate the problematic page Elements to help the technical team fix problems in time. Simulate the high concurrent access of peak users, by increasing the peak pressure, observing the changes of main performance indicators, and tapping performance bottlenecks.
(2) User experience optimization
Through the first screen monitoring and real-time monitoring functions, problem verification and fault recurrence can be performed immediately, and the performance of the website can be evaluated and optimized. And through transaction flow analysis, understand the user's real experience process, optimize the browsing path, explore the conversion bottleneck links, and increase the conversion rate.
(3) Competitive product analysis iteration
With the help of zero-intrusion features, collect and analyze the performance of competing products in the same industry, understand the changes in the marketing situation of competing products and countermeasures, and make targeted IT investment and tuning iterations to make up for marketing shortcomings and stabilize the leading position.
After the above-mentioned related measures, the performance of the website has been greatly improved, and the quantitative indicators related to user experience have increased by more than 30%, effectively driving business growth. In addition to the above scenarios, cloud dial testing can also be widely used in network interfaces, service availability monitoring, CDN service monitoring and selection, DNS resolution status, hijacking analysis, and many other scenarios.
In order to meet the dial test needs of more enterprises and independent webmasters, the cloud dial test is launched to release monthly resource packs of different specifications and carry out limited-time preferential activities. New purchasers will get a 10% discount.
Copyright Statement: content of this article is contributed spontaneously by Alibaba Cloud real-name registered users. The copyright belongs to the original author. The Alibaba Cloud Developer Community does not own its copyright and does not assume corresponding legal responsibilities. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and the "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find suspected plagiarism in this community, fill in the infringement complaint form to report it. Once verified, the community will immediately delete the suspected infringing content.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。