Introduction to more than 10 years of experience in full-link stress testing, allowing Alibaba Cloud users to enjoy a full set of standard full-link stress testing as if they were full of Han Chinese seats, or choose the most suitable one according to their own needs. Your own way.

Author: Zizhen

review & proofreading: Fengyun, Yufu

Editing & Typesetting: Wen Yan

Customer story

The full-link stress test is hailed as a "nuclear weapon" to promote preparations for war. If you have paid attention to the technical summary related to Ali's Double 11 before, you will not be unfamiliar with the "full-link stress test". The appearance rate of this term is almost 100%. . From the perspective of its value to the stability of Double 11, it is not an exaggeration to use "nuclear weapons" to describe the full-link stress test.

In a well-known e-commerce promotion, the e-commerce platform also wanted to use full-link stress testing to eliminate risks for its own promotion in advance. But he encountered several difficulties:

  1. Full-link stress testing is an activity that requires the participation of multiple roles: the business side, testing, operation and maintenance, R&D, and database all need to be involved. However, it can be like Ali with a mature organizational system that can forcefully promote a variety of different roles, which will take a long time to accumulate.
  2. Full-link stress testing often involves the transformation of the framework: and the business of this e-commerce platform is complex, so it is not realistic to organize the structure and business transformation.

So, is there any way for this well-known e-commerce platform to use the full-link stress test within one week without performing business transformation or changing business deployment?

In the following content, we will start with the principle of full-link stress testing, and introduce an "agile version" of full-link stress testing based on the same principle, so that the well-known e-commerce platform can use the full-link stress test within 2 weeks. The scheme of link stress testing.

Full link stress test

First of all, let's take a look at Ali's full-link stress test and what problems have been solved:

The problem that the full-link pressure test actually solves is: online pressure test. Online stress testing can find online problems the fastest and most directly. However, online stress testing will bring about the problem of data pollution: how to distinguish stress testing data from real data is a crucial point in stress testing. So, how does Ali do it? Let's look at the following picture together:

图片 1.png image.gif

Ali's full-link stress test has a mature and complex system: sorting, constructing, preparing, and sending stress tests. However, this system requires long-term construction for a cloud user. So how can we let users enjoy this set of technology quickly and agilely?

Here, PTS precipitates the entire process and provides standardized output to users on the cloud. Users can directly enjoy a complete set of full-link stress testing system, or they can customize what they want during the key steps of stress testing: scene combing, request construction, stress testing environment, stress testing, etc. Pressure test effect.

Scene combing

The business scenario corresponds to the input request of the stress test. This is the first and most important step of stress testing. The most common is to sort out and summarize the URLs involved in the business. For example, the following figure is a summary of a common scenario:

图片 2.JPG

However, this is not enough. When several URLs are aggregated into a scenario, the ratio and time interval between URLs are also the key to affecting business scenarios. Use a common scenario to make an analogy: a user’s order may contain 10 user logins. Each user has viewed an average of 4 products, and each product has been viewed an average of 5 reviews, and the last user is at 10 When the big promotion started, I bought a product.

The relationship and time points between these URLs require personnel to have a wealth of business knowledge to sort out clearly. For this reason, PTS provides the function of server-side traffic recording, which is convenient for users to record traffic and easily obtain the proportional relationship of different dimensions:

图片 3.png

As shown in the figure above, users can clearly get the proportional relationship between URLs, the time behavior between user URLs, and so on. Based on this sorted out data model, users can make tailoring on this basis.

Test data structure

The next step is to construct user data. This step involves the most roles and the most cumbersome. The entire data structure consists of three steps, as shown in the following figure:

图片 4.png image.gif

The first is data discovery. Usually, we can sort through manual business to get all the tables involved in the business and analyze them. In order to avoid this trouble, PTS is connected with DMS and provides a preview of the table structure, so that testers can easily see the structure associated with the scene, which greatly improves efficiency.

image.gif 图片5(1).png

If it is still too complicated, PTS will provide a data recording tool. After installing this agent, the tables involved in the business will be fully recorded:

图片 5.png

With these tools, testers can easily get the table information associated with the scene without the assistance of the DBA.

Data closure

With these data tables, and after analyzing the data closure on this basis, we can start to produce stress test data. Generally, there are three ways to make a shadow table:

  1. Shadow library-full shadow library mapping. The advantage of this method is simplicity, but the disadvantage is that it consumes more resources;
  2. Shadow table-The table in the table closure is associated with names through certain rules. The advantage of this method is to save resources, but the disadvantage is that the tables need to be fully sorted and corresponded one-to-one;
  3. Do not create a new table, in the same table, the shadow data will be large displacement offset. This will be introduced in the agile version later.

图片 6.png

These three methods can be used in combination according to requirements.

Data import/hybrid

With these premises, we can use DMS to import data and make data.

image.gif 图片 7.png

At this point, we have completed the two most complex steps in the full-link stress test: sorting out stress test scenarios and creating stress test data.

Next, we use data processing to finally process these two elements into pressure test data.

Data processing

At this point, we will do the last step on the pressure test data to process the data. That is, we make final adjustments and processing of business scenarios and stress test data according to our business model:

At this point, we can see that the pressure test request for the full link pressure test has been formed. Next, we can begin to design the behavior of the pressure measurement flow in the pressure measurement object.

test environment

The pressure test can be carried out in a simulation environment or an online environment. Different environments, data selection, and manufacturing data have different considerations. As shown below:

图片 8.png

Simply put, the test environment focuses on a single component: such as microservices, interfaces, but protocols (SQL, Redis) and other stress testing; the pre-release environment (usually a VPC environment) focuses on link integration; the production environment is the closest to reality Scenes. Here, we only discuss the online production environment.

Traditional full link stress test

The following figure simply explains the operation mode of the traditional full-link stress test;

图片 9.png

We see that the traditional full-link pressure test mainly uses traffic marking to distinguish the pressure test flow from the real flow. To do this, it is necessary to ensure that the pressure test mark can be transparently transmitted layer by layer. And when the traffic reaches the "write" level, the deployed agent decides the "write" behavior based on the pressure measurement target. Should it write to the real database? Or write to the shadow area? The reason is very simple, but there are still many difficulties in implementation. Among them, the main issues involved are:

  1. If the framework used by the application is not standard, it needs to be adapted;
  2. The process of promoting the development and installation of agents is complicated;
  3. The coverage of the verification agent is complex.

Agile version of full link stress test

If we don't want to transform the business, nor do we want to mount the agent, how can we do this?

Let's take a look at the principle of sampling testing. When testing, there is usually a means to verify the correctness of the program by selecting a few specific real user data for testing; if we turn these real user data into fake users, then we need to meet the following key Condition: The fake user and the business data involved in the business scenario of the fake user, as well as the related data in the business scenario, can all be identified.

For example, we simulate a fake user to purchase a fake product. The user and product here can have a specific feature. The browsing record and purchase record generated by the fake user have the user ID in the database. ; Under this premise, we can identify dirty data from real data;

image.gif 图片 10.png

For this kind of pressure test, the following two points need to be counted:

  1. Completely find out the data tables involved in the business-refer to the PTS SQL recording function in the previous chapter;
  2. Making shadow data-Unlike traditional full-link stress testing, here we choose the third method, which is to make a large displacement in a table instead of making a shadow table or a shadow library. After the stress test is over, according to the characteristics of the shadow data, the database is inspected and cleaned up;

This method is based on the user's clear understanding of the business. The pressure test data produced has obvious pressure test identification (much larger offset than normal data), and all the write pressure tests involved are marked with These offsets; in this way, all the data generated by the pressure test can be identified. After the pressure test is over, clean up the pressure test data according to this data feature;

Choice of traffic engine

In order to better simulate the behavior of users, we often use the customization of the pressure measurement area. However, it is unrealistic to deploy the stress test engine to all parts of the country; and PTS can conveniently allow users to choose a region to initiate, as shown in the following figure:

图片 11.png

Summarize

PTS combines Alibaba's experience in full-link stress testing for more than 10 years, allowing Alibaba Cloud users to enjoy a full set of standard full-link stress testing as if they are enjoying a full-fledged table of Man and Han. They can also choose the most suitable method according to their own needs.

In addition, PTS has recently made "agile" optimization on the price. For more choices, click here to view~

Copyright Notice: content of this article is contributed spontaneously by Alibaba Cloud real-name registered users, and the copyright belongs to the original author. The Alibaba Cloud Developer Community does not own its copyright and does not assume corresponding legal responsibilities. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and the "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find suspected plagiarism in this community, fill in the infringement complaint form to report it. Once verified, the community will immediately delete the suspected infringing content.

阿里云开发者
3.2k 声望6.3k 粉丝

阿里巴巴官方技术号,关于阿里巴巴经济体的技术创新、实战经验、技术人的成长心得均呈现于此。


引用和评论

0 条评论