Introduction to enterprises go to the cloud, the budget on the cloud directly affects the priority, progress, and depth of the cloud. The amount of budget investment is closely related to business development and capacity assessment of resource requirements. Accurate capacity evaluation can make the budget planning of enterprises going to the cloud more scientific, and at the same time it can better meet the needs of the business development stage. This article shares how enterprises should plan and implement capacity after the business goes to the cloud
Author of this article: Alibaba Cloud technical expert Li Yuqian
Abstract
With the rapid development of enterprise digital transformation and cloud nativeization of enterprise IT services, the pace of customers going to the cloud has become more compact, and the budget on the cloud will directly affect the priority, progress, and depth of the cloud. How much investment budget, and related business development, another key factor is the capacity to assess resource requirements .
Accurate capacity evaluation can make the budget planning of enterprises going to the cloud more scientific, and at the same time it can better meet the needs of the business development stage. plans and implements after the enterprise business goes to the cloud.
1. Why do we need to plan for capacity
The digital transformation of enterprises and the cloud nativeization of enterprise IT services are developing in great strides. For companies that are going to the cloud or are going to the cloud, the regular budgetary expenditures include digital informatization or IT software service expenditures. This part of the budget expenditure includes the budget investment of resources on the cloud. One of its accounting basis is: Cloud capacity planning and implementation .
in everyday life, needs the "capacity" planning scenario is very common. For example: Reservoir water storage is a typical dynamic "capacity" planning process, and storage capacity needs to be adjusted according to the upstream and downstream water environment. For example, during the epidemic period, the scenic spot implements the measures of purchasing tickets and entering the park after successful reservations in advance, and the total number of daily tourists needs to be adjusted according to the prevention and control requirements.
In the same way, business on the cloud will also dynamically develop and change, and the computing power resources that cloud products and services rely on also need to be adjusted accordingly. We abstract the usage planning of computing resources as capacity planning.
The necessity of capacity planning after the enterprise goes to the cloud is that the business of the enterprise is dynamically developed, and the computing power resources on the cloud that the business depends on also need to be dynamically adjusted accordingly. Too much computing power resources lead to idle resources and waste of costs, and too few computing power resources affect the response performance of business services and hinder the rapid development of business. Then, enterprises go to the cloud, what problems will arise if capacity planning is not carried out?
First of all, there may be a cost of 1618b2eb288d90 and business development . For example, when the business is developing rapidly, the demand for computing resources that the business depends on is also on the rise. At this time, if there is no capacity planning, it is very likely that the back-end service capabilities will not be able to keep up in time when the business outbreak period comes, which will affect The business continues and develops steadily, and even missed the golden opportunity for business development.
In addition, the application of Internet technology has greatly shortened the distance between service consumers and service providers. The cross-regional high availability and stability of the service performance of service providers has become the normalized goal. Aiming at this goal, one of the most direct implementation schemes is to carry out capacity redundancy between regions, so that in the case of software and hardware failures or other emergency scenarios, traffic switching can be performed to achieve disaster recovery.
To sum up: After an enterprise goes to the cloud, business capacity planning is just needed, and continuous planning is required. Accurate capacity planning can help the rapid development of the business and avoid computing power support from becoming a bottleneck or hindrance to business development. At the same time, the high availability and stability of enterprise business cross-regional services can also be guaranteed.
Second, business needs are transformed into capacity planning
capacity planning is for business, and capacity planning that is out of the actual situation of the business is meaningless . Based on the business characteristics and business development stage goals, it is a reasonable plan to formulate a capacity plan that matches the business development.
For example, for a company A, the business of department B needs one office computer per person. Currently, Alibaba Cloud's cloud desktop products are purchased. This year, the number of employees in Department B is expected to expand by 10%, and the capacity plan for the number of cloud desktops this year also needs to be expanded by 10%. This example is intuitive and easy to understand. In fact, there are many factors that need to be considered in cloud capacity planning for different industries and different business characteristics. The following is a general understanding of disassembly and analysis, as shown in Figure 1, subdivided step by step from bottom to top.
Figure 1-Business-driven capacity planning
Factor 1: Overall development assessment of business needs
overall development trend and evaluation of enterprise business is the foundation of all demand sources. . Without a full evaluation of the overall business development, it is impossible to output a reasonable and effective capacity planning evaluation. For enterprises, they do not plan for capacity planning. Capacity planning serves business development. The overall business development assessment is naturally at the bottom of the "pyramid."
Factor 2: Evaluation of the development of the cloud-native part of business requirements
At the bottom of the "pyramid", there is another layer corresponding to the development evaluation of the cloud native part. cloud native service development is directly related to the proportion of the cloud capacity planning budget . For the Internet industry, the main business may be cloud-native; for traditional industries, if only the enterprise management informatization part goes to the cloud, then the development evaluation of the cloud-native part is a small proportion.
Factor 3: Under a limited budget, the need for priority protection on the cloud is assessed
For enterprises, the budget for each item is always limited, and limited resource services should give priority to the development of key businesses, so as to maximize the input-output ratio. For all cloud services, storage , database , computing service are basic dependencies. Generally, the planning and investment of these three pieces are guaranteed with high priority.
Factor 4: Assessment of the continuity requirements of the cloud-native part of the business
For businesses, at all stages of development operations, business continuity is crucial , especially the continuity of critical business services. Therefore, the capacity planning process needs to pay attention to and evaluate the reflection of business continuity in the budget. For example, the computing resources that the core business relies on can be planned to achieve deterministic delivery of resources through annual and monthly instances, flexible resource guarantee services, and resource reservation services, so as to ensure service continuity.
Reference material: Resource Protection Service
https://help.aliyun.com/document\_detail/193626.html
Factor 5: Assessment of the regional disaster recovery needs of the cloud-native part of the business
For enterprises, at different development stages, the priority of business services in the region may be focused, so capacity planning needs to be aware of the region. At the same time, the high availability of services often relies on the building of service disaster tolerance capabilities between regions. Therefore, the budget needs to balance the needs of regional development.
Factor 6: Independent planning vs. comprehensive planning for business cloud-native requirements
Based on the previous five factors, the capacity assessment is becoming more and more specific. Next, starting with factor 6, planning needs to consider the impact of specific operations. Independent planning and comprehensive planning rely on different inputs, and the output schemes are also different. For example, the aforementioned employee-oriented office scenario requires cloud desktops, because cloud desktops are relatively independent of each other, and can be independently planned and delivered independently.
For example, for large-scale web service scenarios, because they rely on cloud databases, cloud storage, traffic bandwidth and other services, capacity evaluation needs to be packaged and delivered as a whole to avoid short-board effects. And when assessing the specific capacity, the assessment tools and plans to rely on are also different. For independent planning, general evaluation is relatively easy to give; for comprehensive planning, Alibaba Cloud's capacity planning service provides a complete set of solutions.
Reference: Capacity Planning Service
https://www.aliyun.com/service/capacity\_planning
Factor 7: Evaluation of current discount information of different cloud service providers
When the business capacity planning segmentation is in place, and the products and tools that the capacity planning depends on are clarified, then needed to perceive discount information .
Different cloud service providers have related activities and discounts in different regions and computing power products. Evaluating this part of the content can make it possible to spend the same budget to purchase more and more affordable computing resources. For example, the SavingPlan + CapacityReservation service launched by Alibaba Cloud has realized cost savings and deterministic delivery of resources.
Factor 8: Evaluation of planned capacity delivery schedule
The capacity delivery schedule evaluation step is to output specific planning information such as when, where, what computing resources to deliver, and what is the corresponding budget. Delivery too early or too late may not match the business development, and even capacity planning cannot be implemented in the end.
Third, the capacity planning is mapped to the amount of resource purchases
In the previous section, we made a bottom-up description of the factors that need to be considered in capacity planning in a hierarchical manner. The essence of planning and evaluation is to meet the development needs of the business at the right time and place, and plan out the computing power requirements at the corresponding time and place.
As shown in Figure 2, there are many mapping methods from specific requirements to computing power. The following hypothesis: The service capabilities on the cloud required for the future development of enterprise business are predictable. Based on the predictable value, it is transformed into a specific resource instance purchase volume demand, and then a specific purchase plan is formed. The following describes the commonly used technical solutions for mapping the planned capacity to the amount of resource purchases.
Figure 2-Business requirements mapping computing power requirements
_Method 1: _Linear mapping-horizontal expansion and contraction
From the perspective of , the classic evaluation method is: total number of resource instances = total service request QPS/QPS supported by a single resource instance. When business development requires more computing power, the total QPS will change. At this time, the number of resource instances that need to be added and expanded = new QPS/single-machine QPS. This method corresponds to the so-called "horizontal expansion" in the field of resource scheduling. Services provided by Alibaba Cloud, such as Auto Scaling, support automatic horizontal scaling.
Reference material: Elastic scaling
https://help.aliyun.com/document\_detail/25857.html
For more information about horizontal expansion, please refer to K8s's HPA (Horizontal Pod Autoscaling):
https://kubernetes.io/zh/docs/tasks/run-application/horizontal-pod-autoscale/
_Method 2: _Linear mapping-vertical expansion and contraction
From the perspective of resources, , vertical expansion is relative to horizontal expansion. By adjusting the size of the single machine resource computing power, that is, adjusting the size of the single machine supporting QPS (indirectly reducing the QPS supported by a single resource instance through the reduction of resource instances), the total number of resource instances is adjusted, and the total service request QPS is adjusted. Generally, in the scenario of refined resource scheduling and mixed service load deployment, the vertical expansion and contraction of the resource single instance will be carried out.
There are two forms of this kind of vertical expansion: one is fixed (not changed after the specification is adjusted), for example, from the original 4VCPU, the vertical scale is reduced to 2VCPU. Then the instance is scaled horizontally according to 2VCPU; the other is non-fixed (the elastic scaling of a single computing resource in a short period of time), for example, the resource instance is "restricted" by a certain dimension during the operation of the resource instance, so as to achieve Adjustment of computing power of single-instance resources in specific scenarios.
For the business side, the instance specifications seen have not changed. In a typical resource model such as K8s, such as a CPU resource application, there are two parameters request and limit, which can realize the elastic burst of CPU resources. Another example is the Alibaba Cloud burst performance instance, which uses CPU credits to ensure the instance specifications of the computing performance. It is suitable for scenarios where the CPU usage rate is low at ordinary times, but occasionally there is a sudden high CPU usage rate.
Reference: Examples of burst performance
https://help.aliyun.com/document\_detail/59977.html
For more information about vertical scaling, please refer to GKE's VPA (vertical-pod-autoscaler): https://cloud.google.com/kubernetes-engine/docs/concepts/verticalpodautoscaler
_Method 3: _Non-linear mapping-full link evaluation
Large-scale Internet services, such as e-commerce transaction systems, have many business scenarios, dependencies between businesses, and large-scale business services. It has been difficult to evaluate the system capacity separately by application, and it is necessary to conduct an overall capacity evaluation under the pressure of the full-link scenario.
Alibaba Cloud's capacity planning service provides a full set of services, including:
- Service planning, providing business traffic analysis, data capacity analysis, message capacity analysis, database capacity analysis, cluster capacity analysis;
- After the service is planned and executed, the full link pressure test plan, scene flow ratio and scheduling plan, current limiting and downgrading plan, and drill plan are provided.
The core value of full link evaluation : Help customers detect the optimal pressure, ultimate pressure, and failure pressure points of the cloud system, and perform downgrade and current limit protection. The full link evaluation is especially suitable for large-scale and complex scenarios.
Reference: Capacity Planning Service
https://www.aliyun.com/service/capacity\_planning
_Method 4: _Capacity prediction-automatic deployment
Compared with methods 1, 2, 3, and 4, the future capacity changes will not be accurately assessed beforehand. Based on system load balancing and system QPS water level monitoring, automated resource delivery, including automatic horizontal expansion, shrinking, and cross-specification instance delivery, etc. are carried out. For example, Cloud elastic container instance ECI supports multi-specification instance delivery. Cloud Operation and maintenance orchestration service OOS provides automated operation and maintenance services on the cloud that can automate management and execution of tasks. Customers can define execution tasks, execution sequence, execution input and output through templates, and then complete the automated operation of tasks by executing templates. OOS supports cross-product use. You can use OOS to manage cloud products such as ECS, RDS, SLB, and VPC.
Reference material: flexible container example
https://help.aliyun.com/product/87486.html
Operation and maintenance orchestration service
https://help.aliyun.com/document\_detail/120556.html
In summary, the process from business requirements to resource capacity planning to resource capacity execution can be summarized as the process shown in Figure 3.
Figure 3-Demand to capacity execution
Fourth, the purchase plan of resources purchase
When the amount of resource purchase is clarified, the specific purchase plan is shown in Figure 4. On the time axis of business development, deterministic delivery of computing resources.
Figure 4-Deterministic delivery of computing resources in the business development process
As in the previous article "Virtual IDC (Private Pool) Selection Guide on Three Typical Scenarios" introduced: Business resource delivery has daily stable resource demand, daily elastic resource demand, and sudden resource demand. Enterprises need to choose appropriate resource purchase plans based on their own business development characteristics and specific resource requirements to achieve cost savings and deterministic delivery of resources. For example: periodic resource demand, occasional resource demand, resource demand in a specific period, you can purchase flexible resource guarantee related products and services. You can refer to this article directly from the detailed capacity planning to the final purchase plan.
After the enterprise business goes to the cloud, resource capacity planning is required. Cloud provides a wealth of product capabilities to support the accurate assessment of the capacity of the , flexibly purchase , especially based on resource guarantee services, such as elastic guarantee, and capacity reservation with immediate effect And other services, support the deterministic delivery of , which strongly guarantees the continuity of business development.
This is the end of the sharing of best practices in this issue. Soon we will launch the last article in the private pool series on the cloud, so stay tuned~
Related reading:
Best Practice丨Virtual IDC (Private Pool) Selection Guide on the Cloud in Three Typical Scenarios
Copyright Statement: content of this article is contributed spontaneously by Alibaba Cloud real-name registered users. The copyright belongs to the original author. The Alibaba Cloud Developer Community does not own its copyright and does not assume corresponding legal responsibilities. For specific rules, please refer to the "Alibaba Cloud Developer Community User Service Agreement" and the "Alibaba Cloud Developer Community Intellectual Property Protection Guidelines". If you find suspected plagiarism in this community, fill in the infringement complaint form to report it. Once verified, the community will immediately delete the suspected infringing content.
**粗体** _斜体_ [链接](http://example.com) `代码` - 列表 > 引用
。你还可以使用@
来通知其他用户。