​​​​​​Abstract: In fact, for energy conservation, traditional technology has also made a "twelve-point" effort. However, in the context of continuous technological evolution, traditional energy-saving technologies still have problems. How to break them?

This article is shared from Huawei Cloud Community "Data Center Energy Saving? Come and try Huawei's NAIE data center energy-saving technology! " , original author: Qiming.

1. Three years of electricity consumption, you can build another data center!

1.1 Driven by technology to promote the sustained and rapid development of the data center market

International practice, first introduce (bai) show (du) one (bai) next (ke) "data center": a data center is a network of specific equipment that cooperates globally, used to transmit, accelerate, display, and calculate on the Internet network infrastructure , Store data information. The main purpose of a data center is to run applications to process business and operational organization data.

Today, we are in a fully connected world. From 2015 to 2025, according to data from Huawei GIV, the number of global smart terminal connections will surge from 7 billion to 40 billion, and the number of global connections will also surge from 20 billion to 100 billion. Behind the rapid increase in the number of hardware and connections is the explosive growth of data traffic: the annual data traffic will surge from 9ZB to 180ZB at a rate of 20 times (see Figure 1).
image.png

Figure 1: Data from HW GIV

With the rapid growth of data traffic, coupled with the government's strong support for various emerging industries, the development and construction of data centers will usher in a period of rapid development. According to the statistics of MarketsAndMarkets, the value of global data centers will increase from 2017's 13.07 billion U.S. dollars It will grow to US$46.50 billion in 2022 (see Figure 2), of which the CAGR (Compound Annual Growth Rate) is as high as 28.9%. Its market size and market value are self-evident.
image.png

Figure 2: Data from MarketsAndMarkets

1.2 High power consumption, the "shadow behind" the data center industry

"There is always a shadow behind the sun." Behind the high industrial value is high power consumption. As a "data center", you can imagine: a large computer room, densely covered with all kinds of cabinets, servers, etc. The preliminary infrastructure and investment of the data center will be a huge amount. Once it is activated, the electricity bill will be an astronomical figure. We can use the 10-year operating cost of a large data center to look at the power usage:
image.png

As can be seen from the above table, the annual electricity bill of the data center is nearly 36 million, of which 70% is used for electricity, and of the 70%, 19% is used for cooling. And according to statistics in 2017, global data center electricity consumption accounted for 3% of global electricity consumption, with an annual growth rate of more than 6%, equivalent to 30 nuclear power plants; data center electricity consumption in China alone is 120 billion kilowatt-hours per year. More than the annual power generation capacity of the Three Gorges Power Station in 2017 (100 billion kWh). After calculating, the electricity bill of the data center for 3 years can be used to rebuild a data center!

1.3 External policies + operational challenges, energy conservation in the data center industry has become an inevitable trend

The electricity bill data behind the data center is so shocking that there are relevant policies at the national level that put forward strict requirements on energy efficiency indicators: for example, the Ministry of Industry and Information Technology in the "Green Data Center Guidance" requires the new data center PUE should be less than 1.4; Beijing, Shanghai, Shenzhen and other places also have planning requirements for PUE, especially the Shenzhen Development and Reform Commission encourages the PUE of newly built data centers to be less than 1.25, which is actually a very challenging number. Of course, the European Union and the United States also have their own regulations for PUE. After all, energy saving means reducing costs and increasing profits.

To solve the energy consumption problem, we need to formulate the energy consumption problem first, and then reduce or increase a certain value of the formula to achieve the purpose of reducing energy consumption. This formula is the calculation method of PUE we mentioned earlier.

PUE, namely Power Usage Effectiveness, power usage efficiency. PUE=Total energy consumption of data center/Energy consumption of IT equipment, where total energy consumption of data center includes energy consumption of IT equipment and energy consumption of cooling, power distribution and other systems. The value of PUE must be greater than 1. For example, if PUE=2, it means that for every 1 watt of electricity consumed by the IT equipment, an additional 1 watt of electricity needs to be consumed for power distribution and cooling. Of course, if in an ideal situation, if all the electricity is consumed on IT equipment, that is to say, all electricity is used for production, then the PUE at this time is equal to 1.

The following figure shows the details of the energy consumption unit of a data center:
image.png

It can be seen that the energy consumption units of a data center include chillers, water pumps, IT equipment, fans, fresh air lighting, etc. The energy consumption of these units is in the molecular position. The closer the PUE is to 1, the higher the use efficiency, the more power and money saving. So to save electricity, it is natural that we will start from the molecule, that is, non-IT energy consumption (mainly cooling function).

1.4 Find the principle, how to cool the data center

Before thinking of a solution, let's take a look at the cooling principle of the data center (the figure below is a schematic diagram of cooling).
image.png

The whole system can be divided into two parts: the freezing station and the terminal room. The left side of the dotted line is the freezing station, which includes cooling towers, refrigeration units, water pumps with various functions, and cold storage tanks for cold water storage; the right side of the dotted line is us In addition to the server cabinets, the IT equipment room in China also has air conditioners for blowing out cold air. The cold source of the air conditioner comes from the freezing station on the left.

To put it simply, the entire system refrigeration system is to move the heat emitted by the server in the IT equipment to the outdoors. The power consumption units of the refrigeration system are also very intuitive, such as the cooling tower, cooling pump, chiller and air conditioner on the picture.

Of course, the above picture is just a simple schematic diagram, the actual refrigeration diagram will be much more complicated than the above picture. So how can we save energy in a complex system?

1.5 Under technological evolution, the limitations of traditional energy-saving technologies

In fact, traditional technology has also made a "twelve-point" effort for energy conservation. However, in the context of continuous technological evolution, traditional energy-saving technologies still have the following problems:

The application of product-level energy-saving technology is close to the ceiling;
The system is complex, there are many devices, and the relationship between the energy consumption of each device is intricate, and it is difficult to simulate with traditional engineering formulas. The traditional control methods are independent, and the effect of expert experience has reached the limit;
Each data center has a unique environment and architecture. Although many engineering practices and rules of thumb can be fully applied, a customized model of the operation of one system does not guarantee the success of another system.

2. How does NAIE data center energy-saving technology help energy-saving

2.1 Industry consensus, AI helps data center energy saving

As mentioned earlier, traditional energy-saving technologies can no longer meet the energy-saving needs of data centers. Everyone began to find new ways.

Nowadays, the consensus in the industry is to use AI to adjust the entire refrigeration system so that the operating status of each device can be matched with each other to achieve the best state. According to Gartner's user survey, as of 2020, 30% of data centers that are not prepared for artificial intelligence will no longer have economical business. At the same time, the survey also listed three ways that artificial intelligence can improve the daily operations of data centers:

  • Use predictive analysis to optimize workload distribution, and implement optimized storage and computing load balancing;
  • Machine learning algorithms handle transactions in the best way, using artificial intelligence to optimize data center energy consumption;
  • Artificial intelligence can alleviate staff shortages and automatically execute system updates and security patches.

"Use AI to adjust the refrigeration system", the most famous is the collaboration between Jim Gao and the DeepMind team. They used neural networks to predict the PUE, the temperature of the data center, and the load pressure respectively, to control the variables of about 120 data centers, so as to reduce the PUE.

The industry has already had a very successful application of AI technology for data center energy saving. Next, let's take a look at how NAIE data center contributes to energy saving!

2.2 Huawei NAIE Data Center Energy Saving Technology

As far as "energy saving" is concerned, it is actually a very big topic, and NAIE data center energy saving also includes many aspects. Our introduction today focuses on "refrigeration system energy saving". Regarding the "energy saving of refrigeration system", NAIE data center energy saving has the following 4 "means":

2.2.1 Original data feature engineering

For the refrigeration system of a data center, there are generally complicated piping layouts, refrigeration units (water pumps, water towers, etc.) installed, and besides these devices, there are countless sensors. At the same time, different data centers will have differences in various aspects according to different locations, resulting in different pipelines and equipment.

Aiming at these data differences, we can use AI algorithms to shield: through feature engineering to deal with some complex structures, such as single tube, mother tube, ring tube, etc.; according to different controls, we think of ways to extract unified features, and then for different Equipment, such as cooling towers, chillers, heat exchangers, water pumps, air conditioners, etc., comprehensively extract the relatively close features; finally, the data is verified, the missing data is filled, the wrong data machine is corrected, and the The abnormal samples are deleted.

Therefore, through feature engineering, we can process the data collected at the site into a more uniform form and provide it to the subsequent AI algorithm.

2.2.2 Energy consumption prediction and safety assurance model

To save energy, you first need to have an energy consumption prediction model. Establishing a good model is a good start to predict how to adjust the energy saving of the refrigeration system. However, there is a big difference between the predictive model for the industrial control field and the model that predicts the trend of stocks or the flow of subway people: the control of safety. After all, safety in production is the number one priority, and saving electricity and money is the second priority.

Therefore, the NAIE data center energy-saving prediction model is not a simple, independent model, but a set of models: not only to predict the energy consumption after adjustment, but also to predict the state of each intelligent system. It is necessary to ensure that all systems are in normal state before saving energy.

2.2.3 Optimization of control parameters

The introduction of the first two "means" has laid a good foundation for energy-saving algorithms. When it comes to the third "means", "results" are about to come out. Whether the control parameter we searched for is “excellent” or not is entirely determined by the quality of the third “means”. "Energy consumption prediction and safety assurance model" provides a good energy consumption and state prediction model, which can be imagined as a hypersurface graph (as shown in the figure below). Of course, its shape cannot be drawn and it is hard to imagine, because what we are solving is a high-dimensional space problem, and there are many holes on this hypersurface, which represent unsafe control parameters. Then the purpose of our third "means" is to find the better or optimal control parameters in a quick and good way, and send them to the equipment for execution.
![uploading...]()

2.2.4 NAIE Cloud-Ground Collaboration

Cloud-ground collaboration is an automated service that connects the cloud and the ground, realizes data collection and uploading to the cloud, daily model evaluation, chóng training, and model update.

Briefly explain: data collection, that is, new samples; daily evaluation of the model, that is, to decide when to update; retraining, that is, the process of retraining, and finally achieve the goal of fully automatic model update. (See below for specific frame diagram)
image.png

NAIE's cloud-ground collaboration includes NAIE's data lake, data center PUE optimization model generation service, and AI market (AI market is used to manage the generated model package) on the cloud; at the end of the customer network, there is a network AI framework ( The platform that runs the model generated by the model generation service). The local network AI framework is responsible for sample collection and management, and is also responsible for continuously evaluating the generated model using new samples. If it is found that the distribution of the collected samples has changed significantly, or the model accuracy is always not up to standard, it will trigger to rebuild the model.

At the same time, the network AI framework is connected with the actual control system of the data center through Huawei's Cloud Opera Neteco system. In this way, the control parameters generated by the model can be directly delivered to the actual group control system.

2.3 NAIE helps data center energy saving, invincible

With the blessing of NAIE, in a data center of Huawei, after the annual PUE was optimized, the PUE was reduced by 0.12 compared to before using AI, which is converted into electricity, which means that each sampling period can reduce the power consumption of 328.6 kilowatts. Calculated in this way, it can save 5.8 million yuan in electricity bills a year, a considerable figure.

NAIE model generation service
Different data centers may have differences in cooling modes (water-cooled, air-cooled, AHU, etc.), pipe types (main pipe, single pipe, mixed pipe), etc. How should we start?

Here we will use the "feature engineering" we mentioned earlier. As we said before, the purpose of "feature engineering" is to shield many differences in AI algorithms and to form unified features as much as possible.

Ordinary modeling (as shown in the figure below) is aimed at developers: from energy-saving modeling to model application, 4 developers are required, which lasts 6 months.
image.png

With the technical support of "Feature Engineering" and "Old Experts", NAIE has prepared the pre-conditions for everyone. Let's take a look at the highlights and advantages of NAIE:

  1. Zero-coding efficient modeling: Based on Huawei's data center topology template, AI model training platform and PUE feature/algorithm library, energy engineers only need to provide infrastructure operating data and refrigeration equipment process parameters, and they can be matched online without any coding. For the AI ​​model of the data center, the model development time June to 1 person January, and the development investment of the entire model was reduced by more than 95%;
  2. Flexible and visual parameter configuration: Based on Huawei's visual parameter configuration in the data center field, by adjusting the parameters, the PUE model of the data center under different topology templates can be generated;
  3. Comprehensive control strategy: By importing all relevant parameters of the data center infrastructure PUE, the model can infer the control strategy of a complete set of refrigeration equipment, such as chillers, cooling pumps, cooling towers, refrigeration pumps, plate exchanges, etc., helping energy engineers to adjust flexibly and accurately Refrigeration system to achieve the best energy consumption state;
  4. Good optimization effect: through professional feature recognition and processing, the model fitting effect is good. Under the premise of quantity and quality assurance, the PUE prediction accuracy rate reaches 95%.

Through the official website of the data center PUE optimization model generation service ( https://console.huaweicloud.com/naie/products/dpo ), you can quickly experience the service: Click "Function Demo":
image.png

Enter the service introduction page and follow the instructions step by step to quickly and conveniently experience the data center PUE optimization model generation service.
image.png

The data center PUE optimization model generation service combines AI technology and data center engineering experience to provide automated modeling tools (such as data center topology templates, PUE feature/algorithm library, model training platform) to help data center engineers 0 basics 0 coding, only You need to enter the operating data of the data center infrastructure to get an effective PUE optimization model online. Let's try it together?

Click to follow to learn about Huawei Cloud's fresh technology for the first time~


华为云开发者联盟
1.4k 声望1.8k 粉丝

生于云,长于云,让开发者成为决定性力量


引用和评论

0 条评论