6

In Greek mythology, Poseidon, the god of the sea and the god of harvest, held a trident to escort ships and bring clear springs to farmers. Since then, the trident is often used to describe the close combination of three things to form a joint force. For example, many teams in the football world have the classic striker trident.

In the wave of enterprise cloud migration and industrial intelligence, there is also a trident combination of cutting-edge technology trends: cloud migration for businesses and organizations is becoming the digital development choice for most companies; computing power is gradually becoming a strategic resource for companies, while cloud computing The near-infinite cluster computing power enables more and more industry and scenario innovations to rely on cloud-based high-performance computing; AI is changing the production methods of thousands of industries, becoming a pioneer in scientific research and industrial exploration, machine learning and deep learning It has also brought about an explosion in the demand for artificial intelligence special computing power.

In the impression of many people, the tridents of HPC high-performance computing, AI, and cloud services are still in the stage of independent and parallel development. Especially, the realization of high-performance computing in the cloud seems to be too avant-garde.

However, from the logic of industrial efficiency, the realization of high-performance computing in the cloud that can enable high-quality and efficient AI training and deployment is actually the general trend of industrial intensification and low-cost social innovation. Only when these three technologies are perfectly integrated can the digital trident required in the intelligent age be created.

How to smelt this trident of the times, Amazon Cloud Technology already has some answers.

Going to the Cloud: Industry Trends and Challenges of High-Performance Computing

Is the relationship between cloud computing and high-performance computing really only incompatible? The answer may not be so.

According to Hyperion Research market research data, by the end of 2022, 18.8% of HPC will be running in the cloud, and this figure will be 12.3% in 2021. Although most HPC tasks still rely on supercomputing centers and local hardware, but Obtaining high-performance computing in the cloud can be said to be the general trend of industrial development. Acquiring high-performance computing in the cloud, at this stage customers will be concerned about some challenges. For example: management challenges, it is difficult to create and manage large-scale computing clusters, whether there are fast deployment methods and efficient and convenient management methods; energy efficiency challenges, or considering the cost performance of high-performance computing on the cloud, how to play HPC in the cloud The maximum energy efficiency of the cloud is a topic that many users worry about; security challenges, a large number of tasks processed by HPC are inseparable from data, and there is bound to be data security concerns. How to deliver data security in the cloud to users in a reassuring environment.

However, from the perspective of the development trend of the high-performance computing industry, these problems can be solved one by one in practice. From the perspective of basic computing logic, it is more economical to obtain high-performance computing in the cloud, and users can flexibly obtain heterogeneous computing resources to truly realize the adaptation of computing and tasks. In terms of performance of a single node, the performance of computing resources in the cloud is better; in a computing cluster scenario, the cloud can allow users to obtain linearly increasing computing performance and avoid wasting computing power.

Therefore, it is not impossible to achieve high-performance computing in the cloud, but because of the massively expanded computing power of the cloud, the ever-increasing computing performance of nodes, convenient and efficient computing power management methods, and the security of cloud-native systems and data , enabling high-performance computing in many industries to run in the cloud.

In the exploration of how to obtain reliable HPC in the cloud, Amazon Cloud Technology has realized the industry-leading exploration. Technology Convergence and Industry Balance: High Performance Computing Exploration of Amazon Cloud Technology

Technology Convergence and Industry Balance: High Performance Computing Exploration of Amazon Cloud Technology

At the current stage, Amazon Cloud Technology can already provide a highly customizable HPC computing platform, bringing users a variety of heterogeneous computing resources and customized computing instances. It is particularly worth noting that Amazon Cloud Technology, known for its rich software ecosystem, also provides a large number of available and low-cost software in the HPC field to help users solve problems in areas such as management and scheduling.

In general, Amazon Cloud Technology's HPC exploration presents two core differences: a high degree of integration of technical experience in the fields of chips, cloud, storage, software, AI, etc., and a large number of highly industry-oriented research aimed at industry needs and user pain points. Software and hardware ecology.

In terms of computing, network, storage and application software ecological adaptation that high-performance computing customers care about, Amazon Cloud Technology provides customers with mature HPC-related service guarantees.

In the computing power layer, Amazon Cloud Technology provides diversified heterogeneous computing support including CPU, GPU, and ARM, as well as customized elastic computing instances to meet the computing resource needs of users in HPC high-intensity tasks such as AI.

At the storage layer, clustered computing power demands lead to massive and high concurrent access to storage, which makes storage performance critical. Amazon Cloud Technology provides storage support for high-performance computing scenarios, and can implement multi-level file storage strategies in the cloud, helping users to flexibly plan storage usage according to computing needs, thereby reducing the storage cost of cloud HPC and improving data access. , management efficiency.

In the cloud network, Amazon Cloud Technology can provide customers with a continuous low-latency, high-bandwidth network environment required by supercomputing applications. Users can use Amazon Cloud Technology's up to 100Gbps bandwidth throughput, EFA (Elastic Fabric) that supports MPI. Adapter) network card, launched the SRD (Scalable Reliable Datagram) protocol with low latency and reduced network jitter to accelerate the communication between nodes.

At the software layer, Amazon Cloud Technology provides rich and low-cost software tools for migration, scheduling, including visualization and other HPC scenarios. For example, the use of Amazon Cloud Technology ParallelCluster can be said to quickly build an HPC computing environment and simplify the deployment and management of HPC clusters. Amazon Cloud Technology Step Functions is a low-code, visual workflow service that helps developers build distributed applications, automate IT and business processes, and build data and machine learning pipelines, reducing overall development costs. This is very important for high-performance computing tasks in fields such as AI. The rich, professional and low-threshold software ecosystem allows Amazon Cloud Technology to help high-performance computing users save huge software customization development costs and realize industrial-level high-performance computing applications.

Based on the diversified high-performance computing exploration of Amazon cloud technology, it has become possible to obtain the surging computing power of clustering in the cloud. And such a possible direct impact is to lay the foundation for a large-scale AI application wave.

Smart Dawn: The wave of computing brought about by the AI big voyage

As pre-trained large models and AI scientific computing have become mainstream in the industry, the computing power required for AI training and deployment has begun to surge, especially the dependence of AI tasks on high-performance computing has gradually increased. Perhaps it can be said that the dawn of industrial intelligence is gradually blooming, which must be built on the solid computing power base of HPC.

AI tasks with complex structures and huge amounts of data, such as new drug research and development, scientific research, and geological exploration, have begun to increase, which has put forward a series of new requirements for HPC. For example, the requirements for computing clustering continue to increase, the requirements for heterogeneous computing capabilities are more stringent, and the requirements for data throughput and throughput efficiency continue to increase. In such an era of "AI big voyage", if enterprises and scientific research institutions still widely adopt the direction of building hardware computing pools to realize HPC, then obviously the industrial efficiency is very low, and the comprehensive cost is huge, and the physical cluster is from hardware procurement to installation. , deployment, etc. take a long time. For high-performance computing tasks with extremely high timeliness requirements, it is obviously unable to meet their needs.

Facing the computing power demand brought by machine learning, deep learning and other AI tasks, Amazon Cloud Technology not only provides computing resources equipped with enterprise-level GPUs in the cloud, but also independently developed corresponding computing resources according to the working characteristics of machine learning and deep learning. The chips are delivered to customers in the form of cloud services. At this stage, Amazon Cloud Technology can provide customers' machine learning and deep learning tasks with ultra-large-scale computing clusters equipped with 4,000 NVIDIA A100 GPUs, providing 400 Gbps non-blocking networking infrastructure, and high throughput through FSx for Lustre, Low latency storage. A computing cluster of this scale is actually very difficult to achieve in a physical supercomputing center. In the era of AI sailing, it is obviously the most reasonable solution to obtain high-performance computing power for machine learning and deep learning from the cloud.

Facing the inevitable high-speed upsurge of HPC demand in the intelligent era, Amazon Cloud Technology finally integrates AI, HPC, cloud computing, three star technologies into a trident through the accumulation of industrial knowledge and service experience. This trident will continue to evolve to help users go to sea and voyage in the wave of intelligence and gain value in the digital field.

At the global ISC2022 conference in early June this year, Amazon Cloud Technology launched a series of cloud services for high-performance computing, including a computing instance HPC6a dedicated to HPC workloads. Optimized to efficiently run compute-intensive, high-performance computing workloads such as computational fluid dynamics, reservoir modeling, weather simulation, and finite element analysis. Compared to comparable Amazon EC2 x86-based compute-optimized instances, Hpc6a instances offer up to 65% better price/performance. With Hpc6a instances, you can dramatically reduce the cost of HPC workloads while taking advantage of the elasticity and scalability of the Amazon Cloud. On GPU instances, Amazon EC2 P4de, a new instance in preview, delivers the best-in-class performance required for machine learning (ML) training and high-performance computing (HPC) applications such as object detection, semantic segmentation, natural language processing, seismic analysis, and computational fluid dynamics. The Graviton series of ARM-based chips that Amazon Cloud Technology has been working on for a long time has also released the latest product Graviton3 of the third-generation Graviton processor series this year. Compared to Amazon Graviton2 processors, they offer 25% higher compute performance, 2X better floating point performance, and 2X better performance for cryptographic workloads.

If you want to understand the continuous evolution of Amazon Cloud Technology in the field of high-performance computing; if you want to understand how high-performance computing is combined with cutting-edge technologies such as machine learning and true quantum computing; if you want to gain insight into the computing potential hidden in various industries in advance, you may wish to pay attention to "Amazon Cloud Technology HPC + Cloud Business Acceleration Innovation Forum" was held at 13:30 on August 24 in the conference hall AB on the third floor of the Westin Beijing Hotel.

This event will bring together technical experts from Amazon Cloud Technology and various industries to jointly sort out the development trajectory of computing and intelligence, and reveal the innovation opportunities in the "HPC+" era.

We will see you on August 24th.





六一
556 声望347 粉丝

SegmentFault 新媒体运营