头图

By Piyush Singh , Mostafa Mokhtar , Shankar Sivadasan
April 18, 2022

Today, we're excited to announce the public preview of Databricks support for AWS Graviton2-based Amazon Elastic Compute Cloud (Amazon EC2) instances. Graviton processors are custom designed and optimized by AWS to provide the best price/performance for cloud workloads running on Amazon EC2. When used with the high-performance DataRicks query engine Photon, Graviron2-based Amazon EC2 instances can provide 3-4x better price/performance for your data lakehouse workloads than comparable Amazon EC2 instances. In this blog post, we'll describe the price/performance ratio of Photon versus Graviton2, and give you other suggestions for further reducing your AWS infrastructure costs.

Cost-effectiveness of Photon and Graviton2

To determine the price/performance between Photon and Graviton2, we conducted a simple test on Graviton2-based R6gd EC2 instances and similar I3 EC2 instances, running two different workloads (TPC-DS and Standard with bulk insert and merge statements ETL workload). We found that for EC2 instances, the Photon engine alone provides a significant price/performance improvement. But Photon on Graviton2-based instances goes a step further, delivering 3.3x better price/performance for ETL workloads and 3.7x better price/performance for TPC-DS workloads than the previous Databricks runtime on I3 instances. Customers who tried Graviton2-based instances reported similar results and shared our excitement! Here's a quote from a Databricks customer who happens to know instance-based Arm well.

"Cloud computing is driving significant innovation in semiconductor design, and by offloading our design workloads to Arm-based AWS Graviton2 instances, delivering significant price-performance improvements, we have seen Arm firsthand," said Mark Galbraith, vice president of Productivity Engineering at Arm. The benefits of the Neoverse N1 platform.”. "This is especially true for Databricks on Graviton 2, and we look forward to migrating production use of Databricks to Graviton 2 to further enhance the user experience and reduce costs."

image.png
Cost-effective comparison of Photon and Graviton2

Additional cost savings with Amazon EC2 Spot Instance and Amazon EBS gp3 volume support

Besides Graviton2 and Photon, there are other ways to improve the price/performance of Databricks workloads on AWS. These measures include:

Amazon EC2 Spot Instances – Spot Instances allow you to take advantage of your EC2 spare capacity at discounts of up to 90% compared to On-Demand prices. Depending on the nature of your workload, you can save costs by replacing On-Demand or Reserved EC2 instances in your DataRicks cluster with Spot Instances.

Amazon EBS gp3 volumes - Storage can be a large part of the cost of cloud infrastructure. Databricks supports gp3 volumes ( https://databricks.com/blog/2021/08/10/introducing-support-for-gp3-amazons-new-general-purpose-ssd-volume.html ). Amazon Elastic Block Store (Amazon EBS) gp3 SSD volumes enable you to deliver performance independent of storage capacity at 20% better price per gigabyte than existing gp2 volumes.

To learn more about price/performance optimization, read our Clustering Best Practices document.
( https://docs.databricks.com/clusters/cluster-config-best-practices.html?_ga=2.39323047.586000877.1650811897-1256218973.1650811879 )

Start with Graviton

Public preview instance support based on AWS Gravion2 is currently rolling out and will be available in all supported regions in the coming weeks. To get started and get guidance on migrating to Graviton 2 and Photon, read our Graviton documentation ( https://docs.databricks.com/clusters/graviton.html?_ga=2.5702327.586000877.1650811897-1256218973.1650811879 ).


亚马逊云开发者
2.9k 声望9.6k 粉丝

亚马逊云开发者社区是面向开发者交流与互动的平台。在这里,你可以分享和获取有关云计算、人工智能、IoT、区块链等相关技术和前沿知识,也可以与同行或爱好者们交流探讨,共同成长。