Simplify Amazon Redshift RA3 Migration Assessments Using the Amazon Redshift Simple Replay Utility

Amazon Redshift is a fast, popular, fully managed cloud data warehouse that allows you to process exabytes of data in data warehouses, operational databases, and data lakes using standard Amazon SQL. It offers different node types to suit various workloads; you can choose from Amazon Redshift RA3, Amazon Redshift DC2, and Amazon Redshift DS2 depending on your needs. Amazon Redshift RA3 is the latest instance type that allows you to independently scale and pay for compute and storage, and supports advanced features such as cross-cluster data sharing and cross-AZ cluster relocation. For more information on node count and type recommendations when upgrading, see to Amazon Redshift RA3 Node Type .

Amazon Redshift
http://aws.amazon.com/redshift
Upgrade to Amazon Redshift RA3 node type
https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-clusters.html#rs-upgrading-to-ra3

📢 To learn more about the latest technology releases and practical innovations of Amazon Cloud Technology, please pay attention to the 2021 Amazon Cloud Technology China Summit! Click on the image to sign up

Blog Scaling Cloud Data Warehouses and Lowering Costs with New Amazon Redshift RA3 Nodes with Managed Storage Benchmarking with Amazon Redshift: Comparing Amazon Redshift RA3 vs. Amazon Redshift DS2 Instance Types details the advantages of migrating from Amazon Redshift DS2 to Amazon Redshift RA3 . After learning about the benefits of Amazon Redshift RA3, many of our existing Amazon Redshift DC2 customers have migrated to Amazon Redshift RA3 after successfully evaluating their performance. However, this evaluation is mostly done manually, which requires you to replicate the workload to evaluate performance on new nodes.

Scale Cloud Data Warehouse and Reduce Costs with New Amazon Redshift RA3 Nodes with Managed Storage
https://aws.amazon.com/blogs/big-data/scale-your-cloud-data-warehouse-and-reduce-costs-with-the-new-amazon-redshift-ra3-nodes-with-managed-storage/
Amazon Redshift Benchmark: Comparing Amazon Redshift RA3 and Amazon Redshift DS2 Instance Types
https://aws.amazon.com/blogs/apn/amazon-redshift-benchmarking-comparison-of-ra3-vs-ds2-instance-types/

You can use the Amazon Redshift Simple Replay tool to perform what-if analysis and evaluate the performance of your workload in different scenarios. For example, you can use this tool to benchmark real workloads, evaluate new features, or evaluate different cluster configurations on new instance types like Amazon Redshift RA3. It also includes enhanced support for replaying data ingest and export pipelines using COPY and UNLOAD statements. To start and replay your workload, download the tool Amazon Redshift GitHub repository

Amazon Redshift GitHub repository
https://github.com/awslabs/amazon-redshift-utils/tree/master/src/SimpleReplay

In this article, we walk through the steps for automating the assessment of an Amazon Redshift RA3 instance with the Amazon Redshift Simple Replay utility. If you are running production workloads in Amazon Redshift using older generation Amazon Redshift DS2 and Amazon Redshift DC2 node types, you can use this solution to automatically extract workload logs from the source production cluster and replay them in an isolated environment , so that you can directly and seamlessly compare the two Amazon Redshift clusters.

Prerequisites

This solution uses Amazon CloudFormation automatically provision all required resources in your Amazon Cloud Tech account. For more information, see Getting Started with Amazon CloudFormation .

Amazon CloudFormation
http://aws.amazon.com/cloudformation
Getting Started with Amazon CloudFormation
https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/GettingStarted.html

As a prerequisite for this solution, you need to complete the following steps. You may need administrator access to an Amazon Cloud Technology account to perform these steps and subsequently deploy this solution.

Enable audit logging on the source Amazon Redshift cluster in the Amazon Redshift console, and then specify the Amazon Simple Storage Service (Amazon S3) bucket location to save the log files. For more information, see Database Audit Logging.
Change the parameter group enable_user_activity_logging to true. For more information, see Managing parameter groups using the console.
Restart the cluster.
Create an Amazon Elastic Compute Cloud (Amazon EC2) key pair in the Amazon Cloud Tech account where you plan to deploy the Amazon CloudFormation template. For more information, see Creating a Key Pair with Amazon EC2.

Amazon Simple Storage Service
http://aws.amazon.com/s3
Database audit logging
https://docs.aws.amazon.com/redshift/latest/mgmt/db-auditing.html
Manage parameter groups using the console
https://docs.aws.amazon.com/redshift/latest/mgmt/managing-parameter-groups-console.html
Amazon Elastic Compute Cloud
http://aws.amazon.com/ec2
Create a key pair with Amazon EC2
https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html#having-ec2-create-your-key-pair

Solution Overview

The solution includes two Amazon CloudFormation templates for performing ingest and replay of workloads. You can deploy both templates in the same account that hosts the Amazon Redshift cluster, which is our recommended approach. Alternatively, you can run the extraction template in the production account and the replay template in a separate development account to perform this evaluation, as shown in the following figure.

https://github.com/awslabs/amazon-redshift-utils/blob/master/src/SimpleReplay/cloudformation/ra3-migration-replay.yaml

The process uses Amazon Step Functions

and Amazon Lambda to orchestrate end-to-end workflows for ingestion and replay. The first template deploys the extraction application in the source account. This will pull the audit logs from the Amazon S3 bucket for Amazon Redshift audit logging. It stores it in a new Amazon S3 bucket created for the replay-fetch process. It also creates a manual snapshot of your cluster and authorizes the replay account to restore the snapshot.

Amazon Step Functions
http://aws.amazon.com/step-functions
Amazon Lambda
http://aws.amazon.com/lambda

The second template deploys the replay application in the development account (if you choose not to run the replay in the source account). It uses the Amazon S3 files from the extraction application and generates a replayed automatic evaluation summary.

End-to-End Workflow: Extraction Process

The extraction process starts when you enter the time interval at which you want to run the process. It automatically pulls audit logs from the source cluster and stores them in a new Amazon S3 bucket for the account. It also deploys an Amazon EC2 instance of size m5.large (with the Simple Replay utility installed). The diagram below shows the solution architecture.

The following diagram shows the Amazon Step Functions state machine for the extraction process.

The state machine performs the following steps to ingest source cluster metadata into the ingest Amazon S3 bucket:

Wait for the source cluster to be available.
Create a manual snapshot of the source cluster using the identifier ra3-migration-evaluation-snapshot- concatenated with the cluster identifier string.
Snapshots are authorized to the target account you plan to run the replay process on.
Upload the source cluster configuration parameters to the fetch Amazon S3 bucket.
Run the ingest process to get logs from the source cluster and put them into the ingest Amazon S3 bucket.

The extraction of the Amazon CloudFormation template automatically starts the first iteration of the extraction process, but you can rerun the process at any time by submitting a state machine with start_time and end_time input parameters, as shown in the following code:

{“input”: {“start_time”: “<<Extract_Start_Time>>”, “end_time”: “<<Extract_End_Time>>”}}

Replace the start_date and end_date values with the actual date in ISO-8601 format (for example, 2021-03-05T12:30:00+00:00). The following screen shot shows the state machine run input.

You need to deploy the Amazon CloudFormation template for the ingest process in the same account that hosts the source cluster. This template requires you to provide the following parameters:

SourceRedshiftClusterEndpoint — The non-Amazon Redshift RA3 source cluster endpoint, including port number and database name.
AccountIdForReplay — If you plan to run the replay process in a different account, enter the 12-digit Amazon Cloud Technologies account ID in this parameter. Enter N/A if you are running the fetch and replay processes in the same account.
SimpleReplayStartTime — The start date in ISO-8601 format (for example, 2021-01-20T21:41:16+00:00) when you want to run the first iteration of the ingest process from the source cluster. You can change it later in the input JSON of the fetch state machine.
SimpleReplayEndTime — The end date and time, in ISO-8601 format, that you want to extract from the source cluster and replay in the destination Amazon Redshift RA3 cluster. You can change it later in the input JSON of the fetch state machine. Make sure that the start time and end time are within 24 hours.
ExtractSystemTables - This is an optional step (provided you want to extract the source cluster system tables for reference). We recommend setting this parameter to No because it adds the Amazon Identity and Access Management (IAM) role to the source cluster to unload system tables from the source cluster.
EndUserIamRoleName — The name of the existing Amazon IAM role for the end user who may be running the pull-replay evaluation. You can use this parameter to allow non-admin users to run the fetch-replay state machine without any other permissions on the Amazon Cloud Technology resource. Enter N/A if you do not want to provide any end user permissions.
EC2InstanceAMI — Amazon AMI for Amazon EC2 instances based on Amazon Linux 2. We recommend keeping the default Amazon AMI for this parameter unless compliance requirements require it.
Amazon Identity and Access Management
http://aws.amazon.com/iam

After the template is deployed, navigate to the output tab of the template, which lists some relevant parameters required for the deployment of the replay process.

End-to-End Workflow:

The second part of this solution, using the Amazon CloudFormation template in the same account running the ingest process or in another account in the same region.

This process provisions two Amazon Redshift clusters: a replica cluster (configured identically to the source cluster) and a target cluster with an Amazon Redshift RA3 configuration. It deploys two Amazon EC2 instances of the M5 series with the Simple Replay utility installed and simultaneously replays the extracted workload in these clusters. Because the replay process preserves the time interval between queries and transactions to simulate the exact workload from the source cluster, the duration between the time this process takes and the start_time and end_time you provide when running the ingest process roughly the same. The diagram below shows the architecture of the solution.

The following diagram shows the Amazon Step Functions state machine for replaying the process.

The state machine performs the following steps to replay the fetched workload from the fetched Amazon S3 bucket:

Update the Amazon Redshift parameter group to the same configuration as the source cluster parameter group, which was saved in the ingest Amazon S3 bucket as part of the ingest process.
If the replica and target cluster do not exist, start their cluster creation process in parallel. The replica cluster will be created with the exact same configuration as the source cluster, while the target cluster will be created with the Amazon Redshift RA3 configuration (if the source cluster is compatible with the elastic adjustment of the Amazon Redshift RA3 configuration you specified when deploying the Amazon CloudFormation template). If the target Amazon Redshift RA3 configuration is not compatible with resizing, it creates the target cluster with the same configuration as the replica cluster.
If the previous step created the target cluster with a non-Amazon Redshift RA3 configuration due to incompatibility with elastic tuning, it will perform classic tuning against that cluster when it becomes available.
If the target cluster or replica cluster is in a suspended state, it will resume the cluster.
If the target cluster or replica cluster is available and any recovery operations (if applicable) against that cluster have completed, it runs an Amazon SQL script that sets up some Amazon Redshift objects in the cluster's common schema to be used between clusters Perform automatic performance comparisons.
Once the setup process for the target and replica clusters is complete, it runs the replay process in both clusters simultaneously, running all Amazon SQL ingested from the source cluster, while maintaining the same transaction order and time interval as the source cluster.
Once the replay process is complete, it unloads query statistics from the replica cluster and loads it into the target Amazon Redshift RA3 cluster, enabling direct performance comparisons between environments within the Amazon Redshift RA3 cluster.

Elastic adjustment
https://docs.aws.amazon.com/redshift/latest/mgmt/managing-cluster-operations.html#elastic-resize

The Amazon CloudFormation template for the replay process automatically starts the first iteration of the replay process, but you can rerun the process at any time by submitting a state machine with no parameters. This template requires you to provide the following parameters:

SourceAccountNumber — The source account to run the extraction process on. You can find it on the output tab of the extraction stack.
SourceAccountSimpleReplayS3Bucket — Extract Amazon S3 bucket, created by the extraction template (found on the stack output tab).
SourceRedshiftClusterEndpoint — Non-Amazon Redshift RA3 source cluster endpoint, including port number and database name (found on the stack output tab).
SourceRedshiftClusterKMSKeyARN — Amazon Key Management Service (KMS) key Amazon ARN (Amazon Resource Name) (if your source Amazon Redshift cluster is encrypted) (found on the Stack Output tab). If the source cluster is encrypted, you need to run ingest and replay in the same account.
SourceRedshiftClusterMasterUsername — The username associated with the master user account of the source cluster (found on the stack output tab).
SourceRedshiftClusterPrimaryDatabase — The name of the primary database in the source cluster where you want to replay the workload. Amazon Redshift automatically creates a default database named dev, which may not be your primary database. Enter the correct value based on your deployment. If you have multiple databases, you will need to run fetch and replay on one database at a time.
TargetRedshiftClusterNodeType — The Amazon Redshift RA3 node type to provision. We recommend using the node type and node count as recommended in Upgrading to Amazon Redshift RA3 Node Types.
TargetRedshiftClusterNumberOfNodes — The number of compute nodes in the cluster.
EndUserIamRoleName — The name of the existing Amazon IAM role for the end user who may be running the pull-replay evaluation. You can use this parameter to allow non-admin users to run the fetch-replay state machine without any other permissions on the Amazon Cloud Technology resource. Enter N/A if you do not want to provide any end user permissions
GrantS3ReadOnlyAccessToRedshift — If you deploy the fetch and replay processes in the same account, you can enter Yes for this parameter, which grants AmazonS3ReadOnlyAccess to Amazon Redshift targets and replica clusters to replay Amazon Redshift's replica statements in the account. Otherwise, you need to manually copy the files and adjust the copy_replacement.csv file in the latest extraction folder of the Amazon S3 bucket to extract, and set the parameter of the copy statement in the config/replay.yaml file of the replay Amazon S3 bucket to true.
VPC — An existing Amazon Virtual Private Cloud (Amazon VPC) where you want to deploy the cluster and Amazon EC2 instances.
SubnetId — The existing subnet within the Amazon VPC where you deploy your cluster and Amazon EC2 instances.
KeyPairName — An existing key pair that allows SSH connections to replay Amazon EC2 instances.
OnPremisesCIDR — The IP range (CIDR notation) used to access the existing infrastructure of the target and replica clusters from Amazon SQL clients. If unsure, enter the CIDR address of the company desktop. For example, if the IP address of the desktop is 10.156.87.45, enter 10.156.87.45/32.
EC2InstanceType — The Amazon EC2 instance type that hosts the Simple Replay utility codebase. If the size of the data in the cluster is less than 1 TB, you can use large instance types. We recommend using larger instance types for larger workloads to prevent Amazon EC2 instances from becoming a bottleneck when getting query results from the cluster.
EC2InstanceVolumeGiB — Amazon EC2 instance volume size in GiB. We recommend keeping it at 30 GiB or more.
EC2InstanceAMI — Amazon AMI for Amazon EC2 instances based on Amazon Linux 2. Do not change this parameter unless you need to meet compliance requirements.
Amazon Key Management Service (KMS)
https://aws.amazon.com/kms/
Upgrade to RA3 node type
https://docs.aws.amazon.com/redshift/latest/mgmt/working-with-clusters.html#rs-upgrading-to-ra3
Amazon Virtual Private Cloud
http://aws.amazon.com/vpc

access rights and security

To deploy this solution using Amazon CloudFormation, the Amazon Cloud Tech account where you plan to deploy the extract and replay process. Both templates are provided with an input parameter EndUserIamRoleName that you can use to allow non-admin users to run processes without any extensive permissions to system resources.

Amazon CloudFormation templates use security best practices based on the principle of least privilege to provision all required resources and host all resources in the account VPC. Amazon EC2 instances and Amazon Redshift clusters share the same security group, and SSH access to Amazon EC2 instances is not allowed. Access to Amazon Redshift target and replica clusters is controlled by the Amazon CloudFormation template parameter OnPremisesCIDR, which you need to provide to allow local users to connect to the new cluster using the Amazon SQL client on the Amazon Redshift port.

Access to all resources is Amazon IAM role , which grants the appropriate permissions to Amazon Redshift, Amazon Lambda, Amazon Step Functions, and Amazon EC2. Grant read and write access to the Amazon S3 bucket created by the ingest process to the Amazon Cloud Tech account used for the replay process so that it can read and update configuration from the bucket.

Amazon IAM roles
https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles.html

evaluation RA3 performance

After completing the first run of the replay state machine, you should be able to view the Amazon Redshift RA3 target and non-Amazon Redshift RA3 replica clusters on the Amazon Redshift console of the Amazon Cloud Technologies account where the replay template is deployed. Each iteration of the replay automatically populates the following tables and views in the common schema of the target Amazon Redshift RA3 cluster, allowing you to directly compare performance between clusters:

source_target_comparison — Provides a comparative summary of the time it takes for the two clusters to replay the workload. It provides a column total_query_time_saved_seconds grouped by Amazon Redshift queue and username, which may be useful in your final evaluation.
source_target_comparison_raw — Provides a detailed comparison of the time taken by both clusters for each query.
replica_cluster_query_stats — Stores replayed query-level metrics running on replica clusters.
target_cluster_query_stats — Stores replayed query-level metrics running on an Amazon Redshift RA3 cluster.
source_cluster_query_stats — Stores query-level metrics from the source cluster. This table may be empty because it relies on STL log views in the source cluster, which are only kept for 2-5 days. For more information, see STL Views for Logging.
detailed_query_stats — Populates the query_stats table and provides the logic we use to populate these statistics from the STL log view.
STL view for logging
https://docs.amazonaws.cn/en_us/redshift/latest/dg/c_intro_STL_tables.html

Cost and Timeline Considerations

Running this template will bring some cost implications, as it will preset the new Amazon Redshift three clusters and Amazon EC2 instances in your Amazon cloud technology account, if you do not Reserved Instances , these examples may be used as press Instance charges apply. After completing the evaluation, we recommend removing the Amazon CloudFormation stack. This will delete all associated resources except the two Amazon S3 buckets used for ingest and replay. We also recommend pausing the cluster when it is not in use. For details, see Amazon Redshift Pricing and Amazon EC2 Pricing.

Reserved Instances
https://aws.amazon.com/ec2/pricing/reserved-instances/pricing/
Amazon Redshift Pricing
https://aws.amazon.com/redshift/pricing/#On-demand_pricing
Amazon EC2 Pricing
https://aws.amazon.com/ec2/pricing/on-demand/

limit

There are some known limitations of Simple Replay and this automated process:

If there is a lag in delivering audit logs to Amazon S3, the ingest process might fail. In this case, you need to choose a different time interval than in the past to rerun the extraction state machine.
Related Amazon SQL queries across connections are not guaranteed to run in the original order.
Amazon Redshift Spectrum queries will not be replayed if the target cluster does not have access to the external table.
Queries with BIND variables are not replayed.
Replay using JDBC is not supported.
A high number of concurrent fetches can stress Amazon EC2 clients. For these cases, a larger Amazon EC2 instance may be required.
Audit logs may contain Amazon SQL that was not committed to the production cluster. These Amazon SQLs will be replayed.

You can minimize the impact of these limitations by using replicas and target clusters for comparison (rather than direct comparisons to production Amazon Redshift clusters).

Conclusion

Amazon Redshift RA3 instances offer many additional advantages over previous instances. If you're trying to migrate to an Amazon Redshift RA3 instance type but are concerned about the assessment effort, the Amazon Redshift Simple Replay utility makes it easy and seamless to perform this assessment and successfully migrate to Amazon Redshift RA3 nodes.

If you are satisfied with the performance of your Amazon Redshift RA3 instance types, you can perform tuning operations on your production cluster to move it to the Amazon Redshift RA3 platform. Tuning a production cluster takes similar time to creating a test Amazon Redshift RA3 cluster, depending on whether elastic tuning or traditional tuning is used. We recommend creating manual snapshots before performing tuning operations on a production cluster.

Author of this article

Manash Deb

Amazon Cloud Technology

Senior Analytics Specialist Solution Architect

For over 15 years, he has been building end-to-end data-driven solutions using different database and data warehouse technologies. Enjoys learning new technologies, solving, automating and simplifying customer problems with easy-to-use cloud data solutions on Amazon Cloud Technologies.

Sinath Madabushi

Amazon Redshift

Senior Database Engineer

Over 16 years working in various fields of data warehousing, business intelligence and database technology. is an avid big data enthusiast, working with global clients for their data warehousing needs.

Simplify Amazon Redshift RA3 Migration Assessments Using the Amazon Redshift Simple Replay Utility

Prerequisites

End-to-End Workflow: Extraction Process

End-to-End Workflow:

access rights and security

evaluation RA3 performance

Cost and Timeline Considerations

limit

Conclusion

亚马逊云开发者

引用和评论

利用 Amazon Bedrock Data Automation（BDA）对视频数据进行自动化处理与检索

53 倍性能提升！TiDB 全局索引如何优化分区表查询？

分布式数据库解析

做到真正0丢失、0重复：Apache SeaTunnel 实现万亿级数据一致性全解密

在 Kubernetes 上用 KubeBlocks + Dify 快速构建生产级 AIGC 应用

数据库的下一场革命：S3 延迟已降至原先的 10%，云数据库架构该进化了

Ape-DTS：开源 DTS 工具，助力自建 MySQL、PostgreSQL 迁移上云