statement:
This article is transferred from the DEV Community website, and the article translation is provided by the developer community;
Click the link below to view the original English text:

some historical background

About two years ago, I started trying to use QuickSight to solve some of the BI problems at the company I work for. I valued QuickSight's first-class integration with many of Amazon's cloud technology data services at a very low cost compared to other similar tools. With this service, we can quickly build analytical prototypes and dashboards. However, there is a question that we have been puzzled by: why is DynamoDB not supported as a data source?

I asked Hive-mind how to visualize DynamoDB data in Amazon Quicksight on StackOverflow , and this has become the most supported QuickSight question on the platform, because everyone needs this feature, but it has not been answered...until recently.

Most workaround solutions require exporting (eg copying) the DynamoDB data to another location, such as S3 or RDS, which can then be added to QuickSight as a data source. Finally, we created a Glue scheduled job to move DynamoDB data to S3. The S3 data was then crawled using AWS Glue Crawler and added to QuickSight as a dataset in the form of an Amazon Athena table. This approach works, but actually requires more customized infrastructure, and also doesn't allow direct real-time querying.

At the end of 2019, Amazon Cloud Technology announced that users will be able to query any data source through the new federated query function of Amazon Athena . It was a cool feature, they even showed the concept of querying DynamoDB data with a schematic diagram, but it required us to develop and maintain our own connector, and the feature was still in preview at the time, and the integration of Amazon QuickSight had not been updated to be usable. This new feature of Athena.

In March 2020, I found out that Amazon Athena announced a pre-built data connector for DynamoDB . This excites me because I can finally quickly set up a data connector in Athena and use it to actually view and query DynamoDB data without writing any custom code. But the feature is still in preview, and QuickSight is not yet supported.

A year later, the Athena Data Connector was officially released in several regions, the Athena Engine Version 2 was also available , and QuickSight's integration allowed us to choose the Athena workgroups and data sources we wanted.

These capabilities were sufficient for my needs before DynamoDB became a "first-class" data source in Amazon QuickSight.

Detailed implementation process

Using the Athena data connector provided by the Athena engine version 2, I was finally able to visualize my DynamoDB data in QuickSight without having to create any custom resources or copy the data to another data source.

Prerequisites/Assumptions

  • You want to visualize data in DynamoDB tables with QuickSight
  • You can access users/roles in your own account IAM
  • Create a bucket to store Athena query result data and data connector overflow data
  • If you are going to use it for large scale queries in a production environment, it is recommended to add an S3 lifecycle policy to the bucket to ensure that the bucket does not grow in an uncontrolled way
  • The region you use must support both:
  • Athena v2 engine
  • QuickSight

Switch to use the Athena engine version 2

If you don't already have an Athena working group that is already using the Athena engine version 2, create a working group that meets the requirements.

As of this writing, the default primary groups are still using version 1 of the Athena engine, but these groups will be automatically upgraded to version 2 at some point in the future.

No one has time to wait for Amazon Cloud Technologies to upgrade, so we need to create (if needed) workgroups ahead of time and switch to version 2 of the Athena engine.

1. Select the Workgroups tab in the Athena console:

image.png

2. Make sure the "Athena Engine version 2" workgroup exists and switch to it.

image.png

a) You most likely don't have such a workgroup, so you need to click the "Create Workgroup" button to create one. You can use the default settings for almost all options, but be sure to select "Manually choose an engine version now", and when creating a workgroup, be sure to select "Athena engine version 2 (recommended)."

Create a DynamoDB Athena data connector

Now that we've created a working group for the Athena engine version 2, it's time to create our DynamoDB Athena data connector.

1. Open the Data Sources tab of the Athena console and select the "Connect data source" button.

image.png

2. Select the "Query a data source" option in the first step of the Data Sources wizard, then select "Amazon DynamoDB" and click Next.

3. In the second and last step, Amazon Cloud Technology requires us to specify the connector Lambda. The Lambda doesn't exist yet, so we need to click the "Configure new AWS Lambda function" button to deploy it.

Clicking the button opens a new window and displays the Lambda console, where we can deploy the prebuilt AthenaDynamoDBConnector application.

b) Under Application settings, at least the SpillBucket and AthenaCatalogName parameters need to be set as they do not contain any default values.

image.png

c) Click Deploy.

d) After the Lambda deployment is complete, go back to the Athena console window.

4. So far our AthenaDynamoDBConnector function has been deployed, click the refresh button next to the "Choose Lambda function" drop-down menu list, and then you should be able to see the Lambda function you just deployed. Select the function, enter a name for the category, and click the "Connect" button.

5. At this point, we should be able to use Athena to query DynamoDB data through the data source and category name set for the resource.

image.png

Assign Lambda permissions to the QuickSight IAM role

The Athena data connector needs to invoke Lambda to query and return DynamoDB data, so we need to give QuickSight's service role permission to invoke the Lambda function.

1. Open the IAM console and select "Roles".

2. Find and click the QuickSight service role. The role name should start with something like "aws-quicksight-service-role", eg "aws-quicksight-service-role-v0".

3. Click "Attach Policies".

4. Select "AWSLambdaRole" and click "Attach Policy".

Configure QuickSight to use the new connector

Some configuration of QuickSight is then required!

1. Open QuickSight and display the Data Sets menu.

2. Click "New Dataset" and select Athena as the data source.

image.png

3. Select the workgroup you just created that uses version 2 of the Athena engine.

image.png

4. Then select the classification created for the DynamoDB connector in the previous step.

image.png

a) If you see the "Insufficient permissions to connect to this dataset or run this query" error, it is likely that the Lambda invocation permissions were not assigned correctly to the QuickSight role. Please refer to the previous step for the specific operation method.

5. Then choose whether you want to cache the data through SPICE, or query the data directly. Direct queries will be able to view real-time data in DynamoDB, but performance and cost may be impacted because it is not cached.

a) You can also add additional DynamoDB table datasets and continue to reuse the version 2 Athena engine data source created in step 2 above.

to visualize

Now that we've created the DynamoDB dataset (via Athena and the DynamoDB data connector), it's finally time to visualize the DynamoDB data through analytics and dashboards.

image.png

For the examples in this article I used a small sample DynamoDB dataset, so the visualizations are not very interesting and are for reference only.

Summarize

After waiting for so long, it is finally possible to process DynamoDB data with QuickSight without the need for custom resources and duplication of data. This is just one example of the many data sources that we can now easily add via the Athena Engine Version 2 Data Connector and select the Athena Workgroup and Athena Taxonomy to use directly in QuickSight.
I believe that as Amazon Cloud Technology continues to release new features and the technology continues to innovate, the technology introduced in this article will become outdated sooner or later, but you can use the method described in this article as an exercise to explore more new features.
If this article is helpful to you, or if you have any suggestions for improvements and additions to this method, please feel free to leave a message.

Article by: Jared Donboch
Jared Donboch for AWS Community Builders.


亚马逊云开发者
2.9k 声望9.6k 粉丝

亚马逊云开发者社区是面向开发者交流与互动的平台。在这里,你可以分享和获取有关云计算、人工智能、IoT、区块链等相关技术和前沿知识,也可以与同行或爱好者们交流探讨,共同成长。