AWS Certified Data Analytics – Specialty — Question 57

A company wants to research user turnover by analyzing the past 3 months of user activities. With millions of users, 1.5 TB of uncompressed data is generated each day. A 30-node Amazon Redshift cluster with 2.56 TB of solid state drive (SSD) storage for each node is required to meet the query performance goals.
The company wants to run an additional analysis on a year's worth of historical data to examine trends indicating which features are most popular. This analysis will be done once a week.
What is the MOST cost-effective solution?

Answer options

Correct answer: B

Explanation

Option B is the most cost-effective since it allows the company to keep the most recent data in Redshift for performance while offloading older data to S3, which is more economical for storage. Option A is costly as it requires increasing the cluster size significantly, while Option C incurs additional costs for maintaining an EMR cluster. Option D also leads to increased costs without addressing the storage efficiency needed for historical data.