AWS Certified Solutions Architect – Professional — Question 389

A company is running an Apache Hadoop cluster on Amazon EC2 instances. The Hadoop cluster stores approximately 100 TB of data for weekly operational reports and allows occasional access for data scientists to retrieve data. The company needs to reduce the cost and operational complexity for storing and serving this data.
Which solution meets these requirements in the MOST cost-effective manner?

Answer options

Correct answer: A

Explanation

Migrating the self-managed Apache Hadoop cluster from Amazon EC2 to Amazon EMR significantly reduces operational complexity by utilizing a fully managed service while preserving current data access patterns and tools. Scripting EC2 resizing (Option B) does not eliminate the administrative overhead of managing a raw Hadoop cluster. Although options like Amazon S3 with Athena (Option C) or DynamoDB (Option D) are highly scalable, they would require substantial re-engineering of existing reports and workflows, making Amazon EMR the most straightforward and cost-effective transition.