AWS Certified Data Engineer – Associate (DEA-C01) — Question 23

A company is planning to use a provisioned Amazon EMR cluster that runs Apache Spark jobs to perform big data analysis. The company requires high reliability. A big data team must follow best practices for running cost-optimized and long-running workloads on Amazon EMR. The team must find a solution that will maintain the company's current level of performance.
Which combination of resources will meet these requirements MOST cost-effectively? (Choose two.)

Answer options

Correct answer: B, D

Explanation

Using Amazon S3 as a persistent data store (Option B) is cost-effective due to its scalability and lower storage costs compared to HDFS. Graviton instances (Option D) provide better price-performance compared to x86-based instances, making them ideal for cost-optimized workloads. The other options, such as HDFS and x86-based instances, are not as efficient in terms of cost and reliability for this scenario.