AWS Certified Data Analytics – Specialty — Question 82

A company is migrating from an on-premises Apache Hadoop cluster to an Amazon EMR cluster. The cluster runs only during business hours. Due to a company requirement to avoid intraday cluster failures, the EMR cluster must be highly available. When the cluster is terminated at the end of each business day, the data must persist.
Which configurations would enable the EMR cluster to meet these requirements? (Choose three.)

Answer options

Correct answer: A, C, E

Explanation

The correct answer includes A, C, and E because EMR File System (EMRFS) allows for data to persist beyond the lifecycle of the cluster, and using AWS Glue Data Catalog provides a managed metastore, enhancing availability and integration. Multiple master nodes in a single Availability Zone (E) ensures high availability, whereas options B, D, and F do not fulfill the requirements as effectively as the selected choices.