AWS Certified Data Engineer – Associate (DEA-C01) — Question 250

A company is planning to migrate on-premises Apache Hadoop clusters to Amazon EMR. The company also needs to migrate a data catalog into a persistent storage solution.
The company currently stores the data catalog in an on-premises Apache Hive metastore on the Hadoop clusters. The company requires a serverless solution to migrate the data catalog.
Which solution will meet these requirements MOST cost-effectively?

Answer options

Correct answer: B

Explanation

Option B is correct because it provides a direct integration of the existing Hive metastore with AWS Glue Data Catalog, which is a serverless solution for managing the data catalog. Option A involves using Amazon S3, which may not provide the same level of integration as using AWS Glue. Option C introduces Amazon Aurora MySQL, which adds unnecessary complexity and cost. Option D suggests creating a new metastore, which does not utilize the existing data structure efficiently.