A company is planning to create a data lake in Amazon S3. The company wants to create tie…

Question

A company is planning to create a data lake in Amazon S3. The company wants to create tiered storage based on access patterns and cost objectives. The solution must include support for JDBC connections from legacy clients, metadata management that allows federation for access control, and batch-based ETL using PySpark and Scala. Operational management should be limited.
Which combination of components can meet these requirements? (Choose three.)

Accepted Answer

Correct answer: A, C, E. A. AWS Glue Data Catalog for metadata management — C. AWS Glue for Scala-based ETL — E. Amazon Athena for querying data in Amazon S3 using JDBC drivers — The correct answers are A, C, and E. AWS Glue Data Catalog provides efficient metadata management, AWS Glue for Scala enables ETL processes, and Amazon Athena allows querying of data in S3 while supporting JDBC connections. Options B and D do not align with the requirement for minimal operational management, while option F introduces additional complexity with a MySQL-compatible metastore.

AWS Certified Data Analytics – Specialty — Question 12

Answer options

Correct answer: A, C, E

Explanation