AWS Certified Machine Learning – Specialty — Question 18

A financial services company is building a robust serverless data lake on Amazon S3. The data lake should be flexible and meet the following requirements:
✑ Support querying old and new data on Amazon S3 through Amazon Athena and Amazon Redshift Spectrum.
✑ Support event-driven ETL pipelines
✑ Provide a quick and easy way to understand metadata
Which approach meets these requirements?

Answer options

Correct answer: A

Explanation

Option A is the correct choice because it effectively combines an AWS Glue crawler for data scanning, an AWS Lambda function to trigger ETL jobs, and an AWS Glue Data Catalog for easy metadata management, fulfilling all specified requirements. The other options either utilize AWS Batch jobs, which do not align with the event-driven ETL requirement, or rely on an external Apache Hive metastore, which complicates metadata management compared to the native AWS Glue Data Catalog.