A financial services company is building a robust serverless data lake on Amazon S3. The…

Question

A financial services company is building a robust serverless data lake on Amazon S3. The data lake should be flexible and meet the following requirements:
✑ Support querying old and new data on Amazon S3 through Amazon Athena and Amazon Redshift Spectrum.
✑ Support event-driven ETL pipelines
✑ Provide a quick and easy way to understand metadata
Which approach meets these requirements?

Accepted Answer

Correct answer: A. A. Use an AWS Glue crawler to crawl S3 data, an AWS Lambda function to trigger an AWS Glue ETL job, and an AWS Glue Data catalog to search and discover metadata. — Option A is the correct choice because it effectively combines an AWS Glue crawler for data scanning, an AWS Lambda function to trigger ETL jobs, and an AWS Glue Data Catalog for easy metadata management, fulfilling all specified requirements. The other options either utilize AWS Batch jobs, which do not align with the event-driven ETL requirement, or rely on an external Apache Hive metastore, which complicates metadata management compared to the native AWS Glue Data Catalog.

AWS Certified Machine Learning – Specialty — Question 18

Answer options

Correct answer: A

Explanation