AWS Certified Data Engineer – Associate (DEA-C01) — Question 243

A data engineer at a company is optimizing extract, transform, and load (ETL) workflows. The current architecture uses Amazon EMR and Apache Spark for large-scale transformations and AWS Glue for other ETL tasks. The workflows load processed data into an Amazon S3 based data lake.

The company wants to move to a fully managed serverless solution that can orchestrate multiple ETL jobs and automate execution. The new solution must continue to use Spark to process data. The company needs to orchestrate and automate the ETL workflows with minimal manual intervention.

Which solution will meet these requirements?

Answer options

Correct answer: A

Explanation

The correct answer is A because AWS Glue is a fully managed serverless ETL service that can orchestrate and automate workflows efficiently, which aligns with the company's requirements. Option B, while it integrates multiple services, does not provide a fully managed serverless solution focused solely on ETL. Option C focuses on event-driven processing rather than orchestration, and Option D, although valid for scheduling, does not meet the criteria of being entirely serverless for ETL orchestration.