AWS Certified Data Engineer – Associate (DEA-C01) — Question 152
A data engineer wants to orchestrate a set of extract, transform, and load (ETL) jobs that run on AWS. The ETL jobs contain tasks that must run Apache Spark jobs on Amazon EMR, make API calls to Salesforce, and load data into Amazon Redshift.
The ETL jobs need to handle failures and retries automatically. The data engineer needs to use Python to orchestrate the jobs.
Which service will meet these requirements?
Answer options
- A. Amazon Managed Workflows for Apache Airflow (Amazon MWAA)
- B. AWS Step Functions
- C. AWS Glue
- D. Amazon EventBridge
Correct answer: A
Explanation
Amazon Managed Workflows for Apache Airflow (Amazon MWAA) is specifically designed for orchestrating complex workflows and can handle ETL jobs with failure management and retries. AWS Step Functions could also orchestrate workflows, but it is more suited for simpler tasks. AWS Glue is primarily for data integration tasks rather than orchestration, and Amazon EventBridge is used for event-driven architectures, not for orchestrating ETL processes.