AWS Certified Data Engineer – Associate (DEA-C01) — Question 14

A company is migrating on-premises workloads to AWS. The company wants to reduce overall operational overhead. The company also wants to explore serverless options.
The company's current workloads use Apache Pig, Apache Oozie, Apache Spark, Apache Hbase, and Apache Flink. The on-premises workloads process petabytes of data in seconds. The company must maintain similar or better performance after the migration to AWS.
Which extract, transform, and load (ETL) service will meet these requirements?

Answer options

Correct answer: B

Explanation

Amazon EMR is ideal for processing large datasets with frameworks like Apache Spark and Apache Flink, making it suitable for the company's requirements. While AWS Glue is a serverless ETL service, it may not handle workloads at the scale and speed described as effectively as Amazon EMR. AWS Lambda is a serverless compute service, not specifically designed for ETL at this scale, and Amazon Redshift is a data warehouse service that does not perform ETL tasks directly.