AWS Certified Data Engineer – Associate (DEA-C01) — Question 124
A data engineer is processing and analyzing multiple terabytes of raw data that is in Amazon S3. The data engineer needs to clean and prepare the data. Then the data engineer needs to load the data into Amazon Redshift for analytics.
The data engineer needs a solution that will give data analysts the ability to perform complex queries. The solution must eliminate the need to perform complex extract, transform, and load (ETL) processes or to manage infrastructure.
Which solution will meet these requirements with the LEAST operational overhead?
Answer options
- A. Use Amazon EMR to prepare the data. Use AWS Step Functions to load the data into Amazon Redshift. Use Amazon QuickSight to run queries.
- B. Use AWS Glue DataBrew to prepare the data. Use AWS Glue to load the data into Amazon Redshift. Use Amazon Redshift to run queries.
- C. Use AWS Lambda to prepare the data. Use Amazon Kinesis Data Firehose to load the data into Amazon Redshift. Use Amazon Athena to run queries.
- D. Use AWS Glue to prepare the data. Use AWS Database Migration Service (AVVS DMS) to load the data into Amazon Redshift. Use Amazon Redshift Spectrum to run queries.
Correct answer: B
Explanation
The correct answer, B, uses AWS Glue DataBrew for data preparation, which simplifies the process with a low-code interface, and AWS Glue for loading data, streamlining the ETL process without infrastructure management. Other options either involve more complex setups or do not provide the same ease of use and operational efficiency as AWS Glue DataBrew and Glue do in this scenario.