A data engineer needs to build an extract, transform, and load (ETL) job. The ETL job wil…

Question

A data engineer needs to build an extract, transform, and load (ETL) job. The ETL job will process daily incoming .csv files that users upload to an Amazon S3 bucket. The size of each S3 object is less than 100 MB.
Which solution will meet these requirements MOST cost-effectively?

Accepted Answer

Correct answer: D. D. Write an AWS Glue Python shell job. Use pandas to transform the data. — The correct answer is D, as AWS Glue Python shell jobs using pandas are well-suited for lightweight transformations of smaller datasets like the ones described. Options A and B involve more complex setups with higher operational costs, and option C, while effective, may not be as cost-efficient for the specific requirements of processing small .csv files.

AWS Certified Data Engineer – Associate (DEA-C01) — Question 62

Answer options

Correct answer: D

Explanation