AWS Certified Data Engineer – Associate (DEA-C01) — Question 174

A car sales company maintains data about cars that are listed for sale in an area. The company receives data about new car listings from vendors who upload the data daily as compressed files into Amazon S3. The compressed files are up to 5 KB in size. The company wants to see the most up-to-date listings as soon as the data is uploaded to Amazon S3.

A data engineer must automate and orchestrate the data processing workflow of the listings to feed a dashboard. The data engineer must also provide the ability to perform one-time queries and analytical reporting. The query solution must be scalable.

Which solution will meet these requirements MOST cost-effectively?

Answer options

Correct answer: D

Explanation

The correct answer is D because it leverages AWS Glue for data processing, which is serverless and cost-effective, and uses AWS Lambda with S3 Event Notifications for real-time orchestration. This combination allows for immediate updates and scalability. Options A and B involve Amazon EMR, which can be more expensive and less efficient for the given requirements, while option C introduces Amazon Redshift Spectrum, which may not be as cost-effective for one-time queries compared to Athena.