AWS Certified Solutions Architect – Professional — Question 827
A company receives clickstream data files to Amazon S3 every five minutes. A Python script runs as a cron job once a day on an Amazon EC2 instance to process each file and load it into a database hosted on Amazon RDS. The cron job takes 15 to 30 minutes to process 24 hours of data. The data consumers ask for the data be available as soon as possible.
Which solution would accomplish the desired outcome?
Answer options
- A. Increase the size of the instance to speed up processing and update the schedule to run once an hour.
- B. Convert the cron job to an AWS Lambda function and trigger this new function using a cron job on an EC2 instance.
- C. Convert the cron job to an AWS Lambda function and schedule it to run once an hour using Amazon CloudWatch Events.
- D. Create an AWS Lambda function that runs when a file is delivered to Amazon S3 using S3 event notifications.
Correct answer: D
Explanation
Using AWS Lambda triggered by Amazon S3 event notifications allows the system to process clickstream data in near real-time as soon as each file arrives, meeting the requirement for immediate availability. Options A, B, and C still introduce unnecessary delays (hourly or daily) and rely on scheduling instead of event-driven execution. Furthermore, Option D eliminates the overhead of managing Amazon EC2 instances.