AWS Certified Solutions Architect – Professional — Question 972
A company deploys a new web application. As part of the setup, the company configures AWS WAF to log to Amazon S3 through Amazon Kinesis Data Firehose.
The company develops an Amazon Athena query that runs once daily to return AWS WAF log data from the previous 24 hours. The volume of daily logs is constant. However, over time, the same query is taking more time to run.
A solutions architect needs to design a solution to prevent the query time from continuing to increase. The solution must minimize operational overhead.
Which solution will meet these requirements?
Answer options
- A. Create an AWS Lambda function that consolidates each days AWS WAF logs into one log file.
- B. Reduce the amount of data scanned by configuring AWS WAF to send logs to a different S3 bucket each day.
- C. Update the Kinesis Data Firehose configuration to partition the data in Amazon S3 by date and time. Create external tables for Amazon Redshift. Configure Amazon Redshift Spectrum to query the data source.
- D. Modify the Kinesis Data Firehose configuration and Athena table definition to partition the data by date and time. Change the Athena query to view the relevant partitions.
Correct answer: D
Explanation
Partitioning the data in Amazon S3 by date and time using Amazon Kinesis Data Firehose allows Amazon Athena to scan only the specific partitions containing the last 24 hours of data, rather than scanning the entire bucket. This keeps query runtimes and costs low even as the total log volume grows. Other options either introduce unnecessary operational complexity (such as setting up Amazon Redshift Spectrum or writing custom Lambda functions) or fail to address the query scanning efficiency in Athena directly.