AWS Certified Solutions Architect – Professional (SAP-C02) — Question 391
A company deploys a new web application. As part of the setup, the company configures AWS WAF to log to Amazon S3 through Amazon Kinesis Data Firehose. The company develops an Amazon Athena query that runs once daily to return AWS WAF log data from the previous 24 hours. The volume of daily logs is constant. However, over time, the same query is taking more time to run.
A solutions architect needs to design a solution to prevent the query time from continuing to increase. The solution must minimize operational overhead.
Which solution will meet these requirements?
Answer options
- A. Create an AWS Lambda function that consolidates each day's AWS WAF logs into one log file.
- B. Reduce the amount of data scanned by configuring AWS WAF to send logs to a different S3 bucket each day.
- C. Update the Kinesis Data Firehose configuration to partition the data in Amazon S3 by date and time. Create external tables for Amazon Redshift. Configure Amazon Redshift Spectrum to query the data source.
- D. Modify the Kinesis Data Firehose configuration and Athena table definition to partition the data by date and time. Change the Athena query to view the relevant partitions.
Correct answer: D
Explanation
Partitioning the data in Amazon S3 by date and time using Kinesis Data Firehose and updating the Athena table and query to target these partitions ensures Athena only scans the relevant 24-hour log subset, stopping the query time from increasing as total data grows. Option C introduces unnecessary operational overhead and complexity by adding Amazon Redshift and Redshift Spectrum. Options A and B are operationally complex, do not leverage standard partitioning best practices, and do not scale efficiently for Athena queries.