AWS Certified Data Analytics – Specialty — Question 131

A reseller that has thousands of AWS accounts receives AWS Cost and Usage Reports in an Amazon S3 bucket. The reports are delivered to the S3 bucket in the following format:
<example-report-prefix>/<example-report-name>/yyyymmdd-yyyymmdd/<example-report-name>.parquet
An AWS Glue crawler crawls the S3 bucket and populates an AWS Glue Data Catalog with a table. Business analysts use Amazon Athena to query the table and create monthly summary reports for the AWS accounts. The business analysts are experiencing slow queries because of the accumulation of reports from the last
5 years. The business analysts want the operations team to make changes to improve query performance.
Which action should the operations team take to meet these requirements?

Answer options

Correct answer: D

Explanation

The correct answer is D because partitioning the data by account ID, year, and month allows for more efficient querying, as it reduces the amount of data scanned by Athena. Options A, B, and C do not provide the same level of optimization; changing the file format to .csv.zip may reduce size but won't enhance query performance, while partitioning only by date or month may still lead to scanning unnecessary data.