An online retail company stores Application Load Balancer (ALB) access logs in an Amazon…

Question

An online retail company stores Application Load Balancer (ALB) access logs in an Amazon S3 bucket. The company wants to use Amazon Athena to query the logs to analyze traffic patterns. A data engineer creates an unpartitioned table in Athena. As the amount of the data gradually increases, the response time for queries also increases. The data engineer wants to improve the query performance in Athena. Which solution will meet these requirements with the LEAST operational effort?

Accepted Answer

Correct answer: B. B. Create an AWS Glue crawler that includes a classifier that determines the schema of all ALB access logs and writes the partition metadata to AWS Glue Data Catalog. — Option B is correct because using an AWS Glue crawler simplifies the process of discovering the schema and automatically updating the Data Catalog with partition metadata, requiring minimal manual intervention. Options A and D involve more complex setups and operational overhead, while option C, although effective, requires additional steps to transform and save the data, which increases operational effort.

AWS Certified Data Engineer – Associate (DEA-C01) — Question 86

Answer options

Correct answer: B

Explanation