A monitoring service generates 1 TB of scale metrics record data every minute. A Research…

Question

A monitoring service generates 1 TB of scale metrics record data every minute. A Research team performs queries on this data using Amazon Athena. The queries run slowly due to the large volume of data, and the team requires better performance.
How should the records be stored in Amazon S3 to improve query performance?

Accepted Answer

Correct answer: B. B. Parquet files — Using Parquet files is optimal for storing data in Amazon S3 as they are columnar storage formats that allow for efficient querying and compression, leading to improved performance. In contrast, CSV files and Compressed JSON do not provide the same level of efficiency for large datasets, and RecordIO is not as widely supported for querying with Amazon Athena.

AWS Certified Machine Learning – Specialty — Question 59

Answer options

Correct answer: B

Explanation