AWS Certified Data Analytics – Specialty — Question 52

A transportation company uses IoT sensors attached to trucks to collect vehicle data for its global delivery fleet. The company currently sends the sensor data in small .csv files to Amazon S3. The files are then loaded into a 10-node Amazon Redshift cluster with two slices per node and queried using both Amazon Athena and Amazon Redshift. The company wants to optimize the files to reduce the cost of querying and also improve the speed of data loading into the Amazon
Redshift cluster.
Which solution meets these requirements?

Answer options

Correct answer: D

Explanation

Option D is correct because converting the .csv files to multiple Apache Parquet files can significantly reduce storage costs and improve query performance due to Parquet's columnar storage format. The other options either suggest a single file format that may not optimize performance (A and C) or use a different format that might not be as efficient as Parquet (B).