AWS Certified Data Analytics – Specialty — Question 17

A streaming application is reading data from Amazon Kinesis Data Streams and immediately writing the data to an Amazon S3 bucket every 10 seconds. The application is reading data from hundreds of shards. The batch interval cannot be changed due to a separate requirement. The data is being accessed by Amazon
Athena. Users are seeing degradation in query performance as time progresses.
Which action can help improve query performance?

Answer options

Correct answer: A

Explanation

Merging the files in Amazon S3 into larger files can significantly improve query performance because it reduces the number of files that Athena needs to scan, optimizing read efficiency. Increasing the number of shards in Kinesis Data Streams or adding more memory and CPU to the application does not directly address the issue of file size affecting query performance. Writing files to multiple S3 buckets does not inherently improve performance, as it may complicate data retrieval without addressing the core issue.