AWS Certified Data Analytics – Specialty — Question 105

A global pharmaceutical company receives test results for new drugs from various testing facilities worldwide. The results are sent in millions of 1 KB-sized JSON objects to an Amazon S3 bucket owned by the company. The data engineering team needs to process those files, convert them into Apache Parquet format, and load them into Amazon Redshift for data analysts to perform dashboard reporting. The engineering team uses AWS Glue to process the objects, AWS Step
Functions for process orchestration, and Amazon CloudWatch for job scheduling.
More testing facilities were recently added, and the time to process files is increasing.
What will MOST efficiently decrease the data processing time?

Answer options

Correct answer: B

Explanation

The correct answer is B because using the AWS Glue dynamic frame file grouping option allows for efficient handling of multiple small files, reducing overhead and improving processing speed. Options A and D introduce additional steps and complexity, while option C does not address the need to process the files before loading them into Redshift.