AWS Certified Data Analytics – Specialty — Question 112

A company receives data from its vendor in JSON format with a timestamp in the file name. The vendor uploads the data to an Amazon S3 bucket, and the data is registered into the company's data lake for analysis and reporting. The company has configured an S3 Lifecycle policy to archive all files to S3 Glacier after 5 days.
The company wants to ensure that its AWS Glue crawler catalogs data only from S3 Standard storage and ignores the archived files. A data analytics specialist must implement a solution to achieve this goal without changing the current S3 bucket configuration.
Which solution meets these requirements?

Answer options

Correct answer: C

Explanation

The correct answer is C because the excludeStorageClasses property allows the AWS Glue crawler to ignore files stored in S3 Glacier, ensuring only S3 Standard files are cataloged. Option A is incorrect because excluding patterns alone does not specifically target storage classes. Option B suggests moving files, which would complicate the setup unnecessarily. Option D, while it mentions including S3 Standard files, does not address the need to exclude S3 Glacier files effectively.