AWS Certified Data Engineer – Associate (DEA-C01) — Question 29

A company needs to partition the Amazon S3 storage that the company uses for a data lake. The partitioning will use a path of the S3 object keys in the following format: s3://bucket/prefix/year=2023/month=01/day=01.
A data engineer must ensure that the AWS Glue Data Catalog synchronizes with the S3 storage when the company adds new partitions to the bucket.
Which solution will meet these requirements with the LEAST latency?

Answer options

Correct answer: C

Explanation

Option C is correct because invoking the Boto3 AWS Glue create_partition API call directly when writing data to S3 allows for immediate synchronization with the AWS Glue Data Catalog, resulting in minimal latency. Options A and B involve scheduled or manual processes that introduce delays, while option D is used for recovering partitions but doesn't provide real-time updates.