AWS Certified Data Engineer – Associate (DEA-C01) — Question 105

A data engineer needs to debug an AWS Glue job that reads from Amazon S3 and writes to Amazon Redshift. The data engineer enabled the bookmark feature for the AWS Glue job.
The data engineer has set the maximum concurrency for the AWS Glue job to 1.

The AWS Glue job is successfully writing the output to Amazon Redshift. However, the Amazon S3 files that were loaded during previous runs of the AWS Glue job are being reprocessed by subsequent runs.

What is the likely reason the AWS Glue job is reprocessing the files?

Answer options

Correct answer: D

Explanation

The correct answer is D because the absence of a commit statement prevents the job from recording its progress, which is essential for the bookmark feature to function properly. Options A and B do not directly relate to the reprocessing issue, as permission and concurrency settings do not impact the bookmark functionality. Option C is also incorrect because the Glue version does not affect whether bookmarks are processed correctly.