AWS Certified Data Analytics – Specialty — Question 29

A company has developed several AWS Glue jobs to validate and transform its data from Amazon S3 and load it into Amazon RDS for MySQL in batches once every day. The ETL jobs read the S3 data using a DynamicFrame. Currently, the ETL developers are experiencing challenges in processing only the incremental data on every run, as the AWS Glue job processes all the S3 input data on each run.
Which approach would allow the developers to solve the issue with minimal coding effort?

Answer options

Correct answer: B

Explanation

Enabling job bookmarks allows AWS Glue to keep track of which data has already been processed, thus facilitating the handling of only the new or incremental data in subsequent runs. The other options would require more complex coding efforts or do not address the problem effectively; for instance, switching to a DataFrame does not inherently provide incremental processing capabilities, while deleting processed data may not be feasible or desirable.