A data engineer is building a data pipeline. A large data file is uploaded to an Amazon S…

Question

A data engineer is building a data pipeline. A large data file is uploaded to an Amazon S3 bucket once each day at unpredictable times. An AWS Glue workflow uses hundreds of workers to process the file and load the data into Amazon Redshift. The company wants to process the file as quickly as possible. Which solution will meet these requirements?

Accepted Answer

Correct answer: B. B. Create an event-based AWS Glue trigger to start the workflow. Configure Amazon S3 to log events to AWS CloudTrail. Create a rule in Amazon EventBridge to forward PutObject events to the AWS Glue trigger. — The correct answer is B because using an event-based AWS Glue trigger allows the workflow to start immediately when a file is uploaded to the S3 bucket, ensuring swift processing. Option A relies on a Lambda function that checks periodically, which could introduce delays. Option C uses a scheduled approach that may not react quickly enough to file uploads. Option D incorrectly involves AWS DMS, which is not necessary for this workflow.

AWS Certified Data Engineer – Associate (DEA-C01) — Question 222

Answer options

Correct answer: B

Explanation