AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 75
A company regularly receives new training data from the vendor of an ML model. The vendor delivers cleaned and prepared data to the company's Amazon S3 bucket every 3-4 days.
The company has an Amazon SageMaker pipeline to retrain the model. An ML engineer needs to implement a solution to run the pipeline when new data is uploaded to the S3 bucket.
Which solution will meet these requirements with the LEAST operational effort?
Answer options
- A. Create an S3 Lifecycle rule to transfer the data to the SageMaker training instance and to initiate training.
- B. Create an AWS Lambda function that scans the S3 bucket. Program the Lambda function to initiate the pipeline when new data is uploaded.
- C. Create an Amazon EventBridge rule that has an event pattern that matches the S3 upload. Configure the pipeline as the target of the rule.
- D. Use Amazon Managed Workflows for Apache Airflow (Amazon MWAA) to orchestrate the pipeline when new data is uploaded.
Correct answer: C
Explanation
The correct answer is C because Amazon EventBridge can easily detect S3 events and trigger the SageMaker pipeline with minimal configuration. Option A is incorrect as S3 Lifecycle rules do not trigger immediate actions like training. Option B, while possible, requires more operational effort to manage the Lambda function. Option D involves more complexity with orchestration that is not necessary for this straightforward requirement.