AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 63
A company has an ML model that generates text descriptions based on images that customers upload to the company's website. The images can be up to 50 MB in total size.
An ML engineer decides to store the images in an Amazon S3 bucket. The ML engineer must implement a processing solution that can scale to accommodate changes in demand.
Which solution will meet these requirements with the LEAST operational overhead?
Answer options
- A. Create an Amazon SageMaker batch transform job to process all the images in the S3 bucket.
- B. Create an Amazon SageMaker Asynchronous Inference endpoint and a scaling policy. Run a script to make an inference request for each image.
- C. Create an Amazon Elastic Kubernetes Service (Amazon EKS) cluster that uses Karpenter for auto scaling. Host the model on the EKS cluster. Run a script to make an inference request for each image.
- D. Create an AWS Batch job that uses an Amazon Elastic Container Service (Amazon ECS) cluster. Specify a list of images to process for each AWS Batch job.
Correct answer: B
Explanation
The correct answer is B because using an Amazon SageMaker Asynchronous Inference endpoint allows for automatic scaling and efficient handling of inference requests with minimal operational overhead. Option A is less ideal as batch processing may not handle varying demand as efficiently, while options C and D introduce more complexity and operational overhead by requiring additional infrastructure management.