AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 171
A company needs to deploy a custom-trained classification ML model on AWS. The model must make near real-time predictions with low latency and must handle variable request volumes.
Which solution will meet these requirements?
Answer options
- A. Create an Amazon SageMaker AI batch transform job to process inference requests in batches.
- B. Use Amazon API Gateway to receive prediction requests. Use an Amazon S3 bucket to host and serve the model.
- C. Deploy an Amazon SageMaker AI endpoint. Configure auto scaling for the endpoint.
- D. Launch AWS Deep Learning AMIs (DLAMI) on two Amazon EC2 instances. Run the instances behind an Application Load Balancer.
Correct answer: C
Explanation
The correct answer is C because deploying an Amazon SageMaker AI endpoint with auto scaling allows for low latency and the ability to handle variable request volumes in near real-time. Option A is incorrect as batch processing does not meet the near real-time requirement. Option B is not suitable because using an S3 bucket does not provide the necessary low-latency predictions. Option D involves more complexity and does not guarantee the same level of performance as an auto-scaled SageMaker endpoint.