AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 66
A company needs to host a custom ML model to perform forecast analysis. The forecast analysis will occur with predictable and sustained load during the same 2-hour period every day.
Multiple invocations during the analysis period will require quick responses. The company needs AWS to manage the underlying infrastructure and any auto scaling activities.
Which solution will meet these requirements?
Answer options
- A. Schedule an Amazon SageMaker batch transform job by using AWS Lambda.
- B. Configure an Auto Scaling group of Amazon EC2 instances to use scheduled scaling.
- C. Use Amazon SageMaker Serverless Inference with provisioned concurrency.
- D. Run the model on an Amazon Elastic Kubernetes Service (Amazon EKS) cluster on Amazon EC2 with pod auto scaling.
Correct answer: C
Explanation
The correct answer, C, is ideal because Amazon SageMaker Serverless Inference with provisioned concurrency allows for quick responses and automatically manages scaling based on demand. Option A is incorrect as batch processing is not suitable for real-time responses. Option B does not align with the need for rapid scaling and response times, while option D introduces unnecessary complexity with Kubernetes when a serverless solution is more appropriate.