An ML engineer needs to implement a solution to host a trained ML model. The rate of requ…

Question

An ML engineer needs to implement a solution to host a trained ML model. The rate of requests to the model will be inconsistent throughout the day.
The ML engineer needs a scalable solution that minimizes costs when the model is not in use. The solution also must maintain the model's capacity to respond to requests during times of peak usage.
Which solution will meet these requirements?

Accepted Answer

Correct answer: D. D. Deploy the model to an Amazon SageMaker endpoint. Create SageMaker endpoint auto scaling policies that are based on Amazon CloudWatch metrics to adjust the number of instances dynamically. — Option D is correct because it allows for dynamic scaling of instances based on actual demand, ensuring cost efficiency during low usage periods while maintaining responsiveness during peak requests. Option A is incorrect as fixed concurrency does not adapt to varying request volumes. Option B lacks the scalability needed, as it maintains a static number of tasks, and Option C does not provide auto scaling, which is crucial for handling fluctuating request rates.

AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 48

Answer options

Correct answer: D

Explanation