A company uses 10 Reserved Instances of accelerated instance types to serve the current v…

Question

A company uses 10 Reserved Instances of accelerated instance types to serve the current version of an ML model. An ML engineer needs to deploy a new version of the model to an Amazon SageMaker real-time inference endpoint. The solution must use the original 10 instances to serve both versions of the model. The solution also must include one additional Reserved Instance that is available to use in the deployment process. The transition between versions must occur with no downtime or service interruptions. Which solution will meet these requirements?

Accepted Answer

Correct answer: D. D. Configure a rolling deployment with a rolling batch size of 1. — The correct answer is D because a rolling deployment with a batch size of 1 allows for gradual updates to the new model version while ensuring that there is no downtime, as it replaces instances one at a time. Options A and B involve shifting traffic that could lead to interruptions, and C focuses on shadow testing rather than deploying to production, which does not satisfy the requirement for serving both versions concurrently.

AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 92

Answer options

Correct answer: D

Explanation