AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 92
A company uses 10 Reserved Instances of accelerated instance types to serve the current version of an ML model. An ML engineer needs to deploy a new version of the model to an Amazon SageMaker real-time inference endpoint.
The solution must use the original 10 instances to serve both versions of the model. The solution also must include one additional Reserved Instance that is available to use in the deployment process. The transition between versions must occur with no downtime or service interruptions.
Which solution will meet these requirements?
Answer options
- A. Configure a blue/green deployment with all-at-once traffic shifting.
- B. Configure a blue/green deployment with canary traffic shifting and a size of 10%.
- C. Configure a shadow test with a traffic sampling percentage of 10%.
- D. Configure a rolling deployment with a rolling batch size of 1.
Correct answer: D
Explanation
The correct answer is D because a rolling deployment with a batch size of 1 allows for gradual updates to the new model version while ensuring that there is no downtime, as it replaces instances one at a time. Options A and B involve shifting traffic that could lead to interruptions, and C focuses on shadow testing rather than deploying to production, which does not satisfy the requirement for serving both versions concurrently.