AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 153
An ML engineer uses one ML framework to train multiple ML models. The ML engineer needs to optimize the inference costs and host the models on Amazon SageMaker AI.
Which solution will meet these requirements MOST cost-effectively?
Answer options
- A. Create a multi-container inference endpoint for direct invocation.
- B. Create a multi-model inference endpoint for all the models.
- C. Create a multi-container inference endpoint for sequential invocation.
- D. Create multiple single-model inference endpoint for each model.
Correct answer: B
Explanation
The correct answer is B because a multi-model inference endpoint allows for efficient resource utilization by hosting multiple models in a single endpoint, significantly reducing costs. Options A and C involve multi-container endpoints which can be more expensive and less efficient for this scenario, while option D would require separate resources for each model, leading to higher costs.