A company runs Amazon SageMaker ML models that use accelerated instances. The models requ…

Question

A company runs Amazon SageMaker ML models that use accelerated instances. The models require real-time responses. Each model has different scaling requirements. The company must not allow a cold start for the models. Which solution will meet these requirements?

Accepted Answer

Correct answer: C. C. Create a SageMaker endpoint. Create an inference component for each model. In the inference component settings, specify the newly created endpoint. Create an auto scaling policy for each inference component. Set the parameter for the minimum number of copies to at least 1. — Option C is the correct choice because it allows for real-time responses by creating a dedicated SageMaker endpoint and an inference component for each model, ensuring no cold starts occur. The other options do not guarantee the required real-time performance or may introduce delays due to their inherent design, such as serverless or asynchronous processing, which can lead to cold starts.

AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 104

Answer options

Correct answer: C

Explanation