A company uses an Amazon SageMaker AI ML model to make real-time inferences. The company…

Question

A company uses an Amazon SageMaker AI ML model to make real-time inferences. The company has configured auto scaling for the Amazon EC2 instances that SageMaker AI uses for the inferences. During times of peak usage, new instances launch before existing instances are fully ready. As a result, the model experiences inefficiencies and delays. Which solution will optimize the scaling process without affecting response times?

Accepted Answer

Correct answer: D. D. Increase the cooldown period after scale-out activities. — Increasing the cooldown period after scale-out activities allows existing instances more time to become fully operational before new instances are added. This reduces the likelihood of having underprepared instances handling requests, thus improving overall efficiency. The other options either do not address the timing issue effectively or involve changes that could complicate the setup without directly solving the problem.

AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 145

Answer options

Correct answer: D

Explanation