An airline company uses an ML model to adjust ticket prices based on demand. The model ru…

Question

An airline company uses an ML model to adjust ticket prices based on demand. The model runs on Amazon SageMaker real-time endpoints. During previous deployments, the model failed to scale quickly enough when website traffic increased, which caused delays in price adjustments. An ML engineer needs to configure auto scaling for the SageMaker endpoints to respond rapidly to traffic changes. The solution must use target tracking scaling policies. Which configuration will be MOST responsive to sudden changes in traffic?

Accepted Answer

Correct answer: D. D. Configure auto scaling based on the SageMaker InvocationsPerInstance metric. Configure high-resolution 10-second intervals, and set the default 300-second scale-in cooldown period. — Option D is correct because it uses the SageMaker InvocationsPerInstance metric, which allows for a more responsive scaling reaction, and the high-resolution 10-second intervals facilitate quicker adjustments. The 300-second scale-in cooldown is appropriate, allowing for immediate scaling without excessive delays. Other options either use a standard metric or longer cooldown periods, which would hinder responsiveness during sudden traffic increases.

AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 199

Answer options

Correct answer: D

Explanation