A machine learning specialist is running an Amazon SageMaker endpoint using the built-in…

Question

A machine learning specialist is running an Amazon SageMaker endpoint using the built-in object detection algorithm on a P3 instance for real-time predictions in a company's production application. When evaluating the model's resource utilization, the specialist notices that the model is using only a fraction of the GPU.
Which architecture changes would ensure that provisioned resources are being utilized effectively?

Accepted Answer

Correct answer: B. B. Redeploy the model on an M5 instance. Attach Amazon Elastic Inference to the instance. — Option B is correct because attaching Amazon Elastic Inference to an M5 instance can enhance GPU utilization by providing additional inference acceleration without the need for a more expensive instance. The other options either do not utilize the GPU effectively (A, C, D) or switch to an instance type that may not leverage the benefits of Elastic Inference.

AWS Certified Machine Learning – Specialty — Question 100

Answer options

Correct answer: B

Explanation