AWS Certified Machine Learning – Specialty — Question 100
A machine learning specialist is running an Amazon SageMaker endpoint using the built-in object detection algorithm on a P3 instance for real-time predictions in a company's production application. When evaluating the model's resource utilization, the specialist notices that the model is using only a fraction of the GPU.
Which architecture changes would ensure that provisioned resources are being utilized effectively?
Answer options
- A. Redeploy the model as a batch transform job on an M5 instance.
- B. Redeploy the model on an M5 instance. Attach Amazon Elastic Inference to the instance.
- C. Redeploy the model on a P3dn instance.
- D. Deploy the model onto an Amazon Elastic Container Service (Amazon ECS) cluster using a P3 instance.
Correct answer: B
Explanation
Option B is correct because attaching Amazon Elastic Inference to an M5 instance can enhance GPU utilization by providing additional inference acceleration without the need for a more expensive instance. The other options either do not utilize the GPU effectively (A, C, D) or switch to an instance type that may not leverage the benefits of Elastic Inference.