AWS Certified Machine Learning – Specialty — Question 227

An analytics company has an Amazon SageMaker hosted endpoint for an image classification model. The model is a custom-built convolutional neural network (CNN) and uses the PyTorch deep learning framework. The company wants to increase throughput and decrease latency for customers that use the model.

Which solution will meet these requirements MOST cost-effectively?

Answer options

Correct answer: A

Explanation

Using Amazon Elastic Inference allows the company to attach low-cost GPU-powered inference acceleration to their SageMaker endpoint, which is a cost-effective way to enhance performance. Retraining the CNN with more layers, whether with a larger or smaller dataset, would likely increase costs and complexity without directly addressing throughput and latency as efficiently as Elastic Inference. Choosing an instance with multiple GPUs may improve performance but would also incur higher costs compared to using Elastic Inference.