AWS Certified Machine Learning Engineer – Associate (MLA-C01) — Question 119

A company is developing a new online application to gather information from customers. An ML engineer has developed a new ML model that will determine a score for each customer. The model will use the score to determine which product to display to the customer. The ML engineer needs to minimize response-time latency for the model.

How should the ML engineer deploy the application in Amazon SageMaker to meet these requirements?

Answer options

Correct answer: B

Explanation

The correct choice is B, as a real-time inference endpoint in Amazon SageMaker is designed to provide low-latency responses, making it suitable for applications requiring immediate scoring. Options A and D do not support real-time requests, and C, while potentially useful for some applications, does not guarantee the lowest latency compared to a real-time endpoint.